-
-
Save wtbarnes/1a9acc43381a33d016843c666150e8e9 to your computer and use it in GitHub Desktop.
Sorry for not responding @kdere. I just saw this comment. For some reason GitHub didn't send me a notification or I missed it somehow...
PyTables may be a better way to go. It seems more full featured, but a bit of a steeper learning curve. I've just used h5py because I have experience with it and it is simpler. If some of the functionality I've built can be replaced or improved by PyTables, I'm all for switching over.
The increase in read speeds is one of my main motivations for doing this as well. Better to read the ASCII data once and then all subsequent reads are just from the HDF5 "database". Additionally, this means that you can easily slice into portions of a large file (e.g. some of the Fe .wgfa files) without having to load the whole thing into memory. I've found that ChiantiPy can be a bit sluggish for some of these ions with many, many transitions.
Though the format is HDF5, the format is essentially exactly the same as the ASCII database. There are groups for each element, subgroups for each ion, and then datasets for the individual files (.elvlc, .wgfa, etc.) within those subgroups.
If you have some time, play around with the package and let me know what you think (here or in an issue on the main package page).
I had started building HDF5 files of the CHIANTI database using Python Tables which I had some experience with. Also, because it allowed me to insert the references as a table. The main reason for doing this was to increase the speed of reading in some of the data files. It turned out that some of the biggest files, such as some .wgfa files, kept every single calculated transition, even some with an A value of 0.0, making them very large. If these were reduced by enforcing a minimum branching ratio of the A values to something like 1.e-4 or 1.e-5, a lot of the problem went away.
Also, the plan was to keep the ascii files as our basic file structure.