#odML-HDF5-Converter
###Github Repository for Library https://github.com/ila057/odML-HDF5-Converter/
###List of Commits
- https://github.com/ila057/odML-HDF5-Converter/commits/master?author=ila057
- https://github.com/ila057/odML-HDF5-Converter/commits/basicConverter?author=ila057
- https://github.com/INCF/eeg-database/commits/hdf5Converter?author=ila057
###Background The data produced by experiments of Electroencephalography (EEG) and Event-related potentials (ERP) is stored in web portal of EEGBase, by implementing a system of templates. These templates allow the users to store metadata in odML format, and the data is stored its raw format(.eeg or .avg). However, the initiatives within the neuroinformatics community have proposed unification of data models by using open standardized formats such as HDF5 or odML. This concept of opendata necessitates the development of this library, which shall allow the conversion to either odML or HDF5 as per the requirements.
This library is enabled to be integrated with EEGBase, such that the user is provided with the option of downloading the data-package in HDF5 format along with the currently existing option of downloading the odML metadata and raw data.
###Working
The library has been developed in two versions:
- Basic-Converter, which contains the basic crux of complete-converter, and takes in the
- name of the HDF5 file to be created,
- metadata.xml
- .eeg/.avg file containing raw data
- header file name
- marker file name
This Basic-Converter extracts relevant information from the metadata.xml, .eeg/.avg file, header file, and marker and creates the HDF5 file of the given name. This can be used separately to be integrated in another piece of software, as in the sample code : ```java ODMLParserImpl odmlParser = new ODMLParserImpl("Experiment_208_Driver's_attention_with_visual_stimulation_and_audio_disturbance", "metadata.xml", "LED_26_3_2014_0004.eeg", "LED_26_3_2014_0004.vhdr", "LED_26_3_2014_0004.vmrk"); ``` Link to the code of Basic-Converter : https://github.com/ila057/odML-HDF5-Converter/tree/basicConverter
- Complete-Converter, which does everything that the Basic-Convertor does, but is developed with the intention of integration into EEGBase. It takes in a zip data-package, downloaded from EEGBase. This zip package must follow the directory structure templates used in EEGBase. The converter extracts the data files, marker and header files and converts them into HDF5 format, retaining the same directory structure and the scenario in which the experiment was conducted. Example usage:
String finalConvertedFolderToBeZipped = dataProcessor.generateConvertedDataSet("EEG_ERP.zip");
Link to code of Complete-Converter : https://github.com/ila057/odML-HDF5-Converter/
###Implementation
Though the code is documented, explaining the role of each component, a few salient points explaining the working, and the motivation behind making things work in a certain way are explained below:
- The working of Complete-Converter starts from
generateConvertedDataSet(String inputZipFile)
, from wherein it unzips the data-package and handovers its contents toprocessAllDataSetsFinal()
. This function is responsible for extracting the data files(eeg/avg files, header and marker files) and furthering them for conversion to HDF5. On analysis of the data-packages created by EEGBase, it was found that it follows one of the directory structures below:
1. <Data_Package.zip>/<Several_Experiments/Data/<Several_Experiment_Datasets.zip>/<data_files>
2. <Data_Package.zip>/<Several_Experiments/Data/<Several_Experiment_Datasets.zip>/<Experiment_Dataset_Name>/<data_files>
3. <Data_Package.zip>/<Several_Experiments/Data/<data_files>
The <data_files> must be converted to HDF5 and are hence passed to ODMLParser
. The objective of ODMLParser
is to call DataParser
and MetadataParser
, which are responsible for convertion of data and metadata into HDF5 file respectively.
-
The
DataParser
converts the raw data from <file_name.eeg/.avg> and stores it as double values in HDF5. TheMetadataParser
must read through three files to consolidate the entire metadata: -
metadata.xml : Metadata corresponding to a dataset
-
<file_name>.vmrk : Marker File
-
<file_name>.vhdr : Header File
-
One can also convert the converted HDF5 file back to the set of odML metadata and data-files if required. Sample usage is as below:
MetadataCreator metadataCreator = new MetadataCreatorImpl(hdf5FileName);
m.createOdml(odMLMetadataFile);
###Limitations This library is only compatible with versions of java 8 or greater, and needs 64-bit OS to work properly. These hardware and java restrictions are actually imposed by the nix-java bindings which is used as a dependency for the project.
###Future Work
- The library eegloader is used to extract the channel and marker information from .vhdr and .vmrk files respectively. This does not extract all the information from those files, but has support for following information:
- Channel
- Name
- Number
- Units
- Resolution
- Marker
- Name
- Position
- Stimulus
- Channel
When eegloader is extended, this library can also be extended to include further information from header and marker files.
-
EEGBase Live server currently runs on Java 6. This imposes a restriction on merging our library as well as some commits in EEGBase unless its servers are moved to Java 8. Thus, it remains to merge the library with EEGBase after its migration to Java 8.
-
This library's dependencies (nix-java bindings) require 64-bit OS to work. The existing VM for development and testing of EEGBase on any system is 32 bit, and hence end-to-end testing with EEGBase could not be established, which remains as a future work. However, the library is tested separately, and the code for enabling integration in EEGBase has also been thoroughly tested.