- Conference
-
11th IEEE International Conference on eScience (08/31-09/04/2015 in Munich)
- 'i') Full Paper
-
submit for main conference by March 23rd (camera-ready & reviewed by June 7), OR
- 'ii') Short Paper
-
submit for "Works in progress" workshop due in camera-ready form on June 21, and accompany with poster due on May 20
- Title
-
'Development of the Materials Project's open-source framework enabling seemless integration of generic user-contributed data for Computational Materials Design'
- Authors
-
Patrick Huck, Anubhav Jain, Dan Gunter, Kristin Persson (LBNL)
- Abstract (305 words, include flow chart)
-
As a key player in the U.S. Materials Genome Initiative, the 'Materials Project' (www.materialsproject.org) utilizes HPC resources to determine the energetic and structural information of a large pool of over 50,000 inorganic compounds by means of high-throughput ab-initio calculations. The calculation results and analysis tools are disseminated to the public via modern web and application interfaces. The Materials Project serves to accelerate the discovery, design and creation of next-generation materials for the use in applications such as batteries, photovoltaics, and semi-conductors. However, the materials science research community has a continually growing supply of experimental and theoretical materials that are not calculated by the Materials Project. With a growing user base of over 10,000 registered and hundreds of active users, it becomes increasingly important and valuable for the Materials Project to also enable community-driven submissions, which would extend the scope of the possitions and improve the integrity/quality of the provided datasets. + In this paper we describe further computing and software infrastructure within the Materials Project to integrate and organize contributions of computed or measured materials data from users. The framework enables users to add their own materials calculations to the Materials Project database and provides dynamic programmable interfaces for viewing and analyzing the data. The contributed data is immediately comparable to the project’s core data. Quality control of contributed data is a key concern, and we describe how the infrastructure is able to isolate this data from the core data set while still allowing integrated analysis capabilities that aggregate metrics across the entire data set. The resulting framework is expected to enhance user collaborations on materials properties and maximize the impact of each contributor’s dataset on the community. The immediate outcome are more efficient research activities due to the centralized exchange of data, techniques and best practices via a common platform, with a long-term view of the Materials Project as a data management platform serving its users as institutional, and thus community-wide memory for computed and experimental materials science.'''
Note
|
Outline below is work in progress. |
- Outline
-
introduce the materials project and put the work in context.
- DESIGN GOALS
-
-
generalized & non-specific
-
modular & extensible enough to allow for wide range of contribution formats & sizes from different projects
-
split & gather contributions to be assigned to MP (cat.) IDs
-
render in basic default/generic layout
-
provide mechanism for full user-control over graphs
-
vs. expectations/visions for project-specific App: “automagically process raw data & visualize”
-
- USE CASES
-
-
band gap calculations at different conditions
-
measured XAS/XMCD-spectra compared to FEFF calculations
-
properties of nano-porous materials, adsorption, carbon capture
-
VASP calculations for photovoltaic & diffusion
-
- FLOW CHART
-
-
MP Contribution File Format
-
Input File Faker
-
Recursive Parser & Mongo Adapter
-
Builder Stage
-
Generic Frontend & User Apps
-
- FRONTEND GENERIC DISPLAY
-
-
table-like data (search- & sortable)
-
tree-like data (expandable)
-
interactive user-controlled graphs via plotly
-
- SUMMARY/FEATURES
-
-
solution to automate repetitive task of extracting information from instrumental/calculated data & generate publication-ready graphs
-
organize scientific output of wide materials science community using the standards established in Materials Project
-
maximize exposure (to a broad multi-disciplinary audience), impact and usability of produced data through modern web tools and programming interfaces
-
by default, provide interactive, user-controlled and shareable graphs including programmatic access to data & underlying code
-
possibly support streaming / live-updating graphs as data collected (both for experimental & theoretical)
-
streamline/pipeline process of experimental and theoretical comparisons from different institutions accelerating scientific advances
-
serve as well-organized institutional/community-wide memory reducing redundancy and loss of data/figures
-
period of protected data submission and presentation, one-click GitHub-style "make public"
-
contribute during paper publishing process (via CIF etc.)
-
My comments: