Skip to content

Instantly share code, notes, and snippets.

@dangunter
Forked from tschaume/MPContribs.asc
Last active August 29, 2015 14:16
Show Gist options
  • Save dangunter/3e9464a21d63c5e112a5 to your computer and use it in GitHub Desktop.
Save dangunter/3e9464a21d63c5e112a5 to your computer and use it in GitHub Desktop.
Conference

11th IEEE International Conference on eScience (08/31-09/04/2015 in Munich)

'i') Full Paper

submit for main conference by March 23rd (camera-ready & reviewed by June 7), OR

'ii') Short Paper

submit for "Works in progress" workshop due in camera-ready form on June 21, and accompany with poster due on May 20

Title

'Development of the Materials Project's open-source framework enabling seemless integration of generic user-contributed data for Computational Materials Design'

Authors

Patrick Huck, Anubhav Jain, Dan Gunter, Kristin Persson (LBNL)

Abstract (305 words, include flow chart)

As a key player in the U.S. Materials Genome Initiative, the 'Materials Project' (www.materialsproject.org) utilizes HPC resources to determine the energetic and structural information of a large pool of over 50,000 inorganic compounds by means of high-throughput ab-initio calculations. The calculation results and analysis tools are disseminated to the public via modern web and application interfaces. The Materials Project serves to accelerate the discovery, design and creation of next-generation materials for the use in applications such as batteries, photovoltaics, and semi-conductors. However, the materials science research community has a continually growing supply of experimental and theoretical materials that are not calculated by the Materials Project. With a growing user base of over 10,000 registered and hundreds of active users, it becomes increasingly important and valuable for the Materials Project to also enable community-driven submissions, which would extend the scope of the possitions and improve the integrity/quality of the provided datasets. + In this paper we describe further computing and software infrastructure within the Materials Project to integrate and organize contributions of computed or measured materials data from users. The framework enables users to add their own materials calculations to the Materials Project database and provides dynamic programmable interfaces for viewing and analyzing the data. The contributed data is immediately comparable to the project’s core data. Quality control of contributed data is a key concern, and we describe how the infrastructure is able to isolate this data from the core data set while still allowing integrated analysis capabilities that aggregate metrics across the entire data set. The resulting framework is expected to enhance user collaborations on materials properties and maximize the impact of each contributor’s dataset on the community. The immediate outcome are more efficient research activities due to the centralized exchange of data, techniques and best practices via a common platform, with a long-term view of the Materials Project as a data management platform serving its users as institutional, and thus community-wide memory for computed and experimental materials science.'''

Note

Outline below is work in progress.
Please ignore (03/05/2015).

Outline

introduce the materials project and put the work in context.

DESIGN GOALS
  • generalized & non-specific

  • modular & extensible enough to allow for wide range of contribution formats & sizes from different projects

  • split & gather contributions to be assigned to MP (cat.) IDs

  • render in basic default/generic layout

  • provide mechanism for full user-control over graphs

  • vs. expectations/visions for project-specific App: “automagically process raw data & visualize”

USE CASES
  • band gap calculations at different conditions

  • measured XAS/XMCD-spectra compared to FEFF calculations

  • properties of nano-porous materials, adsorption, carbon capture

  • VASP calculations for photovoltaic & diffusion

FLOW CHART
  • MP Contribution File Format

  • Input File Faker

  • Recursive Parser & Mongo Adapter

  • Builder Stage

  • Generic Frontend & User Apps

FRONTEND GENERIC DISPLAY
  • table-like data
 (search- & sortable)

  • tree-like data
 (expandable)

  • interactive
user-controlled
graphs via plotly

SUMMARY/FEATURES
  • solution to automate repetitive task of extracting information from instrumental/calculated data & generate publication-ready graphs

  • organize scientific output of wide materials science community using the standards established in Materials Project

  • maximize exposure (to a broad multi-disciplinary audience), impact and usability of produced data through modern web tools and programming interfaces

  • by default, provide interactive, user-controlled and shareable graphs including programmatic access to data & underlying code

  • possibly support streaming / live-updating graphs as data collected (both for experimental & theoretical)

  • streamline/pipeline process of experimental and theoretical comparisons from different institutions accelerating scientific advances

  • serve as well-organized institutional/community-wide memory reducing redundancy and loss of data/figures

  • period of protected data submission and presentation, one-click GitHub-style "make public"

  • contribute during paper publishing process (via CIF etc.)

@computron
Copy link

My comments:

  1. seemless -> seamless
  2. You should mention a little bit about you are merging available tools (Pandas, Plot.ly, MongoDB) with some custom solutions. i.e. provide a few implementation details as this is a CS conference, and just mentioning those things will give people some explicit idea of what's going on.
  3. Since this is a CS conference and not a MS conference, The ideal situation is probably to rework this so that the problem is framed in a more general CS-ey way, and then this paper takes the approach of MP being more of community solution to that problem. i.e. see Dan's paper "Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project" for some guidance on how to take this viewpoint and approach the problem from the CS perspective.

@computron
Copy link

also REST (not just in the figure, but in the text)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment