Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jeffkinnison/49252125d0475ca19390abfa41d615ad to your computer and use it in GitHub Desktop.
Save jeffkinnison/49252125d0475ca19390abfa41d615ad to your computer and use it in GitHub Desktop.
GSOC 2016 Work Product Submission Jeff Kinnison

In Situ Simulation Analysis Using Airavata

Jeff Kinnison

Summary

Apache Airavata is a Science Gateway designed for easy interaction with remote computing resources and easy customization to address the concerns heterogeneous scientific applications. All parts of my GSoC project have been completed to the satisfaction of my mentors and merged into the development version of the Airavata project. This project involved 1) data streaming and 2) data sharing for scientific applications. Data streaming allows users to monitor remote applications and meaningfully visualize their state with regards to the science being carried out. Data sharing allows users to easily share and replicate computational science experiments while respecting security concrns.

Introduction

Despite progress in making scientific computing accessible, science gateways still face the challenges of providing feedback to and sharing research among users. To address these challenges, this summer I added the capability to stream data from remote computing nodes and share projects and experiments with users to Apache Airavata.

Data streaming allows for application-level remote monitoring using secure communications protocols. Data to be streamed is defined at the application-level and may be incorporated into gateways using a WebSockets server deployed next to Airavata and JavaScript client-side code. Project and experiment sharing allows multiple users to access experiment inputs and outputs in addition to allowing users to clone shared projects. User permissions are set coarsely at the project level and can be fine-tuned on a per-experiment basis to allow easy, secure collaboration.

SimStream Application

Original Repository

Integration into Airavata

Merging Pull Request

Description

SimStream is a data collection and message sending application that is configured to periodically run user-defined functions and send the results to a remote message broker. In Airavata, it is used to parse remote application output files and pipe data to the PGA. The current repository has example code using that parses outputs from the OpenMM Molecular Dynamics simulation suite and sends log entries and Root Mean Squared Deviation calculations as JSON data.

To accommodate heterogeneous scientific applications, SimStream is designed to run user-defined parsing functions. This means that, in effect, it is a long-running polling application that facilitates customized analyses. All data collection is concurrent (not parallel) using Python Threads. Messages are passed using the pika library to communicate with a RabbitMQ server.

Status

All work described in the proposal has been completed and merged into the Airavata project with the exception of event monitoring and handling. Event monitoring/handling was determined to be low-priority, and while it is an interesting idea, little interest was generated. It may be included in future versions of SimStream.

Continuing Work

  1. Writing up a how-to guide for designing data collection functions
  2. Dynamically configure SimStream on an application-by-application basis
  3. Continuing testing, profiling, and maintenance

AMQPWSTunnel Application

Original Repository

Integration into Airavata

Merging Pull Request

Description

AMQPWSTunnel is the solution designed to distribute SimStream data to the PGA. It uses the Tornado Websocket framework and pika library to consume data from RabbitMQ and stream it to clients ove the wss protocol. In the PGA, a JavaScript WebSocket client can read the data and render it in a meaningful format.

This module was necessary because the PGA was not designed to incorporate both standard HTTP and WebSockets endpoints. A preferred solution would be to incorporate WebSockets endpoints directly into the PGA to unify data access and security.

Status

AMQPWSTunnel has been merged into the Airavata as a submodule. Security policies and deployment methods are still being defined.

Continuing Work

  1. Integrating into the Airavata startup process (start AMQPWSTunnel with Airavata)
  2. Defining security policies to ensure streams are only available to the correct users
  3. Writing a how-to guide for creating client-side data visualizations of SimStream data

Grouper Integration

Integration into Airavata PGA

Merging Pull Request

Commit List

Description

Grouper is a database interface for creating and managing group membership. Though not included in the original proposal, enabling project and experiment sharing was a major part of my work this summer. Using Grouper, I enabled coarse- and fine-grained data sharing controls in the PGA to enable collaboration and replication of research.

Status

Sharing policies are still being defined, however basic sharing features have been enabled in the PGA and Airavata.

Continuing Work

  1. Refining sharing policies
  2. Modifying the API to work well with use cases

Presentation

The results of this project were presented at the XSEDE Gateways & Workflows Symposium Series on August 19, 2016. This presentation involved a demonstration of each aspect of the project from the client perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment