Skip to content

Instantly share code, notes, and snippets.

@magentanova
Last active October 13, 2018 23:15
Show Gist options
  • Save magentanova/1615899bfa4af2060f9fcf09e4b183c5 to your computer and use it in GitHub Desktop.
Save magentanova/1615899bfa4af2060f9fcf09e4b183c5 to your computer and use it in GitHub Desktop.
Mountain Workshops Readme

This document is a brief layout of the technology underlying Mountain Workshops, specifically as pertains to the production of mountainworkshops.org. It is intended as lay of the land for all participants and staff, not just developers, but some light technical descriptions will be provided as well as links to documentation for specific software. In general, Mountain Workshops involves a lot of different interlocking components and it's important for technologists and other stakeholders to get a bird's-eye view of what those components are, how they depend on each other, and how they are sequenced along the timeline of workshop.

The focus here is on the transmission of photojournalism stories. Videos are partially accommodated by this system. Great strides were made toward full automation in 2017, but as of now there are still manual steps that need to be taken for data visualization projects.

Timeline

  1. This year's workshop and participants are created and saved as rows of data in the Django admin panel (see note) that feeds mountainworkshops.org. The workshop and participants should all be stored this way before stories are created in Wordpress, because those story editors must select the participants from a drop-down menu that is fed by the Djagno database.

  2. Throughout the week, story editors enter and edit stories in a Wordpress content-management system at mwstory.org/wp-admin. These stories are entered as "posts" in that admin panel.

    • The MWS engineering team has injected a small JavaScript app into this Wordpress panel, so that publication of a "post" will submit a story directly to the back end of mountainworkshops.org. As of October 2017, this feature works for slideshow and video stories.
    • When a story is published through this system, it is crucial that the editor save it with the correct assignment number. For photojournalism stories, this is the only way that the photos imported later in this process will be associated with the right stories. This assignment number, or assignment id, must correspond to the one that's published with a photo's IPTC data when it's uploaded. It travels under different names, and by the time the photo is saved to S3 (see note), this assignment number is stored as "original_transmission_reference" in the IPTC data.
  3. At the end of the week, when the final photos have been selected and uploaded to S3, they must be imported into the database used by mountainworkshops.org, and they must be associated with their respective stories.

    • For our import procedure, we make use of distributed computing to resize all the images and extract their IPTC data in parallel. Thus it takes as long to process 1,000 photos as it does to process one.
    • To do this import, simply log in to Mountain's Amazon Web Services account, click this link -- https://us-west-2.console.aws.amazon.com/lambda/home?region=us-west-2#/functions/mws-conductor -- and click the orange button labeled "test". Even though it says test, this will actually do the import. Due to some work in progress, the import often needs to be run twice. This will be addressed at a later date. In any case, it shouldn't take more than a minute or two each time.
    • This import sends all the photo data to the Django app that serves the website. That app only knows what photos go with what story if they share the same assignment number. That's why it's so important that the assignment number was saved in WordPress and saved correctly. In this step, the photos will also be associated with the workshop and participant that were selected in the WordPress app. When it's done, you should be able to visit the URL for a given story and see it all come together.
  4. That's it! The photos have been imported into Django and automatically folded into the right stories. As far as photojournalism goes, the process is complete and stories are ready to be grokked on the website.

Documentation and Code

Some of the documentation for these apps is still in progress, but the source code, at least, is collected below.

  • The Django app is stored and documented here.

  • The Wordpress backend, including the Javascript layer that captures post data and submits it to Django, lives here.

  • The AWS Lambda function is actually two functions: one, which we call a "conductor", that scans Mountain's S3 bucket, collects all the photos pertaining to a given workshop, and sends them out to hundreds of instances of the second function, the "orchestra", for simultaneous processing. It then composes a JSON object containing data on all photos and sends it to Django for batch processing and storage. Both the conductor and orchestra live here, as mws-conductor and prepareAndSendPhotoData.

Tooling

Some of the technological terms used and platforms involved are briefly described here.

Django

Django is a tool, written in Python, that facilitates the creation of web applications. It stores data, for example photo stories, in a SQL database. Every time a browser opens a path that displays a photo story, that photo story will move along a path like this: Database --> Django --> web site. Django and the database live on a computer that we are renting from Digital Ocean. The web site lives very short lives on browsers around the country and around the world.

In full, a web browser will request a photo story from the Django app, the Django app will query its database for the content, the database will send that content back to Django, and finally Django will respond to browser with a screenful of content.

Django offers some very handy tooling for non-developers. The Django admin panel allows users an interface to add, manipulate and delete the data (story bylines, for example) that feeds the web site. The engineers behind the site control what is exposed in that panel and how.

Amazon Web Services

We make heavy use of Amazon Web Services (AWS), specifically:

  • S3, Amazon's cloud storage system for file storage.
  • Lambda, an Amazon service that allows developers to publish a piece of code online that will be run in its own computing environment, on demand. We make use of this tool by running hundreds of simultaneous lambdas to scan and resize all of the Mountain Workshops images at the same time.

Digital Ocean

We use Digital Ocean's hosting to store and run the app that presents mountainworkshops.org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment