Skip to content

Instantly share code, notes, and snippets.

@williamstein
Last active January 2, 2016 05:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save williamstein/8260862 to your computer and use it in GitHub Desktop.
Save williamstein/8260862 to your computer and use it in GitHub Desktop.
SageMath Cloud backend architecture talk at Sage Days 56
# SageMath Cloud backend architecture talk at Sage Days 56
## Guiding principles
- A place for everybody to use *all* math-related software easily, especially (but not only!) open source
- High availability: automatically survive failure of any proper subset of datacenters
- Make it very hard to permanently lose work:
- everything has synchronization
- you can kill/close your browser at pretty much any moment without losing state (sharp contrast to ipython's design)
## Architecture
Users <--...
- Web Browser <-(todo: route 53)-> Stunnel: ssl (elliptic curve)
- Stunnel <--> Haproxy:
- websocket (or many old-fashioned fallbacks via Sock.js)
- proxy servers
- Haproxy <--> Hubs (load balanced)
- servers written in Node.js; several on each of 20+ VM's
- Hub <--> Local Hub (one for each project)
- simpler server written in node.js
- 23 VM's run projects right now
- Local Hub <--> Sage Server (one for each project)
- Local Hub <--> IPython Servers (one for each directory of each project)
- Local Hub <--> TTY Console server (node.js -- one for each project)
---> sage, consoles, files, ipython, etc.
- Database:
- Cassandra (version 1.2.9):
- distributed peer to peer database
- no single points of failure
- multi-datacenter aware
- written in Java
- Distributed project storage:
- Project is stored on computer where it runs, which minizes latency, while allowing for a slow large distributed encrypted network.
- ZFS
- next generation filesystem... but ready for production use (around for over 10 years)
- file system *and* local volume managment (easy to expand with more disks)
- snapshots
- replication
- compression
- deduplication
- verification
- Each project stored in 2 locations in each data center
- where? Consistent hashing
- Network:
- Tinc
- the most mature general peer to peer VPN
- no single points of failure
- self-healing
- Synchronization:
- I wrote (and BSD licensed!) a new implementation in CoffeeScript of Neil Fraser's Differential Synchronization algorithm: https://neil.fraser.name/writing/sync/
- This is NOT "Operational Transforms"
- All files, worksheets, etc. use this
- Bolted it around IPython as well.
## Business
- My motivation for SMC in May 2010:
- meeting at Simons Foundation HQ on "Funding for Open Source Software in Math and Physics".
- I want to generate more (and more sustainable) revenue to support Sage development. I.e., I want to be able to hire *you*
to work fulltime on Sage. We can all want...
- Discuss more later today...
## Functionality
- Try it out and you'll see. This could be for another talk tomorrow.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment