Skip to content

Instantly share code, notes, and snippets.

@bridgethillyer
Last active March 28, 2016 18:46
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bridgethillyer/56dfcd2cb907bb5e1ea5 to your computer and use it in GitHub Desktop.
Save bridgethillyer/56dfcd2cb907bb5e1ea5 to your computer and use it in GitHub Desktop.
Walkthrough of onyx code
Onyx code walkthrough 2/14/2016
with Michael, Gardner, Bridget
* onyx.peer.min-peers-test
- most basic test
- every test needs onyx.api last - loads all defmethods
- min-peers need = # tasks
- put channels in atoms
- so fresh channels every time you reload
- onyx-id - us UUID so there are not collisions in tests
- test-config.edn - regular configs are in here
- with-test-env - test-helper
- CatchThreadInterrupted - handles Ctrl-B
- lifecycles - this dance you need for any plugin interaction
- take-segments!
- just takes segments/reads chanel until done
* Start Zookeper
- in memory on 2188
- so you can use the ZK Client
- zKcli.sh -server localhost:2188
- /onyx
- directory for onyx-id
- /workflow
- Nippy (edn) - the actual workfow
- all the parts of the job are in there:
workflows, catalog, lifecycles
- keyed by job id
- /znode
- file gets deleted when process ends
- everyone gets notified
- "pulse" does this
- /task
- the tasks
- /job-scheduler
- race to write, then know if they won or failed
- /exception
- might be removed, don't worry about it
- /chunk - arbitrary blog storage
- /ledger - bookkeeper thing
- coordinate which bookkeeper server?
- /log
- what we see in console dashboard - local state in peers
- /origin
- the replica
- onyx gc - nneds all log to a point - puts into origin to make a smaller log
* Back to tests
- with-test-env
- use OnyxTestEnv component - to trap errors for repl/testing purposes
* onyx.system
- onyx.api requires this
- all the magic comes together
- Env
- onyx-development-env
- starts in-memory ZK
:zookeeper/server? true - in config
- onyx.log.zookeeper
- uses 3rd party clojrue ZK library
- persist all the ZK conn stuff
- Peer Group
- onyx-peer-group
- resources for all peers
- messaging group -> Aeron
- multiplex subscribers because serialization is expensive
- do it once above all the peers - ie in the peer group
- subscriber has a handler that buffers the messages
- picking out a channel to route it to the appropriate peer
- PeerManager
- super hash map of peers
- try-start-peers (test-helper)
- publications - peer-peer
- but pooled/ .....
* onyx.peer
- BackPressurePool - back pressure!
- Virtual peer
- inbox/outbox - the log
- Conj talk/slides explains this process
- join - uses a protocol to join the cluster
- processing-loop
- deal with all of the messages
- controls starting and stopping of tasks
- apply-logentry ??
- updating the replica
this pattern is used a lot
- run-task-lifecycle
- big sequential loop
- launch-aux-threads!
- what Aeron is messaging for - segments and progress
- passes segments
- root of tree is complete
- restart because something went wrong
- Onyx is masterless, so every peer sees every message, determines what everyone is doing
- reconfigure-cluster-workfload
- look at jobs in replica and figure out how many peers should work on each job
- btr-place-scheduling
- uses constraints, then figures out the right solution
- peers -> tasks (not tasks -> peers)
* Replica
- (timbre/info replica)
- :tasks - topological sort of all IDs
- :saturation - related to min and max-peers
- :pairs - slots in the ring
- EVERYTHING IS IN HERE
- AND ALL PEERS SEE THE SAME MESSAGE
- :allocations - peer -> job
- so scheuler touches this key
- :peer-sites - where my peer is
- peer-task-id if single host
- :task-slot-ids - stateful things
- like Kafka - topic w/partitions
- so you know which partition to be on
"mind reading"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment