Skip to content

Instantly share code, notes, and snippets.

@josephg
Last active December 18, 2015 13:49
Show Gist options
  • Save josephg/5793219 to your computer and use it in GitHub Desktop.
Save josephg/5793219 to your computer and use it in GitHub Desktop.
14:02 * josephg reads up
14:03 < josephg> koppor, rawtaz: ShareJS does all the actual OT
14:03 < josephg> racer is now a wrapper around it which does things like refs, reflists
14:03 < josephg> ... it manages subscriptions for you (so if you change pages, you don't have to manually unsubscribe)
14:03 < josephg> stuff like that.
14:03 < josephg> ShareJS just does the document editing.
14:04 < josephg> Redis is currently important for 3 things:
14:05 < josephg> - We need to be able to atomically append to the op log. We're using redis's lua scripting to do atomic commits
14:05 < josephg> - Redis is also used for pubsub between your backend servers
14:05 < josephg> (well, between your servers)
14:06 -!- liorix [~liorix@cpe-98-14-229-103.nyc.res.rr.com] has quit [Remote host closed the connection]
14:06 < josephg> (remember the new version of derby is designed to scale across many backend processes - even the derby examples are currently running on 3
load-balanced processes just to test it out)
14:07 < josephg> And finally, redis is used to store the operation log. This is a bad idea, because it means all your ops have to fit in memory. I want to fix this
sometime in the next few weeks.
14:08 < koppor> josephg: Thank you for the information!
14:08 < josephg> Mongo isn't blessed or special in the same way. You can use any database you like to store your data - take a look at share/livedb-mongo for an
example of what the api it implements needs to look like
14:09 < josephg> (we haven't published details on this yet because I might need to add more methods to support presence, cursors and the oplog)
14:09 < k1i> so
14:10 < k1i> i've been following this oplog issue quite closely
14:10 < k1i> is there any reason the oplog can't be capped at a specific size, and clients trying to commit operations older than X version get discarded?
14:10 -!- dascher [~dascher@85-171-206-178.rev.numericable.fr] has quit [Remote host closed the connection]
14:10 < josephg> Yeah we can do that
14:10 < k1i> IMO that needs to be implemented, as, it's a simple solution - more complicated lifetime-oplog-storage techniques can be implemented
14:11 < k1i> but for 99% of webapps, a capped oplog in Redis, to a specific amount of memory, will be enough
14:11 < josephg> ... the only problem is dealing with the error correctly in the client.
14:11 < k1i> most apps aren't going to be doing offline transformations over a long period of time - and if they are unique in that use case, they can add more
memory or do disk-based caching
14:11 < k1i> Derby's problem.
14:11 < josephg> :)
14:12 < k1i> but Share needs to be able to have a limited oplog
14:12 < k1i> also
14:12 < josephg> Yeah - the other thing to do is actually cleaning up / removing old ops
14:12 < k1i> it would be cool to be able to limit the oplog on specificcollections
14:12 < k1i> redis-LRU, would probably be ideal
14:12 * josephg looks up redis-lru
14:12 < k1i> least-recently-used
14:12 < k1i> it's built into redis as a garbage-collection mechanism
14:13 < k1i> specific collections may need more lengthly oplogs, etc.
14:13 < k1i> although
14:13 < k1i> nevermind, not an issue
14:13 < josephg> I'm currently storing oplogs in a redis list
14:13 < josephg> we might have to put each op in a separate redis document to do that
14:14 < josephg> ... which would make it slower when you try and get ops
14:15 < josephg> - I guess we could write a lrange-equivalent command using lua scripting
14:15 < k1i> yep
14:15 < josephg> but I'm not sure how that'll interact with an externally (not in redis) stored oplog
14:15 < k1i> redis's garbage collection is really quite good
14:16 < k1i> especially across a cluster
14:16 < josephg> probably, I won't do that straight away. Instead I'll first do the code to move the oplog out of redis
14:16 < k1i> what do you mean?
14:16 < k1i> out of redis, into where?
14:17 < josephg> into - something else. mongo as a default, who knows
14:17 < josephg> but the point is, into something that isn't locked in memory
14:17 < k1i> well
14:17 < josephg> (leveldb would be ideal - alas no network protocol)
14:17 < k1i> mongo is going to give you a headache when it comes to write contention
14:17 < k1i> redis is just really solid
14:17 < josephg> and its really slow
14:17 < k1i> yea
14:17 < josephg> yeah I know, I love redis
14:17 < k1i> I don't see why redis is an issue
14:18 < josephg> its an issue because you get lots and lots of ops
14:18 < k1i> I wrote a multi-server syncing cache/session store for Redis a while ago, it's a fine piece of technology
14:18 < k1i> yeah, but, keeping a long oplog forever isn't an option
14:18 < k1i> pruning needs to happen at the end of the day
14:18 < k1i> unless you need to be able to support long-term playback
14:18 < k1i> which 99% of apps can do without for the costs associated
14:19 < josephg> right. I guess there's a couple of options here:
14:19 < k1i> I can see a nice DOS-style attack being done via abusing old transformation versions
14:19 < josephg> 1. We leave everything in redis, but remove old operations when we run out of ram
14:20 < josephg> 2. Redis is used as the ultimate source of truth on the last op (so we still use it for contention control) but operations are shifted out into a
secondary store once they've been applied to redis
14:20 < josephg> ie, mongo or something
14:20 < josephg> then we can prune (manually or using lru or something) stuff from redis with mostly impunity - its just a cache
14:20 < josephg> + locking system for doing atomic increments
14:21 < josephg> as for DOSing the server with old ops, the easy way to fix that is to just not allow any ops older than some age
14:21 < k1i> -- aka deleting them
14:21 < josephg> not necessarily.
14:21 < josephg> we can just force clients to do all the OT work
14:21 < k1i> if they aren't going to be allowed for use in transofrmation
14:21 < k1i> ah
14:22 < k1i> I like strategy 1
14:22 < josephg> - although sending a bunch of ops over the wire is probably more expensive than transforming anyway.
14:22 < k1i> because big corps are going to deploy massive redis server clusters
14:22 < k1i> they do
14:22 < k1i> (already)
14:22 < k1i> it's capable of handling it, it's simple, and it already works
14:22 < k1i> smaller users (who probably don't care about longterm playback anyway) can affordably implement OT for a limited timescale
14:22 < k1i> also
14:23 < k1i> you throw the error to Racer; racer could choose to implement some other kind of conflict resolution
14:23 < k1i> manual, last-winner
14:23 < josephg> no, there's nothing good racer can do there.
14:23 < josephg> I catch a plane and do some work on the plane. I don't connect to the internet again in 2 days.
14:24 < k1i> Racer can alert Derby, in which user apps can show the manual conflict?
14:24 < josephg> there's been some changes to that document I was working on in the meantime
14:24 < k1i> (resolution process)
14:24 < josephg> ... well, I don't even have the diff of what other people have done.
14:24 < josephg> I just have my own ops, my view of the document and the server's (changed) view of the document
14:25 < josephg> I mean, we could punt to the application in that case
14:25 < josephg> ... and make them figure out a diff, and do that whole dance
14:25 < josephg> but its not fun. And most people won't bother.
14:25 < k1i> personally?
14:25 < k1i> I'll discard the user's changes
14:25 < josephg> right - yeah most people will.
14:25 < k1i> as in my use case, an extended absence from online is not a big deal (because it cant technically happen)
14:26 < k1i> I want OT, but don't need extended replay
14:26 < k1i> and if I do, I will throw more hardware at redis
14:26 * josephg nods
14:26 < josephg> for us, we're writing hiring software
14:26 < josephg> and we want the oplog anyway for auditing
14:27 < josephg> so if someone does something bad, we want to see exactly who did it and when
14:27 < k1i> ah
14:27 < k1i> I am writing transactional point of sale software
14:27 < k1i> I was planning on creating a manual log
14:27 < k1i> but, that's actually not a bad idea - abuse the log left by OT
14:27 < josephg> right. Yeah, I guess we could do that instead
14:27 < k1i> it seems like there is some overhead though in finding an operation
14:27 < k1i> rather than creating a dedicated log on an action-by-action basis
14:28 < josephg> yeah, maybe. You can play the operations back
14:28 < josephg> actually, playback would be a fun thing to add to the godbox
14:28 < josephg> should be pretty easy to do, too.
14:29 < k1i> right now my main scare with Derby in general is the oplog growth issue (and validations, but thats another story)
14:29 < josephg> Brian is adding schema validation at the moment for our app
14:30 < josephg> sharejs exposes a validate function, so you can plug in your own schema validation / whatever logic in there
14:30 < josephg> but yeah, the oplog growth issue is important
14:30 < josephg> - and I want to solve that in the next few weeks in some form or other.
14:30 < josephg> we also don't have any decent benchmarks about how the whole system performs
14:31 < josephg> which is important for me - for example, if we move redis to have all the ops in their own key, how does that perform?
14:31 < josephg> (although redis being redis, probably still waaaay better than any of the javascript)
14:31 < k1i> also
14:32 < k1i> sorry
14:32 < k1i> this is very important
14:32 < k1i> the fact Racer doesn't support Projections/ShareJS not supporting Mongo projections is hugely problematic for me
14:32 < k1i> and I would expect most users
14:32 < k1i> I shouldn't have to define a User's password field in a separate collection just to get it away from public eyes
14:33 < josephg> yep.... I had this exact conversation on friday night with brian.
14:33 < josephg> he's strongly of the opinion that we should support collections, and I don't want to add more parts to sharejs
14:33 < k1i> again, it goes against conventional data modeling to not be able to do those kinds of operations
14:34 < k1i> no matter the datastore
14:34 < josephg> well, redis doesn't do projections
14:34 < k1i> PGSQL (row), Mongo (document) - fields need hidden
14:34 < josephg> but yeah - mongo and couch both do.
14:34 < k1i> enterprises don't use redis for persistent datastore, either though, generally
14:34 < josephg> true. nate and I have been talking about first adding filters
14:34 < k1i> and I personally wouldn't bank an entire framework on an edge case persistent datastore (redis)
14:35 < k1i> I saw that
14:35 < k1i> and it looked interesting
14:35 < josephg> yep - so thats probably what v1 will look like -
14:35 < k1i> filtering specific 'fields' from being operated on
14:35 < josephg> yep, and from being visible to a client.
14:35 < josephg> so a client will have a specific view of a document. For example, a user can see their entire own profile
14:35 < josephg> but only some fields of other user's profiles
14:36 < josephg> we'll need to edit operations going to that client, but if we do it right, the client won't be able to tell that there even are more fields in the
document
14:36 < k1i> yep
14:36 < k1i> that would be ideal
14:36 < josephg> thats the 'perminant projection' system
14:36 < k1i> this is something none of the realtime 'frameworks' that exist now have solved
14:36 < josephg> interesting.
14:37 < k1i> everyone can stop access to a specific document because a query can be built on it
14:37 < k1i> but app-level security on individual fields is absolutely imperative
14:37 < josephg> yep.
14:37 < k1i> if a Derby or Meteor are going to win over rails in 'framework choice'
14:37 < k1i> it's not even a passable option
14:37 < josephg> ... the other thing that would be nice to have is a way for queries to only return part of a document
14:37 < k1i> yes
14:37 < k1i> that would increase efficiency
14:38 < josephg> for example, if I'm viewing a list of documents, I probably only want a couple of fields
14:38 < k1i> it can spawn weird edge cases
14:38 < josephg> ... then if I click on one, I should see all the rest of the fields too
14:38 < josephg> it sure can.
14:38 < k1i> when certain fields are based on another
14:38 < josephg> so yeah, thats going to take some more thought.
14:38 < josephg> but we'll probably start with the filter thing - though for me its a lower priority than doing a bunch of benchmarks
14:38 < josephg> and solving the oplog issue
14:38 < k1i> well
14:39 < k1i> yeah
14:39 < k1i> some people can't even migrate to derby .5 due to a massive oplog
14:39 < josephg> yeah exactly.
14:39 < k1i> also, I am of the opinion that the oplog should be completely transient -
14:39 < k1i> if I go in and 'redis-flush' everything away
14:39 < k1i> that should be completely OK
14:39 < k1i> and the app should be able to handle any issues associated with that
14:39 < josephg> well, if we move the oplog out into something that mongo / whatever could provide
14:39 < josephg> then you could always just store it in something that sometimes forgets ops
14:39 < josephg> and we should make the system be able to deal iwth that too.
14:40 < k1i> yea
14:40 < k1i> I like redis
14:40 < k1i> but
14:40 < k1i> the memory thing is a bit tricky
14:40 * josephg nods
14:40 < josephg> koppor: are you still around?
14:40 < josephg> ... koppor was asking about socket.io
14:41 < k1i> yes
14:41 < k1i> id like to ask you about that as well
14:41 < k1i> what is the current issue with native websocket?
14:41 < josephg> I dunno if its gotten better since, but I hate socket.io because of all the grief it caused me while doing sharejs
14:41 < k1i> when was the last time you used it
14:41 < josephg> its just unreliable, it doesn't guarantee message ordering
14:41 < josephg> um, about 18 months ago
14:41 < k1i> can you try engine.io
14:42 < josephg> ... and it can tell you a client disconnected, then give you more ops for that client
14:42 < josephg> I dunno man - I don't trust it.
14:42 < k1i> https://github.com/LearnBoost/engine.io
14:42 < k1i> engine.io is heavily actively developed
14:42 * josephg shrugs
14:42 < josephg> does it order operations?
14:42 < josephg> ... anyway, the new architecture of sharejs means that you can use whatever you want.
14:42 < k1i> native websockets have a huge, huge advantage
14:43 < josephg> in performance, yeah
14:43 < k1i> in that they don't require a sticky-sessioning LB to maintain efficiency on the server-side
14:43 < k1i> much easier to scale
14:43 < k1i> obviously you will want one for fallback clients, but, still
14:44 < josephg> ... you don't?
14:44 < k1i> for native websockets?
14:44 < josephg> hm I guess not.
14:44 < k1i> the TCP connection is maintained by whatever LB you are running
14:44 < k1i> it's inherently 'sticky' as it's an open socket
14:44 < k1i> the LB can then round-robin, least-load, etc. any other connection
14:45 < josephg> right, but you aren't just doing request-response over the socket
14:45 < k1i> that's probably my favorite feature about websockets
14:45 < k1i> but the connection remains open throughout the duration of a clients' visit, though, right?
14:45 < josephg> you also need to be able to send to the client when one of the subscribed documents changes
14:45 < k1i> yes
14:45 < josephg> ... and to do that you need a server to be 'responsible' for the client anyway
14:45 < k1i> I am saying just at an LB-level
14:46 < josephg> hm - I guess you could have any server able to send to the client
14:46 < k1i> the LB has less-thought into maintaining a stateful websocket than stateless polling
14:46 < k1i> no, the client still gets talked to by their associated server
14:47 < k1i> if the client refreshes, they reconnect and setup a new copy of the redis-stored session on another backend server
14:48 < josephg> ... so which server sends a client ops for its subscriptions?
14:48 < k1i> the server that they are cnnected to via websocket
14:49 < k1i> initially
14:49 < josephg> oooooooh
14:49 < k1i> the websocket has no reason to ever close
14:49 < josephg> right, because the load balancer will send the websocket *somewhere* it doesn't matter where
14:49 < k1i> so the client has no reason to ever get connected to a different server
14:49 < k1i> yes
14:49 < josephg> and that server is responsible for that client foever.
14:49 < k1i> and it stays open
14:49 < josephg> yeah
14:49 < k1i> the LB never touches the websocket again after it's opened
14:49 < k1i> they know how to pass socketed traffic
14:49 < josephg> yep - its just that the load balancer doesn't hav eto know. It just pipes
14:49 < k1i> now
14:49 < k1i> sticky-sessioning is something you want for efficiency and fallback clients
14:50 < josephg> yeah - lovely.
14:50 < k1i> but, it makes LB a lot easier in high-scalability environments
14:50 < k1i> to be able to roundrobin, etc.
14:50 < k1i> also
14:50 < k1i> LB's such as HAProxy will eventually have bindings written for them, for derby, etc.
14:50 < k1i> to be able to contact them for client count
14:51 < josephg> so in sharejs, because I got sick of people filing bugs about socket.io being broken, etc
14:51 < josephg> I've moved to a system where the user is responsible for making the server-client connection
14:52 < josephg> on the server, you pass sharejs a node 0.10 stream which it can use to talk to a client that just connected
14:52 < k1i> yeah, that's probably good for node-like compatibility and abstraction
14:52 < josephg> and on the client, you pass a websocket-like object which it'll use to talk to the server
14:52 < k1i> I personally have total control over my clients
14:52 < josephg> and then you can send sidechannel messages in the stream, etc.
14:52 < josephg> yep
14:52 < k1i> and will be forcing them all to be websocket-enabled browsers
14:52 < josephg> ... yeah, so then you can use websockets
14:53 < josephg> there's probably a couple issues you'll run into at the moment because I think I"m taking advantage of the fact that browserchannel lets you send
messages while its connecting
14:53 < josephg> - but let me know and I can fix them, or you can fix them.
14:53 < josephg> but it should work.
14:53 < josephg> thats the idea
14:54 < josephg> there's racer-browserchannel kicking around somewhere taht has the 2 files or whatever which does the work
14:54 < josephg> so yeah, go ahead and make a racer-websocket or whatever
14:54 < josephg> and slot it in.
14:55 < k1i> yeah
14:55 < josephg> and koppor: likewise, use socket.io if you want. But if the server crashes because messages arrive out of order, its not my bug.
14:55 < k1i> I think engine.io allows queued messages
14:55 < josephg> if all your browsers support websocket, why bother?
14:56 < josephg> websocket over https works great (better than websocket over http because proxies don't get in the way)
14:56 < k1i> node-browserchannel doesn't use websockets?
14:56 < josephg> nope. I wanted to add it, but I'd need to add websocket support to the closure library
14:56 < k1i> yeah
14:56 < josephg> it was on my nice-to-have list and less important than adding cursors to sharejs
@digitalsanity
Copy link

I know this is old .. but could you clean-up old oplog entries by storing them in an 'archive' store (still want to maintain the full audit trail) and then storing hashes of changes or states aggregated to a full day in Redis? So the client would attempt to hash their local data according to the same aggregation rules, compare the full-day hashes with the full-day hashes in Redis to determine when the timeline broke, then requested the transactions for those full-days to catch up to current, then read the live log from Redis?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment