Skip to content

Instantly share code, notes, and snippets.

@kroitor

kroitor/WS.md Secret

Last active June 25, 2019 13:44
Show Gist options
  • Save kroitor/7dce1d23a10937ab8c07a5451f17ccf2 to your computer and use it in GitHub Desktop.
Save kroitor/7dce1d23a10937ab8c07a5451f17ccf2 to your computer and use it in GitHub Desktop.

Take 1

Take 1 was the initial plan, but it was replaced by Take 2 following below.

Nils Diefenbach @nlsdfnbch Nov 10 14:39
I'm mostly curious as to how far along you're with the websockets, what you'd 
like my help with most and, most importantly - do you need me to write JS? 
Because that may well get a bit bumpy.

Igor Kroitor @kroitor Nov 10 14:54
@nlsdfnbch no need for JS from you, we will merge it all and will port 
everything, but we thought of making it as close to being portable as 
possible, so, we believe this should be a two-tier API:
tier 1: you set the subscriptions you want and those get inited upon 
instantiation, then a derived exchange method gets called upon receiving a 
subscription event, for the user we will need to have the current state and the 
set of changes introduced by that particular event
tier 2: if it was a request-response call, the return should be forwarded to the 
caller in async manner of course (the problem here is their streaming nature, 
they don't identify request-response pairs, so we have to deal with overlapping 
requests)

Nils Diefenbach @nlsdfnbch Nov 10 15:03
@kroitor , ok, tier one sounds simple enough, if I got you right. It sounds 
similar to the pusher client implementation (you register a callback with a 
channel event, which then gets called whenever that event happens).
What exactly do you mean by 'request-response' call ?

Igor Kroitor @kroitor Nov 10 15:04
the exchanges have a public and a private API
WS sub and WS pub
WS sub stands for subscriptions

Nils Diefenbach @nlsdfnbch Nov 10 15:05
that much is clear 

Igor Kroitor @kroitor Nov 10 15:05
WS pub is for publishing (sending requests)
Some exchanges don't support request UUIDs

Nils Diefenbach @nlsdfnbch Nov 10 15:06
Ah, ok. So for example order placements and their confirmation, etc.

Igor Kroitor @kroitor Nov 10 15:06
so if you send many of them in some short period of time
it will be hard to match responses with replies
because the ordering is not guaranteed
also, the subscription may be either authenticated or not
say, you can subscribe to a public order book, or to your private balance
so, it's a little more involved than just setting up a raw callback

Nils Diefenbach @nlsdfnbch Nov 10 15:07
and do you want to guarantee this, or is it just essential for you to guarantee 
to be able to match them properly?
this, being ordering

Igor Kroitor @kroitor Nov 10 15:09
@nlsdfnbch well, we think if the user calls an async procedure named 
createWSOrder or something like that, the user should get the response from that 
call, instead of just throwing the answer to some callback defined somewhere else

Nils Diefenbach @nlsdfnbch Nov 10 15:11
@kroitor, that makes sense.
So, websockets should quietly replace the REST interfaces, where possible - is 
that fair to say?
I'm just trying to wrap my head around the requirements, sorry if the questions 
are tedious - but I'd rather ask stupid questions first, just to get the lay of 
the land right off the bat 

Igor Kroitor @kroitor Nov 10 15:19
@nlsdfnbch sure, questions are absolutely ok, i'd say yes, most users would want 
faster reaction times, so we should replace HTTP rest with WS... But we would 
prefer to keep WS optional, until it is stable enough, or to migrate gradually... 
there's a lot more that can go wrong with a live connection in comparison to flat 
http rest... i'd start with a feature-rich ws api, like Bitfinex, they support 
most of functionality that we need, if we have a reference implementation of it 
in Python, we can carry on with the other exchanges from there... We also have 
several exchanges, including Bitfinex, implemented, but that's JS code, don't 
know if it makes any difference, because there's a lot of implementation on 
GitHub as well, for most of exchanges.

Nils Diefenbach @nlsdfnbch Nov 10 15:21
@kroitor cool, cool. Which implementation are you using? Mine, I hope :D
I have found that the biggest issues with the websocket APIs is the diversity of 
protocols used. Bitfinex uses a plain WS, bitstamp uses Pusher and poloniex uses 
(god knows why) the WAMP protocol. So that makes implementing them a pain.
But if you're fine with using existing implementations and simply adjusting them 
to your needs, I gather you do not require a 'standardized' websocket class ? 
Basically just hook it up to the Tier1 API and let fly?

Igor Kroitor @kroitor Nov 10 15:27
@nlsdfnbch I mean the implementations that we did use are in JS, but yes, your 
bitfinex api looks very good, so maybe we should start integrating some parts of 
it into our async version...
I'm still not sure, but I think I would still say that we may want a standardized 
ws baseclass, because we support three languages at once. And we don't want to 
implement ws for bitfinex 3 times. So we have to make it in one language and 
transpile it to other languages. We've chosen JS as a source language, but this 
is not a problem, because most of asyncio code can be mapped to JS one-to-one 
(we will do that if needed).
As for the protocols, it's a real pain indeed, Bittrex uses Signalr (it's not 
documented, but we managed to reverse-engineer it)...

Nils Diefenbach @nlsdfnbch Nov 10 15:29
But why do you require it in all three languages (that cross-compiler is really 
impressive, btw)? Couldn't you just provide it in one, and have the API cross-compiled?
Looking at this problem, I find it's quite similar to the one I'm currently 
solving in another project I work on - it uses ZMQ to transport websocket data 
from Python, Java and Node.js to C.
But I'm not sure how open you are to this, since this would introduce a pretty 
big dependency to the project (albeit I believe it would definitely make work 
on the websockets easier)

Igor Kroitor @kroitor Nov 10 15:35
By "implementing WS for bitfinex" I don't mean the low-level connectivity, that 
thing is language-specific, I mean their higher level exchange protocol/format 
over ws. That thing should be transpiled.
Our code is split into two main parts:
the base exchange class that is language-specific, because most of base 
functions cannot be transpiled to other languages without a significant effort, 
and this isn't our goal... the base class handles all low-level networking in 
language-specific fashion... PHP doesn't know anything about asyncio. So, the 
need for the base classes is obvious.
the derived exchange classes, those are transpiled, and those don't have any 
language-specific code in them, they mostly look like plain c with single-line 
short statements, the transpiler maps lines one-to-one...
@DeeDab the issue with Livecoin quoteVolume was fixed as of v.1.10.53, also note 
that the former quoteVolume is now moved to baseVolume field, and the quoteVolume 
field now equals baseVolume * VWAP
@nlsdfnbch the main practical consideration behind our structuring: we don't want 
repetitive work, so we mostly have everything language-specific in the base class, 
we still have to maintain three of them for each particular language, but at least 
we don't have to maintain each derived exchange class in three languages.

Igor Kroitor @kroitor Nov 10 15:41
transpiling the base class is beyond our scope for now, as that would be 
comparable to writing our own AST recompiler, which is an overkill we think
we can do three base classes in three days, but we can't do in three days a 
full-featured AST recompiler that would allow any-to-any recompilation, so we 
choose the path of least resistance here

Nils Diefenbach @nlsdfnbch Nov 10 15:55
Ok, so you implement a low-level websocket connection object (as a base class) 
in each language, and then write a single JS wrapper for it, which is then 
compiled into php and python. is that about right as far as steps in development 
goes?
Where the connection object does nothing else but connect and accept/send data 
to and from the WS API, and the wrapper then does the rest (formatting, 
callbacks etc)?

Igor Kroitor @kroitor Nov 10 16:01
Ok, so you implement a low-level websocket connection object (as a base class) 
> in each language, and then write a single JS wrapper for it, which is then 
compiled into php and python. is that about right as far as steps in development 
goes?
This is correct, exactly.
> Where the connection object does nothing else but connect and accept/send data 
to and from the WS API, and the wrapper then does the rest (formatting, callbacks 
etc)?
Apart from the callbacks everything is correct. We don't use callbacks in common 
sense (as passable function pointers). Those are poorly-transpileable. What we 
do instead, is allow the user to redefine/override pure methods in order to 
receive notifications or change some behaviour of an exchange instance...
I'd put it like this:
The connection object does everything that is common to all exchanges and 
everything that is language-specific at the same time.
The wrapper does everything that is exchange-specific. To understand what is 
language-specific and what isn't here's our outdated CONTRIBUTING doc: 
https://github.com/ccxt/ccxt/blob/master/CONTRIBUTING.md
the general rule of thumb is: when it looks like C then it is transpileable
we have to abandon the syntax-sugar that is also language-specific
so our derived classes only contain code that is designed to be transpileable,
 and everything inside derived classes looks like C, because it gets mapped from 
 JS to other languages line-to-line (this is a "wrapper" in your terms)

Igor Kroitor @kroitor Nov 10 16:10
So, if you envisioned some straightforward userland API like 
subscribe (event_type, callback) - we can't do that in a portable way. Mostly 
because of the differences between async JS and Python asyncio + PHP is 
synchronous by nature, parallelism in PHP is mostly threaded

Igor Kroitor @kroitor Nov 10 16:15
The userland API will look something like:
foo = new ccxt.bar ({
  subscribeToOrderBookFeeds: symbolsList,
  onSubscriptionChannelChange: function (args) {  // this is the only method of userland API
      // it's not clear what should be passed in the args
      // we think all data should be cached on the exchange instance
      // so we don't get notifications in the args, and instead of that we will probably
      // work with cached WS data on this./self. in a 'dirty' manner
  },
})
↑ this is a sub part only

Nils Diefenbach @nlsdfnbch Nov 10 16:16
Honestly, I'm not even a fan of using callbacks - I just ahd to put up with the 
pusher client (which does this, extensively) and I had absolutely no joy using it.
My first idea would have been to set up websocket to be available programmatically, 
sans the fuzz with authentication and subscribing etc. So basically, subscribe 
to all and everything, format it, so it's unified and let the rest of the program 
sort out what it needs from the stream.

Igor Kroitor @kroitor Nov 10 16:17
in the above example the overrided pure method that gets passed in the 
constructor kinda replaces the callback
we can override it by subclassing as well (in userland)
the example is pseudocode of course, but it should be clear

Igor Kroitor @kroitor Nov 10 16:29
@nlsdfncbh let me know if it doesn't make sense

Nils Diefenbach @nlsdfnbch Nov 10 16:32
It makes sense. I think. Pretty sure it'll be clearer once I got a chance to 
look at the codebase properly.

Igor Kroitor @kroitor Nov 10 17:06
let me know if you have any difficulties with it, there may be some not very 
obvious code here and there, but i'm sure we will get over all of that eventually

samholt @samholt Nov 10 17:20
Just caught up with everything, great comments so far, I agree exactly with the 
derived base class for the Websockets, and then the wrapper in C like fashion 
for transpilation. I'll submit a PR later tonight with an example of how I see 
this for async python, for Bitfinex, be interesting to then talk about the 
implementation further. @nlsdfnbch not sure if your new to async python, however 
I found this tutorial excellent, and prepares you well for working with it
 https://pymotw.com/3/asyncio/

Nils Diefenbach @nlsdfnbch Nov 10 17:32
Hey @samholt , I am somewhat new; I understand the concepts, but as I mentioned 
in my previous message to you, I'm versed in designing stateless systems, and 
thus haven't had need to use asyncio. I'll give the tutorial a swing.
I believe you also asked what I use my bitex library for - it's really our legacy 
data acquisition framework for crypto currencies. We wanted to do some tests and 
needed data, hence an interface was required. So I hacked that together, and it 
since has evolved into a half-way decent framework.
Currently, the bitfinex websocket client I maintain is used in one of my closed 
source projects that I work on. It too started out as a dirty hack and become a 
bit more sophisticated over time.

Igor Kroitor @kroitor Nov 10 17:58
i guess this happens to all opensource projects... we are evolving in exactly 
the same way, started off as a closed-source (for our own needs), then decided 
to publish it

Take 2

This is the current accepted plan.

Nils Diefenbach @nlsdfnbch Nov 12 12:41
@kroitor, I'd start working on the base class if you haven't already started on it?

Igor Kroitor @kroitor Nov 12 13:51
Hi, @nlsdfnbch !
samholt wrote:
Just a quick update on the web sockets, have a few questions on the implementation. 
Current version is here: https://github.com/samholt/ccxt/tree/master/python/ccxt/async/ws
 With the appropriate tests: https://github.com/samholt/ccxt/blob/master/python/test/test_ws_async.py
So, I guess, he has started that work and we can carry on from there
I contacted him and asked to join this channel in order to synchronize the effort

samholt @samholt Nov 12 13:55
Hi @nlsdfnbch Yes, I'll make you a contributor

Igor Kroitor @kroitor Nov 12 14:01
Also for the web socket order books, the incoming deltas are used to 
build / correct the order book in memory. Was thinking only to convert the order 
books to standard output order book format of ccxt.exchange.fetch_order_book, 
when the user wants the order book ? Otherwise just keep building the latest 
order book using the delta’s.
Makes sense, we don't want to waste on excessive conversions, so, we may need an 
"internal" representation of an orderbook, indexed by an order_id (GDAX L3) or 
order price (most of exchanges return L2)
Also what’s the biggest challenges / things I need to change before considering 
to merge this ?
The biggest to my mind would be to decouple the "subscribed" string and all 
other format-related string literals and values from base class to derived 
classes, that particular one ("subscribed") is Bitfinex-specific
I mean this method of the base class:
async def event_handler(self, response):
        """ Handles the incoming responses"""
        data = ujson.loads(response.data)
        if isinstance(data, dict):
            if data['event'] == 'subscribed':
                print('Subscribed to channel: {0}, for pair: {1}, on channel ID: {2}'.format(data['channel'], data['pair'], data['chanId']))
                self.channel_mapping[data['chanId']] = (data['channel'], data['pair'])
            elif data['event'] == 'info':
                print('Exchange: {0} Websocket version: {1}'.format(self.id, data['version']))
        elif isinstance(data, list):
            if isinstance(data[1], str):
                print('Heartbeat on channel {0}'.format(data[0]))
            else:
                # Published data, time stamp and send to appropriate queue
                timestamp = self.microseconds() / 1000
                datetime = self.iso8601(timestamp)
                if self.channel_mapping[data[0]][0] == 'book':
                    pair_id = self.channel_mapping[data[0]][1]
                    await self.queues['orderbooks'][pair_id].put((data, timestamp, datetime))
the data isn't guaranteed to have 'event' key if you apply the same code to other exchanges
all request/response formats differ and depend on the exchange (exchange-specific)

Igor Kroitor @kroitor Nov 12 14:07
also, I have an implementation of Bitfinex in JS
I'll need some time to find it in my older backups, and I'll post a gist of it 
here (will do that asap), we may be able to use them for comparison in order to 
align further development efforts
The topmost priority would be to remain backward-compatible, so, users would 
have all the same external API methods and just set { websocket: true } in the 
constructor. This would be an ideal solution for us (and for the majority of 
users, we believe)

Igor Kroitor @kroitor Nov 12 14:14
we don't want onOrderBookUpdated() in the userland or that kind of callbackish-nonsense
what we want is just basic promises / futures / coroutines and awaits
So, in terms of @samholt's question, the method that will do a conversion from 
an internal representation of an orderbook to the format documented in the Manual 
would be called... surprise-surprise, fetchOrderBook! )

Igor Kroitor @kroitor Nov 12 14:20
And the main difficulty here: that same fetchOrderBook should subscribe you to
 order book updates for that particular pair, if you are not subscribed yet
The same is true for all other fetch* methods

Igor Kroitor @kroitor Nov 12 14:35
This task in itself may be harder than unifying HTTP REST, because of the 
diversity of exchange formats, constants, there are also some differences in 
WS subscription/auth logic
For example, with OKCoin you can subscribe for an orderbook feed and... they 
send you the entire orderbook once per second (no deltas, dunno what's the 
point of ws then).

samholt @samholt Nov 12 14:53
Excellent suggestions, I just renamed/implemented the fetchOrderBook method. 
Please download the version here: https://github.com/samholt/ccxt/tree/master/python/ccxt/async/ws
and run test: https://github.com/samholt/ccxt/blob/master/python/test/test_ws_async.py
Will add the subscribe to order book check shortly
Anyone wishing to contribute, please do contact me and I will add you to the fork 
@kroitor I agree the event handler in its current state overfits massively for 
the bitfinex api, it was used to get something working, then start adding further 
exchanges, and change the implementation to a more generalised one as more 
exchanges are added, so we can clearly workout how to handle the overlap, and 
the uniqueness of each exchange.
I'll focus on getting GDAX order books in next

samholt @samholt Nov 12 15:00
@nlsdfnbch I found this helpful: 
http://aiohttp.readthedocs.io/en/stable/client_reference.html?highlight=ws_connect#aiohttp.ClientSession.ws_connect. 
We may have to think how we handle errors such as timeouts, and or overflowing 
queues if the users CPU cannot process the incoming queues fast enough

Igor Kroitor @kroitor Nov 12 15:07
@samholt awesome! thx and respect! feel free to dm me anytime for answers, i 
hope to join the development going in that branch asap

samholt @samholt Nov 12 16:43
Added automatic subscribe if not already subscribed to exchange.fetchOrderBook(symbol)

Igor Kroitor @kroitor Nov 12 16:49
Some of our old JS implementations (those are really outdated and don't fit into current ccxt as it is):
Bitfinex: https://gist.github.com/kroitor/4d4638c0f6e89602105c6438f5acd057
Bitmex: https://gist.github.com/kroitor/abcbca36976d8142ef0276da30c9be5b
OKCoin: https://gist.github.com/kroitor/0350cfd07815607c70d36b36c25d8aa5
Don't know if it makes any sense to post that now, but... just in case, maybe we 
would need this later, who knows.
This message was deleted
my handleEvent for Bitfinex from the above link looks like this:
handleEvent (message) {

        const [ channelId, ... data ] = message
        const event = this.events[channelId]

        if (event) {
            if (event.channel === 'book')
                this.handleOrderBookEvent (event, data)
            else if (event.channel === 'trades')
                this.handleTradeEvent (event, data)
            else if (event.channel === 'ticker')
                this.handleTickerEvent (event, data)
            else if (event.channel === 'auth')
                this.handleUserEvent (event, data)
            else
                log.e (this.logid, (new Date ()).toString (), 'message in unknown channel', channelId, message)
        }
    },
@samholt looks quite similar to what you had in Python

FOLLOW UP: ccxt/ccxt#751

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment