Skip to content

Instantly share code, notes, and snippets.

@kroitor
Last active October 24, 2019 01:03
Show Gist options
  • Save kroitor/aabecc346a5bded6ead7a82205d8ffc1 to your computer and use it in GitHub Desktop.
Save kroitor/aabecc346a5bded6ead7a82205d8ffc1 to your computer and use it in GitHub Desktop.
ws draft

Streaming

The streaming WebSocket API is currently under development (a work in progress). Below are key design considerations on supporting WebSockets in ccxt library.

In general, not all exchanges offer WebSockets, but many of them do. Exchanges' WebSocket APIs can be classified into two different categories:

  • sub or subscribe allows receiving only
  • pub or publish allows sending and receiving

Sub

A sub interface usually allows to subscribe to a stream of data and listen for it to income. Most of exchanges that do support WebSockets will offer a sub type of API only. The sub type includes streaming public market data. Sometimes exchanges also allow subcribing to private user data. After the user subscribes to any data feed the channel effectively starts working one-way and sending updates from the exchange towards the user continuously.

  • Commonly appearing types of public market data streams:
    • order book (most common) - updates on added, edited and deleted orders (aka change deltas)
    • ticker- updates upon a change of 24 hour stats
    • fills feed (also common) - a live stream of public trades
    • exchange chat
  • Less common types of private user data streams:
    • the stream of trades of the user
    • balance updates
    • custom streams
    • exchange-specific and other streams

Pub

A pub interface usually allows users to send data requests towards the server. This usually includes common user actions, like:

  • placing and canceling orders
  • placing withdrawal requests
  • posting chat messages
  • etc

Most exchanges do not offer a pub WS API, they will offer sub WS API only. However, there are some exchanges that have a complete WebSocket API.

Unified WS API

In most cases a user cannot operate effectively having just the WebSocket API. Exchanges will stream public market data sub, and the HTTP REST API will still be required for the pub part (where missing).

The goal of ccxt is to seamlessly combine in a unified interface all available types of networking, possibly, without introducing backward-incompatible changes.

The WebSocket API in ccxt consists of the following:

  • the pull (on-demand) interface
  • the push (notification-based) interface

The pull WebSocket interface replicates the async REST interface one-to-one. So, in order to switch from REST to pull WebSocket + REST, the user is only required to submit a { ws: true } option in constructor params. From there any call to the unified API will be switched to WebSockets, where available (where supported by the exchange).

The pull interface means the user pulling data from the library by calling its methods, whereas the data is fetched and merged in background. For example, whevener the user calls the fetchOrderBook (symbol, params) method, the following sequence of events takes place:

  1. If the user is already subscribed to the orderbook updates feed for that particular symbol, the returned value would represent the current state of that orderbook in memory with all updates up to the moment of the call.
  2. If the user is not subscribed to the orderbook updates for that symbol yet, the library will subscribe the user to it upon first call.
  3. After subscribing, the library will receive a snapshot of current orderbook. This is returned to the caller right away.
  4. It will continue to receive partial updates just in time from the exchange, merging all updates with the orderbook in memory. Each incoming update is called a delta. Deltas represent changes to the orderbook (order added, edited or deleted) that have to be merged on top of the last known snapshot of the orderbook. These update-deltas are incoming continuously as soon as the exchange sends them.
  5. The ccxt library merges deltas to the orderbook in background.
  6. If the user calls the same fetchOrderBook method again – the library will return the up-to-date orderbook with all current deltas merged into it (return to step 1 at this point).

The above behaviour is the same for all methods that can get data feeds from exchanges websocket. The library will fallback to HTTP REST and will send a HTTP request if the exchange does not offer streaming of this or that type of data.

The list of related unified API methods is:

  • fetchOrderBook
  • fetchOrderBooks
  • fetchTicker
  • fetchTickers
  • fetchTrades
  • fetchBalances
  • fetchOrders
  • fetchOpenOrders
  • fetchClosedOrders
  • fetchMyTrades
  • fetchTransactions
  • fetchDeposits
  • fetchWithdrawals

The push interface contains all of the above methods, but works in reverse, the library pushes the updates to the user. This is done in two ways:

  • callbacks (JS, Python 2 & 3, PHP)
  • async generators (JS, Python 3.5+)

The async generators is the prefered modern way of reading and writing streams of data. They do the work of callbacks in much more natural syntax that is built into the language itself. A callback is a mechanism of an inverted flow control. An async generator, on the other hand, is a mechanism of a direct flow control. Async generators make code much cleaner and sometimes even faster in both development and production.

@pursehouse
Copy link

I haven't used it but the PECL event extension seems like it might add the async to a PHP implementation.

It seems like adding that "ws: true" command would make things more complicated to deal with. Basically a hidden parameter that would have to be checked and passed in every method. And wouldn't reveal which are available as a socket compared to only as rest. Which would affect coding decisions.
Wouldn't it be better if the rest and socket classes were separate and extended from a base class per exchange?

@TimNZ
Copy link

TimNZ commented Jan 27, 2018

Ambitious for a v1 release.

First implement WS standalone with unified API and release that, and then you can build on there with integrated between REST and WS and merging of data and the pipeline in #1 -> #6

@kroitor
Copy link
Author

kroitor commented Jan 28, 2018

Ambitious for a v1 release.

That was the whole point, we don't want a less convenient interface, because there's a ton of that non-unified shit available online already. Most of existing websocket implementations are unusable in real action because they aren't designed with unification in mind.

First implement WS standalone with unified API and release that, and then you can build on there with integrated between REST and WS and merging of data and the pipeline in #1 -> #6

It is still impossible for the moment and nobody has proven the opposite yet. If it was doable in a manner like that, "first do ws, then think how to attach it" – we would have done it three times already. This is why we are doing it the other way from the very beginning.

@pursehouse, @TimNZ

Wouldn't it be better if the rest and socket classes were separate and extended from a base class per exchange?

The article above explains that most of the time the exchanges won't provide all necessary data via WS. So, 99% of the time the user will have to use REST+WS, and it logically follows that there's no point in separating the classes – you wouldn't want to have all the internal exchange communication burden. Practically in 100% of cases the REST will be used for loading markets (at the very least, but, probably, for many other things as well). In other words, I would not write the above text, if it was possible to dissect the task like that. You would not want to load internal market data from one class, then supply them to another separate class-call-chain.

When I say it's hard to design properly, I mean it. This is why it takes so long.

@TimNZ
Copy link

TimNZ commented Jan 28, 2018

@kroiter I don't want to debate the different points of view on this.

You've been saying WS has been coming since July, and it has dragged out because it has been over-complicated things by focusing on the ideal end Goal, instead of releasing in useable Steps.

As I commented in the related repo issue, I am not entitled, and appreciative of what the team has done, but a reasonable dose of constructive candidness is warranted here to help tangible progress happen faster.

Today I am happy with a simple unified WS API for exchanges that support it.
I believe a lot of people would be, and are waiting for this.
They can figure out how to mash REST and WS together for the time being.

Refinement can come next.

At this stage I'm also only after push/read-only for trades and depths and orders.
Don't need to push anything yet.
If I do, I can use REST.

I'd do a PR but I parked my crypto project for a little while, due to waiting for this, and a couple of other projects
I prioritised.

Also I don't like doing large PR for repos unless there is a discussion before hand, and even more so when there are quite strongly opinionated repo members, where architecture approaches and implementation choices could be very different.

You seem really stuck on what You think you should be doing, vs what perhaps a large portion of the community would be happy with, in stages.

Just trying to be helpful, and more opinionated and vocal than most in how I go about doing that.

@kroitor
Copy link
Author

kroitor commented Jan 29, 2018

@TimNZ have you actually tried adding at least two exchanges in three languages in a way that would not require writing exchange-specific code in all three languages later upon adding each next exchange? Because if you think of just a few exchanges in one language – there's a ton of libs for that out there already available on GitHub, npm, PyPI and elsewhere... do you see the point? We welcome all contributions and if you want to help – try doing it without breaking anything first. Theorising seems counter-productive from here. I will be happy to help, if you have any questions.

@TimNZ
Copy link

TimNZ commented Jan 30, 2018

"@TimNZ have you actually tried adding at least two exchanges in three languages in a way that would not require writing exchange-specific code in all three languages later upon adding each next exchange?"

I wouldn't simultaneously try to support 3 languages and all the exchanges at once in a big bang approach.
I would start with one language, and the top 5 exchanges.

@illera88
Copy link

I think as Tim that focusing on getting just ticker information order book information is good enough to start. Actually a lot of exchanges only provide that information through websockets.

I've been implementing code to collect that info using ws and I thing there is not a single silver bullet to make it work in all exchanges. I think is better to focus on a few ones and keep going.

Thanks for your effort

@kroitor
Copy link
Author

kroitor commented Jan 31, 2018

@TimNZ

I wouldn't simultaneously try to support 3 languages and all the exchanges at once in a big bang approach.
I would start with one language, and the top 5 exchanges.

This is the exact reason why we have a thousand of single-language-wrappers for 5 exchanges ) Already available on GitHub in all flavors ) And they usually stop right there, no further progress, guess why? ) I can post links to 10 examples of such wrappers right away, but think of it, do you want the ccxt project to be a yet another one wrapper of that kind?

Oh, and yes, we also made one... and it is open-sourced here:

Igor Kroitor @kroitor Nov 12 16:49
Some of our old JS implementations (those are really outdated and don't fit into current ccxt as it is):
Bitfinex: https://gist.github.com/kroitor/4d4638c0f6e89602105c6438f5acd057
Bitmex: https://gist.github.com/kroitor/abcbca36976d8142ef0276da30c9be5b
OKCoin: https://gist.github.com/kroitor/0350cfd07815607c70d36b36c25d8aa5

Those implementations are crap, I would never merge code like that. These links have been published in November and available since then...

@mmehrle
Copy link

mmehrle commented Feb 3, 2018

Just caught this page today - great summary and it ups the ante on an comprehensive and very ambitious project. I won't be able to contribute on this end but wanted throw in that I really enjoy using ccxt in its current REST-only form. It's extremely well designed/structured and despite its short existence whatever is being offered works almost flawlessly. Over the long term these are exactly the aspects that people are looking for when choosing an API for serious projects. It's important that adding new functionality doesn't make a mess of things just to squeeze in a bag of new tricks. Especially since exchanges are constantly evolving and thus extensibility is a prime consideration. And I haven't even considered the multi-language equation, so Kroitor et. al. have their work cut out for them. Anyway, keep on rocking - I'll be happy to devote some time for alpha/beta testing when it's ready.

@dmitriz
Copy link

dmitriz commented Apr 4, 2018

I have been following this interesting discussion and wonder how natural would it be to use
a streaming library, such as https://github.com/paldepind/flyd or https://github.com/cujojs/most

Here are some simple examples of streams generated from clicks or web sockets:
https://github.com/paldepind/flyd#creating-streams

Would it be not nice to have a single function call

var dataStream = read(source)
// creates read stream from the source

where the source object contains the entire information about the data source and how the stream is retrieved.
Then you can consume the stream in functional way like

dataStream.map(console.log)
// prints the stream content

Even better, the ordinary REST request can also be wrapped into the same stream type with single event as

var reqStream = fromPromise(apiReqPromise)
// stream wrapping the promise

That would make both REST and Socket returning exactly the same stream type,
that could be used interchangeably in the same code:

reqStream.map(console.log)
// prints the promise content with exactly the same syntax

So I can create the same general purpose function to see the stream's content in both cases:

var printStream = stream => stream.map(console.log)
printStream(dataStream)
printStream(reqStream)

Any thoughts?

@kroitor
Copy link
Author

kroitor commented Apr 5, 2018

@dmitriz will look into the above libraries, however, I would avoid wrappers at all costs and would stick to returning plain JSON structs for compatibility and portability. Your proposal is JS-only, however, the entire library should be language-agnostic, so we can't really use language-specific wrappers and dependencies. The interface described in the spec above is designed to keep the existing ccxt API and returned types as per Manual.

@dmitriz
Copy link

dmitriz commented Apr 6, 2018

@kroitor
It is of course a pity no plain structs are available for streams. On the other hand, it is perhaps easy to wrap the callback based API into proper stream types. The actual reason I brought up these streams, was the mention of the Async Generators, that are not yet part of the official API, as far as I understand.

I would be curious to compare the Generator based code with the Stream based one for typical tasks such as executing orders based on stream data feed. In particular, my concern is Generators might not be as convenient for real time data streams, because they give control to the consumer who does not know when data is pushed. It might be more suitable for simulations based on pre-recorded data, which can be implemented by a generator wrapper, but it is not quite clear how to combine both real time trades and simulation that way with maximum reusable code.

Your proposal is JS-only, however, the entire library should be language-agnostic, so we can't really use language-specific wrappers and dependencies.

There might be equivalent libraries offered by other langs if needed.
Then perhaps, a callback based api is enough, to be wrapped after.
Also, it might be less daunting task to begin with the JS version first,
and even focus on a limited subset such as streaming market data.

@kvdveer
Copy link

kvdveer commented Apr 10, 2018

@dmitriz
Your proposal is very js-esque in its handling of events.

Javascript really embraces anonymous callbacks, whereas other languages generally avoid those. In languages like Java or c++, I'd expect to subclass something in order to add event handlers. In python and PHP, I'd expect to either subclass, or assign event handlers. I would personally prefer it if each language would use its own preferred idiom, so that's callback-heavy js, and event-handler-based php, py. For all languages, exchanges should be extendable using subclassing, I suppose - as that's probably easy to do.

Keep in mind that nor Java, nor Python, nor PHP do not offer multi-line anonymous callables. C++ does offer them, but they are not really commonly used (although this probably differs a lot per environment). This really precludes anonymous callbacks.

Note: I really can't judge other languages, as I'm not fluid in them, but I'm already pushing the scope by mentioning Java and C++.

@kvdveer
Copy link

kvdveer commented Apr 11, 2018

I've implemented several WS api clients now, and I've found that one sometimes needs to do something with the raw (unparsed) message. For example: OkCoin/OkEx can send compressed messages. The message will be received as a binary, which you then decompress, and parse. Similarly, I've noticed that some exchanges bundle the messages, so one will receive one message, which will need to be dispatched to several receivers.

Ergo: we will need a method to parse the message (default will probably just load the JSON), and dispatch it to a more general dispatch function. I propose a signature like parseWsMessage(messageType, messageContent), which will call dispatchAsyncMessage(message). I believe this could be implemented in all languages, and would allow exchange-specific message handling. Perhaps this could even unify FIX and WS, where FIX would call dispatchAsyncMessage.

In summary, I propose the following exchange API:

function parseWsMessage(messageType:string, messageContent:object) {
    if (messageType === 'JSON')  {
        this.dispatchAsyncMessage(messageContent)
    }
}

function dispatchAsyncMessage(message:object) { /* to be implemented per exchange */ }

A common pattern I've seen is tokenized call/response message, e.g. you submit a RPC with a token, and you can identify the response as the token will be repeated in it. This pattern is present in most WS interfaces, so we will definitely need generic 'rpc waiting for answers' support. Sadly, there's also a common anti-pattern where RPC calls aren't tokenized, and you can only recognize the response by seeing what type of response it is. This disallows parallel tokens. To work around this, we need some locking system. Instead of porting locking primitives to all languages, I propose the following API, where untokenized exchanges use a fixed token, and the code prevents duplicate tokens in play:

_pending_rpc_tokens = {}
function generateWsRpcToken():string /* Language specific implementation */

async function wsRpc(method:str, arguments:object) {
    /* Language specific implementation */
   let rpc_token:string
   if (this.api.websocket[method].fixedToken) {
      rpc_token = this.api.websocket[method].fixedToken
   } else {
      rpc_token = this.generateWsRpcToken()
   }

   // disallow parallel reuse of tokens
   while (_pending_rpc_tokens[token] !== undefined) {
      await _pending_rpc_tokens[token]
   }
   _pending_rpc_tokens[token] = sendWsRpc(method, arguments, rpc_token)
   await _pending_rpc_tokens[token]
}

async function sendWsRpc(method, arguments:string, token:string) { /* to be implemented per exchange */}

async function wsRpcResponse(token:str, result, isException) {
   /* Language specific implementation */
   // typically called from dispatchAsyncMessage
   let promise = _pending_rpc_tokens[token]
   if(promise !== undefined) {
      delete _pending_rpc_tokens[token]
      if (isException) {
         promise.reject(result)
      } else {
         promise.resolve(result)
      }
   }
}

@kroitor
Copy link
Author

kroitor commented Apr 20, 2018

@kvdveer, thanks for your considerations on this! I'd agree with @dmitriz – let's talk in ccxt/ccxt#56 as, unfortunately, Gist format does not have means for notifications of new messages...

@pursehouse
Copy link

pursehouse commented Apr 20, 2018

I think a step #1 would be to have a setup to compile the data you'd send to the web socket, and then a method to process the result of the web socket... then let the developers be able to at least use that then create their individual socket interaction solution.

and then, potentially... full solutions using whatever libraries could be built into this package for each language or extension packages for each style... like... "ccxt-php-websocket-pecl-eventlib" type of thing..

For sure people need to use the websockets on top of just the rest calls. I need websocket ability right now for EVERY exchange I can manage. I doubt I'm alone :)

@HoDaDor
Copy link

HoDaDor commented Oct 24, 2019

Is this still being worked on? I notice that work on websocket implementation for ccxt has been discussed since 2017...

@kroitor
Copy link
Author

kroitor commented Oct 24, 2019

@HoDaDor yes, read down from here for most recent news

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment