begin;
delete
from order_event
where trade_date < '2020-06-02'
and event_type != 'FILL'
and event_type != 'PARTIAL';
commit;
{"1":["cross","1","1","1","1","0","00FFFF","000000","FF0000","5","2","0","0","1","persistent"],"2":["circle","0","FFFFFF","000000","FF0000","2","18","2","50","0","1","persistent"],"3":["circle","1","FFFFFF00","000000","FF0000","2","22","1","50","0","0","persistent"],"d":1080} | |
{"1":["dot",0,"00FFFF","000000","FF0000","3","2","round","persistent","0"],"2":["circle","1","FFFFFF80","000000","FF0000","3","40","1","33","60","0","persistent"],"3":["none"],"d":1080} | |
6333f91b-22b6-434b-a068-a33f4936224e |
CREATE OR REPLACE VIEW view_dealbook AS | |
SELECT p.trade_date, | |
p.source, | |
p.group_id, | |
p.create_us, | |
p.market_type, | |
p.hold_time, | |
p.symbol, | |
p.settlement_ccy, | |
p.size, |
CREATE OR REPLACE VIEW view_dealbook AS | |
SELECT p.trade_date, | |
p.source, | |
p.group_id, | |
p.create_us, | |
p.market_type, | |
p.hold_time, | |
p.symbol, | |
p.settlement_ccy, | |
p.size, |
update_spread_tiers_for_agents naive loop -> ~30s | |
w/ multiprocessing on update_spread_tiers_1_agent_from_file -> 6s | |
w/ multiprocessing on push_file_to_prod -> ?s |
when kafkacat (or any other consumer) reads from a new topic, it automatically creates a 0 partition topic as a result, the mdcs can't publish and the mdp and mbs can't consume, throwing .unwrap errors in the lb logs (UnknownTopicOrPartition)
on each broker (deploy@lucera-ld4-kz-<00|01|02>)
cd ~/src/kafka_2.10-0.8.2.1/config
add delete.topic.enable=true
to server.properties to enable topic deletion
restart broker: stop_k;sleep 5;stop_zk start_zk;sleep 5;start_k
https://colab.research.google.com/drive/1YjGz2f74abea8SqOp09o3u7vaHsHFfrb
I've been experimenting with the main umap repo and testing / reproducing results in a Colab high-memory test env.
The project is still under very active development and it looks like the author is very responsive to issues and feature requests. I ran into a few quirks trying to build the latest master (0.4dev), which includes a lot of new optimizations and things like an inverse transform.
If you build the latest version without having pynndescent installed, it'll fall back to a more naive approximate nearest neighbors algorithm, which is very expensive RAM wise (I was able to run in on up to ~7000 samples of MNIST, which used about 18GB of ram). Reverting to the latest stable release (0.3, with numba==0.46.0
and llvmlite==0.30
), the memory usage went down to 2GB, which tracks with the original paper which ran on an 8GB instance (https://arxiv.org/pdf/1802.03426.pdf#page=26&zoom=auto,-205,300).
Building the latest master with `
WITH cte AS ( | |
SELECT trade_date, source | |
FROM ( | |
SELECT generate_series::date as trade_date | |
from generate_series( | |
($__timeFrom() AT TIME ZONE 'America/New_York' + interval '6h 58m')::date, | |
($__timeTo() AT TIME ZONE 'America/New_York' + interval '6h 58m')::date, | |
'1 day' | |
) | |
) c |
import argparse | |
from confluent_kafka import Producer | |
from random import randint | |
parser = argparse.ArgumentParser() | |
parser.add_argument("--broker", default="lumefx-mdp-prod00", help="kafka broker") | |
args = parser.parse_args() | |
p = Producer({'bootstrap.servers': args.broker}) |
.Q.hdpf[`::5000;`:db/fnfx_ny4/;(.data.tradedate .z.p) - 1;`sym] |