Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
(by @andrestaltz)
If you prefer to watch video tutorials with live-coding, then check out this series I recorded with the same contents as in this article: Egghead.io - Introduction to Reactive Programming.
I was talking to a coworker recently about general techniques that almost always form the core of any effort to write very fast, down-to-the-metal hot path code on the JVM, and they pointed out that there really isn't a particularly good place to go for this information. It occurred to me that, really, I had more or less picked up all of it by word of mouth and experience, and there just aren't any good reference sources on the topic. So… here's my word of mouth.
This is by no means a comprehensive gist. It's also important to understand that the techniques that I outline in here are not 100% absolute either. Performance on the JVM is an incredibly complicated subject, and while there are rules that almost always hold true, the "almost" remains very salient. Also, for many or even most applications, there will be other techniques that I'm not mentioning which will have a greater impact. JMH, Java Flight Recorder, and a good profiler are your very best friend! Mea
task :environment do | |
require './dj-sinatra' | |
end | |
namespace :jobs do | |
desc "Clear the delayed_job queue." | |
task :clear => :environment do | |
Delayed::Job.delete_all | |
end |
Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggerganov/llama.cpp#5962
In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.
See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix
-- show running queries (pre 9.2) | |
SELECT procpid, age(clock_timestamp(), query_start), usename, current_query | |
FROM pg_stat_activity | |
WHERE current_query != '<IDLE>' AND current_query NOT ILIKE '%pg_stat_activity%' | |
ORDER BY query_start desc; | |
-- show running queries (9.2) | |
SELECT pid, age(clock_timestamp(), query_start), usename, query | |
FROM pg_stat_activity | |
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%' |
Producer | |
Setup | |
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1 | |
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3 | |
Single thread, no replication | |
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 |