What is an AI Coding Assistant?
If the coding assistant can't run ITERATIVE CRUD on ALL of your code, it's not a True AI Coding Assistant (TACA)
Must work on existing codebases
Must have a file context mechanism
On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for
This worked on 14/May/23. The instructions will probably require updating in the future.
llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)
Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.
It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.
08737ef720f0510c7ec2aa84d7f70c691073c35d
.{ | |
"schema_version": "v1", | |
"name_for_model": "twilio", | |
"name_for_human": "Twilio Plugin", | |
"description_for_model": "Plugin for integrating the Twilio API to send SMS messages and make phone calls. Use it whenever a user wants to send a text message or make a call using their Twilio account.", | |
"description_for_human": "Send text messages and make phone calls with Twilio.", | |
"auth": { | |
"type": "user_http", | |
"authorization_type": "basic" | |
}, |
mp=1; size=7B; # to run 7B | |
mp=8; size=65B; # to run 65B | |
for seed in $(randint 1000000) | |
do | |
export TARGET_FOLDER=~/ml/data/llama/LLaMA | |
time python3 -m torch.distributed.run --nproc_per_node $mp example.py --ckpt_dir $TARGET_FOLDER/$size --tokenizer_path $TARGET_FOLDER/tokenizer.model --seed $seed --max_seq_len 2048 --max_gen_len 2048 --count 0 | tee -a ${size}_startrek.txt | |
done |
Audience: I assume you heard of ChatGPT, maybe played with it a little, and was impressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.
Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labor costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We
ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowehre. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?
I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.
The easiest way to "convert" MKV to MP4, is to copy the existing video and audio streams and place them into a new container. This avoids any encoding task and hence no quality will be lost, it is also a fairly quick process and requires very little CPU power. The main factor is disk read/write speed.
With ffmpeg
this can be achieved with -c copy
. Older examples may use -vcodec copy -acodec copy
which does the same thing.
These examples assume ffmpeg
is in your PATH
. If not just substitute with the full path to your ffmpeg binary.
This is a translation of grugbrain.dev into clear English. All props to the original author.
This is a collection of thoughts on software development, originally written by an pseudonymous author styling themselves the "grug brain developer," but then translated into clear English by Raph Levien.
I am not an extremely smart developer, but I have many years of experience and have learned some things, although still don't know everything.
L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns
Compress 1K bytes with Zippy ............. 3,000 ns = 3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns = 20 µs
SSD random read ........................ 150,000 ns = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs