You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In programming, a paradigm is an abstract way to understand and solve a problem.
A paradigm is like a perspective, a high point from which you can survey the terrain and try to decide the path your journey will take.
Toay, there are three major programming paradigms:
Imperative Programming.
Object Oriented Programming (OOP).
Functional Programming (FP).
In principle any language can be used to program in any paradigm, but in practice certain languages tend to favor certain paradigms.
Hello there! I’m trying to train a custom LLM similar to Andrej Karpathy’s nanogpt and nanochat tutorials. My issue is that training loss and gradient norms go to nearly zero after around a hundred steps. I’m using the MLX framework on an M1 Max.
I have a rather llama like architecture with rope. Unlike llama I am using gelu (like gpt2) instead of swiglu in the MLP to save a few parameters on the gate matrices. I’m using a embedding dimension of 768 and 12 layers, and an mlp up projection ratio of 4, and group query attention with a key value head ratio of 4 (all like gpt2 small and llama). I’m using the gpt2 tokenizer with a vocab dimension of 50304. This comes out to around 114M parameters and seems like I’m on the beaten path f
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When programmers talk about typing, most of the time they aren't talking about
the odious task of pressing keys on a keyboard (watch any programmer and look
to see how much of their time they spend actually typing out code. What you'll
see instead is a lot of frowning and staring at the screen with an expression
of great consternation as you can see them think "why the hell didn't my code
do what I thought?"). Instead they're talking about the types of variables. Now
you're probably familiar with the idea that there are numbers and strings and
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A place where I can keep notes on fuzz testing daala.
All tests here were done against commit sha d8daca8e9aadb1f6ba53e089b89824f170d59703 from Fri May 1, 2015.
16:50:05 radens | Do you guys run fuzz testers on daala? I was playing around with afl-fuzz
| today and was thinking of the recent android bug.
16:53:38 +TD-Linux | radens, no, and we should
16:54:03 +TD-Linux | tons of fuzzing was done on opus, though.