Cameron R. Wolfe wolfecameron

## masked_self_attention.py
"""
Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
"""

import math
import torch
from torch import nn
import torch.nn.functional as F

class MaskedSelfAttention(nn.Module):

## gpt.py
"""
Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
"""

import torch
from torch import nn
import torch.nn.functional as F

class GPT(nn.Module):

## decoder_only_block.py
"""
Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
"""

from torch import nn

class Block(nn.Module):
    def __init__(
        self,
        d,

## transformer_ffnn.py
"""
Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
"""

from torch import nn

class FFNN(nn.Module):

    def __init__(
        self,

## exploding_activations.py
import torch

# experiment settings
d = 5
nlayers = 100
normalize = False # set True to use normalization

# create vector with random entries between [-1, 1]
input_vector = (torch.rand(d) - 0.5) * 2.0

## causal_self_attention.py
"""
Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
"""

import math
import torch
from torch import nn
import torch.nn.functional as F

class CausalSelfAttention(nn.Module):

## cartier_session2_links
Summaries and Overviews:
- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
- Specialized LLMs: https://cameronrwolfe.substack.com/p/specialized-llms-chatgpt-lamda-galactica
- Practical Prompt Engineering: https://cameronrwolfe.substack.com/p/practical-prompt-engineering-part
- Advanced Prompt Engineering: https://cameronrwolfe.substack.com/p/advanced-prompt-engineering
- LLM Training and Inference: https://cameronrwolfe.substack.com/p/language-model-training-and-inference
- Understanding SFT: https://cameronrwolfe.substack.com/p/understanding-and-using-supervised
- RLHF and Alternatives: https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives
- Data is the foundation of language models: https://cameronrwolfe.substack.com/p/data-is-the-foundation-of-language
- RLAIF: https://cameronrwolfe.substack.com/p/rlaif-reinforcement-learning-from

## cartier_session1_links.txt
Summaries and Overviews:
- History of AI: https://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/
- GPT and GPT-2: https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
- Scaling Laws and GPT-3: https://cameronrwolfe.substack.com/p/language-model-scaling-laws-and-gpt
- The Illustrated Transformer: http://jalammar.github.io/illustrated-transformer/
- Language Model Mechanics: https://cameronrwolfe.substack.com/i/135273362/the-mechanics-of-a-language-model
- BERT: https://cameronrwolfe.substack.com/p/language-understanding-with-bert
- Transformer Architecture (T5): https://cameronrwolfe.substack.com/p/t5-text-to-text-transformers-part
- Foundation Models: https://crfm.stanford.edu

## llm_preso_links_2.txt
[LLM Training and Fundamentals]
- GPT and GPT-2: https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
- GPT-3 and LLM Scaling: https://cameronrwolfe.substack.com/p/language-model-scaling-laws-and-gpt
- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
- Specialized LLMs: https://cameronrwolfe.substack.com/p/specialized-llms-chatgpt-lamda-galactica

[Open Source LLMs]
- LLaMA: https://cameronrwolfe.substack.com/p/llama-llms-for-everyone
- Beyond LLaMA (Imitation Models): https://cameronrwolfe.substack.com/p/beyond-llama-the-power-of-open-llms
- False Promise of Imitation: https://cameronrwolfe.substack.com/p/imitation-models-and-the-open-source

## llm_preso_links.txt
Summaries and Overviews:
- GPT and GPT-2: https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
- Scaling Laws and GPT-3: https://cameronrwolfe.substack.com/p/language-model-scaling-laws-and-gpt
- OPT-175B (Open-Source GPT-3): https://cameronrwolfe.substack.com/p/understanding-the-open-pre-trained-transformers-opt-library-193a29c14a15
- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
- Specialized LLMs: https://cameronrwolfe.substack.com/p/specialized-llms-chatgpt-lamda-galactica
- Why does ChatGPT work?: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
- Orca: https://cameronrwolfe.substack.com/p/orca-properly-imitating-proprietary
- LLaMA: https://cameronrwolfe.substack.com/p/llama-llms-for-everyone
- MPT: https://cameronrwolfe.substack.com/p/democratizing-ai-mosaicmls-impact
	"""
	Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
	"""

	import math
	import torch
	from torch import nn
	import torch.nn.functional as F

	class MaskedSelfAttention(nn.Module):
	"""
	Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
	"""

	from torch import nn

	class Block(nn.Module):
	def __init__(
	self,
	d,
	"""
	Source: https://github.com/karpathy/nanoGPT/blob/master/model.py
	"""

	from torch import nn

	class FFNN(nn.Module):

	def __init__(
	self,
	import torch

	# experiment settings
	d = 5
	nlayers = 100
	normalize = False # set True to use normalization

	# create vector with random entries between [-1, 1]
	input_vector = (torch.rand(d) - 0.5) * 2.0
	Summaries and Overviews:
	- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
	- Specialized LLMs: https://cameronrwolfe.substack.com/p/specialized-llms-chatgpt-lamda-galactica
	- Practical Prompt Engineering: https://cameronrwolfe.substack.com/p/practical-prompt-engineering-part
	- Advanced Prompt Engineering: https://cameronrwolfe.substack.com/p/advanced-prompt-engineering
	- LLM Training and Inference: https://cameronrwolfe.substack.com/p/language-model-training-and-inference
	- Understanding SFT: https://cameronrwolfe.substack.com/p/understanding-and-using-supervised
	- RLHF and Alternatives: https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives
	- Data is the foundation of language models: https://cameronrwolfe.substack.com/p/data-is-the-foundation-of-language
	- RLAIF: https://cameronrwolfe.substack.com/p/rlaif-reinforcement-learning-from
	Summaries and Overviews:
	- History of AI: https://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/
	- GPT and GPT-2: https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
	- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
	- Scaling Laws and GPT-3: https://cameronrwolfe.substack.com/p/language-model-scaling-laws-and-gpt
	- The Illustrated Transformer: http://jalammar.github.io/illustrated-transformer/
	- Language Model Mechanics: https://cameronrwolfe.substack.com/i/135273362/the-mechanics-of-a-language-model
	- BERT: https://cameronrwolfe.substack.com/p/language-understanding-with-bert
	- Transformer Architecture (T5): https://cameronrwolfe.substack.com/p/t5-text-to-text-transformers-part
	- Foundation Models: https://crfm.stanford.edu
	[LLM Training and Fundamentals]
	- GPT and GPT-2: https://cameronrwolfe.substack.com/p/language-models-gpt-and-gpt-2
	- GPT-3 and LLM Scaling: https://cameronrwolfe.substack.com/p/language-model-scaling-laws-and-gpt
	- Modern LLMs: https://cameronrwolfe.substack.com/p/modern-llms-mt-nlg-chinchilla-gopher
	- Specialized LLMs: https://cameronrwolfe.substack.com/p/specialized-llms-chatgpt-lamda-galactica

	[Open Source LLMs]
	- LLaMA: https://cameronrwolfe.substack.com/p/llama-llms-for-everyone
	- Beyond LLaMA (Imitation Models): https://cameronrwolfe.substack.com/p/beyond-llama-the-power-of-open-llms
	- False Promise of Imitation: https://cameronrwolfe.substack.com/p/imitation-models-and-the-open-source