Shahbaz Syed shahbazsyed

## apple-silicon.preset.json
{
  "name": "Apple Silicon",
  "load_params": {
    "n_ctx": 2048,
    "n_batch": 512,
    "rope_freq_base": 10000,
    "rope_freq_scale": 1,
    "n_gpu_layers": 1,
    "use_mlock": false,
    "main_gpu": 0,

## Automatic Summarizing: factors and directions.md

      
              1 file
            
          
              0 forks
            
          
                1 comment
              
            
              0 stars
            
          
                shahbazsyed
                / Automatic Summarizing: factors and directions.md
            
            
              Created
              May 8, 2023 12:30
            
          
    Notes on Automatic Summarizing: factors and directions

This position paper outlines the various context factors to be considered in order to develop effective methods for summarization and its evaluation. A key argument is that we cannot develop useful summarization systems unless we pay close attention to both the context (where summarization is applied), and the purpose (why is it done).
The paper analyses three key factors: (1) the input to the summarization model, (2) the purpose of the output summaries, and (3) the output format of the summaries.
What is a summary?

A summary is loosely defined as a reductive transformation of source text through content reduction by selection and/or generalization on what is important in the source. A possible three-step model to achieve this can be:

I : source text interpretation (to source text representation)
T : source representation transformation (to summary text represe


## Notes on Summarizing Information.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                shahbazsyed
                / Notes on Summarizing Information.md
            
            
              Created
              May 8, 2023 12:23
            
          
    Notes on Summarizing Information

This book by Brigitte Endres-Niggemeyer (1998) details the concept of summarizing information, its connection to cognitive pyschology, how professionals summarize information, and some computational approaches to automatic summarization.
Communication and Cognition

At its core, summarizing is the process of reducing textual information to its most essential parts. It is a situationally and communicatively bound cognitive task where three principal components of human communication are employed: the storage of knowledge in memory, understanding or learning knowledge from the environment, and the generation of utterances (imparting the learnt knowledge).
Communication is tied to the principal of relevance, i.e., one communication partner expects the statements of the other to influence their cognitive state in the current situation. This forms the communicative function of a discourse. Frequent functions are to

  
## An overview of multi-task learning in NLP.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                shahbazsyed
                / An overview of multi-task learning in NLP.md
            
            
              Created
              May 8, 2023 12:20
            
              
                Notes on multi-task learning for NLP
              
          
    Notes on Multi-task learning for NLP

Multi-task learning (MTL) tackles the overfitting and data scarcity problems of deep learning methods by introducing useful information from related tasks to achieve simultaneous performance improvement on multiple related tasks.
MTL trains machine learning models from multiple related tasks simultaneously or enhances the model for a specific task using auxiliary tasks. Learning from multiple tasks makes it possible for learning models to capture generalized and complementary knowledge from the tasks at hand besides task-specific features. MTL architectures used in NLP tasks are categorized into four classes: the parallel, hierarchical, modular, and generative adversarial architecture.
The parallel architecture shares the bulk of the model among multiple tasks while each task has its own task-specific output layer. The hierarchical architecture models the hierarchical relationships between tasks. Such architecture can hierarchically combine features from differe

  
## SOTA in Summarization according to HELM benchmark.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                shahbazsyed
                / SOTA in Summarization according to HELM benchmark.md
            
            
              Last active
              May 8, 2023 12:17
            
              
                Notes on SOTA in Summarization according to HELM benchmark
              
          
    SOTA in Summarization according to the HELM benchmark

Listed here are some key points relevant to the task of text summarization by large language models and their evaluation as per the HELM benchmark.
Problem setting

Text summarization is formulated as an unstructured sequence-to-sequence problem, where a document is the input and the LM is tasked with generating a summary resembling the reference summary.
Automatic Evaluation


ROUGE-2 correlated with more accurate models, especially a strong correlation was found with model size.
Relationship between model quality and abstraction was very variable.


## Argument and Argumentation Theory.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                shahbazsyed
                / Argument and Argumentation Theory.md
            
            
              Last active
              May 8, 2023 12:17
            
              
                Notes from the SEP article on Argument and Argumentation Theory
              
          
    Argument and Argumentation Theory

Terminology


An argument can be defined as a complex symbolic structure where some parts, known as the premises, offer support to another part, the conclusion.
The relation of support between premises and conclusion can be cashed out in different ways: the premises may guarantee the truth of the conclusion, or make its truth more probable; the premises may imply the conclusion; the premises may make the conclusion more acceptable (or assertible).
Argumentation is the exchange of arguments.
The study of arguments and argumentation is also closely connected to the study of reasoning, understood as the process of reaching conclusions on the basis of careful, reflective consideration of the available information, i.e., by an examination of reasons.

Types of Arguments


## LLM Zoo.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                shahbazsyed
                / LLM Zoo.md
            
            
              Last active
              May 8, 2023 09:36
            
              
                Ongoing list of LLMs
              
          
    Influential LLMs and methods ordered by time (arXiv v1 release date)


Model
Link
Date
Org


T5
https://arxiv.org/abs/1910.10683
23.10.2019
Google


GPT-3
https://arxiv.org/abs/2005.14165
28.05.2020
OpenAI


Switch Transformer
https://arxiv.org/abs/2101.03961
11.01.2021
Google


LoRA
https://arxiv.org/abs/2106.09685
17.06.2021
Microsoft


FLAN
https://arxiv.org/abs/2109.01652
03.09.2021
Google


T0pp
https://arxiv.org/abs/2110.08207
15.10.2021
HuggingFace


## RLHF.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                shahbazsyed
                / RLHF.md
            
            
              Last active
              May 8, 2023 12:18
            
              
                Running notes on reinforcement learning from human feedback
              
          
    Notes on RLHF

Three stages of training a LLM


Pretraining: a LLM is pretrained on indiscriminate web data
Supervised finetuning (SFT): the pretrained language model (PLM) is then finetuned on higher quality data
RLHF: finetuned model is further polished using RLHF to make it appropriate for the broad audience

Pretraining is the most resource-intensive phase; SFT and RLHF can be seen as unlocking the existing capabilities of the pretrained models that are hard for users to do via prompting alone.
There are two types of data required besides the scraped web data used for pretraining:

  
## LLM.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                shahbazsyed
                / LLM.md
            
            
              Created
              March 29, 2023 10:34
                — forked from rain-1/LLM.md
            
              
                LLM Introduction: Learn Language Models
              
          
    Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.
Avoid being a link dump. Try to provide only valuable well tuned information.
Prelude

Neural network links before starting with transformers.

  
## bart-preprocess-example.sh
source ~/miniconda3/bin/activate allen

LANG=en
TASK=qa_en_small
for SPLIT in train valid
  do
    python -m examples.roberta.multiprocessing_bpe_encoder \
    --encoder-json encoder.json \
    --vocab-bpe vocab.bpe \
    --inputs "$TASK/$SPLIT.$LANG" \
	{
	"name": "Apple Silicon",
	"load_params": {
	"n_ctx": 2048,
	"n_batch": 512,
	"rope_freq_base": 10000,
	"rope_freq_scale": 1,
	"n_gpu_layers": 1,
	"use_mlock": false,
	"main_gpu": 0,
Model	Link	Date	Org
T5	https://arxiv.org/abs/1910.10683	23.10.2019	Google
GPT-3	https://arxiv.org/abs/2005.14165	28.05.2020	OpenAI
Switch Transformer	https://arxiv.org/abs/2101.03961	11.01.2021	Google
LoRA	https://arxiv.org/abs/2106.09685	17.06.2021	Microsoft
FLAN	https://arxiv.org/abs/2109.01652	03.09.2021	Google
T0pp	https://arxiv.org/abs/2110.08207	15.10.2021	HuggingFace
	source ~/miniconda3/bin/activate allen

	LANG=en
	TASK=qa_en_small
	for SPLIT in train valid
	do
	python -m examples.roberta.multiprocessing_bpe_encoder \
	--encoder-json encoder.json \
	--vocab-bpe vocab.bpe \
	--inputs "$TASK/$SPLIT.$LANG" \