Austin Liu austin362667

## LLM_notes.md

      
              1 file
            
          
              0 forks
            
          
              1 comment
            
          
              0 stars
            
          
                austin362667
                / LLM_notes.md
            
            
              Last active
              February 29, 2024 06:39
            
              
                給 Ariel 的小科普
              
          
和一位朋友討論 LLM evaluation (成效評估) 和其一些 benchmark (比較基準) datasets 的相關問題

1. LLM 模型評估很難，那模型評估 (evaluation) 是什麼？


因為面對語言類型的任務通常沒有絕對的正確或錯誤，所以我們會設計一些資料來測驗 LLM 的能力，這些 benchmark datasets 只能作為驗證模型能力的某種  proxy。
而且各種資料集有各自專精的領域，類型包羅萬象，諸如：邏輯型、情緒型、翻譯、程式碼、數學解題、常識推理等等族繁不及備載。


以 BBH 基準資料集來說：

文字輸入: False or not ( True ) and False is


我們會期望模型文字輸出: False


## image.svg

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                austin362667
                / image.svg
            
            
              Created
              August 1, 2023 08:51
            
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## gpt2_embed_relatedness.py
def sim(ta, tb):
    a_t = enc.encode(ta)
    b_t = enc.encode(tb)
    a_te = np.array(model.transformer.wte.weight[a_t].tolist())
    b_te = np.array(model.transformer.wte.weight[b_t].tolist())
    a_pe = np.array(model.transformer.wpe.weight[[i for i in range(len(a_t))]].tolist())
    b_pe = np.array(model.transformer.wpe.weight[[i for i in range(len(b_t))]].tolist())
    x = np.add.reduce(a_te+a_pe)
    y = np.add.reduce(b_te+b_pe)


## wolframalphaAPI.yaml
openapi: 3.0.0
info:
  title: WolframAlpha API
  version: 1.0.0
servers:
  - url: https://api.wolframalpha.com/v1
paths:
  /result:
    get:
      summary: Returns a simple answer from WolframAlpha

## autotrader.cc
// Copyright 2023 Optiver Asia Pacific Pty. Ltd.
//
// This file is part of Ready Trader Go.
//
//     Ready Trader Go is free software: you can redistribute it and/or
//     modify it under the terms of the GNU Affero General Public License
//     as published by the Free Software Foundation, either version 3 of
//     the License, or (at your option) any later version.
//
//     Ready Trader Go is distributed in the hope that it will be useful,

## READ_ONCE-and-WRITE_ONCEㄎ.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                austin362667
                / READ_ONCE-and-WRITE_ONCEㄎ.md
            
            
              Created
              January 15, 2023 17:37
            
              
                Why kernel code should use READ_ONCE and WRITE_ONCE for shared memory accesses
              
          
    Why kernel code should use READ_ONCE and WRITE_ONCE for shared memory accesses

There are several reasons to use at least READ_ONCE and WRITE_ONCE for all concurrent non-read-only shared memory accesses:

It makes code easier to understand
It is required by relevant standards
It enables automatic data race detection
It is required for kernel memory model
It may improve performance


## parachain_fee_structure.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                austin362667
                / parachain_fee_structure.md
            
            
              Created
              December 19, 2022 10:20
            
              
                Polkadot Parachain custom fee structure
              
          
    The transaction-payment pallet.
The functions withdraw_fee and correct_and_deposit_fee of its CurrencyAdapter handle the fees.
These fees are then handed over to the OnUnbalanced handler. This handler is an injected trait function and can be configured by the runtime.
You see how this works when analyzing the on_unbalanceds function in the common Polkadot runtime.
It looks like this:
fn on_unbalanceds<B>(mut fees_then_tips: impl Iterator<Item = NegativeImbalance<R>>) {
    if let Some(fees) = fees_then_tips.next() {
        // for fees, 80% to treasury, 20% to author

  
## discussion.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                austin362667
                / discussion.md
            
            
              Last active
              December 14, 2022 11:21
            
              
                random discussion on Processor Performance(critical path & regs size) & Pipelining
              
          
    Processor Performance

The critical path latencies for the 7 major blocks in a simple processor are given below.


CPU
IMem
Add
Mux
ALU
Regs
DMem
Control


a
400ps
100ps
30ps
120ps
200ps
350ps
100ps


b
500ps
150ps
100ps
180ps
220ps
1000ps
65ps


For each part, answer the following questions:

  
## transformer.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                austin362667
                / transformer.md
            
            
              Last active
              December 13, 2022 10:16
            
              
                Transoformer QA
              
          
    Large-Scale Pretraining with Transformers
更 high level 來看, BERT 是用了 Transformer 的 encoder; GPT 則是用了 Transformer 的 decoder.


注意力機制中 Q, K, V 意義上是什麼, 是如何產生的?
想像一個場景：一張白色的桌子上有一張白色的紙和一顆紅色的蘋果.
這時, Values 就是整個場景.
Keys 就是你不經意就會注意到的明顯物體(e.g., 紅色蘋果).


## NTU_106_HW_Q1.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                austin362667
                / NTU_106_HW_Q1.md
            
            
              Last active
              December 9, 2022 08:03
            
              
                ILP Benchmarking + Cache
              
          
5-Stage Pipelined MIPS CPU,
L1:
  I-Cache: 256KB, D-Cache: 32KB
MEM Latency: 200ps(raiesd up to 250ps),
Base CPI(ideal cache) = 1,
Miss Penalty 200 cycles, I-Cache Miss Rate: 5%, D-Cache Miss Rate: 10%(reduce down to 5%).
	def sim(ta, tb):
	a_t = enc.encode(ta)
	b_t = enc.encode(tb)
	a_te = np.array(model.transformer.wte.weight[a_t].tolist())
	b_te = np.array(model.transformer.wte.weight[b_t].tolist())
	a_pe = np.array(model.transformer.wpe.weight[[i for i in range(len(a_t))]].tolist())
	b_pe = np.array(model.transformer.wpe.weight[[i for i in range(len(b_t))]].tolist())
	x = np.add.reduce(a_te+a_pe)
	y = np.add.reduce(b_te+b_pe)
	openapi: 3.0.0
	info:
	title: WolframAlpha API
	version: 1.0.0
	servers:
	- url: https://api.wolframalpha.com/v1
	paths:
	/result:
	get:
	summary: Returns a simple answer from WolframAlpha
	// Copyright 2023 Optiver Asia Pacific Pty. Ltd.
	//
	// This file is part of Ready Trader Go.
	//
	// Ready Trader Go is free software: you can redistribute it and/or
	// modify it under the terms of the GNU Affero General Public License
	// as published by the Free Software Foundation, either version 3 of
	// the License, or (at your option) any later version.
	//
	// Ready Trader Go is distributed in the hope that it will be useful,
CPU	IMem	Add	Mux	ALU	Regs	DMem	Control
a	400ps	100ps	30ps	120ps	200ps	350ps	100ps
b	500ps	150ps	100ps	180ps	220ps	1000ps	65ps