Skip to content

Instantly share code, notes, and snippets.

View chunhualiao's full-sized avatar
🎯
Focusing

Chunhua Liao chunhualiao

🎯
Focusing
View GitHub Profile
@chunhualiao
chunhualiao / intelligence.md
Last active June 29, 2019 17:16
My definition of Intelligence

My definition of intelligence

Intelligence = collect info + build models + predict outcomes of different choices + decide on optimal choice + execute the choice

The key is really to find/build a model for important things (self, others, and environment), and to quickly make good decision for actions, based on predicted outcomes of the model, and finally act on the choices.

Dictionary definition

noun: intelligence

  1. the ability to acquire and apply knowledge and skills.
@chunhualiao
chunhualiao / data analytics
Created August 26, 2019 20:46
Data analytics
Four levels of data analytics
* descriptive analytics
* diagnosis analytics
* predictive analytics
* prescriptive analytics
@chunhualiao
chunhualiao / schema.org.notes.md
Last active March 12, 2021 17:20
schema.org.notes.md

There are rdfs:domain and rdfs:range already. Why does schema.org use schema:domainIncludes and schema:rangeIncludes instead?

Answer:

Schema.org doesn't want you to do inferencing using certain properties. If I knew that

schema:name rdfs:domain schema:Person

then whenever I saw a schema:name defined for an object, I could infer that the object was of type schema:Person.

@chunhualiao
chunhualiao / deepspeed-chat.ipynb
Created April 23, 2023 03:06
DeepSpeed-Chat.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@chunhualiao
chunhualiao / QuantitativeMeasuresofIndependence.md
Last active July 4, 2023 06:20
Contemplating Quantitative Measures of Independence on the Eve of American Independence Day

Contemplating Quantitative Measures of Independence on the Eve of American Independence Day

The original draft was written on July 22, 2022. It has been polished by using GPT-4 this year.

This year's American Independence Day holiday prompted me to reflect on the meaning of independence and consider if there are any quantitative measures to assess it. After pondering the topic intermittently for several days, I'm finally taking the time to jot down some of my thoughts. Examining potential indicators of independence proves valuable, as it allows us to evaluate the level of independence for ourselves and those around us. Furthermore, we can analyze and enhance our independence through various methods. Like many issues globally, the independence of a country's people is not a binary (0 vs. 1) matter but rather a multi-dimensional continuum. All nations can make ongoing improvements across various dimensions.

Historically, one of the main factors that drove American colonists to pursue independence was the

@chunhualiao
chunhualiao / CodeLlama-7b-Instruct-hf.md
Last active April 16, 2024 12:17
Example code to use CodeLlama 7B Instruct model HuggingFace version
@chunhualiao
chunhualiao / torchrun-multiple-gpus-input.md
Last active September 4, 2023 05:05
A lesson using torchrun, multiple GPUs, and Python input() together

I was playing CodeLama using 2 or 4 GPUs for 7B and 30 B models, respectively. I changed the official example instruct python code to accept user instructions using input() inside a while loop. So I can keep giving instructions and getting results from the models.

But whenever I run the code using 2 or 4 GPUs, it just hangs after I type in an instruction. The same code works fine if I only use 1 single GPU.

# Copyright (c) Meta Platforms, Inc. and affiliates.
# This software may be used and distributed according to the terms of the Llama 2 Community License Agreement.

I use the following code, running on perlmutter.nersc.gov, single node, 4 A100 GPUs, each 80 GB memory:

Timing information

  • Loading tokenizer and model: took 749.432817184017 seconds to execute.
  • Creating pipepline: took 0.0001321239396929741 seconds to execute.
  • Inferencing using the model: took 455.8312628919957 seconds to execute.
    • this time depends on the output token length specified, I use 512 tokens. Longer lengths require even longer inferencing time.
# Use a pipeline as a high-level helper
from transformers import pipeline
@chunhualiao
chunhualiao / SimulatingMontyHallProblem.md
Last active December 26, 2023 19:03
Simulating the Monty Hall Problem

The Monty Hall Problem (Where is the goat) is fascinating: You are presented with three doors. Behind one door is a car, and behind the other two doors are goats. You choose one of the three doors (hoping it has a car behind it you can win it), but it remains closed for now.

I have asked gpt-4 to create a python problem to simulate the choices and compare their winning rates.

import random

# input parameter: switch or not: true or false
def play_game(switch_doors):
    doors = [0, 0, 1]  # Two goats (0) and one car (1)
    random.shuffle(doors)  # The prizes behind the doors are randomly shuffled
@chunhualiao
chunhualiao / ggml-llama-cpp.md
Last active May 3, 2024 14:29
Initial source code understanding of ggml (llama.cpp)

I have taken quite some machine learning courses and have done a few projects already. I think I know the math formula involved in transformers and GPT models. However, I always wondered how they work in reality. The best way for me is to read and understand source codes implementing these models. I am a C/C++ programmer mostly. I am more comfortable to read C/C++ programs. So, recently I started to read, run, and debug ggml's gpt-2 inference example since ggml is entirely written in C and can run many transformer models on a laptop: https://github.com/ggerganov/ggml/tree/master/examples/gpt-2 . The famous llama.cpp is closely connected to this library. My experiment environment is a MacBook Pro laptop+ Visual Studio Code + cmake+ CodeLLDB (gdb does not work with my M2 chip), and GPT-2 117 M model. Here is what I have learned so far:

The high-level main function has the following structure https://github.com/ggerganov/ggml/blob/master/examples/gpt-2/main-backend.cpp

  • load the model: ggml specific format us