Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 14 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save zitterbewegung/4787e42617aa0be6019c3db526d5d6fd to your computer and use it in GitHub Desktop.
Save zitterbewegung/4787e42617aa0be6019c3db526d5d6fd to your computer and use it in GitHub Desktop.
llama.cpp 65B run
(venv) # Exit:0 2023-03-12 16:59:27 [r2q2@Reformer#[:~/opt/llama.cpp]
$(: !605 ) ./main -m ./models/65B/ggml-model-q4_0.bin -t 8 -n 128
main: seed = 1678658429
llama_model_load: loading model from './models/65B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx = 512
llama_model_load: n_embd = 8192
llama_model_load: n_mult = 256
llama_model_load: n_head = 64
llama_model_load: n_layer = 80
llama_model_load: n_rot = 128
llama_model_load: f16 = 2
llama_model_load: n_ff = 22016
llama_model_load: n_parts = 8
llama_model_load: ggml ctx size = 41477.73 MB
llama_model_load: memory_size = 2560.00 MB, n_mem = 40960
llama_model_load: loading model part 1/8 from './models/65B/ggml-model-q4_0.bin'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
llama_model_load: loading model part 2/8 from './models/65B/ggml-model-q4_0.bin.1'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
llama_model_load: loading model part 3/8 from './models/65B/ggml-model-q4_0.bin.2'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
llama_model_load: loading model part 4/8 from './models/65B/ggml-model-q4_0.bin.3'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
llama_model_load: loading model part 5/8 from './models/65B/ggml-model-q4_0.bin.4'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
llama_model_load: loading model part 6/8 from './models/65B/ggml-model-q4_0.bin.5'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
llama_model_load: loading model part 7/8 from './models/65B/ggml-model-q4_0.bin.6'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
llama_model_load: loading model part 8/8 from './models/65B/ggml-model-q4_0.bin.7'
llama_model_load: .......................................................................................... done
llama_model_load: model size = 4869.09 MB / num tensors = 723
main: prompt: 'If'
main: number of tokens in prompt = 2
1 -> ''
3644 -> 'If'
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
If you’re looking to work in one of the most diverse, exciting and fast-paced industries around – we want YOU!
From early education all through college students are taught that great careers have titles like doctor or lawyer. Students learn about a variety of professions but they may not be exposed to what it takes to run an event successfully from start to finish; the overall big picture process and its effect on those involved in corporations, hotels, convention centres as well many other areas with smaller budgets which are also dependent upon meeting planners. This job profile will provide you information about
main: mem per token = 70897348 bytes
main: load time = 19427.11 ms
main: sample time = 440.50 ms
main: predict time = 70716.00 ms / 548.19 ms per token
main: total time = 96886.10 ms
@zitterbewegung
Copy link
Author

zitterbewegung commented Mar 13, 2023

Memory use is 41GB of RAM or 62.7% of ram. This should work on the newer MacBook Pros.

@hakusaro
Copy link

Thank you for sharing the consumption results! A lot of people have said it works, but don't say precise amounts / numbers. Thanks!

@iMacker2020
Copy link

How fast is its output? Could you give us some examples of its output please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment