Forked from zitterbewegung/Output using 65B on a M1 MacBook Pro 14.2 with 64GB
Created
July 19, 2023 13:28
-
-
Save manniru/32b195b590bd906af4389430d522f50a to your computer and use it in GitHub Desktop.
llama.cpp 65B run
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(venv) # Exit:0 2023-03-12 16:59:27 [r2q2@Reformer#[:~/opt/llama.cpp] | |
$(: !605 ) ./main -m ./models/65B/ggml-model-q4_0.bin -t 8 -n 128 | |
main: seed = 1678658429 | |
llama_model_load: loading model from './models/65B/ggml-model-q4_0.bin' - please wait ... | |
llama_model_load: n_vocab = 32000 | |
llama_model_load: n_ctx = 512 | |
llama_model_load: n_embd = 8192 | |
llama_model_load: n_mult = 256 | |
llama_model_load: n_head = 64 | |
llama_model_load: n_layer = 80 | |
llama_model_load: n_rot = 128 | |
llama_model_load: f16 = 2 | |
llama_model_load: n_ff = 22016 | |
llama_model_load: n_parts = 8 | |
llama_model_load: ggml ctx size = 41477.73 MB | |
llama_model_load: memory_size = 2560.00 MB, n_mem = 40960 | |
llama_model_load: loading model part 1/8 from './models/65B/ggml-model-q4_0.bin' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 2/8 from './models/65B/ggml-model-q4_0.bin.1' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 3/8 from './models/65B/ggml-model-q4_0.bin.2' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 4/8 from './models/65B/ggml-model-q4_0.bin.3' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 5/8 from './models/65B/ggml-model-q4_0.bin.4' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 6/8 from './models/65B/ggml-model-q4_0.bin.5' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 7/8 from './models/65B/ggml-model-q4_0.bin.6' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 8/8 from './models/65B/ggml-model-q4_0.bin.7' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
main: prompt: 'If' | |
main: number of tokens in prompt = 2 | |
1 -> '' | |
3644 -> 'If' | |
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000 | |
If you’re looking to work in one of the most diverse, exciting and fast-paced industries around – we want YOU! | |
From early education all through college students are taught that great careers have titles like doctor or lawyer. Students learn about a variety of professions but they may not be exposed to what it takes to run an event successfully from start to finish; the overall big picture process and its effect on those involved in corporations, hotels, convention centres as well many other areas with smaller budgets which are also dependent upon meeting planners. This job profile will provide you information about | |
main: mem per token = 70897348 bytes | |
main: load time = 19427.11 ms | |
main: sample time = 440.50 ms | |
main: predict time = 70716.00 ms / 548.19 ms per token | |
main: total time = 96886.10 ms |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment