Skip to content

Instantly share code, notes, and snippets.

@boxabirds
Created February 16, 2024 12:30
Show Gist options
  • Save boxabirds/019fd81234e4c85791e92259797cfa14 to your computer and use it in GitHub Desktop.
Save boxabirds/019fd81234e4c85791e92259797cfa14 to your computer and use it in GitHub Desktop.
Testing ifioravanti/lwm
Runnig on ollama 0.1.25 / web-ui v1.0.0-alpha.100 / ubuntu 22.04
Feb 16 12:27:52 gruntus ollama[57441]: {"timestamp":1708086472,"level":"ERROR","function":"load_model","line":378,"message":"unable to load model","model":"/usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942"}
Feb 16 12:27:52 gruntus ollama[57441]: time=2024-02-16T12:27:52.809Z level=WARN source=llm.go:162 msg="Failed to load dynamic library /tmp/ollama2601240208/cpu_avx2/libext_server.so error loading model /usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959"
Feb 16 12:27:52 gruntus ollama[57441]: [GIN] 2024/02/16 - 12:27:52 | 500 | 203.311184ms | 127.0.0.1 | POST "/api/chat"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=gpu.go:146 msg="CUDA Compute Capability detected: 8.9"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=gpu.go:146 msg="CUDA Compute Capability detected: 8.9"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=llm.go:111 msg="not enough vram available, falling back to CPU only"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama2601240208/cpu_avx2/libext_server.so"
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=dyn_ext_server.go:145 msg="Initializing llama server"
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942 (version GGUF V3 (latest))
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 0: general.architecture str = llama
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 1: general.name str = ollama_conversion
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 2: llama.context_length u32 = 1048576
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 3: llama.embedding_length u32 = 4096
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 4: llama.block_count u32 = 32
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 11008
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 6: llama.rope.dimension_count u32 = 128
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 7: llama.attention.head_count u32 = 32
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 8: llama.attention.head_count_kv u32 = 32
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000001
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 10: llama.rope.freq_base f32 = 50000000.000000
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 11: general.file_type u32 = 2
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 12: tokenizer.ggml.model str = llama
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,32000] = ["<unk>", "<s>", "</s>", "<0x00>", "<...
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 14: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000...
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 15: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 16: tokenizer.ggml.bos_token_id u32 = 1
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 17: tokenizer.ggml.eos_token_id u32 = 2
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 18: tokenizer.ggml.padding_token_id u32 = 0
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 19: tokenizer.ggml.add_bos_token bool = true
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 20: tokenizer.ggml.add_eos_token bool = false
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 21: general.quantization_version u32 = 2
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - type f32: 65 tensors
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - type q4_0: 225 tensors
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - type q6_K: 1 tensors
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_vocab: special tokens definition check successful ( 259/32000 ).
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: format = GGUF V3 (latest)
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: arch = llama
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: vocab type = SPM
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_vocab = 32000
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_merges = 0
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_ctx_train = 1048576
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd = 4096
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_head = 32
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_head_kv = 32
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_layer = 32
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_rot = 128
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_head_k = 128
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_head_v = 128
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_gqa = 1
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_k_gqa = 4096
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_v_gqa = 4096
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_norm_eps = 0.0e+00
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_norm_rms_eps = 1.0e-06
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_clamp_kqv = 0.0e+00
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_max_alibi_bias = 0.0e+00
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_ff = 11008
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_expert = 0
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_expert_used = 0
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: rope scaling = linear
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: freq_base_train = 50000000.0
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: freq_scale_train = 1
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_yarn_orig_ctx = 1048576
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: rope_finetuned = unknown
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model type = 7B
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model ftype = Q4_0
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model params = 6.74 B
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model size = 3.56 GiB (4.54 BPW)
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: general.name = ollama_conversion
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: BOS token = 1 '<s>'
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: EOS token = 2 '</s>'
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: UNK token = 0 '<unk>'
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: PAD token = 0 '<unk>'
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: LF token = 13 '<0x0A>'
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_tensors: ggml ctx size = 0.11 MiB
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_tensors: CPU buffer size = 3647.87 MiB
Feb 16 12:27:53 gruntus ollama[57441]: ..................................................................................................
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: n_ctx = 1048575
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: freq_base = 50000000.0
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: freq_scale = 1
Feb 16 12:27:53 gruntus ollama[57441]: ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 549755289632
Feb 16 12:27:53 gruntus ollama[57441]: llama_kv_cache_init: failed to allocate buffer for kv cache
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: llama_kv_cache_init() failed for self-attention cache
Feb 16 12:27:53 gruntus ollama[57441]: llama_init_from_gpt_params: error: failed to create context with model '/usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942'
Feb 16 12:27:53 gruntus ollama[57441]: loading library /tmp/ollama2601240208/cpu_avx2/libext_server.so
Feb 16 12:27:53 gruntus ollama[57441]: {"timestamp":1708086473,"level":"ERROR","function":"load_model","line":378,"message":"unable to load model","model":"/usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942"}
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.157Z level=WARN source=llm.go:162 msg="Failed to load dynamic library /tmp/ollama2601240208/cpu_avx2/libext_server.so error loading model /usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment