Created
February 16, 2024 12:30
-
-
Save boxabirds/019fd81234e4c85791e92259797cfa14 to your computer and use it in GitHub Desktop.
Testing ifioravanti/lwm
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Runnig on ollama 0.1.25 / web-ui v1.0.0-alpha.100 / ubuntu 22.04 | |
Feb 16 12:27:52 gruntus ollama[57441]: {"timestamp":1708086472,"level":"ERROR","function":"load_model","line":378,"message":"unable to load model","model":"/usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942"} | |
Feb 16 12:27:52 gruntus ollama[57441]: time=2024-02-16T12:27:52.809Z level=WARN source=llm.go:162 msg="Failed to load dynamic library /tmp/ollama2601240208/cpu_avx2/libext_server.so error loading model /usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959" | |
Feb 16 12:27:52 gruntus ollama[57441]: [GIN] 2024/02/16 - 12:27:52 | 500 | 203.311184ms | 127.0.0.1 | POST "/api/chat" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=gpu.go:146 msg="CUDA Compute Capability detected: 8.9" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=gpu.go:146 msg="CUDA Compute Capability detected: 8.9" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=llm.go:111 msg="not enough vram available, falling back to CPU only" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama2601240208/cpu_avx2/libext_server.so" | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.013Z level=INFO source=dyn_ext_server.go:145 msg="Initializing llama server" | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942 (version GGUF V3 (latest)) | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 0: general.architecture str = llama | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 1: general.name str = ollama_conversion | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 2: llama.context_length u32 = 1048576 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 3: llama.embedding_length u32 = 4096 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 4: llama.block_count u32 = 32 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 11008 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 6: llama.rope.dimension_count u32 = 128 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 7: llama.attention.head_count u32 = 32 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 8: llama.attention.head_count_kv u32 = 32 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000001 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 10: llama.rope.freq_base f32 = 50000000.000000 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 11: general.file_type u32 = 2 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 12: tokenizer.ggml.model str = llama | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,32000] = ["<unk>", "<s>", "</s>", "<0x00>", "<... | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 14: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000... | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 15: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ... | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 16: tokenizer.ggml.bos_token_id u32 = 1 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 17: tokenizer.ggml.eos_token_id u32 = 2 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 18: tokenizer.ggml.padding_token_id u32 = 0 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 19: tokenizer.ggml.add_bos_token bool = true | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 20: tokenizer.ggml.add_eos_token bool = false | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - kv 21: general.quantization_version u32 = 2 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - type f32: 65 tensors | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - type q4_0: 225 tensors | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_model_loader: - type q6_K: 1 tensors | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_vocab: special tokens definition check successful ( 259/32000 ). | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: format = GGUF V3 (latest) | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: arch = llama | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: vocab type = SPM | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_vocab = 32000 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_merges = 0 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_ctx_train = 1048576 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd = 4096 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_head = 32 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_head_kv = 32 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_layer = 32 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_rot = 128 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_head_k = 128 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_head_v = 128 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_gqa = 1 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_k_gqa = 4096 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_embd_v_gqa = 4096 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_norm_eps = 0.0e+00 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_norm_rms_eps = 1.0e-06 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_clamp_kqv = 0.0e+00 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: f_max_alibi_bias = 0.0e+00 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_ff = 11008 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_expert = 0 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_expert_used = 0 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: rope scaling = linear | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: freq_base_train = 50000000.0 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: freq_scale_train = 1 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: n_yarn_orig_ctx = 1048576 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: rope_finetuned = unknown | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model type = 7B | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model ftype = Q4_0 | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model params = 6.74 B | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: model size = 3.56 GiB (4.54 BPW) | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: general.name = ollama_conversion | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: BOS token = 1 '<s>' | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: EOS token = 2 '</s>' | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: UNK token = 0 '<unk>' | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: PAD token = 0 '<unk>' | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_print_meta: LF token = 13 '<0x0A>' | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_tensors: ggml ctx size = 0.11 MiB | |
Feb 16 12:27:53 gruntus ollama[57441]: llm_load_tensors: CPU buffer size = 3647.87 MiB | |
Feb 16 12:27:53 gruntus ollama[57441]: .................................................................................................. | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: n_ctx = 1048575 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: freq_base = 50000000.0 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: freq_scale = 1 | |
Feb 16 12:27:53 gruntus ollama[57441]: ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 549755289632 | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_kv_cache_init: failed to allocate buffer for kv cache | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_new_context_with_model: llama_kv_cache_init() failed for self-attention cache | |
Feb 16 12:27:53 gruntus ollama[57441]: llama_init_from_gpt_params: error: failed to create context with model '/usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942' | |
Feb 16 12:27:53 gruntus ollama[57441]: loading library /tmp/ollama2601240208/cpu_avx2/libext_server.so | |
Feb 16 12:27:53 gruntus ollama[57441]: {"timestamp":1708086473,"level":"ERROR","function":"load_model","line":378,"message":"unable to load model","model":"/usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959942"} | |
Feb 16 12:27:53 gruntus ollama[57441]: time=2024-02-16T12:27:53.157Z level=WARN source=llm.go:162 msg="Failed to load dynamic library /tmp/ollama2601240208/cpu_avx2/libext_server.so error loading model /usr/share/ollama/.ollama/models/blobs/sha256:883e646838a09df3315b167aac81293affbf48dbccf927eb144191dc9c959" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment