Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: llama runner process no longer running: -1 #3904

Closed
parthV2 opened this issue Apr 25, 2024 · 7 comments
Closed

Error: llama runner process no longer running: -1 #3904

parthV2 opened this issue Apr 25, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@parthV2
Copy link

parthV2 commented Apr 25, 2024

What is the issue?

Was trying to run a finetuned version of llama2 having a gguf of 13.5gb.

Screenshot from 2024-04-25 11-30-55

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

v0.1.32

@parthV2 parthV2 added the bug Something isn't working label Apr 25, 2024
@kannon92
Copy link

Can you get logs for the server? I’ve seen this error when I couldn’t load some linear algebra libraries.

@MoonRide303
Copy link

Also happens when trying to load Phi-3 GGUF created with current llama.cpp versions:

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'phi3'
llama_load_model_from_file: exception loading model
time=2024-04-25T18:30:30.049+02:00 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: 3221226505 "
[GIN] 2024/04/25 - 18:30:30 | 500 |    536.2714ms |       127.0.0.1 | POST     "/api/chat"

@parthV2
Copy link
Author

parthV2 commented Apr 26, 2024

@kannon92 > logs for the server?

Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: [GIN] 2024/04/25 - 11:18:52 | 200 | 345.721µs | 127.0.0.1 | HEAD "/"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: [GIN] 2024/04/25 - 11:18:52 | 200 | 1.668953ms | 127.0.0.1 | POST "/api/show"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: [GIN] 2024/04/25 - 11:18:52 | 200 | 327.883µs | 127.0.0.1 | POST "/api/show"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.272+05:30 level=INFO source=gpu.go:121 msg="Detecting GPU type"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.272+05:30 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.276+05:30 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama695377432/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.4.127 /usr/lib/x86_64-linux-gnu/libcudart.so.10.1.243]"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.417+05:30 level=INFO source=gpu.go:343 msg="Unable to load cudart CUDA management library /tmp/ollama695377432/runners/cuda_v11/libcudart.so.11.0: cudart init failure: 100"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.562+05:30 level=INFO source=gpu.go:343 msg="Unable to load cudart CUDA management library /usr/local/cuda/lib64/libcudart.so.12.4.127: cudart init failure: 100"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.720+05:30 level=INFO source=gpu.go:343 msg="Unable to load cudart CUDA management library /usr/lib/x86_64-linux-gnu/libcudart.so.10.1.243: cudart init failure: 100"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.720+05:30 level=INFO source=gpu.go:268 msg="Searching for GPU management library libnvidia-ml.so"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.727+05:30 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.550.67 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00 /usr/lib32/libnvidia-ml.so.550.67]"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.849+05:30 level=INFO source=gpu.go:326 msg="Unable to load NVML management library /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.550.67: nvml vram init failure: 9"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.877+05:30 level=INFO source=gpu.go:326 msg="Unable to load NVML management library /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00: nvml vram init failure: 9"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.878+05:30 level=INFO source=gpu.go:326 msg="Unable to load NVML management library /usr/lib32/libnvidia-ml.so.550.67: Unable to load /usr/lib32/libnvidia-ml.so.550.67 library to query for Nvidia GPUs: /usr/lib32/libnvidia-ml.so.550.67: wrong ELF class: ELFCLASS32"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.878+05:30 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.879+05:30 level=INFO source=gpu.go:121 msg="Detecting GPU type"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.879+05:30 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
Apr 25 11:18:52 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:52.887+05:30 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama695377432/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.4.127 /usr/lib/x86_64-linux-gnu/libcudart.so.10.1.243]"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.004+05:30 level=INFO source=gpu.go:343 msg="Unable to load cudart CUDA management library /tmp/ollama695377432/runners/cuda_v11/libcudart.so.11.0: cudart init failure: 100"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.120+05:30 level=INFO source=gpu.go:343 msg="Unable to load cudart CUDA management library /usr/local/cuda/lib64/libcudart.so.12.4.127: cudart init failure: 100"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.252+05:30 level=INFO source=gpu.go:343 msg="Unable to load cudart CUDA management library /usr/lib/x86_64-linux-gnu/libcudart.so.10.1.243: cudart init failure: 100"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.254+05:30 level=INFO source=gpu.go:268 msg="Searching for GPU management library libnvidia-ml.so"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.262+05:30 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.550.67 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00 /usr/lib32/libnvidia-ml.so.550.67]"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.308+05:30 level=INFO source=gpu.go:326 msg="Unable to load NVML management library /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.550.67: nvml vram init failure: 9"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.321+05:30 level=INFO source=gpu.go:326 msg="Unable to load NVML management library /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.226.00: nvml vram init failure: 9"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.321+05:30 level=INFO source=gpu.go:326 msg="Unable to load NVML management library /usr/lib32/libnvidia-ml.so.550.67: Unable to load /usr/lib32/libnvidia-ml.so.550.67 library to query for Nvidia GPUs: /usr/lib32/libnvidia-ml.so.550.67: wrong ELF class: ELFCLASS32"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.321+05:30 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.321+05:30 level=INFO source=server.go:127 msg="offload to gpu" reallayers=0 layers=0 required="13791.0 MiB" used="193.0 MiB" available="0 B" kv="1024.0 MiB" fulloffload="164.0 MiB" partialoffload="193.0 MiB"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.322+05:30 level=INFO source=server.go:264 msg="starting llama server" cmd="/tmp/ollama695377432/runners/cpu_avx2/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-86b805d4fd5c7881ae110aeedafc8062d6a504c6aa80145d41e00d0076b6cebe --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 0 --port 41521"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.322+05:30 level=INFO source=server.go:389 msg="waiting for llama runner to start responding"
Apr 25 11:18:53 vassar-Latitude-5490 ollama[12750]: {"function":"server_params_parse","level":"INFO","line":2603,"msg":"logging to file is disabled.","tid":"139646131890048","timestamp":1714024133}
Apr 25 11:18:53 vassar-Latitude-5490 ollama[12750]: {"function":"server_params_parse","level":"WARN","line":2380,"msg":"Not compiled with GPU offload support, --n-gpu-layers option will be ignored. See main README.md for information on enabling GPU BLAS support","n_gpu_layers":-1,"tid":"139646131890048","timestamp":1714024133}
Apr 25 11:18:53 vassar-Latitude-5490 ollama[12750]: {"build":1,"commit":"7593639","function":"main","level":"INFO","line":2819,"msg":"build info","tid":"139646131890048","timestamp":1714024133}
Apr 25 11:18:53 vassar-Latitude-5490 ollama[12750]: {"function":"main","level":"INFO","line":2822,"msg":"system info","n_threads":4,"n_threads_batch":-1,"system_info":"AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | ","tid":"139646131890048","timestamp":1714024133,"total_threads":8}
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: loaded meta data with 26 key-value pairs and 323 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-86b805d4fd5c7881ae110aeedafc8062d6a504c6aa80145d41e00d0076b6cebe (version GGUF V3 (latest))
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 0: general.architecture str = llama
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 1: general.name str = Agri-llama
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 2: llama.block_count u32 = 32
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 3: llama.context_length u32 = 4096
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 11008
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 32
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 8: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 9: general.file_type u32 = 1
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 10: llama.vocab_size u32 = 32000
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 11: llama.rope.dimension_count u32 = 128
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 12: tokenizer.ggml.model str = llama
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,32000] = ["", "", "", "<0x00>", "<...
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 14: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000...
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 15: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 16: tokenizer.ggml.bos_token_id u32 = 1
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 17: tokenizer.ggml.eos_token_id u32 = 2
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 18: tokenizer.ggml.unknown_token_id u32 = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 20: tokenizer.chat_template str = {% if messages[0]['role'] == 'system'...
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 21: tokenizer.ggml.prefix_token_id u32 = 32007
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 22: tokenizer.ggml.suffix_token_id u32 = 32008
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 23: tokenizer.ggml.middle_token_id u32 = 32009
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 24: tokenizer.ggml.eot_token_id u32 = 32010
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - kv 25: tokenizer.chat_template str = {% if messages[0]['role'] == 'system'...
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - type f32: 97 tensors
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_loader: - type f16: 226 tensors
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_vocab: special tokens definition check successful ( 259/32000 ).
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: format = GGUF V3 (latest)
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: arch = llama
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: vocab type = SPM
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_vocab = 32000
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_merges = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_ctx_train = 4096
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_embd = 4096
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_head = 32
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_head_kv = 32
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_layer = 32
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_rot = 128
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_embd_head_k = 128
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_embd_head_v = 128
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_gqa = 1
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_embd_k_gqa = 4096
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_embd_v_gqa = 4096
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: f_norm_eps = 0.0e+00
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: f_norm_rms_eps = 1.0e-05
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: f_clamp_kqv = 0.0e+00
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: f_max_alibi_bias = 0.0e+00
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: f_logit_scale = 0.0e+00
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_ff = 11008
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_expert = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_expert_used = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: causal attn = 1
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: pooling type = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: rope type = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: rope scaling = linear
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: freq_base_train = 10000.0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: freq_scale_train = 1
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: n_yarn_orig_ctx = 4096
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: rope_finetuned = unknown
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: ssm_d_conv = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: ssm_d_inner = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: ssm_d_state = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: ssm_dt_rank = 0
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: model type = 7B
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: model ftype = F16
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: model params = 6.74 B
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: model size = 12.55 GiB (16.00 BPW)
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: general.name = Agri-llama
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: BOS token = 1 ''
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: EOS token = 2 '
'
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: UNK token = 0 ''
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: PAD token = 0 ''
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_print_meta: LF token = 13 '<0x0A>'
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llm_load_tensors: ggml ctx size = 0.12 MiB
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 323, got 291
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: llama_load_model_from_file: exception loading model
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: terminate called after throwing an instance of 'std::runtime_error'
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: what(): done_getting_tensors: wrong number of tensors; expected 323, got 291
Apr 25 11:18:53 vassar-Latitude-5490 ollama[6500]: time=2024-04-25T11:18:53.736+05:30 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: -1 "

@sanyuan0704
Copy link

I have the save issue with Phi3 model

@lvoz2
Copy link

lvoz2 commented Apr 30, 2024

I have the same issue, after updating manually (This is on a Raspberry Pi 4B), but it is with Apple's new Open ELM. I used the conversion and quantization scripts from the pr by joshcarp, and have successfully built the GGUF file of OpenELM-270M. I want to test it, but that error is preventing me. Interesting, I can still run gemma:2b, although this was pulled with an older version of ollama

@parthV2
Copy link
Author

parthV2 commented Apr 30, 2024

** lvoz2I have 6 models in ollama...everyone is working well except this.

@dhiltgen
Copy link
Collaborator

dhiltgen commented May 1, 2024

Initial support for phi3 was added in 0.1.32, and conversion should be working in 0.1.33. Please give the latest RC a try and let us know if you're still having problems.

https://github.com/ollama/ollama/releases

@dhiltgen dhiltgen self-assigned this May 2, 2024
@parthV2 parthV2 closed this as completed May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants