-
-
Save aldur/94378954caa0829bc5cb5dcca6962379 to your computer and use it in GitHub Desktop.
Qwen3.5-27B
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| bash-5.3$ ./sandbox.sh llm chat | |
| Chatting with llama-server | |
| Type 'exit' or 'quit' to exit | |
| Type '!multi' to enter multiple lines, then '!end' to finish | |
| Type '!edit' to open your default editor and modify the prompt | |
| Type '!fragment <my_fragment> [<another_fragment> ...]' to insert one or more fragments | |
| > !edit | |
| Here is a detailed proofread of your blog post. Overall, it's a solid technical piece with a clear value proposition. However, there are several grammatical errors, numbering inconsistencies, and a few technical phrasing issues that should be addressed to make it polished. | |
| ### 🔴 Critical Errors & Inconsistencies | |
| 1. **Model Name Consistency:** | |
| * **Issue:** You refer to the model as **Qwen3.5** in the title, excerpt, and intro, but switch to **Qwen3** in the Conclusion. | |
| * **Fix:** Standardize on `Qwen3.5` throughout the post (or `Qwen 3.5` for readability). | |
| * **Location:** Conclusion ("On a MacBook Pro... Qwen3 `35B-A3B`..."). | |
| 2. **List Numbering:** | |
| * **Issue:** In the "Sandboxing" and "Local clients" sections, your Markdown lists use `1.` for every item, rendering as `1, 1, 1`. | |
| * **Fix:** Change the second and third items in these lists to `2.` and `3.`. | |
| * **Location:** "Sandboxing" section (list of 3 items) and "Local clients" section (list of 2 items). | |
| 3. **HTTP Protocol Typo:** | |
| * **Issue:** In the QEMU VM code block, the curl command is missing a slash in the protocol. | |
| * **Fix:** Change `http:/10.0.2.2` to `http://10.0.2.2`. | |
| * **Location:** "In a QEMU VM" code block. | |
| 4. **Technical Inaccuracy (Bytecode):** | |
| * **Issue:** You mention allowing access to "executable bytecode". On macOS (and Linux), native executables are typically Mach-O or ELF binaries, not "bytecode" (which implies Java, Python, etc.). | |
| * **Fix:** Change to "the executable binary" or just "the executable". | |
| * **Location:** "Server" section ("...allows access to the executable bytecode..."). | |
| ### 🟠 Grammar & Syntax | |
| 1. **Verb Agreement:** | |
| * **Original:** "The `sandbox.sh` scripts automatically **takes** care..." | |
| * **Correction:** "The `sandbox.sh` scripts automatically **take** care..." (Plural subject "scripts" needs plural verb). | |
| * **Original:** "Now that the big AI companies **using** it..." | |
| * **Correction:** "Now that the big AI companies **are using** it..." | |
| 2. **Spelling & Word Choice:** | |
| * **Original:** "...has **lead** to [code execution vulnerabilities]..." | |
| * **Correction:** "...has **led** to..." (Past participle of lead). | |
| * **Original:** "...guarantees both integrity and **confidentially**..." | |
| * **Correction:** "...guarantees both integrity and **confidentiality**..." (Noun form needed). | |
| * **Original:** "...64GB of **RAMs**..." | |
| * **Correction:** "...64GB of **RAM**..." (RAM is uncountable). | |
| * **Original:** "...these relatively small **model** are able..." | |
| * **Correction:** "...these relatively small **models** are able..." | |
| * **Original:** "...preventing `opencode` **to leak** your prompts..." | |
| * **Correction:** "...preventing `opencode` **from leaking** your prompts..." | |
| 3. **Sentence Structure:** | |
| * **Original:** "...so `llama-server` from the host **available** at `10.0.2.2:8080`..." | |
| * **Correction:** "...so `llama-server` from the host **is available** at `10.0.2.2:8080`..." (Missing verb). | |
| * **Original:** "YOLO **modo**" | |
| * **Correction:** "YOLO **mode**" (Unless you specifically want the Italian/Spanish flair, "mode" is the standard English term for this context). | |
| ### 🟡 Flow & Clarity | |
| 1. **Excerpt Clarity:** | |
| * **Current:** "...plus local Qwen3.5 from a QEMU VM." | |
| * **Critique:** This implies the model runs *inside* the VM. Your post explains the model runs on the host and is accessed *by* the VM. | |
| * **Suggestion:** "...plus local Qwen3.5 **accessed via** a QEMU VM." | |
| 2. **Confusing Parenthetical:** | |
| * **Current:** "...preventing `opencode` to leak your prompts (that's how it gives a name to sessions)." | |
| * **Critique:** This parenthetical is confusing. Does `opencode` name sessions based on the prompts? If the sandbox prevents leaking, how does it name them? | |
| * **Suggestion:** Clarify or remove. If the point is that the sandbox stops the prompts from leaving the machine, the session naming part might be distracting. | |
| 3. **The "Intel CPUs" Non-Sequitur:** | |
| * **Current:** "Plus, their OCR capabilities are impressive, even on Intel CPUs!" | |
| * **Critique:** You are testing on an M3 Max MacBook. Mentioning Intel CPUs here feels like a non-sequitur or a typo. Did you test it elsewhere? | |
| * **Suggestion:** If you haven't tested on Intel, remove the reference to avoid confusing readers. If you mean the model *architecture* supports Intel, rephrase: "...impressive across different architectures, including Intel CPUs." | |
| 4. **Code Block Artifact:** | |
| * **Current:** `{"status":"ok"}⏎` | |
| * **Critique:** The `⏎` symbol looks like an artifact from a terminal copy-paste. | |
| * **Suggestion:** Remove the symbol to keep the code block clean. | |
| 5. **Footnote Link:** | |
| * **Current:** `manpath=macOS+26.3` (Footnote 3) | |
| * **Critique:** macOS version numbers are currently around 15 (Sequoia). `26.3` looks suspicious (possibly a FreeBSD version number mixed in?). | |
| * **Suggestion:** Verify the link or remove the version number if it's dynamic. | |
| ### 🟢 Specific Edits (Before/After) | |
| **Section: Sandboxing** | |
| > *Before:* "when they can wreak havoc in YOLO modo (`--dangerously-skip-permissions`)." | |
| > *After:* "when they can wreak havoc in YOLO mode (`--dangerously-skip-permissions`)." | |
| > *Before:* "since parsing the GGUF files representing the models has lead to [code execution vulnerabilities][10]";" | |
| > *After:* "since parsing GGUF files has led to [code execution vulnerabilities][10]";" | |
| **Section: Local clients** | |
| > *Before:* "The `sandbox.sh` scripts automatically takes care of the configuration files..." | |
| > *After:* "The `sandbox.sh` script automatically takes care of the configuration files..." (Or "scripts... take"). | |
| **Section: In a QEMU VM** | |
| > *Before:* "so `llama-server` from the host available at `10.0.2.2:8080`:" | |
| > *After:* "so `llama-server` from the host is available at `10.0.2.2:8080`:" | |
| **Section: Conclusion** | |
| > *Before:* "On a MacBook Pro with an M3 Max and 64GB of RAMs, Qwen3 `35B-A3B` is quick enough..." | |
| > *After:* "On a MacBook Pro with an M3 Max and 64GB of RAM, Qwen3.5 `35B-A3B` is quick enough..." | |
| > *Before:* "Unsurprisingly, the quality of results is lower than frontier hosted models, but it's a big step forward: these relatively small model are able to make small, interactive changes." | |
| > *After:* "Unsurprisingly, the quality of results is lower than frontier hosted models, but it's a big step forward: these relatively small models are capable of making small, interactive edits." | |
| ### Summary | |
| The content is excellent and the "default-deny" approach is a great hook. Once you fix the grammar, list numbering, and model name consistency, this will be a very high-quality technical post. Good luck! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| srv params_from_: Chat format: peg-native | |
| slot get_availabl: id 3 | task -1 | selected slot by LRU, t_last = -1 | |
| slot launch_slot_: id 3 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> top-p -> min-p -> ?xtc -> temp-ext -> dist | |
| slot launch_slot_: id 3 | task 0 | processing task, is_child = 0 | |
| slot update_slots: id 3 | task 0 | new prompt, n_ctx_slot = 262144, n_keep = 0, task.n_tokens = 11 | |
| slot update_slots: id 3 | task 0 | n_tokens = 0, memory_seq_rm [0, end) | |
| slot init_sampler: id 3 | task 0 | init sampler, took 0.00 ms, tokens: text = 11, total = 11 | |
| slot update_slots: id 3 | task 0 | prompt processing done, n_tokens = 11, batch.n_tokens = 11 | |
| srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200 | |
| slot print_timing: id 3 | task 0 | | |
| prompt eval time = 1229.25 ms / 11 tokens ( 111.75 ms per token, 8.95 tokens per second) | |
| eval time = 18463.71 ms / 194 tokens ( 95.17 ms per token, 10.51 tokens per second) | |
| total time = 19692.96 ms / 205 tokens | |
| slot release: id 3 | task 0 | stop processing: n_tokens = 204, truncated = 0 | |
| srv update_slots: all slots are idle | |
| srv params_from_: Chat format: peg-native | |
| slot get_availabl: id 2 | task -1 | selected slot by LRU, t_last = -1 | |
| slot launch_slot_: id 2 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> top-p -> min-p -> ?xtc -> temp-ext -> dist | |
| slot launch_slot_: id 2 | task 195 | processing task, is_child = 0 | |
| slot update_slots: id 2 | task 195 | new prompt, n_ctx_slot = 262144, n_keep = 0, task.n_tokens = 2573 | |
| slot update_slots: id 2 | task 195 | n_tokens = 0, memory_seq_rm [0, end) | |
| slot update_slots: id 2 | task 195 | prompt processing progress, n_tokens = 2048, batch.n_tokens = 2048, progress = 0.795958 | |
| slot update_slots: id 2 | task 195 | n_tokens = 2048, memory_seq_rm [2048, end) | |
| slot update_slots: id 2 | task 195 | prompt processing progress, n_tokens = 2061, batch.n_tokens = 13, progress = 0.801010 | |
| slot update_slots: id 2 | task 195 | n_tokens = 2061, memory_seq_rm [2061, end) | |
| slot init_sampler: id 2 | task 195 | init sampler, took 0.23 ms, tokens: text = 2573, total = 2573 | |
| slot update_slots: id 2 | task 195 | prompt processing done, n_tokens = 2573, batch.n_tokens = 512 | |
| slot update_slots: id 2 | task 195 | created context checkpoint 1 of 32 (pos_min = 2060, pos_max = 2060, n_tokens = 2061, size = 149.626 MiB) | |
| srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200 | |
| slot print_timing: id 2 | task 195 | | |
| prompt eval time = 14795.45 ms / 2573 tokens ( 5.75 ms per token, 173.90 tokens per second) | |
| eval time = 688980.45 ms / 5453 tokens ( 126.35 ms per token, 7.91 tokens per second) | |
| total time = 703775.90 ms / 8026 tokens | |
| slot release: id 2 | task 195 | stop processing: n_tokens = 8025, truncated = 0 | |
| srv update_slots: all slots are idle |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. | |
| llama_model_loader: - kv 0: general.architecture str = qwen35 | |
| llama_model_loader: - kv 1: general.type str = model | |
| llama_model_loader: - kv 2: general.sampling.top_k i32 = 20 | |
| llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000 | |
| llama_model_loader: - kv 4: general.sampling.temp f32 = 0.600000 | |
| llama_model_loader: - kv 5: general.name str = Qwen3.5-27B | |
| llama_model_loader: - kv 6: general.basename str = Qwen3.5-27B | |
| llama_model_loader: - kv 7: general.quantized_by str = Unsloth | |
| llama_model_loader: - kv 8: general.size_label str = 27B | |
| llama_model_loader: - kv 9: general.license str = apache-2.0 | |
| llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3.5-2... | |
| llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth | |
| llama_model_loader: - kv 12: general.base_model.count u32 = 1 | |
| llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3.5 27B | |
| llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen | |
| llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3.5-27B | |
| llama_model_loader: - kv 16: general.tags arr[str,3] = ["qwen3_5_moe", "unsloth", "image-tex... | |
| llama_model_loader: - kv 17: qwen35.block_count u32 = 64 | |
| llama_model_loader: - kv 18: qwen35.context_length u32 = 262144 | |
| llama_model_loader: - kv 19: qwen35.embedding_length u32 = 5120 | |
| llama_model_loader: - kv 20: qwen35.feed_forward_length u32 = 17408 | |
| llama_model_loader: - kv 21: qwen35.attention.head_count u32 = 24 | |
| llama_model_loader: - kv 22: qwen35.attention.head_count_kv u32 = 4 | |
| llama_model_loader: - kv 23: qwen35.rope.dimension_sections arr[i32,4] = [11, 11, 10, 0] | |
| llama_model_loader: - kv 24: qwen35.rope.freq_base f32 = 10000000.000000 | |
| llama_model_loader: - kv 25: qwen35.attention.layer_norm_rms_epsilon f32 = 0.000001 | |
| llama_model_loader: - kv 26: qwen35.attention.key_length u32 = 256 | |
| llama_model_loader: - kv 27: qwen35.attention.value_length u32 = 256 | |
| llama_model_loader: - kv 28: qwen35.ssm.conv_kernel u32 = 4 | |
| llama_model_loader: - kv 29: qwen35.ssm.state_size u32 = 128 | |
| llama_model_loader: - kv 30: qwen35.ssm.group_count u32 = 16 | |
| llama_model_loader: - kv 31: qwen35.ssm.time_step_rank u32 = 48 | |
| llama_model_loader: - kv 32: qwen35.ssm.inner_size u32 = 6144 | |
| llama_model_loader: - kv 33: qwen35.full_attention_interval u32 = 4 | |
| llama_model_loader: - kv 34: qwen35.rope.dimension_count u32 = 64 | |
| llama_model_loader: - kv 35: tokenizer.ggml.model str = gpt2 | |
| llama_model_loader: - kv 36: tokenizer.ggml.pre str = qwen35 | |
| llama_model_loader: - kv 37: tokenizer.ggml.tokens arr[str,248320] = ["!", "\"", "#", "$", "%", "&", "'", ... | |
| llama_model_loader: - kv 38: tokenizer.ggml.token_type arr[i32,248320] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... | |
| llama_model_loader: - kv 39: tokenizer.ggml.merges arr[str,247587] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",... | |
| llama_model_loader: - kv 40: tokenizer.ggml.eos_token_id u32 = 248046 | |
| llama_model_loader: - kv 41: tokenizer.ggml.padding_token_id u32 = 248055 | |
| llama_model_loader: - kv 42: tokenizer.chat_template str = {%- set image_count = namespace(value... | |
| llama_model_loader: - kv 43: general.quantization_version u32 = 2 | |
| llama_model_loader: - kv 44: general.file_type u32 = 7 | |
| llama_model_loader: - kv 45: quantize.imatrix.file str = Qwen3.5-27B-GGUF/imatrix_unsloth.gguf | |
| llama_model_loader: - kv 46: quantize.imatrix.dataset str = unsloth_calibration_Qwen3.5-27B.txt | |
| llama_model_loader: - kv 47: quantize.imatrix.entries_count u32 = 496 | |
| llama_model_loader: - kv 48: quantize.imatrix.chunks_count u32 = 80 | |
| llama_model_loader: - type f32: 353 tensors | |
| llama_model_loader: - type q8_0: 498 tensors | |
| print_info: file format = GGUF V3 (latest) | |
| print_info: file type = Q8_0 | |
| print_info: file size = 26.62 GiB (8.50 BPW) | |
| load: 0 unused tokens | |
| load: printing all EOG tokens: | |
| load: - 248044 ('<|endoftext|>') | |
| load: - 248046 ('<|im_end|>') | |
| load: - 248063 ('<|fim_pad|>') | |
| load: - 248064 ('<|repo_name|>') | |
| load: - 248065 ('<|file_sep|>') | |
| load: special tokens cache size = 33 | |
| load: token to piece cache size = 1.7581 MB | |
| print_info: arch = qwen35 | |
| print_info: vocab_only = 0 | |
| print_info: no_alloc = 0 | |
| print_info: n_ctx_train = 262144 | |
| print_info: n_embd = 5120 | |
| print_info: n_embd_inp = 5120 | |
| print_info: n_layer = 64 | |
| print_info: n_head = 24 | |
| print_info: n_head_kv = 4 | |
| print_info: n_rot = 64 | |
| print_info: n_swa = 0 | |
| print_info: is_swa_any = 0 | |
| print_info: n_embd_head_k = 256 | |
| print_info: n_embd_head_v = 256 | |
| print_info: n_gqa = 6 | |
| print_info: n_embd_k_gqa = 1024 | |
| print_info: n_embd_v_gqa = 1024 | |
| print_info: f_norm_eps = 0.0e+00 | |
| print_info: f_norm_rms_eps = 1.0e-06 | |
| print_info: f_clamp_kqv = 0.0e+00 | |
| print_info: f_max_alibi_bias = 0.0e+00 | |
| print_info: f_logit_scale = 0.0e+00 | |
| print_info: f_attn_scale = 0.0e+00 | |
| print_info: n_ff = 17408 | |
| print_info: n_expert = 0 | |
| print_info: n_expert_used = 0 | |
| print_info: n_expert_groups = 0 | |
| print_info: n_group_used = 0 | |
| print_info: causal attn = 1 | |
| print_info: pooling type = 0 | |
| print_info: rope type = 40 | |
| print_info: rope scaling = linear | |
| print_info: freq_base_train = 10000000.0 | |
| print_info: freq_scale_train = 1 | |
| print_info: n_ctx_orig_yarn = 262144 | |
| print_info: rope_yarn_log_mul = 0.0000 | |
| print_info: rope_finetuned = unknown | |
| print_info: mrope sections = [11, 11, 10, 0] | |
| print_info: ssm_d_conv = 4 | |
| print_info: ssm_d_inner = 6144 | |
| print_info: ssm_d_state = 128 | |
| print_info: ssm_dt_rank = 48 | |
| print_info: ssm_n_group = 16 | |
| print_info: ssm_dt_b_c_rms = 0 | |
| print_info: model type = 27B | |
| print_info: model params = 26.90 B | |
| print_info: general.name = Qwen3.5-27B | |
| print_info: vocab type = BPE | |
| print_info: n_vocab = 248320 | |
| print_info: n_merges = 247587 | |
| print_info: BOS token = 11 ',' | |
| print_info: EOS token = 248046 '<|im_end|>' | |
| print_info: EOT token = 248046 '<|im_end|>' | |
| print_info: PAD token = 248055 '<|vision_pad|>' | |
| print_info: LF token = 198 'Ċ' | |
| print_info: FIM PRE token = 248060 '<|fim_prefix|>' | |
| print_info: FIM SUF token = 248062 '<|fim_suffix|>' | |
| print_info: FIM MID token = 248061 '<|fim_middle|>' | |
| print_info: FIM PAD token = 248063 '<|fim_pad|>' | |
| print_info: FIM REP token = 248064 '<|repo_name|>' | |
| print_info: FIM SEP token = 248065 '<|file_sep|>' | |
| print_info: EOG token = 248044 '<|endoftext|>' | |
| print_info: EOG token = 248046 '<|im_end|>' | |
| print_info: EOG token = 248063 '<|fim_pad|>' | |
| print_info: EOG token = 248064 '<|repo_name|>' | |
| print_info: EOG token = 248065 '<|file_sep|>' | |
| print_info: max token length = 256 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment