Qwen3.6-35B-A3B
Collection
Mixed-precision MLX builds of Qwen/Qwen3.6-35B-A3B for Apple Silicon. Size points: 19 GB, 25 GB. • 3 items • Updated
Mixed-precision MLX build of Qwen/Qwen3.6-35B-A3B, prepared by baa.ai.
Built at the predicted global (quality max) operating point.
| Metric | Value |
|---|---|
| In-memory footprint | ~25 GiB |
| Size on disk | 26.3 GB |
| Average bits per weight | 5.29 |
| Group size | 64 |
| Framework | MLX (Apple Silicon) |
| Source | Qwen/Qwen3.6-35B-A3B (BF16, 71.9 GB) |
Benchmarks pending.
sampler_params = {
"temperature": 1.0,
"top_p": 0.95,
"top_k": 40,
"repetition_penalty": 1.1,
"max_tokens": 8192,
}
from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler, make_logits_processors
model, tokenizer = load("baa-ai/Qwen3.6-35B-A3B-RAM-25GB-MLX")
sampler = make_sampler(temp=1.0, top_p=0.95, top_k=40)
logits_processors = make_logits_processors(repetition_penalty=1.1)
prompt = tokenizer.apply_chat_template(
[{"role": "user", "content": "Write a Python function that reverses a string."}],
tokenize=False,
add_generation_prompt=True,
)
response = generate(model, tokenizer, prompt=prompt, max_tokens=8192,
sampler=sampler, logits_processors=logits_processors)
print(response)
| Variant | Size | Link |
|---|---|---|
| 19 GB | 20.6 GB | baa-ai/Qwen3.6-35B-A3B-RAM-19GB-MLX |
| 25 GB | 26.3 GB | baa-ai/Qwen3.6-35B-A3B-RAM-25GB-MLX |
Apache 2.0 — inherited from Qwen/Qwen3.6-35B-A3B.
Quantized by baa.ai
4-bit
Base model
Qwen/Qwen3.6-35B-A3B