Qwen3.6-35B-A3B-Abliterated-Heretic-GGUF
This is a GGUF release of an abliterated version of Qwen's Qwen3.6-35B-A3B.
By applying Heretic on the Qwen 3.6 sparse-MoE text stack, the base refusal behavior was removed at the weight level. The result keeps Qwen3.6-35B-A3B's multimodal architecture and general capability profile, while no longer defaulting to the original refusal pattern.
Quick Benchmarks
| Check | Original Qwen3.6-35B-A3B | Abliterated Heretic |
|---|---|---|
| Official 25-prompt refusal check | 22/25 refusals | 1/25 refusals |
| Archived Heretic KL divergence | - | 0.010655362159013748 |
Methodology & Model Notes
Qwen3.6-35B-A3B is a 35.95B sparse MoE vision-language model with roughly 3B active parameters per token, 40 text layers, 256 routed experts, and 8 active experts per token.
This release was produced with a Heretic MPOA/SOMA-style sibling-transfer run, finalized with a split-MoE input-side intervention on the accepted candidate.
The accepted candidate scored Refusals: 1/25 on the official 25-prompt marker suite used for the MiniMax M2.7 abliterated run.
The resulting abliterated checkpoint was exported to BF16 and then converted to GGUF for llama.cpp-compatible deployment.
Files
Qwen3.6-35B-A3B-Abliterated-Heretic-BF16/: BF16 GGUF sourceQwen3.6-35B-A3B-Abliterated-Heretic-Q8_0/: highest-fidelity quantQwen3.6-35B-A3B-Abliterated-Heretic-Q6_K/: near-lossless practical quantQwen3.6-35B-A3B-Abliterated-Heretic-Q4_K_M/: smaller general-use quantmmproj-Qwen3.6-35B-A3B-Abliterated-Heretic.gguf: matching multimodal projector file for llama.cpp vision use
Running
llama-server \
-m <quant-file.gguf> \
--mmproj <mmproj-file.gguf> \
-ngl 999 -c 32768 --jinja -fa
Model Architecture
| Spec | Value |
|---|---|
| Total Parameters | 35.95B (sparse MoE) |
| Active Parameters | ~3B per token |
| Experts | 256 routed, 8 per token |
| Layers | 40 |
| Hidden Size | 2048 |
| Family | qwen3_5_moe |
| Modality | Vision-language |
| Base Model | Qwen/Qwen3.6-35B-A3B |
Disclaimer
This model has had refusal behavior removed at the weight level. It will answer prompts that the base model would normally refuse. You are responsible for how you use it.
Credits
- Base model: Qwen/Qwen3.6-35B-A3B
- Refusal removal pipeline: Heretic
- GGUF runtime and quantization: llama.cpp
License
This release inherits the base Qwen3.6-35B-A3B license.
Apache-2.0.
- Downloads last month
- 9,453
4-bit
6-bit
8-bit
16-bit
Model tree for Youssofal/Qwen3.6-35B-A3B-Abliterated-Heretic-GGUF
Base model
Qwen/Qwen3.6-35B-A3B