Currently working with Qwen 3.5/6 35B-A3B in the lab ; learning the "quirks" ; still a ways to go.
David Belton PRO
AI & ML interests
Recent Activity
Organizations
I noticed the chat template got updated, and tried it on the E4B, with surprising results in stabilizing the brainwave.
quant arc arc/e boolq hswag obkqa piqa wino
mxfp8 0.480,0.656,0.797,0.608,0.400,0.755,0.665
mxfp4 0.455,0.607,0.851,0.585,0.402,0.744,0.651
Quant Perplexity Peak Memory Tokens/sec
mxfp8 35.937 ± 0.525 14.80 GB 1153
mxfp4 36.746 ± 0.534 11.06 GB 1030Old numbers
quant arc arc/e boolq hswag obkqa piqa wino
mxfp8 0.404,0.489,0.825,0.586,0.392,0.734,0.661
mxfp4 0.414,0.508,0.854,0.562,0.378,0.717,0.645
Quant Perplexity Peak Memory Tokens/sec
mxfp8 34.652 ± 0.502 14.80 GB 1146
mxfp4 35.203 ± 0.506 11.06 GB 1200I will re-do all baselines soon based on the new template. It is completely expected that the model behavior will change as a result.
Here are the effects of the new template on few known distills from DavidAU
gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED
quant arc arc/e boolq hswag obkqa piqa wino
New template
mxfp8 0.518,0.709,0.755,0.657,0.418,0.759,0.626
mxfp4 0.485,0.682,0.792,0.641,0.432,0.746,0.635
Old template
mxfp8 0.506,0.697,0.754,0.661,0.416,0.757,0.627
mxfp4 0.487,0.670,0.792,0.644,0.430,0.748,0.624gemma-4-E4B-it-GLM-4.7-Flash-HERETIC-UNCENSORED-Thinking
mxfp8 0.461,0.599,0.779,0.630,0.406,0.766,0.629
Old template
mxfp8 0.456,0.580,0.786,0.629,0.410,0.764,0.633gemma-4-E4B-it-Claude-Opus-4.5-HERETIC-UNCENSORED-Thinking
mxfp8 0.509,0.705,0.806,0.646,0.416,0.773,0.650
Old template
mxfp8 0.502,0.692,0.809,0.650,0.420,0.771,0.651RE: 16-18 B ; yes, something running in the lab right now. (Gemma 4).
Also can make Qwen 3's (Version 3) moes like Llama3.2-8X3B as well ; I have some of these at my repo too.
I have built a few GPT-OSS ; and some 12B [mistral nemo] as well as mistral nemo "large" 15-17Bs...
A lot of options ;
Maybe in the future ; atm still learning/addressing quirks with these new Gemmas.
Google released three different arch structure here : "E", "MOE", and 31B dense.
Also plans to create larger Gemma 4s too ; which may work better for specific applications and/or work better period.
These are in the plans for next week.
De-censored, tuned, and tuned again via Unsloth using custom in house datasets and methods:
DavidAU/gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED-Thinking
Exceeds Gemma4 26B-A4B in critical benchmarks.
Training a Gemma 4 Reap 19B-A4B right now ; should be done tomorrow, then testing.
RE: FRanken merge 26B-A3B ; yes, just need to make a map for Mergekit ; this is also in progress.
RE: Claudes ; depends on how reap turns out.
There are a lot of updates still in progress with Unsloth/Llamacpp RE: Gemma 4s atm too ;
There are also some dataset issues to address when training with Gemma 4s.
NOTE:
Just finished a number of fine tunes on Gemma 4's E4B ; which is a MOE LIKE model. These will release in the next day or so ; pending final testing.
Uncensored first, then tuned.
Some benchmarks posted, others pending.
Examples posted, detailed instructions.
Some GGUFs are up; others pending as of this writing.
Enjoy:
DavidAU/gemma-4-31B-it-Mystery-Fine-Tune-HERETIC-UNCENSORED-Thinking
DavidAU/gemma-4-31B-it-Grand-Horror-X-INTENSE-HERETIC-UNCENSORED-Thinking
DavidAU/gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking
UPDATE:
DavidAU/gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED-Thinking
Exceeds Gemma4 26B-A4B in critical benchmarks.
Qwen 3.5 40B Claude Opus Deckard UNCENSORED.
Expanded, and trained with Claude Opus 4.6 Dataset, but first it was Heretic'ed and trained with DECKARD - 5 hand crafted datasets to give the model character, point of view and intelligence... and a lot more.
Examples posted.
Several quant types available under quantizations:
DavidAU/Qwen3.5-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking
Drastically larger, with performance to match.
Upgraded Jinja template too.
DavidAU/Qwen3.5-40B-Claude-4.5-Opus-High-Reasoning-Thinking
UPDATE:
All of these are now up; and can be downloaded.
Awaiting quants.
RE: 13B:
=> one is upscaled + trained, the other is merge of two 9Bs fine tunes (and upscaled).
They are hidden as of this writing (undergoing private testing), awaiting final metrics / eval.
If they "pass" ; they will be made public.
These will be active within 24-48 hrs pending results.
Currently have full running 13B (GLM 4.7 Flash) - which is very strong ; and experimental 21Bs of Qwen 3.5.
These are trained.
These are in testing, and access is limited as of this writing.
As for MOEs:
This is a little more complicated as scripting must be written for Mergekit to "moe together" 0.8B, 2B, 4B, 9Bs etc etc.
A draft (by me) has been completed to do this; but not tested/debugged yet.
No time line here ; too many variables.
RE 35B moes ; it is possible to address this in a different way ; but I have not tried it yet.
This is a different approach than REAP.
All are bench marked against org model.
Many exceed all benchmarks of org model.
Claude, GLM, Gemini and other distills.
Thinking AND dedicated Instruct versions.
Core goal: Increase benchmarks, and address long thinking blocks.
Highlights:
9B and 27B instruct "Claude" versions hit 624 and 675 on the "ARC-C" (hard challenge).
Thinking fine tunes exceed org model performance (in thinking mode).
In many cases there is a drastic reduction in thinking block size.
9B Claude Heretic Uncensored, GGUF :
-Neo, Code Imatrix (duel imatrix)
- Updated Jinja template
- Custom tensor enhancements.
DavidAU/Qwen3.5-9B-Claude-4.6-OS-Auto-Variable-HERETIC-UNCENSORED-THINKING-MAX-NEOCODE-Imatrix-GGUF
COLLECTION [21 models]:
https://huggingface.co/collections/DavidAU/qwen-35-08-2-4-9-27-35b-regular-uncensored
UPDATE:
Now 31 models, including experimental 21B and new 13B models.
arc_challenge,arc_easy,boolq,hellaswag,openbookqa,piqa, winogrande
0.661 ,0.816 ,0.878,0.763 ,0.464 ,0.808 ,0.762
For comparison:
Qwen3.5-27B-Text
qx86-hi 0.443,0.498,0.857,0.701,0.372,0.770,0.752
Trained on a HERETIC uncensored base too ;
DavidAU/Gemma3-27B-it-vl-Polaris-HI16-Heretic-Uncensored-INSTRUCT
3 Ernie 21B-A3B MOE Models (64 experts) fine tuned with Unsloth using Gemini Pro 3, Claude 4.5 Opus, and GLM 4.7 Flash high reasoning datasets.
All benched, all exceeding org model specs too.
https://huggingface.co/DavidAU/models?search=ernie
Enjoy the freedom and added power.
20 Gemma 3 models 1B, 4B, 12B and 27B with full reasoning using GLM 4.7 Flash, GPT, Claude and Gemini datasets and more fully fine tuned using Unsloth.
Most models are Heretic'ed (uncensored) first, and tuned second.
This vastly improves the model.
Models are also bench marked and in almost all cases exceed org model metrics - and in some cases by a lot.
Enjoy the freedom and more powerful THINKING/REASONING and UNCENSORED Gemma 3s !
https://huggingface.co/collections/DavidAU/gemma-3-reasoning-thinking-models-incl-uncensored
UPDATE: Benchmarks added for almost all models, including "VS" with Heretic (untuned) source models too.
9 Heretic Uncensored LFM fine tunes are now up at my repo:
https://huggingface.co/DavidAU/models?sort=created&search=lfm
Model card updates in progress as I write this.
The merges will take a wee bit longer.
...and 5 more new "non-heretic" ones too.
@muxodious ; Excellent.
In the cue.
Important note:
I can make the base models W reasoning datasets; however the "Kimi Mega Brain" is a complex merge of these base models (trained with different datasets) by Nightmedia.
I will query Nightmedia to see if he will do an updated "Heretic" mega brain merge after the "heretic" versions are complete.
Waiting for updates W HEretic/Transformers to make this possible with "thinking" LFM base.