could you make an uncensored version of DeepSeek v4 Flash?
I absolutely would if I could, the issue is that to do Heretic you need to basically have everything in VRAM and DeepSeek-V4-Flash is a 160GB model and I do not have that much VRAM, the only way would be if I added an RTX PRO 6000 Blackwell Workstation Edition (or the more expensive Datacenter cards like a B300) and that's not gonna happen unless someone donate for me 11000+ Euros.
I see, thanks for explaining. Would it be possible to consider renting GPUs? On a platform I’m familiar with, an H200 costs about €0.5/hour. Do you have an estimate of how long Heretic would take for a model of this size?
I guess it need multi round work to make best result. so estimate time is not reliable.
I see, thanks for explaining. Would it be possible to consider renting GPUs?
Yes it's possible, the problem, there is a lot of trial and errors involved in the process, there is a lot of experiments and testing to be done, all this takes a lot of time and many models have their own quirks and you have to find what "makes them tick", this involves testing and re-running trials until you get it right and every model needs to be tweaked and tested multiple times and every model takes it's own time, typically the more parameters a model has the longer Heretication takes and you don't know how many trials you should run, some models gives you the best result at Trial 150, some others at Trial 248, some other at Trial 600, it's impossible to know in advance.
A great example is this:
https://huggingface.co/RadicalNotionAI/Qwen3.5-397B-A17B-heretic
This is the only person to ever Hereticate this model (the rest are just cloned repos of this release) and the model is actually Hereticated as it has now low refusals, but look at what cost:
You can already see the bad sign, KL Divergence of 0.3823, then you go UGI Leaderboard:
If you would notice the big amount of quality loss due to this 0.3823 KL divergence? The issue is that to run Heretic on this model would cost a lot of money very fast and you can't just go ahead and retry until you get it right and have a great result, because every hour costs a lot of money just to run Heretic on this model through multi-GPU renting and so far nobody else ran Heretic on this model as far as I can tell, for obvious reasons.
When I am behind my PC, I can insure that I can release the best models and re-run trials no issue, when I am renting GPUs and I have to operate everything though an interface everytime instead of having all my own tools on my own PC and I am limited with how much time I can use because money is going away every hour and so can not afford any trials and errors, testing, experiments etc., well you can guess that this is a lot different workflow.
On a platform I’m familiar with, an H200 costs about €0.5/hour. Do you have an estimate of how long Heretic would take for a model of this size?
H200 is old and outdated, it "only" has 141GB of VRAM, H200 was supplanted by B200 (192GB of VRAM) and now B200 was supplanted by B300 (288GB of VRAM), Runpod doesn't mention offering B300 on their website and on Runpod B200 costs $5.49/hr per hour, also B200 doesn't have enough VRAM to Heretic DeepSeek-V4-Flash comfortably as everything needs to have a good amount of headroom.
Now I am not saying I can not do it, I can definitely do it, the issue is that I also need a good amount of money to rent Cloud GPUs and I have no idea if I will be able to release a high quality version or not due to all these issues and constraints mentioned in this post.
I guess it need multi round work to make best result. so estimate time is not reliable.
Yes exactly, see my post above.

