How to suppress the CoT info from the output?

#11
by nailding1 - opened

I ran the Q4 & Q6 models via llama.cpp with the option "--reasoning off" and called from openclaw. But both output the thinking process which is bad to users. How can I suppress the cot info ?

For these distilled reasoning models, CoT isn’t a hard switch anymore. Unlike earlier Qwen3 models, you can’t reliably disable it with a boolean flag — you’ll usually need to control it via prompt/template or post-processing.

Any workaround? I tried prompt in different ways but it did not work. 😒 I confirmed that my suppress rules had successfully been injected into the system prompt to the model. Or would you please offer a template example for me to try ?
Btw, post-processing is kind of intrusive for me and so I would turn to that in low priority.

Thanks.

Sign up or log in to comment