Perfect parameters for coding !!! SHARE YOURS
Hey everybody,
Please share your parameters for webdev/coding use. I use opencode, seems fine, sometime crashing opencode but not sure it's because of oh-my-openagent plugin or because of this gguf. Don't want to analyze further. But if somebody have some configs good to share, please do. This is mine:
c:\0_llama_server\llama-server ^
-m a:\0_LM_Studio\Jackrong\Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF\Qwen3.5-27B.Q6_K.gguf ^
-ngl 999 --threads 22 --ctx-size 196610 --alias qwen3.5:27b ^
--flash-attn on ^
--batch-size 1024 ^
--ubatch-size 256 ^
--host 0.0.0.0 ^
--repeat_penalty 1.1 ^
--presence_penalty 0.0 ^
--temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.00 ^
--port 8080 --no-context-shift ^
--spec-type ngram-mod --spec-ngram-size-n 24 --draft-min 48 --draft-max 64 ^
--jinja ^
--seed 3407 --reasoning-format deepseek --ctx-checkpoints 128 --reasoning 1
Deployed on one DGX Spark, Mac connected via LM Studio link + CC-Switch, VSCode terminal, single Claude session: sometimes crashes amid GEN process. However, when I create 4 agents to work together, it looks fine, prompting in and out @200-500 tokens.
LM_Studio\Jackrong\Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF\Qwen3.5-27B.Q4_K_M.gguf
Context Length 262144, GPU Offload 64, CPU Thread Pool Size 15, Eva Batch Size 512, Max Concurrent Predictions 4,
Temp 0.5, Context Overflow Rolling Window