🦅 Shaheen-Gemma4-Urdu 🦅
"نہیں تیرا نشیمن قصرِ سلطانی کے گنبد پر"
"تو شاہیں ہے، بسیرا کر پہاڑوں کی چٹانوں میں"
📖 Overview
Shaheen-Gemma4-Urdu is a high-performance Urdu language model developed by Khurram Pervez (Khurram123). It is fine-tuned on 51,686 high-quality Urdu instruction samples to provide deep linguistic understanding, formal vocabulary, and cultural nuance.
This repository contains both the Full 16-bit Safetensors and the Quantized GGUF versions.
🚀 Key Features
- Dual-Format: Includes Safetensors for
transformersand GGUF forllama.cpp. - Architecture: Based on the state-of-the-art Gemma 4 (2B).
- Urdu Fluency: Specifically tuned to handle complex Urdu grammar and formal literature.
- Speed: Delivers an impressive ~94 tokens per second on an NVIDIA RTX 4060 Ti.
- Academic Precision: Leveraging a background in Mathematics to ensure logical consistency in Urdu reasoning.
🛠️ Technical Specifications
- Dataset:
large-traversaal/urdu-instruct - Training Time: ~2 hours (1 full epoch).
- Final Loss: 1.118, showing strong generalization.
- Format 1:
model.safetensors(10.2 GB) - Format 2:
Shaheen-Gemma4-Urdu-Q4_K_M.gguf(3.43 GB)
💻 Usage Instructions
1. Using with llama.cpp (GGUF)
Run the quantized version directly on your GPU:
./llama-cli -m Shaheen-Gemma4-Urdu-Q4_K_M.gguf \
-p "<start_of_turn>user\nاسلام علیکم! آپ کیسے ہیں؟<end_of_turn>\n<start_of_turn>model\n" \
-n 128 --n-gpu-layers 33
- Downloads last month
- 1,051
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for Khurram123/Shaheen-Gemma4-Urdu
Base model
google/gemma-4-E2B-it