🦅 Shaheen-Gemma4-Urdu 🦅

"نہیں تیرا نشیمن قصرِ سلطانی کے گنبد پر"
"تو شاہیں ہے، بسیرا کر پہاڑوں کی چٹانوں میں"

📖 Overview

Shaheen-Gemma4-Urdu is a high-performance Urdu language model developed by Khurram Pervez (Khurram123). It is fine-tuned on 51,686 high-quality Urdu instruction samples to provide deep linguistic understanding, formal vocabulary, and cultural nuance.

This repository contains both the Full 16-bit Safetensors and the Quantized GGUF versions.

🚀 Key Features

Dual-Format: Includes Safetensors for transformers and GGUF for llama.cpp.
Architecture: Based on the state-of-the-art Gemma 4 (2B).
Urdu Fluency: Specifically tuned to handle complex Urdu grammar and formal literature.
Speed: Delivers an impressive ~94 tokens per second on an NVIDIA RTX 4060 Ti.
Academic Precision: Leveraging a background in Mathematics to ensure logical consistency in Urdu reasoning.

🛠️ Technical Specifications

Dataset: large-traversaal/urdu-instruct
Training Time: ~2 hours (1 full epoch).
Final Loss: 1.118, showing strong generalization.
Format 1: model.safetensors (10.2 GB)
Format 2: Shaheen-Gemma4-Urdu-Q4_K_M.gguf (3.43 GB)

💻 Usage Instructions

1. Using with llama.cpp (GGUF)

Run the quantized version directly on your GPU:

./llama-cli -m Shaheen-Gemma4-Urdu-Q4_K_M.gguf \
    -p "<start_of_turn>user\nاسلام علیکم! آپ کیسے ہیں؟<end_of_turn>\n<start_of_turn>model\n" \
    -n 128 --n-gpu-layers 33

Downloads last month: 1,051

GGUF

Model size

5B params

Architecture

gemma4

Hardware compatibility

4-bit

Model tree for Khurram123/Shaheen-Gemma4-Urdu

Base model

google/gemma-4-E2B-it

Quantized

(133)

this model

Khurram123
/

Shaheen-Gemma4-Urdu