🦅 Shaheen-Gemma4-Urdu 🦅

Shaheen Logo

"نہیں تیرا نشیمن قصرِ سلطانی کے گنبد پر"
"تو شاہیں ہے، بسیرا کر پہاڑوں کی چٹانوں میں"


📖 Overview

Shaheen-Gemma4-Urdu is a high-performance Urdu language model developed by Khurram Pervez (Khurram123). It is fine-tuned on 51,686 high-quality Urdu instruction samples to provide deep linguistic understanding, formal vocabulary, and cultural nuance.

This repository contains both the Full 16-bit Safetensors and the Quantized GGUF versions.

🚀 Key Features

  • Dual-Format: Includes Safetensors for transformers and GGUF for llama.cpp.
  • Architecture: Based on the state-of-the-art Gemma 4 (2B).
  • Urdu Fluency: Specifically tuned to handle complex Urdu grammar and formal literature.
  • Speed: Delivers an impressive ~94 tokens per second on an NVIDIA RTX 4060 Ti.
  • Academic Precision: Leveraging a background in Mathematics to ensure logical consistency in Urdu reasoning.

🛠️ Technical Specifications

  • Dataset: large-traversaal/urdu-instruct
  • Training Time: ~2 hours (1 full epoch).
  • Final Loss: 1.118, showing strong generalization.
  • Format 1: model.safetensors (10.2 GB)
  • Format 2: Shaheen-Gemma4-Urdu-Q4_K_M.gguf (3.43 GB)

💻 Usage Instructions

1. Using with llama.cpp (GGUF)

Run the quantized version directly on your GPU:

./llama-cli -m Shaheen-Gemma4-Urdu-Q4_K_M.gguf \
    -p "<start_of_turn>user\nاسلام علیکم! آپ کیسے ہیں؟<end_of_turn>\n<start_of_turn>model\n" \
    -n 128 --n-gpu-layers 33
Downloads last month
1,051
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Khurram123/Shaheen-Gemma4-Urdu

Quantized
(133)
this model

Dataset used to train Khurram123/Shaheen-Gemma4-Urdu