tawkeed-embedding

tawkeed-embedding is an Arabic-first text embedding model built by Tawkeed, fine-tuned for on-device and edge AI deployment.

Forked from BAAI/bge-m3 and fine-tuned on Arabic semantic similarity and retrieval data, this model powers Arabic search, RAG, and similarity tasks running natively on Tawkeed devices.

Highlights

  • Arabic-first embeddings — trained and rigorously tested on Arabic text for semantic understanding
  • Edge-optimized — efficient enough to run embedding pipelines on Tawkeed edge hardware
  • Production-ready — validated on Arabic retrieval and similarity benchmarks
  • Multilingual — retains strong multilingual capability from BGE-M3

Model Details

Property Value
Base Model BAAI/bge-m3
Language Arabic (ar), English (en), + multilingual
License MIT
Task Text Embedding / Retrieval / Similarity
Fine-tuning Arabic semantic similarity & retrieval data
Deployment On-device / Edge / Cloud

Training

This model is fine-tuned for Arabic embeddings through:

  1. Fork of the BGE-M3 multilingual embedding model
  2. Fine-tuning on Arabic semantic similarity and retrieval datasets
  3. Evaluation on Arabic retrieval benchmarks

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("tawkeed-sa/tawkeed-embedding")

sentences = [
    "الذكاء الاصطناعي يغير العالم",
    "تقنيات التعلم العميق تتطور بسرعة",
    "الطقس جميل اليوم"
]

embeddings = model.encode(sentences)
print(embeddings.shape)

Tawkeed Model Family

A complete suite of Arabic AI models — from compact edge models to large-scale MoE — all fine-tuned and tested for Arabic.

Model Size Type
tawkeed-sa/tawkeed-0.8b 0.8b Arabic LLM
tawkeed-sa/tawkeed-2b 2b Arabic LLM
tawkeed-sa/tawkeed-4b 4b Arabic LLM
tawkeed-sa/tawkeed-9b 9b Arabic LLM
tawkeed-sa/tawkeed-27b 27b Arabic LLM
tawkeed-sa/tawkeed-40b 40b Arabic LLM
tawkeed-sa/tawkeed-27b-MLX 27b 8-bit LLM — Apple Silicon (MLX)
tawkeed-sa/tawkeed-27b-GGUF 27b Q8_0 LLM — Ollama / llama.cpp
tawkeed-sa/tawkeed-ocr OCR
tawkeed-sa/tawkeed-embedding Embedding

About Tawkeed

Tawkeed builds Arabic-native AI that runs on the edge. Every model in the family is fine-tuned for Arabic, tested on Arabic benchmarks, and optimized for deployment on Tawkeed devices.

Built by Tawkeed.

Downloads last month
31
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tawkeed-sa/tawkeed-embedding

Base model

BAAI/bge-m3
Quantized
(85)
this model