GTE Multilingual Base
HuggingFace
A compact yet high-performance embedding model from the GTE family, designed for multilingual retrieval and long-context text representation. It focuses on delivering strong retrieval accuracy while keeping hardware and inference requirements low, making it well suited for production RAG systems that need speed, scalability, and multilingual coverage without relying on large decoder-only models.
Key features:
- Strong multilingual retrieval: Achieves state-of-the-art results on multilingual and cross-lingual retrieval benchmarks for models of similar size
- Efficient architecture: Uses an encoder-only transformer design that delivers significantly faster inference and lower hardware requirements
- Long-context support: Handles inputs up to 8192 tokens for long-document retrieval
- Elastic embeddings: Supports flexible output dimensions to reduce storage costs while preserving downstream performance
- Hybrid retrieval support: Generates both dense embeddings and sparse token weights for dense, sparse, or hybrid search pipelines
Resources
Articles, Blogs, Essays
Tags:
ai
model
embedding
Last modified 07 May 2026