Difference between revisions of "OSGeo-Knowledge-Embedding-Server"
Jump to navigation
Jump to search
| Line 1: | Line 1: | ||
| − | + | ''BGE-M3 Embedding Server''' - Semantic search embeddings via llama.cpp | |
| − | |||
Runs in [[Gallery_Container]] | Runs in [[Gallery_Container]] | ||
| + | |||
| + | Only listens to localhost device, port: 8094, Model: bge-m3-Q8_0.gguf | ||
The embedding model running on osgeo7-gallery is BGE-M3 (bge-m3-Q8_0.gguf), a multilingual model from BAAI supporting 100+ languages. It produces 1024-dimensional vectors and runs via llama-server on port 8094 (localhost only). The Q8_0 quantization keeps it light: 606 MB on disk, ~457 MB RAM, ~12ms per query. | The embedding model running on osgeo7-gallery is BGE-M3 (bge-m3-Q8_0.gguf), a multilingual model from BAAI supporting 100+ languages. It produces 1024-dimensional vectors and runs via llama-server on port 8094 (localhost only). The Q8_0 quantization keeps it light: 606 MB on disk, ~457 MB RAM, ~12ms per query. | ||
Revision as of 10:00, 5 June 2026
BGE-M3 Embedding Server' - Semantic search embeddings via llama.cpp
Runs in Gallery_Container
Only listens to localhost device, port: 8094, Model: bge-m3-Q8_0.gguf
The embedding model running on osgeo7-gallery is BGE-M3 (bge-m3-Q8_0.gguf), a multilingual model from BAAI supporting 100+ languages. It produces 1024-dimensional vectors and runs via llama-server on port 8094 (localhost only). The Q8_0 quantization keeps it light: 606 MB on disk, ~457 MB RAM, ~12ms per query.
Contact: ominiverdi (Lorenzo Becchi) or SAC channel