Difference between revisions of "OSGeo-Knowledge-Embedding-Server"

From OSGeo
Jump to navigation Jump to search
Line 1: Line 1:
''BGE-M3 Embedding Server''' - Semantic search embeddings via llama.cpp
+
OSGeo-Knowledge-Embedding-Server =
  
Runs in [[Gallery_Container]]
+
BGE-M3 embedding server for semantic search, running on [[Gallery_Container]] when active.
  
Only listens to localhost device, port: 8094, Model: bge-m3-Q8_0.gguf
+
It provides local embeddings for services such as [[Osgeo-library]] and other OSGeo AI/search tooling.
  
The embedding model running on osgeo7-gallery is BGE-M3 (bge-m3-Q8_0.gguf), a multilingual model from BAAI supporting 100+ languages. It produces 1024-dimensional vectors and runs via llama-server on port 8094 (localhost only). The Q8_0 quantization keeps it light: 606 MB on disk, ~457 MB RAM, ~12ms per query.
+
== Model ==
  
'''Contact:''' ominiverdi (Lorenzo Becchi) or SAC channel
+
Verified model file on <code>osgeo7-gallery</code>:
 +
 
 +
<pre>
 +
/home/ominiverdi/models/bge-m3-Q8_0.gguf
 +
</pre>
 +
 
 +
Model size:
 +
 
 +
<pre>
 +
606 MB
 +
</pre>
 +
 
 +
Model details:
 +
 
 +
* Model family: BGE-M3
 +
* Upstream: BAAI
 +
* Quantization: Q8_0
 +
* Vector size: 1024 dimensions
 +
* Multilingual: supports 100+ languages
 +
 
 +
== Endpoint ==
 +
 
 +
Configured by [[Osgeo-library]] as:
 +
 
 +
<pre>
 +
http://localhost:8094/embedding
 +
</pre>
 +
 
 +
Verified config files:
 +
 
 +
<pre>
 +
/home/ominiverdi/github/osgeo-library/config.example.toml
 +
/home/ominiverdi/github/osgeo-library/config.toml
 +
</pre>
 +
 
 +
The endpoint is intended to listen on localhost only.
 +
 
 +
== Runtime ==
 +
 
 +
The service is expected to run through llama.cpp / <code>llama-server</code>.
 +
 
 +
Check listener status:
 +
 
 +
<pre>
 +
ss -ltnp | grep 8094
 +
</pre>
 +
 
 +
Check process status:
 +
 
 +
<pre>
 +
pgrep -af "llama|bge|8094|embedding"
 +
</pre>
 +
 
 +
At the time of the last check, the model file and client configuration were present, but no active listener on port <code>8094</code> was observed.
 +
 
 +
== Consumers ==
 +
 
 +
* [[Osgeo-library]]
 +
* [[LLM-based Chat Assistant]]
 +
 
 +
'''Contact:''' ominiverdi, Lorenzo Becchi, or SAC channel.
  
 
[[Category:Services]]
 
[[Category:Services]]
 
[[Category:AI-Services]]
 
[[Category:AI-Services]]

Revision as of 10:59, 5 June 2026

OSGeo-Knowledge-Embedding-Server =

BGE-M3 embedding server for semantic search, running on Gallery_Container when active.

It provides local embeddings for services such as Osgeo-library and other OSGeo AI/search tooling.

Model

Verified model file on osgeo7-gallery:

/home/ominiverdi/models/bge-m3-Q8_0.gguf

Model size:

606 MB

Model details:

  • Model family: BGE-M3
  • Upstream: BAAI
  • Quantization: Q8_0
  • Vector size: 1024 dimensions
  • Multilingual: supports 100+ languages

Endpoint

Configured by Osgeo-library as:

http://localhost:8094/embedding

Verified config files:

/home/ominiverdi/github/osgeo-library/config.example.toml
/home/ominiverdi/github/osgeo-library/config.toml

The endpoint is intended to listen on localhost only.

Runtime

The service is expected to run through llama.cpp / llama-server.

Check listener status:

ss -ltnp | grep 8094

Check process status:

pgrep -af "llama|bge|8094|embedding"

At the time of the last check, the model file and client configuration were present, but no active listener on port 8094 was observed.

Consumers

Contact: ominiverdi, Lorenzo Becchi, or SAC channel.