Latest revision as of 11:15, 5 June 2026

OSGeo-Knowledge-Embedding-Server

BGE-M3 embedding server for semantic search, running on Gallery_Container.

It provides local embeddings for Osgeo-library and other OSGeo AI/search tooling.

Model

Verified model file on osgeo7-gallery:

/home/ominiverdi/models/bge-m3-Q8_0.gguf

Model details:

Model family: BGE-M3
Upstream: BAAI
Quantization: Q8_0
File size: 606 MB
Vector size: 1024 dimensions
Multilingual: supports 100+ languages

Endpoint

The service listens on localhost only:

http://localhost:8094/embedding

Configured by Osgeo-library in:

/home/ominiverdi/github/osgeo-library/config.toml
/home/ominiverdi/github/osgeo-library/config.example.toml

Runtime

The service runs through llama.cpp / llama-server.

Startup script:

/home/ominiverdi/github/osgeo-library/servers/bge-m3-cpu.sh

The script uses:

/home/ominiverdi/llama.cpp/build/bin/llama-server
/home/ominiverdi/models/bge-m3-Q8_0.gguf

It binds to:

127.0.0.1:8094

Automatic startup

The service is started at reboot from the ominiverdi crontab:

@reboot cd ~/github/osgeo-library && nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null

Related API startup entry:

@reboot ~/github/osgeo-library/servers/start-server.sh >> ~/logs/osgeo-library.log 2>&1

Operations

Check listener status:

ss -ltnp | grep 8094

Expected listener:

127.0.0.1:8094

Check process status:

pgrep -af "llama-server.*8094|bge-m3"

Check health:

curl http://127.0.0.1:8094/health

Expected result:

{"status":"ok"}

Check from Osgeo-library:

cd /home/ominiverdi/github/osgeo-library
/home/ominiverdi/github/osgeo-library/.venv/bin/python - <<'PY'
import asyncio
from doclibrary.servers.mcp import get_library_status
async def main():
    print(await get_library_status())
asyncio.run(main())
PY

Expected status includes:

Embedding server: OK
Database: OK

View logs:

tail -f /home/ominiverdi/logs/bge-m3-cpu.log

Start manually:

cd /home/ominiverdi/github/osgeo-library
nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null &

Stop manually:

pkill -f "llama-server.*8094"

Consumers

Contact: ominiverdi, Lorenzo Becchi, or SAC channel.

@@ Line 1: / Line 1: @@
-* '''BGE-M3 Embedding Server''' - Semantic search embeddings via llama.cpp
+= OSGeo-Knowledge-Embedding-Server =
-** Port: 8094 (localhost only), Model: bge-m3-Q8_0.gguf
-Runs in [[Gallery_Container]]
+BGE-M3 embedding server for semantic search, running on [[Gallery_Container]].
-The embedding model running on osgeo7-gallery is BGE-M3 (bge-m3-Q8_0.gguf), a multilingual model from BAAI supporting 100+ languages. It produces 1024-dimensional vectors and runs via llama-server on port 8094 (localhost only). The Q8_0 quantization keeps it light: 606 MB on disk, ~457 MB RAM, ~12ms per query.
+It provides local embeddings for [[Osgeo-library]] and other OSGeo AI/search tooling.
-'''Contact:''' ominiverdi (Lorenzo Becchi) or SAC channel
+== Model ==
+Verified model file on <code>osgeo7-gallery</code>:
+<pre>
+/home/ominiverdi/models/bge-m3-Q8_0.gguf
+</pre>
+Model details:
+* Model family: BGE-M3
+* Upstream: BAAI
+* Quantization: Q8_0
+* File size: 606 MB
+* Vector size: 1024 dimensions
+* Multilingual: supports 100+ languages
+== Endpoint ==
+The service listens on localhost only:
+<pre>
+http://localhost:8094/embedding
+</pre>
+Configured by [[Osgeo-library]] in:
+<pre>
+/home/ominiverdi/github/osgeo-library/config.toml
+/home/ominiverdi/github/osgeo-library/config.example.toml
+</pre>
+== Runtime ==
+The service runs through llama.cpp / <code>llama-server</code>.
+Startup script:
+<pre>
+/home/ominiverdi/github/osgeo-library/servers/bge-m3-cpu.sh
+</pre>
+The script uses:
+<pre>
+/home/ominiverdi/llama.cpp/build/bin/llama-server
+/home/ominiverdi/models/bge-m3-Q8_0.gguf
+</pre>
+It binds to:
+<pre>
+.0.0.1:8094
+</pre>
+== Automatic startup ==
+The service is started at reboot from the <code>ominiverdi</code> crontab:
+<pre>
+@reboot cd ~/github/osgeo-library && nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null
+</pre>
+Related API startup entry:
+<pre>
+@reboot ~/github/osgeo-library/servers/start-server.sh >> ~/logs/osgeo-library.log 2>&1
+</pre>
+== Operations ==
+Check listener status:
+<pre>
+ss -ltnp | grep 8094
+</pre>
+Expected listener:
+<pre>
+.0.0.1:8094
+</pre>
+Check process status:
+<pre>
+pgrep -af "llama-server.*8094|bge-m3"
+</pre>
+Check health:
+<pre>
+curl http://127.0.0.1:8094/health
+</pre>
+Expected result:
+<pre>
+{"status":"ok"}
+</pre>
+Check from [[Osgeo-library]]:
+<pre>
+cd /home/ominiverdi/github/osgeo-library
+/home/ominiverdi/github/osgeo-library/.venv/bin/python - <<'PY'
+import asyncio
+from doclibrary.servers.mcp import get_library_status
+async def main():
+    print(await get_library_status())
+asyncio.run(main())
+PY
+</pre>
+Expected status includes:
+<pre>
+Embedding server: OK
+Database: OK
+</pre>
+View logs:
+<pre>
+tail -f /home/ominiverdi/logs/bge-m3-cpu.log
+</pre>
+Start manually:
+<pre>
+cd /home/ominiverdi/github/osgeo-library
+nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null &
+</pre>
+Stop manually:
+<pre>
+pkill -f "llama-server.*8094"
+</pre>
+== Consumers ==
+* [[Osgeo-library]]
+* [[LLM-based Chat Assistant]]
+'''Contact:''' ominiverdi, Lorenzo Becchi, or SAC channel.
 [[Category:Services]]
 [[Category:AI-Services]]

Difference between revisions of "OSGeo-Knowledge-Embedding-Server"