Difference between revisions of "OSGeo-Knowledge-Embedding-Server"

From OSGeo
Jump to navigation Jump to search
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
* '''BGE-M3 Embedding Server''' - Semantic search embeddings via llama.cpp
+
= OSGeo-Knowledge-Embedding-Server =
** Port: 8094 (localhost only), Model: bge-m3-Q8_0.gguf
 
  
Runs in [[Gallery_Container]]
+
BGE-M3 embedding server for semantic search, running on [[Gallery_Container]].
  
The embedding model running on osgeo7-gallery is BGE-M3 (bge-m3-Q8_0.gguf), a multilingual model from BAAI supporting 100+ languages. It produces 1024-dimensional vectors and runs via llama-server on port 8094 (localhost only). The Q8_0 quantization keeps it light: 606 MB on disk, ~457 MB RAM, ~12ms per query.
+
It provides local embeddings for [[Osgeo-library]] and other OSGeo AI/search tooling.
  
'''Contact:''' ominiverdi (Lorenzo Becchi) or SAC channel
+
== Model ==
 +
 
 +
Verified model file on <code>osgeo7-gallery</code>:
 +
 
 +
<pre>
 +
/home/ominiverdi/models/bge-m3-Q8_0.gguf
 +
</pre>
 +
 
 +
Model details:
 +
 
 +
* Model family: BGE-M3
 +
* Upstream: BAAI
 +
* Quantization: Q8_0
 +
* File size: 606 MB
 +
* Vector size: 1024 dimensions
 +
* Multilingual: supports 100+ languages
 +
 
 +
== Endpoint ==
 +
 
 +
The service listens on localhost only:
 +
 
 +
<pre>
 +
http://localhost:8094/embedding
 +
</pre>
 +
 
 +
Configured by [[Osgeo-library]] in:
 +
 
 +
<pre>
 +
/home/ominiverdi/github/osgeo-library/config.toml
 +
/home/ominiverdi/github/osgeo-library/config.example.toml
 +
</pre>
 +
 
 +
== Runtime ==
 +
 
 +
The service runs through llama.cpp / <code>llama-server</code>.
 +
 
 +
Startup script:
 +
 
 +
<pre>
 +
/home/ominiverdi/github/osgeo-library/servers/bge-m3-cpu.sh
 +
</pre>
 +
 
 +
The script uses:
 +
 
 +
<pre>
 +
/home/ominiverdi/llama.cpp/build/bin/llama-server
 +
/home/ominiverdi/models/bge-m3-Q8_0.gguf
 +
</pre>
 +
 
 +
It binds to:
 +
 
 +
<pre>
 +
127.0.0.1:8094
 +
</pre>
 +
 
 +
== Automatic startup ==
 +
 
 +
The service is started at reboot from the <code>ominiverdi</code> crontab:
 +
 
 +
<pre>
 +
@reboot cd ~/github/osgeo-library && nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null
 +
</pre>
 +
 
 +
Related API startup entry:
 +
 
 +
<pre>
 +
@reboot ~/github/osgeo-library/servers/start-server.sh >> ~/logs/osgeo-library.log 2>&1
 +
</pre>
 +
 
 +
== Operations ==
 +
 
 +
Check listener status:
 +
 
 +
<pre>
 +
ss -ltnp | grep 8094
 +
</pre>
 +
 
 +
Expected listener:
 +
 
 +
<pre>
 +
127.0.0.1:8094
 +
</pre>
 +
 
 +
Check process status:
 +
 
 +
<pre>
 +
pgrep -af "llama-server.*8094|bge-m3"
 +
</pre>
 +
 
 +
Check health:
 +
 
 +
<pre>
 +
curl http://127.0.0.1:8094/health
 +
</pre>
 +
 
 +
Expected result:
 +
 
 +
<pre>
 +
{"status":"ok"}
 +
</pre>
 +
 
 +
Check from [[Osgeo-library]]:
 +
 
 +
<pre>
 +
cd /home/ominiverdi/github/osgeo-library
 +
/home/ominiverdi/github/osgeo-library/.venv/bin/python - <<'PY'
 +
import asyncio
 +
from doclibrary.servers.mcp import get_library_status
 +
async def main():
 +
    print(await get_library_status())
 +
asyncio.run(main())
 +
PY
 +
</pre>
 +
 
 +
Expected status includes:
 +
 
 +
<pre>
 +
Embedding server: OK
 +
Database: OK
 +
</pre>
 +
 
 +
View logs:
 +
 
 +
<pre>
 +
tail -f /home/ominiverdi/logs/bge-m3-cpu.log
 +
</pre>
 +
 
 +
Start manually:
 +
 
 +
<pre>
 +
cd /home/ominiverdi/github/osgeo-library
 +
nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null &
 +
</pre>
 +
 
 +
Stop manually:
 +
 
 +
<pre>
 +
pkill -f "llama-server.*8094"
 +
</pre>
 +
 
 +
== Consumers ==
 +
 
 +
* [[Osgeo-library]]
 +
* [[LLM-based Chat Assistant]]
 +
 
 +
'''Contact:''' ominiverdi, Lorenzo Becchi, or SAC channel.
  
 
[[Category:Services]]
 
[[Category:Services]]
 
[[Category:AI-Services]]
 
[[Category:AI-Services]]

Latest revision as of 11:15, 5 June 2026

OSGeo-Knowledge-Embedding-Server

BGE-M3 embedding server for semantic search, running on Gallery_Container.

It provides local embeddings for Osgeo-library and other OSGeo AI/search tooling.

Model

Verified model file on osgeo7-gallery:

/home/ominiverdi/models/bge-m3-Q8_0.gguf

Model details:

  • Model family: BGE-M3
  • Upstream: BAAI
  • Quantization: Q8_0
  • File size: 606 MB
  • Vector size: 1024 dimensions
  • Multilingual: supports 100+ languages

Endpoint

The service listens on localhost only:

http://localhost:8094/embedding

Configured by Osgeo-library in:

/home/ominiverdi/github/osgeo-library/config.toml
/home/ominiverdi/github/osgeo-library/config.example.toml

Runtime

The service runs through llama.cpp / llama-server.

Startup script:

/home/ominiverdi/github/osgeo-library/servers/bge-m3-cpu.sh

The script uses:

/home/ominiverdi/llama.cpp/build/bin/llama-server
/home/ominiverdi/models/bge-m3-Q8_0.gguf

It binds to:

127.0.0.1:8094

Automatic startup

The service is started at reboot from the ominiverdi crontab:

@reboot cd ~/github/osgeo-library && nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null

Related API startup entry:

@reboot ~/github/osgeo-library/servers/start-server.sh >> ~/logs/osgeo-library.log 2>&1

Operations

Check listener status:

ss -ltnp | grep 8094

Expected listener:

127.0.0.1:8094

Check process status:

pgrep -af "llama-server.*8094|bge-m3"

Check health:

curl http://127.0.0.1:8094/health

Expected result:

{"status":"ok"}

Check from Osgeo-library:

cd /home/ominiverdi/github/osgeo-library
/home/ominiverdi/github/osgeo-library/.venv/bin/python - <<'PY'
import asyncio
from doclibrary.servers.mcp import get_library_status
async def main():
    print(await get_library_status())
asyncio.run(main())
PY

Expected status includes:

Embedding server: OK
Database: OK

View logs:

tail -f /home/ominiverdi/logs/bge-m3-cpu.log

Start manually:

cd /home/ominiverdi/github/osgeo-library
nohup ./servers/bge-m3-cpu.sh >> ~/logs/bge-m3-cpu.log 2>&1 </dev/null &

Stop manually:

pkill -f "llama-server.*8094"

Consumers

Contact: ominiverdi, Lorenzo Becchi, or SAC channel.