Difference between revisions of "Osgeo-library"

From OSGeo
Jump to navigation Jump to search
Line 1: Line 1:
osgeo-library
+
= osgeo-library =
  
PDF figure/table/equation extraction and semantic search
+
PDF figure, table, equation extraction, and semantic search service running on [[Gallery_Container]].
  
Runs in [[Gallery_Container]]
+
* GitHub: https://github.com/ominiverdi/osgeo-library
 +
* Local path: <code>/home/ominiverdi/github/osgeo-library</code>
 +
* Started at reboot from the <code>ominiverdi</code> crontab:
 +
<pre>
 +
~/github/osgeo-library/servers/start-server.sh
 +
</pre>
 +
 
 +
Verified running processes on <code>osgeo7-gallery</code> include:
 +
 
 +
<pre>
 +
python -m doclibrary.servers.api
 +
python -m doclibrary.servers.mcp
 +
</pre>
  
* GitHub: https://github.com/ominiverdi/osgeo-library
+
== MCP server ==
* CLI command: <code>osgeo-library</code> (as ominiverdi user)
+
 
 +
The MCP tools are provided by the <code>doclibrary</code> MCP server:
  
'''Contact:''' ominiverdi (Lorenzo Becchi) or SAC channel
+
<pre>
 +
/home/ominiverdi/github/osgeo-library/.venv/bin/python -m doclibrary.servers.mcp
 +
</pre>
  
== OSGeo Library tools ==
+
In the Matrix chat bridge, these tools appear with the <code>doclibrary_</code> prefix.
  
Common OSGeo Library / document tools shown by the bridge:
+
== MCP tools ==
  
 
<pre>
 
<pre>
 
[doclibrary_search_documents]
 
[doclibrary_search_documents]
 +
</pre>
 +
Search document text and extracted visual elements using semantic search.
 +
 +
<pre>
 
[doclibrary_search_visual_elements]
 
[doclibrary_search_visual_elements]
 +
</pre>
 +
Search only visual elements such as figures, tables, equations, charts, and diagrams.
 +
 +
<pre>
 
[doclibrary_list_elements]
 
[doclibrary_list_elements]
 +
</pre>
 +
List extracted elements from a document, optionally filtered by type or page.
 +
 +
<pre>
 
[doclibrary_get_element_details]
 
[doclibrary_get_element_details]
 +
</pre>
 +
Get metadata and text description for a specific extracted element.
 +
 +
<pre>
 
[doclibrary_get_element_image]
 
[doclibrary_get_element_image]
 +
</pre>
 +
Return the cropped image for a specific figure, table, equation, chart, or diagram.
 +
 +
<pre>
 
[doclibrary_get_page_image]
 
[doclibrary_get_page_image]
 +
</pre>
 +
Return the full page image for a document page.
 +
 +
<pre>
 
[doclibrary_get_document_info]
 
[doclibrary_get_document_info]
 +
</pre>
 +
Show metadata for a document.
 +
 +
<pre>
 
[doclibrary_find_document]
 
[doclibrary_find_document]
 +
</pre>
 +
Find a document by name or query.
 +
 +
<pre>
 
[doclibrary_list_documents]
 
[doclibrary_list_documents]
 +
</pre>
 +
List indexed documents.
 +
 +
<pre>
 
[doclibrary_get_library_status]
 
[doclibrary_get_library_status]
 
</pre>
 
</pre>
 +
Check document library and embedding/search service status.
 +
 +
== Related services ==
 +
 +
* [[LLM-based Chat Assistant]]
 +
* [[OSGeo-Knowledge-Embedding-Server]]
  
These tools query [[Osgeo-library]].
+
'''Contact:''' ominiverdi, Lorenzo Becchi, or SAC channel.
  
 
[[Category:Services]]
 
[[Category:Services]]
 
[[Category:AI-Services]]
 
[[Category:AI-Services]]

Revision as of 10:28, 5 June 2026

osgeo-library

PDF figure, table, equation extraction, and semantic search service running on Gallery_Container.

~/github/osgeo-library/servers/start-server.sh

Verified running processes on osgeo7-gallery include:

python -m doclibrary.servers.api
python -m doclibrary.servers.mcp

MCP server

The MCP tools are provided by the doclibrary MCP server:

/home/ominiverdi/github/osgeo-library/.venv/bin/python -m doclibrary.servers.mcp

In the Matrix chat bridge, these tools appear with the doclibrary_ prefix.

MCP tools

[doclibrary_search_documents]

Search document text and extracted visual elements using semantic search.

[doclibrary_search_visual_elements]

Search only visual elements such as figures, tables, equations, charts, and diagrams.

[doclibrary_list_elements]

List extracted elements from a document, optionally filtered by type or page.

[doclibrary_get_element_details]

Get metadata and text description for a specific extracted element.

[doclibrary_get_element_image]

Return the cropped image for a specific figure, table, equation, chart, or diagram.

[doclibrary_get_page_image]

Return the full page image for a document page.

[doclibrary_get_document_info]

Show metadata for a document.

[doclibrary_find_document]

Find a document by name or query.

[doclibrary_list_documents]

List indexed documents.

[doclibrary_get_library_status]

Check document library and embedding/search service status.

Related services

Contact: ominiverdi, Lorenzo Becchi, or SAC channel.