Osgeo-library
osgeo-library
PDF figure, table, equation extraction, document indexing, and semantic search service running on Gallery_Container.
It is used by the Matrix chat assistant to search OSGeo-related documents and retrieve extracted visual elements.
- GitHub: https://github.com/ominiverdi/osgeo-library
- Local path:
/home/ominiverdi/github/osgeo-library - Database: PostgreSQL on Gallery
- MCP server name:
doclibrary
Started at reboot from the ominiverdi crontab:
~/github/osgeo-library/servers/start-server.sh
Verified running processes on osgeo7-gallery include:
python -m doclibrary.servers.api python -m doclibrary.servers.mcp
What can users ask?
Examples for the Matrix chat assistant:
!oc list documents in the OSGeo library !oc find the Snyder map projections document !oc show information about the usgs_snyder document !oc list equations on page 44 of usgs_snyder !oc show metadata for page 44 of usgs_snyder !oc find figures or equations about Mercator projection
Semantic search requires the OSGeo-Knowledge-Embedding-Server to be running.
MCP server
The MCP tools are provided by the doclibrary MCP server:
/home/ominiverdi/github/osgeo-library/.venv/bin/python -m doclibrary.servers.mcp
In the Matrix chat bridge, these tools appear with the doclibrary_ prefix.
Live status
At last check, the service reported:
Database: OK (107 documents) Embedding server: NOT AVAILABLE Embedding URL: http://localhost:8094/embedding
When the embedding server is unavailable, semantic search tools such as doclibrary_search_documents and doclibrary_search_visual_elements return an error.
MCP tools and examples
doclibrary_get_library_status
Checks document library, database, and embedding server status.
Example visible tool call:
[doclibrary_get_library_status]
Live result included:
Database: OK (107 documents) Embedding server: NOT AVAILABLE
doclibrary_list_documents
Lists available documents with slugs, titles, page counts, and summaries.
Example visible tool call:
[doclibrary_list_documents]
Live result examples included:
aibenchaisegalpine_changeusgs_snyder
doclibrary_find_document
Finds documents by title, slug, or filename.
Example visible tool call:
query=snyder, limit=3 [doclibrary_find_document]
Live result example:
slug: usgs_snyder title: Usgs Snyder source_file: usgs_snyder1987.pdf pages: 397
doclibrary_get_document_info
Shows metadata for a document, including page count, summary, keywords, license, and indexed element counts.
Example visible tool call:
document_slug=usgs_snyder [doclibrary_get_document_info]
Live result included:
total_pages: 397 figures: 63 tables: 69 equations: 909
doclibrary_search_documents
Semantic search over document text and extracted visual elements.
Example visible tool call:
query=mercator projection, limit=3 [doclibrary_search_documents]
Requires the embedding server.
doclibrary_search_visual_elements
Semantic search over figures, tables, equations, charts, and diagrams.
Example visible tool call:
query=mercator equation, element_type=equation, document_slug=usgs_snyder, limit=3 [doclibrary_search_visual_elements]
Requires the embedding server.
doclibrary_list_elements
Lists extracted elements from a document, optionally filtered by type or page.
Example visible tool call:
document_slug=usgs_snyder, element_type=equation, page=44, limit=5 [doclibrary_list_elements]
Live result examples included:
Equation (5-10a)Equation (5-10b)Equation (5-11) and (5-12)Equation (5-12a)
doclibrary_get_element_details
Gets metadata and description for a specific extracted element.
Example visible tool call:
document_slug=usgs_snyder, element_label=Equation (5-10a), page_number=44 [doclibrary_get_element_details]
doclibrary_get_element_image
Returns the cropped image for a specific figure, table, equation, chart, or diagram.
Example visible tool call:
document_slug=usgs_snyder, element_label=Equation (5-10a), page_number=44 [doclibrary_get_element_image]
doclibrary_get_page_image
Returns the full page image for a document page.
Example visible tool call:
document_slug=usgs_snyder, page_number=44 [doclibrary_get_page_image]
doclibrary_get_page_metadata
Returns page text, summary, keywords, size, and visual elements without transferring the page image.
Example visible tool call:
document_slug=usgs_snyder, page_number=44 [doclibrary_get_page_metadata]
Live result included page 44 summary and keywords for oblique/transverse map projection formulas.
doclibrary_list_documents_paginated
Lists documents with pagination, summaries, keywords, and license information.
Example visible tool call:
page=1, page_size=3 [doclibrary_list_documents_paginated]
Related services
Contact: ominiverdi, Lorenzo Becchi, or SAC channel.