by gpetraroli
An MCP server for advanced PDF text extraction, search, and analysis.
MCP PDF Reader Enhanced is a comprehensive Model Context Protocol (MCP) server designed to provide advanced functionalities for interacting with PDF files. It allows users to extract text, search for specific content, and retrieve metadata from PDF documents.
To use MCP PDF Reader Enhanced, you first need to install it using npm install
. Once installed, you can integrate it with your Cursor settings by adding the provided JSON configuration. The server exposes three main tools:
read-pdf
: For extracting text from PDF files with options for page ranges, metadata inclusion, and text cleaning.
{"file": "/path/to/document.pdf", "clean_text": true}
search-pdf
: For searching specific text within PDF documents with options for case sensitivity and whole-word matching.
{"file": "/path/to/document.pdf", "query": "important term", "case_sensitive": true}
pdf-metadata
: For extracting comprehensive metadata from PDF files without text extraction.
{"file": "/path/to/document.pdf"}
A comprehensive Model Context Protocol (MCP) server that provides advanced PDF text extraction, search, and analysis functionality.
npm install
read-pdf
- Enhanced PDF ReadingExtract text from PDF files with customizable options.
Parameters:
file
(string, required): Path to the PDF filepages
(string, optional): Page range (e.g., '1-5', '1,3,5', 'all'). Default: 'all'include_metadata
(boolean, optional): Include PDF metadata. Default: trueclean_text
(boolean, optional): Clean and normalize text. Default: falseExample Usage:
// Basic extraction
{ "file": "/path/to/document.pdf" }
// Extract with clean text and no metadata
{
"file": "/path/to/document.pdf",
"clean_text": true,
"include_metadata": false
}
search-pdf
- Search Within PDFsSearch for specific text within PDF documents.
Parameters:
file
(string, required): Path to the PDF filequery
(string, required): Text to search forcase_sensitive
(boolean, optional): Case sensitive search. Default: falsewhole_word
(boolean, optional): Match whole words only. Default: falseExample Usage:
// Case-insensitive search
{ "file": "/path/to/document.pdf", "query": "important term" }
// Whole word, case-sensitive search
{
"file": "/path/to/document.pdf",
"query": "API",
"case_sensitive": true,
"whole_word": true
}
pdf-metadata
- Extract Metadata OnlyGet comprehensive metadata from PDF files without extracting text.
Parameters:
file
(string, required): Path to the PDF fileReturns:
Add to your Cursor settings:
{
"mcp": {
"servers": {
"mcp-gp-pdf-reader": {
"command": "node",
"args": ["/absolute/path/to/mcp_gp_pdf_reader/index.js"]
}
}
}
}
# Via MCP client
"Extract all text from /documents/report.pdf"
# Via MCP client
"Search for 'quarterly results' in /documents/financial-report.pdf"
# Via MCP client
"Get metadata from /documents/contract.pdf"
This MCP server is designed to be extensible. Key areas for contribution:
MIT License
Please log in to share your review and rating for this MCP.
Discover more MCP servers with similar functionality and use cases
by mckinsey
Build high-quality data visualization apps quickly with low‑code configuration, leveraging Plotly, Dash, and Pydantic while allowing deep customisation through Python, JavaScript, HTML, and CSS.
by antvis
mcp-server-chart is a Model Context Protocol (MCP) server developed by AntV that generates over 25 types of visual charts. It provides robust chart generation and data analysis capabilities, integrating with various AI clients and platforms.
by reading-plus-ai
mcp-server-data-exploration is an MCP server designed for autonomous data exploration on CSV-based datasets. It acts as a personal Data Scientist assistant, providing intelligent insights with minimal effort.
by Canner
Wren Engine is a semantic engine designed for Model Context Protocol (MCP) clients and AI agents, enabling accurate and context-aware access to enterprise data.
by GongRzhe
A Model Context Protocol (MCP) server for generating various types of charts using QuickChart.io, enabling chart creation through MCP tools.
by ergut
mcp-bigquery-server is a Model Context Protocol (MCP) server that enables Large Language Models (LLMs) to securely and efficiently interact with Google BigQuery datasets. It acts as a translator, allowing LLMs to query and analyze data in BigQuery using natural language instead of SQL.
by isaacwasserman
Provides tools for saving data tables and generating Vega‑Lite visualizations via an MCP interface, supporting both textual specifications and PNG image output.
by surendranb
Google Analytics MCP Server is a Python-based tool that enables Large Language Models (LLMs) to access and analyze Google Analytics 4 (GA4) data using natural language, providing conversational querying of over 200 GA4 dimensions and metrics.
by tinybirdco
Provides a Model Context Protocol server implementation for Tinybird, allowing analytics agents to forward data to Tinybird's platform.