by cyberchitta
Fetch HTML or markdown from bot‑protected websites, delivering clean text content that AI assistants can consume.
Scrapling Fetch MCP provides a lightweight MCP server that retrieves text‑only content (HTML or markdown) from websites that employ anti‑automation measures. It bridges the gap between what a human sees in a browser and what an LLM can access, focusing on low‑volume documentation and reference material.
uv
):
uv tool install scrapling
scrapling install
uv tool install scrapling-fetch-mcp
{
"mcpServers": {
"Cyber-Chitta": {
"command": "uvx",
"args": ["scrapling-fetch-mcp"]
}
}
}
s-fetch-page
– retrieve a full page (with optional pagination).s-fetch-pattern
– extract text matching a regular‑expression pattern, with surrounding context.basic
, stealth
, and max‑stealth
modes trade speed for success on heavily guarded sites.Q: Do I need a browser driver? A: No. Scrapling handles the anti‑automation challenges internally; no Selenium or Playwright setup is required.
Q: Can it log in to sites requiring authentication? A: Authentication is not supported; the tool works with publicly accessible pages.
Q: What is the typical response time?
A: basic
mode returns in 1–2 seconds, stealth
in 3–8 seconds, and max‑stealth
may take 10 seconds or more.
Q: Is high‑volume scraping allowed? A: The project is intentionally limited to low‑volume, documentation‑type retrieval and should not be used for mass data harvesting.
Q: How does pagination work?
A: Use the start_index
and max_length
parameters of s-fetch-page
to fetch large documents in chunks.
An MCP server that helps AI assistants access text content from websites that implement bot detection, bridging the gap between what you can see in your browser and what the AI can access.
This tool is optimized for low-volume retrieval of documentation and reference materials (text/HTML only) from websites that implement bot detection. It has not been designed or tested for general-purpose site scraping or data harvesting.
Note: This project was developed in collaboration with Claude Sonnet 3.7, using LLM Context.
Requirements:
Install dependencies and the tool:
uv tool install scrapling
scrapling install
uv tool install scrapling-fetch-mcp
Add this configuration to your Claude client's MCP server configuration:
{
"mcpServers": {
"Cyber-Chitta": {
"command": "uvx",
"args": ["scrapling-fetch-mcp"]
}
}
}
This package provides two distinct tools:
Human: Please fetch and summarize the documentation at https://example.com/docs
Claude: I'll help you with that. Let me fetch the documentation.
<mcp:function_calls>
<mcp:invoke name="s-fetch-page">
<mcp:parameter name="url">https://example.com/docs</mcp:parameter>
<mcp:parameter name="mode">basic</mcp:parameter>
</mcp:invoke>
</mcp:function_calls>
Based on the documentation I retrieved, here's a summary...
Human: Please find all mentions of "API keys" on the documentation page.
Claude: I'll search for that specific information.
<mcp:function_calls>
<mcp:invoke name="s-fetch-pattern">
<mcp:parameter name="url">https://example.com/docs</mcp:parameter>
<mcp:parameter name="mode">basic</mcp:parameter>
<mcp:parameter name="search_pattern">API\s+keys?</mcp:parameter>
<mcp:parameter name="context_chars">150</mcp:parameter>
</mcp:invoke>
</mcp:function_calls>
I found several mentions of API keys in the documentation:
...
Protection Levels:
basic
: Fast retrieval (1-2 seconds) but lower success with heavily protected sitesstealth
: Balanced protection (3-8 seconds) that works with most sitesmax-stealth
: Maximum protection (10+ seconds) for heavily protected sitesContent Targeting Options:
start_index
and max_length
)search_pattern
and context_chars
)
s-fetch-page
basic
mode and only escalate to higher protection levels if neededs-fetch-page
s-fetch-pattern
when looking for specific information on large pagesApache 2
Please log in to share your review and rating for this MCP.
{ "mcpServers": { "ScraplingFetch": { "command": "uvx", "args": [ "scrapling-fetch-mcp" ] } } }
Discover more MCP servers with similar functionality and use cases
by firecrawl
Adds powerful web scraping, crawling, and search capabilities to LLM clients through a Model Context Protocol (MCP) server.
by mendableai
Firecrawl MCP Server is an official Model Context Protocol (MCP) server implementation that integrates with Firecrawl to provide powerful web scraping capabilities to Large Language Models (LLMs). It acts as a bridge between LLMs and the web, allowing them to access and process web content for various tasks.
by tavily-ai
Provides real-time web search, intelligent data extraction, site mapping, and crawling capabilities via MCP tools.
by iFurySt
RedNote-MCP is an MCP server designed to access content from RedNote (XiaoHongShu, xhs), a popular Chinese social media and e-commerce platform. It enables programmatic interaction with RedNote for data retrieval and automation.
by zcaceres
fetch-mcp is a flexible HTTP fetching server designed to retrieve web content in various formats. It acts as a server that can fetch HTML, JSON, Markdown, or plaintext from specified URLs, enabling on-demand fetching and transformation of web content.
by apify
An MCP server for Apify Actors, allowing AI assistants to use any of the 3,000+ pre-built cloud tools for web scraping and automation.
by openbnb-org
The mcp-server-airbnb is an MCP (Multi-Cloud Platform) server designed to interact with Airbnb. It provides tools for searching Airbnb listings and retrieving detailed information about specific listings.
by cnych
A free SEO tool MCP (Model Control Protocol) service based on Ahrefs data, offering features like backlink analysis, keyword research, and traffic estimation.
by tinyfish-io
AgentQL MCP Server is a Model Context Protocol (MCP) server that integrates AgentQL's data extraction capabilities, enabling AI agents to get structured data from the unstructured web.