by hyperbrowserai
Enables browser automation powered by large language models, offering natural‑language commands, AI‑driven data extraction, and seamless fallback to regular Playwright functionality.
HyperAgent extends Playwright with AI capabilities, allowing tasks such as navigation, interaction, and data extraction to be described in plain English. It integrates LLM providers, supports stealth browsing, and can run in the cloud via Hyperbrowser.
npm install @hyperbrowser/agent
# or
yarn add @hyperbrowser/agent
npx @hyperbrowser/agent -c "Find a route from Miami to New Orleans, and provide the detailed route information."
import { HyperAgent } from "@hyperbrowser/agent";
import { ChatOpenAI } from "@langchain/openai";
const agent = new HyperAgent({
llm: new ChatOpenAI({ openAIApiKey: process.env.OPENAI_API_KEY, modelName: "gpt-4o" })
});
const result = await agent.executeTask("Navigate to amazon.com, search for 'laptop', and extract the prices of the first 5 results");
console.log(result.output);
await agent.closeAgent();
browserProvider: "Hyperbrowser"
and provide HYPERBROWSER_API_KEY
.page.ai()
, page.extract()
, executeTask()
for natural‑language driven automation.Q: Which LLM models are supported?
A: Any LangChain BaseChatModel
implementation, such as OpenAI's GPT‑4o, Anthropic's Claude, etc.
Q: Do I need an OpenAI key? A: Only if you choose an OpenAI model. You can switch to Anthropic, Google Gemini, or other providers.
Q: How does the cloud scaling work?
A: Set browserProvider
to "Hyperbrowser"
and provide a HYPERBROWSER_API_KEY
. HyperAgent will spin up headless browsers on Hyperbrowser’s infrastructure.
Q: Can I run HyperAgent without an LLM? A: Yes, you can use the regular Playwright APIs directly, bypassing AI.
Q: How do I integrate with other services? A: Use the built‑in MCP client to start external MCP servers (e.g., Composio) and invoke their actions from your tasks.
Hyperagent is Playwright supercharged with AI. No more brittle scripts, just powerful natural language commands. Just looking for scalable headless browsers or scraping infra? Go to Hyperbrowser to get started for free!
page.ai()
, page.extract()
and executeTask()
for any AI automation# Using npm
npm install @hyperbrowser/agent
# Using yarn
yarn add @hyperbrowser/agent
$ npx @hyperbrowser/agent -c "Find a route from Miami to New Orleans, and provide the detailed route information."
The CLI supports options for debugging or using hyperbrowser instead of a local browser
-d, --debug Enable debug mode
-c, --command <task description> Command to run
--hyperbrowser Use Hyperbrowser for the browser provider
import { HyperAgent } from "@hyperbrowser/agent";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";
// Initialize the agent
const agent = new HyperAgent({
llm: new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: "gpt-4o",
}),
});
// Execute a task
const result = await agent.executeTask(
"Navigate to amazon.com, search for 'laptop', and extract the prices of the first 5 results"
);
console.log(result.output);
// Use page.ai and page.extract
const page = await agent.newPage();
await page.goto("https://flights.google.com", { waitUntil: "load" });
await page.ai("search for flights from Rio to LAX from July 16 to July 22");
const res = await page.extract(
"give me the flight options",
z.object({
flights: z.array(
z.object({
price: z.number(),
departure: z.string(),
arrival: z.string(),
})
),
})
);
console.log(res);
// Clean up
await agent.closeAgent();
You can scale HyperAgent with cloud headless browsers using Hyperbrowser
HYPERBROWSER_API_KEY
browserProvider
to "Hyperbrowser"
const agent = new HyperAgent({
browserProvider: "Hyperbrowser",
});
const response = await agent.executeTask(
"Go to hackernews, and list me the 5 most recent article titles"
);
console.log(response);
await agent.closeAgent();
// Create and manage multiple pages
const page1 = await agent.newPage();
const page2 = await agent.newPage();
// Execute tasks on specific pages
const page1Response = await page1.ai(
"Go to google.com/travel/explore and set the starting location to New York. Then, return to me the first recommended destination that shows up. Return to me only the name of the location."
);
const page2Response = await page2.ai(
`I want to plan a trip to ${page1Response.output}. Recommend me places to visit there.`
);
console.log(page2Response.output);
// Get all active pages
const pages = await agent.getPages();
await agent.closeAgent();
HyperAgent can extract data in a specified schema. The schema can be passed in at a per-task level
import { z } from "zod";
const agent = new HyperAgent();
const agentResponse = await agent.executeTask(
"Navigate to imdb.com, search for 'The Matrix', and extract the director, release year, and rating",
{
outputSchema: z.object({
director: z.string().describe("The name of the movie director"),
releaseYear: z.number().describe("The year the movie was released"),
rating: z.string().describe("The IMDb rating of the movie"),
}),
}
);
console.log(agentResponse.output);
await agent.closeAgent();
{
"director": "Lana Wachowski, Lilly Wachowski",
"releaseYear": 1999,
"rating": "8.7/10"
}
Hyperagent supports multiple LLM providers. A provider can be anything that extends to the Langchain BaseChatModel
class.
// Using OpenAI
const agent = new HyperAgent({
llm: new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: "gpt-4o",
}),
});
// Using Anthropic's Claude
const agent = new HyperAgent({
llm: new ChatAnthropic({
anthropicApiKey: process.env.ANTHROPIC_API_KEY,
modelName: "claude-3-7-sonnet-latest",
}),
});
HyperAgent functions as a fully functional MCP client. For best results, we recommend using
gpt-4o
as your LLM.
Here is an example which reads from wikipedia, and inserts information into a google sheet using the composio Google Sheet MCP. For the full example, see here
const agent = new HyperAgent({
llm: llm,
debug: true,
});
await agent.initializeMCPClient({
servers: [
{
command: "npx",
args: [
"@composio/mcp@latest",
"start",
"--url",
"https://mcp.composio.dev/googlesheets/...",
],
env: {
npm_config_yes: "true",
},
},
],
});
const response = await agent.executeTask(
"Go to https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population and get the data on the top 5 most populous states from the table. Then insert that data into a google sheet. You may need to first check if there is an active connection to google sheet, and if there isn't connect to it and present me with the link to sign in. "
);
console.log(response);
await agent.closeAgent();
HyperAgent's capabilities can be extended with custom actions. Custom actions require 3 things:
Here is an example that performs a search using Exa
const exaInstance = new Exa(process.env.EXA_API_KEY);
export const RunSearchActionDefinition: AgentActionDefinition = {
type: "perform_search",
actionParams: z.object({
search: z
.string()
.describe(
"The search query for something you want to search about. Keep the search query concise and to-the-point."
),
}).describe("Search and return the results for a given query.");,
run: async function (
ctx: ActionContext,
params: z.infer<typeof searchSchema>
): Promise<ActionOutput> {
const results = (await exaInstance.search(params.search, {})).results
.map(
(res) =>
`title: ${res.title} || url: ${res.url} || relevance: ${res.score}`
)
.join("\n");
return {
success: true,
message: `Succesfully performed search for query ${params.search}. Got results: \n${results}`,
};
},
};
const agent = new HyperAgent({
"Search about the news for today in New York",
customActions: [RunSearchActionDefinition],
});
We welcome contributions to Hyperagent! Here's how you can help:
git checkout -b feature/AmazingFeature
)git commit -m 'Add some AmazingFeature'
)git push origin feature/AmazingFeature
)Please log in to share your review and rating for this MCP.
{ "mcpServers": { "composio-mcp": { "command": "npx", "args": [ "@composio/mcp@latest", "start", "--url", "https://mcp.composio.dev/googlesheets/..." ], "env": { "npm_config_yes": "true" } } } }
Discover more MCP servers with similar functionality and use cases
by Skyvern-AI
Skyvern automates browser-based workflows using LLMs and computer vision, offering a robust solution for repetitive online tasks.
by PipedreamHQ
Connect APIs quickly, run event‑driven automations, and execute custom code in Node.js, Python, Go, or Bash on a hosted platform.
by czlonkowski
Provides AI assistants with structured access to n8n node documentation, properties, and operations, enabling automated workflow creation, validation, and management.
by executeautomation
mcp-playwright is a Model Context Protocol (MCP) server that enables large language models (LLMs) to perform browser automation and web scraping tasks using Playwright.
by browserbase
Provides cloud browser automation capabilities for LLMs, enabling web navigation, interaction, screenshot capture, and data extraction through Browserbase and Stagehand.
by haris-musa
excel-mcp-server is a Python-based Model Context Protocol (MCP) server that enables AI agents to programmatically create, read, and modify Excel files without requiring Microsoft Excel to be installed.
by mobile-next
Mobile-mcp is a Model Context Protocol (MCP) server designed for scalable mobile automation, app scraping, and development across iOS and Android devices, including physical devices, simulators, and emulators.
by anaisbetts
mcp-installer is an MCP server designed to automate the installation of other MCP servers, simplifying the process for users.
by leonardsellem
An MCP server that enables AI assistants to interact with n8n workflows through natural language.