Cartesia MCP Server

What is Cartesia MCP Server about?

Provides a local MCP (Model Context Protocol) server that bridges applications such as Claude Desktop, Cursor, and OpenAI agents with Cartesia's speech AI platform. It supports listing voices, converting text to audio, localizing speech to different languages, infilling missing audio segments, and re‑voicing existing files.

How to use Cartesia MCP Server?

Create a Cartesia account and obtain an API key from the Cartesia playground.

Install the package:

pip install cartesia-mcp
which cartesia-mcp   # get the absolute path to the executable

Configure the client (e.g., Claude Desktop or Cursor) by referencing the executable path and supplying CARTESIA_API_KEY (and optionally OUTPUT_DIRECTORY).
Run the server and issue MCP commands from the integrated client to perform audio operations.

Key Features

Voice listing – Retrieve all available Cartesia voices.
Text‑to‑audio – Generate speech from any text using a chosen voice.
Localization – Translate and synthesize speech in a different language while preserving the speaker's style.
Audio infill – Seamlessly generate audio between two existing clips.
Voice swapping – Change the voice of an existing audio file to another Cartesia voice.
Simple integration – Ready‑to‑use config snippets for Claude Desktop and Cursor.

Use Cases

Developers building AI assistants that need on‑demand speech synthesis.
Content creators generating multilingual voice‑overs.
Applications requiring dynamic audio generation such as interactive games or virtual agents.
Automated pipelines that need to stitch or replace audio segments programmatically.

FAQ

Q: Do I need a paid Cartesia account? A: No, a free tier provides 20,000 credits per month, sufficient for most development and testing.

Q: Which languages are supported for localization? A: All languages offered by Cartesia’s API can be used; refer to the Cartesia documentation for the full list.

Q: Can I run the server without installing Python packages? A: The server is distributed as a Python package; installation via pip is required.

Q: How do I specify the output folder for generated audio? A: Set the OUTPUT_DIRECTORY environment variable in the client configuration; if omitted, files are saved in the current working directory.