by mberg
Kokoro Text to Speech (TTS) MCP Server that generates .mp3 files with option to upload to S3.
Kokoro Text to Speech (TTS) MCP Server is a Python-based application that converts text into MP3 audio files. It offers the functionality to optionally upload these generated MP3s to an Amazon S3 bucket, providing a convenient solution for managing and distributing audio content.
To use Kokoro TTS MCP Server, you need to:
kokoro-v1.0.onnx
and voices-v1.0.bin
files from the Kokoro Onnx Weights repository and place them in the same directory as the cloned project.brew install ffmpeg
..env
file (or directly in your MCP configs) for AWS credentials (if using S3), S3 bucket details, and other server settings like default voice, speed, and language.uv run mcp-tts.py
.mcp_client.py
script to send text-to-speech requests to the server. You can provide text directly, read from a file, customize voice and speed, and control S3 uploads.Q: What is FFmpeg needed for? A: FFmpeg is required to convert the generated WAV audio files into MP3 format.
Q: How do I configure S3 uploads?
A: You need to set AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, AWS_S3_BUCKET_NAME
, and AWS_S3_REGION
in your environment variables or MCP configs. You can also enable/disable S3 uploads using the S3_ENABLED
variable or the client's --no-s3
option.
Q: Can I change the default voice or speed?
A: Yes, you can set TTS_VOICE
and TTS_SPEED
environment variables for default values, or specify them per request using the client's --voice
and --speed
options.
Q: How are MP3 files managed locally?
A: MP3 files are stored in the directory specified by MP3_FOLDER
. You can configure MP3_RETENTION_DAYS
to automatically delete old files or DELETE_LOCAL_AFTER_S3_UPLOAD
to remove them after successful S3 upload.
Q: How do I run the server and client on the same machine?
A: Ensure the server binds to 0.0.0.0
or 127.0.0.1
and the client connects to localhost
or 127.0.0.1
.
Kokoro Text to Speech MCP server that generates .mp3 files with option to upload to S3.
Uses: https://huggingface.co/spaces/hexgrad/Kokoro-TTS
Add the following to your MCP configs. Update with your own values.
"kokoro-tts-mcp": {
"command": "uv",
"args": [
"--directory",
"/path/toyourlocal/kokoro-tts-mcp",
"run",
"mcp-tts.py"
],
"env": {
"TTS_VOICE": "af_heart",
"TTS_SPEED": "1.0",
"TTS_LANGUAGE": "en-us",
"AWS_ACCESS_KEY_ID": "",
"AWS_SECRET_ACCESS_KEY": "",
"AWS_REGION": "us-east-1",
"AWS_S3_FOLDER": "mp3",
"S3_ENABLED": "true",
"MP3_FOLDER": "/path/to/mp3"
}
}
This is needed to convert .wav to .mp3 files
For mac:
brew install ffmpeg
To run locally add these to your .env file. See env.example and copy to .env and modify with your own values.
AWS_ACCESS_KEY_ID
: Your AWS access key IDAWS_SECRET_ACCESS_KEY
: Your AWS secret access keyAWS_S3_BUCKET_NAME
: S3 bucket nameAWS_S3_REGION
: S3 region (e.g., us-east-1)AWS_S3_FOLDER
: Folder path within the S3 bucketAWS_S3_ENDPOINT_URL
: Optional custom endpoint URL for S3-compatible storageMCP_HOST
: Host to bind the server to (default: 0.0.0.0)MCP_PORT
: Port to listen on (default: 9876)MCP_CLIENT_HOST
: Hostname for client connections to the server (default: localhost)DEBUG
: Enable debug mode (set to "true" or "1")S3_ENABLED
: Enable S3 uploads (set to "true" or "1")MP3_FOLDER
: Path to store MP3 files (default is 'mp3' folder in script directory)MP3_RETENTION_DAYS
: Number of days to keep MP3 files before automatic deletionDELETE_LOCAL_AFTER_S3_UPLOAD
: Whether to delete local MP3 files after successful S3 upload (set to "true" or "1")TTS_VOICE
: Default voice for the TTS client (default: af_heart)TTS_SPEED
: Default speed for the TTS client (default: 1.0)TTS_LANGUAGE
: Default language for the TTS client (default: en-us)Preferred method use UV
uv run mcp-tts.py
The mcp_client.py
script allows you to send TTS requests to the server. It can be used as follows:
When running the server and client on the same machine:
0.0.0.0
(all interfaces) or 127.0.0.1
(localhost only)localhost
or 127.0.0.1
python mcp_client.py --text "Hello, world!"
python mcp_client.py --file my_text.txt
python mcp_client.py --text "Hello, world!" --voice "en_female" --speed 1.2
python mcp_client.py --text "Hello, world!" --no-s3
python mcp_client.py --help
The TTS server generates MP3 files that are stored locally and optionally uploaded to S3. You can configure how these files are managed:
MP3_FOLDER
in your .env
file to specify where MP3 files are storedMP3_RETENTION_DAYS=30
(or any number) to automatically delete files older than that number of daysDELETE_LOCAL_AFTER_S3_UPLOAD=true
to delete local files immediately after successful S3 uploadS3_ENABLED=true
or DISABLE_S3=true
.env
file--no-s3
optionPlease log in to share your review and rating for this MCP.