by datastrato
mcp-server-gravitino is an MCP (Metadata Catalog Protocol) server for Apache Gravitino, designed to enable LLMs to explore metadata and perform data governance tasks. It provides a simplified interface for interacting with Gravitino APIs, offering comprehensive metadata operations and flexible authorization.
mcp-server-gravitino is an MCP (Metadata Catalog Protocol) server specifically designed for Apache Gravitino. It provides a simplified interface for interacting with Gravitino APIs, enabling LLMs (Large Language Models) to explore metadata of structured and unstructured data and perform data governance tasks like tagging and classification.
To use mcp-server-gravitino, you need to first install uv
for dependency and virtual environment management. Then, clone the repository, navigate into the directory, create and activate a virtual environment, and install dependencies using uv install
.
Configuration involves setting environment variables for GRAVITINO_METALAKE
and GRAVITINO_URI
. For authorization, you can use either token-based authentication (GRAVITINO_JWT_TOKEN
) or basic authentication (GRAVITINO_USERNAME
, GRAVITINO_PASSWORD
). You can also control which tools are activated using GRAVITINO_ACTIVE_TOOLS
.
To launch the server, execute a uv run
command with specific arguments to include fastmcp
, httpx
, and mcp-server-gravitino
as runtime dependencies, and then run the server module. An example configuration for integrating with a Goose client is also provided, demonstrating how to set up the command and environment variables.
Q: What is the primary purpose of mcp-server-gravitino? A: It acts as an MCP server for Apache Gravitino, allowing LLMs to interact with Gravitino metadata and perform data governance tasks.
Q: What authentication methods are supported? A: Both token-based authentication (JWT) and basic authentication (username/password) are supported.
Q: Can I control which Gravitino APIs are exposed?
A: Yes, you can specify which tools (APIs) to activate using the GRAVITINO_ACTIVE_TOOLS
environment variable.
Q: What dependency management tool does this project use?
A: It uses uv
for dependency and virtual environment management.
Q: Does it expose all Gravitino APIs? A: No, it provides a selected set of optimized tools designed to return concise and relevant metadata, staying within LLM token limits while maintaining semantic integrity.
MCP server providing Gravitino APIs - A FastMCP integration for Apache Gravitino services.
This project uses uv as the dependency and virtual environment management tool. Please ensure uv
is installed on your system.
Clone the repository:
git clone git@github.com:datastrato/mcp-server-gravitino.git
Navigate into the project directory:
cd mcp-server-gravitino
Create a virtual environment:
uv venv
Activate the virtual environment:
source .venv/bin/activate
Install dependencies:
uv install
Regardless of the Authorization, the following environment variables need to be set:
GRAVITINO_METALAKE=<YOUR_METALAKE> # default: "metalake_demo"
GRAVITINO_URI=<YOUR_GRAVITINO_URI>
GRAVITINO_URI
: The base URL of your Gravitino server.GRAVITINO_METALAKE
: The name of the metakube to use.mcp-server-gravitino
supports both token-based and basic authentication methods. These mechanisms allow secure access to MCP tools and prompts and are suitable for integration with external systems.
Set the following environment variables:
GRAVITINO_JWT_TOKEN=<YOUR_GRAVITINO_JWT_TOKEN>
GRAVITINO_JWT_TOKEN
: The JWT token for authentication.
Alternatively, you can use basic authentication:
GRAVITINO_USERNAME=<YOUR_GRAVITINO_USERNAME>
GRAVITINO_PASSWORD=<YOUR_GRAVITINO_PASSWORD>
GRAVITINO_USERNAME
: The username for Gravitino authentication.GRAVITINO_PASSWORD
: The corresponding password.Tool activation is currently based on method names (e.g., get_list_of_table
). You can specify which tools to activate by setting the optional environment variable GRAVITINO_ACTIVE_TOOLS
. The default value is *
, which activates all tools. If just want to activate get_list_of_roles
tool, you can set the environment variable as follows:
GRAVITINO_ACTIVE_TOOLS=get_list_of_roles
To launch the Gravitino MCP Server, run the following command:
uv \
--directory /path/to/mcp-gravitino \
run \
--with fastmcp \
--with httpx \
--with mcp-server-gravitino \
python -m mcp_server_gravitino.server
The meaning of each argument is as follows:
Argument | Description |
---|---|
uv |
Launches the UV CLI tool |
--directory /path/to/mcp-gravitino |
Specifies the working project directory with pyproject.toml |
run |
Indicates that a command will be executed in the managed environment |
--with fastmcp |
Adds fastmcp to the runtime environment without altering project deps |
--with httpx |
Adds httpx dependency for async HTTP functionality |
--with mcp-server-gravitino |
Adds the local module as a runtime dependency |
python -m mcp_server_gravitino.server |
Starts the MCP server using the package's entry module |
Example configuration to run the server using Goose:
{
"mcpServers": {
"Gravitino": {
"command": "uv",
"args": [
"--directory",
"/Users/user/workspace/mcp-server-gravitino",
"run",
"--with",
"fastmcp",
"--with",
"httpx",
"--with",
"mcp-server-gravitino",
"python",
"-m",
"mcp_server_gravitino.server"
],
"env": {
"GRAVITINO_URI": "http://localhost:8090",
"GRAVITINO_USERNAME": "admin",
"GRAVITINO_PASSWORD": "admin",
"GRAVITINO_METALAKE": "metalake_demo"
}
}
}
}
mcp-server-gravitino
does not expose all Gravitino APIs, but provides a selected set of optimized tools:
get_list_of_catalogs
: Retrieve a list of catalogsget_list_of_schemas
: Retrieve a list of schemasget_list_of_tables
: Retrieve a paginated list of tablesget_table_by_fqn
: Fetch detailed information for a specific tableget_table_columns_by_fqn
: Retrieve column information for a tableget_list_of_tags
: Retrieve all tagsassociate_tag_to_entity
: Attach a tag to a table or columnlist_objects_by_tag
: List objects associated with a specific tagget_list_of_roles
: Retrieve all rolesget_list_of_users
: Retrieve all usersgrant_role_to_user
: Assign a role to a userrevoke_role_from_user
: Revoke a user's roleget_list_of_models
: Retrieve a list of modelsget_list_of_model_versions_by_fqn
: Get versions of a model by fully qualified nameEach tool is designed to return concise and relevant metadata to stay within LLM token limits while maintaining semantic integrity.
This project is licensed under the Apache License Version 2.0.
Reviews feature coming soon
Stay tuned for community discussions and feedback