Groq
Ultra-fast AI inference with Groq LPU. Run Llama, Mixtral, and Gemma models at lightning speed. OpenAI-compatible API.
🔑 Setup Required — Bring Your Own Key
This integration requires API credentials. Store them once via
setup_integration or the Dashboard Vault — they're encrypted
with AES-256-GCM and never exposed.
Required Credentials
groq_api_key
API Key
API key from console.groq.com → API Keys. Starts with "gsk_".
Get your API key at groq.com →
Endpoints (2)
POST
chat-completion
Send a conversation and get an AI response. Ultra-fast inference. OpenAI-compatible format.
| Field | Type | Required | Description |
|---|---|---|---|
model |
string | ✓ Yes | Model: "llama-3.3-70b-versatile" (best), "llama-3.1-8b-instant" (fastest), "mixtral-8x7b-32768" (long context), "gemma2-9b-it" (Google). |
messages |
array | ✓ Yes | Conversation messages. Example: [{"role": "user", "content": "Hello"}]. OpenAI format. |
max_tokens |
integer | No | Max response tokens (default: model-dependent). |
GET
list-models
List available models on Groq with their capabilities and context window sizes.
No input parameters required.
MCP Tool Names
When using this integration through an AI assistant (Claude, ChatGPT, Cursor, etc.), the endpoints are available as MCP tools:
| Endpoint | MCP Tool Name |
|---|---|
| chat-completion | groq_chat_completion |
| list-models | groq_list_models |