Ultimate LLM API Documentation

Comprehensive reference for all major Language Model APIs - Updated 2025

📋 Quick Navigation

🤖 OpenAI
Industry-leading AI with GPT-5 (released Aug 7, 2025) - unified reasoning and fast responses
🔗 API Endpoints
Base URL:
https://api.openai.com
Chat Completions:
POST https://api.openai.com/v1/chat/completions
Completions (Legacy):
POST https://api.openai.com/v1/completions
Images:
POST https://api.openai.com/v1/images/generations
Audio (Speech-to-Text):
POST https://api.openai.com/v1/audio/transcriptions
Audio (Text-to-Speech):
POST https://api.openai.com/v1/audio/speech
🎯 Available Models (Updated Aug 8, 2025)
gpt-5 🆕
Latest flagship with built-in reasoning - released Aug 7, 2025
gpt-5-mini 🆕
Lightweight version for cost-sensitive applications
gpt-5-nano 🆕
Ultra-low latency for instant responses
gpt-5-chat 🆕
Advanced natural conversations for enterprise
gpt-4.1
Specialized coding model with 1M token context
gpt-4.1-mini
Fast, efficient coding assistant
gpt-4.1-nano
Fastest and cheapest for low-latency tasks
gpt-4o
Multimodal model (superseded by GPT-5)
o3
Advanced reasoning model
o4-mini
Fast, cost-efficient reasoning model
gpt-image-1
Professional image generation model
📝 Text 🖼️ Images 🎵 Audio 💻 Code 🔧 Function Calling 🧠 Built-in Reasoning ⚡ Real-time Router 🔄 Streaming 🆓 Free Access (GPT-5)
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY

🚨 Model Status Updates:
• GPT-5 released Aug 7, 2025 (available to ALL users including free)
• GPT-4.5 being deprecated July 14, 2025
• GPT-4 retired from ChatGPT April 2025 (API still available)
• GPT-4o mini replaced by GPT-4.1 mini
🧠 Anthropic (Claude)
Constitutional AI with advanced reasoning and safety features
🔗 API Endpoints
Base URL:
https://api.anthropic.com
Messages:
POST https://api.anthropic.com/v1/messages
Models List:
GET https://api.anthropic.com/v1/models
OpenAI Compatible:
POST https://api.anthropic.com/v1/chat/completions
🎯 Available Models
claude-sonnet-4-20250514
Latest Claude 4 Sonnet with thinking capabilities
claude-opus-4.1
Most powerful Claude model for complex tasks
claude-3.7-sonnet
Extended thinking model with step-by-step reasoning
claude-3.5-sonnet
Balanced performance and efficiency
claude-3.5-haiku
Fast model for lightweight tasks
📝 Text 🖼️ Images 📄 Documents 💻 Code 🔧 Function Calling 🧠 Extended Thinking 🔄 Streaming
🔑 Authentication
Headers: x-api-key: YOUR_API_KEY, anthropic-version: 2023-06-01
🌟 Google (Gemini)
Multimodal AI with native understanding of text, images, video, and audio
🔗 API Endpoints
Base URL (AI Studio):
https://generativelanguage.googleapis.com
Generate Content:
POST https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent
Vertex AI:
https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent
Live API (Real-time):
wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService/BidiGenerateContent
🎯 Available Models
gemini-2.5-pro
State-of-the-art thinking model with advanced reasoning
gemini-2.5-flash
Best price-performance multimodal model
gemini-2.0-flash
Latest multimodal model with enhanced capabilities
gemini-1.5-pro
Large context window, handles 2M tokens
gemini-1.5-flash
Fast and efficient for everyday tasks
📝 Text 🖼️ Images 🎥 Video 🎵 Audio 💻 Code 🔧 Function Calling 🔍 Grounding 🎨 Image Generation 🎭 TTS
🔑 Authentication
API Key: ?key=YOUR_API_KEY or Header: Authorization: Bearer YOUR_ACCESS_TOKEN
🚀 xAI (Grok)
Real-time information and reasoning capabilities with live search integration
🔗 API Endpoints
Base URL:
https://api.x.ai/v1
Chat Completions:
POST https://api.x.ai/v1/chat/completions
Models:
GET https://api.x.ai/v1/models
🎯 Available Models
grok-4
Most intelligent model with native tool use and real-time search
grok-4-heavy
Most powerful version of Grok 4
grok-3
Advanced reasoning and code generation
grok-3-mini
Lightweight version for faster responses
grok-beta
Latest experimental model (128k context)
📝 Text 🖼️ Images 💻 Code 🔧 Function Calling 🔍 Live Search 🧠 Reasoning 🔄 Streaming
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY
🇫🇷 Mistral AI
European AI with strong multilingual capabilities and specialized models
🔗 API Endpoints
Base URL:
https://api.mistral.ai
Chat Completions:
POST https://api.mistral.ai/v1/chat/completions
Embeddings:
POST https://api.mistral.ai/v1/embeddings
Fine-tuning:
POST https://api.mistral.ai/v1/fine_tuning/jobs
🎯 Available Models
mistral-large-latest
Flagship model for complex reasoning and analysis
mistral-medium-2505
Balanced frontier-class multimodal performance
mistral-small-latest
Cost-effective model for general tasks
codestral-2501
Specialized coding model
mistral-embed
High-quality text embeddings
mistral-ocr-2505
Document processing and OCR
📝 Text 🖼️ Images 💻 Code 🔧 Function Calling 🌍 Multilingual 📄 Document Processing 🎯 Fine-tuning
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY
🔍 DeepSeek
High-performance models with advanced reasoning capabilities at competitive pricing
🔗 API Endpoints
Base URL:
https://api.deepseek.com
Chat Completions:
POST https://api.deepseek.com/chat/completions
OpenAI Compatible:
POST https://api.deepseek.com/v1/chat/completions
Models:
GET https://api.deepseek.com/models
🎯 Available Models
deepseek-reasoner (R1-0528)
Advanced reasoning model with step-by-step thinking
deepseek-chat (V3-0324)
General-purpose model with 671B parameters, 37B active
deepseek-r1
Reasoning model comparable to OpenAI o1
deepseek-v3
Mixture-of-Experts model for general tasks
📝 Text 💻 Code 🧠 Reasoning 🔧 Function Calling 🎯 Mathematics 💰 Cost-Effective 🔄 Streaming
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY
🦙 Meta (Llama)
Open-source multimodal models with native tool use and extended context
🔗 API Endpoints
Official API (Preview):
https://api.llama.com/v1
Chat Completions:
POST https://api.llama.com/v1/chat/completions
Models:
GET https://api.llama.com/v1/models
Via Partners (Groq, Cerebras):
Multiple partner APIs available
🎯 Available Models
llama-4-scout
17B active params, 10M context, best multimodal in class
llama-4-maverick
17B active params, 128 experts, beats GPT-4o
llama-4-behemoth
288B active params, teacher model (in training)
llama-3.3-70b
Latest 70B model with improved capabilities
llama-3.1-405b
Largest open model with 405B parameters
llama-3.1-70b
Balanced performance and efficiency
📝 Text 🖼️ Images 💻 Code 🔧 Function Calling 🌍 Multilingual 📖 Long Context 🆓 Open Source
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY (limited preview access)
🇨🇳 Alibaba (Qwen)
Multilingual models with strong Asian language support and coding capabilities
🔗 API Endpoints
International Base URL:
https://dashscope-intl.aliyuncs.com
OpenAI Compatible:
POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions
China Base URL:
https://dashscope.aliyuncs.com
Multimodal Generation:
POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation
🎯 Available Models
qwen-max-2025-01-25
Latest flagship model, outperforms DeepSeek V3
qwen3-235b-a22b-instruct-2507
Large model with 256K context, 1M token support
qwen3-30b-a3b-instruct-2507
Mid-size model with excellent performance
qwen3-coder-480b-a35b
Specialized coding model with 480B params
qwen2.5-72b
Powerful model for general tasks
qwen-vl
Vision-language model for multimodal tasks
📝 Text 🖼️ Images 🎥 Video 💻 Code 🔧 Function Calling 🌏 Asian Languages 🧠 Thinking Mode
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY (Alibaba Cloud API Key)
🤝 Together AI
Platform with 200+ open-source models for various tasks and use cases
🔗 API Endpoints
Base URL:
https://api.together.xyz
Chat Completions:
POST https://api.together.xyz/v1/chat/completions
Completions:
POST https://api.together.xyz/v1/completions
Image Generation:
POST https://api.together.xyz/v1/images/generations
Fine-tuning:
POST https://api.together.xyz/v1/fine-tuning/jobs
🎯 Popular Models
meta-llama/Llama-4-Maverick-17B-128E-Instruct
Latest Llama 4 model
deepseek-ai/DeepSeek-V3
DeepSeek's latest model
Qwen/Qwen2.5-Coder-32B-Instruct
Qwen coding specialist
mistralai/Mixtral-8x7B-Instruct-v0.1
Mistral's MoE model
togethercomputer/StripedHyena-Nous-7B
Fast alternative architecture
📝 Text 🖼️ Images 💻 Code 🎨 Image Generation 🎯 Fine-tuning 🔧 Custom Models ⚡ Fast Inference
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY
⚡ Groq
Ultra-fast inference with Language Processing Units (LPUs) for speed-critical applications
🔗 API Endpoints
Base URL:
https://api.groq.com
Chat Completions:
POST https://api.groq.com/openai/v1/chat/completions
Models:
GET https://api.groq.com/openai/v1/models
🎯 Available Models
llama-3.3-70b-versatile
Latest Llama model optimized for Groq
llama-3.1-8b-instant
Ultra-fast 8B model for quick responses
mixtral-8x7b-32768
Mistral's MoE model with extended context
gemma2-9b-it
Google's Gemma model
gpt-oss-120b
OpenAI open-source model
📝 Text 💻 Code ⚡ Ultra-Fast 🔧 Function Calling 🏢 Enterprise 🔄 Streaming
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY
🎯 Cohere
Enterprise-focused AI with specialized models for RAG, embeddings, and reranking
🔗 API Endpoints
Base URL:
https://api.cohere.ai
Chat:
POST https://api.cohere.ai/v1/chat
Embeddings:
POST https://api.cohere.ai/v1/embed
Rerank:
POST https://api.cohere.ai/v1/rerank
Classify:
POST https://api.cohere.ai/v1/classify
🎯 Available Models
command-r-plus
Most powerful model with 128K context for RAG
command-r
Balanced model for complex workflows
command-a
Advanced model for enterprise use
command-a-vision
Multimodal model with image understanding
embed-english-v3.0
High-quality English embeddings
rerank-english-v3.0
Advanced reranking for search
📝 Text 🖼️ Images 🔍 RAG 📊 Embeddings 🔄 Reranking 🏷️ Classification 🏢 Enterprise
🔑 Authentication
Header: Authorization: Bearer YOUR_API_KEY
📅 Last updated: August 8, 2025 | 🔄 Continuously refreshed with the latest API information
🚨 BREAKING: GPT-5 released August 7, 2025 - Available to ALL users including free!
💡 For the most current pricing and model availability, always check the official documentation