Portkey Docs
HomeAPIIntegrationsChangelog
  • Introduction
    • What is Portkey?
    • Make Your First Request
    • Feature Overview
  • Integrations
    • LLMs
      • OpenAI
        • Structured Outputs
        • Prompt Caching
      • Anthropic
        • Prompt Caching
      • Google Gemini
      • Groq
      • Azure OpenAI
      • AWS Bedrock
      • Google Vertex AI
      • Bring Your Own LLM
      • AI21
      • Anyscale
      • Cerebras
      • Cohere
      • Fireworks
      • Deepbricks
      • Deepgram
      • Deepinfra
      • Deepseek
      • Google Palm
      • Huggingface
      • Inference.net
      • Jina AI
      • Lingyi (01.ai)
      • LocalAI
      • Mistral AI
      • Monster API
      • Moonshot
      • Nomic
      • Novita AI
      • Ollama
      • OpenRouter
      • Perplexity AI
      • Predibase
      • Reka AI
      • SambaNova
      • Segmind
      • SiliconFlow
      • Stability AI
      • Together AI
      • Voyage AI
      • Workers AI
      • ZhipuAI / ChatGLM / BigModel
      • Suggest a new integration!
    • Agents
      • Autogen
      • Control Flow
      • CrewAI
      • Langchain Agents
      • LlamaIndex
      • Phidata
      • Bring Your own Agents
    • Libraries
      • Autogen
      • DSPy
      • Instructor
      • Langchain (Python)
      • Langchain (JS/TS)
      • LlamaIndex (Python)
      • LibreChat
      • Promptfoo
      • Vercel
        • Vercel [Depricated]
  • Product
    • Observability (OpenTelemetry)
      • Logs
      • Tracing
      • Analytics
      • Feedback
      • Metadata
      • Filters
      • Logs Export
      • Budget Limits
    • AI Gateway
      • Universal API
      • Configs
      • Multimodal Capabilities
        • Image Generation
        • Function Calling
        • Vision
        • Speech-to-Text
        • Text-to-Speech
      • Cache (Simple & Semantic)
      • Fallbacks
      • Automatic Retries
      • Load Balancing
      • Conditional Routing
      • Request Timeouts
      • Canary Testing
      • Virtual Keys
        • Budget Limits
    • Prompt Library
      • Prompt Templates
      • Prompt Partials
      • Retrieve Prompts
      • Advanced Prompting with JSON Mode
    • Guardrails
      • List of Guardrail Checks
        • Patronus AI
        • Aporia
        • Pillar
        • Bring Your Own Guardrails
      • Creating Raw Guardrails (in JSON)
    • Autonomous Fine-tuning
    • Enterprise Offering
      • Org Management
        • Organizations
        • Workspaces
        • User Roles & Permissions
        • API Keys (AuthN and AuthZ)
      • Access Control Management
      • Budget Limits
      • Security @ Portkey
      • Logs Export
      • Private Cloud Deployments
        • Architecture
        • AWS
        • GCP
        • Azure
        • Cloudflare Workers
        • F5 App Stack
      • Components
        • Log Store
          • MongoDB
    • Open Source
    • Portkey Pro & Enterprise Plans
  • API Reference
    • Introduction
    • Authentication
    • OpenAPI Specification
    • Headers
    • Response Schema
    • Gateway Config Object
    • SDK
  • Provider Endpoints
    • Supported Providers
    • Chat
    • Embeddings
    • Images
      • Create Image
      • Create Image Edit
      • Create Image Variation
    • Audio
      • Create Speech
      • Create Transcription
      • Create Translation
    • Fine-tuning
      • Create Fine-tuning Job
      • List Fine-tuning Jobs
      • Retrieve Fine-tuning Job
      • List Fine-tuning Events
      • List Fine-tuning Checkpoints
      • Cancel Fine-tuning
    • Batch
      • Create Batch
      • List Batch
      • Retrieve Batch
      • Cancel Batch
    • Files
      • Upload File
      • List Files
      • Retrieve File
      • Retrieve File Content
      • Delete File
    • Moderations
    • Assistants API
      • Assistants
        • Create Assistant
        • List Assistants
        • Retrieve Assistant
        • Modify Assistant
        • Delete Assistant
      • Threads
        • Create Thread
        • Retrieve Thread
        • Modify Thread
        • Delete Thread
      • Messages
        • Create Message
        • List Messages
        • Retrieve Message
        • Modify Message
        • Delete Message
      • Runs
        • Create Run
        • Create Thread and Run
        • List Runs
        • Retrieve Run
        • Modify Run
        • Submit Tool Outputs to Run
        • Cancel Run
      • Run Steps
        • List Run Steps
        • Retrieve Run Steps
    • Completions
    • Gateway for Other API Endpoints
  • Portkey Endpoints
    • Configs
      • Create Config
      • List Configs
      • Retrieve Config
      • Update Config
    • Feedback
      • Create Feedback
      • Update Feedback
    • Guardrails
    • Logs
      • Insert a Log
      • Log Exports [BETA]
        • Retrieve a Log Export
        • Update a Log Export
        • List Log Exports
        • Create a Log Export
        • Start a Log Export
        • Cancel a Log Export
        • Download a Log Export
    • Prompts
      • Prompt Completion
      • Render
    • Virtual Keys
      • Create Virtual Key
      • List Virtual Keys
      • Retrieve Virtual Key
      • Update Virtual Key
      • Delete Virtual Key
    • Analytics
      • Graphs - Time Series Data
        • Get Requests Data
        • Get Cost Data
        • Get Latency Data
        • Get Tokens Data
        • Get Users Data
        • Get Requests per User
        • Get Errors Data
        • Get Error Rate Data
        • Get Status Code Data
        • Get Unique Status Code Data
        • Get Rescued Requests Data
        • Get Cache Hit Rate Data
        • Get Cache Hit Latency Data
        • Get Feedback Data
        • Get Feedback Score Distribution Data
        • Get Weighted Feeback Data
        • Get Feedback Per AI Models
      • Summary
        • Get All Cache Data
      • Groups - Paginated Data
        • Get User Grouped Data
        • Get Model Grouped Data
        • Get Metadata Grouped Data
    • API Keys [BETA]
      • Update API Key
      • Create API Key
      • Delete an API Key
      • Retrieve an API Key
      • List API Keys
    • Admin
      • Users
        • Retrieve a User
        • Retrieve All Users
        • Update a User
        • Remove a User
      • User Invites
        • Invite a User
        • Retrieve an Invite
        • Retrieve All User Invites
        • Delete a User Invite
      • Workspaces
        • Create Workspace
        • Retrieve All Workspaces
        • Retrieve a Workspace
        • Update Workspace
        • Delete a Workspace
      • Workspace Members
        • Add a Workspace Member
        • Retrieve All Workspace Members
        • Retrieve a Workspace Member
        • Update Workspace Member
        • Remove Workspace Member
  • Guides
    • Getting Started
      • A/B Test Prompts and Models
      • Tackling Rate Limiting
      • Function Calling
      • Image Generation
      • Getting started with AI Gateway
      • Llama 3 on Groq
      • Return Repeat Requests from Cache
      • Trigger Automatic Retries on LLM Failures
      • 101 on Portkey's Gateway Configs
    • Integrations
      • Llama 3 on Portkey + Together AI
      • Introduction to GPT-4o
      • Anyscale
      • Mistral
      • Vercel AI
      • Deepinfra
      • Groq
      • Langchain
      • Mixtral 8x22b
      • Segmind
    • Use Cases
      • Few-Shot Prompting
      • Enforcing JSON Schema with Anyscale & Together
      • Detecting Emotions with GPT-4o
      • Build an article suggestion app with Supabase pgvector, and Portkey
      • Setting up resilient Load balancers with failure-mitigating Fallbacks
      • Run Portkey on Prompts from Langchain Hub
      • Smart Fallback with Model-Optimized Prompts
      • How to use OpenAI SDK with Portkey Prompt Templates
      • Setup OpenAI -> Azure OpenAI Fallback
      • Fallback from SDXL to Dall-e-3
      • Comparing Top10 LMSYS Models with Portkey
      • Build a chatbot using Portkey's Prompt Templates
  • Support
    • Contact Us
    • Developer Forum
    • Common Errors & Resolutions
    • December '23 Migration
    • Changelog
Powered by GitBook
On this page
  • Example Configs
  • Schema Details
  • Strategy Object Details
  • Cache Object Details
  • Retry Object Details
  • Cloud Provider Params (Azure OpenAI, Google Vertex, AWS Bedrock)
  • Notes
  • Examples

Was this helpful?

Edit on GitHub
  1. API Reference

Gateway Config Object

The config object is used to configure API interactions with various providers. It supports multiple modes such as single provider access, load balancing between providers, and fallback strategies.

The following JSON schema is used to validate the config object:

JSON Schema
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "strategy": {
      "type": "object",
      "properties": {
        "mode": {
          "type": "string",
          "enum": [
            "single",
            "loadbalance",
            "fallback"
          ]
        },
        "on_status_codes": {
          "type": "array",
          "items": {
            "type": "number"
          },
          "optional": true
        }
      }
    },
    "provider": {
      "type": "string",
      "enum": [
        "openai",
        "anthropic",
        "azure-openai",
        "anyscale",
        "cohere",
        "palm"
      ]
    },
    "resource_name": {
      "type": "string",
      "optional": true
    },
    "deployment_id": {
      "type": "string",
      "optional": true
    },
    "api_version": {
      "type": "string",
      "optional": true
    },
    "override_params": {
      "type": "object"
    },
    "api_key": {
      "type": "string"
    },
    "virtual_key": {
      "type": "string"
    },
    "cache": {
      "type": "object",
      "properties": {
        "mode": {
          "type": "string",
          "enum": [
            "simple",
            "semantic"
          ]
        },
        "max_age": {
          "type": "integer",
          "optional": true
        }
      },
      "required": [
        "mode"
      ]
    },
    "retry": {
      "type": "object",
      "properties": {
        "attempts": {
          "type": "integer"
        },
        "on_status_codes": {
          "type": "array",
          "items": {
            "type": "integer"
          },
          "optional": true
        }
      },
      "required": [
        "attempts"
      ]
    },
    "weight": {
      "type": "number"
    },
    "on_status_codes": {
      "type": "array",
      "items": {
        "type": "integer"
      }
    },
    "targets": {
      "type": "array",
      "items": {
        "$ref": "#"
      }
    }
  },
  "anyOf": [
    {
      "required": [
        "provider",
        "api_key"
      ]
    },
    {
      "required": [
        "virtual_key"
      ]
    },
    {
      "required": [
        "strategy",
        "targets"
      ]
    },
    {
      "required": [
        "cache"
      ]
    },
    {
      "required": [
        "retry"
      ]
    }
  ],
  "additionalProperties": false
}

Example Configs

// Simple config with cache and retry
{
  "virtual_key": "***", // Your Virtual Key
  "cache": { // Optional
    "mode": "semantic",
    "max_age": 10000
  },
  "retry": { // Optional
    "attempts": 5,
    "on_status_codes": []
  }
}

// Load balancing with 2 OpenAI keys
{
  "strategy": {
      "mode": "loadbalance"
    },
  "targets": [
    {
      "provider": "openai",
      "api_key": "sk-***"
    },
    {
      "provider": "openai",
      "api_key": "sk-***"
    }
  ]
}

Schema Details

Key Name
Description
Type
Required
Enum Values
Additional Info

strategy

Operational strategy for the config or any individual target

object

Yes (if no provider or virtual_key)

-

See Strategy Object Details

provider

Name of the service provider

string

Yes (if no mode or virtual_key)

"openai", "anthropic", "azure-openai", "anyscale", "cohere"

-

api_key

API key for the service provider

string

Yes (if provider is specified)

-

-

virtual_key

Virtual key identifier

string

Yes (if no mode or provider)

-

-

cache

Caching configuration

object

No

-

See Cache Object Details

retry

Retry configuration

object

No

-

See Retry Object Details

weight

Weight for load balancing

number

No

-

Used in loadbalance mode

on_status_codes

Status codes triggering fallback

array of strings

No

-

Used in fallback mode

targets

List of target configurations

array

Yes (if mode is specified)

-

Each item follows the config schema

request_timeout

Request timeout configuration

number

No

-

-

custom_host

Route to privately hosted model

string

No

-

Used in combination with provider + api_key

forward_headers

Forward sensitive headers directly

array of strings

No

-

-

override_params

Pass model name and other hyper parameters

object

No

"model", "temperature", "frequency_penalty", "logit_bias", "logprobs", "top_logprobs", "max_tokens", "n", "presence_penalty", "response_format", "seed", "stop", "top_p", etc.

Pass everything that's typically part of the payload

Strategy Object Details

Key Name
Description
Type
Required
Enum Values
Additional Info

mode

strategy mode for the config

string

Yes

"loadbalance", "fallback"

on_status_codes

status codes to apply the strategy. This field is only used when strategy mode is "fallback"

array of numbers

No

Optional

Cache Object Details

Key Name
Description
Type
Required
Enum Values
Additional Info

mode

Cache mode

string

Yes

"simple", "semantic"

-

max_age

Maximum age for cache entries

integer

No

-

Optional

Retry Object Details

Key Name
Description
Type
Required
Enum Values
Additional Info

attempts

Number of retry attempts

integer

Yes

-

-

on_status_codes

Status codes to trigger retries

array of strings

No

-

Optional

Cloud Provider Params (Azure OpenAI, Google Vertex, AWS Bedrock)

Azure OpenAI

Key Name
Type
Required

azure_resource_name

string

No

azure_deployment_id

string

No

azure_api_version

string

No

azure_model_name

string

No

Authorization

string ("Bearer $API_KEY")

No

Google Vertex AI

Key Name
Type
Required

vertex_project_id

string

No

vertex_region

string

No

AWS Bedrock

Key Name
Type
Required

aws_access_key_id

string

No

aws_secret_access_key

string

No

aws_region

string

No

aws_session_token

string

No

Notes

  • The strategy mode key determines the operational mode of the config. If strategy mode is not specified, a single provider mode is assumed, requiring either provider and api_key or virtual_key.

  • In loadbalance and fallback modes, the targets array specifies the configurations for each target.

  • The cache and retry objects provide additional configurations for caching and retry policies, respectively.

Examples

Single Provider with API Key
{
  "provider": "openai",
  "api_key": "sk-***"
}
Passing Model & Hyperparameters with Override Option
{
  "provider": "anthropic",
  "api_key": "xxx",
  "override_params": {
    "model": "claude-3-sonnet-20240229",
    "max_tokens": 512,
    "temperature": 0
  }
}
Single Provider with Virtual Key
{
  "virtual_key": "***"
}
Single Provider with Virtual Key, Cache and Retry
{
  "virtual_key": "***",
  "cache": {
    "mode": "semantic",
    "max_age": 10000
  },
  "retry": {
    "attempts": 5,
    "on_status_codes": [429]
  }
}
Load Balancing with Two OpenAI API Keys
{
  "strategy": {
      "mode": "loadbalance"
    },
  "targets": [
    {
      "provider": "openai",
      "api_key": "sk-***"
    },
    {
      "provider": "openai",
      "api_key": "sk-***"
    }
  ]
}
Load Balancing and Fallback Combination
{
  "strategy": {
      "mode": "loadbalance"
    },
  "targets": [
    {
      "provider": "openai",
      "api_key": "sk-***"
    },
    {
      "strategy": {
          "mode": "fallback",
          "on_status_codes": [429, 241]
        },
      "targets": [
        {
          "virtual_key": "***"
        },
        {
          "virtual_key": "***"
        }
      ]
    }
  ]
}
PreviousResponse SchemaNextSDK

Last updated 9 months ago

Was this helpful?

You can find more examples of schemas .

below