Portkey Docs
HomeAPIIntegrationsChangelog
  • Introduction
    • What is Portkey?
    • Make Your First Request
    • Feature Overview
  • Integrations
    • LLMs
      • OpenAI
        • Structured Outputs
        • Prompt Caching
      • Anthropic
        • Prompt Caching
      • Google Gemini
      • Groq
      • Azure OpenAI
      • AWS Bedrock
      • Google Vertex AI
      • Bring Your Own LLM
      • AI21
      • Anyscale
      • Cerebras
      • Cohere
      • Fireworks
      • Deepbricks
      • Deepgram
      • Deepinfra
      • Deepseek
      • Google Palm
      • Huggingface
      • Inference.net
      • Jina AI
      • Lingyi (01.ai)
      • LocalAI
      • Mistral AI
      • Monster API
      • Moonshot
      • Nomic
      • Novita AI
      • Ollama
      • OpenRouter
      • Perplexity AI
      • Predibase
      • Reka AI
      • SambaNova
      • Segmind
      • SiliconFlow
      • Stability AI
      • Together AI
      • Voyage AI
      • Workers AI
      • ZhipuAI / ChatGLM / BigModel
      • Suggest a new integration!
    • Agents
      • Autogen
      • Control Flow
      • CrewAI
      • Langchain Agents
      • LlamaIndex
      • Phidata
      • Bring Your own Agents
    • Libraries
      • Autogen
      • DSPy
      • Instructor
      • Langchain (Python)
      • Langchain (JS/TS)
      • LlamaIndex (Python)
      • LibreChat
      • Promptfoo
      • Vercel
        • Vercel [Depricated]
  • Product
    • Observability (OpenTelemetry)
      • Logs
      • Tracing
      • Analytics
      • Feedback
      • Metadata
      • Filters
      • Logs Export
      • Budget Limits
    • AI Gateway
      • Universal API
      • Configs
      • Multimodal Capabilities
        • Image Generation
        • Function Calling
        • Vision
        • Speech-to-Text
        • Text-to-Speech
      • Cache (Simple & Semantic)
      • Fallbacks
      • Automatic Retries
      • Load Balancing
      • Conditional Routing
      • Request Timeouts
      • Canary Testing
      • Virtual Keys
        • Budget Limits
    • Prompt Library
      • Prompt Templates
      • Prompt Partials
      • Retrieve Prompts
      • Advanced Prompting with JSON Mode
    • Guardrails
      • List of Guardrail Checks
        • Patronus AI
        • Aporia
        • Pillar
        • Bring Your Own Guardrails
      • Creating Raw Guardrails (in JSON)
    • Autonomous Fine-tuning
    • Enterprise Offering
      • Org Management
        • Organizations
        • Workspaces
        • User Roles & Permissions
        • API Keys (AuthN and AuthZ)
      • Access Control Management
      • Budget Limits
      • Security @ Portkey
      • Logs Export
      • Private Cloud Deployments
        • Architecture
        • AWS
        • GCP
        • Azure
        • Cloudflare Workers
        • F5 App Stack
      • Components
        • Log Store
          • MongoDB
    • Open Source
    • Portkey Pro & Enterprise Plans
  • API Reference
    • Introduction
    • Authentication
    • OpenAPI Specification
    • Headers
    • Response Schema
    • Gateway Config Object
    • SDK
  • Provider Endpoints
    • Supported Providers
    • Chat
    • Embeddings
    • Images
      • Create Image
      • Create Image Edit
      • Create Image Variation
    • Audio
      • Create Speech
      • Create Transcription
      • Create Translation
    • Fine-tuning
      • Create Fine-tuning Job
      • List Fine-tuning Jobs
      • Retrieve Fine-tuning Job
      • List Fine-tuning Events
      • List Fine-tuning Checkpoints
      • Cancel Fine-tuning
    • Batch
      • Create Batch
      • List Batch
      • Retrieve Batch
      • Cancel Batch
    • Files
      • Upload File
      • List Files
      • Retrieve File
      • Retrieve File Content
      • Delete File
    • Moderations
    • Assistants API
      • Assistants
        • Create Assistant
        • List Assistants
        • Retrieve Assistant
        • Modify Assistant
        • Delete Assistant
      • Threads
        • Create Thread
        • Retrieve Thread
        • Modify Thread
        • Delete Thread
      • Messages
        • Create Message
        • List Messages
        • Retrieve Message
        • Modify Message
        • Delete Message
      • Runs
        • Create Run
        • Create Thread and Run
        • List Runs
        • Retrieve Run
        • Modify Run
        • Submit Tool Outputs to Run
        • Cancel Run
      • Run Steps
        • List Run Steps
        • Retrieve Run Steps
    • Completions
    • Gateway for Other API Endpoints
  • Portkey Endpoints
    • Configs
      • Create Config
      • List Configs
      • Retrieve Config
      • Update Config
    • Feedback
      • Create Feedback
      • Update Feedback
    • Guardrails
    • Logs
      • Insert a Log
      • Log Exports [BETA]
        • Retrieve a Log Export
        • Update a Log Export
        • List Log Exports
        • Create a Log Export
        • Start a Log Export
        • Cancel a Log Export
        • Download a Log Export
    • Prompts
      • Prompt Completion
      • Render
    • Virtual Keys
      • Create Virtual Key
      • List Virtual Keys
      • Retrieve Virtual Key
      • Update Virtual Key
      • Delete Virtual Key
    • Analytics
      • Graphs - Time Series Data
        • Get Requests Data
        • Get Cost Data
        • Get Latency Data
        • Get Tokens Data
        • Get Users Data
        • Get Requests per User
        • Get Errors Data
        • Get Error Rate Data
        • Get Status Code Data
        • Get Unique Status Code Data
        • Get Rescued Requests Data
        • Get Cache Hit Rate Data
        • Get Cache Hit Latency Data
        • Get Feedback Data
        • Get Feedback Score Distribution Data
        • Get Weighted Feeback Data
        • Get Feedback Per AI Models
      • Summary
        • Get All Cache Data
      • Groups - Paginated Data
        • Get User Grouped Data
        • Get Model Grouped Data
        • Get Metadata Grouped Data
    • API Keys [BETA]
      • Update API Key
      • Create API Key
      • Delete an API Key
      • Retrieve an API Key
      • List API Keys
    • Admin
      • Users
        • Retrieve a User
        • Retrieve All Users
        • Update a User
        • Remove a User
      • User Invites
        • Invite a User
        • Retrieve an Invite
        • Retrieve All User Invites
        • Delete a User Invite
      • Workspaces
        • Create Workspace
        • Retrieve All Workspaces
        • Retrieve a Workspace
        • Update Workspace
        • Delete a Workspace
      • Workspace Members
        • Add a Workspace Member
        • Retrieve All Workspace Members
        • Retrieve a Workspace Member
        • Update Workspace Member
        • Remove Workspace Member
  • Guides
    • Getting Started
      • A/B Test Prompts and Models
      • Tackling Rate Limiting
      • Function Calling
      • Image Generation
      • Getting started with AI Gateway
      • Llama 3 on Groq
      • Return Repeat Requests from Cache
      • Trigger Automatic Retries on LLM Failures
      • 101 on Portkey's Gateway Configs
    • Integrations
      • Llama 3 on Portkey + Together AI
      • Introduction to GPT-4o
      • Anyscale
      • Mistral
      • Vercel AI
      • Deepinfra
      • Groq
      • Langchain
      • Mixtral 8x22b
      • Segmind
    • Use Cases
      • Few-Shot Prompting
      • Enforcing JSON Schema with Anyscale & Together
      • Detecting Emotions with GPT-4o
      • Build an article suggestion app with Supabase pgvector, and Portkey
      • Setting up resilient Load balancers with failure-mitigating Fallbacks
      • Run Portkey on Prompts from Langchain Hub
      • Smart Fallback with Model-Optimized Prompts
      • How to use OpenAI SDK with Portkey Prompt Templates
      • Setup OpenAI -> Azure OpenAI Fallback
      • Fallback from SDXL to Dall-e-3
      • Comparing Top10 LMSYS Models with Portkey
      • Build a chatbot using Portkey's Prompt Templates
  • Support
    • Contact Us
    • Developer Forum
    • Common Errors & Resolutions
    • December '23 Migration
    • Changelog
Powered by GitBook
On this page
  • Guide: Create a Portkey + OpenAI Chatbot
  • Guide: Handle OpenAI Failures
  • Guide: Cache Semantically Similar Requests
  • Guide: Manage Prompts Separately
  • Talk to the Developers

Was this helpful?

Edit on GitHub
  1. Guides
  2. Integrations

Vercel AI

PreviousMistralNextDeepinfra

Last updated 10 months ago

Was this helpful?

Portkey is a control panel for your Vercel AI app. It makes your LLM integrations prod-ready, reliable, fast, and cost-efficient.

Use Portkey with your Vercel app for:

  1. Calling 100+ LLMs (open & closed)

  2. Logging & analysing LLM usage

  3. Caching responses

  4. Automating fallbacks, retries, timeouts, and load balancing

  5. Managing, versioning, and deploying prompts

  6. Continuously improving app with user feedback

Guide: Create a Portkey + OpenAI Chatbot

1. Create a NextJS app

Go ahead and create a Next.js application, and install ai and portkey-ai as dependencies.

pnpm dlx create-next-app my-ai-app
cd my-ai-app
pnpm install ai @ai-sdk/openai portkey-ai

2. Add Authentication keys to .env

  1. Login to Portkey

  2. To integrate OpenAI with Portkey, add your OpenAI API key to Portkey’s Virtual Keys

  3. This will give you a disposable key that you can use and rotate instead of directly using the OpenAI API key

  4. Grab the Virtual key & your Portkey API key and add them to .env file:

# ".env"
PORTKEY_API_KEY="xxxxxxxxxx"
OPENAI_VIRTUAL_KEY="xxxxxxxxxx"

3. Create Route Handler

Create a Next.js Route Handler that utilizes the Edge Runtime to generate a chat completion. Stream back to Next.js.

For this example, create a route handler at app/api/chat/route.ts that calls GPT-4 and accepts a POST request with a messages array of strings:

// filename="app/api/chat/route.ts"
import { streamText } from 'ai';
import { createOpenAI } from '@ai-sdk/openai';
import { createHeaders, PORTKEY_GATEWAY_URL } from 'portkey-ai';

// Create a OpenAI client
const client = createOpenAI({
    baseURL: PORTKEY_GATEWAY_URL,
    apiKey: "xx",
    headers: createHeaders({
        apiKey: "PORTKEY_API_KEY",
        virtualKey: "OPENAI_VIRTUAL_KEY"
    }),
})

// Set the runtime to edge for best performance
export const runtime = 'edge';

export async function POST(req: Request) {
  const { messages } = await req.json();

  // Invoke Chat Completion
  const response = await streamText({
    model: client('gpt-3.5-turbo'),
    messages
  })

  // Respond with the stream
  return response.toTextStreamResponse();
}

Portkey follows the same signature as OpenAI SDK but extends it to work with 100+ LLMs. Here, the chat completion call will be sent to the gpt-3.5-turbo model, and the response will be streamed to your Next.js app.

4. Switch from OpenAI to Anthropic

Let’s see how you can switch from GPT-4 to Claude-3-Opus by updating 2 lines of code (without breaking anything else).

  1. Add your Anthropic API key or AWS Bedrock secrets to Portkey’s Virtual Keys

  2. Update the virtual key while instantiating your Portkey client

  3. Update the model name while making your /chat/completions call

  4. Add maxTokens field inside streamText invocation (Anthropic requires this field)

Let’s see it in action:

const client = createOpenAI({
    baseURL: PORTKEY_GATEWAY_URL,
    apiKey: "xx",
    headers: createHeaders({
        apiKey: "PORTKEY_API_KEY",
        virtualKey: "ANTHROPIC_VIRTUAL_KEY"
    }),
})

// Set the runtime to edge for best performance
export const runtime = 'edge';

export async function POST(req: Request) {
  const { messages } = await req.json();

  // Invoke Chat Completion
  const response = await streamText({
    model: client('claude-3-opus-20240229'),
    messages,
    maxTokens: 200
  })

  // Respond with the stream
  return response.toTextStreamResponse();
}

5. Switch to Gemini 1.5

const client = createOpenAI({
    baseURL: PORTKEY_GATEWAY_URL,
    apiKey: "xx",
    headers: createHeaders({
        apiKey: "PORTKEY_API_KEY",
        virtualKey: "GEMINI_VIRTUAL_KEY"
    }),
})

// Set the runtime to edge for best performance
export const runtime = 'edge';

export async function POST(req: Request) {
  const { messages } = await req.json();

  // Invoke Chat Completion
  const response = await streamText({
    model: client('gemini-1.5-flash'),
    messages
  })

  // Respond with the stream
  return response.toTextStreamResponse();
}

6. Wire up the UI

Let's create a Client component that will have a form to collect the prompt from the user and stream back the completion. The useChat hook will default use the POST Route Handler we created earlier (/api/chat). However, you can override this default value by passing an api prop to useChat({ api: '...'}).

//"app/page.tsx"
'use client';

import { useChat } from 'ai/react';

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat();
  return (
    <div className="flex flex-col w-full max-w-md py-24 mx-auto stretch">
      {messages.map((m) => (
        <div key={m.id} className="whitespace-pre-wrap">
          {m.role === 'user' ? 'User: ' : 'AI: '}
          {m.content}
        </div>
      ))}

      <form onSubmit={handleSubmit}>
        <input
          className="fixed bottom-0 w-full max-w-md p-2 mb-8 border border-gray-300 rounded shadow-xl"
          value={input}
          placeholder="Say something..."
          onChange={handleInputChange}
        />
      </form>
    </div>
  );
}

7. Log the Requests

Portkey logs all the requests you’re sending to help you debug errors, and get request-level + aggregate insights on costs, latency, errors, and more.

You can enhance the logging by tracing certain requests, passing custom metadata or user feedback.

Segmenting Requests with Metadata

While Creating the Client, you can pass any {"key":"value"} pairs inside the metadata header. Portkey segments the requests based on the metadata to give you granular insights.

const client = createOpenAI({
    baseURL: PORTKEY_GATEWAY_URL,
    apiKey: "xx",
    headers: createHeaders({
        apiKey: {PORTKEY_API_KEY},
        virtualKey: {GEMINI_VIRTUAL_KEY},
        metadata: {
          _user: 'john doe',
          organization_name: 'acme',
          custom_key: 'custom_value'
        }
    }),
})

Guide: Handle OpenAI Failures

1. Solve 5xx, 4xx Errors

For example, for setting up a fallback from OpenAI to Anthropic, the Gateway Config would be:

{
  "strategy": { "mode": "fallback" },
  "targets": [{ "virtual_key": "openai-virtual-key" }, { "virtual_key": "anthropic-virtual-key" }]
}

You can save this Config in Portkey app and get an associated Config ID that you can pass while instantiating your LLM client:

2. Apply Config to the Route Handler

const client = createOpenAI({
    baseURL: PORTKEY_GATEWAY_URL,
    apiKey: "xx",
    headers: createHeaders({
        apiKey: {PORTKEY_API_KEY},
        config: {CONFIG_ID}
    }),
})

3. Handle Rate Limit Errors

You can loadbalance your requests against multiple LLMs or accounts and prevent any one account from hitting rate limit thresholds.

For example, to route your requests between 1 OpenAI and 2 Azure OpenAI accounts:

{
  "strategy": { "mode": "loadbalance" },
  "targets": [
    { "virtual_key": "openai-virtual-key", "weight": 1 },
    { "virtual_key": "azure-virtual-key-1", "weight": 1 },
    { "virtual_key": "azure-virtual-key-2", "weight": 1 }
  ]
}

Save this Config in the Portkey app and pass it while instantiating the LLM Client, just like we did above.

Guide: Cache Semantically Similar Requests

Portkey can save LLM costs & reduce latencies 20x by storing responses for semantically similar queries and serving them from cache.

For Q&A use cases, cache hit rates go as high as 50%. To enable semantic caching, just set the cache mode to semantic in your Gateway Config:

{
  "cache": { "mode": "semantic" }
}

Same as above, you can save your cache Config in the Portkey app, and reference the Config ID while instantiating the LLM Client.

Guide: Manage Prompts Separately

To create a Prompt Template,

  1. From the Dashboard, Open Prompts

  2. In the Prompts page, Click Create

  3. Add your instructions, variables, and You can modify model parameters and click Save

Trigger the Prompt in the Route Handler

import Portkey from 'portkey-ai'

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY"
})


export async function POST(req: Request) {
  const { movie } = await req.json();

  const moviePromptRender = await portkey.prompts.render({
        promptID: "PROMPT_ID",
        variables: { "movie": movie }
    })
  const messages = moviePromptRender.data.messages

  const response = await streamText({
    model: client('gemini-1.5-flash'),
    messages
  })

  return response.toTextStreamResponse();
}

Talk to the Developers

Portkey is powered by an with which you can route to 100+ LLMs using the same, known OpenAI spec.

Similarly, you can just add your to Portkey and call Gemini 1.5:

The same will follow for all the other providers like Azure, Mistral, Anyscale, Together, and .

rolling logs and cachescreens

Learn more about and .

Portkey helps you automatically trigger a call to any other LLM/provider in case of primary failures. a fallback logic with Portkey’s Gateway Config.

Portkey can also trigger , set , and more.

Moreover, you can set the max-age of the cache and force refresh a cache. See the for more information.

Storing prompt templates and instructions in code is messy. Using Portkey, you can create and manage all of your app’s prompts in a single place and directly hit our prompts API to get responses. Here’s more on .

verel app prompt

See for more information.

If you have any questions or issues, reach out to us on . On Discord, you will also meet many other practitioners who are putting their Vercel AI + Portkey app to production.

here
open-source, universal AI Gateway
Google AI Studio API key
more
tracing
feedback
Create
automatic retries
request timeouts
docs
what Prompts on Portkey can do
docs
Discord here