Portkey Docs
HomeAPIIntegrationsChangelog
  • Introduction
    • What is Portkey?
    • Make Your First Request
    • Feature Overview
  • Integrations
    • LLMs
      • OpenAI
        • Structured Outputs
        • Prompt Caching
      • Anthropic
        • Prompt Caching
      • Google Gemini
      • Groq
      • Azure OpenAI
      • AWS Bedrock
      • Google Vertex AI
      • Bring Your Own LLM
      • AI21
      • Anyscale
      • Cerebras
      • Cohere
      • Fireworks
      • Deepbricks
      • Deepgram
      • Deepinfra
      • Deepseek
      • Google Palm
      • Huggingface
      • Inference.net
      • Jina AI
      • Lingyi (01.ai)
      • LocalAI
      • Mistral AI
      • Monster API
      • Moonshot
      • Nomic
      • Novita AI
      • Ollama
      • OpenRouter
      • Perplexity AI
      • Predibase
      • Reka AI
      • SambaNova
      • Segmind
      • SiliconFlow
      • Stability AI
      • Together AI
      • Voyage AI
      • Workers AI
      • ZhipuAI / ChatGLM / BigModel
      • Suggest a new integration!
    • Agents
      • Autogen
      • Control Flow
      • CrewAI
      • Langchain Agents
      • LlamaIndex
      • Phidata
      • Bring Your own Agents
    • Libraries
      • Autogen
      • DSPy
      • Instructor
      • Langchain (Python)
      • Langchain (JS/TS)
      • LlamaIndex (Python)
      • LibreChat
      • Promptfoo
      • Vercel
        • Vercel [Depricated]
  • Product
    • Observability (OpenTelemetry)
      • Logs
      • Tracing
      • Analytics
      • Feedback
      • Metadata
      • Filters
      • Logs Export
      • Budget Limits
    • AI Gateway
      • Universal API
      • Configs
      • Multimodal Capabilities
        • Image Generation
        • Function Calling
        • Vision
        • Speech-to-Text
        • Text-to-Speech
      • Cache (Simple & Semantic)
      • Fallbacks
      • Automatic Retries
      • Load Balancing
      • Conditional Routing
      • Request Timeouts
      • Canary Testing
      • Virtual Keys
        • Budget Limits
    • Prompt Library
      • Prompt Templates
      • Prompt Partials
      • Retrieve Prompts
      • Advanced Prompting with JSON Mode
    • Guardrails
      • List of Guardrail Checks
        • Patronus AI
        • Aporia
        • Pillar
        • Bring Your Own Guardrails
      • Creating Raw Guardrails (in JSON)
    • Autonomous Fine-tuning
    • Enterprise Offering
      • Org Management
        • Organizations
        • Workspaces
        • User Roles & Permissions
        • API Keys (AuthN and AuthZ)
      • Access Control Management
      • Budget Limits
      • Security @ Portkey
      • Logs Export
      • Private Cloud Deployments
        • Architecture
        • AWS
        • GCP
        • Azure
        • Cloudflare Workers
        • F5 App Stack
      • Components
        • Log Store
          • MongoDB
    • Open Source
    • Portkey Pro & Enterprise Plans
  • API Reference
    • Introduction
    • Authentication
    • OpenAPI Specification
    • Headers
    • Response Schema
    • Gateway Config Object
    • SDK
  • Provider Endpoints
    • Supported Providers
    • Chat
    • Embeddings
    • Images
      • Create Image
      • Create Image Edit
      • Create Image Variation
    • Audio
      • Create Speech
      • Create Transcription
      • Create Translation
    • Fine-tuning
      • Create Fine-tuning Job
      • List Fine-tuning Jobs
      • Retrieve Fine-tuning Job
      • List Fine-tuning Events
      • List Fine-tuning Checkpoints
      • Cancel Fine-tuning
    • Batch
      • Create Batch
      • List Batch
      • Retrieve Batch
      • Cancel Batch
    • Files
      • Upload File
      • List Files
      • Retrieve File
      • Retrieve File Content
      • Delete File
    • Moderations
    • Assistants API
      • Assistants
        • Create Assistant
        • List Assistants
        • Retrieve Assistant
        • Modify Assistant
        • Delete Assistant
      • Threads
        • Create Thread
        • Retrieve Thread
        • Modify Thread
        • Delete Thread
      • Messages
        • Create Message
        • List Messages
        • Retrieve Message
        • Modify Message
        • Delete Message
      • Runs
        • Create Run
        • Create Thread and Run
        • List Runs
        • Retrieve Run
        • Modify Run
        • Submit Tool Outputs to Run
        • Cancel Run
      • Run Steps
        • List Run Steps
        • Retrieve Run Steps
    • Completions
    • Gateway for Other API Endpoints
  • Portkey Endpoints
    • Configs
      • Create Config
      • List Configs
      • Retrieve Config
      • Update Config
    • Feedback
      • Create Feedback
      • Update Feedback
    • Guardrails
    • Logs
      • Insert a Log
      • Log Exports [BETA]
        • Retrieve a Log Export
        • Update a Log Export
        • List Log Exports
        • Create a Log Export
        • Start a Log Export
        • Cancel a Log Export
        • Download a Log Export
    • Prompts
      • Prompt Completion
      • Render
    • Virtual Keys
      • Create Virtual Key
      • List Virtual Keys
      • Retrieve Virtual Key
      • Update Virtual Key
      • Delete Virtual Key
    • Analytics
      • Graphs - Time Series Data
        • Get Requests Data
        • Get Cost Data
        • Get Latency Data
        • Get Tokens Data
        • Get Users Data
        • Get Requests per User
        • Get Errors Data
        • Get Error Rate Data
        • Get Status Code Data
        • Get Unique Status Code Data
        • Get Rescued Requests Data
        • Get Cache Hit Rate Data
        • Get Cache Hit Latency Data
        • Get Feedback Data
        • Get Feedback Score Distribution Data
        • Get Weighted Feeback Data
        • Get Feedback Per AI Models
      • Summary
        • Get All Cache Data
      • Groups - Paginated Data
        • Get User Grouped Data
        • Get Model Grouped Data
        • Get Metadata Grouped Data
    • API Keys [BETA]
      • Update API Key
      • Create API Key
      • Delete an API Key
      • Retrieve an API Key
      • List API Keys
    • Admin
      • Users
        • Retrieve a User
        • Retrieve All Users
        • Update a User
        • Remove a User
      • User Invites
        • Invite a User
        • Retrieve an Invite
        • Retrieve All User Invites
        • Delete a User Invite
      • Workspaces
        • Create Workspace
        • Retrieve All Workspaces
        • Retrieve a Workspace
        • Update Workspace
        • Delete a Workspace
      • Workspace Members
        • Add a Workspace Member
        • Retrieve All Workspace Members
        • Retrieve a Workspace Member
        • Update Workspace Member
        • Remove Workspace Member
  • Guides
    • Getting Started
      • A/B Test Prompts and Models
      • Tackling Rate Limiting
      • Function Calling
      • Image Generation
      • Getting started with AI Gateway
      • Llama 3 on Groq
      • Return Repeat Requests from Cache
      • Trigger Automatic Retries on LLM Failures
      • 101 on Portkey's Gateway Configs
    • Integrations
      • Llama 3 on Portkey + Together AI
      • Introduction to GPT-4o
      • Anyscale
      • Mistral
      • Vercel AI
      • Deepinfra
      • Groq
      • Langchain
      • Mixtral 8x22b
      • Segmind
    • Use Cases
      • Few-Shot Prompting
      • Enforcing JSON Schema with Anyscale & Together
      • Detecting Emotions with GPT-4o
      • Build an article suggestion app with Supabase pgvector, and Portkey
      • Setting up resilient Load balancers with failure-mitigating Fallbacks
      • Run Portkey on Prompts from Langchain Hub
      • Smart Fallback with Model-Optimized Prompts
      • How to use OpenAI SDK with Portkey Prompt Templates
      • Setup OpenAI -> Azure OpenAI Fallback
      • Fallback from SDXL to Dall-e-3
      • Comparing Top10 LMSYS Models with Portkey
      • Build a chatbot using Portkey's Prompt Templates
  • Support
    • Contact Us
    • Developer Forum
    • Common Errors & Resolutions
    • December '23 Migration
    • Changelog
Powered by GitBook
On this page
  • 1. Import and Authenticate Portkey SDK
  • 2. The Limitation with the Traditional Fallbacks
  • 3. Create Model-Optimised Prompt Templates
  • Fallback Configs using Prompt Templates
  • Trigger Prompt Completions to Activate Smart Fallbacks
  • View Fallback status in the Logs
  • Bonus: Activate Loadbalancing

Was this helpful?

Edit on GitHub
  1. Guides
  2. Use Cases

Smart Fallback with Model-Optimized Prompts

Portkey can help you easily create fallbacks from one LLM to another, making your application more reliable. While Fallback ensures reliability, it also means that you'll be running a prompt optimized for one LLM on another, which can often lead to significant differences in the final output.

Using Portkey Prompt templates you can optimize for specific models and ensure the final output is best optimised for the use-case, even if there are different models (in the fallback chain).

In this cookbook, we will explore setting up fallbacks between model-optimized prompt templates instead of using the same prompt for different models.

Let’s get started

1. Import and Authenticate Portkey SDK

Start by importing Portkey SDK into your NodeJS project using npm and authenticate by passing the Portkey API Key.

import { Portkey } from 'portkey-ai';

const portkey = new Portkey({
  apiKey: process.env.PORTKEYAI_API_KEY
});

You are now ready to access methods on portkey instance to trigger a prompt completions API.

2. The Limitation with the Traditional Fallbacks

Prepare the prompt for the task you want the model to do. We want our model to split a goal into actionable steps for the cookbook. The good version I was able to come up with GPT4 was with the following prompt with default model parameters (based on satisfactory response in the playground):

You are a productivity expert. When given a task, you can smartly suggest possible subtasks. You list the subtasks in less than 10 items, keeping each as actionable.

System:

You are a productivity expert. When given a task, you can smartly suggest possible subtasks. You list the subtasks in less than 10 items, keeping each short and actionable.

User:

The following is the goal I want to achieve:

I want to become fit in 6 months

GPT4:

1. Visit a doctor for a health check-up.

2. Set specific fitness goals (like weight loss, strength, etc).

...

9. Stay hydrated and make adjustments as required.

Claude:

Here are some suggested subtasks to help you achieve six-pack abs in six months:

1. Develop a balanced and nutritious meal plan that focuses on lean proteins, vegetables, and healthy fats while limiting processed foods and sugary drinks.

2. Create a sustainable calorie deficit by tracking your daily food intake and ensuring you burn more calories than you consume.

...

9. Stay motivated by setting short-term goals, rewarding yourself for progress, and seeking support from friends, family, or a fitness coach.

This means the prompt that got the satisfactory output from GPT4 may not fetch optimum quality with Claude. From the above example, Claude’s response is a bit more elaborate than what we wanted—short and actionable.

We will solve this problem with model optimised prompt templates.

3. Create Model-Optimised Prompt Templates

Using Portkey Prompt Templates, you can write your prompt and instructions in one place and then just input the variables when making a call rather than passing the whole instruction again.

To create a prompt template:

  1. Login into Portkey Dashboard

  2. Navigate to Prompts

    1. Click Create to open prompt creation page

The following page should open:

I am using Anthopic’s claude-3-opus-20240229 model to instruct it to generate sub-tasks for an user’s goal. You can declare an variable using moustache syntax to substitute an value when prompt is triggered. For example, {{goal}} is substituted with “I want to earn six packs in six months” in the playground.

Now, create another prompt template that can act as a fallback.

You can create the same prompt this time but use a different model, such as gpt-4. You have created two prompt templates by now. You must have noticed each prompt has a slightly different system message based on the model. After experimenting with each model, the above prompt was best suited for suggesting actionable steps to reach the goal.

For further exploration, Try learning about OpenAI SDK to work with Prompt Templates.

Fallback Configs using Prompt Templates

You need to prepare requests to apply fallback strategy. To do that, use the created prompt templates earlier, one with Anthropic and another with OpenAI, structure them as follows:

{
  "strategy": {
    "mode": "fallback"
  },
  "targets": [
    {
      "prompt_id": "task_to_subtasks_anthropic"
    },
    {
      "prompt_id": "task_to_subtasks_openai"
    }
  ]
}

The targets is an array of objects ordered by preference in favor of Anthropic and then on to OpenAI.

Pass these configs at instance creation from Portkey

const portkey = new Portkey({
  apiKey: PORTKEY_API_KEY,
  config: {
    strategy: {
      mode: 'fallback'
    },
    targets: [
      {
        prompt_id: 'task_to_subtasks_anthropic'
      },
      {
        prompt_id: 'task_to_subtasks_openai'
      }
    ]
  }
});

With this step done, moving forward the methods on portkey will have the context of above gateway configs for every request sent through portkey.

Read more about different ways to work with Gateway Configs.

Trigger Prompt Completions to Activate Smart Fallbacks

The prompt templates are prepared to be triggered while the Portkey client SDK waits to trigger the prompt completions API.

const response = await portkey.prompts.completions.create({
  promptID: 'pp-test-811461',
  variables: { goal: 'I want to acquire an AI engineering skills' }
});

console.log(response.choices[0].message.content); // success

The promptID invokes the prompt template you want to trigger on a prompt completions API. Since we already pass the gateway configs as an argument to the configs parameter during client instance creation, the value against the promptID is ignored, and task_to_subtasks_anthropic will be treated as the first target where requests will routed to, then fallback to task_to_subtasks_openai as defined in the targets.

Notice how variables hold the information to be substituted in the prompt templates at runtime. Also, even when the promptID is valid, the gateway configs will be respected in precedence.

View Fallback status in the Logs

Portkey provides the Logs to inspect and monitor all the requests seamlessly. It provides valuable information about each request from date/time, model, request, response, etc.

Here is a screenshot of a log:

Great job! You learned how to create prompt templates in Portkey and set up fallbacks for thousands of requests from your app, all with just a few lines of code.

Bonus: Activate Loadbalancing

Loadbalancing can split the volume of requests to both prompts separately, respecting the weights. As an outcome, you have fewer chances of hitting the rate limits and not overwhelming the models.

Here is how you can update the gateway configs:

const portkey = new Portkey({
  apiKey: PORTKEY_API_KEY,
  config: {
    strategy: {
      mode: 'loadbalance'
    },
    targets: [
      {
        prompt_id: 'task_to_subtasks_anthropic',
        weight: 0.1
      },
      {
        prompt_id: 'task_to_subtasks_openai',
        weight: 0.9
      }
    ]
  }
});

The weights will split the traffic of 90% to OpenAI and 10% to Anthropic prompt templates.

Great job! You learned how to create prompt templates in Portkey and set up fallbacks and load balancing for thousands of requests from your app, all with just a few lines of code.

Happy Coding!

See the full code:

import { Portkey } from 'portkey-ai';

const PORTKEY_API_KEY = 'xssxxrk';

const portkey = new Portkey({
  apiKey: PORTKEY_API_KEY,
  config: {
    strategy: {
      mode: 'fallback'
    },
    targets: [
      {
        prompt_id: 'pp-task-to-su-72fbbb'
      },
      {
        prompt_id: 'pp-task-to-su-051f65'
      }
    ]
  }
});

const response = await portkey.prompts.completions.create({
  promptID: 'pp-test-811461',
  variables: { goal: 'I want to acquire an AI engineering skills' }
});

console.log(response.choices[0].message.content);
PreviousRun Portkey on Prompts from Langchain HubNextHow to use OpenAI SDK with Portkey Prompt Templates

Last updated 1 year ago

Was this helpful?

The models on this page require you to save OpenAI and Anthropic API keys to the Portkey Vault. For more information about Portkey Vault, .

See the to learn more.

.

read more on Virtual Keys
reference
Refer to the Logs documentation