Vercel AI
Last updated
Was this helpful?
Last updated
Was this helpful?
Portkey is a control panel for your Vercel AI app. It makes your LLM integrations prod-ready, reliable, fast, and cost-efficient.
Use Portkey with your Vercel app for:
Calling 100+ LLMs (open & closed)
Logging & analysing LLM usage
Caching responses
Automating fallbacks, retries, timeouts, and load balancing
Managing, versioning, and deploying prompts
Continuously improving app with user feedback
Go ahead and create a Next.js application, and install ai
and portkey-ai
as dependencies.
.env
Login to Portkey
To integrate OpenAI with Portkey, add your OpenAI API key to Portkey’s Virtual Keys
This will give you a disposable key that you can use and rotate instead of directly using the OpenAI API key
Grab the Virtual key & your Portkey API key and add them to .env
file:
Create a Next.js Route Handler that utilizes the Edge Runtime to generate a chat completion. Stream back to Next.js.
For this example, create a route handler at app/api/chat/route.ts
that calls GPT-4 and accepts a POST
request with a messages array of strings:
Portkey follows the same signature as OpenAI SDK but extends it to work with 100+ LLMs. Here, the chat completion call will be sent to the gpt-3.5-turbo
model, and the response will be streamed to your Next.js app.
Let’s see how you can switch from GPT-4 to Claude-3-Opus by updating 2 lines of code (without breaking anything else).
Add your Anthropic API key or AWS Bedrock secrets to Portkey’s Virtual Keys
Update the virtual key while instantiating your Portkey client
Update the model name while making your /chat/completions
call
Add maxTokens field inside streamText invocation (Anthropic requires this field)
Let’s see it in action:
Let's create a Client component that will have a form to collect the prompt from the user and stream back the completion. The useChat
hook will default use the POST
Route Handler we created earlier (/api/chat
). However, you can override this default value by passing an api
prop to useChat({ api: '...'}
).
Portkey logs all the requests you’re sending to help you debug errors, and get request-level + aggregate insights on costs, latency, errors, and more.
You can enhance the logging by tracing certain requests, passing custom metadata or user feedback.
Segmenting Requests with Metadata
While Creating the Client, you can pass any {"key":"value"}
pairs inside the metadata header. Portkey segments the requests based on the metadata to give you granular insights.
For example, for setting up a fallback from OpenAI to Anthropic, the Gateway Config would be:
You can save this Config in Portkey app and get an associated Config ID that you can pass while instantiating your LLM client:
You can loadbalance your requests against multiple LLMs or accounts and prevent any one account from hitting rate limit thresholds.
For example, to route your requests between 1 OpenAI and 2 Azure OpenAI accounts:
Save this Config in the Portkey app and pass it while instantiating the LLM Client, just like we did above.
Portkey can save LLM costs & reduce latencies 20x by storing responses for semantically similar queries and serving them from cache.
For Q&A use cases, cache hit rates go as high as 50%. To enable semantic caching, just set the cache
mode
to semantic
in your Gateway Config:
Same as above, you can save your cache Config in the Portkey app, and reference the Config ID while instantiating the LLM Client.
To create a Prompt Template,
From the Dashboard, Open Prompts
In the Prompts page, Click Create
Add your instructions, variables, and You can modify model parameters and click Save
Portkey is powered by an with which you can route to 100+ LLMs using the same, known OpenAI spec.
Similarly, you can just add your to Portkey and call Gemini 1.5:
The same will follow for all the other providers like Azure, Mistral, Anyscale, Together, and .
Learn more about and .
Portkey helps you automatically trigger a call to any other LLM/provider in case of primary failures. a fallback logic with Portkey’s Gateway Config.
Portkey can also trigger , set , and more.
Moreover, you can set the max-age
of the cache and force refresh a cache. See the for more information.
Storing prompt templates and instructions in code is messy. Using Portkey, you can create and manage all of your app’s prompts in a single place and directly hit our prompts API to get responses. Here’s more on .
See for more information.
If you have any questions or issues, reach out to us on . On Discord, you will also meet many other practitioners who are putting their Vercel AI + Portkey app to production.