Portkey Docs
HomeAPIIntegrationsChangelog
  • Introduction
    • What is Portkey?
    • Make Your First Request
    • Feature Overview
  • Integrations
    • LLMs
      • OpenAI
        • Structured Outputs
        • Prompt Caching
      • Anthropic
        • Prompt Caching
      • Google Gemini
      • Groq
      • Azure OpenAI
      • AWS Bedrock
      • Google Vertex AI
      • Bring Your Own LLM
      • AI21
      • Anyscale
      • Cerebras
      • Cohere
      • Fireworks
      • Deepbricks
      • Deepgram
      • Deepinfra
      • Deepseek
      • Google Palm
      • Huggingface
      • Inference.net
      • Jina AI
      • Lingyi (01.ai)
      • LocalAI
      • Mistral AI
      • Monster API
      • Moonshot
      • Nomic
      • Novita AI
      • Ollama
      • OpenRouter
      • Perplexity AI
      • Predibase
      • Reka AI
      • SambaNova
      • Segmind
      • SiliconFlow
      • Stability AI
      • Together AI
      • Voyage AI
      • Workers AI
      • ZhipuAI / ChatGLM / BigModel
      • Suggest a new integration!
    • Agents
      • Autogen
      • Control Flow
      • CrewAI
      • Langchain Agents
      • LlamaIndex
      • Phidata
      • Bring Your own Agents
    • Libraries
      • Autogen
      • DSPy
      • Instructor
      • Langchain (Python)
      • Langchain (JS/TS)
      • LlamaIndex (Python)
      • LibreChat
      • Promptfoo
      • Vercel
        • Vercel [Depricated]
  • Product
    • Observability (OpenTelemetry)
      • Logs
      • Tracing
      • Analytics
      • Feedback
      • Metadata
      • Filters
      • Logs Export
      • Budget Limits
    • AI Gateway
      • Universal API
      • Configs
      • Multimodal Capabilities
        • Image Generation
        • Function Calling
        • Vision
        • Speech-to-Text
        • Text-to-Speech
      • Cache (Simple & Semantic)
      • Fallbacks
      • Automatic Retries
      • Load Balancing
      • Conditional Routing
      • Request Timeouts
      • Canary Testing
      • Virtual Keys
        • Budget Limits
    • Prompt Library
      • Prompt Templates
      • Prompt Partials
      • Retrieve Prompts
      • Advanced Prompting with JSON Mode
    • Guardrails
      • List of Guardrail Checks
        • Patronus AI
        • Aporia
        • Pillar
        • Bring Your Own Guardrails
      • Creating Raw Guardrails (in JSON)
    • Autonomous Fine-tuning
    • Enterprise Offering
      • Org Management
        • Organizations
        • Workspaces
        • User Roles & Permissions
        • API Keys (AuthN and AuthZ)
      • Access Control Management
      • Budget Limits
      • Security @ Portkey
      • Logs Export
      • Private Cloud Deployments
        • Architecture
        • AWS
        • GCP
        • Azure
        • Cloudflare Workers
        • F5 App Stack
      • Components
        • Log Store
          • MongoDB
    • Open Source
    • Portkey Pro & Enterprise Plans
  • API Reference
    • Introduction
    • Authentication
    • OpenAPI Specification
    • Headers
    • Response Schema
    • Gateway Config Object
    • SDK
  • Provider Endpoints
    • Supported Providers
    • Chat
    • Embeddings
    • Images
      • Create Image
      • Create Image Edit
      • Create Image Variation
    • Audio
      • Create Speech
      • Create Transcription
      • Create Translation
    • Fine-tuning
      • Create Fine-tuning Job
      • List Fine-tuning Jobs
      • Retrieve Fine-tuning Job
      • List Fine-tuning Events
      • List Fine-tuning Checkpoints
      • Cancel Fine-tuning
    • Batch
      • Create Batch
      • List Batch
      • Retrieve Batch
      • Cancel Batch
    • Files
      • Upload File
      • List Files
      • Retrieve File
      • Retrieve File Content
      • Delete File
    • Moderations
    • Assistants API
      • Assistants
        • Create Assistant
        • List Assistants
        • Retrieve Assistant
        • Modify Assistant
        • Delete Assistant
      • Threads
        • Create Thread
        • Retrieve Thread
        • Modify Thread
        • Delete Thread
      • Messages
        • Create Message
        • List Messages
        • Retrieve Message
        • Modify Message
        • Delete Message
      • Runs
        • Create Run
        • Create Thread and Run
        • List Runs
        • Retrieve Run
        • Modify Run
        • Submit Tool Outputs to Run
        • Cancel Run
      • Run Steps
        • List Run Steps
        • Retrieve Run Steps
    • Completions
    • Gateway for Other API Endpoints
  • Portkey Endpoints
    • Configs
      • Create Config
      • List Configs
      • Retrieve Config
      • Update Config
    • Feedback
      • Create Feedback
      • Update Feedback
    • Guardrails
    • Logs
      • Insert a Log
      • Log Exports [BETA]
        • Retrieve a Log Export
        • Update a Log Export
        • List Log Exports
        • Create a Log Export
        • Start a Log Export
        • Cancel a Log Export
        • Download a Log Export
    • Prompts
      • Prompt Completion
      • Render
    • Virtual Keys
      • Create Virtual Key
      • List Virtual Keys
      • Retrieve Virtual Key
      • Update Virtual Key
      • Delete Virtual Key
    • Analytics
      • Graphs - Time Series Data
        • Get Requests Data
        • Get Cost Data
        • Get Latency Data
        • Get Tokens Data
        • Get Users Data
        • Get Requests per User
        • Get Errors Data
        • Get Error Rate Data
        • Get Status Code Data
        • Get Unique Status Code Data
        • Get Rescued Requests Data
        • Get Cache Hit Rate Data
        • Get Cache Hit Latency Data
        • Get Feedback Data
        • Get Feedback Score Distribution Data
        • Get Weighted Feeback Data
        • Get Feedback Per AI Models
      • Summary
        • Get All Cache Data
      • Groups - Paginated Data
        • Get User Grouped Data
        • Get Model Grouped Data
        • Get Metadata Grouped Data
    • API Keys [BETA]
      • Update API Key
      • Create API Key
      • Delete an API Key
      • Retrieve an API Key
      • List API Keys
    • Admin
      • Users
        • Retrieve a User
        • Retrieve All Users
        • Update a User
        • Remove a User
      • User Invites
        • Invite a User
        • Retrieve an Invite
        • Retrieve All User Invites
        • Delete a User Invite
      • Workspaces
        • Create Workspace
        • Retrieve All Workspaces
        • Retrieve a Workspace
        • Update Workspace
        • Delete a Workspace
      • Workspace Members
        • Add a Workspace Member
        • Retrieve All Workspace Members
        • Retrieve a Workspace Member
        • Update Workspace Member
        • Remove Workspace Member
  • Guides
    • Getting Started
      • A/B Test Prompts and Models
      • Tackling Rate Limiting
      • Function Calling
      • Image Generation
      • Getting started with AI Gateway
      • Llama 3 on Groq
      • Return Repeat Requests from Cache
      • Trigger Automatic Retries on LLM Failures
      • 101 on Portkey's Gateway Configs
    • Integrations
      • Llama 3 on Portkey + Together AI
      • Introduction to GPT-4o
      • Anyscale
      • Mistral
      • Vercel AI
      • Deepinfra
      • Groq
      • Langchain
      • Mixtral 8x22b
      • Segmind
    • Use Cases
      • Few-Shot Prompting
      • Enforcing JSON Schema with Anyscale & Together
      • Detecting Emotions with GPT-4o
      • Build an article suggestion app with Supabase pgvector, and Portkey
      • Setting up resilient Load balancers with failure-mitigating Fallbacks
      • Run Portkey on Prompts from Langchain Hub
      • Smart Fallback with Model-Optimized Prompts
      • How to use OpenAI SDK with Portkey Prompt Templates
      • Setup OpenAI -> Azure OpenAI Fallback
      • Fallback from SDXL to Dall-e-3
      • Comparing Top10 LMSYS Models with Portkey
      • Build a chatbot using Portkey's Prompt Templates
  • Support
    • Contact Us
    • Developer Forum
    • Common Errors & Resolutions
    • December '23 Migration
    • Changelog
Powered by GitBook
On this page
  • Portkey SDK Integration with Google Vertex AI
  • 1. Install the Portkey SDK
  • 2. Initialize Portkey with the Virtual Key
  • 3. Invoke Chat Completions with Vertex AI and Gemini
  • Document, Video, Audio Processing
  • Sending base64 Image
  • Text Embedding Models
  • Function Calling
  • Managing Vertex AI Prompts
  • Making Requests Without Virtual Keys
  • How to Find Your Google Vertex Project Details
  • Get Your Vertex Service Account JSON
  • Next Steps

Was this helpful?

Edit on GitHub
  1. Integrations
  2. LLMs

Google Vertex AI

PreviousAWS BedrockNextBring Your Own LLM

Last updated 8 months ago

Was this helpful?

Portkey provides a robust and secure gateway to facilitate the integration of various Large Language Models (LLMs), and embedding models into your apps, including .

With Portkey, you can take advantage of features like fast AI gateway access, observability, prompt management, and more, all while ensuring the secure management of your Vertex auth through a system.

Provider Slug: vertex-ai

Portkey SDK Integration with Google Vertex AI

Portkey provides a consistent API to interact with models from various providers. To integrate Google Vertex AI with Portkey:

1. Install the Portkey SDK

Add the Portkey SDK to your application to interact with Google Vertex AI API through Portkey's gateway.

npm install --save portkey-ai
pip install portkey-ai

2. Initialize Portkey with the Virtual Key

To integrate Vertex AI with Portkey, you'll need your Vertex Project Id & Vertex Region, with which you can set up the Virtual key.

.

If you are integrating through Service Account File, .

import Portkey from 'portkey-ai'
 
const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY", // defaults to process.env["PORTKEY_API_KEY"]
    virtualKey: "VERTEX_VIRTUAL_KEY", // Your Vertex AI Virtual Key
})
from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",  # Replace with your Portkey API key
    virtual_key="VERTEX_VIRTUAL_KEY"   # Replace with your virtual key for Google
)
import OpenAI from "openai";
import { PORTKEY_GATEWAY_URL, createHeaders } from "portkey-ai";

const portkey = new OpenAI({
  baseURL: PORTKEY_GATEWAY_URL,
  defaultHeaders: createHeaders({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "PORTKEY_VERTEX_VIRTUAL_KEY",
    authorization: "Bearer $GCLOUD AUTH PRINT-ACCESS-TOKEN"
  }),
});

3. Invoke Chat Completions with Vertex AI and Gemini

Use the Portkey instance to send requests to Gemini models hosted on Vertex AI. You can also override the virtual key directly in the API call if needed.

Vertex AI uses OAuth2 to authenticate its requests, so you need to send the access token additionally along with the request.

const chatCompletion = await portkey.chat.completions.create({
    messages: [{ role: 'user', content: 'Say this is a test' }],
    model: 'gemini-pro',
}, {authorization: "vertex ai access token here"});

console.log(chatCompletion.choices);
completion = portkey.with_options(authorization="...").chat.completions.create(
    messages= [{ "role": 'user', "content": 'Say this is a test' }],
    model= 'gemini-pro'
)

print(completion)
async function main() {
  const response = await portkey.chat.completions.create({
    messages: [{ role: "user", content: "1729" }],
    model: "gemini-1.5-flash-001",
    max_tokens: 128,
  });
  console.log(response.choices[0].message.content);
}

main();

Document, Video, Audio Processing

Vertex AI supports attaching mp4, pdf, jpg, mp3, wav, etc. file types to your Gemini messages.

Gemini Docs:

Using Portkey, here's how you can send these media files:

const chatCompletion = await portkey.chat.completions.create({
    messages: [
        { role: 'system', content: 'You are a helpful assistant' },
        { role: 'user', content: [
            {
                type: 'image_url',
                image_url: {
                    url: 'gs://cloud-samples-data/generative-ai/image/scones.jpg'
                }
            },
            {
                type: 'text',
                text: 'Describe the image'
            }
        ]}
    ],
    model: 'gemini-1.5-pro-001',
    max_tokens: 200
});
completion = portkey.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "gs://cloud-samples-data/generative-ai/image/scones.jpg"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe the image"
                }
            ]
        }
    ],
    model='gemini-1.5-pro-001',
    max_tokens=200
)

print(completion)
curl --location 'https://api.portkey.ai/v1/chat/completions' \
--header 'x-portkey-provider: vertex-ai' \
--header 'x-portkey-vertex-region: us-central1' \
--header 'Content-Type: application/json' \
--header 'x-portkey-api-key: PORTKEY_API_KEY' \
--header 'Authorization: GEMINI_API_KEY' \
--data '{
    "model": "gemini-1.5-pro-001",
    "max_tokens": 200,
    "stream": false,
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant"
        },
        {
            "role": "user",
            "content": [
                {    
                    "type": "image_url",
                    "image_url": {
                        "url": "gs://cloud-samples-data/generative-ai/image/scones.jpg"
                    }
                },
                {
                    "type": "text",
                    "text": "describe this image"
                }
            ]
        }
    ]
}'

This same message format also works for all other media types — just send your media file in the url field, like "url": "gs://cloud-samples-data/video/animals.mp4" for google cloud urls and "url":"https://download.samplelib.com/mp3/sample-3s.mp3" for public urls

Your URL should have the file extension, this is used for inferring MIME_TYPE which is a required parameter for prompting Gemini models with files

Sending base64 Image

Here, you can send the base64 image data along with the url field too:

"url": "....."

Text Embedding Models

You can use any of Vertex AI's English and Multilingual models through Portkey, in the familar OpenAI-schema.

The Gemini-specific parameter task_type is also supported on Portkey.

import Portkey from 'portkey-ai';

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "VERTEX_VIRTUAL_KEY"
});

// Generate embeddings
async function getEmbeddings() {
    const embeddings = await portkey.embeddings.create({
        input: "embed this",
        model: "text-multilingual-embedding-002",
        // @ts-ignore (if using typescript)
        task_type: "CLASSIFICATION", // Optional
    }, {authorization: "vertex ai access token here"});

    console.log(embeddings);
}
await getEmbeddings();
from portkey_ai import Portkey

# Initialize the Portkey client
portkey = Portkey(
    api_key="PORTKEY_API_KEY",  # Replace with your Portkey API key
    virtual_key="VERTEX_VIRTUAL_KEY"
)

# Generate embeddings
def get_embeddings():
    embeddings = portkey.with_options(authorization="...").embeddings.create(
        input='The vector representation for this text',
        model='text-embedding-004',
        task_type="CLASSIFICATION" # Optional
    )
    print(embeddings)

get_embeddings()
curl 'https://api.portkey.ai/v1/embeddings' \
    -H 'Content-Type: application/json' \
    -H 'x-portkey-api-key: PORTKEY_API_KEY' \
    -H 'x-portkey-provider: vertex-ai' \
    -H 'Authorization: Bearer VERTEX_AI_ACCESS_TOKEN' \
    -H 'x-portkey-virtual-key: $VERTEX_VIRTUAL_KEY' \
    --data-raw '{
        "model": "textembedding-004",
        "input": "A HTTP 246 code is used to signify an AI response containing hallucinations or other inaccuracies",
        "task_type": "CLASSIFICATION"
    }'

Function Calling

Managing Vertex AI Prompts

Once you're ready with your prompt, you can use the portkey.prompts.completions.create interface to use the prompt in your application.

Making Requests Without Virtual Keys

You can also pass your Vertex AI details & secrets directly without using the Virtual Keys in Portkey.

Vertex AI expects a region, a project ID and the access token in the request for a successful completion request. This is how you can specify these fields directly in your requests:

import Portkey from 'portkey-ai'
 
const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    vertexProjectId: "sample-55646",
    vertexRegion: "us-central1",
    provider:"vertex_ai",
    Authorization: "$GCLOUD AUTH PRINT-ACCESS-TOKEN"
})

const chatCompletion = await portkey.chat.completions.create({
    messages: [{ role: 'user', content: 'Say this is a test' }],
    model: 'gemini-pro',
});

console.log(chatCompletion.choices);
from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    vertex_project_id="sample-55646",
    vertex_region="us-central1",
    provider="vertex_ai",
    Authorization="$GCLOUD AUTH PRINT-ACCESS-TOKEN"
)

completion = portkey.chat.completions.create(
    messages= [{ "role": 'user', "content": 'Say this is a test' }],
    model= 'gemini-pro'
)

print(completion)
import OpenAI from "openai";
import { PORTKEY_GATEWAY_URL, createHeaders } from "portkey-ai";

const portkey = new OpenAI({
  baseURL: PORTKEY_GATEWAY_URL,
  defaultHeaders: createHeaders({
    apiKey: "PORTKEY_API_KEY",
    provider: "vertex-ai",
    vertexRegion: "us-central1",
    vertexProjectId: "xxx"
    Authorization: "Bearer $GCLOUD AUTH PRINT-ACCESS-TOKEN",
    // forwardHeaders: ["authorization"] // You can also directly forward the auth token to Google
  }),
});

async function main() {
  const response = await portkey.chat.completions.create({
    messages: [{ role: "user", content: "1729" }],
    model: "gemini-1.5-flash-001",
    max_tokens: 32,
  });
  console.log(response.choices[0].message.content);
}

main();
curl 'https://api.portkey.ai/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'x-portkey-api-key: PORTKEY_API_KEY' \
-H 'x-portkey-provider: vertex-ai' \
-H 'Authorization: Bearer VERTEX_AI_ACCESS_TOKEN' \
-H 'x-portkey-vertex-project-id: sample-94994' \
-H 'x-portkey-vertex-region: us-central1' \
--data '{
    "messages": [{"role": "user","content": "Hello"}],
    "max_tokens": 20,
    "model": "gemini-pro"
}'

For further questions on custom Vertex AI deployments or fine-grained access tokens, reach out to us on support@portkey.ai


How to Find Your Google Vertex Project Details

  • You can copy the Project ID located at the top left corner of your screen.

  • Find the Region dropdown on the same page to get your Vertex Region.

Get Your Vertex Service Account JSON


Next Steps

The complete list of features supported in the SDK are available on the link below.

You'll find more information in the relevant sections:

If you do not want to add your Vertex AI details to Portkey vault, you can directly pass them while instantiating the Portkey client. .

Portkey supports function calling mode on Google's Gemini Models. Explore this Cookbook for a deep dive and examples:

You can manage all prompts to Google Gemini in the . All the models in the model garden are supported and you can easily start testing different prompts.

To obtain your Vertex Project ID and Region, .

to get your Service Account JSON.

⬇️
Document Processing
Video & Image Processing
Audio Processing
Function Calling
Prompt Library
navigate to Google Vertex Dashboard
Follow this process
SDK
Add metadata to your requests
Add gateway configs to your Vertex AI requests
Tracing Vertex AI requests
Setup a fallback from OpenAI to Vertex AI APIs
Google Vertex AI
virtual key
Here's a guide on how to find your Vertex Project details
refer to this guide
More on that here