Portkey Docs
HomeAPIIntegrationsChangelog
  • Introduction
    • What is Portkey?
    • Make Your First Request
    • Feature Overview
  • Integrations
    • LLMs
      • OpenAI
        • Structured Outputs
        • Prompt Caching
      • Anthropic
        • Prompt Caching
      • Google Gemini
      • Groq
      • Azure OpenAI
      • AWS Bedrock
      • Google Vertex AI
      • Bring Your Own LLM
      • AI21
      • Anyscale
      • Cerebras
      • Cohere
      • Fireworks
      • Deepbricks
      • Deepgram
      • Deepinfra
      • Deepseek
      • Google Palm
      • Huggingface
      • Inference.net
      • Jina AI
      • Lingyi (01.ai)
      • LocalAI
      • Mistral AI
      • Monster API
      • Moonshot
      • Nomic
      • Novita AI
      • Ollama
      • OpenRouter
      • Perplexity AI
      • Predibase
      • Reka AI
      • SambaNova
      • Segmind
      • SiliconFlow
      • Stability AI
      • Together AI
      • Voyage AI
      • Workers AI
      • ZhipuAI / ChatGLM / BigModel
      • Suggest a new integration!
    • Agents
      • Autogen
      • Control Flow
      • CrewAI
      • Langchain Agents
      • LlamaIndex
      • Phidata
      • Bring Your own Agents
    • Libraries
      • Autogen
      • DSPy
      • Instructor
      • Langchain (Python)
      • Langchain (JS/TS)
      • LlamaIndex (Python)
      • LibreChat
      • Promptfoo
      • Vercel
        • Vercel [Depricated]
  • Product
    • Observability (OpenTelemetry)
      • Logs
      • Tracing
      • Analytics
      • Feedback
      • Metadata
      • Filters
      • Logs Export
      • Budget Limits
    • AI Gateway
      • Universal API
      • Configs
      • Multimodal Capabilities
        • Image Generation
        • Function Calling
        • Vision
        • Speech-to-Text
        • Text-to-Speech
      • Cache (Simple & Semantic)
      • Fallbacks
      • Automatic Retries
      • Load Balancing
      • Conditional Routing
      • Request Timeouts
      • Canary Testing
      • Virtual Keys
        • Budget Limits
    • Prompt Library
      • Prompt Templates
      • Prompt Partials
      • Retrieve Prompts
      • Advanced Prompting with JSON Mode
    • Guardrails
      • List of Guardrail Checks
        • Patronus AI
        • Aporia
        • Pillar
        • Bring Your Own Guardrails
      • Creating Raw Guardrails (in JSON)
    • Autonomous Fine-tuning
    • Enterprise Offering
      • Org Management
        • Organizations
        • Workspaces
        • User Roles & Permissions
        • API Keys (AuthN and AuthZ)
      • Access Control Management
      • Budget Limits
      • Security @ Portkey
      • Logs Export
      • Private Cloud Deployments
        • Architecture
        • AWS
        • GCP
        • Azure
        • Cloudflare Workers
        • F5 App Stack
      • Components
        • Log Store
          • MongoDB
    • Open Source
    • Portkey Pro & Enterprise Plans
  • API Reference
    • Introduction
    • Authentication
    • OpenAPI Specification
    • Headers
    • Response Schema
    • Gateway Config Object
    • SDK
  • Provider Endpoints
    • Supported Providers
    • Chat
    • Embeddings
    • Images
      • Create Image
      • Create Image Edit
      • Create Image Variation
    • Audio
      • Create Speech
      • Create Transcription
      • Create Translation
    • Fine-tuning
      • Create Fine-tuning Job
      • List Fine-tuning Jobs
      • Retrieve Fine-tuning Job
      • List Fine-tuning Events
      • List Fine-tuning Checkpoints
      • Cancel Fine-tuning
    • Batch
      • Create Batch
      • List Batch
      • Retrieve Batch
      • Cancel Batch
    • Files
      • Upload File
      • List Files
      • Retrieve File
      • Retrieve File Content
      • Delete File
    • Moderations
    • Assistants API
      • Assistants
        • Create Assistant
        • List Assistants
        • Retrieve Assistant
        • Modify Assistant
        • Delete Assistant
      • Threads
        • Create Thread
        • Retrieve Thread
        • Modify Thread
        • Delete Thread
      • Messages
        • Create Message
        • List Messages
        • Retrieve Message
        • Modify Message
        • Delete Message
      • Runs
        • Create Run
        • Create Thread and Run
        • List Runs
        • Retrieve Run
        • Modify Run
        • Submit Tool Outputs to Run
        • Cancel Run
      • Run Steps
        • List Run Steps
        • Retrieve Run Steps
    • Completions
    • Gateway for Other API Endpoints
  • Portkey Endpoints
    • Configs
      • Create Config
      • List Configs
      • Retrieve Config
      • Update Config
    • Feedback
      • Create Feedback
      • Update Feedback
    • Guardrails
    • Logs
      • Insert a Log
      • Log Exports [BETA]
        • Retrieve a Log Export
        • Update a Log Export
        • List Log Exports
        • Create a Log Export
        • Start a Log Export
        • Cancel a Log Export
        • Download a Log Export
    • Prompts
      • Prompt Completion
      • Render
    • Virtual Keys
      • Create Virtual Key
      • List Virtual Keys
      • Retrieve Virtual Key
      • Update Virtual Key
      • Delete Virtual Key
    • Analytics
      • Graphs - Time Series Data
        • Get Requests Data
        • Get Cost Data
        • Get Latency Data
        • Get Tokens Data
        • Get Users Data
        • Get Requests per User
        • Get Errors Data
        • Get Error Rate Data
        • Get Status Code Data
        • Get Unique Status Code Data
        • Get Rescued Requests Data
        • Get Cache Hit Rate Data
        • Get Cache Hit Latency Data
        • Get Feedback Data
        • Get Feedback Score Distribution Data
        • Get Weighted Feeback Data
        • Get Feedback Per AI Models
      • Summary
        • Get All Cache Data
      • Groups - Paginated Data
        • Get User Grouped Data
        • Get Model Grouped Data
        • Get Metadata Grouped Data
    • API Keys [BETA]
      • Update API Key
      • Create API Key
      • Delete an API Key
      • Retrieve an API Key
      • List API Keys
    • Admin
      • Users
        • Retrieve a User
        • Retrieve All Users
        • Update a User
        • Remove a User
      • User Invites
        • Invite a User
        • Retrieve an Invite
        • Retrieve All User Invites
        • Delete a User Invite
      • Workspaces
        • Create Workspace
        • Retrieve All Workspaces
        • Retrieve a Workspace
        • Update Workspace
        • Delete a Workspace
      • Workspace Members
        • Add a Workspace Member
        • Retrieve All Workspace Members
        • Retrieve a Workspace Member
        • Update Workspace Member
        • Remove Workspace Member
  • Guides
    • Getting Started
      • A/B Test Prompts and Models
      • Tackling Rate Limiting
      • Function Calling
      • Image Generation
      • Getting started with AI Gateway
      • Llama 3 on Groq
      • Return Repeat Requests from Cache
      • Trigger Automatic Retries on LLM Failures
      • 101 on Portkey's Gateway Configs
    • Integrations
      • Llama 3 on Portkey + Together AI
      • Introduction to GPT-4o
      • Anyscale
      • Mistral
      • Vercel AI
      • Deepinfra
      • Groq
      • Langchain
      • Mixtral 8x22b
      • Segmind
    • Use Cases
      • Few-Shot Prompting
      • Enforcing JSON Schema with Anyscale & Together
      • Detecting Emotions with GPT-4o
      • Build an article suggestion app with Supabase pgvector, and Portkey
      • Setting up resilient Load balancers with failure-mitigating Fallbacks
      • Run Portkey on Prompts from Langchain Hub
      • Smart Fallback with Model-Optimized Prompts
      • How to use OpenAI SDK with Portkey Prompt Templates
      • Setup OpenAI -> Azure OpenAI Fallback
      • Fallback from SDXL to Dall-e-3
      • Comparing Top10 LMSYS Models with Portkey
      • Build a chatbot using Portkey's Prompt Templates
  • Support
    • Contact Us
    • Developer Forum
    • Common Errors & Resolutions
    • December '23 Migration
    • Changelog
Powered by GitBook
On this page
  • Components and Sizing Recommendations
  • Component: AI Gateway
  • Component: Logs Store (optional)
  • Component: Cache (Prompts, Configs & Virtual Keys)
  • Deployment Steps
  • Prerequisites
  • Step 1: Clone the Portkey Repo Containing Helm Chart
  • Step 2: Update values.yaml for Helm
  • Step 3: Deploy Using Helm Charts
  • Step 4: Verify the Deployment
  • Step 5: Port Forwarding (Optional)
  • Uninstalling the Deployment
  • Network Configuration
  • Step 1: Allow Access to the Service
  • Step 2: Ensure Outbound Network Access
  • Step 3: Configure Inbound Access for Portkey Control Plane

Was this helpful?

Edit on GitHub
  1. Product
  2. Enterprise Offering
  3. Private Cloud Deployments

GCP

This enterprise-focused document provides comprehensive instructions for deploying the Portkey software on Google Cloud Platform (GCP), tailored to meet the needs of large-scale, mission-critical applications. It includes specific recommendations for component sizing, high availability, disaster recovery, and integration with monitoring systems.

Components and Sizing Recommendations

Component: AI Gateway

  • Deployment: Deploy as a Docker container in your Kubernetes cluster using Helm Charts.

  • Instance Type: GCP n1-standard-2 instance, with at least 4GiB of memory and two vCPUs.

  • High Availability: Deploy across multiple zones for high reliability.

Component: Logs Store (optional)

  • Options: Hosted MongoDB, Google Cloud Storage (GCS), or Google Firestore.

  • Sizing: Each log document is ~10kb in size (uncompressed).

Component: Cache (Prompts, Configs & Virtual Keys)

  • Options: Google Memorystore for Redis or self-hosted Redis.

  • Deployment: Deploy in the same VPC as the Portkey Gateway.

Deployment Steps

Prerequisites

Ensure the following tools are installed:

  • Docker

  • Kubectl

  • Helm (v3 or above)

Step 1: Clone the Portkey Repo Containing Helm Chart

git clone https://github.com/Portkey-AI/helm-chart

Step 2: Update values.yaml for Helm

Modify the values.yaml file in the Helm chart directory to include the Docker registry credentials and necessary environment variables. You can find the sample file at ./helm-chart/helm/enterprise/values.yaml.

Image Credentials Configuration

imageCredentials:
  name: portkey-enterprise-registry-credentials
  create: true
  registry: kubernetes.io/dockerconfigjson
  username: <docker-username>
  password: <docker-password>

The Portkey team will share the credentials for your image.

Environment Variables Configuration

These can be stored & fetched from a vault as well

environment:
  ...
  data:
    SERVICE_NAME: 
    LOG_STORE: 
    MONGO_DB_CONNECTION_URL: 
    MONGO_DATABASE: 
    MONGO_COLLECTION_NAME: 
    LOG_STORE_REGION: 
    LOG_STORE_ACCESS_KEY: 
    LOG_STORE_SECRET_KEY: 
    LOG_STORE_GENERATIONS_BUCKET: 
    ANALYTICS_STORE: 
    ANALYTICS_STORE_ENDPOINT: 
    ANALYTICS_STORE_USER: 
    ANALYTICS_STORE_PASSWORD: 
    ANALYTICS_LOG_TABLE: 
    ANALYTICS_FEEDBACK_TABLE: 
    CACHE_STORE: 
    REDIS_URL: 
    REDIS_TLS_ENABLED: 
    PORTKEY_CLIENT_AUTH: 
    ORGANISATIONS_TO_SYNC: 

Notes on the Log Store

LOG_STORE can be:

  • Google Cloud Storage (gcs)

  • Hosted MongoDB (mongo)

If the LOG_STORE is mongo, the following environment variables are needed:

MONGO_DB_CONNECTION_URL: 
MONGO_DATABASE:
MONGO_COLLECTION_NAME:

If the LOG_STORE is gcs, the following values are mandatory:

LOG_STORE_REGION: 
LOG_STORE_ACCESS_KEY: 
LOG_STORE_SECRET_KEY: 
LOG_STORE_GENERATIONS_BUCKET: 

You need to generate the Access Key and Secret Key from the respective providers.

Notes on Cache

If CACHE_STORE is set as redis, a Redis instance will also get deployed in the cluster. If you are using custom Redis, then leave it blank. The following values are mandatory:

REDIS_URL: 
REDIS_TLS_ENABLED: 

REDIS_URL defaults to redis://redis:6379 and REDIS_TLS_ENABLED defaults to false.

Notes on Analytics Store

This is hosted in Portkey’s control plane and these credentials will be shared by the Portkey team.

The following are mandatory and are shared by the Portkey Team.

PORTKEY_CLIENT_AUTH:
ORGANISATIONS_TO_SYNC:

Step 3: Deploy Using Helm Charts

Navigate to the directory containing your Helm chart and run the following command to deploy the application:

helm install portkey-gateway ./helm/enterprise --namespace portkeyai --create-namespace

This command installs the Helm chart into the portkeyai namespace.

Step 4: Verify the Deployment

Check the status of your deployment to ensure everything is running correctly:

kubectl get pods -n portkeyai

Step 5: Port Forwarding (Optional)

To access the service over the internet, use port forwarding:

kubectl port-forward <pod-name> -n portkeyai 443:8787

Replace <pod-name> with the name of your pod.

Uninstalling the Deployment

If you need to remove the deployment, run:

helm uninstall portkey-app --namespace portkeyai

This command will uninstall the Helm release and clean up the resources.

Network Configuration

Step 1: Allow Access to the Service

To make the service accessible from outside the cluster, define a Service of type LoadBalancer in your values.yaml or Helm templates. Specify the desired port for external access.

service:
  type: LoadBalancer
  port: <desired_port>
  targetPort: 8787

Replace <desired_port> with the port number for external access with the port the application listens on internally.

Step 2: Ensure Outbound Network Access

By default, Kubernetes allows full outbound access, but if your cluster has NetworkPolicies that restrict egress, configure them to allow outbound traffic.

Example NetworkPolicy for Outbound Access:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-all-egress
  namespace: portkeyai
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0

This allows the gateway to access LLMs hosted within your VPC and outside as well. This also enables connection for the sync service to the Portkey Control Plane.

Step 3: Configure Inbound Access for Portkey Control Plane

Ensure the Portkey control plane can access the service either over the internet or through VPC peering.

Over the Internet:

  • Ensure the LoadBalancer security group allows inbound traffic on the specified port.

  • Document the public IP/hostname and port for the control plane connection.

Through VPC Peering:

  • Set up VPC peering between your GCP account and the control plane's GCP account. Requires manual setup by Portkey Team.

This guide provides the necessary steps and configurations to deploy Portkey on GCP effectively, ensuring high availability, scalability, and integration with your existing infrastructure.

PreviousAWSNextAzure

Last updated 11 months ago

Was this helpful?