Portkey Docs
HomeAPIIntegrationsChangelog
  • Introduction
    • What is Portkey?
    • Make Your First Request
    • Feature Overview
  • Integrations
    • LLMs
      • OpenAI
        • Structured Outputs
        • Prompt Caching
      • Anthropic
        • Prompt Caching
      • Google Gemini
      • Groq
      • Azure OpenAI
      • AWS Bedrock
      • Google Vertex AI
      • Bring Your Own LLM
      • AI21
      • Anyscale
      • Cerebras
      • Cohere
      • Fireworks
      • Deepbricks
      • Deepgram
      • Deepinfra
      • Deepseek
      • Google Palm
      • Huggingface
      • Inference.net
      • Jina AI
      • Lingyi (01.ai)
      • LocalAI
      • Mistral AI
      • Monster API
      • Moonshot
      • Nomic
      • Novita AI
      • Ollama
      • OpenRouter
      • Perplexity AI
      • Predibase
      • Reka AI
      • SambaNova
      • Segmind
      • SiliconFlow
      • Stability AI
      • Together AI
      • Voyage AI
      • Workers AI
      • ZhipuAI / ChatGLM / BigModel
      • Suggest a new integration!
    • Agents
      • Autogen
      • Control Flow
      • CrewAI
      • Langchain Agents
      • LlamaIndex
      • Phidata
      • Bring Your own Agents
    • Libraries
      • Autogen
      • DSPy
      • Instructor
      • Langchain (Python)
      • Langchain (JS/TS)
      • LlamaIndex (Python)
      • LibreChat
      • Promptfoo
      • Vercel
        • Vercel [Depricated]
  • Product
    • Observability (OpenTelemetry)
      • Logs
      • Tracing
      • Analytics
      • Feedback
      • Metadata
      • Filters
      • Logs Export
      • Budget Limits
    • AI Gateway
      • Universal API
      • Configs
      • Multimodal Capabilities
        • Image Generation
        • Function Calling
        • Vision
        • Speech-to-Text
        • Text-to-Speech
      • Cache (Simple & Semantic)
      • Fallbacks
      • Automatic Retries
      • Load Balancing
      • Conditional Routing
      • Request Timeouts
      • Canary Testing
      • Virtual Keys
        • Budget Limits
    • Prompt Library
      • Prompt Templates
      • Prompt Partials
      • Retrieve Prompts
      • Advanced Prompting with JSON Mode
    • Guardrails
      • List of Guardrail Checks
        • Patronus AI
        • Aporia
        • Pillar
        • Bring Your Own Guardrails
      • Creating Raw Guardrails (in JSON)
    • Autonomous Fine-tuning
    • Enterprise Offering
      • Org Management
        • Organizations
        • Workspaces
        • User Roles & Permissions
        • API Keys (AuthN and AuthZ)
      • Access Control Management
      • Budget Limits
      • Security @ Portkey
      • Logs Export
      • Private Cloud Deployments
        • Architecture
        • AWS
        • GCP
        • Azure
        • Cloudflare Workers
        • F5 App Stack
      • Components
        • Log Store
          • MongoDB
    • Open Source
    • Portkey Pro & Enterprise Plans
  • API Reference
    • Introduction
    • Authentication
    • OpenAPI Specification
    • Headers
    • Response Schema
    • Gateway Config Object
    • SDK
  • Provider Endpoints
    • Supported Providers
    • Chat
    • Embeddings
    • Images
      • Create Image
      • Create Image Edit
      • Create Image Variation
    • Audio
      • Create Speech
      • Create Transcription
      • Create Translation
    • Fine-tuning
      • Create Fine-tuning Job
      • List Fine-tuning Jobs
      • Retrieve Fine-tuning Job
      • List Fine-tuning Events
      • List Fine-tuning Checkpoints
      • Cancel Fine-tuning
    • Batch
      • Create Batch
      • List Batch
      • Retrieve Batch
      • Cancel Batch
    • Files
      • Upload File
      • List Files
      • Retrieve File
      • Retrieve File Content
      • Delete File
    • Moderations
    • Assistants API
      • Assistants
        • Create Assistant
        • List Assistants
        • Retrieve Assistant
        • Modify Assistant
        • Delete Assistant
      • Threads
        • Create Thread
        • Retrieve Thread
        • Modify Thread
        • Delete Thread
      • Messages
        • Create Message
        • List Messages
        • Retrieve Message
        • Modify Message
        • Delete Message
      • Runs
        • Create Run
        • Create Thread and Run
        • List Runs
        • Retrieve Run
        • Modify Run
        • Submit Tool Outputs to Run
        • Cancel Run
      • Run Steps
        • List Run Steps
        • Retrieve Run Steps
    • Completions
    • Gateway for Other API Endpoints
  • Portkey Endpoints
    • Configs
      • Create Config
      • List Configs
      • Retrieve Config
      • Update Config
    • Feedback
      • Create Feedback
      • Update Feedback
    • Guardrails
    • Logs
      • Insert a Log
      • Log Exports [BETA]
        • Retrieve a Log Export
        • Update a Log Export
        • List Log Exports
        • Create a Log Export
        • Start a Log Export
        • Cancel a Log Export
        • Download a Log Export
    • Prompts
      • Prompt Completion
      • Render
    • Virtual Keys
      • Create Virtual Key
      • List Virtual Keys
      • Retrieve Virtual Key
      • Update Virtual Key
      • Delete Virtual Key
    • Analytics
      • Graphs - Time Series Data
        • Get Requests Data
        • Get Cost Data
        • Get Latency Data
        • Get Tokens Data
        • Get Users Data
        • Get Requests per User
        • Get Errors Data
        • Get Error Rate Data
        • Get Status Code Data
        • Get Unique Status Code Data
        • Get Rescued Requests Data
        • Get Cache Hit Rate Data
        • Get Cache Hit Latency Data
        • Get Feedback Data
        • Get Feedback Score Distribution Data
        • Get Weighted Feeback Data
        • Get Feedback Per AI Models
      • Summary
        • Get All Cache Data
      • Groups - Paginated Data
        • Get User Grouped Data
        • Get Model Grouped Data
        • Get Metadata Grouped Data
    • API Keys [BETA]
      • Update API Key
      • Create API Key
      • Delete an API Key
      • Retrieve an API Key
      • List API Keys
    • Admin
      • Users
        • Retrieve a User
        • Retrieve All Users
        • Update a User
        • Remove a User
      • User Invites
        • Invite a User
        • Retrieve an Invite
        • Retrieve All User Invites
        • Delete a User Invite
      • Workspaces
        • Create Workspace
        • Retrieve All Workspaces
        • Retrieve a Workspace
        • Update Workspace
        • Delete a Workspace
      • Workspace Members
        • Add a Workspace Member
        • Retrieve All Workspace Members
        • Retrieve a Workspace Member
        • Update Workspace Member
        • Remove Workspace Member
  • Guides
    • Getting Started
      • A/B Test Prompts and Models
      • Tackling Rate Limiting
      • Function Calling
      • Image Generation
      • Getting started with AI Gateway
      • Llama 3 on Groq
      • Return Repeat Requests from Cache
      • Trigger Automatic Retries on LLM Failures
      • 101 on Portkey's Gateway Configs
    • Integrations
      • Llama 3 on Portkey + Together AI
      • Introduction to GPT-4o
      • Anyscale
      • Mistral
      • Vercel AI
      • Deepinfra
      • Groq
      • Langchain
      • Mixtral 8x22b
      • Segmind
    • Use Cases
      • Few-Shot Prompting
      • Enforcing JSON Schema with Anyscale & Together
      • Detecting Emotions with GPT-4o
      • Build an article suggestion app with Supabase pgvector, and Portkey
      • Setting up resilient Load balancers with failure-mitigating Fallbacks
      • Run Portkey on Prompts from Langchain Hub
      • Smart Fallback with Model-Optimized Prompts
      • How to use OpenAI SDK with Portkey Prompt Templates
      • Setup OpenAI -> Azure OpenAI Fallback
      • Fallback from SDXL to Dall-e-3
      • Comparing Top10 LMSYS Models with Portkey
      • Build a chatbot using Portkey's Prompt Templates
  • Support
    • Contact Us
    • Developer Forum
    • Common Errors & Resolutions
    • December '23 Migration
    • Changelog
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. Provider Endpoints
  2. Assistants API
  3. Runs

Create Run

PreviousRunsNextCreate Thread and Run

Last updated 9 months ago

Was this helpful?

Supported Providers
  • OpenAI

Create a run.

post
Authorizations
Path parameters
thread_idstringRequired

The ID of the thread to run.

Body
assistant_idstringRequired

The ID of the assistant to use to execute this run.

modelany of | nullableOptional

The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.

Example: gpt-4-turbo
stringOptional
or
string · enumOptionalPossible values:
instructionsstring | nullableOptional

Overrides the instructions of the assistant. This is useful for modifying the behavior on a per-run basis.

additional_instructionsstring | nullableOptional

Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions.

metadataobject | nullableOptional

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long.

temperaturenumber | nullableOptional

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Default: 1Example: 1
top_pnumber | nullableOptional

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.

Default: 1Example: 1
streamboolean | nullableOptional

If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.

max_prompt_tokensinteger | nullableOptional

The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.

max_completion_tokensinteger | nullableOptional

The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.

tool_choiceone of | nullableOptional

Controls which (if any) tool is called by the model. none means the model will not call any tools and instead generates a message. auto is the default value and means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools before responding to the user. Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

string · enumOptional

none means the model will not call any tools and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools before responding to the user.

Possible values:
or
parallel_tool_callsbooleanOptional

Whether to enable parallel function calling during tool use.

Default: true
response_formatone of | nullableOptional

Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

string · enumOptional

auto is the default value

Possible values:
or
Responses
200
OK
application/json
200

OK

{
  "id": "text",
  "object": "thread.run",
  "created_at": 1,
  "thread_id": "text",
  "assistant_id": "text",
  "status": "queued",
  "required_action": {
    "type": "submit_tool_outputs",
    "submit_tool_outputs": {
      "tool_calls": [
        {
          "id": "text",
          "type": "function",
          "function": {
            "name": "text",
            "arguments": "text"
          }
        }
      ]
    }
  },
  "last_error": {
    "code": "server_error",
    "message": "text"
  },
  "expires_at": 1,
  "started_at": 1,
  "cancelled_at": 1,
  "failed_at": 1,
  "completed_at": 1,
  "incomplete_details": {
    "reason": "max_completion_tokens"
  },
  "model": "text",
  "instructions": "text",
  "tools": [
    {
      "type": "code_interpreter"
    }
  ],
  "metadata": {},
  "usage": {
    "completion_tokens": 1,
    "prompt_tokens": 1,
    "total_tokens": 1
  },
  "temperature": 1,
  "top_p": 1,
  "max_prompt_tokens": 1,
  "max_completion_tokens": 1,
  "truncation_strategy": {
    "type": "auto",
    "last_messages": 1
  },
  "tool_choice": "none",
  "parallel_tool_calls": true,
  "response_format": "none"
}