Portkey's AI gateway supports STT models like Whisper by OpenAI.
Transcription & Translation Usage
Portkey supports both Transcription
and Translation
methods for STT models and follows the OpenAI signature where you can send the file (in flac
, mp3
, mp4
, mpeg
, mpga
, m4a
, ogg
, wav
, or webm
formats) as part of the API request.
Here's an example:
OpenAI NodeJS OpenAI Python REST Python SDK
Copy import fs from "fs";
import OpenAI from "openai";
import { PORTKEY_GATEWAY_URL, createHeaders } from 'portkey-ai'
const openai = new OpenAI({
baseURL: PORTKEY_GATEWAY_URL,
defaultHeaders: createHeaders({
apiKey: "PORTKEY_API_KEY",
virtualKey: "OPENAI_VIRTUAL_KEY"
})
});
// Transcription
async function transcribe() {
const transcription = await openai.audio.transcriptions.create({
file: fs.createReadStream("/path/to/file.mp3"),
model: "whisper-1",
});
console.log(transcription.text);
}
transcribe();
// Translation
async function translate() {
const translation = await openai.audio.translations.create({
file: fs.createReadStream("/path/to/file.mp3"),
model: "whisper-1",
});
console.log(translation.text);
}
translate();
Copy from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders
client = OpenAI(
base_url=PORTKEY_GATEWAY_URL,
default_headers=createHeaders(
api_key="PORTKEY_API_KEY",
virtual_key="OPENAI_VIRTUAL_KEY"
)
)
audio_file= open("/path/to/file.mp3", "rb")
# Transcription
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print(transcription.text)
# Translation
translation = client.audio.translations.create(
model="whisper-1",
file=audio_file
)
print(translation.text)
For Transcriptions:
Copy curl "https://api.portkey.ai/v1/audio/transcriptions" \
-H "x-portkey-api-key: $PORTKEY_API_KEY" \
-H "x-portkey-provider: openai" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H 'Content-Type: multipart/form-data' \
--form file=@/path/to/file/audio.mp3 \
--form model=whisper-1
For Translations:
Copy curl "https://api.portkey.ai/v1/audio/translations" \
-H "x-portkey-api-key: $PORTKEY_API_KEY" \
-H "x-portkey-provider: openai" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H 'Content-Type: multipart/form-data' \
--form file=@/path/to/file/audio.mp3 \
--form model=whisper-1
Copy from pathlib import Path
from portkey_ai import Portkey
# Initialize the Portkey client
portkey = Portkey(
api_key="PORTKEY_API_KEY", # Replace with your Portkey API key
virtual_key="VIRTUAL_KEY" # Add your provider's virtual key
)
audio_file= open("/path/to/file.mp3", "rb")
# Transcription
transcription = portkey.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print(transcription.text)
# Translation
translation = portkey.audio.translations.create(
model="whisper-1",
file=audio_file
)
print(translation.text)
On completion, the request will get logged in the logs UI where you can see trasncribed or translated text, along with the cost and latency incurred.
Supported Providers and Models
Transcription
Translation
Transcription
Translation
Last updated 6 months ago