@@ -5,13 +5,13 @@ description: Latest preview data plane inference documentation generated from Op
manager: nitinme
ms.service: azure-ai-openai
ms.topic: include
-ms.date: 01/08/2024
+ms.date: 01/29/2025
---
## Completions - Create
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-01-01-preview
```
Creates a completion for the provided prompt, parameters and chosen model.
@@ -41,18 +41,36 @@ Creates a completion for the provided prompt, parameters and chosen model.
| echo | boolean | Echo back the prompt in addition to the completion<br> | No | False |
| frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.<br> | No | 0 |
| logit_bias | object | Modify the likelihood of specified tokens appearing in the completion.<br><br>Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.<br><br>As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token from being generated.<br> | No | None |
-| logprobs | integer | Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the five most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response.<br><br>The maximum value for `logprobs` is 5.<br> | No | None |
-| max_tokens | integer | The maximum number of tokens that can be generated in the completion.<br><br>The token count of your prompt plus `max_tokens` can't exceed the model's context length. <br> | No | 16 |
+| logprobs | integer | Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the 5 most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response.<br><br>The maximum value for `logprobs` is 5.<br> | No | None |
+| max_tokens | integer | The maximum number of tokens that can be generated in the completion.<br><br>The token count of your prompt plus `max_tokens` can't exceed the model's context length. | No | 16 |
| n | integer | How many completions to generate for each prompt.<br><br>**Note:** Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.<br> | No | 1 |
+| modalities | [ChatCompletionModalities](#chatcompletionmodalities) | Output types that you would like the model to generate for this request.<br>Most models are capable of generating text, which is the default:<br><br>`["text"]`<br><br>The `gpt-4o-audio-preview` model can also be used to generate audio. To<br>request that this model generate both text and audio responses, you can<br>use:<br><br>`["text", "audio"]`<br> | No | |
+| prediction | [PredictionContent](#predictioncontent) | Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content. | No | |
+| audio | object | Parameters for audio output. Required when audio output is requested with<br>`modalities: ["audio"]`. | No | |
| presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.<br> | No | 0 |
| seed | integer | If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result.<br><br>Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.<br> | No | |
| stop | string or array | Up to four sequences where the API will stop generating further tokens. The returned text won't contain the stop sequence.<br> | No | |
-| stream | boolean | Whether to stream back partial progress. If set, tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. <br> | No | False |
+| stream | boolean | Whether to stream back partial progress. If set, tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. | No | False |
| suffix | string | The suffix that comes after a completion of inserted text.<br><br>This parameter is only supported for `gpt-3.5-turbo-instruct`.<br> | No | None |
| temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.<br><br>We generally recommend altering this or `top_p` but not both.<br> | No | 1 |
| top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.<br><br>We generally recommend altering this or `temperature` but not both.<br> | No | 1 |
| user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse.<br> | No | |
+
+### Properties for audio
+
+#### voice
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| voice | string | Specifies the voice type. Supported voices are `alloy`, `echo`, <br>`fable`, `onyx`, `nova`, and `shimmer`.<br> | |
+
+#### format
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| format | string | Specifies the output audio format. Must be one of `wav`, `mp3`, `flac`,<br>`opus`, or `pcm16`. <br> | |
+
### Responses
**Status Code:** 200
@@ -79,7 +97,7 @@ Creates a completion for the provided prompt, parameters and chosen model.
Creates a completion for the provided prompt, parameters and chosen model.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-01-01-preview
{
"prompt": [
@@ -119,7 +137,7 @@ Status Code: 200
## Embeddings - Create
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-01-01-preview
```
Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.
@@ -190,7 +208,7 @@ Get a vector representation of a given input that can be easily consumed by mach
Return the embeddings for a given prompt.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-01-01-preview
{
"input": [
@@ -249,8 +267,7 @@ Status Code: 200
-0.021560553,
0.016515596,
-0.015572986,
- 0.0038666942,
- -8.432463e-05
+ 0.0038666942
]
}
],
@@ -265,7 +282,7 @@ Status Code: 200
## Chat completions - Create
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
```
Creates a completion for the chat message
@@ -292,19 +309,19 @@ Creates a completion for the chat message
|------|------|-------------|----------|---------|
| temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.<br><br>We generally recommend altering this or `top_p` but not both.<br> | No | 1 |
| top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.<br><br>We generally recommend altering this or `temperature` but not both.<br> | No | 1 |
-| stream | boolean | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. <br> | No | False |
+| stream | boolean | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. | No | False |
| stop | string or array | Up to four sequences where the API will stop generating further tokens.<br> | No | |
-| max_tokens | integer | The maximum number of tokens that can be generated in the chat completion.<br><br>The total length of input tokens and generated tokens is limited by the model's context length. <br> | No | |
+| max_tokens | integer | The maximum number of tokens that can be generated in the chat completion.<br><br>The total length of input tokens and generated tokens is limited by the model's context length. | No | |
| max_completion_tokens | integer | An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. | No | |
| presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.<br> | No | 0 |
| frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.<br> | No | 0 |
| logit_bias | object | Modify the likelihood of specified tokens appearing in the completion.<br><br>Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.<br> | No | None |
| store | boolean | Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. | No | |
| metadata | object | Developer-defined tags and values used for filtering completions in the stored completions dashboard. | No | |
| user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse.<br> | No | |
-| messages | array | A list of messages comprising the conversation so far. | Yes | |
+| messages | array | A list of messages comprising the conversation so far. | Yes | |
| data_sources | array | The configuration entries for Azure OpenAI chat extensions that use them.<br> This additional specification is only compatible with Azure OpenAI. | No | |
-| reasoning_effort | enum | **o1 models only** <br><br> Constrains effort on reasoning for <br>reasoning models.<br><br>Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.<br>Possible values: low, medium, high | No | |
+| reasoning_effort | enum | **o1 models only** <br><br> Constrains effort on reasoning for reasoning models.<br><br>Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.<br>Possible values: low, medium, high | No | |
| logprobs | boolean | Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`. | No | False |
| top_logprobs | integer | An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used. | No | |
| n | integer | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep `n` as `1` to minimize costs. | No | 1 |
@@ -343,7 +360,7 @@ Creates a completion for the chat message
Creates a completion for the provided prompt, parameters and chosen model.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -391,7 +408,7 @@ Status Code: 200
Creates a completion based on Azure Search data and system-assigned managed identity.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -459,7 +476,7 @@ Status Code: 200
Creates a completion based on Azure Search image vector data.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -522,7 +539,7 @@ Status Code: 200
Creates a completion based on Azure Search vector data, previous assistant message and user-assigned managed identity.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -623,7 +640,7 @@ Status Code: 200
Creates a completion for the provided Azure Cosmos DB.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -705,7 +722,7 @@ Status Code: 200
Creates a completion for the provided Mongo DB.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -790,7 +807,7 @@ Status Code: 200
Creates a completion for the provided Elasticsearch.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -860,7 +877,7 @@ Status Code: 200
Creates a completion for the provided Pinecone resource.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-01-01-preview
{
"messages": [
@@ -940,7 +957,7 @@ Status Code: 200
## Transcriptions - Create
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-01-01-preview
```
Transcribes audio into the input language.
@@ -990,7 +1007,7 @@ Transcribes audio into the input language.
Gets transcribed text and associated metadata from provided spoken audio data.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-01-01-preview
```
@@ -1009,7 +1026,7 @@ Status Code: 200
Gets transcribed text and associated metadata from provided spoken audio data.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-01-01-preview
"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"
@@ -1027,7 +1044,7 @@ Status Code: 200
## Translations - Create
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-01-01-preview
```
Transcribes and translates input audio into English text.
@@ -1075,7 +1092,7 @@ Transcribes and translates input audio into English text.
Gets English language transcribed text and associated metadata from provided spoken audio data.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-01-01-preview
"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"
@@ -1096,7 +1113,7 @@ Status Code: 200
Gets English language transcribed text and associated metadata from provided spoken audio data.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-01-01-preview
"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"
@@ -1114,7 +1131,7 @@ Status Code: 200
## Speech - Create
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-01-01-preview
```
Generates audio from the input text.
@@ -1161,7 +1178,7 @@ Generates audio from the input text.
Synthesizes audio from the provided text.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-01-01-preview
{
"input": "Hi! What are you going to make?",
@@ -1182,7 +1199,7 @@ Status Code: 200
## Image generations - Create
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-01-01-preview
```
Generates a batch of images from a text caption on a given DALLE model deployment
@@ -1240,7 +1257,7 @@ Generates a batch of images from a text caption on a given DALLE model deploymen
Creates images given a prompt.
```HTTP
-POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-01-01-preview
{
"prompt": "In the style of WordArt, Microsoft Clippy wearing a cowboy hat.",
@@ -1314,7 +1331,7 @@ Status Code: 200
## List - Assistants
```HTTP
-GET https://{endpoint}/openai/assistants?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/assistants?api-version=2025-01-01-preview
```
Returns a list of assistants.
@@ -1353,7 +1370,7 @@ Returns a list of assistants.
Returns a list of assistants.
```HTTP
-GET https://{endpoint}/openai/assistants?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/assistants?api-version=2025-01-01-preview
```
@@ -1424,7 +1441,7 @@ Status Code: 200
## Create - Assistant
```HTTP
-POST https://{endpoint}/openai/assistants?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/assistants?api-version=2025-01-01-preview
```
Create an assistant with a model and instructions.
@@ -1492,7 +1509,7 @@ Create an assistant with a model and instructions.
Create an assistant with a model and instructions.
```HTTP
-POST https://{endpoint}/openai/assistants?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/assistants?api-version=2025-01-01-preview
{
"name": "Math Tutor",
@@ -1535,7 +1552,7 @@ Status Code: 200
## Get - Assistant
```HTTP
-GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-01-01-preview
```
Retrieves an assistant.
@@ -1571,7 +1588,7 @@ Retrieves an assistant.
Retrieves an assistant.
```HTTP
-GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-01-01-preview
```
@@ -1603,7 +1620,7 @@ Status Code: 200
## Modify - Assistant
```HTTP
-POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-01-01-preview
```
Modifies an assistant.
@@ -1671,7 +1688,7 @@ Modifies an assistant.
Modifies an assistant.
```HTTP
-POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-01-01-preview
{
"instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
@@ -1718,7 +1735,7 @@ Status Code: 200
## Delete - Assistant
```HTTP
-DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-01-01-preview
```
Delete an assistant.
@@ -1754,7 +1771,7 @@ Delete an assistant.
Deletes an assistant.
```HTTP
-DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-01-01-preview
```
@@ -1773,7 +1790,7 @@ Status Code: 200
## Create - Thread
```HTTP
-POST https://{endpoint}/openai/threads?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads?api-version=2025-01-01-preview
```
Create a thread.
@@ -1834,7 +1851,7 @@ Create a thread.
Creates a thread.
```HTTP
-POST https://{endpoint}/openai/threads?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads?api-version=2025-01-01-preview
```
@@ -1854,7 +1871,7 @@ Status Code: 200
## Get - Thread
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-01-01-preview
```
Retrieves a thread.
@@ -1890,7 +1907,7 @@ Retrieves a thread.
Retrieves a thread.
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-01-01-preview
```
@@ -1915,7 +1932,7 @@ Status Code: 200
## Modify - Thread
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-01-01-preview
```
Modifies a thread.
@@ -1975,7 +1992,7 @@ Modifies a thread.
Modifies a thread.
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-01-01-preview
{
"metadata": {
@@ -2006,7 +2023,7 @@ Status Code: 200
## Delete - Thread
```HTTP
-DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-01-01-preview
```
Delete a thread.
@@ -2042,7 +2059,7 @@ Delete a thread.
Deletes a thread.
```HTTP
-DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-01-01-preview
```
@@ -2061,7 +2078,7 @@ Status Code: 200
## List - Messages
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-01-01-preview
```
Returns a list of messages for a given thread.
@@ -2102,7 +2119,7 @@ Returns a list of messages for a given thread.
List Messages
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-01-01-preview
```
@@ -2164,7 +2181,7 @@ Status Code: 200
## Create - Message
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-01-01-preview
```
Create a message.
@@ -2211,7 +2228,7 @@ Create a message.
Create a message.
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-01-01-preview
{
"role": "user",
@@ -2250,7 +2267,7 @@ Status Code: 200
## Get - Message
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-01-01-preview
```
Retrieve a message.
@@ -2287,7 +2304,7 @@ Retrieve a message.
Retrieve a message.
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-01-01-preview
```
@@ -2321,7 +2338,7 @@ Status Code: 200
## Modify - Message
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-01-01-preview
```
Modifies a message.
@@ -2366,7 +2383,7 @@ Modifies a message.
Modify a message.
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-01-01-preview
{
"metadata": {
@@ -2410,7 +2427,7 @@ Status Code: 200
## Create - Thread And Run
```HTTP
-POST https://{endpoint}/openai/threads/runs?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/runs?api-version=2025-01-01-preview
```
Create a thread and run it in one request.
@@ -2436,7 +2453,7 @@ Create a thread and run it in one request.
|------|------|-------------|----------|---------|
| assistant_id | string | The ID of the assistant to use to execute this run. | Yes | |
| thread | [createThreadRequest](#createthreadrequest) | | No | |
-| model | string | The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
+| model | string | The ID of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
| instructions | string | Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. | No | |
| tools | array | Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. | No | |
| tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.<br> | No | |
@@ -2484,7 +2501,7 @@ Create a thread and run it in one request.
Create a thread and run it in one request.
```HTTP
-POST https://{endpoint}/openai/threads/runs?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/runs?api-version=2025-01-01-preview
{
"assistant_id": "asst_abc123",
@@ -2542,7 +2559,7 @@ Status Code: 200
## List - Runs
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-01-01-preview
```
Returns a list of runs belonging to a thread.
@@ -2582,7 +2599,7 @@ Returns a list of runs belonging to a thread.
Returns a list of runs belonging to a thread.
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-01-01-preview
```
@@ -2696,7 +2713,7 @@ Status Code: 200
## Create - Run
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-01-01-preview
```
Create a run.
@@ -2756,7 +2773,7 @@ Create a run.
Create a run.
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-01-01-preview
{
"assistant_id": "asst_abc123"
@@ -2808,7 +2825,7 @@ Status Code: 200
## Get - Run
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-01-01-preview
```
Retrieves a run.
@@ -2845,7 +2862,7 @@ Retrieves a run.
Gets a run.
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-01-01-preview
```
@@ -2878,7 +2895,7 @@ Status Code: 200
## Modify - Run
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-01-01-preview
```
Modifies a run.
@@ -2923,7 +2940,7 @@ Modifies a run.
Modifies a run.
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-01-01-preview
{
"metadata": {
@@ -2991,7 +3008,7 @@ Status Code: 200
## Submit - Tool Outputs To Run
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-01-01-preview
```
When a run has the `status: "requires_action"` and `required_action.type` is `submit_tool_outputs`, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.
@@ -3039,7 +3056,7 @@ When a run has the `status: "requires_action"` and `required_action.type` is `su
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-01-01-preview
{
"tool_outputs": [
@@ -3118,7 +3135,7 @@ Status Code: 200
## Cancel - Run
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-01-01-preview
```
Cancels a run that is `in_progress`.
@@ -3156,7 +3173,7 @@ Cancels a run that is `in_progress`.
```HTTP
-POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-01-01-preview
```
@@ -3203,7 +3220,7 @@ Status Code: 200
## List - Run Steps
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-01-01-preview
```
Returns a list of run steps belonging to a run.
@@ -3245,7 +3262,7 @@ Returns a list of run steps belonging to a run.
Returns a list of run steps belonging to a run.
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-01-01-preview
```
@@ -3293,7 +3310,7 @@ Status Code: 200
## Get - Run Step
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-01-01-preview
```
Retrieves a run step.
@@ -3333,7 +3350,7 @@ Retrieves a run step.
Retrieves a run step.
```HTTP
-GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-01-01-preview
```
@@ -3373,7 +3390,7 @@ Status Code: 200
## List - Vector Stores
```HTTP
-GET https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores?api-version=2025-01-01-preview
```
Returns a list of vector stores.
@@ -3412,7 +3429,7 @@ Returns a list of vector stores.
Returns a list of vector stores.
```HTTP
-GET https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores?api-version=2025-01-01-preview
```
@@ -3462,7 +3479,7 @@ Status Code: 200
## Create - Vector Store
```HTTP
-POST https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores?api-version=2025-01-01-preview
```
Create a vector store.
@@ -3509,7 +3526,7 @@ Create a vector store.
Creates a vector store.
```HTTP
-POST https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores?api-version=2025-01-01-preview
```
@@ -3537,7 +3554,7 @@ Status Code: 200
## Get - Vector Store
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-01-01-preview
```
Retrieves a vector store.
@@ -3573,7 +3590,7 @@ Retrieves a vector store.
Retrieves a vector store.
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-01-01-preview
```
@@ -3592,7 +3609,7 @@ Status Code: 200
## Modify - Vector Store
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-01-01-preview
```
Modifies a vector store.
@@ -3638,7 +3655,7 @@ Modifies a vector store.
Modifies a vector store.
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-01-01-preview
{
"name": "Support FAQ"
@@ -3670,7 +3687,7 @@ Status Code: 200
## Delete - Vector Store
```HTTP
-DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-01-01-preview
```
Delete a vector store.
@@ -3706,7 +3723,7 @@ Delete a vector store.
Deletes a vector store.
```HTTP
-DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-01-01-preview
```
@@ -3725,7 +3742,7 @@ Status Code: 200
## List - Vector Store Files
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-01-01-preview
```
Returns a list of vector store files.
@@ -3766,7 +3783,7 @@ Returns a list of vector store files.
Returns a list of vector store files.
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-01-01-preview
```
@@ -3800,7 +3817,7 @@ Status Code: 200
## Create - Vector Store File
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-01-01-preview
```
Create a vector store file by attaching a File to a vector store.
@@ -3845,7 +3862,7 @@ Create a vector store file by attaching a File to a vector store.
Create a vector store file by attaching a File to a vector store.
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-01-01-preview
{
"file_id": "file-abc123"
@@ -3872,7 +3889,7 @@ Status Code: 200
## Get - Vector Store File
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-01-01-preview
```
Retrieves a vector store file.
@@ -3909,7 +3926,7 @@ Retrieves a vector store file.
Retrieves a vector store file.
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-01-01-preview
```
@@ -3931,7 +3948,7 @@ Status Code: 200
## Delete - Vector Store File
```HTTP
-DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-01-01-preview
```
Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.
@@ -3968,7 +3985,7 @@ Delete a vector store file. This will remove the file from the vector store but
Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.
```HTTP
-DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview
+DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-01-01-preview
```
@@ -3987,7 +4004,7 @@ Status Code: 200
## Create - Vector Store File Batch
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-01-01-preview
```
Create a vector store file batch.
@@ -4032,7 +4049,7 @@ Create a vector store file batch.
Create a vector store file batch.
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-01-01-preview
{
"file_ids": [
@@ -4065,7 +4082,7 @@ Status Code: 200
## Get - Vector Store File Batch
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-01-01-preview
```
Retrieves a vector store file batch.
@@ -4102,7 +4119,7 @@ Retrieves a vector store file batch.
Retrieves a vector store file batch.
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-01-01-preview
```
@@ -4130,7 +4147,7 @@ Status Code: 200
## Cancel - Vector Store File Batch
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-01-01-preview
```
Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.
@@ -4167,7 +4184,7 @@ Cancel a vector store file batch. This attempts to cancel the processing of file
Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.
```HTTP
-POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2024-12-01-preview
+POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-01-01-preview
```
@@ -4195,7 +4212,7 @@ Status Code: 200
## List - Vector Store File Batch Files
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-01-01-preview
```
Returns a list of vector store files in a batch.
@@ -4237,7 +4254,7 @@ Returns a list of vector store files in a batch.
Returns a list of vector store files.
```HTTP
-GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2024-12-01-preview
+GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-01-01-preview
```
@@ -4576,20 +4593,38 @@ Information about the content filtering category (hate, sexual, violence, self_h
| best_of | integer | Generates `best_of` completions server-side and returns the "best" (the one with the highest log probability per token). Results can't be streamed.<br><br>When used with `n`, `best_of` controls the number of candidate completions and `n` specifies how many to return – `best_of` must be greater than `n`.<br><br>**Note:** Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.<br> | No | 1 |
| echo | boolean | Echo back the prompt in addition to the completion<br> | No | False |
| frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.<br> | No | 0 |
-| logit_bias | object | Modify the likelihood of specified tokens appearing in the completion.<br><br>Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.<br><br>As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token from being generated.<br> | No | None |
-| logprobs | integer | Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the five most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response.<br><br>The maximum value for `logprobs` is 5.<br> | No | None |
-| max_tokens | integer | The maximum number of tokens that can be generated in the completion.<br><br>The token count of your prompt plus `max_tokens` can't exceed the model's context length. <br> | No | 16 |
+| logit_bias | object | Modify the likelihood of specified tokens appearing in the completion.<br><br>Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.<br><br>As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token from being generated.<br> | No | None |
+| logprobs | integer | Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the 5 most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response.<br><br>The maximum value for `logprobs` is 5.<br> | No | None |
+| max_tokens | integer | The maximum number of tokens that can be generated in the completion.<br><br>The token count of your prompt plus `max_tokens` can't exceed the model's context length. | No | 16 |
| n | integer | How many completions to generate for each prompt.<br><br>**Note:** Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.<br> | No | 1 |
+| modalities | [ChatCompletionModalities](#chatcompletionmodalities) | Output types that you would like the model to generate for this request.<br>Most models are capable of generating text, which is the default:<br><br>`["text"]`<br><br>The `gpt-4o-audio-preview` model can also be used to generate audio. To<br>request that this model generate both text and audio responses, you can<br>use:<br><br>`["text", "audio"]`<br> | No | |
+| prediction | [PredictionContent](#predictioncontent) | Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content. | No | |
+| audio | object | Parameters for audio output. Required when audio output is requested with<br>`modalities: ["audio"]`. | No | |
| presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.<br> | No | 0 |
| seed | integer | If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result.<br><br>Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.<br> | No | |
| stop | string or array | Up to four sequences where the API will stop generating further tokens. The returned text won't contain the stop sequence.<br> | No | |
-| stream | boolean | Whether to stream back partial progress. If set, tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. <br> | No | False |
+| stream | boolean | Whether to stream back partial progress. If set, tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. | No | False |
| suffix | string | The suffix that comes after a completion of inserted text.<br><br>This parameter is only supported for `gpt-3.5-turbo-instruct`.<br> | No | None |
| temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.<br><br>We generally recommend altering this or `top_p` but not both.<br> | No | 1 |
| top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.<br><br>We generally recommend altering this or `temperature` but not both.<br> | No | 1 |
| user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse.<br> | No | |
+### Properties for audio
+
+#### voice
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| voice | string | Specifies the voice type. Supported voices are `alloy`, `echo`, <br>`fable`, `onyx`, `nova`, and `shimmer`.<br> | |
+
+#### format
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| format | string | Specifies the output audio format. Must be one of `wav`, `mp3`, `flac`,<br>`opus`, or `pcm16`. <br> | |
+
+
### createCompletionResponse
Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).
@@ -4615,17 +4650,17 @@ Represents a completion response from the API. Note: both the streamed and non-s
|------|------|-------------|----------|---------|
| temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.<br><br>We generally recommend altering this or `top_p` but not both.<br> | No | 1 |
| top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.<br><br>We generally recommend altering this or `temperature` but not both.<br> | No | 1 |
-| stream | boolean | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. <br> | No | False |
+| stream | boolean | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. | No | False |
| stop | string or array | Up to four sequences where the API will stop generating further tokens.<br> | No | |
-| max_tokens | integer | The maximum number of tokens that can be generated in the chat completion.<br><br>The total length of input tokens and generated tokens is limited by the model's context length. <br> | No | |
+| max_tokens | integer | The maximum number of tokens that can be generated in the chat completion.<br><br>The total length of input tokens and generated tokens is limited by the model's context length. | No | |
| max_completion_tokens | integer | An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. | No | |
| presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.<br> | No | 0 |
| frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.<br> | No | 0 |
| logit_bias | object | Modify the likelihood of specified tokens appearing in the completion.<br><br>Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.<br> | No | None |
| store | boolean | Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. | No | |
| metadata | object | Developer-defined tags and values used for filtering completions in the stored completions dashboard. | No | |
| user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse.<br> | No | |
-| messages | array | A list of messages comprising the conversation so far. | Yes | |
+| messages | array | A list of messages comprising the conversation so far. | Yes | |
| data_sources | array | The configuration entries for Azure OpenAI chat extensions that use them.<br> This additional specification is only compatible with Azure OpenAI. | No | |
| reasoning_effort | enum | **o1 models only** <br><br> Constrains effort on reasoning for <br>reasoning models.<br><br>Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.<br>Possible values: low, medium, high | No | |
| logprobs | boolean | Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`. | No | False |
@@ -4662,7 +4697,7 @@ User security context contains several parameters that describe the AI applicati
|------|------|-------------|----------|---------|
| description | string | A description of what the function does, used by the model to choose when and how to call the function. | No | |
| name | string | The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. | Yes | |
-| parameters | [FunctionParameters](#functionparameters) | The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format. <br><br>Omitting `parameters` defines a function with an empty parameter list. | No | |
+| parameters | [FunctionParameters](#functionparameters) | The parameters the functions accepts, described as a JSON Schema object. [See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format. <br><br>Omitting `parameters` defines a function with an empty parameter list. | No | |
### chatCompletionFunctionCallOption
@@ -4743,7 +4778,7 @@ With o1 models and newer, `developer` messages replace the previous `system` mes
| Name | Type | Description | Default |
|------|------|-------------|--------|
-| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
+| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
#### name
@@ -4819,6 +4854,30 @@ This component can be one of the following:
| text | string | The text content. | Yes | |
+### chatCompletionRequestMessageContentPartAudio
+
+
+| Name | Type | Description | Required | Default |
+|------|------|-------------|----------|---------|
+| type | enum | The type of the content part. Always `input_audio`.<br>Possible values: input_audio | Yes | |
+| input_audio | object | | Yes | |
+
+
+### Properties for input_audio
+
+#### data
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| data | string | Base64 encoded audio data. | |
+
+#### format
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| format | string | The format of the encoded audio data. Currently supports "wav" and "mp3".<br> | |
+
+
### chatCompletionRequestMessageContentPartImage
@@ -4981,7 +5040,7 @@ MongoDB vCore.
|------|------|-------------|----------|---------|
| authentication | [onYourDataConnectionStringAuthenticationOptions](#onyourdataconnectionstringauthenticationoptions) | The authentication options for Azure OpenAI On Your Data when using a connection string. | Yes | |
| top_n_documents | integer | The configured top number of documents to feature for the configured query. | No | |
-| max_search_queries | integer | The max number of rewritten queries that should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send. | No | |
+| max_search_queries | integer | The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. | No | |
| allow_partial_result | boolean | If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. | No | False |
| in_scope | boolean | Whether queries should be restricted to use of indexed data. | No | |
| strictness | integer | The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. | No | |
@@ -5464,7 +5523,7 @@ The filtering reason of the retrieved document.
| Name | Type | Description | Default |
|------|------|-------------|--------|
-| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
+| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
### toolCallType
@@ -5556,7 +5615,7 @@ A chat completion delta generated by streamed model responses.
| Name | Type | Description | Default |
|------|------|-------------|--------|
-| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
+| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
#### name
@@ -5589,7 +5648,7 @@ A chat completion delta generated by streamed model responses.
| Name | Type | Description | Default |
|------|------|-------------|--------|
-| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
+| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | |
### chatCompletionStreamOptions
@@ -5635,9 +5694,37 @@ A chat completion message generated by the model.
| content | string | The contents of the message. | Yes | |
| tool_calls | array | The tool calls generated by the model, such as function calls. | No | |
| function_call | [chatCompletionFunctionCall](#chatcompletionfunctioncall) | Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model. | No | |
+| audio | object | If the audio output modality is requested, this object contains data<br>about the audio response from the model. | No | |
| context | [azureChatExtensionsMessageContext](#azurechatextensionsmessagecontext) | A representation of the additional context information available when Azure OpenAI chat extensions are involved<br> in the generation of a corresponding chat completions response. This context information is only populated when<br> using an Azure OpenAI request configured to use a matching extension. | No | |
+### Properties for audio
+
+#### id
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| id | string | Unique identifier for this audio response. | |
+
+#### expires_at
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| expires_at | integer | The Unix timestamp (in seconds) for when this audio response will<br>no longer be accessible on the server for use in multi-turn<br>conversations.<br> | |
+
+#### data
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| data | string | Base64 encoded audio bytes generated by the model, in the format<br>specified in the request.<br> | |
+
+#### transcript
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| transcript | string | Transcript of the audio generated by the model. | |
+
+
### chatCompletionResponseMessageRole
The role of the author of the response message.
@@ -5686,21 +5773,48 @@ Whether to enable parallel function calling during tool use.
No properties defined for this component.
+### PredictionContent
+
+Static predicted output content, such as the content of a text file that is being regenerated.
+
+| Name | Type | Description | Required | Default |
+|------|------|-------------|----------|---------|
+| type | enum | The type of the predicted content you want to provide. This type is currently always `content`.<br>Possible values: content | Yes | |
+| content | string or array | The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly. | Yes | |
+
+
### chatCompletionMessageToolCalls
The tool calls generated by the model, such as function calls.
No properties defined for this component.
+### ChatCompletionModalities
+
+Output types that you would like the model to generate for this request.
+Most models are capable of generating text, which is the default:
+
+`["text"]`
+
+The `gpt-4o-audio-preview` model can also be used to generate audio. To
+request that this model generate both text and audio responses, you can
+use:
+
+`["text", "audio"]`
+
+
+No properties defined for this component.
+
+
### chatCompletionFunctionCall
Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.
| Name | Type | Description | Required | Default |
|------|------|-------------|----------|---------|
| name | string | The name of the function to call. | Yes | |
-| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | Yes | |
+| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | Yes | |
### completionUsage
@@ -5718,6 +5832,12 @@ Usage statistics for the completion request.
### Properties for prompt_tokens_details
+#### audio_tokens
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| audio_tokens | integer | Audio input tokens present in the prompt. | |
+
#### cached_tokens
| Name | Type | Description | Default |
@@ -5727,12 +5847,30 @@ Usage statistics for the completion request.
### Properties for completion_tokens_details
+#### accepted_prediction_tokens
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| accepted_prediction_tokens | integer | When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion. | |
+
+#### audio_tokens
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| audio_tokens | integer | Audio input tokens generated by the model. | |
+
#### reasoning_tokens
| Name | Type | Description | Default |
|------|------|-------------|--------|
| reasoning_tokens | integer | Tokens generated by the model for reasoning. | |
+#### rejected_prediction_tokens
+
+| Name | Type | Description | Default |
+|------|------|-------------|--------|
+| rejected_prediction_tokens | integer | When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits. | |
+
### chatCompletionTool
@@ -5746,7 +5884,7 @@ Usage statistics for the completion request.
### FunctionParameters
-The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format.
+The parameters the functions accepts, described as a JSON Schema object. [See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format.
Omitting `parameters` defines a function with an empty parameter list.
@@ -5761,7 +5899,7 @@ No properties defined for this component.
|------|------|-------------|----------|---------|
| description | string | A description of what the function does, used by the model to choose when and how to call the function. | No | |
| name | string | The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. | Yes | |
-| parameters | [FunctionParameters](#functionparameters) | The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format. <br><br>Omitting `parameters` defines a function with an empty parameter list. | No | |
+| parameters | [FunctionParameters](#functionparameters) | The parameters the functions accepts, described as a JSON Schema object. [See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format. <br><br>Omitting `parameters` defines a function with an empty parameter list. | No | |
| strict | boolean | Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the `parameters` field. Only a subset of JSON Schema is supported when `strict` is `true`. | No | False |
@@ -6170,7 +6308,7 @@ Represents an `assistant` that can call the model and use tools.
| Name | Type | Description | Default |
|------|------|-------------|--------|
-| file_ids | array | A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.<br> | [] |
+| file_ids | array | A list of file IDs made available to the `code_interpreter`` tool. There can be a maximum of 20 files associated with the tool.<br> | [] |
#### file_search
@@ -6375,7 +6513,7 @@ Represents an `assistant` that can call the model and use tools.
| Name | Type | Description | Default |
|------|------|-------------|--------|
-| parameters | [chatCompletionFunctionParameters](#chatcompletionfunctionparameters) | The parameters the functions accepts, described as a JSON Schema object. See the [guide/](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format. | |
+| parameters | [chatCompletionFunctionParameters](#chatcompletionfunctionparameters) | The parameters the functions accepts, described as a JSON Schema object. See the [guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format. | |
@@ -6640,7 +6778,7 @@ Tool call objects
|------|------|-------------|----------|---------|
| assistant_id | string | The ID of the assistant to use to execute this run. | Yes | |
| thread | [createThreadRequest](#createthreadrequest) | | No | |
-| model | string | The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
+| model | string | The ID of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
| instructions | string | Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. | No | |
| tools | array | Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. | No | |
| tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.<br> | No | |
@@ -7209,7 +7347,7 @@ Represents a step in execution of a run.
| created_at | integer | The Unix timestamp (in seconds) for when the run step was created. | Yes | |
| assistant_id | string | The ID of the assistant associated with the run step. | Yes | |
| thread_id | string | The ID of the thread that was run. | Yes | |
-| run_id | string | The ID of the run) that this run step is a part of. | Yes | |
+| run_id | string | The ID of the run that this run step is a part of. | Yes | |
| type | string | The type of run step, which can be either `message_creation` or `tool_calls`. | Yes | |
| status | string | The status of the run, which can be either `in_progress`, `cancelled`, `failed`, `completed`, or `expired`. | Yes | |
| step_details | [runStepDetailsMessageCreationObject](#runstepdetailsmessagecreationobject) or [runStepDetailsToolCallsObject](#runstepdetailstoolcallsobject) | The details of the run step. | Yes | |
@@ -7647,7 +7785,7 @@ A result instance of the file search.
| Name | Type | Description | Default |
|------|------|-------------|--------|
-| output | string | The output of the function. This will be `null` if the outputs have not been submitted yet. | |
+| output | string | The output of the function. This will be `null` if the outputs have not been [submitted](/docs/api-reference/runs/submitToolOutputs) yet. | |