View Diff on GitHub
# Highlights
ドキュメントの更新では、新機能の追加や詳細なフィールド説明の修正が行われ、多くのセクションで用語の一貫性が向上しました。特に、ベクトル検索機能に関連するドキュメントでは、設定手順の簡素化と新しいリスコアリングオプションが大幅に強化され、ユーザーの理解と利便性を向上させています。また、OneLakeの名称が「Microsoft OneLake」に統一され、ブランド名の一貫性が保たれるようになりました。
New features
- Azure OpenAIリソースのカスタムサブドメインの支持が追加され、より柔軟なAPIエンドポイントの使用が可能に。
- 新しいREST APIバージョンのサポート開始と、Microsoft OneLakeインデクサーの一般提供。
- ベクトル検索における新しいリスコアリングオプションの実装。
Breaking changes
- ベクトル検索インデックスの作成と量子化手法に関するドキュメントが大幅に改訂され、現行設定の互換性が失われる可能性がある。
Other updates
- 各ドキュメントで用語の統一や表現の修正が行われ、特にOneLakeの表記が「Microsoft OneLake」に変更。
- 文書の日付の更新による情報の鮮度が維持され、最新の仕様が反映されている。
Insights
今回のドキュメント更新は、Azure AI Searchの機能をよりよく活用するために重要な情報が追加され、既存の内容が改善されています。特に、ベクトル検索に関する更新が目を引く部分で、ユーザーが最新の技術に追いつくことができるように支援しています。リスコアリングと量子化のオプションが追加され、これによって検索結果の精度が向上するだけでなく、パフォーマンスの効率化も期待できます。
また、OneLakeの名称を「Microsoft OneLake」に統一することで、ドキュメント全体の一貫性が向上し、ユーザーがAzureの各サービスをより直感的に理解できるようになっています。特に、複数のドキュメントで詳述されたAPIリソースや接続仕様の最新情報は、開発者において即座に使用可能な形で提供されています。
総じて、ドキュメントのこれらの更新は、ユーザーエクスペリエンスとドキュメントの信頼性を同時に高めるものであり、Azure AI Searchの豊富な機能を効果的に利用するための確固たる基盤といえます。特に、新たに追加されたベクトル検索の機能については、検索の質を飛躍的に向上させる可能性を秘め、今後の実装において活用が期待されます。
Summary Table
Modified Contents
articles/search/cognitive-search-skill-azure-openai-embedding.md
Diff
@@ -9,7 +9,7 @@ ms.custom:
- ignite-2023
- build-2024
ms.topic: reference
-ms.date: 09/12/2025
+ms.date: 09/26/2025
---
# Azure OpenAI Embedding skill
@@ -45,7 +45,7 @@ Parameters are case-sensitive.
| Inputs | Description |
|---------------------|-------------|
-| `resourceUri` | The URI of the model provider. This parameter only supports URLs with the `openai.azure.com` domain, such as `https://<resourcename>.openai.azure.com`. If your Azure OpenAI endpoint has a URL with the `cognitiveservices.azure.com` domain, such as `https://<resourcename>.cognitiveservices.azure.com`, you must create a [custom subdomain](/azure/ai-services/openai/how-to/use-your-data-securely#enabled-custom-subdomain) with `openai.azure.com` for the Azure OpenAI resource and use `https://<resourcename>.openai.azure.com` instead. This field is required if your Azure OpenAI resource is deployed behind a private endpoint or uses Virtual Network (VNet) integration. |
+| `resourceUri` | The URI of the model provider. This parameter only supports URLs with the `openai.azure.com` domain, such as `https://<resourcename>.openai.azure.com`. If your Azure OpenAI endpoint has a URL with the `cognitiveservices.azure.com` domain, such as `https://<resourcename>.cognitiveservices.azure.com`, you must create a [custom subdomain](/azure/ai-services/openai/how-to/use-your-data-securely#enabled-custom-subdomain) with `openai.azure.com` for the Azure OpenAI resource and use `https://<resourcename>.openai.azure.com` instead. This field is required if your Azure OpenAI resource is deployed behind a private endpoint or uses Virtual Network (VNet) integration. [Azure API Management](/azure/api-management/api-management-key-concepts) endpoints are supported with URL `https://<resourcename>.azure-api.net `. Shared private links aren't supported for API Management endpoints.
| `apiKey` | The secret key used to access the model. If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
| `deploymentId` | The name of the deployed Azure OpenAI embedding model. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models.|
| `authIdentity` | A user-managed identity used by the search service for connecting to Azure OpenAI. You can use either a [system or user managed identity](search-how-to-managed-identities.md). To use a system managed identity, leave `apiKey` and `authIdentity` blank. The system-managed identity is used automatically. A managed identity must have [Cognitive Services OpenAI User](/azure/ai-services/openai/how-to/role-based-access-control#azure-openai-roles) permissions to send text to Azure OpenAI. |
Summary
{
"modification_type": "minor update",
"modification_title": "Azure OpenAI Embedding Skillの更新"
}
Explanation
この変更は、Azure OpenAI Embedding Skillに関するドキュメントを更新したものです。具体的には、最終更新日(ms.date)が2025年9月12日から2025年9月26日に変更され、resourceUriの説明に早期の注意が追加されました。また、特定のドメインに対するサポートや関連するAPI管理エンドポイントについての情報が明確にされました。これにより、ユーザーはAzure OpenAIリソースを扱う際の要件や制約についてより具体的な理解が得られるようになります。
articles/search/cognitive-search-skill-document-intelligence-layout.md
Diff
@@ -9,14 +9,12 @@ ms.custom:
- references_regions
- ignite-2024
ms.topic: reference
-ms.date: 09/19/2025
+ms.date: 09/28/2025
ms.update-cycle: 365-days
---
# Document Layout skill
-[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
-
The **Document Layout** skill analyzes a document to detect structure and characteristics, and produces a syntactical representation of the document in Markdown or Text format. You can use it to extract text and images, where image extraction includes location metadata that preserves image position within the document. Image proximity to related content is beneficial in Retrieval Augmented Generation (RAG) workloads and [multimodal search](multimodal-search-overview.md) scenarios.
This article is the reference documentation for the Document Layout skill. For usage information, see [How to chunk and vectorize by document layout](search-how-to-semantic-chunking.md).
@@ -37,11 +35,12 @@ This skill is bound to a [billable Azure AI multi-service resource](cognitive-se
This skill has the following limitations:
+ The skill isn't suitable for large documents requiring more than 5 minutes of processing in the AI Document Intelligence layout model. The skill times out, but charges still apply to the AI Services multi-services resource if it attaches to the skillset for billing purposes. Ensure documents are optimized to stay within processing limits to avoid unnecessary costs.
+
+ Since this skill calls the Azure AI Document Intelligence layout model, all documented [service behaviors for different document types](/azure/ai-services/document-intelligence/prebuilt/layout#pages) for different file types apply to its output. For example, Word (DOCX) and PDF files may produce different results due to differences in how images are handled. If consistent image behavior across DOCX and PDF is required, consider converting documents to PDF or reviewing the [multimodal search documentation](multimodal-search-overview.md) for alternative approaches.
## Supported regions
-The Document Layout skill calls the [Document Intelligence Public preview version 2024-07-31-preview](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-07-31-preview)&preserve-view=true).
+The Document Layout skill calls the [Document Intelligence 2024-11-30 API](/rest/api/aiservices/operation-groups).
Supported regions vary by modality and how the skill connects to the Document Intelligence layout model.
@@ -70,12 +69,6 @@ This skill recognizes the following file formats.
Refer to [Azure AI Document Intelligence layout model supported languages](/azure/ai-services/document-intelligence/language-support/ocr?view=doc-intel-3.1.0&tabs=read-print%2Clayout-print%2Cgeneral#layout&preserve-view=true) for printed text.
-## Supported parameters
-
-Several parameters are version-specific. The skills parameter table notes the API version in which a parameter was introduced so that you know how to configure the skill. To use version-specific features such as image and location metadata extraction in [2025-05-01-preview REST API](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2025-05-01-preview&preserve-view=true), you can use the Azure portal, or target 2025-05-01-preview, or check an Azure SDK change log to see if it supports the new parameters.
-
-The Azure portal supports most preview features and can be used to create or update a skillset. For updates to the Document Layout skill, edit the skillset JSON definition to add new preview parameters.
-
## @odata.type
Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill
@@ -89,21 +82,21 @@ Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill
## Skill parameters
-Parameters are case-sensitive.
+Parameters are case-sensitive. Several parameters were introduced in specific preview versions of the REST API. We recommend using the generally available version (2025-09-01) or the latest preview (2025-08-01-preview) for full access to all parameters.
-| Parameter name | Version | Allowed Values | Description |
-|--------------------|-------------|-------------|-------------|
-| `outputMode` | [2024-11-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true) |`oneToMany` | Controls the cardinality of the output produced by the skill. |
-| `markdownHeaderDepth` | [2024-11-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true) |`h1`, `h2`, `h3`, `h4`, `h5`, `h6(default)` | Only applies if `outputFormat` is set to `markdown`. This parameter describes the deepest nesting level that should be considered. For instance, if the markdownHeaderDepth is `h3`, any sections that are deeper such as `h4`, are rolled into `h3`. |
-| `outputFormat` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) |`markdown(default)`, `text` | **New**. Controls the format of the output generated by the skill. |
-| `extractionOptions` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) |`["images"]`, `["images", "locationMetadata"]`, `["locationMetadata"]` | **New**. Identify any extra content extracted from the document. Define an array of enums that correspond to the content to be included in the output. For instance, if the `extractionOptions` is `["images", "locationMetadata"]`, the output includes images and location metadata which provides page location information related to where the content was extracted, such as a page number or section. This parameter applies to both output formats. |
-| `chunkingProperties` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | See below. | **New**. Only applies if `outputFormat` is set to `text`. Options that encapsulate how to chunk text content while recomputing other metadata. |
+| Parameter name | Allowed Values | Description |
+|--------------------|----------------|-------------|
+| `outputMode` |`oneToMany` | Controls the cardinality of the output produced by the skill. |
+| `markdownHeaderDepth` |`h1`, `h2`, `h3`, `h4`, `h5`, `h6(default)` | Only applies if `outputFormat` is set to `markdown`. This parameter describes the deepest nesting level that should be considered. For instance, if the markdownHeaderDepth is `h3`, any sections that are deeper such as `h4`, are rolled into `h3`. |
+| `outputFormat` |`markdown(default)`, `text` | **New**. Controls the format of the output generated by the skill. |
+| `extractionOptions` |`["images"]`, `["images", "locationMetadata"]`, `["locationMetadata"]` | **New**. Identify any extra content extracted from the document. Define an array of enums that correspond to the content to be included in the output. For instance, if the `extractionOptions` is `["images", "locationMetadata"]`, the output includes images and location metadata which provides page location information related to where the content was extracted, such as a page number or section. This parameter applies to both output formats. |
+| `chunkingProperties` | See below. | **New**. Only applies if `outputFormat` is set to `text`. Options that encapsulate how to chunk text content while recomputing other metadata. |
| ChunkingProperties Parameter | Version | Allowed Values | Description |
|--------------------|-------------|-------------|-------------|
-| `unit` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | `Characters`. currently the only allowed value. Chunk length is measured in characters, as opposed to words or tokens | **New**. Controls the cardinality of the chunk unit. |
-| `maximumLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Any integer between 300-50000 | **New**. The maximum chunk length in characters as measured by String.Length. |
-| `overlapLength` | [2025-05-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | Integer. The value needs to be less than the half of the `maximumLength` | **New**. The length of overlap provided between two text chunks. |
+| `unit` | `Characters`. currently the only allowed value. Chunk length is measured in characters, as opposed to words or tokens | **New**. Controls the cardinality of the chunk unit. |
+| `maximumLength` | Any integer between 300-50000 | **New**. The maximum chunk length in characters as measured by String.Length. |
+| `overlapLength` | Integer. The value needs to be less than the half of the `maximumLength` | **New**. The length of overlap provided between two text chunks. |
## Skill inputs
@@ -203,7 +196,7 @@ The value of the `markdownHeaderDepth` controls the number of keys in the "secti
## Example for text output mode and image and metadata extraction
-This example demonstrates how to use the new parameters introduced in the **2025-05-01-preview** to output text content in fixed-sized chunks and extract images along with location metadata from the document.
+This example demonstrates how to output text content in fixed-sized chunks and extract images along with location metadata from the document.
### Sample definition for text output mode and image and metadata extraction
Summary
{
"modification_type": "minor update",
"modification_title": "Document Layout Skillの更新"
}
Explanation
この変更では、AzureのDocument Layout Skillに関するドキュメントが更新されました。最も重要な変更点は、最終更新日(ms.date)が2025年9月19日から2025年9月28日に変更されたことと、ドキュメントの処理時間の制約に関する注意喚起が追加された点です。具体的には、5分以上の処理を要する大きなドキュメントには対応できないこと、またAzure AI Document IntelligenceのAPIバージョンが2024年11月30日に更新されたことが強調されています。
新しいパラメータの追加も行われており、これによりスキルの出力を制御するためのオプションが拡張されています。たとえば、outputFormatやextractionOptionsなどの新しい設定が導入され、ユーザーがドキュメントから抽出する情報をより細かく設定できるようになっています。これにより、ユーザーはより柔軟で具体的な出力を得ることが可能になります。
articles/search/cognitive-search-skill-genai-prompt.md
Diff
@@ -39,7 +39,7 @@ The GenAI Prompt skill is available in the [latest preview REST API](/rest/api/s
- For image verbalization, the model you use to analyze the image determines what image formats are supported.
-- For GPT-5 model, the `temperature` parameter is not supported in the same way as previous models. If defined, it must be set to `1.0`, as other values will result in errors.
+- For GPT-5 models, the `temperature` parameter is not supported in the same way as previous models. If defined, it must be set to `1.0`, as other values will result in errors.
- Billing is based on the pricing of the model you use.
Summary
{
"modification_type": "minor update",
"modification_title": "GenAIプロンプトスキルの更新"
}
Explanation
この変更は、GenAIプロンプトスキルに関するドキュメントの微調整を含んでいます。具体的には、GPT-5モデルに関する説明が若干修正されました。特に、temperatureパラメータの仕様が微細に変更され、「For GPT-5 model」が「For GPT-5 models」といった形で文の表現が統一されています。この修正により、複数のGPT-5モデルに対しての情報が明確に表現されています。その他の部分は変わっていませんが、全体的にはドキュメントの明瞭性と正確性を高めるための更新となっています。
articles/search/cognitive-search-skill-image-analysis.md
Diff
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
ms.custom:
- ignite-2023
ms.topic: reference
-ms.date: 07/11/2024
+ms.date: 09/17/2025
---
# Image Analysis cognitive skill
@@ -21,7 +21,7 @@ This skill uses the machine learning models provided by [Azure AI Vision](/azure
+ The file size of the image must be less than 4 megabytes (MB)
+ The dimensions of the image must be greater than 50 x 50 pixels
-Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
+Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in Microsoft OneLake. Images can be standalone files or embedded images in a PDF or other files.
This skill is implemented using the [AI Image Analysis API](/azure/ai-services/computer-vision/overview-image-analysis) version 3.2. If your solution requires calling a newer version of that service API (such as version 4.0), consider implementing through [Web API custom skill](cognitive-search-custom-skill-web-api.md) or use the [ImageAnalysisV4 power skill](https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Vision/ImageAnalysisV4/README.md).
Summary
{
"modification_type": "minor update",
"modification_title": "画像分析コグニティブスキルの更新"
}
Explanation
この変更では、画像分析コグニティブスキルに関するドキュメントが更新されました。主な変更点として、最終更新日(ms.date)が2024年7月11日から2025年9月17日に変更されたことがあります。また、画像に関する要件も具体化されています。特に、画像のファイルサイズが4メガバイト(MB)未満でなければならないことや、画像の寸法が50 x 50ピクセルを超える必要があることが明記されています。
また、データソースに関する表現も若干変更され、Azure Data Lake Storage (ADLS) Gen2の記載が「Microsoft OneLake」に置き換えられています。この修正によって、利用可能なデータソースについての理解がより明確になり、ユーザーが正確な情報をもって画像分析機能を利用できるようになります。全体として、ドキュメントの明瞭性を高めるための重要な更新となっています。
articles/search/cognitive-search-skill-ocr.md
Diff
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
ms.custom:
- ignite-2023
ms.topic: reference
-ms.date: 06/24/2022
+ms.date: 09/17/2025
ms.update-cycle: 365-days
---
# OCR cognitive skill
@@ -29,7 +29,7 @@ The **OCR** skill extracts text from image files and embedded images. Supported
+ .BMP
+ .TIFF
-Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
+Supported data sources for OCR and image analysis are blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, and image content in Microsoft OneLake. Images can be standalone files or embedded images in a PDF or other files.
> [!NOTE]
> This skill is bound to Azure AI services and requires [a billable resource](cognitive-search-attach-cognitive-services.md) for transactions that exceed 20 documents per indexer per day. Execution of built-in skills is charged at the existing [Azure AI services Standard price](https://azure.microsoft.com/pricing/details/cognitive-services/).
Summary
{
"modification_type": "minor update",
"modification_title": "OCRコグニティブスキルの更新"
}
Explanation
この変更は、OCRコグニティブスキルのドキュメントに関して行われた微調整を示しています。主な変更点は、最終更新日(ms.date)が2022年6月24日から2025年9月17日に更新されたことです。また、サポートされている画像ファイル形式として、BMPおよびTIFFが新たに追加されました。
さらに、データソースに関する説明も若干の表現が修正されており、「OneLake」が「Microsoft OneLake」と明確に記載されるようになりました。このような変更は、ユーザーに提供される情報の正確性を向上させ、理解を助けるものです。全体として、ドキュメントの明瞭性と最新性を維持するための重要な更新となっています。
articles/search/hybrid-search-ranking.md
Diff
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
ms.custom:
- ignite-2023
ms.topic: conceptual
-ms.date: 08/27/2025
+ms.date: 09/28/2025
---
# Relevance scoring in hybrid search using Reciprocal Rank Fusion (RRF)
@@ -17,9 +17,6 @@ Reciprocal Rank Fusion (RRF) is an algorithm that evaluates the search scores fr
RRF is based on the concept of *reciprocal rank*, which is the inverse of the rank of the first relevant document in a list of search results. The goal of the technique is to take into account the position of the items in the original rankings, and give higher importance to items that are ranked higher in multiple lists. This can help improve the overall quality and reliability of the final ranking, making it more useful for the task of fusing multiple ordered search results.
-> [!NOTE]
-> The [latest preview REST API](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-08-01-preview&preserve-view=true) can deconstruct an RRF-ranked search score into its component subscores. This gives you transparency into all-up score composition. For more information, see [Unpack search scores (preview)](#unpack-a-search-score-into-subscores-preview) in this article.
-
## How RRF ranking works
RRF works by taking the search results from multiple methods, assigning a reciprocal rank score to each document in the results, and then combining the scores to create a new ranking. The concept is that documents appearing in the top positions across multiple search methods are likely to be more relevant and should be ranked higher in the combined result.
@@ -59,22 +56,20 @@ The following chart identifies the scoring property returned on each match, algo
Semantic ranking occurs after RRF merging of results. Its score (`@search.rerankerScore`) is always reported separately in the query response. Semantic ranker can rerank full text and hybrid search results, assuming those results include fields having semantically rich content. It can rerank pure vector queries if the search documents include text fields that contain semantically relevant content.
-## Unpack a search score into subscores (preview)
-
-Using the [latest preview REST API](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-08-01-preview&preserve-view=true), you can deconstruct a search score to view its subscores.
+## Unpack a search score into subscores
-For vector queries, this information can help you determine an appropriate value for [vector weighting](vector-search-how-to-query.md#vector-weighting) or [setting minimum thresholds](vector-search-how-to-query.md#set-thresholds-to-exclude-low-scoring-results-preview).
+You can deconstruct a search score to view its subscores. For vector queries, this information can help you determine an appropriate value for [vector weighting](vector-search-how-to-query.md#vector-weighting) or [setting minimum thresholds](vector-search-how-to-query.md#set-thresholds-to-exclude-low-scoring-results-preview).
To get subscores:
-+ Use the [latest preview Search Documents REST API](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-08-01-preview&preserve-view=true#request-body) or an Azure SDK beta package that provides the feature.
++ Use the [Search Documents REST API](/rest/api/searchservice/documents/search-post#request-body) or an Azure SDK package that provides the feature.
+ Modify a query request, adding a new `debug` parameter set to either `vector`, `semantic` if using semantic ranker, or `all`.
Here's an example of hybrid query that returns subscores in debug mode:
```http
-POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2025-08-01-preview
+POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2025-09-01
{
"vectorQueries": [
@@ -114,7 +109,7 @@ POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/d
## Weighted scores
-Using the [stable REST API version](/rest/api/searchservice/documents/search-post) and newer preview API versions, you can [weight vector queries](vector-search-how-to-query.md#vector-weighting) to increase or decrease their importance in a hybrid query.
+You can also [weight vector queries](vector-search-how-to-query.md#vector-weighting) to increase or decrease their importance in a hybrid query.
Recall that when computing RRF for a certain document, the search engine looks at the rank of that document for each result set where it shows up. Assume a document shows up in three separate search results, where the results are from two vector queries and one text BM25-ranked query. The position of the document varies in each result.
@@ -142,5 +137,5 @@ For more information, see [How to work with search results](search-pagination-pa
## See also
-+ [Learn more about hybrid search](hybrid-search-overview.md)
-+ [Learn more about vector search](vector-search-overview.md)
++ [Hybrid search](hybrid-search-overview.md)
++ [Vector search](vector-search-overview.md)
Summary
{
"modification_type": "minor update",
"modification_title": "ハイブリッド検索ランキングの更新"
}
Explanation
このドキュメントの変更では、ハイブリッド検索ランキングに関連する情報が更新されました。主な修正点として、最終更新日(ms.date)が2025年8月27日から2025年9月28日に変更されています。また、文書にいくつかの新しい情報が追加され、不要な注釈が削除されています。
具体的には、逆数ランク融合(RRF)アルゴリズムや、RRFによるランキングの仕組みについての説明が整備され、ユーザーが理解しやすいように情報が整理されました。また、検索スコアをサブスコアに分解する方法についての説明も明確化され、最新のREST APIのバージョンに関する記載が更新されています。これにより、ユーザーは検索スコアとその構成要素の透明性を高めることができます。
全体的に、この更新はハイブリッド検索に関する情報を最新の状態に保ち、実際の使用に役立つ情報を提供するための重要な改良を行っています。
articles/search/includes/quickstarts/search-get-started-skillset-new-wizard.md
Diff
@@ -83,7 +83,7 @@ If you get `Error detecting index schema from data source`, the indexer that pow
The next step is to configure AI enrichment to invoke OCR, image analysis, and entity recognition.
-OCR and image analysis are available for blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2 and for image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
+OCR and image analysis are available for blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2 and for image content in Microsoft OneLake. Images can be standalone files or embedded images in a PDF or other files.
To add the skills:
Summary
{
"modification_type": "minor update",
"modification_title": "新しいウィザードによるスキルセット作成に関する更新"
}
Explanation
この変更では、新しいウィザードを利用したスキルセット作成に関するドキュメントの内容が微調整されました。具体的には、OCRおよび画像解析の利用可能なプラットフォームに関する文言が少し修正されています。
元の文書では、OneLakeに関連する表現が「OneLake」とだけ書かれていましたが、新しい文書では「Microsoft OneLake」と明記されるようになりました。この変更により、製品名がより明確になり、情報が整理されてユーザーにとって理解しやすくなっています。全体として、この更新はドキュメントの正確性を向上させ、ユーザーが利用可能なリソースについての理解を助けることを目的としています。
articles/search/includes/quickstarts/search-get-started-skillset-old-wizard.md
Diff
@@ -77,7 +77,7 @@ If you get `Error detecting index schema from data source`, the indexer that pow
The next step is to configure AI enrichment to invoke OCR, image analysis, and natural-language processing.
-OCR and image analysis are available for blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2 and for image content in OneLake. Images can be standalone files or embedded images in a PDF or other files.
+OCR and image analysis are available for blobs in Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2 and for image content in Microsoft OneLake. Images can be standalone files or embedded images in a PDF or other files.
To add the skills:
Summary
{
"modification_type": "minor update",
"modification_title": "旧ウィザードによるスキルセット作成に関する更新"
}
Explanation
今回の変更では、旧ウィザードを使用したスキルセット作成に関するドキュメントが微修正されました。具体的には、OCRおよび画像解析機能が利用可能なプラットフォームに関する記述が更新されています。
元の文では、OneLakeが単に「OneLake」とだけ表現されていましたが、最新の変更によって「Microsoft OneLake」と明記されるようになりました。この修正により、製品名がより明確になり、ユーザーがどのプラットフォームを利用しているのかを理解しやすくなっています。このような小さな変更ですが、正確さと一貫性を向上させるための重要なステップとなります。全体的に、この更新はドキュメントの品質を高め、利用者にとっての価値を向上させることを目的としています。
articles/search/media/search-how-to-index-onelake-files/lakehouse-guid.png
Summary
{
"modification_type": "minor update",
"modification_title": "湖ハウスガイドの画像の更新"
}
Explanation
この変更は、ドキュメント内の「湖ハウスガイド」を表す画像ファイルに対する更新があります。具体的には、ファイルの内容に変更はありませんが、画像自体が新しいバージョンに置き換えられたことが示唆されています。変更の詳細がないため、画像の品質や解像度の向上などの目的によるものと考えられます。
画像は、ユーザーがOneLakeファイルをインデックスする際に手順を理解するための重要な視覚的要素です。そのため、画像の更新は、ドキュメント全体の理解を向上させるための重要な改良として機能します。全体として、これは視覚素材の最適化を通じて、ユーザーエクスペリエンスを向上させることを目的としています。
articles/search/search-agentic-retrieval-how-to-create.md
Diff
@@ -65,6 +65,8 @@ Use Azure OpenAI or an equivalent open source model:
+ `gpt-4.1-nano`
+ `gpt-4.1-mini`
+ `gpt-5`
++ `gpt-5-nano`
++ `gpt-5-mini`
## Configure access
Summary
{
"modification_type": "minor update",
"modification_title": "エージェント的検索のためのモデルの追加"
}
Explanation
この変更は、「エージェント的検索の作成方法」に関するドキュメントで行われたもので、利用可能なモデルに新たなオプションが追加されました。具体的には、gpt-5-nanoとgpt-5-miniという2つの新しいモデルがリストに加わっています。
これにより、ユーザーはエージェント的検索を実装する際に選択肢が広がり、最新のモデルを利用することでパフォーマンスや精度の向上が期待できるようになります。この更新は、AIモデルの進化に伴う重要な情報を反映したものであり、ユーザーが最適な技術を活用できるようにするためのものです。全体的に、ドキュメントの情報を最新に保つことで、ユーザーの利便性を向上させることを目的としています。
articles/search/search-agentic-retrieval-how-to-index.md
Diff
@@ -47,6 +47,7 @@ Here's an example index that works for agentic retrieval. It meets the criteria
```json
{
"name": "earth_at_night",
+ "description": "Contains images an descriptions of our planet in darkness as captured from space by Earth-observing satellites and astronauts on the International Space Station over the past 25 years.",
"fields": [
{
"name": "id", "type": "Edm.String",
@@ -166,6 +167,15 @@ All `searchable` fields are included in query execution. There's no support for
> + Fields selected in the response string are semantic fields (title, terms, content)
> + Fields in reference source data are all `retrievable` fields, assuming reference source data is true -->
+## Add a description
+
+An index `description` field is a user-defined string that you can use to provide guidance to LLMs and Model Context Protocol (MCP) servers when deciding to use a specific index for a query. This human-readable text is invaluable when a system must access several indexes and make a decision based on the description.
+
+An index description is a schema update, and you can add it without having to rebuild the entire index.
+
++ String length is 4,000 characters maximum.
++ Content must be human-readable, in Unicode. Your use case should determine which language to use (for example, English or another language).
+
## Add a semantic configuration
The index must have at least one semantic configuration. The semantic configuration must have:
Summary
{
"modification_type": "minor update",
"modification_title": "インデックスの説明フィールドの追加"
}
Explanation
この変更は、「エージェント的検索のインデックス作成方法」に関するドキュメントで、インデックスに新しい説明フィールドが追加されたことを示しています。具体的には、インデックスに対する説明(description)を定義できるフィールドが導入され、このフィールドがどのようにユーザーおよびLLM(大規模言語モデル)やMCP(モデルコンテキストプロトコル)サーバーに役立つかについての説明が追加されています。
これにより、インデックスの内容を一目で理解するのに役立つテキストを提供でき、特に複数のインデックスが存在する場合に、どのインデックスを使用するかの判断をする際に有用です。さらに、説明フィールドはスキーマの更新であり、インデックス全体を再構築することなく追加できることが強調されています。
変更内容には、説明の文字数制限(最大4,000文字)や、使用する言語に関する考慮事項についても触れられています。この更新は、ユーザーがインデックスをより効果的に使用できるようサポートすることを目的としており、ドキュメントの使いやすさを向上させる重要な要素となっています。
articles/search/search-agentic-retrieval-how-to-pipeline.md
Diff
@@ -48,6 +48,8 @@ Use one of the following chat completion models with your AI agent:
+ `gpt-4.1-nano`
+ `gpt-4.1-mini`
+ `gpt-5`
++ `gpt-5-nano`
++ `gpt-5-mini`
### Package version requirements
Summary
{
"modification_type": "minor update",
"modification_title": "新しいチャット完了モデルの追加"
}
Explanation
この変更は、「エージェント的検索のパイプラインに関する方法」に関するドキュメントで、新しいチャット完了モデルがリストに追加されたことを示しています。具体的には、gpt-5-nanoおよびgpt-5-miniという2つの新しいモデルが追加され、これによりユーザーは一層幅広い選択肢からAIエージェントを構築する際のモデルを選べるようになりました。
この更新は、最新のAI技術を反映しており、ユーザーが最新モデルを利用することで、パフォーマンスの向上や多様な応答の生成が期待できるようになります。全体として、ドキュメントはAIエージェントの構築に必要な情報を最新の状態に保ち、ユーザーが効果的にAIソリューションを実装できるようサポートすることを目的としています。
articles/search/search-api-migration.md
Diff
@@ -11,7 +11,7 @@ ms.custom:
- build-2024
- ignite-2024
ms.topic: conceptual
-ms.date: 08/27/2025
+ms.date: 09/27/2025
---
# Upgrade to the latest REST API in Azure AI Search
@@ -22,7 +22,7 @@ Here are the most recent versions of the REST APIs:
| Targeted operations | REST API | Status |
|---------------------|----------|--------|
-| Data plane | [`2024-07-01`](/rest/api/searchservice/search-service-api-versions#2024-07-01) | Stable |
+| Data plane | [`2025-09-01`](/rest/api/searchservice/search-service-api-versions#2025-09-01) | Stable |
| Data plane | [`2025-08-01-preview`](/rest/api/searchservice/search-service-api-versions#2025-08-01-preview&preserve-view=true) | Preview |
| Control plane | [`2025-05-01`](/rest/api/searchmanagement/operation-groups?view=rest-searchmanagement-2025-05-01&preserve-view=true) | Stable |
| Control plane | [`2025-02-01-preview`](/rest/api/searchmanagement/operation-groups?view=rest-searchmanagement-2025-02-01-preview&preserve-view=true) | Preview |
@@ -90,6 +90,12 @@ See [Migrate from preview version](semantic-code-migration.md) to transition you
Upgrade guidance assumes upgrade from the most recent previous version. If your code is based on an old API version, we recommend upgrading through each successive version to get to the newest version.
+### Upgrade to 2025-09-01
+
+[`2025-09-01`](/rest/api/searchservice/search-service-api-versions#2025-09-01) is the latest stable REST API version and it adds general availability for the OneLake indexer, Document Layout skill, and other APIs.
+
+There are no breaking changes if you're upgrading from `2024-07-01` and not using any preview features. To use the new stable release, change the API version and test your code.
+
### Upgrade to 2025-08-01-preview
[`2025-08-01-preview`](/rest/api/searchservice/search-service-api-versions#2025-08-01-preview) introduces the following breaking changes to knowledge agents created using `2025-05-01-preview`:
@@ -136,7 +142,7 @@ There are breaking changes if you upgrade directly from `2023-11-01`. Follow the
### Upgrade to 2024-05-01-preview
-[`2024-05-01-preview`](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) adds OneLake index, support for binary vectors, and support for more embedding models.
+[`2024-05-01-preview`](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) adds an indexer for Microsoft OneLake, binary vectors, and more embedding models.
If you're upgrading from `2024-03-01-preview`, the AzureOpenAIEmbedding skill now requires a model name and dimensions property.
Summary
{
"modification_type": "minor update",
"modification_title": "REST APIの移行ガイドの更新"
}
Explanation
この変更は、「Azure AI Searchの最新REST APIへの移行」に関するドキュメントの更新を示しています。主な修正点として、APIのバージョン情報や日付が調整され、新しい安定版APIが追加されています。
具体的には、2025-09-01が最新の安定版REST APIバージョンとして紹介され、OneLakeインデクサーやDocument Layoutスキルなどの一般提供が加わったことが明記されています。また、2024-07-01からのアップグレードについては、プレビュー機能を使用していない限り、破壊的変更はないことが強調されています。
加えて、REST APIバージョンの記載が更新され、古いバージョンからの推奨されるアップグレード手順についても説明が改善されました。これにより、開発者は最新のAPI機能と安定性を確保しやすくなります。この更新は、ユーザーにとって重要な情報を提供し、移行プロセスを円滑に進める助けとなるでしょう。
articles/search/search-api-preview.md
Diff
@@ -10,7 +10,7 @@ ms.custom:
- build-2024
- ignite-2024
ms.topic: conceptual
-ms.date: 08/27/2025
+ms.date: 09/25/2025
---
# Preview features in Azure AI Search
@@ -27,32 +27,22 @@ Preview features are removed from this list if they're retired or transition to
|---------|------------------|-------------|---------------|
| [**"Fast path" for knowledge agents**](search-agentic-retrieval-how-to-create.md) | Agentic search | New `attemptFastPath` boolean for knowledge agents. Enables a shorter processing time if queries are concise and the initial response is sufficiently relevant. | [Knowledge Agents - Create Or Update (preview)](/rest/api/searchservice/knowledge-agents/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true) |
| [**Retrieval instructions**](search-agentic-retrieval-how-to-create.md) | Agentic search | New `retrievalInstructions` property for knowledge agents guides query planning in an agentic retrieval workflow. For example, you can specify criteria for including or excluding specific knowledge sources. | [Knowledge Agents - Create Or Update (preview)](/rest/api/searchservice/knowledge-agents/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true) |
-| [**Improved indexer runtime tracking information**](search-howto-run-reset-indexers.md) | Indexers | New cumulative indexer processing information for the search service and for specific indexers. | [Get Service Statistics (preview)](/rest/api/searchservice/get-service-statistics/get-service-statistics?view=rest-searchservice-2025-08-01-preview&preserve-view=true) and [Get Status - Indexers (preview)](/rest/api/searchservice/get-service-statistics/get-status?view=rest-searchservice-2025-08-01-preview&preserve-view=true) |
+| [**Improved indexer runtime tracking information**](search-howto-run-reset-indexers.md) | Indexers | New cumulative indexer processing information for the search service and for specific indexers. | [Get Service Statistics (preview)](/rest/api/searchservice/get-service-statistics/get-service-statistics?view=rest-searchservice-2025-08-01-preview&preserve-view=true) and [Get Status - Indexers (preview)](/rest/api/searchservice/get-service-statistics/get-service-statistics?view=rest-searchservice-2025-08-01-preview&preserve-view=true) |
| [**Strict postfiltering for vector queries**](vector-search-filters.md) | Vectors | New `strictPostFilter` mode for the `vectorFilterMode` parameter. When specified, filters are applied after the global top-`k` vector results are identified, ensuring that returned documents are a subset of the unfiltered results. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
| [**Agentic retrieval**](search-agentic-retrieval-concept.md) | Query | Create a conversational search experience powered by large language models (LLMs) and your proprietary content. Agentic retrieval breaks down complex user queries into subqueries, runs the subqueries simultaneously, and either extracts grounding data or synthesizes an answer based on documents indexed in Azure AI Search. To get started, see [Quickstart: Agentic retrieval](search-get-started-agentic-retrieval.md).<p>The pipeline involves one or more [knowledge sources](search-knowledge-source-overview.md) and an associated [knowledge agent](search-agentic-retrieval-how-to-create.md), whose [response payload](search-agentic-retrieval-how-to-retrieve.md) provides full transparency into the query plan and reference data. Knowledge sources currently support [search indexes](search-knowledge-source-how-to-index.md) and [Azure blobs](search-knowledge-source-how-to-blob.md). | [Knowledge Sources (preview)](/rest/api/searchservice/knowledge-sources?view=rest-searchservice-2025-08-01-preview&preserve-view=true), [Knowledge Agents (preview)](/rest/api/searchservice/knowledge-agents?view=rest-searchservice-2025-08-01-preview&preserve-view=true), and [Knowledge Retrieval (preview)](/rest/api/searchservice/knowledge-retrieval?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
| [**Multivector support**](vector-search-multi-vector-fields.md) | Indexing | Index multiple child vectors within a single document field. You can now use vector types in nested fields of complex collections, effectively allowing multiple vectors to be associated with a single document.| [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
-| [**Scoring profiles with semantic ranking**](semantic-how-to-enable-scoring-profiles.md) | Relevance | Semantic ranker adds a new field, `@search.rerankerBoostedScore`, to help you maintain consistent relevance and greater control over final ranking outcomes in your search pipeline. | [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
-| [**Logic Apps integration in the portal wizard**](search-how-to-index-logic-apps-indexers.md) | Indexing | Create an automated indexing pipeline that retrieves content using a logic app workflow. Use the [**Import data (new)** wizard](search-get-started-portal-import-vectors.md) in the Azure portal to build an indexing pipeline based on Logic Apps. | Image and vectorize data wizard in the Azure portal. |
| [**Document-level access control**](search-document-level-access-overview.md) | Security | Flow document-level permissions from blobs in Azure Data Lake Storage (ADLS) Gen2 to searchable documents in an index. Queries can now filter results based on user identity for selected data sources. | [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
| [**GenAI Prompt skill**](cognitive-search-skill-genai-prompt.md) | Skills | A new skill that connects to a large language model (LLM) for information, using a prompt you provide. With this skill, you can populate a searchable field using content from an LLM. A primary use case for this skill is *image verbalization*, using an LLM to describe images and send the description to a searchable field in your index. | [Create or Update Skillset (preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
-| [**Document Layout skill**](cognitive-search-skill-document-intelligence-layout.md)| Skills | New parameters are available for this skill if you use the 2025-05-01-preview API version or later. The new parameters support image offset metadata that improves the image search experience. | [Create or Update Skillset (preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
-| [**Index "description" support**](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true#request-body) | REST | You can now write descriptions for indexes. A description is useful in agentic solutions, where the agent reads the description to decide whether to run a query or move on to another index. | [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true). |
| [**flightingOptIn parameter in a semantic configuration**](semantic-how-to-configure.md#opt-in-for-prerelease-semantic-ranking-models) | Queries| You can opt in to use prerelease semantic ranking models if one is available in a search service region. | [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-03-01-preview&preserve-view=true). |
-| [**Rescore vector queries over binary embeddings using full precision vectors**](vector-search-how-to-quantization.md#recommended-rescoring-techniques) | Relevance (scoring) | For vector indexes that contain quantized binary embeddings, you can rescore query results using a full precision query vector. The query engine uses the dot product for rescoring, which improves the quality of search results. Set `enableRescoring` and `discardOriginals` to use this feature.| [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-03-01-preview&preserve-view=true). |
| [**Facet hierarchies, aggregations, and facet filters**](search-faceted-navigation-examples.md) | Queries| New facet query parameters support nested facets. For numeric facetable fields, you can sum the values of each field. You can also specify filters on a facet to add inclusion or exclusion criteria. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-03-01-preview&preserve-view=true). |
| [**Query rewrite in the semantic reranker**](semantic-how-to-query-rewrite.md) | Relevance (scoring) | You can set options on a semantic query to rewrite the query input into a revised or expanded query that generates more relevant results from the L2 ranker. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-11-01-preview&preserve-view=true).|
-| [**Document Layout skill**](cognitive-search-skill-document-intelligence-layout.md) | Applied AI (skills) | A new skill used to analyze a document for structure and provide [structure-aware chunking](search-how-to-semantic-chunking.md). | [Create or Update Skillset (preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true). |
| [**Keyless billing for Azure AI skills processing**](cognitive-search-attach-cognitive-services.md). | Applied AI (skills) | You can now use a managed identity and roles for a keyless connection to Azure AI services for built-in skills processing. This capability removes restrictions for having both search and AI services in the same region. | [Create or Update Skillset (preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true).|
| [**Markdown parsing mode**](search-how-to-index-markdown-blobs.md) | Indexer data source | With this parsing mode, indexers can generate one-to-one or one-to-many search documents from Markdown files in Azure Storage. | [Create or Update Indexer (preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true). |
-| [**Rescoring options for compressed vectors**](vector-search-how-to-quantization.md) | Relevance (scoring) | You can set options to rescore with original vectors instead of compressed vectors. Applies to HNSW and exhaustive KNN vector algorithms, using binary and scalar compression. | [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true).|
-| [**Lower the dimension requirements for MRL-trained text embedding models on Azure OpenAI**](vector-search-how-to-truncate-dimensions.md) | Index | Text-embedding-3-small and Text-embedding-3-large are trained using Matryoshka Representation Learning (MRL). This allows you to truncate the embedding vectors to fewer dimensions, and adjust the balance between vector index size usage and retrieval quality. A new `truncationDimension` provides the MRL behaviors as an extra parameter in a vector compression configuration. This can only be configured for new vector fields. | [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
-| [**Unpack `@search.score` to view subscores in hybrid search results**](hybrid-search-ranking.md#unpack-a-search-score-into-subscores-preview) | Relevance (scoring) | You can investigate Reciprocal Rank Fusion (RRF) ranked results by viewing the individual query subscores of the final merged and scored result. A new `debug` property unpacks the search score. `QueryResultDocumentSubscores`, `QueryResultDocumentRerankerInput`, and `QueryResultDocumentSemanticField` provide the extra detail. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
| [**Target filters in a hybrid search to just the vector queries**](hybrid-search-how-to-query.md#example-hybrid-search-with-filters-targeting-vector-subqueries-preview) | Query | A filter on a hybrid query involves all subqueries on the request, regardless of type. You can override the global filter to scope the filter to a specific subquery. A new `filterOverride` parameter provides the behaviors. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
| [**Text Split skill (token chunking)**](cognitive-search-skill-textsplit.md) | Applied AI (skills) | This skill has new parameters that improve data chunking for embedding models. A new `unit` parameter lets you specify token chunking. You can now chunk by token length, setting the length to a value that makes sense for your embedding model. You can also specify the tokenizer and any tokens that shouldn't be split during data chunking. | [Create or Update Skillset (preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
| [**Azure AI Vision multimodal embedding skill**](cognitive-search-skill-vision-vectorize.md) | Applied AI (skills) | A new skill type that calls Azure AI Vision multimodal API to generate embeddings for text or images during indexing. | [Create or Update Skillset (preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**Azure Machine Learning (AML) skill**](cognitive-search-aml-skill.md) | Applied AI (skills) | AML skill integrates an inferencing endpoint from Azure Machine Learning. In previous preview APIs, it supports connections to deployed custom models in an AML workspace. Starting in the 2024-05-01-preview, you can use this skill in workflows that connect to embedding models in the Azure AI Foundry model catalog. It's also available in the Azure portal, in skillset design, assuming Azure AI Search and Azure Machine Learning services are deployed in the same subscription. | [Create or Update Skillset (preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**Incremental enrichment cache**](enrichment-cache-how-to-configure.md) | Applied AI (skills) | Adds caching to an enrichment pipeline, allowing you to reuse existing output if a targeted modification, such as an update to a skillset or another object, doesn't change the content. Caching applies only to enriched documents produced by a skillset.| [Create or Update Indexer (preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
-| [**OneLake files indexer**](search-how-to-index-onelake-files.md) | Indexer data source | New data source for extracting searchable data and metadata data from a [lakehouse](/fabric/onelake/create-lakehouse-onelake) on top of [OneLake](/fabric/onelake/onelake-overview) | [Create or Update Data Source (preview)](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**Azure Files indexer**](search-file-storage-integration.md) | Indexer data source | New data source for indexer-based indexing from [Azure Files](https://azure.microsoft.com/services/storage/files/) | [Create or Update Data Source (preview)](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**SharePoint Online indexer**](search-howto-index-sharepoint-online.md) | Indexer data source | New data source for indexer-based indexing of SharePoint content. | [Sign up](https://aka.ms/azure-cognitive-search/indexer-preview) to enable the feature. [Create or Update Data Source (preview)](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true) or the Azure portal. |
| [**MySQL indexer**](search-howto-index-mysql.md) | Indexer data source | New data source for indexer-based indexing of Azure MySQL data sources.| [Sign up](https://aka.ms/azure-cognitive-search/indexer-preview) to enable the feature. [Create or Update Data Source (preview)](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true), [.NET SDK 11.2.1](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype.mysql), and Azure portal. |
@@ -61,7 +51,6 @@ Preview features are removed from this list if they're retired or transition to
| [**Native blob soft delete**](search-howto-index-changed-deleted-blobs.md) | Indexer data source | Applies to the Azure Blob Storage indexer. Recognizes blobs that are in a soft-deleted state, and removes the corresponding search document during indexing. | [Create or Update Data Source (preview)](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**Reset Documents**](search-howto-run-reset-indexers.md) | Indexer | Reprocesses individually selected search documents in indexer workloads. | [Reset Documents (preview)](/rest/api/searchservice/indexers/reset-docs?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**speller**](speller-how-to-add.md) | Query | Optional spelling correction on query term inputs for simple, full, and semantic queries. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
-| [**Normalizers**](search-normalizers.md) | Query | Normalizers provide simple text preprocessing: consistent casing, accent removal, and ASCII folding, without invoking the full text analysis chain.| [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**featuresMode parameter**](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-05-01-preview&preserve-view=true) | Relevance (scoring) | BM25 relevance score expansion to include details: per field similarity score, per field term frequency, and per field number of unique tokens matched. You can consume these data points in [custom scoring solutions](https://github.com/Azure-Samples/search-ranking-tutorial). | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-05-01-preview&preserve-view=true).|
| [**vectorQueries.threshold parameter**](vector-search-how-to-query.md#vector-weighting) | Relevance (scoring) | Exclude low-scoring search result based on a minimum score. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
| [**hybridSearch.maxTextRecallSize and countAndFacetMode parameters**](hybrid-search-how-to-query.md#set-maxtextrecallsize-and-countandfacetmode) | Relevance (scoring) | adjust the inputs to a hybrid query by controlling the amount BM25-ranked results that flow to the hybrid ranking model. | [Search Documents (preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
Summary
{
"modification_type": "minor update",
"modification_title": "プレビュー機能の更新"
}
Explanation
この変更は、「Azure AI Searchのプレビュー機能」に関するドキュメントの更新を示しています。この修正では、特定の日付の更新や新しい機能の情報が追加されています。具体的には、APIの変更に伴う新しいパラメータや機能が一覧に追加され、いくつかの機能名が明確化されています。
特に、プレビュー機能に関するセクションが調整され、新たに追加された機能「エージェント的検索の高速化」、「知識エージェントの取得指示」などが強調されています。一方で、いくつかの古い機能もリストから削除され、最新の情報に更新されています。
この変更により、ユーザーはプレビュー機能に関する最新の情報にアクセスでき、システムの機能性を向上させるための選択肢が広がります。また、APIを使用している開発者にとって、必要な変更を適切に行うための重要な情報が提供されています。全体として、ドキュメントは最新のプレリリース機能とその利用方法を反映しており、ユーザーへの提供価値を高めることを目的としています。
articles/search/search-data-sources-gallery.md
Diff
@@ -9,7 +9,7 @@ ms.update-cycle: 180-days
ms.custom:
- ignite-2023
ms.topic: conceptual
-ms.date: 05/21/2025
+ms.date: 09/28/2025
---
# Data sources gallery
@@ -45,30 +45,30 @@ Extract blob metadata and content, serialized into JSON documents, and imported
---
-### Azure Cosmos DB for NoSQL
+### Azure Table Storage
By [Azure AI Search](search-what-is-azure-search.md)
-Connect to Azure Cosmos DB through the SQL API to extract items from a container, serialized into JSON documents, and imported into a search index as search documents. Configure change tracking to refresh the search index with the latest changes in your database.
+Extract rows from an Azure Table, serialized into JSON documents, and imported into a search index as search documents.
-[More details](search-howto-index-cosmosdb.md)
+[More details](search-howto-indexing-azure-tables.md)
-:::image type="icon" source="media/search-data-sources-gallery/azure_cosmos_db_logo_small.png":::
+:::image type="icon" source="media/search-data-sources-gallery/azure_storage.png":::
:::column-end:::
:::column span="":::
---
-### Azure SQL Database
+### Azure Data Lake Storage Gen2
By [Azure AI Search](search-what-is-azure-search.md)
-Extract field values from a single table or view, serialized into JSON documents, and imported into a search index as search documents. Configure change tracking to refresh the search index with the latest changes in your database.
+Connect to Azure Storage through Azure Data Lake Storage Gen2 to extract content from a hierarchy of directories and nested subdirectories.
-[More details](search-how-to-index-sql-database.md)
+[More details](search-howto-index-azure-data-lake-storage.md)
-:::image type="icon" source="media/search-data-sources-gallery/azuresqlconnectorlogo_medium.png":::
+:::image type="icon" source="media/search-data-sources-gallery/azure_storage.png":::
:::column-end:::
:::row-end:::
@@ -86,36 +86,46 @@ Extract field values from a single table or view, serialized into JSON documents
---
-### Azure Table Storage
+### Azure Cosmos DB for NoSQL
By [Azure AI Search](search-what-is-azure-search.md)
-Extract rows from an Azure Table, serialized into JSON documents, and imported into a search index as search documents.
+Connect to Azure Cosmos DB through the SQL API to extract items from a container, serialized into JSON documents, and imported into a search index as search documents. Configure change tracking to refresh the search index with the latest changes in your database.
-[More details](search-howto-indexing-azure-tables.md)
+[More details](search-howto-index-cosmosdb.md)
-:::image type="icon" source="media/search-data-sources-gallery/azure_storage.png":::
+:::image type="icon" source="media/search-data-sources-gallery/azure_cosmos_db_logo_small.png":::
:::column-end:::
:::column span="":::
---
-### Azure Data Lake Storage Gen2
+### Azure SQL Database
By [Azure AI Search](search-what-is-azure-search.md)
-Connect to Azure Storage through Azure Data Lake Storage Gen2 to extract content from a hierarchy of directories and nested subdirectories.
+Extract field values from a single table or view, serialized into JSON documents, and imported into a search index as search documents. Configure change tracking to refresh the search index with the latest changes in your database.
-[More details](search-howto-index-azure-data-lake-storage.md)
+[More details](search-how-to-index-sql-database.md)
-:::image type="icon" source="media/search-data-sources-gallery/azure_storage.png":::
+:::image type="icon" source="media/search-data-sources-gallery/azuresqlconnectorlogo_medium.png":::
:::column-end:::
:::column span="":::
---
+### Microsoft OneLake files
+
+By [Azure AI Search](search-how-to-index-onelake-files.md)
+
+Connect to a OneLake lakehouse to extract supported files content from a hierarchy of directories and nested subdirectories.
+
+[More details](search-how-to-index-onelake-files.md)
+
+:::image type="icon" source="media/search-data-sources-gallery/fabric_onelake_logo.png":::
+
:::column-end:::
:::row-end:::
:::row:::
@@ -129,7 +139,7 @@ Connect to Azure Storage through Azure Data Lake Storage Gen2 to extract content
---
-## Logic app connectors (preview)
+## Logic app connectors
Pull in content [using logic app workflows](search-how-to-index-logic-apps-indexers.md) and the following supported data sources.
@@ -254,15 +264,15 @@ New data sources are issued as preview features. [Sign up](https://aka.ms/azure-
---
-### Fabric OneLake files
+### Azure Files
-By [Azure AI Search](search-how-to-index-onelake-files.md)
+By [Azure AI Search](search-what-is-azure-search.md)
-Connect to a OneLake lakehouse to extract supported files content from a hierarchy of directories and nested subdirectories.
+Connect to Azure Storage through Azure Files share to extract content serialized into JSON documents, and imported into a search index as search documents.
-[More details](search-how-to-index-onelake-files.md)
+[More details](search-file-storage-integration.md)
-:::image type="icon" source="media/search-data-sources-gallery/fabric_onelake_logo.png":::
+:::image type="icon" source="media/search-data-sources-gallery/azure_storage.png":::
:::column-end:::
:::column span="":::
@@ -340,16 +350,6 @@ Connect to MySQL database on Azure to extract rows in a table, serialized into J
---
-### Azure Files
-
-By [Azure AI Search](search-what-is-azure-search.md)
-
-Connect to Azure Storage through Azure Files share to extract content serialized into JSON documents, and imported into a search index as search documents.
-
-[More details](search-file-storage-integration.md)
-
-:::image type="icon" source="media/search-data-sources-gallery/azure_storage.png":::
-
:::column-end:::
:::row-end:::
:::row:::
@@ -361,9 +361,6 @@ Connect to Azure Storage through Azure Files share to extract content serialized
:::row-end:::
-<!-- :::row:::
-:::column span=""::: -->
-
---
<a name="partners"></a>
Summary
{
"modification_type": "minor update",
"modification_title": "データソースギャラリーの更新"
}
Explanation
この変更は、「データソースギャラリー」に関するドキュメントの内容を更新したものです。いくつかのデータソースのセクションが入れ替えられ、新しい情報が追加されています。主な変更点は、データソースの説明や関連情報がより明確に整理されている点です。
具体的には、Azure Table StorageやAzure Data Lake Storage Gen2の情報が新たに紹介され、各データソースからのデータ抽出方法がわかりやすく示されています。また、Azure Cosmos DBやAzure SQL Databaseに関する情報も最新の形式に更新されており、参照用のリンクも変更されています。
新たに「Microsoft OneLake files」というセクションが追加され、OneLakeのデータ内容の抽出方法が説明されています。さらに、「Logic app connectors」の名称が変更され、関連情報が整理されています。全体として、ユーザーは最新のデータソースにアクセスしやすくなり、各サービスの利用方法理解が深まります。この更新により、Azure AI Searchを利用する開発者にとって、より効率的なデータ統合が可能になります。
articles/search/search-features-list.md
Diff
@@ -25,7 +25,7 @@ The following table summarizes features by category. There's feature parity in a
| Category | Features |
|-------------------|----------|
-| Data sources | Search indexes can accept text from any source, provided it's submitted as a JSON document. </br></br> [**Indexers**](search-indexer-overview.md) are a feature that automates data import from supported data sources to extract searchable content in primary data stores. Indexers handle JSON serialization for you and most support some form of change and deletion detection. You can connect to a [variety of data sources](search-data-sources-gallery.md), including [OneLake (preview)](search-how-to-index-onelake-files.md), [Azure SQL Database](search-how-to-index-sql-database.md), [Azure Cosmos DB](search-howto-index-cosmosdb.md), or [Azure Blob storage](search-howto-indexing-azure-blob-storage.md). </br></br>[**Logic Apps connectors (preview)**](search-how-to-index-logic-apps-indexers.md) give you access to a broader range of data sources, including data on other cloud platforms. This indexing and enrichment pipeline is created in Azure AI Search but managed in Azure Logic Apps.|
+| Data sources | Search indexes can accept text from any source, provided it's submitted as a JSON document. </br></br> [**Indexers**](search-indexer-overview.md) are a feature that automates data import from supported data sources to extract searchable content in primary data stores. Indexers handle JSON serialization for you and most support some form of change and deletion detection. You can connect to a [variety of data sources](search-data-sources-gallery.md), including [Microsoft OneLake](search-how-to-index-onelake-files.md), [Azure SQL Database](search-how-to-index-sql-database.md), [Azure Cosmos DB](search-howto-index-cosmosdb.md), or [Azure Blob storage](search-howto-indexing-azure-blob-storage.md). </br></br>[**Logic Apps connectors (preview)**](search-how-to-index-logic-apps-indexers.md) give you access to a broader range of data sources, including data on other cloud platforms. This indexing and enrichment pipeline is created in Azure AI Search but managed in Azure Logic Apps.|
| Hierarchical and nested data structures | [**Complex types**](search-howto-complex-data-types.md) and collections allow you to model virtually any type of JSON structure within a search index. One-to-many and many-to-many cardinality can be expressed natively through collections, complex types, and collections of complex types.|
| Linguistic analysis | Analyzers are components used for text processing during indexing and search operations. By default, you can use the general-purpose Standard Lucene analyzer, or override the default with a language analyzer, a custom analyzer that you configure, or another predefined analyzer that produces tokens in the format you require. </br></br>[**Language analyzers**](index-add-language-analyzers.md) from Lucene or Microsoft are used to intelligently handle language-specific linguistics including verb tenses, gender, irregular plural nouns (for example, 'mouse' vs. 'mice'), word decompounding, word-breaking (for languages with no spaces), and more. </br></br>[**Custom lexical analyzers**](index-add-custom-analyzers.md) are used for complex query forms such as phonetic matching and regular expressions.</br></br> |
Summary
{
"modification_type": "minor update",
"modification_title": "データソースに関する表現の修正"
}
Explanation
この変更は、「検索機能一覧」におけるデータソースに関する記述の軽微な更新を示しています。具体的には、OneLakeに関する表現が「Microsoft OneLake」に修正されており、他のデータソースと同様に一貫性のある名称が適用されています。
この修正により、情報の正確さと一貫性が向上し、ユーザーがデータソースを理解する際の混乱を軽減します。また、元の説明は削除されずに残っているため、情報の内容自体には大きな変更はないですが、記述がより明確で公式な表現になっています。全体として、ドキュメントの信頼性が高まり、利用者に対して正しい導入が促進されることを目的とした更新です。
articles/search/search-get-started-portal-import-vectors.md
Diff
@@ -38,7 +38,7 @@ The wizard [supports a wide range of Azure data sources](search-import-data-port
|--|--|
| [Azure Blob Storage](/azure/storage/common/storage-account-create) | This data source works with blobs and tables. You must use a standard performance (general-purpose v2) account. Access tiers can be hot, cool, or cold. |
| [Azure Data Lake Storage (ADLS) Gen2](/azure/storage/blobs/create-data-lake-storage-account) | This is an Azure Storage account with a hierarchical namespace enabled. To confirm that you have Data Lake Storage, check the **Properties** tab on the **Overview** page.<br><br> :::image type="content" source="media/search-get-started-portal-import-vectors/data-lake-storage.png" alt-text="Screenshot of an Azure Data Lake Storage account in the Azure portal." border="true" lightbox="media/search-get-started-portal-import-vectors/data-lake-storage.png"::: |
-| [OneLake](search-how-to-index-onelake-files.md) | This data source is currently in preview. For information about limitations and supported shortcuts, see [OneLake indexing](search-how-to-index-onelake-files.md). |
+| [Microsoft OneLake](search-how-to-index-onelake-files.md) | This data source connects to OneLake files and shortcuts. |
### Supported embedding models
@@ -173,7 +173,7 @@ This section points you to the content that works for this quickstart. Before yo
1. On the **Home** tab of your lakehouse, select **Upload files**, and then upload the [health-plan PDF documents](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/health-plan) used for this quickstart.
-1. At the top of your browser, copy the lakehouse URL, which has the following format: `https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111?experience=power-bi`. You specify this URL later in [Connect to your data](#connect-to-your-data).
+1. At the top of your browser, copy the lakehouse URL, which has the following format: `https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111`. Remove any query parameters, such as `?experience=power-bi`. You specify this URL later in [Connect to your data](#connect-to-your-data).
---
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeに関する表現の修正とURLのフォーマット更新"
}
Explanation
この変更は、「ポータルからのベクターインポートを開始する」投稿の内容を少し更新したものです。主な更新点は、OneLakeに関する記述を「OneLake」から「Microsoft OneLake」に修正したことです。この修正により、OneLakeのブランド名が公式に表現され、情報の明確性が向上しています。
さらに、湖畔のURLのコピーに関する手順にも変更が加えられています。具体的には、URLに含まれるクエリパラメータ「?experience=power-bi」を削除するように指示が追加され、ユーザーがより正確なURLを指定できるように配慮されています。
このような軽微な更新により、ドキュメントの信頼性が高まり、ユーザーによるガイドの実施がスムーズになることが期待されます。全体として、これらの修正は、利用者にとっての可読性と使いやすさを向上させることを目的としています。
articles/search/search-how-to-create-search-index.md
Diff
@@ -53,7 +53,7 @@ Use this checklist to assist the design decisions for your search index.
1. Review [supported data types](/rest/api/searchservice/supported-data-types). The data type affects how the field is used. For example, numeric content is filterable but not full text searchable. The most common data type is `Edm.String` for searchable text, which is tokenized and queried using the full text search engine. The most common data type for a vector field is `Edm.Single` but you can use other types as well.
-1. Provide a description of the index (preview), 4,000 character maximum. This human-readable text is invaluable when a system must access several indexes and make a decision based on the description. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on index name alone. An index `description` field is available in the [latest preview REST API](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true), the Azure portal, or a preview package of an Azure SDK that provides the feature. For more information, see [Add an index description](search-howto-reindex.md#add-an-index-description-preview).
+1. Provide a description of the index, 4,000 character maximum. This human-readable text is invaluable when a system must access several indexes and make a decision based on the description. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on index name alone.
1. Identify a [document key](#document-keys). A document key is an index requirement. It's a single string field populated from a source data field that contains unique values. For example, if you're indexing from Blob Storage, the metadata storage path is often used as the document key because it uniquely identifies each blob in the container.
Summary
{
"modification_type": "minor update",
"modification_title": "インデックスの説明に関する記述の修正"
}
Explanation
この変更は、「検索インデックスの作成方法」に関するドキュメントのインデックスの説明に関する部分を微修正するものです。具体的には、インデックスの説明が「プレビュー」から削除され、一般的な説明文に単純化されています。
修正された文では、インデックスの説明が4,000文字の最大制限であること、そしてこの説明がシステムが複数のインデックスにアクセスする際に重要であることが強調されています。また、インデックスの選択がインデックス名だけでなく説明に基づいて行われる可能性についても触れています。
この修正により、ユーザーにとっての理解が容易になり、ドキュメントの内容がより明確になることを目的としています。全体として、改訂された内容は、技術的な正確性を維持しつつ、インデックスの説明の重要性をより明瞭に伝えるものとなっています。
articles/search/search-how-to-define-index-projections.md
Diff
@@ -295,7 +295,7 @@ If you add new content to your data source, new chunks or child documents are ad
If you modify existing content in the data source, chunks are updated incrementally in the search index if the data source you're using supports change tracking and deletion detection. For example, if a word or sentence changes in a document, the chunk in the target index that contains that word or sentence is updated on the next indexer run. Other types of updates, such as changing a field type and some attributions, aren't supported for existing fields. For more information about allowed updates, see [Update an index schema](search-howto-reindex.md#update-an-index-schema).
-Some data sources like [Azure Storage](search-howto-index-changed-deleted-blobs.md) support change and deletion tracking by default, based on the timestamp. Other data sources such as [OneLake](search-how-to-index-onelake-files.md), [Azure SQL](search-how-to-index-sql-database.md), or [Azure Cosmos DB](search-howto-index-cosmosdb.md) must be configured for change tracking.
+Some data sources like [Azure Storage](search-howto-index-changed-deleted-blobs.md) support change and deletion tracking by default, based on the timestamp. Other data sources such as [Microsoft OneLake](search-how-to-index-onelake-files.md), [Azure SQL](search-how-to-index-sql-database.md), or [Azure Cosmos DB](search-howto-index-cosmosdb.md) must be configured for change tracking.
### Deleted content
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeの名称変更と表記の統一"
}
Explanation
この変更は、「インデックスプロジェクションを定義する方法」に関するドキュメントの一部を修正したものです。具体的には、OneLakeの名前を「OneLake」から「Microsoft OneLake」に変更し、ブランドの表記を統一する目的で行われました。
この修正により、Azureのサービスにおける名称の一貫性が保たれるとともに、利用者がOneLakeを認識しやすくなることが期待されています。また、既存のデータソースの説明は、変更および削除の追跡がタイムスタンプに基づいてサポートされていることを述べていますが、他のデータソースについては変更追跡の設定が必要である旨が引き続き説明されています。
このような軽微な更新ではありますが、正確な名称を使用することで提供するドキュメントの信頼性を高める効果があります。全体として、修正はドキュメントの明確性と一貫性を向上させることを目的としています。
articles/search/search-how-to-index-logic-apps-indexers.md
Diff
@@ -1,34 +1,29 @@
---
title: Connect to Azure Logic Apps
titleSuffix: Azure AI Search
-description: Use an Azure Logic Apps workflow for indexer-based indexing in Azure AI Search.
+description: Use an Azure Logic Apps workflow for automated indexing in Azure AI Search.
author: HeidiSteen
ms.author: heidist
manager: nitinme
-ms.date: 08/27/2025
+ms.date: 09/28/2025
ms.service: azure-ai-search
ms.topic: how-to
ms.custom:
- references_regions
- build-2025
---
-# Use an Azure Logic Apps workflow for indexer-based indexing in Azure AI Search
+# Use an Azure Logic Apps workflow for automated indexing in Azure AI Search
-[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
+In Azure AI Search, you can use the [**Import data (new)** wizard](search-get-started-portal-import-vectors.md) in the Azure portal to create a logic app workflow that indexes and vectorizes your content. This capability is equivalent to an [indexer](search-indexer-overview.md) and data source that generates an indexing pipeline and creates searchable content.
-Support for Azure Logic Apps integration is currently in public preview and only available through the [**Import data (new)** wizard](search-get-started-portal-import-vectors.md) in the Azure portal. In Azure AI Search, a logic app workflow is used for indexing and vectorization, and it's equivalent to an indexer and data source in Azure AI Search.
-
-You can create a workflow in Azure AI Search using the **Import data (new)** wizard, and then manage the workflow in Azure Logic Apps alongside your other workflows. Behind the scenes, the wizard follows a workflow template that pulls in (ingests) content from a source for indexing in AI Search. The connectors used in this scenario are prebuilt and already exist in Azure Logic Apps, so the workflow template just provides details for those connectors to create connections to the data source, AI Search, and other items to complete the ingestion workflow.
-
-> [!NOTE]
-> A logic app workflow is a billable resource. For more information, see [Azure Logic Apps pricing](/azure/logic-apps/logic-apps-pricing).
+After you create a workflow in the wizard, you can manage the workflow in Azure Logic Apps alongside your other workflows. Behind the scenes, the wizard follows a workflow template that pulls in (ingests) content from a source for indexing in AI Search. The connectors used in this scenario are prebuilt and already exist in Azure Logic Apps, so the workflow template just provides details for those connectors to create connections to the data source, AI Search, and other items to complete the ingestion workflow.
## Key features
Azure Logic Apps integration in Azure AI Search adds support for:
-+ More data sources from Microsoft and other providers
++ [More data sources](search-data-sources-gallery.md) from Microsoft and other providers
+ Integrated vectorization
+ Scheduled or on-demand indexing
+ Change detection of new and existing documents
@@ -58,9 +53,12 @@ Review the following requirements before you start:
+ Azure Logic Apps is a [supported region](#supported-regions). It should have a [system-assigned managed identity](/azure/logic-apps/authenticate-with-managed-identity) if you want to use Microsoft Entra ID authentication on connections rather than API keys.
+> [!NOTE]
+> A logic app workflow is a billable resource. For more information, see [Azure Logic Apps pricing](/azure/logic-apps/logic-apps-pricing).
+
### Supported regions
-End-to-end functionality is available in the following regions, which provide the data source connection, document cracking, document chunks, support for Azure OpenAI embedding models, and the Azure AI indexer support for pulling the data. The following regions for Azure Logic Apps provide the `ParseDocument` action upon which Azure AI Search integration is based.
+End-to-end functionality is available in the following regions, which provide the data source connection, document cracking, document chunks, support for Azure OpenAI embedding models, and the built-in indexing support for pulling the data. The following regions for Azure Logic Apps provide the `ParseDocument` action upon which indexing integration is based.
+ East US
+ East US 2
@@ -96,15 +94,43 @@ The following connectors are useful for indexing unstructured data, as a complem
+ [Azure Queues](/connectors/azurequeues/)
+ [Service Bus](/connectors/servicebus/)
-## Limitations
+### Supported actions
+
+Logic apps integration includes the following indexing actions. For more information, see [Connect to Azure AI services from workflows in Azure Logic Apps](/azure/logic-apps/connectors/azure-ai#ingest-data-workflow).
+
++ Check for new data.
++ Get the data. An HTTP action that retrieves the uploaded document using the file URL from the trigger output.
++ Compose document details. A Data Operations action that concatenates various items.
++ Create token string. A Data Operations action that produces a token string using the output from the Compose action.
++ Create content chunks. A Data Operations action that splits the token string into pieces, based on either the number of characters or tokens per content chunk.
++ Convert tokenized data to JSON. A Data Operations action that converts the token string chunks into a JSON array.
++ Select JSON array items. A Data Operations action that selects multiple items from the JSON array.
++ Generate the embeddings. An Azure OpenAI action that creates embeddings for each JSON array item.
++ Select embeddings and other information. A Data Operations action that selects embeddings and other document information.
++ Index the data. An Azure AI Search action that indexes the data based on each selected embedding.
+
+It also supports the following query actions:
+
++ Wait for input prompt. A trigger that either polls or waits for new data to arrive, either based on a scheduled recurrence or in response to specific events respectively.
++ Input system message for the model. A Data Operations action that provides input to train the model.
++ Input sample questions and responses. A Data Operations action that provides sample customer questions and associated roles to train the model.
++ Input system message for search query. A Data Operations action that provides search query input to train the model.
++ Generate search query. An Inline Code action that uses JavaScript to create a search query for the vector store, based on the outputs from the preceding Compose actions.
++ Convert query to embedding. An Azure OpenAI action that connects to the chat completion API, which guarantees reliable responses in chat conversations.
++ Get an embedding. An Azure OpenAI action that gets a single vector embedding.
++ Search the vector database. An Azure AI Search action that executes searches in the vector store.
++ Create prompt. An Inline Code action that uses JavaScript to build prompts.
++ Perform chat completion. An Azure OpenAI action that connects to the chat completion API, which guarantees reliable responses in chat conversations.
++ Return a response. A Request action that returns the results to the caller when you use the Request trigger.
-Currently, the public preview has these limitations:
+## Limitations
+ The search index is generated using a fixed schema (document ID, content, and vectorized content), with text extraction only. You can [modify the index](#modify-existing-objects) as long as the update doesn't affect existing fields.
+ Vectorization supports text embedding only.
+ Deletion detection isn't supported. You must manually [delete orphaned documents](search-howto-reindex.md#delete-orphan-documents) from the index.
+ Duplicate documents in the search index are a known issue in this preview. Consider deleting objects and starting over if this becomes an issue.
-+ No support for private endpoints in the logic app workflow created by the portal wizard. The workflow is hosted using the [**Consumption** hosting option](/azure/logic-apps/single-tenant-overview-compare) and is subject to its constraints. To use the **Standard** hosting option, use a programmatic approach to creating the workflow. Use the [latest preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2025-08-01-preview&preserve-view=true) or a prerelease Azure SDK package that provides the feature.
++ No support for private endpoints in the logic app workflow created by the portal wizard. The workflow is hosted using the [**Consumption** hosting option](/azure/logic-apps/single-tenant-overview-compare) and is subject to its constraints. To use the **Standard** hosting option, use a programmatic approach to creating the workflow.
++ All actions are generally available except for
## Create a logic app workflow
Summary
{
"modification_type": "minor update",
"modification_title": "Azure Logic Appsによるインデキシングに関する説明の更新"
}
Explanation
この変更は、「Azure Logic Appsインデクサーのインデキシング方法」に関するドキュメントにおける内容を更新したものです。主な修正点は、ドキュメントの説明文の変更、更新された日付、およびロジックアプリのワークフローに関する詳細な説明の追加です。
具体的には、もともとの「インデクサーベースのインデキシング」から「自動インデキシング」という表現に変更され、内容がより正確に反映されるようになりました。また、新しいセクションが追加され、Logic Appsとの統合における主要機能やサポートされるアクション、制限事項についても詳しく説明されています。
さらに、インデクシングにおけるフローの更新に伴い、データソースからのコンテンツの取り込み方法や、既存のワークフローへの統合方法についての説明が明確にされました。これにより、ユーザーがAzure Logic Appsを使用したインデキシングプロセスを理解するのに役立ち、より効果的に機能を利用できるようになることを目指しています。
全体として、この変更は名詞の明確化と機能に関する詳細な情報の追加により、ドキュメントの品質と有用性を向上させることを目的としています。
articles/search/search-how-to-index-markdown-blobs.md
Diff
@@ -16,7 +16,7 @@ ms.update-cycle: 180-days
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
-In Azure AI Search, indexers for Azure Blob Storage, Azure Files, and OneLake support a `markdown` parsing mode for Markdown files. Markdown files can be indexed in two ways:
+In Azure AI Search, indexers for Azure Blob Storage, Azure Files, and Microsoft OneLake support a `markdown` parsing mode for Markdown files. Markdown files can be indexed in two ways:
+ One-to-many parsing mode, creating multiple search documents per Markdown file
+ One-to-one parsing mode, creating one search document per Markdown file
@@ -26,7 +26,7 @@ In Azure AI Search, indexers for Azure Blob Storage, Azure Files, and OneLake su
## Prerequisites
-+ A supported data source: Azure Blob storage, Azure File storage, OneLake in Microsoft Fabric.
++ A supported data source: Azure Blob storage, Azure File storage, Microsoft OneLake.
For OneLake, make sure you meet all of the requirements of the [OneLake indexer](search-how-to-index-onelake-files.md#prerequisites).
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeの表記統一と文言の修正"
}
Explanation
この変更は、「Markdown Blobsのインデキシング方法」に関するドキュメントの更新を示しています。主な修正点は、OneLakeというサービス名の表記の統一と、関連する説明文の改善です。
具体的には、文中で「OneLake」という名称が「Microsoft OneLake」に変更され、正確なブランド名が使用されています。また、Markdownファイルをインデックスする方法についての説明が改善されており、より明確にパースモードの選択肢(複数の検索ドキュメントを作成する一対多のパースモードと、一つの検索ドキュメントを作成する一対一のパースモード)が利用者に伝わるようになっています。
これにより、具体的なインデクシングの方法に関する理解が深まることが期待され、ユーザーが正確な情報をもとにAzure AI Searchの機能を効果的に活用できるようになることを目的としています。この変更は、用語の一貫性を促進し、ドキュメントの整合性を高める役割も果たしています。
articles/search/search-how-to-index-onelake-files.md
Diff
@@ -1,13 +1,13 @@
---
-title: OneLake indexer (preview)
+title: OneLake indexer
titleSuffix: Azure AI Search
-description: Set up a OneLake indexer to automate indexing of content and metadata from OneLake files and shortcuts.
+description: Set up a OneLake indexer to automate indexing of content and metadata from Microsoft OneLake files and shortcuts.
author: gmndrg
ms.author: gimondra
manager: nitinme
ms.service: azure-ai-search
ms.topic: how-to
-ms.date: 05/08/2025
+ms.date: 09/26/2025
ms.custom:
- build-2024
- ignite-2024
@@ -17,13 +17,13 @@ ms.custom:
# Index data from OneLake files and shortcuts
-In this article, learn how to configure a OneLake files indexer for extracting searchable data and metadata data from a [lakehouse](/fabric/onelake/create-lakehouse-onelake) on top of [OneLake](/fabric/onelake/onelake-overview).
+In this article, learn how to configure a OneLake files indexer for extracting searchable data and metadata data from a [lakehouse](/fabric/onelake/create-lakehouse-onelake) on top of [Microsoft OneLake](/fabric/onelake/onelake-overview).
To configure and run the indexer, you can use:
-+ [2024-05-01-preview REST API](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&tabs=HTTP&preserve-view=true) or a newer preview REST API.
-+ An Azure SDK beta package that provides the feature.
-+ [**Import data** wizard](search-get-started-portal.md) in the Azure portal.
++ [Data Source REST API](/rest/api/searchservice/data-sources/create-or-update) with an [Indexer REST API](/rest/api/searchservice/indexers/create-or-update)
++ An Azure SDK package that provides the feature
++ [**Import data** wizard](search-get-started-portal.md) in the Azure portal
+ [**Import data (new)** wizard](search-get-started-portal-import-vectors.md) in the Azure portal.
This article uses the REST APIs to illustrate each step.
@@ -36,13 +36,13 @@ This article uses the REST APIs to illustrate each step.
+ Textual data. If you have binary data, you can use [AI enrichment](cognitive-search-concept-intro.md) image analysis to extract text or generate descriptions of images. File content can't exceed the [indexer limits](search-limits-quotas-capacity.md#indexer-limits) for your search service tier.
-+ Content in the **Files** location of your lakehouse. You can add data by:
++ Unstructured content in the **Files** location of your lakehouse. You can add data by:
+ [Upload into a lakehouse directly](/fabric/onelake/create-lakehouse-onelake#load-data-into-a-lakehouse)
+ [Use data pipelines](/fabric/data-engineering/tutorial-lakehouse-data-ingestion) from [Microsoft Fabric](https://fabric.microsoft.com/)
+ [Add shortcuts](/fabric/onelake/create-onelake-shortcut) from external data sources like [Amazon S3](/fabric/onelake/create-s3-shortcut) or [Google Cloud Storage](/fabric/onelake/create-gcs-shortcut).
-+ An AI Search service configured for either a [system managed identity](search-how-to-managed-identities.md#create-a-system-managed-identity) or [user-assigned assigned managed identity](search-how-to-managed-identities.md#create-a-user-assigned-managed-identity). The AI Search service must reside within the same tenant as the Microsoft Fabric workspace.
++ An AI Search service, basic pricing tier or higher, configured for either a [system managed identity](search-how-to-managed-identities.md#create-a-system-managed-identity) or [user-assigned assigned managed identity](search-how-to-managed-identities.md#create-a-user-assigned-managed-identity). The AI Search service must reside within the same tenant as the Microsoft Fabric workspace.
+ A Contributor role assignment in the Microsoft Fabric workspace where the lakehouse is located. Steps are outlined in the [Grant permissions](#assign-service-permissions) section of this article.
@@ -54,7 +54,7 @@ You can use this indexer for the following tasks:
+ **Data indexing and incremental indexing:** The indexer can index files and associated metadata from data paths within a lakehouse. It detects new and updated files and metadata through built-in change detection. You can configure data refresh on a schedule or on demand.
+ **Deletion detection:** The indexer can [detect deletions via custom metadata](#detect-deletions-via-custom-metadata) for most files and shortcuts. This requires adding metadata to files to signify that they have been "soft deleted", enabling their removal from the search index. Currently, it's not possible to detect deletions in Google Cloud Storage or Amazon S3 shortcut files because custom metadata isn't supported for those data sources.
-+ **Applied AI through skillsets:** [Skillsets](cognitive-search-concept-intro.md) are fully supported by the OneLake files indexer. This includes key features like [integrated vectorization](vector-search-integrated-vectorization.md) that adds data chunking and embedding steps.
++ **Applied AI enrichment through skillsets:** [Skillsets](cognitive-search-concept-intro.md) are fully supported by the OneLake files indexer. This includes key features like [integrated vectorization](vector-search-integrated-vectorization.md) that adds data chunking and embedding steps.
+ **Parsing modes:** The indexer supports [JSON parsing modes](search-howto-index-json-blobs.md) if you want to parse JSON arrays or lines into individual search documents. It also supports [Markdown parsing mode](search-how-to-index-markdown-blobs.md).
+ **Compatibility with other features:** The OneLake indexer is designed to work seamlessly with other indexer features, such as [debug sessions](cognitive-search-debug-session.md), [indexer cache for incremental enrichments](enrichment-cache-how-to-configure.md), and [knowledge store](knowledge-store-concept-intro.md).
@@ -78,7 +78,7 @@ The following OneLake shortcuts are supported by the OneLake files indexer:
+ [Google Cloud Storage shortcut](/fabric/onelake/create-gcs-shortcut)
-## Limitations in this preview
+## Limitations
+ Parquet (including delta parquet) file types aren't currently supported.
@@ -88,17 +88,22 @@ The following OneLake shortcuts are supported by the OneLake files indexer:
+ This indexer doesn't support SQL queries, but the query used in the data source configuration is exclusively to add optionally the folder or shortcut to access.
-+ There's no support to ingest files from **My Workspace** workspace in OneLake since this is a personal repository per user.
++ There's no support to ingest files from **My Workspace** workspace in OneLake since this is a personal repository per user.
+
++ Microsoft Purview Sensitivity Labels applied via Data Map are not currently supported. If sensitivity labels are applied to artifacts in OneLake using [Microsoft Purview Data Map](/purview/data-map-sensitivity-labels-apply), the indexer may fail to execute properly. To bypass this restriction, an exception must be granted by your organization’s IT team responsible for managing Purview sensitivity labels and Data Map configurations.
+
++ Workspace role-based permissions in Microsoft OneLake may affect indexer access to files. Ensure that the Azure AI Search service principal (managed identity) has sufficient permissions over the files you intend to access in the target [Microsoft Fabric workspace](/fabric/fundamentals/workspaces).
+
## Prepare data for indexing
-Before you set up indexing, review your source data to determine whether any changes should be made up front. An indexer can index content from one container at a time. By default, all files in the container are processed. You have several options for more selective processing:
+Before you set up indexing, review your source data to determine whether any changes should be made to your data in the lakehouse. An indexer can index content from one container at a time. By default, all files in the container are processed. You have several options for more selective processing:
+ Place files in a virtual folder. An indexer [data source definition](#define-the-data-source) includes a "query" parameter that can be either a lakehouse subfolder or shortcut. If this value is specified, only those files in the subfolder or shortcut within the lakehouse are indexed.
+ Include or exclude files by file type. The [supported document formats list](#SupportedFormats) can help you determine which files to exclude. For example, you might want to exclude image or audio files that don't provide searchable text. This capability is controlled through [configuration settings](#configure-and-run-the-onelake-files-indexer) in the indexer.
-+ Include or exclude arbitrary files. If you want to skip a specific file for whatever reason, you can add metadata properties and values to files in your OneLake lakehouse. When an indexer encounters this property, it skips the file or its content in the indexing run.
++ Include or exclude arbitrary files. If you want to skip a specific file for whatever reason, you can add metadata properties and values to files in your lakehouse. When an indexer encounters this property, it skips the file or its content in the indexing run.
File inclusion and exclusion are covered in the [indexer configuration](#configure-and-run-the-onelake-files-indexer) step. If you don't set criteria, the indexer reports an ineligible file as an error and moves on. If enough errors occur, processing might stop. You can specify error tolerance in the indexer [configuration settings](#configure-and-run-the-onelake-files-indexer).
@@ -162,23 +167,27 @@ The minimum role assignment for your search service identity is Contributor.
:::image type="content" source="media/search-how-to-index-onelake-files/add-user-assigned-managed-identity.png" alt-text="Screenshot showing a Contributor role assignment for a search service user-assigned managed identity in the Azure portal." lightbox="media/search-how-to-index-onelake-files/add-user-assigned-managed-identity.png":::
+## Configure a shared private link (required if using Fabric workspace-level private link)
+
+If your Fabric workspace is secured with a [private link](/fabric/security/security-workspace-level-private-links-overview), Azure AI Search won't be able to access your lakehouse data over the public internet, and you won't be able to configure the indexer or its required dependencies, such as the data source. To enable access, you must configure a [shared private link](search-indexer-howto-access-private.md) between Azure AI Search and your Fabric workspace.
+
## Define the data source
-A data source is defined as an independent resource so that it can be used by multiple indexers. You must use the 2024-05-01-preview REST API to create the data source.
+A data source is defined as an independent resource so that it can be used by multiple indexers.
-1. Use the [Create or update a data source REST API](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&tabs=HTTP&preserve-view=true) to set its definition. These are the most significant steps of the definition.
+1. Use the [Create or update a data source REST API](/rest/api/searchservice/data-sources/create-or-update) to set its definition. These are the most significant steps of the definition.
1. Set `"type"` to `"onelake"` (required).
1. Get the Microsoft Fabric workspace GUID and the lakehouse GUID:
- + Go to the lakehouse you'd like to import data from its URL. It should look similar to this example: "https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111?experience=power-bi". Copy the following values that are used in the data source definition:
+ + In Power BI, open the lakehouse you'd like to import data from. Notice the lakehouse URL in the browser. It should look similar to this example: "https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111". The URL contains both the workspace GUID and the lakehouse GUID. If the Fabric workspace is secured with a private link, the URL would start with "https://{FabricWorkspaceGuid}.z{xy}.blob.fabric.microsoft.com".
- + Copy the workspace GUID, that we'll call `{FabricWorkspaceGuid}`, which is listed right after "groups" in the URL. In this example, it would be 00000000-0000-0000-0000-000000000000.
+ + Copy the workspace GUID, which is listed to the right of "groups" in the URL. In this example, it would be 00000000-0000-0000-0000-000000000000. In your REST file, create an environment variable for `{FabricWorkspaceGuid}` and set it to the workspace GUID. If your workspace uses a private link, the workspace GUID will appear in a different location in the URL. Be sure to reference the correct part of the URL based on your setup.
:::image type="content" source="media/search-how-to-index-onelake-files/fabric-guid.png" alt-text="Screenshot of the Fabric workspace GUID in the Azure portal." lightbox="media/search-how-to-index-onelake-files/fabric-guid.png" :::
- + Copy the lakehouse GUID that we'll call `{lakehouseGuid}`, which is listed right after "lakehouses" in the URL. In this example, it would be 11111111-1111-1111-1111-111111111111.
+ + Copy the lakehouse GUID, which is listed right after "lakehouses" in the URL. In this example, it would be 11111111-1111-1111-1111-111111111111. In your REST file, create an environment variable for `{LakehouseGuid}`and set it to the lakehouse GUID.
:::image type="content" source="media/search-how-to-index-onelake-files/lakehouse-guid.png" alt-text="Screenshot of the lakehouse GUID in the Azure portal." lightbox="media/search-how-to-index-onelake-files/lakehouse-guid.png" :::
@@ -190,16 +199,24 @@ A data source is defined as an independent resource so that it can be used by mu
}
```
-1. Set `"container.name"` to the lakehouse GUID, replacing `{lakehouseGuid}` with the value you copied in the previous step. Use `"query"` to optionally specify a lakehouse subfolder or shortcut.
+For your setup with [shared private link](search-indexer-howto-access-private.md), setup the managed identities using the following connection string, that varies from the setup using the internet for communication. Note that not only the URL is different, but also `WorkspaceEndpoint` is used, instead of `ResourceId`. Take this into consideration when configuring either the system-managed identity or user-managed identity setups.
- ```json
+ ```json
+ "credentials": {
+ "connectionString": "WorkspaceEndpoint=https://{FabricWorkspaceGuid}.z{xy}.blob.fabric.microsoft.com"
+ }
+ ```
+
+1. Set `"container.name"` to the lakehouse GUID, replacing `{LakehouseGuid}` with the value you copied in the previous step. Use `"query"` to optionally specify a lakehouse subfolder or shortcut.
+
+ ```json
"container": {
- "name": "{lakehouseGuid}",
+ "name": "{LakehouseGuid}",
"query": "{optionalLakehouseFolderOrShortcut}"
}
```
-1. Set the authentication method using the user-assigned managed identity, or skip to the next step for system-managed identity.
+1. Set the authentication method using the user-assigned managed identity, or skip to the next step for system-managed identity.
```json
{
@@ -210,7 +227,7 @@ A data source is defined as an independent resource so that it can be used by mu
"connectionString": "ResourceId={FabricWorkspaceGuid}"
},
"container": {
- "name": "{lakehouseGuid}",
+ "name": "{LakehouseGuid}",
"query": "{optionalLakehouseFolderOrShortcut}"
},
"identity": {
@@ -256,7 +273,7 @@ A data source is defined as an independent resource so that it can be used by mu
"connectionString": "ResourceId={FabricWorkspaceGuid}"
},
"container": {
- "name": "{lakehouseGuid}",
+ "name": "{LakehouseGuid}",
"query": "{optionalLakehouseFolderOrShortcut}"
}
}
@@ -304,15 +321,15 @@ There are steps to follow in both OneLake and Azure AI Search, but there are no
1. In Azure AI Search, edit the data source definition to include a "dataDeletionDetectionPolicy" property. For example, the following policy considers a file to be deleted if it has a metadata property "IsDeleted" with the value true:
```https
- PUT https://[service name].search.windows.net/datasources/file-datasource?api-version=2024-05-01-preview
+ PUT https://[service name].search.windows.net/datasources/file-datasource?api-version=2025-09-01
{
"name" : "onelake-datasource",
"type" : "onelake",
"credentials": {
"connectionString": "ResourceId={FabricWorkspaceGuid}"
},
"container": {
- "name": "{lakehouseGuid}",
+ "name": "{LakehouseGuid}",
"query": "{optionalLakehouseFolderOrShortcut}"
},
"dataDeletionDetectionPolicy" : {
@@ -404,7 +421,7 @@ Once the index and data source are created, you're ready to create the indexer.
+ "contentAndMetadata" is the default. It specifies that all metadata and textual content extracted from the file are indexed.
- + "storageMetadata" specifies that only the [standard file properties and user-specified metadata](/azure/storage/blobs/storage-blob-container-properties-metadata) are indexed. Although the properties are documented for Azure blobs, the file properties are the same for OneLkae, except for the SAS related metadata.
+ + "storageMetadata" specifies that only the [standard file properties and user-specified metadata](/azure/storage/blobs/storage-blob-container-properties-metadata) are indexed. Although the properties are documented for Azure blobs, the file properties are the same for OneLake, except for the SAS related metadata.
+ "allMetadata" specifies that standard file properties and any [metadata for found content types](search-blob-metadata-properties.md) are extracted from the file content and indexed.
@@ -414,9 +431,9 @@ Once the index and data source are created, you're ready to create the indexer.
In file indexing, you can often omit field mappings because the indexer has built-in support for mapping the "content" and metadata properties to similarly named and typed fields in an index. For metadata properties, the indexer automatically replaces hyphens `-` with underscores in the search index.
-For more information about other properties, [Create an indexer](search-howto-create-indexers.md). For the full list of parameter descriptions, see [Create Indexer (REST)](/rest/api/searchservice/indexers/create#definitions) in the REST API. The parameters are the same for OneLake.
+For more information about other properties, [Create an indexer](search-howto-create-indexers.md). For the full list of parameter descriptions, see [Create Indexer (REST)](/rest/api/searchservice/indexers/create#definitions) in the REST API. The parameters are the same for Microsoft OneLake.
-By default, an indexer runs automatically when you create it. You can change this behavior by setting "disabled" to true. To control indexer execution, [run an indexer on demand](search-howto-run-reset-indexers.md) or [put it on a schedule](search-howto-schedule-indexers.md).
+By default, an indexer runs automatically when you create it. You can change this behavior by setting "disabled" to true. If you create an indexer in a disabled state, [run an indexer on demand](search-howto-run-reset-indexers.md) when you're ready to use it, or [put it on a schedule](search-howto-schedule-indexers.md).
## Check indexer status
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeインデクサーに関するドキュメントの更新"
}
Explanation
この変更は、「OneLakeファイルのインデキシング方法」に関するドキュメントの更新を示しています。主な内容の修正として、OneLakeサービスの名称の統一、説明の明確化、構成手順の細分化が行われています。
具体的には、OneLakeが「Microsoft OneLake」として表記されるようになり、ドキュメント全体のブランドの一貫性が向上しました。また、インデクサーの設定方法に関する段階的なガイドが見直され、特定の手順がより詳細に説明されています。特に、データソースの定義や接続方法に関する具体的な設定方法が強調され、より使いやすくなっています。
さらに、新たに「共有プライベートリンク」の設定に関するセクションが追加され、Microsoft Fabricワークスペースがプライベートリンクで保護されている場合に、Azure AI Searchが湖のデータにアクセスするための手順が具体的に示されています。
全体として、これらの修正は、ユーザーがOneLakeインデクサーをより効果的に設定し利用するための指針を提供し、ドキュメントの品質と実用性を向上させることを目的としています。
articles/search/search-how-to-integrated-vectorization.md
Diff
@@ -38,7 +38,7 @@ Integrated vectorization works with [all supported data sources](search-indexer-
|--|--|
| [Azure Blob Storage](/azure/storage/common/storage-account-create) | This data source works with blobs and tables. You must use a standard performance (general-purpose v2) account. Access tiers can be hot, cool, or cold. |
| [Azure Data Lake Storage (ADLS) Gen2](/azure/storage/blobs/create-data-lake-storage-account) | This is an Azure Storage account with a hierarchical namespace enabled. To confirm that you have Data Lake Storage, check the **Properties** tab on the **Overview** page.<br><br> :::image type="content" source="media/search-how-to-integrated-vectorization/data-lake-storage-account.png" alt-text="Screenshot of an Azure Data Lake Storage account in the Azure portal." border="true" lightbox="media/search-how-to-integrated-vectorization/data-lake-storage-account.png"::: |
-<!--| [OneLake](search-how-to-index-onelake-files.md) | This data source is currently in preview. For information about limitations and supported shortcuts, see [OneLake indexing](search-how-to-index-onelake-files.md). |-->
+| [Microsoft OneLake](search-how-to-index-onelake-files.md) | This data source connects to OneLake files and shortcuts. |
### Supported embedding models
@@ -67,7 +67,7 @@ To configure role-based access for integrated vectorization:
1. On your data source platform and embedding model provider, create role assignments that allow your search service to access data and models. See [Prepare your data](#prepare-your-data) and [Prepare your embedding model](#prepare-your-embedding-model).
> [!NOTE]
-> Free search services support role-based connections to Azure AI Search. However, they don't support managed identities on outbound connections to Azure Storage or Azure AI Vision. This lack of support requires key-based authentication on connections between free search services and other Azure resources.
+> Free search services support role-based connections to Azure AI Search. However, they don't support managed identities on outbound connections to Azure Storage or Azure AI Vision. This lack of support requires that you use key-based authentication on connections between free search services and other Azure resources.
>
> For more secure connections, use the Basic tier or higher. You can then enable roles and configure a managed identity for authorized access.
@@ -156,7 +156,7 @@ In this section, you prepare your data for integrated vectorization by uploading
1. [Add custom metadata](search-howto-index-changed-deleted-blobs.md#soft-delete-strategy-using-custom-metadata) that an indexer can scan to determine which blobs are deleted. Give your custom property a descriptive name. For example, you can name the property "IsDeleted" and set it to false. Repeat this step for every blob in the container. When you want to delete the blob, change the property to true. For more information, see [Change and delete detection when indexing from Azure Storage](search-howto-index-changed-deleted-blobs.md).
-<!--### [OneLake](#tab/prepare-data-onelake)
+### [OneLake](#tab/prepare-data-onelake)
1. Sign in to [Power BI](https://powerbi.com/) and [create a workspace](/fabric/data-engineering/tutorial-lakehouse-get-started).
@@ -184,11 +184,11 @@ In this section, you prepare your data for integrated vectorization by uploading
1. To obtain connection IDs:
- 1. At the top of your browser, locate the lakehouse URL, which has the following format: `https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111?experience=power-bi`.
+ 1. At the top of your browser, locate the lakehouse URL, which has the following format: `https://msit.powerbi.com/groups/00000000-0000-0000-0000-000000000000/lakehouses/11111111-1111-1111-1111-111111111111`. Remove any query parameters, such as `?experience=power-bi`.
1. Copy the workspace ID, which is listed after "groups" in the URL. You specify this ID later in [Set variables](#set-variables). In our example, the workspace ID is `00000000-0000-0000-0000-000000000000`.
- 1. Copy the lakehouse ID, which is listed after "lakehouses" in the URL. You specify this ID later in [Set variables](#set-variables). In our example, the lakehouse ID is `11111111-1111-1111-1111-111111111111`.-->
+ 1. Copy the lakehouse ID, which is listed after "lakehouses" in the URL. You specify this ID later in [Set variables](#set-variables). In our example, the lakehouse ID is `11111111-1111-1111-1111-111111111111`.
---
@@ -318,7 +318,7 @@ In this section, you specify the connection information for your Azure AI Search
|--|--|--|
| Azure Blob Storage | `@storageConnectionString` and `@blobContainer` | The connection string and the name of the container you created in [Prepare your data](#prepare-your-data). |
| ADLS Gen2 | `@storageConnectionString` and `@blobContainer` | The connection string and the name of the container you created in [Prepare your data](#prepare-your-data). |
- <!--| OneLake | `@workspaceId` and `@lakehouseId` | The workspace and lakehouse IDs you obtained in [Prepare your data](#prepare-your-data). |-->
+ | OneLake | `@workspaceId` and `@lakehouseId` | The workspace and lakehouse IDs you obtained in [Prepare your data](#prepare-your-data). |
1. Depending on your embedding model provider, add the following variables.
@@ -403,7 +403,7 @@ In this section, you connect to a [supported data source](#supported-data-source
1. To create the data source, select **Send request**.
-<!--1. If you're using OneLake, set `credentials.connectionString` to `ResourceId={{workspaceId}}` and `container.name` to `{{lakehouseId}}`.-->
+1. If you're using OneLake, set `credentials.connectionString` to `ResourceId={{workspaceId}}` and `container.name` to `{{lakehouseId}}`.
<!--
### [Python](#tab/connect-data-python)
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeに関する表記の修正と文言の明確化"
}
Explanation
この変更は、「統合ベクトル化の方法」に関するドキュメントの修正を示しています。主に、OneLakeに関する表記の修正と、一部文言の明確化が行われています。
具体的には、データソースとしてのOneLakeの名称が「Microsoft OneLake」に変更され、それに伴い関連する説明文が整理されてより明確に記述されました。また、ユーザーが理解しやすいように、接続情報の設定手順が具体化されています。特に、必要な接続IDの取得方法において、URLからクエリパラメータを除去する手順が強調されました。
これらの修正は、ユーザーがOneLakeを利用してデータを統合ベクトル化する際に、正確で理解しやすい情報を提供することを目的としています。全体として、ドキュメントの明確性と正確性が高まり、利用者にとっての実用性が向上しています。
articles/search/search-how-to-semantic-chunking.md
Diff
@@ -6,19 +6,15 @@ author: haileytap
ms.author: haileytapia
ms.service: azure-ai-search
ms.topic: how-to
-ms.date: 06/11/2025
+ms.date: 09/28/2025
ms.custom:
- references_regions
- ignite-2024
---
# Chunk and vectorize by document layout or structure
-[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
-
-Text data chunking strategies play a key role in optimizing RAG responses and performance. By using the new **Document Layout** skill that's currently in preview, you can chunk content based on document structure, capturing headings and chunking the content body based on semantic coherence, such as paragraphs and sentences. Chunks are processed independently. Because LLMs work with multiple chunks, when those chunks are of higher quality and semantically coherent, the overall relevance of the query is improved.
-
-<!-- Text data chunking strategies play a key role in optimizing RAG responses and performance. By using the new **Document Layout** skill that's currently in preview, you can chunk content based on document structure, capturing headings and chunking the content body based on semantic coherence, such as paragraphs and sentences. Chunks are processed independently and recombined as semantic representations. The inherent meaning of the text is used as a guide for the chunking process. -->
+Text data chunking strategies play a key role in optimizing RAG responses and performance. By using the **Document Layout** skill, you can chunk content based on document structure, capturing headings and chunking the content body based on semantic coherence, such as paragraphs and sentences. Chunks are processed independently. Because LLMs work with multiple chunks, when those chunks are of higher quality and semantically coherent, the overall relevance of the query is improved.
The Document Layout skill calls the [layout model](/azure/ai-services/document-intelligence/prebuilt/layout) in Document Intelligence. The model articulates content structure in JSON using Markdown syntax (headings and content), with fields for headings and content stored in a search index on Azure AI Search. The searchable content produced from the Document Layout skill is plain text but you can apply integrated vectorization to generate embeddings for any field in your source documents, including images.
@@ -52,7 +48,7 @@ The raw inputs must be in a [supported data source](search-indexer-overview.md#s
+ Supported file formats include: PDF, JPEG, JPG, PNG, BMP, TIFF, DOCX, XLSX, PPTX, HTML.
-+ Supported indexers can be any indexer that can handle the supported file formats. These indexers include [Blob indexers](search-howto-indexing-azure-blob-storage.md), [OneLake indexers](search-how-to-index-onelake-files.md), [File indexers](search-file-storage-integration.md).
++ Supported indexers can be any indexer that can handle the supported file formats. These indexers include [Blob indexers](search-howto-indexing-azure-blob-storage.md), [Microsoft OneLake indexers](search-how-to-index-onelake-files.md), [File indexers](search-file-storage-integration.md).
+ Supported regions for the portal experience of this feature include: East US, West Europe, North Central US. If you're setting up your skillset programmatically, you can use any Document Intelligence region that also provides the AI enrichment feature of Azure AI Search. For more information, see [Product availability by region](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/table).
@@ -187,8 +183,6 @@ If you aren't using the wizard, the index must exist on the search service befor
## Define a skillset for structure-aware chunking and vectorization
-Because the Document Layout skill is in preview, you must use the [Create Skillset 2024-11-01-preview](/rest/api/searchservice/skillsets/create?view=rest-searchservice-2024-11-01-preview&preserve-view=true) REST API for this step. You can also use the Azure portal.
-
This section shows an example of a skillset definition that projects individual markdown sections, chunks, and their vector equivalents as fields in the search index. It uses the [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) to detect headings and populate a content field based on semantically coherent paragraphs and sentences in the source document. It uses the [Text Split skill](cognitive-search-skill-textsplit.md) to split the Markdown content into chunks. It uses the [Azure OpenAI Embedding skill](cognitive-search-skill-azure-openai-embedding.md) to vectorize chunks and any other field for which you want embeddings.
Besides skills, the skillset includes `indexProjections` and `cognitiveServices`:
@@ -198,7 +192,7 @@ Besides skills, the skillset includes `indexProjections` and `cognitiveServices`
+ `cognitiveServices` [attaches an Azure AI services multi-service account](cognitive-search-attach-cognitive-services.md) for billing purposes (the Document Layout skill is available through [Standard pricing](https://azure.microsoft.com/pricing/details/ai-document-intelligence/)).
```https
-POST {endpoint}/skillsets?api-version=2024-11-01-preview
+POST {endpoint}/skillsets?api-version=2025-09-01
{
"name": "my_skillset",
@@ -330,7 +324,7 @@ When using the [Document Layout skill](cognitive-search-skill-document-intellige
Here's an example of an indexer creation request.
```https
-POST {endpoint}/indexers?api-version=2024-11-01-preview
+POST {endpoint}/indexers?api-version=2025-09-01
{
"name": "my_indexer",
Summary
{
"modification_type": "minor update",
"modification_title": "セマンティックチャンクとベクトル化に関するドキュメントの更新"
}
Explanation
この変更は、「セマンティックチャンク化の方法」に関するドキュメントの更新を示しています。主に、内容の明確化とAPIバージョンの更新が行われました。
具体的には、Document Layoutスキルに関する説明が整理され、文書構造に基づいてコンテンツをチャンク化する方法がより明確に述べられています。これにより、文書の見出しや意味的な一貫性を考慮したチャンク化がどのように行われるかが説明されています。また、APIのバージョンが新しいものに更新されており、特にインデクサーやスキルセットの作成リクエストにおいてバージョンが変更されています。
さらに、OneLakeに関する呼称が「Microsoft OneLake」に統一され、情報が整理されました。これにより、利用者はより正確で一貫性のある情報を得ることができ、ドキュメント全体の理解が深まります。この変更によって、ユーザーがこの機能を効果的に利用できるような情報提供が目指されています。
articles/search/search-howto-reindex.md
Diff
@@ -9,7 +9,7 @@ ms.service: azure-ai-search
ms.custom:
- ignite-2024
ms.topic: how-to
-ms.date: 08/27/2025
+ms.date: 09/28/2025
ms.update-cycle: 180-days
---
@@ -253,9 +253,9 @@ When you create the index, physical storage is allocated for each field in the i
To minimize disruption to application code, consider [creating an index alias](search-how-to-alias.md). Application code references the alias, but you can update the name of the index that the alias points to.
-## Add an index description (preview)
+## Add an index description
-Beginning with REST API version 2025-05-01-preview, a `ddescription` is now supported. This human-readable text is invaluable when a system must access several indexes and make a decision based on the description. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on the index name alone.
+An index has a `description` property that you can specify and use when a system must access several indexes and make a decision based on the description. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on the index name alone.
An index description is a schema update, and you can add it without having to rebuild the entire index.
@@ -286,11 +286,11 @@ The Azure portal supports the latest preview API.
1. Copy the JSON so that you can use it as the basis of a new request.
-1. [Formulate an index update using a PUT request](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true) and the latest preview REST API.
+1. [Formulate an index update using a PUT request](/rest/api/searchservice/indexes/create-or-update).
1. Provide the *full* JSON of the existing schema, plus the new `description` field. The field must be a top-level field, on the same level as `name` or `fields`. The value must be less than 4,000 characters and in Unicode.
-1. To confirm the change, issue another [GET using the latest preview REST API](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-08-01-preview&preserve-view=true).
+1. To confirm the change, issue another [GET](/rest/api/searchservice/indexes/get).
---
Summary
{
"modification_type": "minor update",
"modification_title": "インデックスの説明フィールドに関するドキュメントの更新"
}
Explanation
この変更は、「再インデックス作成方法」に関するドキュメントの更新を示しています。主に、インデックスに関する説明フィールドの追加と、APIのバージョン指定の簡素化が行われました。
具体的には、従来は「2025-05-01-preview」バージョンからサポートされるようになったインデックスのdescriptionプロパティに関する記述が簡潔になり、より明確に説明されています。インデックスの説明は、複数のインデックスにアクセスする必要があるシステムにおいて、インデックス名だけでなく、その説明に基づいて適切なインデックスを選択する際に重要です。この変更により、インデックスの説明が運用上の意思決定においてどのように役立つかが強調されました。
また、PUTリクエストやGETリクエストに関するリンクが更新され、特定のAPIバージョンへのリンクは削除され、最新のエンドポイントに指し示されています。全体として、今回の修正により、ドキュメントの記述が整理され、ユーザーにとっての理解が容易になることを目指しています。
articles/search/search-import-data-portal.md
Diff
@@ -113,7 +113,7 @@ The following skills might appear in a wizard-generated skillset.
<sup>1</sup> This skill is available for RAG and multimodal RAG workflows only. Keyword search isn't supported.
-<sup>2</sup> This skill is available for Azure Storage blobs and OneLake files, assuming the default parsing mode. Images can be an image content type (such as PNG or JPG) or an embedded image in an application file (such as PDF).
+<sup>2</sup> This skill is available for Azure Storage blobs and Microsoft OneLake files, assuming the default parsing mode. Images can be an image content type (such as PNG or JPG) or an embedded image in an application file (such as PDF).
<sup>3</sup> This skill is added when you configure a knowledge store.
Summary
{
"modification_type": "minor update",
"modification_title": "Azure StorageとOneLakeの用語統一"
}
Explanation
この変更は、「データポータルからのインポート」に関するドキュメントの更新を示しています。主に、用語の統一が行われ、Azure Storage blobsに関する説明が改善されました。
具体的には、以前は「OneLake」と表記されていた部分が「Microsoft OneLake」と明記され、このスキルがAzure Storage blobsおよびMicrosoft OneLakeファイルで利用可能であることがクリアに示されています。この変更により、ユーザーが両者の関係を理解しやすくすることを目指しています。
その他の文言や内容に大きな変化はありませんが、この用語の明確化により、ドキュメントの一貫性が向上し、読者にとっての理解が促進されることが期待されています。
articles/search/search-indexer-howto-access-private.md
Diff
@@ -7,7 +7,7 @@ author: mrcarter8
ms.author: mcarter
ms.service: azure-ai-search
ms.topic: how-to
-ms.date: 07/31/2025
+ms.date: 09/26/2025
ms.update-cycle: 180-days
ms.custom:
- ignite-2024
@@ -134,7 +134,7 @@ You can create a shared private link for the following resources.
<sup>8</sup> Shared private links are now supported (as of November 2024) for connections to Azure AI services multi-service accounts. Azure AI Search connects to Azure AI services multi-service for [billing purposes](cognitive-search-attach-cognitive-services.md). These connections can now be private through a shared private link. Shared private link is only supported when configuring [a managed identity (keyless configuration)](cognitive-search-attach-cognitive-services.md#bill-through-a-keyless-connection) in the skillset definition.
-<sup>9</sup> Shared private link is supported for connections to OneLake workspace. To create a `privateLinkServicesForFabric` resource specific to a workspace, [register](/azure/azure-resource-manager/management/resource-providers-and-types#register-resource-provider) `Microsoft.Fabric` namespace to your subscription and refer to step 2 as documented in [Create the private link service in Azure](/fabric/security/security-workspace-level-private-links-set-up#step-2-create-the-private-link-service-in-azure).
+<sup>9</sup> Shared private link is supported for connections to OneLake workspace. To create a `privateLinkServicesForFabric` resource specific to a workspace, [register](/azure/azure-resource-manager/management/resource-providers-and-types#register-resource-provider) `Microsoft.Fabric` namespace to your subscription and refer to step 2 as documented in [Create the private link service in Azure](/fabric/security/security-workspace-level-private-links-set-up#step-2-create-the-private-link-service-in-azure). Note that when using a shared private link, the OneLake data source configuration must be defined with a specific connection string as outlined in the [OneLake indexer documentation](search-how-to-index-onelake-files.md#define-the-data-source).
## 1 - Create a shared private link
@@ -313,7 +313,7 @@ Approval of the private endpoint connection is granted on the Azure PaaS side. E
+ On Azure Storage, use [Private Endpoint Connections - Put](/rest/api/storagerp/private-endpoint-connections/put)
+ On Azure Cosmos DB, use [Private Endpoint Connections - Create Or Update](/rest/api/cosmos-db-resource-provider/private-endpoint-connections/create-or-update)
+ On Azure OpenAI, use [Private Endpoint Connections - Create Or Update](/rest/api/aiservices/accountmanagement/private-endpoint-connections/create-or-update?view=rest-aiservices-accountmanagement-2023-05-01&preserve-view=true)
-+ On Onelake, use [Private Endpoints - Approve via CLI](/cli/azure/network/private-endpoint-connection?view=azure-cli-latest#az-network-private-endpoint-connection-approve) or [Private Endpoints - Approve via Portal](/azure/private-link/manage-private-endpoint?tabs=manage-private-link-powershell#manage-private-endpoint-connections-on-azure-paas-resources)
++ On Microsoft Onelake, use [Private Endpoints - Approve via CLI](/cli/azure/network/private-endpoint-connection#az-network-private-endpoint-connection-approve) or [Private Endpoints - Approve via Portal](/azure/private-link/manage-private-endpoint#manage-private-endpoint-connections-on-azure-paas-resources)
Using the Azure portal, perform the following steps:
Summary
{
"modification_type": "minor update",
"modification_title": "OneLake関連の用語統一と詳細の追加"
}
Explanation
この変更は、「プライベートアクセスのためのインデクサー」のドキュメントで行われた更新を示しています。主に、日付の更新と、OneLakeに関連する用語の明確化が行われました。
具体的には、ドキュメント内の日付が「2025年7月31日」から「2025年9月26日」に変更されており、最新の情報を反映しています。また、OneLakeに関する説明がより詳細になり、privateLinkServicesForFabricリソースを作成する際には、特定の接続文字列を使用する必要があることが明記されています。これにより、OneLakeデータソースの設定が明確になり、ユーザーが必要な手順を理解しやすくなっています。
さらに、構文の一部で「OneLake」という表現が「Microsoft OneLake」として修正され、ブランド名の統一が図られています。この変更により、ドキュメントの一貫性が向上し、ユーザーにとっての明瞭さが増すよう配慮されています。全体として、これらの修正はユーザーの理解を助けるために重要です。
articles/search/search-indexer-overview.md
Diff
@@ -56,13 +56,13 @@ Indexers crawl data stores on Azure and outside of Azure.
+ [Azure SQL Database](search-how-to-index-sql-database.md)
+ [Azure Table Storage](search-howto-indexing-azure-tables.md)
+ [Azure SQL Managed Instance](search-how-to-index-sql-managed-instance.md)
++ [Microsoft OneLake](search-how-to-index-onelake-files.md)
+ [SQL Server on Azure Virtual Machines](search-how-to-index-sql-server.md)
+ [Azure Files](search-file-storage-integration.md) (in preview)
+ [Azure MySQL](search-howto-index-mysql.md) (in preview)
+ [SharePoint in Microsoft 365](search-howto-index-sharepoint-online.md) (in preview)
+ [Azure Cosmos DB for MongoDB](search-howto-index-cosmosdb-mongodb.md) (in preview)
+ [Azure Cosmos DB for Apache Gremlin](search-howto-index-cosmosdb-gremlin.md) (in preview)
-+ [OneLake](search-how-to-index-onelake-files.md) (in preview)
Azure Cosmos DB for Cassandra is not supported.
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeの明確化と文言修正"
}
Explanation
この変更は、「インデクサーの概要」に関するドキュメントの更新を示しています。主な内容は、データストアとしてのMicrosoft OneLakeが新たにリストに追加されたことです。
特に、「OneLake」が他のAzureのデータストアと同様に、インデクサーが対応するデータストアの一つとして挙げられています。以前は「OneLake」と記載されていた部分が独立した項目として挿入され、より強調される形になっています。この修正により、ユーザーはAzureにおけるOneLakeの役割を理解しやすくなり、完全なリストが提供されています。
この更新は、ユーザードキュメントの一貫性を高め、Azureのさまざまなデータストアについての理解を助けることを目的としています。最終的には、ユーザーに対する情報提供の質が向上することが期待されます。
articles/search/search-indexer-securing-resources.md
Diff
@@ -35,7 +35,7 @@ A list of all possible Azure resource types that an indexer might access in a ty
| Azure Storage (blobs, tables) | Skillsets (caching enrichments, debug sessions, knowledge store projections) |
| Azure Cosmos DB (various APIs) | Data source |
| Azure SQL Database | Data source |
-| OneLake (Microsoft Fabric) | Data source |
+| Microsoft OneLake | Data source |
| SQL Server on Azure virtual machines | Data source |
| SQL Managed Instance | Data source |
| Azure Functions | Attached to a skillset and used to host for custom web API skills |
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeの表記修正"
}
Explanation
この変更は、「リソースのセキュリティに関するインデクサー」ドキュメントの修正を示しています。具体的には、「OneLake」という表記が「Microsoft OneLake」と修正されています。
この修正により、OneLakeのブランド名が明確になり、他のAzureサービスとの一貫性が保たれています。特に、同じ行に配置された他のリソース名と同様の形式で表記されることで、情報がより明瞭になることが期待されます。
全体として、この更新はユーザーにとって重要な情報の一貫性を高め、Azureのリソースタイプに対する理解を深める助けとなります。
articles/search/search-normalizers.md
Diff
@@ -9,15 +9,12 @@ ms.service: azure-ai-search
ms.custom:
- ignite-2023
ms.topic: how-to
-ms.date: 05/19/2025
+ms.date: 09/28/2025
ms.update-cycle: 365-days
---
# Text normalization for case-insensitive filtering, faceting and sorting
-> [!IMPORTANT]
-> This feature is in public preview under [Supplemental Terms of Use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). We recommend the latest [preview REST API version](/rest/api/searchservice/search-service-api-versions) for this feature.
-
In Azure AI Search, a *normalizer* is a component that pre-processes text for keyword matching over fields marked as "filterable", "facetable", or "sortable". In contrast with full text "searchable" fields that are paired with [text analyzers](search-analyzers.md), content that's created for filter-facet-sort operations doesn't undergo analysis or tokenization. Omission of text analysis can produce unexpected results when casing and character differences show up, which is why you need a normalizer to homogenize variations in your content.
By applying a normalizer, you can achieve light text transformations that improve results:
Summary
{
"modification_type": "minor update",
"modification_title": "公開プレビュー日付の更新と注意書きの削除"
}
Explanation
この変更は、「検索正規化器」に関するドキュメントの更新を示しています。主な修正点は以下の通りです:
日付の更新: ドキュメントの公開日付が「2025年5月19日」から「2025年9月28日」へと更新されました。これにより、情報が最新のものと一致します。
注意書きの削除: 機能が公開プレビューであることに関する重要な注意書きが削除されました。この部分では、公開プレビューの利用規約へのリンクが示されていましたが、それが省略されています。
全体として、これらの変更はドキュメントをより正確かつ明確にし、利用者に必要な最新情報を提供することを目的としています。また、ユーザーが正規化器の使用に関する理解を深めるための情報も引き続き提供されています。
articles/search/search-relevance-overview.md
Diff
@@ -7,7 +7,7 @@ author: HeidiSteen
ms.author: heidist
ms.service: azure-ai-search
ms.topic: concept-article
-ms.date: 08/27/2025
+ms.date: 09/27/2025
ms.update-cycle: 180-days
---
@@ -17,16 +17,6 @@ In a query operation, the relevance of any given result is determined by a ranki
Ranking occurs whenever the query request includes full text or vector queries. It doesn't occur if the query invokes strict pattern matching, such as a filter-only query or a specialized query form like autocomplete, suggestions, geospatial search, fuzzy search, or regular expression search. A uniform search score of 1.0 indicates the absence of a ranking algorithm.
-## Relevance tuning
-
-***Relevance tuning*** is a technique for boosting search scores based on extra criteria such as weighted fields, freshness, or proximity. In Azure AI Search, relevance tuning options vary based on query type:
-
-+ For textual and numeric (nonvector) content in keyword or hybrid search, you can tune relevance through [scoring profiles](#custom-boosting-logic-using-scoring-profiles) or invoking the [semantic ranker](semantic-search-overview.md).
-
-+ For vector content in a hybrid query, you can [weight a vector field](hybrid-search-ranking.md#weighted-scores) to boost the importance of the vector component relative to the text component of the hybrid query.
-
-+ For pure vector queries, you can experiment between Hierarchical Navigable Small World (HNSW) and exhaustive K-nearest neighbors (KNN) to see if one algorithm outperforms the other for your scenario. HNSW graphing with an exhaustive KNN override at query time is the most flexible approach for comparison testing. You can also experiment with various embedding models to see which ones produce higher quality results. Finally, remember that a hybrid query or a vector query on documents that include nonvector fields are in-scope for relevance tuning, so it's just the vector fields themselves that can't participate in a relevance tuning effort.
-
## Levels of ranking
The query engine in Azure AI Search supports a multi-level approach to ranking search results, where there's a built-in ranking modality for each query type, plus extra ranking capabilities for extended relevance tuning.
@@ -39,6 +29,16 @@ This section describes the levels of scoring operations. For an illustration of
| Fused L1 | Scoring from multiple queries using the [Reciprocal Ranking Fusion (RRF) algorithm](hybrid-search-ranking.md). RRF is used for hybrid queries that include text and vector components. RRF is also used when multiple vector queries execute in parallel. A search score from RRF is reflected in `@search.score` [over a different range](#types-of-search-scores).|
| Level 2 (L2) | [Semantic ranking score (`@search.reRankerScore`)](semantic-search-overview.md) applies machine reading comprehension to the textual content retrieved by L1 ranking, rescoring the L1 results to better match the semantic intent of the query. L2 reranks L1 results because doing so saves time and money; it would be prohibitive to use semantic ranking as an L1 ranking system. Semantic ranking is a premium feature that bills for usage of the semantic ranking models. It's optional for text queries and vector queries that contain text, but required for [agentic retrieval (preview)](search-agentic-retrieval-concept.md). Although agentic retrieval sends multiple queries to the query engine, the ranking algorithm for agentic retrieval is the semantic ranker. |
+## Relevance tuning
+
+***Relevance tuning*** is a technique for boosting search scores based on extra criteria such as weighted fields, freshness, or proximity. In Azure AI Search, relevance tuning options vary based on query type:
+
++ For textual and numeric (nonvector) content in keyword or hybrid search, you can tune relevance through [scoring profiles](#custom-boosting-logic-using-scoring-profiles) or invoking the [semantic ranker](semantic-search-overview.md).
+
++ For vector content in a hybrid query, you can [weight a vector field](hybrid-search-ranking.md#weighted-scores) to boost the importance of the vector component relative to the text component of the hybrid query.
+
++ For pure vector queries, you can experiment between Hierarchical Navigable Small World (HNSW) and exhaustive K-nearest neighbors (KNN) to see if one algorithm outperforms the other for your scenario. HNSW graphing with an exhaustive KNN override at query time is the most flexible approach for comparison testing. You can also experiment with various embedding models to see which ones produce higher quality results. Finally, remember that a hybrid query or a vector query on documents that include nonvector fields are in-scope for relevance tuning, so it's just the vector fields themselves that can't participate in a relevance tuning effort.
+
## Custom boosting logic using scoring profiles
[Scoring profiles](index-add-scoring-profiles.md) are an optional feature for boosting scores based on extra user-defined criteria. Criteria can include weighted fields where a match found in a specific field is given more weight than the same match found in a different field. Criteria can also be defined through functions that boost by freshness, proximity, magnitude, or range. There's no extra costs associated with scoring profiles. To use a scoring profile, you define it in an index and then specify it on a query.
@@ -68,7 +68,7 @@ Scored results are indicated for each match in the query response. This table li
| `@search.score` | 0.333 - 1.00 | [HNSW or exhaustive KNN algorithm](vector-search-ranking.md#scores-in-a-vector-search-results) for vector search |
| `@search.score` | 0 through an upper limit determined by the number of queries | [RRF algorithm](hybrid-search-ranking.md#scores-in-a-hybrid-search-results) |
| `@search.rerankerScore` | 0.00 - 4.00 | [Semantic ranking algorithm](semantic-search-overview.md#how-results-are-scored) for L2 ranking |
-| `@search.rerankerScoreBoosted` | 0 through unlimited | [Semantic ranking with scoring profile boosting](semantic-how-to-enable-scoring-profiles.md) (scores can be significantly higher than 4) |
+| `@search.rerankerBoostedScore` | 0 through unlimited | [Semantic ranking with scoring profile boosting](semantic-how-to-enable-scoring-profiles.md) (scores can be significantly higher than 4) |
## Diagram of ranking algorithms
@@ -77,7 +77,7 @@ The following diagram illustrates how the ranking algorithms work together.
:::image type="content" source="media/scoring-profiles/scoring-over-ranked-results.png" alt-text="Diagram showing which fields have a scoring profile and when ranking occurs.":::
> [!NOTE]
-> This workflow diagram currently omits `@search.rerankerScoreBoosted` and a step for semantic ranking with boosting from a scoring profile. If you use semantic ranking with scoring profile, the scoring profile is applied after L2 ranking, and the final score is based on `@search.rerankerScoreBoosted`.
+> This workflow diagram currently omits `@search.rerankerBoostedScore` and a step for semantic ranking with boosting from a scoring profile. If you use semantic ranking with scoring profile, the scoring profile is applied after L2 ranking, and the final score is based on `@search.rerankerBoostedScore`.
## Example query inclusive of all ranking algorithms
Summary
{
"modification_type": "minor update",
"modification_title": "ランキングと関連性調整に関する情報の再配置"
}
Explanation
この変更は、「検索関連性の概要」に関するドキュメントの内容を一部整理し、更新したものです。主な修正点は以下の通りです:
日付の更新: ドキュメントの日付が「2025年8月27日」から「2025年9月27日」にプレ更新され、最新の情報に一致しています。
関連性調整の再配置: 以前は「関連性調整」というセクションがドキュメントの初めに存在していましたが、その内容が削除された後、ドキュメントの後半で再び現れるようになりました。このセクションでは、検索スコアを向上させるための手法について詳述されており、具体的なシナリオに応じた調整方法が説明されています。
用語の統一: 「@search.rerankerScoreBoosted」という用語が「@search.rerankerBoostedScore」に改名され、用語の一貫性が向上しました。
これらの変更は、読者にとって関連性調整に関する情報をより分かりやすく提供することを目的としており、ドキュメント全体の流れも改善されています。ユーザーは、Azure AI Searchのランキングアルゴリズムと関連性に関するより明確な理解を得ることができるでしょう。
articles/search/search-what-is-data-import.md
Diff
@@ -77,7 +77,7 @@ The pull model uses *indexers* connecting to a supported data source, automatica
+ [Azure Files (preview)](search-file-storage-integration.md)
+ [Azure Cosmos DB](search-howto-index-cosmosdb.md)
+ [Azure SQL Database, SQL Managed Instance, and SQL Server on Azure VMs](search-how-to-index-sql-database.md)
-+ [OneLake files and shortcuts](search-how-to-index-onelake-files.md)
++ [Microsoft OneLake files and shortcuts](search-how-to-index-onelake-files.md)
+ [SharePoint Online (preview)](search-howto-index-sharepoint-online.md)
You can use third-party connectors, developed and maintained by Microsoft partners. For more information and links, see [Data source gallery](search-data-sources-gallery.md).
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeの名称をMicrosoft OneLakeに更新"
}
Explanation
この変更は、「データインポートとは何か」というドキュメント内での用語の明確化を目的としています。具体的には、次の点が修正されています:
- 名称の更新: 「OneLake files and shortcuts」という表現が「Microsoft OneLake files and shortcuts」に変更されました。この変更により、OneLakeがマイクロソフトによって提供されていることがより明確になります。
この修正は、読者がMicrosoftの製品に関する情報を理解しやすくするために行われています。正確な名称を使用することで、一貫性や認識が向上します。全体として、ドキュメントの品質向上に寄与する小さな更新ですが、重要な意味を持つ変更です。
articles/search/semantic-how-to-enable-scoring-profiles.md
Diff
@@ -7,32 +7,20 @@ ms.author: heidist
ms.service: azure-ai-search
ms.update-cycle: 180-days
ms.topic: how-to
-ms.date: 08/27/2025
+ms.date: 09/28/2025
---
# Use scoring profiles with semantic ranker in Azure AI Search
-[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
+You can apply a [scoring profile](index-add-scoring-profiles.md) over [semantically ranked search results](semantic-search-overview.md), where the scoring profile is processed last.
-Using a [scoring profile](index-add-scoring-profiles.md) with [semantic ranker](semantic-search-overview.md) is supported in newer Azure AI Search preview REST API versions and Azure SDK preview packages. With this feature, the scoring profile is processed last. Without this feature, semantic ranking is processed last.
-
-To ensure the scoring profile provides the determining score, the semantic ranker adds a new response field, `@search.rerankerBoostedScore`, that applies scoring profile logic on semantically ranked results. In search results that include `@search.score` from level 1 ranking, `@search.rerankerScore` from semantic ranker, and `@search.reRankerBoostedScore`, results are sorted by `@search.reRankerBoostedScore`.
-
-> [!NOTE]
-> If you're using a stable API version or an earlier preview, scoring profiles are only used upstream, before the semantic ranking step. For a diagram of the scoring workflow, see [Relevance in Azure AI Search](search-relevance-overview.md).
+To ensure the scoring profile provides the determining score, the semantic ranker adds a response field, `@search.rerankerBoostedScore`, that applies scoring profile logic on semantically ranked results. In search results that include `@search.score` from level 1 ranking, `@search.rerankerScore` from semantic ranker, and `@search.reRankerBoostedScore`, results are sorted by `@search.reRankerBoostedScore`.
## Prerequisites
- [Azure AI Search](search-create-service-portal.md), Basic pricing tier or higher, with [semantic ranker enabled](semantic-how-to-enable-disable.md).
-- The [latest preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2025-08-01-preview&preserve-view=true) or a preview Azure SDK package that provides the new APIs.
-
-- A search index with a semantic configuration that specifies `"rankingOrder": "boostedReRankerScore"` and a scoring profile that specifies [functions](index-add-scoring-profiles.md#use-functions).
-
-- A semantic query includes the scoring profile.
-
-> [!TIP]
-> For all preview features, we recommend reviewing the Azure SDK change logs to check for feature availability: [Python SDK change log](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md), [.NET SDK change log](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/search/Azure.Search.Documents/CHANGELOG.md), [Java SDK change log](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/search/azure-search-documents/CHANGELOG.md), [JavaScript SDK change log](https://github.com/Azure/azure-sdk-for-js/blob/main/sdk/search/search-documents/CHANGELOG.md).
+- A search index with a semantic configuration that specifies `"rankingOrder": "boostedRerankerScore"` and a scoring profile that specifies [functions](index-add-scoring-profiles.md#use-functions).
## Limitations
@@ -42,7 +30,7 @@ Boosting of semantically ranked results applies to scoring profile functions onl
When you execute a semantic query associated with a scoring profile, a third search score, `@search.rerankerBoostedScore` value, is generated for every document in your search results. This boosted score, calculated by applying the scoring profile to the existing reranker score, doesn't have a guaranteed range (0–4) like a normal reranker score, and scores can be significantly higher than 4.
-Starting in API version `2025-05-01-preview`, semantic results are sorted by `@search.rerankerBoostedScore` by default if a scoring profile exists. If the `rankingOrder` property isn't specified, then `BoostedReRankerScore` is the default value in the semantic configuration.
+Semantic results are sorted by `@search.rerankerBoostedScore` by default if a scoring profile exists. If the `rankingOrder` property isn't specified, then `BoostedRerankerScore` is the default value in the semantic configuration.
In this scenario, a scoring profile is used twice.
@@ -58,16 +46,16 @@ In this scenario, a scoring profile is used twice.
## Enable scoring profiles in semantic configuration
-To enable scoring profiles with semantic ranking, use the latest preview REST API (2025-08-01-preview) to update an index by setting the `rankingOrder` property of its semantic configuration. Use the PUT method to update the index with your revisions. No index rebuild is required.
+To enable scoring profiles for semantically ranked results, [update an index](/rest/api/searchservice/indexes/create-or-update#rankingorder) by setting the `rankingOrder` property of its semantic configuration. Use the PUT method to update the index with your revisions. No index rebuild is required.
```json
-PUT https://{service-name}.search.windows.com/indexes/{index-name}?api-version=2025-08-01-preview
+PUT https://{service-name}.search.windows.com/indexes/{index-name}?api-version=2025-09-01
{
"semantic": {
"configurations": [
{
"name": "mySemanticConfig",
- "rankingOrder": "boostedReRankerScore"
+ "rankingOrder": "boostedRerankerScore"
}
]
}
@@ -79,7 +67,7 @@ PUT https://{service-name}.search.windows.com/indexes/{index-name}?api-version=2
To opt out of sorting by semantic reranker boosted score, set the `rankingOrder` field to `reRankerScore` value in the semantic configuration.
```json
-PUT https://{service-name}.search.windows.com/indexes/{index-name}?api-version=2025-08-01-preview
+PUT /indexes/{index-name}?api-version=2025-09-01
{
"semantic": {
"configurations": [
@@ -92,14 +80,14 @@ PUT https://{service-name}.search.windows.com/indexes/{index-name}?api-version=2
}
```
-Even if you opt out of sorting by `@search.rerankerBoostedScore`, the `boostedReRankerScore` field is still produced in the response, but it's no longer used to sort results.
+Even if you opt out of sorting by `@search.rerankerBoostedScore`, the `boostedRerankerScore` field is still produced in the response, but it's no longer used to sort results.
## Example query and response
-Start with a [semantic query](semantic-how-to-query-request.md) that specifies a scoring profile. The query uses the new preview REST API, and it targets a search index that has `rankingOrder` set to `boostedReRankerScore`.
+Start with a [semantic query](semantic-how-to-query-request.md) that specifies a scoring profile. This query targets a search index that has `rankingOrder` set to `boostedRerankerScore`.
```json
-POST https://{service-name}.search.windows.com/indexes/{index-name}/docs/search?api-version=2025-08-01-preview
+POST /indexes/{index-name}/docs/search?api-version=2025-09-01
{
"search": "my query to be boosted",
"scoringProfile": "myScoringProfile",
Summary
{
"modification_type": "minor update",
"modification_title": "スコアリングプロファイルに関する情報の更新"
}
Explanation
この変更は、Azure AI Searchにおけるスコアリングプロファイルとセマンティックランキングに関するドキュメントの内容を最新にし、一部の情報を整理したものです。具体的な修正内容は以下の通りです:
日付の更新: ドキュメントの日付が「2025年8月27日」から「2025年9月28日」に変更され、最新の情報を反映しています。
テキストの簡略化と明確化: スコアリングプロファイルがセマンティックランキングの結果にどのように適用されるかについての説明が簡潔になり、重要な点が強調されています。特に、@search.rerankerBoostedScoreという新しいレスポンスフィールドの導入にともなう変更が強調されています。
APIバージョンの更新: 実行するAPIバージョンが更新され、新しいバージョンのエンドポイントが示されています。これにより、ユーザーは最新のAPI仕様を利用できるようになります。
関数の明確化: スコアリングプロファイルに関連する関数の指定についてもより明確に記述されています。
これらの変更により、ユーザーはAzure AI Searchにおけるスコアリングプロファイルの設定や使用方法をより効果的に理解できるようになります。また、変更された内容は、機能の利用を検討している開発者にとって非常に有益です。
articles/search/toc.yml
Diff
@@ -342,7 +342,7 @@ items:
href: search-how-to-index-sql-managed-instance.md
- name: Azure SQL Server VMs
href: search-how-to-index-sql-server.md
- - name: OneLake files
+ - name: Microsoft OneLake
href: search-how-to-index-onelake-files.md
- name: SharePoint Online
href: search-howto-index-sharepoint-online.md
Summary
{
"modification_type": "minor update",
"modification_title": "OneLakeの名称をMicrosoft OneLakeに変更"
}
Explanation
この変更は、Azure AI Searchに関連するドキュメントの目次(toc.yml)での用語の一貫性を保つために行われました。具体的には、以下の点が修正されています:
名称の変更: 「OneLake files」という項目が「Microsoft OneLake」に変更されました。この修正は、OneLakeがMicrosoftのプロダクトであることを明確に示すためのものです。
目次の整合性向上: ドキュメントの目次における名称の統一は、ユーザーが情報を理解しやすくするために重要です。この変更により、Microsoftのブランドに対する認識が強化され、利用者の信頼感が向上することが期待されます。
全体として、この小さな更新は、ドキュメントの品質を向上させるための重要な修正であり、ユーザーにとっての利便性を高めることに寄与しています。
articles/search/tutorial-document-extraction-multimodal-embeddings.md
Diff
@@ -9,7 +9,7 @@ ms.service: azure-ai-search
ms.update-cycle: 180-days
ms.custom:
ms.topic: tutorial
-ms.date: 08/27/2025
+ms.date: 09/27/2025
---
Summary
{
"modification_type": "minor update",
"modification_title": "ドキュメントの日付を更新"
}
Explanation
この変更は、Azure AI Searchに関するドキュメント「tutorial-document-extraction-multimodal-embeddings.md」において、日付情報を最新のものに更新するために行われました。具体的には、以下の修正内容があります。
日付の変更: ドキュメントの日付が「2025年8月27日」から「2025年9月27日」に変更されました。この更新は、ユーザーが最新の情報に基づいてガイドを利用できるようにするためのものです。
文書の鮮度の保持: 日付が変更されることで、文書が最近のものであることが示され、利用者にとっての信頼性が向上します。
このような小さな更新は、情報を適時更新することで、ユーザー体験を向上させる重要な役割を果たします。これにより、Azure AI Searchの機能や使用方法に関するドキュメントが常に最新の状態に保たれることが期待されます。
articles/search/tutorial-document-layout-image-verbalization.md
Diff
@@ -9,7 +9,7 @@ ms.service: azure-ai-search
ms.update-cycle: 180-days
ms.custom:
ms.topic: tutorial
-ms.date: 08/27/2025
+ms.date: 09/27/2025
---
@@ -25,7 +25,7 @@ In this tutorial, you use:
+ An indexer and skillset to create an indexing pipeline that includes AI enrichment through skills.
-+ The [Document Layout skill (preview)](cognitive-search-skill-document-intelligence-layout.md) for extracting text and normalized images with its `locationMetadata` from various documents, such as page numbers or bounding regions.
++ The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) for extracting text and normalized images with its `locationMetadata` from various documents, such as page numbers or bounding regions.
+ The [GenAI Prompt skill (preview)](cognitive-search-skill-genai-prompt.md) that calls a chat completion model to create descriptions of visual content.
Summary
{
"modification_type": "minor update",
"modification_title": "Document Layout skillの記述を修正"
}
Explanation
この変更は、Azure AI Searchに関するドキュメント「tutorial-document-layout-image-verbalization.md」の内容を若干修正するために行われました。具体的には、以下のポイントが修正されています。
日付の更新: ドキュメントの日付が「2025年8月27日」から「2025年9月27日」に更新されました。この変更により、最新の情報として文書が維持されます。
文言の修正: 「Document Layout skill (preview)」という表現が「Document Layout skill」に修正されました。この変更は、ソフトウェアの名称に対する整合性を保つためのもので、プレビューという表記を削除することで、利用者に安定した機能として認識してもらう意図があります。
文書の整合性の向上: 言葉の使用を一貫させることで、ユーザーが機能や関連情報をより簡単に理解できるようにし、全体的なドキュメントの品質を向上させています。
このような小さな更新は、文書の信頼性を向上させ、ユーザーにとっての可読性を改善するために重要です。これにより、Azure AI Searchの機能に関するユーザーの理解が深まることが期待されます。
articles/search/tutorial-document-layout-multimodal-embeddings.md
Diff
@@ -9,7 +9,7 @@ ms.service: azure-ai-search
ms.update-cycle: 180-days
ms.custom:
ms.topic: tutorial
-ms.date: 08/27/2025
+ms.date: 09/27/2025
---
@@ -23,7 +23,7 @@ In this tutorial, you use:
+ An indexer and skillset to create an indexing pipeline that includes AI enrichment through skills.
-+ The [Document Layout skill (preview)](cognitive-search-skill-document-intelligence-layout.md) for extracting text and normalized images with its `locationMetadata` from various documents, such as page numbers or bounding regions.
++ The [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) for extracting text and normalized images with its `locationMetadata` from various documents, such as page numbers or bounding regions.
+ The [Azure AI Vision multimodal embeddings skill](cognitive-search-skill-vision-vectorize.md) to vectorize text and images.
Summary
{
"modification_type": "minor update",
"modification_title": "Document Layout skillの表現を修正"
}
Explanation
この変更は、Azure AI Searchに関するドキュメント「tutorial-document-layout-multimodal-embeddings.md」の内容を更新するために行われました。具体的な変更内容は以下の通りです。
日付の更新: ドキュメントの日付が「2025年8月27日」から「2025年9月27日」に変更され、文書の新しさを示しています。これにより、利用者に最新の情報が提供されることを保証します。
文言の修正: 「Document Layout skill (preview)」から「Document Layout skill」へ表現が修正されました。この変更は、技能の名称をより明確に示し、プレビュー段階から正式な機能として認識させるためのものであり、ユーザーがより安心して機能を利用できるようにします。
全体の可読性向上: この修正により、ドキュメント全体の整合性が向上し、ユーザーが必要な情報を迅速に見つけやすくなります。
このような小さな変更が文書の信頼性を高め、ユーザー体験の向上に寄与することが期待されます。全体として、Azure AI Searchに関する情報がより正確かつ利用しやすいものとなるよう努められています。
articles/search/vector-search-how-to-create-index.md
Diff
@@ -9,7 +9,7 @@ ms.update-cycle: 180-days
ms.custom:
- ignite-2024
ms.topic: how-to
-ms.date: 08/28/2025
+ms.date: 09/28/2025
---
# Create a vector index
@@ -75,6 +75,7 @@ Here's a basic index with a `"name"`, a `"fields"` collection, and some other co
POST https://[servicename].search.windows.net/indexes?api-version=[api-version]
{
"name": "example-index",
+ "description": "This is an example index."
"fields": [
{ "name": "documentId", "type": "Edm.String", "key": true, "retrievable": true, "searchable": true, "filterable": true },
{ "name": "myHumanReadableNameField", "type": "Edm.String", "retrievable": true, "searchable": true, "filterable": false, "sortable": true, "facetable": false },
@@ -97,20 +98,7 @@ A vector configuration includes:
+ `vectorSearch.compressions` for scalar or binary quantization, oversampling, and reranking with original vectors.
+ `vectorSearch.profiles` for specifying multiple combinations of algorithm and compression configurations.
-### [**2024-07-01**](#tab/config-2024-07-01)
-
-[**2024-07-01**](/rest/api/searchservice/search-service-api-versions#2024-07-01) is generally available. It supports a vector configuration that has:
-
-+ Hierarchical Navigable Small World (HNSW) algorithm.
-+ Exhaustive K-Nearest Neighbor (KNN) algorithm.
-+ Scalar compression.
-+ Binary compression, which is available in 2024-07-01 only and in newer Azure SDK packages.
-+ Oversampling.
-+ Reranking with original vectors.
-
-If you choose HNSW on a field, you can opt for exhaustive KNN at query time. However, the opposite doesn’t work. If you choose exhaustive for indexing, you can’t later request HNSW search because the extra data structures that enable approximate search don’t exist.
-
-Be sure to have a strategy for [vectorizing your content](vector-search-how-to-generate-embeddings.md). We recommend [integrated vectorization](vector-search-integrated-vectorization.md) and [query-time vectorizers](vector-search-how-to-configure-vectorizer.md) for built-in encoding.
+Here are the steps:
1. Use the [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) REST API to create the index.
@@ -122,17 +110,23 @@ Be sure to have a strategy for [vectorizing your content](vector-search-how-to-g
{
"name": "scalar-quantization",
"kind": "scalarQuantization",
- "rerankWithOriginalVectors": true,
- "defaultOversampling": 10.0,
- "scalarQuantizationParameters": {
- "quantizedDataType": "int8"
- }
+ "scalarQuantizationParameters": {
+ "quantizedDataType": "int8"
+ },
+ "rescoringOptions": {
+ "enableRescoring": true,
+ "defaultOversampling": 10,
+ "rescoreStorageMethod": "preserveOriginals"
+ }
},
{
"name": "binary-quantization",
"kind": "binaryQuantization",
- "rerankWithOriginalVectors": true,
- "defaultOversampling": 10.0
+ "rescoringOptions": {
+ "enableRescoring": true,
+ "defaultOversampling": 10,
+ "rescoreStorageMethod": "discardOriginals"
+ }
}
],
"algorithms": [
@@ -181,9 +175,9 @@ Be sure to have a strategy for [vectorizing your content](vector-search-how-to-g
+ `vectorSearch.compressions` can be `scalarQuantization` or `binaryQuantization`. Scalar quantization compresses float values into narrower data types. Binary quantization converts floats into binary 1-bit values.
- + `vectorSearch.compressions.rerankWithOriginalVectors` uses the original, uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if `stored` is false. This property is optional. Default is true.
+ + `vectorSearch.compressions.rescoringOptions` uses the original, uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if `stored` is false. This property is optional. Default is true.
- + `vectorSearch.compressions.defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
+ + `vectorSearch.compressions.rescoringOptions.defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
+ `vectorSearch.compressions.scalarQuantizationParameters.quantizedDataType` must be set to `int8`. This is the only primitive data type supported at this time. This property is optional. Default is `int8`.
@@ -199,104 +193,12 @@ Be sure to have a strategy for [vectorizing your content](vector-search-how-to-g
+ `vectorSearch.profiles` add a layer of abstraction for accommodating richer definitions. A profile is defined in `vectorSearch` and referenced by name in each vector field. It's a combination of compression and algorithm configurations. You assign this property to a vector field, and it determines the fields' algorithm and compression.
-### [**2024-05-01-preview**](#tab/config-2024-05-01-Preview)
-
-[**2024-05-01-preview**](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) is the most recent preview version. It's inclusive of previous preview versions.
-
-Preview and stable API versions support the same `vectorSearch` configurations. You would choose the preview over the stable version for other reasons, such as [more compression options](vector-search-how-to-quantization.md) or [newer vectorizers](vector-search-how-to-configure-vectorizer.md) invoked at query time.
-
-1. Use the [Create or Update Index Preview REST API](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true) to create the index.
-
-1. Add a `vectorSearch` section in the index that specifies compression settings and the search algorithms used to create the embedding space. For more information, see [Configure vector quantization](vector-search-how-to-quantization.md).
-
- ```json
- "vectorSearch": {
- "compressions": [
- {
- "name": "my-scalar-quantization",
- "kind": "scalarQuantization",
- "rerankWithOriginalVectors": true,
- "defaultOversampling": 10.0,
- "scalarQuantizationParameters": {
- "quantizedDataType": "int8"
- }
- }
- ],
- "algorithms": [
- {
- "name": "hnsw-1",
- "kind": "hnsw",
- "hnswParameters": {
- "m": 4,
- "efConstruction": 400,
- "efSearch": 500,
- "metric": "cosine"
- }
- },
- {
- "name": "hnsw-2",
- "kind": "hnsw",
- "hnswParameters": {
- "m": 8,
- "efConstruction": 800,
- "efSearch": 800,
- "metric": "hamming"
- }
- },
- {
- "name": "eknn",
- "kind": "exhaustiveKnn",
- "exhaustiveKnnParameters": {
- "metric": "euclidean"
- }
- }
-
- ],
- "profiles": [
- {
- "name": "vector-profile-hnsw-1",
- "algorithm": "hnsw-1"
- }
- ]
- }
- ```
-
- **Key points**:
-
- + `vectorSearch.compressions.kind` must be `scalarQuantization`.
-
- + `vectorSearch.compressions.rerankWithOriginalVectors` uses the original, uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if `stored` is false. This property is optional. Default is true.
-
- + `vectorSearch.compressions.defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
-
- + `vectorSearch.compressions.scalarQuantizationParameters.quantizedDataType` must be set to `int8`. This is the only primitive data type supported at this time. This property is optional. Default is `int8`.
-
- + `vectorSearch.algorithms.kind` is either `hnsw` or `exhaustiveKnn`. These are the Approximate Nearest Neighbors (ANN) algorithms used to organize vector content during indexing.
-
- + `vectorSearch.algorithms.m` is the bi-directional link count. Default is 4. The range is 4 to 10. Lower values should return less noise in the results.
-
- + `vectorSearch.algorithms.efConstruction` is the number of nearest neighbors used during indexing. Default is 400. The range is 100 to 1,000.
-
- + `vectorSearch.algorithms.efSearch` is the number of nearest neighbors used during search. Default is 500. The range is 100 to 1,000.
-
- + `vectorSearch.algorithms.metric` should be `cosine` if you're using Azure OpenAI, otherwise use the similarity metric associated with the embedding model you're using. Supported values are `cosine`, `dotProduct`, `euclidean`, and `hamming` (used for [indexing binary data](vector-search-how-to-index-binary-data.md)).
-
- + `vectorSearch.profiles` add a layer of abstraction for accommodating richer definitions. A profile is defined in `vectorSearch` and referenced by name in each vector field. It's a combination of compression and algorithm configurations. You assign this property to a vector field, and it determines the fields' algorithm and compression.
-
-For more information about new preview features, see [What's new in Azure AI Search](whats-new.md).
-
----
-
## Add a vector field to the fields collection
Once you have a vector configuration, you can add a vector field to the fields collection. Recall that the fields collection must include a field for the document key, vector fields, and any other nonvector fields you need for [hybrid search scenarios](hybrid-search-overview.md) or chat model completion in [RAG workloads](retrieval-augmented-generation-overview.md).
Vector fields are characterized by [their data type](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields), a `dimensions` property based on the embedding model used to output the vectors, and a vector profile that you created in a previous step.
-### [**2024-07-01**](#tab/rest-2024-07-01)
-
-[**2024-07-01**](/rest/api/searchservice/search-service-api-versions#2024-07-01) is generally available.
-
1. Use the [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) REST API to create the index and add a vector field to the fields collection.
```json
@@ -403,122 +305,6 @@ Vector fields are characterized by [their data type](/rest/api/searchservice/sup
}
```
-### [**2024-05-01-preview**](#tab/rest-2024-05-01-Preview)
-
-[**2024-05-01-preview**](/rest/api/searchservice/search-service-api-versions#2024-05-01-preview) is the most recent preview version. It supports the same vector field definitions as the stable version, including support for all [vector data types](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields).
-
-1. Use the [Create or Update Index Preview REST API](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true) to create the index and add a vector field to the fields collection.
-
- ```json
- {
- "name": "example-index",
- "fields": [
- {
- "name": "contentVector",
- "type": "Collection(Edm.Single)",
- "searchable": true,
- "retrievable": false,
- "stored": false,
- "dimensions": 1536,
- "vectorSearchProfile": "vector-profile-1"
- }
- ]
- }
- ```
-
-1. Specify a vector field with the following attributes. You can store one generated embedding per document field. For each vector field:
-
- + `type` can be `Collection(Edm.Single)`, `Collection(Edm.Half)`, `Collection(Edm.Int16)`, or `Collection(Edm.SByte)`.
- + `dimensions` is the number of dimensions generated by the embedding model. For text-embedding-ada-002, it's 1536.
- + `vectorSearchProfile` is the name of a profile defined elsewhere in the index.
- + `searchable` must be true.
- + `retrievable` can be true or false. True returns the raw vectors (1,536 of them) as plain text and consumes storage space. Set to true if you're passing a vector result to a downstream app. False is required if `stored` is false.
- + `stored` is a new boolean property that applies to vector fields only. True stores a copy of vectors returned in search results. False discards that copy during indexing. You can search on vectors, but you can't return vectors in results.
- + `filterable`, `facetable`, and `sortable` must be false.
-
-1. Add filterable nonvector fields to the collection, such as "title" with `filterable` set to true, if you want to invoke [prefiltering, postfiltering, or strict postfiltering (preview)](vector-search-filters.md) on the [vector query](vector-search-how-to-query.md).
-
-1. Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
-
- You should also add fields that are useful in the query or in its response. The following example shows vector fields for title and content (`titleVector` and `contentVector`) that are equivalent to vectors. It also provides fields for equivalent textual content (`title` and `content`) that are useful for sorting, filtering, and reading in a search result.
-
- The following example shows the fields collection:
-
- ```http
- PUT https://my-search-service.search.windows.net/indexes/my-index?api-version=2024-05-01-preview&allowIndexDowntime=true
- Content-Type: application/json
- api-key: {{admin-api-key}}
- {
- "name": "{{index-name}}",
- "fields": [
- {
- "name": "id",
- "type": "Edm.String",
- "key": true,
- "filterable": true
- },
- {
- "name": "firstVectorfield-float32-embeddings",
- "type": "Collection(Edm.Single)",
- "searchable": true,
- "retrievable": false,
- "stored": false,
- "dimensions": 1536,
- "vectorSearchProfile": "vector-profile-1"
- },
- {
- "name": "secondVectorfield-float16-embeddings",
- "type": "Collection(Edm.Half)",
- "searchable": true,
- "retrievable": false,
- "stored": false,
- "dimensions": 1536,
- "vectorSearchProfile": "vector-profile-1"
- },
- {
- "name": "thirdVectorfield-int8-embeddings-for-my-custom-quantization-output",
- "type": "Collection(Edm.SByte)",
- "searchable": true,
- "retrievable": false,
- "stored": false,
- "dimensions": 1536,
- "vectorSearchProfile": "vector-profile-1"
- },
- {
- "name": "fourthVectorfield-for-binary-data",
- "type": "Collection(Edm.Byte)",
- "searchable": true,
- "retrievable": false,
- "stored": false,
- "dimensions": 1536,
- "vectorSearchProfile": "vector-profile-1"
- }
- ],
- "vectorSearch": {
- "algorithms": [
- {
- "name": "hnsw-1",
- "kind": "hnsw",
- "hnswParameters": {
- "m": 4,
- "efConstruction": 400,
- "efSearch": 500,
- "metric": "cosine"
- }
- }
- ],
- "profiles": [
- {
- "name": "vector-profile-1",
- "algorithm": "hnsw-1"
- }
- ]
- }
- }
- ```
-
----
-
## Load vector data for indexing
Content that you provide for indexing must conform to the index schema and include a unique string value for the document key. Prevectorized data is loaded into one or more vector fields, which can coexist with other fields containing nonvector content.
@@ -527,13 +313,10 @@ For data ingestion, you can use [push or pull methodologies](search-what-is-data
### [**Push APIs**](#tab/push)
-Use **Documents - Index** to load vector and nonvector data into an index. The push APIs for indexing are identical across all stable and preview versions. Use any of the following APIs to load documents:
-
-+ [2024-07-01](/rest/api/searchservice/documents)
-+ [2024-05-01-preview](/rest/api/searchservice/documents/?view=rest-searchservice-2024-05-01-preview&preserve-view=true)
+Use [Documents - Index](/rest/api/searchservice/documents) to load vector and nonvector data into an index. The push APIs for indexing are identical across all stable and preview versions. Use any of the following APIs to load documents:
```http
-POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/index?api-version=2024-07-01
+POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/index?api-version=2025-09-01
{
"value": [
@@ -626,7 +409,7 @@ For more information, see [Create a vector query](vector-search-how-to-query.md)
The following REST API example is a vector query, but it returns only nonvector fields (`title`, `content`, and `category`). Only fields marked as `retrievable` can be returned in search results.
```http
-POST https://my-search-service.search.windows.net/indexes/my-index/docs/search?api-version=2024-07-01
+POST https://my-search-service.search.windows.net/indexes/my-index/docs/search?api-version=2025-09-01
Content-Type: application/json
api-key: {{admin-api-key}}
{
Summary
{
"modification_type": "breaking change",
"modification_title": "ベクトル検索インデックス作成手順の大幅な改訂"
}
Explanation
この変更は、Azure AI Searchに関する「vector-search-how-to-create-index.md」ドキュメントの内容を大幅に改訂するためのものであり、以下の点が主な変更内容です。
日付の更新: ドキュメントの日付が「2025年8月28日」から「2025年9月28日」に変更され、新しい情報が反映されています。
インデックスの説明追加: 新たに「description」フィールドが追加され、「This is an example index.」という説明文が付け加えられました。これにより、インデックスの目的がより明確になっています。
コンテンツの削減と整理: 不要な情報や古いバージョンの設定に関する部分が大幅に削除され、代わりにベクトル検索の設定手順が整理されて提示されています。これにより、ユーザーが必要な情報を迅速に見つけやすくなります。
新しいオプションの導入: リスコアリングオプションが追加され、rescoringOptionsを使用して類似性を再計算し、検索クエリによって返される上位結果を再ランク付けする機能が強化されました。具体的には、enableRescoring, defaultOversampling,およびrescoreStorageMethodの指定が可能になっています。
APIバージョンの更新: いくつかのAPI呼び出しのバージョンが更新され、最新のAPI呼び出し方法が反映されました。
これらの改訂により、Azure AI Searchのベクトルインデックスを作成するための手順が簡素化され、より明確にユーザーに提供されるようになったため、特に新しいユーザーにとって利便性が向上しています。
articles/search/vector-search-how-to-quantization.md
Diff
@@ -9,7 +9,7 @@ ms.update-cycle: 180-days
ms.custom:
- ignite-2024
ms.topic: how-to
-ms.date: 03/31/2025
+ms.date: 09/28/2025
---
# Compress vectors using scalar or binary quantization
@@ -48,28 +48,40 @@ Two types of quantization are supported:
>[!Note]
> While free services support quantization, they don't demonstrate the full storage savings due to the limited storage quota.
-## Recommended rescoring techniques
+### How scalar quantization works in Azure AI Search
-Rescoring is a technique used to offset information loss due to vector compression. It uses oversampling to pick up extra vectors, and supplemental information to rescore initial results found by the query. Supplemental information is either uncompressed original full-precision vectors - or for binary quantization only - you have the option of rescoring using the binary quantized document candidates against the query vector. Rescoring options are specified in the index, but you can invoke rescoring at query time if the index supports it.
+Scalar quantization reduces the resolution of each number within each vector embedding. Instead of describing each number as a 16-bit or 32-bit floating point number, it uses an 8-bit integer. It identifies a range of numbers (typically 99th percentile minimum and maximum) and divides them into a finite number of levels or bin, assigning each bin an identifier. In 8-bit scalar quantization, there are 2^8, or 256, possible bins.
+
+Each component of the vector is mapped to the closest representative value within this set of quantization levels in a process akin to rounding a real number to the nearest integer. In the quantized 8-bit vector, the identifier number stands in place of the original value. After quantization, each vector is represented by an array of identifiers for the bins to which its components belong. These quantized vectors require much fewer bits to store compared to the original vector, thus reducing storage requirements and memory footprint.
+
+### How binary quantization works in Azure AI Search
+
+Binary quantization compresses high-dimensional vectors by representing each component as a single bit, either 0 or 1. This method drastically reduces the memory footprint and accelerates vector comparison operations, which are crucial for search and retrieval tasks. Benchmark tests show up to 96% reduction in vector index size.
+
+It's particularly effective for embeddings with dimensions greater than 1024. For smaller dimensions, we recommend testing the quality of binary quantization, or trying scalar instead. Additionally, we’ve found binary quantization performs very well when embeddings are centered around zero. Most popular embedding models offered by OpenAI, Cohere, and Mistral are centered around zero.
+
+## Supported rescoring techniques
-API versions determine which rescoring behavior is operational for your code. The most recent preview API supports a new rescoring approach for binary quantization. Indexes created with `2025-03-01-preview` can use the new rescoring behaviors.
+Rescoring is an optional technique used to offset information loss due to vector quantization. During query execution, it uses oversampling to pick up extra vectors, and supplemental information to rescore initial results found by the query. Supplemental information is either uncompressed original full-precision vectors - or for binary quantization only - you have the option of rescoring using the binary quantized document candidates against the query vector.
-| API version | Quantization type | Rescoring properties |
-|-------------|-------------------|------------------|
-| [2024-07-01](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-07-01&preserve-view=true) | Scalar and binary quantization, on vector indexes built using Hierarchical Navigable Small World (HNSW) graphs for similarity search | `rerankWithOriginalVectors` |
-| [2024-11-01-preview](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true) | Scalar and binary quantization on HNSW graphs | `rescoringOptions.enableRescoring` and `rescoreStorageMethod.preserveOriginals` |
-| [2025-03-01-preview](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-03-01-preview&preserve-view=true) | Binary quantization on HNSW graphs | Previous parameter combinations are still supported but binary quantization can now be rescored if original embeddings are deleted: `rescoringOptions.enableRescoring` and `rescoringOptions.rescoreStorageMethod=discardOriginals` |
+Only HNSW graphs allow rescoring. Exhaustive KNN doesn't support rescoring because by definition, all vectors are scanned at query time, which makes oversampling irrelevant.
-Only HNSW graphs allow rescoring. Exhaustive KNN doesn't support rescoring.
+Rescoring options are specified in the index, but you can invoke rescoring at query time by adding the oversampling query parameter.
-<!-- - In version 2024-11-01-preview, set `rescoringOptions.enableRescoring` and `rescoreStorageMethod.preserveOriginals`
-- In version 2025-03-01-preview, set `rescoringOptions.enableRescoring` and `rescoringOptions.rescoreStorageMethod=preserveOriginals` for scalar or binary quantization, or `rescoringOptions.enableRescoring` and `rescoringOptions.rescoreStorageMethod=discardOriginals` for binary quantization only -->
+| Object | Properties |
+|--------|------------|
+| Index | Add [`RescoringOptions`](/rest/api/searchservice/indexes/create-or-update#rescoringoptions) to the vector compressions section: `rescoringOptions.enableRescoring` (true or false), `rescoringOptions.defaultOversampling` (an integer), `rescoringOptions.rescoreStorageMethod` (preserveOriginals or discardOriginals). We recommend preserveOriginals for scalar quantization and discardOriginals for binary quantization. |
+| Query | Add `oversampling` on [`RawVectorQuery`](/rest/api/searchservice/documents/search-post#rawvectorquery) or [`VectorizableTextQuery`](/rest/api/searchservice/documents/search-post#vectorizabletextquery) definitions. |
+
+> [!NOTE]
+> Rescoring parameter names have changed over the last several releases. If you're using an older preview API, review the [upgrade instructions](search-api-migration.md#upgrade-to-2024-11-01-preview) for addressing breaking changes.
The generalized process for rescoring is:
1. The vector query executes over compressed vector fields.
1. The vector query returns the top k oversampled candidates.
-1. Oversampled k candidates are rescored using either the uncompressed original vectors, or the dot product of binary quantization. 1. After rescoring, results are adjusted so that more relevant matches appear first.
+1. Oversampled k candidates are rescored using either the uncompressed original vectors, or the dot product of binary quantization.
+1. After rescoring, results are adjusted so that more relevant matches appear first.
## Add "compressions" to a search index
@@ -79,15 +91,14 @@ The compression example includes both `scalarQuantization` or `binaryQuantizatio
Syntax for `vectorSearch.Compressions` varies between stable and preview REST APIs, with the preview adding more options for storage optimization, plus changes to existing syntax. Backwards compatibility is preserved through internal API mappings, but we recommend adopting the newer properties in code that targets 2024-11-01-preview and future versions.
-### [**2024-07-01**](#tab/2024-07-01)
-
Use the [Create Index](/rest/api/searchservice/indexes/create) or [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) REST API to configure compression settings.
```http
-POST https://[servicename].search.windows.net/indexes?api-version=2024-07-01
+POST https://[servicename].search.windows.net/indexes?api-version=2025-09-01
{
"name": "my-index",
+ "description": "This is a description of this index",
"fields": [
{ "name": "Id", "type": "Edm.String", "key": true, "retrievable": true, "searchable": true, "filterable": true },
{ "name": "content", "type": "Edm.String", "retrievable": true, "searchable": true },
@@ -111,95 +122,30 @@ POST https://[servicename].search.windows.net/indexes?api-version=2024-07-01
],
"compressions": [
{
- "name": "use-scalar",
- "kind": "scalarQuantization",
"scalarQuantizationParameters": {
"quantizedDataType": "int8"
},
- "rerankWithOriginalVectors": true,
- "defaultOversampling": 10
+ "name": "mySQ8",
+ "kind": "scalarQuantization",
+ "rescoringOptions": {
+ "enableRescoring": true,
+ "defaultOversampling": 10,
+ "rescoreStorageMethod": "preserveOriginals"
+ },
+ "truncationDimension": 2
},
{
- "name": "use-binary",
+ "name": "myBQC",
"kind": "binaryQuantization",
- "rerankWithOriginalVectors": true,
- "defaultOversampling": 10
+ "rescoringOptions": {
+ "enableRescoring": true,
+ "defaultOversampling": 10,
+ "rescoreStorageMethod": "discardOriginals"
+ },
+ "truncationDimension": 2
}
]
- }
-}
-```
-
-**Key points**:
-
-- `kind` must be set to `scalarQuantization` or `binaryQuantization`.
-
-- `rerankWithOriginalVectors` uses the original uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if `stored` is false. This property is optional. Default is true.
-
-- `defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
-
-- `quantizedDataType` is optional and applies to scalar quantization only. If you add it, it must be set to `int8`. This is the only primitive data type supported for scalar quantization at this time. Default is `int8`.
-
-### [**2024-11-01-preview**](#tab/2024-11-01-preview)
-
-Use the [Create Index (preview)](/rest/api/searchservice/indexes/create?view=rest-searchservice-2024-11-01-preview&preserve-view=true) or [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true) REST API to configure compression settings.
-
-Changes in this version include new `rescoringOptions` that replace `rerankWithOriginalVectors`, and extend the API with more storage options. Notice that `defaultOversampling` is now a property of `rescoringOptions`.
-
-Rescoring options are used to mitigate the effects of lossy comprehension. You can set `rescoringOptions` for scalar or binary quantization.
-
-```http
-POST https://[servicename].search.windows.net/indexes?api-version=2024-11-01-preview
-
-{
- "name": "my-index",
- "fields": [
- { "name": "Id", "type": "Edm.String", "key": true, "retrievable": true, "searchable": true, "filterable": true },
- { "name": "content", "type": "Edm.String", "retrievable": true, "searchable": true },
- { "name": "vectorContent", "type": "Collection(Edm.Single)", "retrievable": false, "searchable": true, "dimensions": 1536,"vectorSearchProfile": "vector-profile-1"},
- ],
- "vectorSearch": {
- "profiles": [
- {
- "name": "vector-profile-1",
- "algorithm": "use-hnsw",
- "compression": "use-scalar"
- }
- ],
- "algorithms": [
- {
- "name": "use-hnsw",
- "kind": "hnsw",
- "hnswParameters": { },
- "exhaustiveKnnParameters": null
- }
- ],
- "compressions": [
- {
- "name": "use-scalar",
- "kind": "scalarQuantization",
- "rescoringOptions": {
- "enableRescoring": true,
- "defaultOversampling": 10,
- "rescoreStorageMethod": "preserveOriginals"
- },
- "scalarQuantizationParameters": {
- "quantizedDataType": "int8"
- },
- "truncationDimension": 1024
- },
- {
- "name": "use-binary",
- "kind": "binaryQuantization",
- "rescoringOptions": {
- "enableRescoring": true,
- "defaultOversampling": 10,
- "rescoreStorageMethod": "preserveOriginals"
- },
- "truncationDimension": 1024
- }
- ]
- }
+ },
}
```
@@ -209,94 +155,13 @@ POST https://[servicename].search.windows.net/indexes?api-version=2024-11-01-pre
- `rescoringOptions` are a collection of properties used to offset lossy compression by rescoring query results using the original full-precision vectors that exist prior to quantization. For rescoring to work, you must have the vector instance that provides this content. Setting `rescoreStorageMethod` to `discardOriginals` prevents you from using `enableRescoring` or `defaultOversampling`. For more information about vector storage, see [Eliminate optional vector instances from storage](vector-search-how-to-storage-options.md).
-- `"rescoreStorageMethod": "preserveOriginals"` is the API equivalent of `"rerankWithOriginalVectors": true`. Rescoring vector search results with the original full-precision vectors can result in adjustments to search score and rankings, promoting the more relevant matches as determined by the rescoring step.
+- `"rescoreStorageMethod": "preserveOriginals"` rescores vector search results with the original full-precision vectors can result in adjustments to search score and rankings, promoting the more relevant matches as determined by the rescoring step. For binary quantization, you can set `rescoreStorageMethod` to `discardOriginals` to further reduce storage, without reducing quality. Original vectors aren't needed for binary quantization.
- `defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
- `quantizedDataType` is optional and applies to scalar quantization only. If you add it, it must be set to `int8`. This is the only primitive data type supported for scalar quantization at this time. Default is `int8`.
-- `truncationDimension` is a preview feature that taps inherent capabilities of the text-embedding-3 models to "encode information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks" (see [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147)). You can use truncated dimensions with or without rescoring options. For more information about how this feature is implemented in Azure AI Search, see [Truncate dimensions using MRL compression](vector-search-how-to-truncate-dimensions.md).
-
-### [**2025-03-01-preview**](#tab/2025-03-01-preview)
-
-Use the [Create Index (preview)](/rest/api/searchservice/indexes/create?view=rest-searchservice-2025-031-01-preview&preserve-view=true) or [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-03-01-preview&preserve-view=true) REST API to configure compression settings.
-
-Changes in this version include new guidance for *binary quantization*. If you set `enableRescoring` to true, you can set `rescoreStorageMethod` to `discardOriginals` to further reduce storage, without reducing quality.
-
-Azure AI Search supports a lossy rescoring option on the binary quantized document vectors, which helps close the quality gap between no rescoring and full-precision rescoring when using `binaryQuantization`.
-
-For scalar quantization, there are no rescoring changes in this preview.
-
-```http
-POST https://[servicename].search.windows.net/indexes?api-version=2025-03-01-preview
-
-{
- "name": "my-index",
- "fields": [
- { "name": "Id", "type": "Edm.String", "key": true, "retrievable": true, "searchable": true, "filterable": true },
- { "name": "content", "type": "Edm.String", "retrievable": true, "searchable": true },
- { "name": "vectorContent", "type": "Collection(Edm.Single)", "retrievable": false, "searchable": true, "dimensions": 1536,"vectorSearchProfile": "vector-profile-1"},
- ],
- "vectorSearch": {
- "profiles": [
- {
- "name": "vector-profile-1",
- "algorithm": "use-hnsw",
- "compression": "use-binary"
- }
- ],
- "algorithms": [
- {
- "name": "use-hnsw",
- "kind": "hnsw",
- "hnswParameters": { },
- "exhaustiveKnnParameters": null
- }
- ],
- "compressions": [
- {
- "name": "use-scalar",
- "kind": "scalarQuantization",
- "rescoringOptions": {
- "enableRescoring": true,
- "defaultOversampling": 10,
- "rescoreStorageMethod": "preserveOriginals"
- },
- "scalarQuantizationParameters": {
- "quantizedDataType": "int8"
- },
- "truncationDimension": 1024
- },
- {
- "name": "use-binary",
- "kind": "binaryQuantization",
- "rescoringOptions": {
- "enableRescoring": true,
- "defaultOversampling": 10,
- "rescoreStorageMethod": "discardOriginals"
- },
- "truncationDimension": 1024
- }
- ]
- }
-}
-```
-
-**Key points**:
-
-- `kind` must be set to `scalarQuantization` or `binaryQuantization`.
-
-- `rescoringOptions` are a collection of properties used to offset lossy compression by rescoring query results using the original full-precision vectors that exist prior to quantization.
-
-- `enableRescoring` rescores the initial results obtained by query execution over compressed data. For scalar quantization, rescoring uses uncompressed vectors to produce more relevant results and takes a dependency on `preserveOriginals`. For binary quantization, rescoring is the same as scalar quantization if you preserve originals, but you can also discard originals and still get rescoring. In this scenario, rescoring is calculated by the dot product of the full precision query and binary quantized data in the index.
-
-- `"rescoreStorageMethod": "discardOriginals"` removes original vectors. These aren't needed for binary quantization.
-
-- `defaultOversampling` considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the `k` in the query, with an oversampling multiplier. For example, if the query specifies a `k` of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top `k` reranked results are returned. This property is optional. Default is 4.
-
-- `truncationDimension` is a preview feature that taps inherent capabilities of the text-embedding-3 models to "encode information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks" (see [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147)). You can use truncated dimensions with or without rescoring options. For more information about how this feature is implemented in Azure AI Search, see [Truncate dimensions using MRL compression](vector-search-how-to-truncate-dimensions.md).
-
----
+- `truncationDimension` taps inherent capabilities of the text-embedding-3 models to "encode information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks" (see [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147)). You can use truncated dimensions with or without rescoring options. For more information about how this feature is implemented in Azure AI Search, see [Truncate dimensions using MRL compression](vector-search-how-to-truncate-dimensions.md).
## Add the vector search algorithm
@@ -365,84 +230,18 @@ To use a new quantization configuration, you must create a *new* vector profile.
1. [Load the index](search-what-is-data-import.md) using indexers for pull model indexing, or APIs for push model indexing.
-## How scalar quantization works in Azure AI Search
-
-Scalar quantization reduces the resolution of each number within each vector embedding. Instead of describing each number as a 16-bit or 32-bit floating point number, it uses an 8-bit integer. It identifies a range of numbers (typically 99th percentile minimum and maximum) and divides them into a finite number of levels or bin, assigning each bin an identifier. In 8-bit scalar quantization, there are 2^8, or 256, possible bins.
-
-Each component of the vector is mapped to the closest representative value within this set of quantization levels in a process akin to rounding a real number to the nearest integer. In the quantized 8-bit vector, the identifier number stands in place of the original value. After quantization, each vector is represented by an array of identifiers for the bins to which its components belong. These quantized vectors require much fewer bits to store compared to the original vector, thus reducing storage requirements and memory footprint.
-
-## How binary quantization works in Azure AI Search
-
-Binary quantization compresses high-dimensional vectors by representing each component as a single bit, either 0 or 1. This method drastically reduces the memory footprint and accelerates vector comparison operations, which are crucial for search and retrieval tasks. Benchmark tests show up to 96% reduction in vector index size.
-
-It's particularly effective for embeddings with dimensions greater than 1024. For smaller dimensions, we recommend testing the quality of binary quantization, or trying scalar instead. Additionally, we’ve found binary quantization performs very well when embeddings are centered around zero. Most popular embedding models such as OpenAI, Cohere, and Mistral are centered around zero.
-
## Query a quantized vector field using oversampling
-Query syntax for a compressed or quantized vector field is the same as for noncompressed vector fields, unless you want to override parameters associated with oversampling and rescoring. You can add an o`versampling` parameter to invoke oversampling and rescoring at query time.
-
-### [**2024-07-01**](#tab/query-2024-07-01)
-
-Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `rerankWithOriginalVectors` and `defaultOversampling` to mitigate the effects of lossy compression. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
+Query syntax for a compressed or quantized vector field is the same as for noncompressed vector fields, unless you want to override parameters associated with oversampling and rescoring. You can add an `oversampling` parameter to invoke oversampling and rescoring at query time.
-You can set the oversampling parameter even if the index doesn't explicitly have a `rerankWithOriginalVectors` or `defaultOversampling` definition. Providing `oversampling` at query time overrides the index settings for that query and executes the query with an effective `rerankWithOriginalVectors` as true.
-
-```http
-POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-07-01
-
-{
- "vectorQueries": [
- {
- "kind": "vector",
- "vector": [8, 2, 3, 4, 3, 5, 2, 1],
- "fields": "myvector",
- "oversampling": 12.0,
- "k": 5
- }
- ]
-}
-```
-
-**Key points**:
-
-- Applies to vector fields that undergo vector compression, per the vector profile assignment.
-
-- Overrides the `defaultOversampling` value or introduces oversampling at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
-
-### [**2024-11-01-preview**](#tab/query-2024-11-01-preview)
+Use [Search Documents](/rest/api/searchservice/documents/search-post) for the request.
Recall that the [vector compression definition](vector-search-how-to-quantization.md) in the index has settings for `enableRescoring`, `rescoreStorageMethod`, and `defaultOversampling` to mitigate the effects of lossy compression. You can override the default values to vary the behavior at query time. For example, if `defaultOversampling` is 10.0, you can change it to something else in the query request.
You can set the oversampling parameter even if the index doesn't explicitly have rescoring options or `defaultOversampling` definition. Providing `oversampling` at query time overrides the index settings for that query and executes the query with an effective `enableRescoring` as true.
```http
-POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2024-11-01-preview
-
-{
- "vectorQueries": [
- {
- "kind": "vector",
- "vector": [8, 2, 3, 4, 3, 5, 2, 1],
- "fields": "myvector",
- "oversampling": 12.0,
- "k": 5
- }
- ]
-}
-```
-
-**Key points**:
-
-- Oversampling applies to vector fields that undergo vector compression, per the vector profile assignment.
-
-- Oversampling in the query overrides the `defaultOversampling` value in the index, or invokes oversampling and rescoring at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
-
-### [**2025-03-01-preview**](#tab/query-2025-03-01-preview)
-
-The latest preview API is identical to the previous preview API in terms of `vectorQueries` specification. As with the previous version, we recommend oversampling as mitigation for lossy compression.
-
-```http
-POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2025-03-01-preview
+POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?api-version=2025-09-01
{
"vectorQueries": [
@@ -462,5 +261,3 @@ POST https://[service-name].search.windows.net/indexes/demo-index/docs/search?ap
- Oversampling applies to vector fields that undergo vector compression, per the vector profile assignment.
- Oversampling in the query overrides the `defaultOversampling` value in the index, or invokes oversampling and rescoring at query time, even if the index's compression configuration didn't specify oversampling or reranking options.
-
----
Summary
{
"modification_type": "breaking change",
"modification_title": "ベクトル圧縮とリスコアリング手法の改訂"
}
Explanation
この変更は、「vector-search-how-to-quantization.md」ドキュメントの内容を大幅に見直し、更新するために行われました。主な変更点は以下の通りです:
日付の更新: ドキュメントの最終更新日が「2025年3月31日」から「2025年9月28日」に変更され、最新の情報が反映されています。
量子化手法の詳細説明:
- スカラー量子化: スカラー量子化は、各ベクトルの計算結果を8ビット整数として表現する方法で、元の16ビットや32ビットの浮動小数点数から情報を圧縮します。これにより、ストレージの要件が大幅に削減されます。
- バイナリ量子化: バイナリ量子化は、各コンポーネントを0または1のビットとして表現する方法であり、検索や取得タスクにとって重要なベクトル比較の効率を向上させることができます。
リスコアリング手法の明確化: リスコアリングは、量子化によって失われた情報を補うためのオプションの技術であり、検索クエリによって取得された初期結果を実行時に再評価するための手法が説明されています。
新しいAPIバージョンとの互換性: 新しく導入された設定やプロパティが、将来のAPIバージョンとの互換性を持つように内部的にマッピングされていることが強調されており、新しいAPIでの実装が推奨されています。
圧縮とリスコアリングの設定: 圧縮オプションやリスコアリングオプションが追加され、インデックスの定義にrescoringOptionsが設定できるようになり、より柔軟な検索実行が可能になります。具体的なパラメーターとしてenableRescoring、defaultOversampling、およびrescoreStorageMethodが設定されています。
これにより、ユーザーはAzure AI Searchにおけるベクトル圧縮の特性をより深く理解でき、適切な圧縮とリスコアリングの設定を使用して、性能と効率を最適化することができるようになります。全体として、ドキュメントの内容はより明確で直接的になり、利用者の利便性が向上しています。
articles/search/vector-search-how-to-storage-options.md
Diff
@@ -42,7 +42,7 @@ For indexes created with the 2024-11-01-preview or a later API with uncompressed
If you choose [vector compression](vector-search-how-to-configure-compression-storage.md), AI Search compresses (quantizes) the in-memory portion of the vector index. Since memory is often a primary constraint for vector indexes, this practice allows you to store more vectors within the same search service. However, lossy compression equates to less information in the index, which can affect search quality.
-To mitigate the loss in information, you can [enable "rescoring" and "oversampling" options](vector-search-how-to-quantization.md#recommended-rescoring-techniques) to help maintain quality. The effect is retrieval of a larger set of candidate documents from the compressed index, with recomputation of similarity scores using the original vectors or the dot product. For rescoring to work, original vectors must be retained in storage for certain scenarios. As a result, while quantization reduces memory usage (vector index size usage), it slightly increases storage requirements since both compressed and original vectors are stored. The extra storage is approximately equal to the size of the compressed index.
+To mitigate the loss in information, you can [enable "rescoring" and "oversampling" options](vector-search-how-to-quantization.md#supported-rescoring-techniques) to help maintain quality. The effect is retrieval of a larger set of candidate documents from the compressed index, with recomputation of similarity scores using the original vectors or the dot product. For rescoring to work, original vectors must be retained in storage for certain scenarios. As a result, while quantization reduces memory usage (vector index size usage), it slightly increases storage requirements since both compressed and original vectors are stored. The extra storage is approximately equal to the size of the compressed index.
## Remove source vectors (JSON data)
Summary
{
"modification_type": "minor update",
"modification_title": "ストレージオプションに関するドキュメントの微調整"
}
Explanation
この変更は、「vector-search-how-to-storage-options.md」ドキュメントの内容を微調整するために行われました。主な変更点は以下の通りです:
リンクの更新: リスコアリングとオーバーサンプリングオプションに関するリンクが修正され、より正確な情報源(vector-search-how-to-quantization.md#supported-rescoring-techniques)に向けられるようになりました。これにより、ユーザーはリスコアリングの技術に関する最新の情報にアクセスできるようになります。
説明文の変更: 文言が微調整され、情報の正確さや明確さが向上しています。具体的には、情報損失を軽減する手段としてのリスコアリングとオーバーサンプリングの機能に関する説明が更新され、検索の質を保つためのプロセスが明確に示されています。
これらの変更は、Azure AI Searchのストレージオプションに関する理解を深め、ユーザーがベクトルインデックスの圧縮と関連するリスコアリング機能を効果的に利用できるようにサポートします。全体として、ドキュメントの有用性が向上しています。
articles/search/vector-search-how-to-truncate-dimensions.md
Diff
@@ -9,13 +9,10 @@ ms.update-cycle: 180-days
ms.custom:
- ignite-2024
ms.topic: how-to
-ms.date: 09/19/2025
+ms.date: 09/28/2025
---
-# Truncate dimensions using MRL compression (preview)
-
-> [!IMPORTANT]
-> This feature is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). We recommend the latest [preview REST API version](/rest/api/searchservice/search-service-api-versions#preview-versions) for this feature.
+# Truncate dimensions using MRL compression
Exercise the ability to use fewer dimensions on text-embedding-3 models. On Azure OpenAI, text-embedding-3 models are retrained on the [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147) (MRL) technique that produces multiple vector representations at different levels of compression. This approach produces faster searches and reduced storage costs with minimal loss of semantic information.
@@ -34,23 +31,21 @@ MRL multilevel compression saves on vector storage and improves query response t
- [New vector fields](vector-search-how-to-create-index.md) of type `Edm.Half` or `Edm.Single`. You can't add MRL compression to an existing field.
-- [Hierarchical Navigable Small World (HNSW) algorithm](vector-search-ranking.md). This preview doesn't support exhaustive KNN.
+- [Hierarchical Navigable Small World (HNSW) algorithm](vector-search-ranking.md).
- [Scalar or binary quantization](vector-search-how-to-quantization.md). Truncated dimensions can be set only when scalar or binary quantization is configured. We recommend binary quantization for MRL compression.
### Supported clients
-You can use the REST APIs or Azure SDK beta packages to implement MRL compression. At this time, there's no Azure portal or Azure AI Foundry support.
-
-- [REST API 2024-09-01-preview](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true) or later. We recommend the latest preview API.
+You can use the REST APIs or Azure SDK packages to implement MRL compression. At this time, there's no Azure portal or Azure AI Foundry support.
-- Check the change logs for each Azure SDK beta package: [Python](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md), [.NET](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/search/Azure.Search.Documents/CHANGELOG.md), [Java](https://github.com/Azure/azure-sdk-for-java/blob/azure-search-documents_11.1.3/sdk/search/azure-search-documents/CHANGELOG.md), [JavaScript](https://github.com/Azure/azure-sdk-for-js/blob/main/sdk/search/search-documents/CHANGELOG.md).
+- Check the change logs for each Azure SDK package for feature support: [Python](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md), [.NET](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/search/Azure.Search.Documents/CHANGELOG.md), [Java](https://github.com/Azure/azure-sdk-for-java/blob/azure-search-documents_11.1.3/sdk/search/azure-search-documents/CHANGELOG.md), [JavaScript](https://github.com/Azure/azure-sdk-for-js/blob/main/sdk/search/search-documents/CHANGELOG.md).
## Use MRL-extended text embeddings
MRL is built into the text embedding model you're already using. To use MRL capabilities in Azure AI Search:
-1. Use [Create or Update Index (preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true) or an equivalent API to specify the index schema.
+1. Use [Create or Update Index](/rest/api/searchservice/indexes/create-or-update) or an equivalent API to specify the index schema.
1. [Add vector fields](vector-search-how-to-create-index.md) to the index definition.
@@ -100,22 +95,28 @@ The following example illustrates a vector search configuration that meets the r
{
"name": "use-mrl",
"kind": "truncation",
- "rerankWithOriginalVectors": true,
- "defaultOversampling": 10,
+ "rescoringOptions": {
+ "enableRescoring": true,
+ "defaultOversampling": 10,
+ "rescoreStorageMethod": "preserveOriginals"
+ },
"truncationDimension": 1024
},
{
"name": "use-bq",
"kind": "binaryQuantization",
- "rerankWithOriginalVectors": true,
- "defaultOversampling": 10
+ "rescoringOptions": {
+ "enableRescoring": true,
+ "defaultOversampling": 10,
+ "rescoreStorageMethod": "discardOriginals"
+ }
}
]
}
}
```
-Here's an example of a [fully specified vector field definition](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true#searchfield) that satisfies the requirements for MRL. Recall that vector fields must:
+Here's an example of a [fully specified vector field definition](/rest/api/searchservice/indexes/create-or-update#searchfield) that satisfies the requirements for MRL. Recall that vector fields must:
- Be of type `Edm.Half` or `Edm.Single`.
Summary
{
"modification_type": "minor update",
"modification_title": "次元のトランケーションに関するMRL圧縮ドキュメントの改訂"
}
Explanation
この変更は、「vector-search-how-to-truncate-dimensions.md」ドキュメントの内容を微調整するために行われました。主な変更点は以下の通りです:
日付の更新: ドキュメントの最終更新日が「2025年9月19日」から「2025年9月28日」に変更され、最新情報が反映されています。
タイトルの簡素化: タイトルが「圧縮のプレビュー」から「MRL圧縮を使用した次元のトランケーション」という明確な表現に変更されました。この変更により、ユーザーはMRL圧縮による次元のトランケーションについての内容をより具体的に理解できるようになります。
内容の明確化: MRL(Matryoshka Representation Learning)を利用した次元のトランケーションのメリットが詳述され、ストレージコストの削減と検索速度の向上についての情報が追加されました。
リスコアリングオプションの強調: 設定例において、リスコアリングオプションが追加され、enableRescoringおよびそれに関連するパラメータが指定されています。これにより、ユーザーはより質の高い結果を得るための設定方法を理解しやすくなります。
APIバージョンに関する言及: APIバージョンの表記が簡素化され、現在の最新のAPIバージョンを使用することが明記されています。
これらの変更により、MRL圧縮を用いた次元のトランケーションに関するリソースがより分かりやすくなり、Azure AI Searchのユーザーが利便性を感じやすくなることが期待されます。全体として、ドキュメントの精度と有用性が向上しています。
articles/search/vector-search-overview.md
Diff
@@ -87,7 +87,7 @@ Azure AI Search is deeply integrated across the Azure AI platform. The following
| Azure AI Foundry | In the chat playground, **Add your own data** uses Azure AI Search for grounding data and conversational search. The playground is the easiest and fastest way to [chat with your data](/azure/ai-services/openai/use-your-data-quickstart). |
| Azure OpenAI | Azure OpenAI provides embedding models and chat models. Demos and samples target the [text-embedding-ada-002](/azure/ai-services/openai/concepts/models#embeddings-models) model. We recommend Azure OpenAI for generating embeddings for text. |
| Azure AI Services | [Image Retrieval Vectorize Image API (preview)](/azure/ai-services/computer-vision/how-to/image-retrieval#call-the-vectorize-image-api) supports vectorization of image content. We recommend this API for generating embeddings for images. |
-| Azure data platforms: Azure Blob Storage, Azure Cosmos DB, Azure SQL, OneLake | You can use [indexers](search-indexer-overview.md) to automate data ingestion, and then use [integrated vectorization](vector-search-integrated-vectorization.md) to generate embeddings. Azure AI Search can automatically index vector data from [Azure blob indexers](search-howto-indexing-azure-blob-storage.md), [Azure Cosmos DB for NoSQL indexers](search-howto-index-cosmosdb.md), [Azure Data Lake Storage Gen2](search-howto-index-azure-data-lake-storage.md), [Azure Table Storage](search-howto-indexing-azure-tables.md), and [Fabric OneLake](search-how-to-index-onelake-files.md). For more information, see [Add vector fields to a search index.](vector-search-how-to-create-index.md). |
+| Azure data platforms: Azure Blob Storage, Azure Cosmos DB, Azure SQL, Microsoft OneLake | You can use [indexers](search-indexer-overview.md) to automate data ingestion, and then use [integrated vectorization](vector-search-integrated-vectorization.md) to generate embeddings. Azure AI Search can automatically index vector data from [Azure blob indexers](search-howto-indexing-azure-blob-storage.md), [Azure Cosmos DB for NoSQL indexers](search-howto-index-cosmosdb.md), [Azure Data Lake Storage Gen2](search-howto-index-azure-data-lake-storage.md), [Azure Table Storage](search-howto-indexing-azure-tables.md), and [Microsoft OneLake](search-how-to-index-onelake-files.md). For more information, see [Add vector fields to a search index.](vector-search-how-to-create-index.md). |
It's also commonly used in open-source frameworks like [LangChain](https://js.langchain.com/docs/integrations/vectorstores/azure_aisearch).
Summary
{
"modification_type": "minor update",
"modification_title": "Azureデータプラットフォームの名称修正"
}
Explanation
この変更は、「vector-search-overview.md」ドキュメント内のAzureデータプラットフォームの名称を修正するために行われました。主な変更点は以下の通りです:
名称の統一: 「OneLake」という名称が「Microsoft OneLake」となり、ブランド名としての一貫性が強調されました。これにより、ユーザーはより明確にMicrosoftが提供するサービスとして認識できるようになります。
文脈の維持: この変更により、ドキュメント全体の文脈や情報の流れが損なわれることなく、具体的なサービス名が修正され、より正確な情報が提供されるようになっています。
全体として、この変更は、Azure AI Searchに関する情報を最新かつ明確に保つことを目的としており、ユーザーが正確な知識を持ってサービスを利用できるようにするための重要な更新です。
articles/search/vector-search-vectorizer-azure-open-ai.md
Diff
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
ms.custom:
- build-2024
ms.topic: reference
-ms.date: 09/12/2025
+ms.date: 09/26/2025
ms.update-cycle: 365-days
---
@@ -39,7 +39,7 @@ Parameters are case-sensitive.
| Parameter name | Description |
|--------------------|-------------|
-| `resourceUri` | The URI of the model provider. This parameter only supports URLs with the `openai.azure.com` domain, such as `https://<resourcename>.openai.azure.com`. If your Azure OpenAI endpoint has a URL with the `cognitiveservices.azure.com` domain, such as `https://<resourcename>.cognitiveservices.azure.com`, you must create a [custom subdomain](/azure/ai-services/openai/how-to/use-your-data-securely#enabled-custom-subdomain) with `openai.azure.com` for the Azure OpenAI resource and use `https://<resourcename>.openai.azure.com` instead. |
+| `resourceUri` | The URI of the model provider. This parameter only supports URLs with the `openai.azure.com` domain, such as `https://<resourcename>.openai.azure.com`. If your Azure OpenAI endpoint has a URL with the `cognitiveservices.azure.com` domain, such as `https://<resourcename>.cognitiveservices.azure.com`, you must create a [custom subdomain](/azure/ai-services/openai/how-to/use-your-data-securely#enabled-custom-subdomain) with `openai.azure.com` for the Azure OpenAI resource and use `https://<resourcename>.openai.azure.com` instead. [Azure API Management](/azure/api-management/api-management-key-concepts) endpoints are supported with URL `https://<resourcename>.azure-api.net `. Shared private links aren't supported for API Management endpoints. |
| `apiKey` | The secret key used to access the model. If you provide a key, leave `authIdentity` empty. If you set both the `apiKey` and `authIdentity`, the `apiKey` is used on the connection. |
| `deploymentId` | The name of the deployed Azure OpenAI embedding model. The model should be an embedding model, such as text-embedding-ada-002. See the [List of Azure OpenAI models](/azure/ai-services/openai/concepts/models) for supported models.|
| `authIdentity` | A user-managed identity used by the search service for connecting to Azure OpenAI. You can use either a [system or user managed identity](search-how-to-managed-identities.md). To use a system managed identity, leave `apiKey` and `authIdentity` blank. The system-managed identity is used automatically. A managed identity must have [Cognitive Services OpenAI User](/azure/ai-services/openai/how-to/role-based-access-control#azure-openai-roles) permissions to send text to Azure OpenAI. |
Summary
{
"modification_type": "minor update",
"modification_title": "Azure OpenAIリソースのカスタムサブドメインに関する情報の追加"
}
Explanation
この変更は、「vector-search-vectorizer-azure-open-ai.md」ドキュメント内のパラメータ説明を微調整するために行われました。主な変更点は以下の通りです:
日付の更新: ドキュメントの最終更新日が「2025年9月12日」から「2025年9月26日」に変更され、最新の情報が反映されています。
パラメータ説明の拡充: resourceUri パラメータの説明に、Azure API Management のエンドポイントに関する詳細が追加されました。具体的には、https://<resourcename>.azure-api.net という形式のURLがサポートされており、共有プライベートリンクはAPI Managementエンドポイントではサポートされていないことが明記されています。
この追加情報により、ユーザーはAzure OpenAIリソースを使用する際に、どのような条件や制限があるのかをより正確に理解できるようになります。変更全体として、文書がより明確で情報量が増したことで、ユーザーの利便性が向上しています。
articles/search/whats-new.md
Diff
@@ -4,7 +4,7 @@ description: Announcements of new and enhanced features, including a service ren
author: HeidiSteen
ms.author: heidist
manager: nitinme
-ms.date: 09/16/2025
+ms.date: 09/28/2025
ms.service: azure-ai-search
ms.topic: overview
ms.custom:
@@ -24,6 +24,17 @@ Learn about the latest updates to Azure AI Search functionality, docs, and sampl
| Item | Type | Description |
|--|--|--|
+| [Search Service 2025-09-01](/rest/api/searchservice/operation-groups?view=rest-searchservice-2025-09-01&preserve-view=true) | REST | New stable REST API version supports general availability for Microsoft OneLake indexer, Document Layout skill, and other APIs. |
+| [Logic app worklow integration](search-how-to-index-logic-apps-indexers.md) | Indexer | Generally available. |
+| [OneLake indexer](search-how-to-index-onelake-files.md) | Indexer | Generally available. |
+| [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) | Indexer | Generally available. |
+| [Normalizers](search-normalizers.md) | Keyword search | Generally available. |
+| [Index description](search-agentic-retrieval-how-to-index.md#add-a-description) | Agentic search | Generally available. |
+| [Rescoring of binary quantized vectors](vector-search-how-to-quantization.md#supported-rescoring-techniques) | Vector search | Generally available. |
+| [Rescoring options for scalar compressed vectors](vector-search-how-to-quantization.md#supported-rescoring-techniques) | Vector search | Generally available. |
+| [Scoring profiles for semantically ranked results](semantic-how-to-enable-scoring-profiles.md) | Relevance | Generally available. |
+| [Truncate dimensions](vector-search-how-to-truncate-dimensions.md) | Vector search | Generally available. |
+| [Unpack `@search.score` to view subscores in hybrid search results](hybrid-search-ranking.md#unpack-a-search-score-into-subscores) | Hybrid search | Generally available. |
| [Updates to import wizards (Phase 1)](search-import-data-portal.md) | Portal | The Azure portal is undergoing a three-phase rollout to unify the two import wizards. For Phase 1, the **Import and vectorize data** wizard has been renamed to **Import data (new)** and redeveloped to support keyword search, modernizing the legacy **Import data** workflow with an improved interface and user experience. <p>**Import data (new)** isn't a direct replacement for the old wizard. For example, it supports fewer skills for keyword search and doesn't offer built-in sample data.<p>Both wizards are currently available, but **Import data** will be deprecated in a future phase. |
## August 2025
@@ -36,7 +47,7 @@ Learn about the latest updates to Azure AI Search functionality, docs, and sampl
| [Answer synthesis (preview)](search-agentic-retrieval-how-to-synthesize.md) | Agentic search | New `answerSynthesis` modality for knowledge agents. When specified, an LLM generates a natural-language answer as an embedded step in the retrieval pipeline. This differs from the default `extractiveData` modality, which returns raw search results for downstream processing. |
| ["Fast path" for knowledge agents (preview)](search-knowledge-source-overview.md#attempt-fast-path-processing) | Agentic search | New `attemptFastPath` boolean for knowledge agents. Enables a shorter processing time if queries are concise and the initial response is sufficiently relevant. |
| [Retrieval instructions (preview)](search-agentic-retrieval-how-to-create.md) | Agentic search | New `retrievalInstructions` property for knowledge agents guides query planning in an agentic retrieval workflow. For example, you can specify criteria for including or excluding specific knowledge sources. |
-| [Improved indexer runtime tracking information (preview)](search-howto-run-reset-indexers.md#check-indexer-runtime-quota-for-s3-hd-search-services) | Indexers | Applies to Standard 3 High Density (S3 HD) services only. [Get Service Statistics](/rest/api/searchservice/get-service-statistics/get-service-statistics?view=rest-searchservice-2025-08-01-preview&preserve-view=true) response now provides cumulative indexer processing information for the entire service. [Get Status - Indexers](/rest/api/searchservice/get-service-statistics/get-status?view=rest-searchservice-2025-08-01-preview&preserve-view=true) provides the same information, but for a specific indexer. |
+| [Improved indexer runtime tracking information (preview)](search-howto-run-reset-indexers.md#check-indexer-runtime-quota-for-s3-hd-search-services) | Indexers | Applies to Standard 3 High Density (S3 HD) services only. [Get Service Statistics](/rest/api/searchservice/get-service-statistics/get-service-statistics?view=rest-searchservice-2025-08-01-preview&preserve-view=true) response now provides cumulative indexer processing information for the entire service. [Get Status - Indexers](/rest/api/searchservice/indexers/get-status?view=rest-searchservice-2025-08-01-preview&preserve-view=true) provides the same information, but for a specific indexer. |
| [Strict postfiltering for vector queries (preview)](vector-search-filters.md) | Vector search | New `strictPostFilter` mode for the `vectorFilterMode` parameter. When specified, filters are applied after the global top-`k` vector results are identified, ensuring that returned documents are a subset of the unfiltered results. |
| [Increased maximum dimensions for vector fields](search-limits-quotas-capacity.md#index-limits) | Vector search | The maximum dimensions per vector field are now `4096`. This update applies to all stable and preview REST API versions that support vectors and doesn't introduce breaking changes. |
@@ -63,7 +74,7 @@ Learn about the latest updates to Azure AI Search functionality, docs, and sampl
| [GenAI prompt skill (preview)](cognitive-search-skill-genai-prompt.md) | Skills | A new skill that connects to a large language model (LLM) for information, using a prompt you provide. With this skill, you can populate a searchable field using content from an LLM. A primary use case for this skill is *image verbalization*, using an LLM to describe images and send the description to a searchable field in your index. |
| [Document Layout skill (preview)](cognitive-search-skill-document-intelligence-layout.md)| Skills | New parameters are available for this skill if you use the 2025-05-01-preview API version or later. New parameters support image offset metadata that improves the image search experience. |
| Import and vectorize data wizard enhancements | Portal | This wizard provides two paths for creating and populating vector indexes: [Retrieval Augmented Generation (RAG)](search-get-started-portal-import-vectors.md) and [Multimodal RAG](search-get-started-portal-image-search.md). Logic apps integration is through the RAG path. |
-| [Index "description" support (preview)](search-howto-reindex.md#add-an-index-description-preview) | REST | The latest preview API adds a description to an index. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on the index name alone. The description must be human readable and under four thousand characters.|
+| [Index "description" support (preview)](search-agentic-retrieval-how-to-index.md#add-a-description) | REST | The latest preview API adds a description to an index. Consider a Model Context Protocol (MCP) server that must pick the correct index at run time. The decision can be based on the description rather than on the index name alone. The description must be human readable and under four thousand characters.|
| [2025-05-01-preview](/rest/api/searchservice/operation-groups?view=rest-searchservice-2025-05-01-preview&preserve-view=true) | REST | New data plane preview REST API version providing programmatic access to the preview features announced in this release. |
## April 2025
@@ -79,7 +90,7 @@ Learn about the latest updates to Azure AI Search functionality, docs, and sampl
| [Service upgrade (preview)](search-how-to-upgrade.md) | Service | Upgrade your search service to higher storage limits in your region. With a one-time upgrade, you no longer need to recreate your service. Available in [Upgrade Service (2025-02-01-preview)](/rest/api/searchmanagement/services/upgrade?view=rest-searchmanagement-2025-02-01-preview&preserve-view=true) and the Azure portal. |
| [Pricing tier change (preview)](search-capacity-planning.md#change-your-pricing-tier) | Service | Change the [pricing tier](search-sku-tier.md) of your search service. This provides flexibility to scale storage, increase request throughput, and decrease latency based on your needs. Initially, this preview only supported upgrades between Basic and Standard (S1, S2, and S3) tiers, but starting in July 2025, it supports upgrades *and* downgrades between these tiers. Available in [Update Service (2025-02-01-preview)](/rest/api/searchmanagement/services/update?view=rest-searchmanagement-2025-02-01-preview&preserve-view=true#searchupdateservicewithsku) and the Azure portal. |
| [Facet hierarchies, aggregations, and facet filters (preview)](search-faceted-navigation-examples.md) | Queries | New facet query parameters support nested facets. For numeric facetable fields, you can sum the values of each field. You can also specify filters on a facet to add inclusion or exclusion criteria. Available in [Search Documents (2025-03-01-preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2025-03-01-preview&preserve-view=true) and the Azure portal.|
-| [Rescore vector queries over binary quantization using full precision vectors (preview)](vector-search-how-to-quantization.md#recommended-rescoring-techniques) | Queries | For vector indexes that contain binary quantization, you can rescore query results using a full precision vector query. The query engine uses the dot product of the binary embeddings and the vector query for rescoring, which improves the quality of search results. Set `enableRescoring` and `discardOriginals` to use this feature, and call the latest preview API version on the request.|
+| [Rescore vector queries over binary quantization using full precision vectors (preview)](vector-search-how-to-quantization.md#supported-rescoring-techniques) | Queries | For vector indexes that contain binary quantization, you can rescore query results using a full precision vector query. The query engine uses the dot product of the binary embeddings and the vector query for rescoring, which improves the quality of search results. Set `enableRescoring` and `discardOriginals` to use this feature, and call the latest preview API version on the request.|
| [Semantic ranker prerelease models (preview)](semantic-how-to-configure.md#opt-in-for-prerelease-semantic-ranking-models) | Index | Opt in to use prerelease semantic ranker models if one happens to be available in your region. Available in [Create or Update Index (2025-03-01-preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2025-03-01-preview#semanticconfiguration&preserve-view=true).|
| [Search Service REST 2025-03-01-preview](/rest/api/searchservice/search-service-api-versions?view=rest-searchservice-2025-03-01-preview&preserve-view=true) | REST | Public preview release of REST APIs for data plane operations. Adds support for multi-vector embeddings, hierarchical facets, facet aggregation, and facet filters. |
| [Search Management 2025-02-01-preview](/rest/api/searchmanagement/management-api-versions?view=rest-searchmanagement-2025-02-01-preview&preserve-view=true) | REST | Public review release of REST APIs for control plane operations. Adds support for in-place upgrade to higher capacity partitions, in-place upgrade to higher tiers, and Azure Confidential Compute. |
@@ -97,17 +108,17 @@ Learn about the latest updates to Azure AI Search functionality, docs, and sampl
| December | Template | [RAG chat with Azure AI Search + Python](https://azure.github.io/ai-app-templates/repo/azure-samples/azure-search-openai-demo/). An AI application template for building a RAG solution using Azure AI Search and Python. |
| November | Security | [Network security perimeter](search-security-network-security-perimeter.md). Join a search service to a [network security perimeter](/azure/private-link/network-security-perimeter-concepts) to control network access to your search service. The Azure portal and the Management REST APIs in the [2024-06-01-preview](/rest/api/searchmanagement/network-security-perimeter-configurations?view=rest-searchmanagement-2024-06-01-preview&preserve-view=true) can be used to view and reconcile network security perimeter configurations. |
| November | Security | [Shared private link support for Azure AI service connections](search-indexer-howto-access-private.md). Connections to Azure AI for built-in skills processing can now be private using a shared private link on the connection. |
-| November | Relevance | [Rescoring options for compressed vectors](/azure/search/vector-search-how-to-quantization?tabs=2024-11-01-preview%2Cquery-2024-07-01#add-compressions-to-a-search-index). You can set options to rescore with original vectors instead of compressed vectors. Applies to HNSW and exhaustive KNN vector algorithms, using binary and scalar compression. Available in the [Create or Update Index (2024-11-01-preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
+| November | Relevance | [Rescoring options for compressed vectors](vector-search-how-to-quantization.md#add-compressions-to-a-search-index). You can set options to rescore with original vectors instead of compressed vectors. Applies to HNSW and exhaustive KNN vector algorithms, using binary and scalar compression. Available in the [Create or Update Index (2024-11-01-preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
| November | Vector search | [Store fewer vector instances](vector-search-how-to-storage-options.md). In vector compression scenarios, you can omit storage of full precision vectors if you don't need them for rescoring. Available in the [Create or Update Index (2024-11-01-preview)](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
| November | Relevance | [Query rewrite in the semantic reranker](semantic-how-to-query-rewrite.md). You can set options on a semantic query to rewrite the query input into a revised or expanded query that generates more relevant results from the L2 ranker. Available in the [Search Documents (2024-11-01-preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature.|
| November | Relevance | [New semantic ranker models](semantic-search-overview.md). Semantic ranker runs with improved models in all supported regions. There's no change to APIs or the Azure portal experience. |
| November | Applied AI (skills) | [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md). A new skill used to analyze a document for structure and provide [structure-aware (paragraph) chunking](search-how-to-semantic-chunking.md). This skill calls Document Intelligence and uses the Document Intelligence layout model. Available in selected regions through the [Create or Update Skillset (2024-11-01-preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
| November | Applied AI (skills) | [Keyless billing for Azure AI skills processing](cognitive-search-attach-cognitive-services.md). You can now use a managed identity and roles for a keyless connection to Azure AI services for built-in skills processing. This capability removes restrictions for having both search and AI services in the same region. Available in the [Create or Update Skillset (2024-11-01-preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
-| November | Indexer data source | [Markdown parsing mode](search-how-to-index-markdown-blobs.md). With this parsing mode, indexers can generate one-to-one or one-to-many search documents from Markdown files in Azure Storage and OneLake. Available in the [Create or Update Indexer (2024-11-01-preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
+| November | Indexer data source | [Markdown parsing mode](search-how-to-index-markdown-blobs.md). With this parsing mode, indexers can generate one-to-one or one-to-many search documents from Markdown files in Azure Storage and Microsoft OneLake. Available in the [Create or Update Indexer (2024-11-01-preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
| November | API | [2024-11-01-preview](/rest/api/searchservice/search-service-api-versions?view=rest-searchservice-2024-11-01-preview&preserve-view=true). Preview release of REST APIs for query rewrite, Document Layout skill, keyless billing for skills processing, Markdown parsing mode, and rescoring options for compressed vectors. |
| November | Feature | [Portal support for structured data](search-get-started-portal-import-vectors.md). The **Import and vectorize data** wizard now supports Azure SQL, Azure Cosmos DB, and Azure Table Storage. |
| October | Feature | [Lower the dimension requirements for MRL-trained text embedding models on Azure OpenAI](vector-search-how-to-truncate-dimensions.md). Text-embedding-3-small and Text-embedding-3-large are trained using Matryoshka Representation Learning (MRL). This allows you to truncate the embedding vectors to fewer dimensions, and adjust the balance between vector index size usage and retrieval quality. A new `truncationDimension` in the [2024-09-01-preview](/rest/api/searchservice/indexes/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true) enables access to MRL compression in text embedding models. This can only be configured for new vector fields. |
-| October | Feature | [Unpack `@search.score` to view subscores in hybrid search results](hybrid-search-ranking.md#unpack-a-search-score-into-subscores-preview). You can investigate Reciprocal Rank Fusion (RRF) ranked results by viewing the individual query subscores of the final merged and scored result. A new `debug` property unpacks the search score. `QueryResultDocumentSubscores`, `QueryResultDocumentRerankerInput`, and `QueryResultDocumentSemanticField` provide the extra detail. These definitions are available in the [2024-09-01-preview](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
+| October | Feature | [Unpack `@search.score` to view subscores in hybrid search results](hybrid-search-ranking.md#unpack-a-search-score-into-subscores). You can investigate Reciprocal Rank Fusion (RRF) ranked results by viewing the individual query subscores of the final merged and scored result. A new `debug` property unpacks the search score. `QueryResultDocumentSubscores`, `QueryResultDocumentRerankerInput`, and `QueryResultDocumentSemanticField` provide the extra detail. These definitions are available in the [2024-09-01-preview](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
| October | Feature | [Target filters in a hybrid search to just the vector queries](hybrid-search-how-to-query.md#example-hybrid-search-with-filters-targeting-vector-subqueries-preview). A filter on a hybrid query involves all subqueries on the request, regardless of type. You can override the global filter to scope the filter to a specific subquery. The new `filterOverride` parameter is available on hybrid queries using the [2024-09-01-preview](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
| October | Applied AI (skills) | [Text Split skill (token chunking)](cognitive-search-skill-textsplit.md). This skill has new parameters that improve data chunking for embedding models. A new `unit` parameter lets you specify token chunking. You can now chunk by token length, setting the length to a value that makes sense for your embedding model. You can also specify the tokenizer and any tokens that shouldn't be split during data chunking. The new `unit` parameter and query subscore definitions are found in the [2024-09-01-preview](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-09-01-preview&preserve-view=true). |
| October | API | [2024-09-01-preview](/rest/api/searchservice/search-service-api-versions?view=rest-searchservice-2024-09-01-preview&preserve-view=true). Preview release of REST APIs for truncated dimensions in text-embedding-3 models, targeted vector filtering for hybrid queries, RRF subscore details for debugging, and token chunking for Text Split skill.|
@@ -128,7 +139,7 @@ Learn about the latest updates to Azure AI Search functionality, docs, and sampl
| July | Accelerator | [Build your own copilot](https://github.com/microsoft/Build-your-own-AI-Assistant-Solution-Accelerator). Create your own custom copilot solution that empowers [Client Advisor](https://github.com/microsoft/Build-your-own-copilot-Solution-Accelerator) to harness the power of generative AI across both structured and unstructured data. Help our customers to optimize daily tasks and foster better interactions with more clients. |
| June | Feature | [Image search in the Azure portal](search-get-started-portal-image-search.md). Search explorer now supports image search. In a vector index that contains vectorized image content, you can drop images into Search Explorer to query for a match. |
| May | Service limits| [Higher capacity and more vector quota at every tier (same billing rate)](search-limits-quotas-capacity.md#service-limits). For most regions, partition sizes are now even larger for Standard 2 (S2), Standard 3 (S3), and Standard 3 High Density (S3 HD) for services created after April 3, 2024. To get the larger partitions, create a new service in a [region that provides newer infrastructure](search-region-support.md). <br><br>Storage Optimized tiers (L1 and L2) also have more capacity. L1 and L2 customers must create a new service to benefit from the higher capacity. There's no in-place upgrade at this time. <br><br>Extra capacity is now available in [more regions](search-limits-quotas-capacity.md#service-limits): Germany North, Germany West Central, South Africa North, Switzerland West, and Azure Government (Texas, Arizona, and Virginia). |
-| May | Feature | [OneLake integration (preview)](search-how-to-index-onelake-files.md). New indexer for OneLake files and OneLake shortcuts. If you use Microsoft Fabric and OneLake for data access to Amazon Web Services (AWS) and Google data sources, use this indexer to import external data into a search index. This indexer is available through the Azure portal, the [2024-05-01-preview REST API](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true), and Azure SDK beta packages. |
+| May | Feature | [OneLake integration (preview)](search-how-to-index-onelake-files.md). New indexer for OneLake files and OneLake shortcuts. If you use Microsoft OneLake for data access to Amazon Web Services (AWS) and Google data sources, use this indexer to import external data into a search index. This indexer is available through the Azure portal, the [2024-05-01-preview REST API](/rest/api/searchservice/data-sources/create-or-update?view=rest-searchservice-2024-05-01-preview&preserve-view=true), and Azure SDK beta packages. |
| May | Feature | [Vector relevance](vector-search-how-to-query.md) <br>[hybrid query relevance](hybrid-search-how-to-query.md). Four enhancements improve vector and hybrid search relevance. <br><br>First, you can now set thresholds on vector search results to exclude low-scoring results. <br><br>Second, changes in the query architecture apply scoring profiles at the end of the query pipeline for every query type. Document boosting is a common scoring profile, and it now works as expected on vector and hybrid queries.<br><br>Third, you can set [`MaxTextRecallSize` and `countAndFacetMode`](hybrid-search-how-to-query.md#set-maxtextrecallsize-and-countandfacetmode) in hybrid queries to control the quantity of BM25-ranked search results that flow into the hybrid ranking model. <br><br>Fourth, for vector and hybrid search, you can weight a vector query to have boost or diminish its importance in a multiquery request. |
| May | Feature | [Binary vectors support](/rest/api/searchservice/supported-data-types). `Collection(Edm.Byte)` is a new supported data type. This data type opens up integration with the [Cohere v3 binary embedding models](https://cohere.com/blog/int8-binary-embeddings) and custom binary quantization. Narrow data types lower the cost of large vector datasets. See [Index binary data for vector search](vector-search-how-to-index-binary-data.md) for more information. |
| May | Skill | [Azure AI Vision multimodal embeddings skill (preview)](cognitive-search-skill-vision-vectorize.md). New skill that's bound to the [multimodal embeddings API of Azure AI Vision](/azure/ai-services/computer-vision/concept-image-retrieval). You can generate embeddings for text or images during indexing. This skill is available through the Azure portal and the [2024-05-01-preview REST API](/rest/api/searchservice/operation-groups?view=rest-searchservice-2024-05-01-preview&preserve-view=true). |
Summary
{
"modification_type": "minor update",
"modification_title": "Azure AI Searchの新機能と更新情報の追加"
}
Explanation
この変更は、「whats-new.md」ドキュメント内に、Azure AI Searchに関する最新の機能や改善点に関する情報を追加するために行われました。主な変更点は以下の通りです:
更新日付の変更: ドキュメントの最終更新日が「2025年9月16日」から「2025年9月28日」に変更され、最新の情報が反映されています。
新機能の追加: 新しい機能や更新情報が多数追加されています。具体的には、以下の内容が盛り込まれました。
- 新しいREST APIバージョンのサポートや、Microsoft OneLakeインデクサーの一般提供開始。
- Logicアプリとの統合、ドキュメントレイアウトスキル、正規化機能等が一般提供されています。
- ベクトル検索に関する新機能やオプションも追加され、特にスコアリングプロファイルや次元の切り捨てに関する情報が強化されています。
コンテンツ構造の改良: いくつかの機能説明が整理され、より明確に理解しやすくなっています。例えば、一部の説明が具体的なリンクとの関連付けが強化され、読者は関連情報に容易にアクセスできるようになっています。
この全体の更新により、Azure AI Searchの利用者は新たに追加された機能や改善点を迅速に把握し、サービスの利用を向上させることができるようになります。また、これによりドキュメントの信頼性と有用性も増しています。