Diff Insight Report - search

最終更新日: 2024-10-16

利用上の注意

このポストは Microsoft 社の Azure 公式ドキュメント(CC BY 4.0 または MIT ライセンス) をもとに生成AIを用いて翻案・要約した派生作品です。 元の文書は MicrosoftDocs/azure-ai-docs にホストされています。

生成AIの性能には限界があり、誤訳や誤解釈が含まれる可能性があります。 本ポストはあくまで参考情報として用い、正確な情報は必ず元の文書を参照してください。

このポストで使用されている商標はそれぞれの所有者に帰属します。これらの商標は技術的な説明のために使用されており、商標権者からの公式な承認や推奨を示すものではありません。

View Diff on GitHub

ハイライト

このコードの変更は、複数のAzure AI Search関連のドキュメントの更新を反映しています。主な新機能として、リダイレクションルールの追加や、インデックスプロジェクションの定義に関する新しいドキュメントの追加が挙げられます。破壊的変更としては、既存のindex-projections-concept-intro.mdファイルの削除が特徴的です。

新機能

  • リダイレクションルールがJSONファイルに追加され、ユーザーのスムーズなナビゲーションを支援。
  • 新しいドキュメントとして連帯した「インデックスプロジェクションの定義」に関するガイドが公開。

破壊的変更

  • index-projections-concept-intro.mdファイルが削除され、関連する情報が失われた。

その他の更新

  • 様々なファイル内で軽微な更新が行われ、情報が最新の状態に保たれた。
  • 複雑なデータ型のモデル化、検索制限のドキュメントが改善され、実際の使用ケースについての説明が強化。
  • RAGソリューションのインデックススキーマに関する情報が明確化され、ユーザーの理解を助ける。

インサイト

今回の変更は、Azure AI Searchを利用する開発者向けドキュメントを最新の状態に保つための重要なアップデートが含まれています。まず、リダイレクションルールの追加により、ユーザーが関心のあるドキュメントに円滑に移動できるようになりました。これは、長期的に非アクティブなリンクに対処するための効果的な手段です。

削除されたindex-projections-concept-intro.mdに代わって、新しいインデックスプロジェクションの定義に関するドキュメントが追加されました。この文書は、Azureのインデックス投影における最新かつ詳細なガイドを提供するものであり、親子関係を持つデータの描画やインデックス作成の方法について具体的な例を通じて解説しています。これにより、開発者は高度なインデキシング技術を活用しやすくなり、効果的なプロジェクトの実施を支援します。

加えて、既存の複雑なデータや検索制限に関するドキュメントが改善され、Azureのインデキシングの特性理解が促進されました。これらの更新は、開発者がシステムの能力や制限を把握し、適切な実装戦略を採るためのガイドとなるでしょう。

このように、今回のコード変更はAI Searchプラットフォームのユーザーエクスペリエンスを向上させ、ドキュメントの整合性を維持するためのものであり、Azureサービスをより効果的に活用するための道筋を示すものです。

Summary Table

Filename Type Title Status A D M
.openpublishing.redirection.search.json new feature リダイレクションルールの追加 modified 5 0 5
index-projections-concept-intro.md breaking change インデックスプロジェクションに関する概念紹介の削除 removed 0 128 128
search-get-started-rag.md minor update 生成的検索クイックスタートの更新 modified 162 41 203
search-how-to-define-index-projections.md new feature インデックスプロジェクションの定義に関する新しいドキュメントの追加 added 402 0 402
search-howto-complex-data-types.md minor update 複雑なデータ型のモデル化に関するドキュメントの更新 modified 78 57 135
search-limits-quotas-capacity.md minor update 検索制限、クオータ、キャパシティに関するドキュメントの修正 modified 1 1 2
toc.yml minor update 目次ファイルのリンクの更新 modified 2 2 4
tutorial-rag-build-solution-index-schema.md minor update RAGソリューションのインデックススキーマに関するチュートリアルの修正 modified 2 2 4

Modified Contents

articles/search/.openpublishing.redirection.search.json

Diff
@@ -1,5 +1,10 @@
 {
     "redirections": [
+        {
+            "source_path_from_root": "/articles/search/index-projections-concept-intro.md",
+            "redirect_url": "/azure/search/search-how-to-define-index-projections",
+            "redirect_document_id": true
+        },
         {
             "source_path_from_root": "/articles/search/tutorial-javascript-overview.md",
             "redirect_url": "/azure/search/tutorial-csharp-overview",

Summary

{
    "modification_type": "new feature",
    "modification_title": "リダイレクションルールの追加"
}

Explanation

この変更は、JSONファイルに新しいリダイレクションルールを追加することを目的としています。具体的には、/articles/search/index-projections-concept-intro.md から /azure/search/search-how-to-define-index-projections へのリダイレクションが追加され、redirect_document_idtrue に設定されています。このリダイレクションの追加により、ユーザーは新しいURLにスムーズにアクセスできるようになります。ファイル全体で5つの変更が行われており、元のファイルに新しい情報が追加されています。

articles/search/index-projections-concept-intro.md

Diff
@@ -1,128 +0,0 @@
----
-title: Index projections concepts
-titleSuffix: Azure AI Search
-description: Index projections are a way to project enriched content created by an Azure AI Search skillset to a secondary index on the search service.
-author: careyjmac
-manager: jiantaosun
-ms.author: chalton
-ms.service: azure-ai-search
-ms.custom:
-  - ignite-2023
-ms.topic: conceptual
-ms.date: 08/05/2024
----
-
-# Index projections in Azure AI Search
-
-*Index projections* are a component of a skillset definition that defines the shape of a secondary index, supporting a one-to-many index pattern, where content from an enrichment pipeline can target multiple indexes.
-
-Index projections take AI-enriched content generated by an [enrichment pipeline](cognitive-search-concept-intro.md) and index it into a secondary index (different from the one that an indexer targets by default) on your search service. Index projections also allow you to reshape the data before indexing it, in a way that uniquely allows you to separate an array of enriched items into multiple search documents in the target index, otherwise known as "one-to-many" indexing. "One-to-many" indexing is useful for data chunking scenarios, where you might want a primary index for unchunked content and a secondary index for chunked.
-
-If you've used cognitive skills in the past, you already know that *skillsets* create enriched content. Skillsets move a document through a sequence of enrichments that invoke atomic transformations, such as recognizing entities or translating text. By default, one document processed within a skillset maps to a single document in the search index. This means that if you perform chunking of an input text and then perform enrichments on each chunk, the result in the index when mapped via outputFieldMappings is an array of the generated enrichments. With index projections, you define a context at which to map each chunk of enriched data to its own search document. This allows you to apply a one-to-many mapping of a document's enriched data to the search index.
-
-<!-- TODO diagram showcasing the one-to-many abilities of index projections. -->
-
-## Index projections definition
-
-Index projections are defined inside a skillset definition, and are primarily defined as an array of **selectors**, where each selector corresponds to a different target index on the search service. Each selector requires the following parameters as part of its definition:
-
-- `targetIndexName`: The name of the index on the search service that the index projection data index into. 
-- `parentKeyFieldName`: The name of the field in the target index that contains the value of the key for the parent document.
-- `sourceContext`: The enrichment annotation that defines the granularity at which to map data into individual search documents. For more information, see [Skill context and input annotation language](cognitive-search-skill-annotation-language.md).
-- `mappings`: An array of mappings of enriched data to fields in the search index. Each mapping consists of:
-    - `name`: The name of the field in the search index that the data should be indexed into,
-    - `source`: The enrichment annotation path that the data should be pulled from.
-
-Each `mapping` can also recursively define data with an optional `sourceContext` and `inputs` field, similar to the [knowledge store](knowledge-store-concept-intro.md) or [Shaper Skill](cognitive-search-skill-shaper.md). These parameters allow you to shape data to be indexed into fields of type `Edm.ComplexType` in the search index.
-
-The index defined in the `targetIndexName` parameter has the following requirements:
-- Must already have been created on the search service before the skillset containing the index projections definition is created.
-- Must contain a field with the name defined in the `parentKeyFieldName` parameter. This field must be of type `Edm.String`, can't be the key field, and must have filterable set to true.
-- The key field must have searchable set to true and be defined with the `keyword` analyzer.
-- Must have fields defined for each of the `name`s defined in `mappings`, none of which can be the key field.
-
-Here's an example payload for an index projections definition that you might use to project individual pages output by the [Split skill](cognitive-search-skill-textsplit.md) as their own documents in the search index.
-
-```json
-"indexProjections": {
-    "selectors": [
-        {
-            "targetIndexName": "myTargetIndex",
-            "parentKeyFieldName": "ParentKey",
-            "sourceContext": "/document/pages/*",
-            "mappings": [
-                {
-                    "name": "chunk",
-                    "source": "/document/pages/*"
-                }
-            ]
-        }
-    ]
-}
-```
-
-### Handling parent documents
-
-Because index projections effectively generate "child" documents for each "parent" document that runs through a skillset, you also have the following choices as to how to handle the indexing of the "parent" documents.
-
-- To keep parent and child documents in separate indexes, you would just ensure that the `targetIndexName` for your indexer definition is different from the `targetIndexName` defined in your index projection selector.
-- To index parent and child documents into the same index, you need to make sure that the schema for the target index works with both your defined `fieldMappings` and `outputFieldMappings` in your indexer definition and the `mappings` in your index projection selector. You would then just provide the same `targetIndexName` for your indexer definition and your index projection selector.
-- To ignore parent documents and only index child documents, you still need to provide a `targetIndexName` in your indexer definition (you can just provide the same one that you do for the index projection selector). Then define a separate `parameters` object next to your `selectors` definition with a `projectionMode` key set to `skipIndexingParentDocuments`, as shown here:
-
-    ```json
-    "indexProjections": {
-        "selectors": [
-            ...
-        ],
-        "parameters": {
-            "projectionMode": "skipIndexingParentDocuments"
-        }
-    }
-    ```
-
-### [**REST**](#tab/kstore-rest)
-
-Index projections are generally available. We recommend the most recent stable API.
-
-+ [Create Skillset (api-version=2024-07-01)](/rest/api/searchservice/skillsets/create)
-
-### [**.NET**](#tab/kstore-csharp)
-
-For .NET developers, use the [IndexProjections Class](/dotnet/api/azure.search.documents.indexes.models.searchindexerskillset.indexprojections?view=azure-dotnet-preview#azure-search-documents-indexes-models-searchindexerskillset-indexprojections&preserve-view=true) in the Azure.Search.Documents client library.
-
----
-
-## Content lifecycle
-
-If the indexer data source supports change tracking and deletion detection, the indexing process can synchronize the primary and secondary indexes to pick up those changes.
-
-Each time you run the indexer and skillset, the index projections are updated if the skillset or underlying source data has changed. Any changes picked up by the indexer are propagated through the enrichment process to the projections in the index, ensuring that your projected data is a current representation of content in the originating data source. 
-
-> [!NOTE]
-> While you can manually edit the data in the projected documents using the [index push API](search-how-to-load-search-index.md), any edits will be overwritten on the next pipeline invocation, assuming the document in source data is updated. 
-
-### Projected key value
-
-Each index projection document contains a unique identifying key that the indexer generates in order to ensure uniqueness and allow for change and deletion tracking to work correctly. This key contains the following segments:
-
-- A random hash to guarantee uniqueness. This hash changes if the parent document is updated across indexer runs.
-- The parent document's key.
-- The enrichment annotation path that identifies the context that that document was generated from.
-
-For example, if you split a parent document with key value "123" into four pages, and then each of those pages is projected as its own document via index projections, the key for the third page of text would look something like "01f07abfe7ed_123_pages_2". If the parent document is then updated to add a fifth page, the new key for the third page might, for example, be "9d800bdacc0e_123_pages_2", since the random hash value changes between indexer runs even though the rest of the projection data didn't change.
-
-### Changes or additions
-
-If a parent document is changed such that the data within a projected index document changes (an example would be if a word was changed in a particular page but no net new pages were added), the data in the target index for that particular projection is updated to reflect that change.
-
-If a parent document is changed such that there are new projected child documents that weren't there before (an example would be if one or more pages worth of text were added to the document), those new child documents are added next time the indexer runs.
-
-In both of these cases, all projected documents are updated to have a new hash value in their key, regardless of if their particular content was updated.
-
-### Deletions
-
-If a parent document is changed such that a child document generated by index projections no longer exists (an example would be if a text is shortened so there are fewer chunks than before), the corresponding child document in the search index is deleted. The remaining child documents also get their key updated to include a new hash value, even if their content didn't otherwise change.
-
-If a parent document is completely deleted from the datasource, the corresponding child documents only get deleted if the deletion is detected by a `dataDeletionDetectionPolicy` defined on the datasource definition. If you don't have a `dataDeletionDetectionPolicy` configured and need to delete a parent document from the datasource, then you should manually delete the child documents if they're no longer wanted. 
-
-<!-- TODO Next steps heading with link to BYOE documentation -->

Summary

{
    "modification_type": "breaking change",
    "modification_title": "インデックスプロジェクションに関する概念紹介の削除"
}

Explanation

この変更は、index-projections-concept-intro.md ファイルが完全に削除されたことを示しています。このファイルは、Azure AI Searchにおけるインデックスプロジェクションの概念について詳しく説明していました。同ファイルには、インデックスプロジェクションの定義、要件、使用法に関する重要な情報が含まれていましたが、変更後はその内容が全て削除されたため、関連する情報源が消失しました。この重大な変更により、開発者やユーザーはこのコンセプトについての情報を他のドキュメントから探し続ける必要がある可能性があります。

articles/search/search-get-started-rag.md

Diff
@@ -6,20 +6,20 @@ author: HeidiSteen
 ms.author: heidist
 ms.service: azure-ai-search
 ms.topic: quickstart
-ms.date: 09/16/2024
+ms.date: 10/14/2024
 ---
 
 # Quickstart: Generative search (RAG) with grounding data from Azure AI Search
 
-This quickstart shows you how to send queries to a Large Language Model (LLM) for a conversational search experience over your indexed content on Azure AI Search. You use the Azure portal to set up the resources, and then run Python code to call the APIs. 
+This quickstart shows you how to send basic and complex queries to a Large Language Model (LLM) for a conversational search experience over your indexed content on Azure AI Search. You use the Azure portal to set up the resources, and then run Python code to call the APIs. 
 
 ## Prerequisites
 
 - An Azure subscription. [Create one for free](https://azure.microsoft.com/free/).
 
 - [Azure AI Search](search-create-service-portal.md), Basic tier or higher so that you can [enable semantic ranker](semantic-how-to-enable-disable.md). Region must be the same one used for Azure OpenAI.
 
-- [Azure OpenAI](https://aka.ms/oai/access) resource with a deployment of `gpt-35-turbo`, `gpt-4`, or equivalent model, in the same region as Azure AI Search.
+- [Azure OpenAI](https://aka.ms/oai/access) resource with a deployment of `gpt-4o`, `gpt-4o-mini`, or equivalent LLM, in the same region as Azure AI Search.
 
 - [Visual Studio Code](https://code.visualstudio.com/download) with the [Python extension](https://marketplace.visualstudio.com/items?itemName=ms-python.python) and the [Jupyter package](https://pypi.org/project/jupyter/). For more information, see [Python in Visual Studio Code](https://code.visualstudio.com/docs/languages/python).
 
@@ -122,28 +122,68 @@ We recommend the hotels-sample-index, which can be created in minutes and runs o
 
 1. **Save** your changes.
 
-1. Run the following query in [Search Explorer](search-explorer.md) to test your index: `hotels near the ocean with beach access and good views`.
+1. Run the following query in [Search Explorer](search-explorer.md) to test your index: `complimentary breakfast`.
 
-   Output should look similar to the following example. Results that are returned directly from the search engine consist of fields and their verbatim values, along with metadata like a search score and a semantic ranking score and caption if you use semantic ranker.
+   Output should look similar to the following example. Results that are returned directly from the search engine consist of fields and their verbatim values, along with metadata like a search score and a semantic ranking score and caption if you use semantic ranker. We used a [select statement](search-query-odata-select.md) to return just the HotelName, Description, and Tags fields.
 
    ```
-      "@search.score": 5.600783,
-      "@search.rerankerScore": 2.4191176891326904,
-      "@search.captions": [
-        {
-          "text": "Contoso Ocean Motel. Budget. pool\r\nair conditioning\r\nbar. Oceanfront hotel overlooking the beach features rooms with a private balcony and 2 indoor and outdoor pools. Various shops and art entertainment are on the boardwalk, just steps away..",
-          "highlights": "Contoso Ocean Motel. Budget.<em> pool\r\nair conditioning\r\nbar. O</em>ceanfront hotel overlooking the beach features rooms with a private balcony and 2 indoor and outdoor pools. Various shops and art entertainment are on the boardwalk, just steps away."
-        }
-      ],
-      "HotelId": "41",
-      "HotelName": "Contoso Ocean Motel",
-      "Description": "Oceanfront hotel overlooking the beach features rooms with a private balcony and 2 indoor and outdoor pools. Various shops and art entertainment are on the boardwalk, just steps away.",
-      "Category": "Budget",
-      "Tags": [
-        "pool",
-        "air conditioning",
-        "bar"
-      ],
+   {
+   "@odata.count": 18,
+   "@search.answers": [],
+   "value": [
+      {
+         "@search.score": 2.2896252,
+         "@search.rerankerScore": 2.506816864013672,
+         "@search.captions": [
+         {
+            "text": "Head Wind Resort. Suite. coffee in lobby\r\nfree wifi\r\nview. The best of old town hospitality combined with views of the river and cool breezes off the prairie. Our penthouse suites offer views for miles and the rooftop plaza is open to all guests from sunset to 10 p.m. Enjoy a **complimentary continental breakfast** in the lobby, and free Wi-Fi throughout the hotel..",
+            "highlights": ""
+         }
+         ],
+         "HotelName": "Head Wind Resort",
+         "Description": "The best of old town hospitality combined with views of the river and cool breezes off the prairie. Our penthouse suites offer views for miles and the rooftop plaza is open to all guests from sunset to 10 p.m. Enjoy a complimentary continental breakfast in the lobby, and free Wi-Fi throughout the hotel.",
+         "Tags": [
+         "coffee in lobby",
+         "free wifi",
+         "view"
+         ]
+      },
+      {
+         "@search.score": 2.2158256,
+         "@search.rerankerScore": 2.288334846496582,
+         "@search.captions": [
+         {
+            "text": "Swan Bird Lake Inn. Budget. continental breakfast\r\nfree wifi\r\n24-hour front desk service. We serve a continental-style breakfast each morning, featuring a variety of food and drinks. Our locally made, oh-so-soft, caramel cinnamon rolls are a favorite with our guests. Other breakfast items include coffee, orange juice, milk, cereal, instant oatmeal, bagels, and muffins..",
+            "highlights": ""
+         }
+         ],
+         "HotelName": "Swan Bird Lake Inn",
+         "Description": "We serve a continental-style breakfast each morning, featuring a variety of food and drinks. Our locally made, oh-so-soft, caramel cinnamon rolls are a favorite with our guests. Other breakfast items include coffee, orange juice, milk, cereal, instant oatmeal, bagels, and muffins.",
+         "Tags": [
+         "continental breakfast",
+         "free wifi",
+         "24-hour front desk service"
+         ]
+      },
+      {
+         "@search.score": 0.92481667,
+         "@search.rerankerScore": 2.221315860748291,
+         "@search.captions": [
+         {
+            "text": "White Mountain Lodge & Suites. Resort and Spa. continental breakfast\r\npool\r\nrestaurant. Live amongst the trees in the heart of the forest. Hike along our extensive trail system. Visit the Natural Hot Springs, or enjoy our signature hot stone massage in the Cathedral of Firs. Relax in the meditation gardens, or join new friends around the communal firepit. Weekend evening entertainment on the patio features special guest musicians or poetry readings..",
+            "highlights": ""
+         }
+         ],
+         "HotelName": "White Mountain Lodge & Suites",
+         "Description": "Live amongst the trees in the heart of the forest. Hike along our extensive trail system. Visit the Natural Hot Springs, or enjoy our signature hot stone massage in the Cathedral of Firs. Relax in the meditation gardens, or join new friends around the communal firepit. Weekend evening entertainment on the patio features special guest musicians or poetry readings.",
+         "Tags": [
+         "continental breakfast",
+         "pool",
+         "restaurant"
+         ]
+      },
+      . . .
+   ]}
    ```
 
 ## Get service endpoints
@@ -169,29 +209,21 @@ This section uses Visual Studio Code and Python to call the chat completion APIs
 1. Install the following Python packages.
 
    ```python
-   ! pip install azure-search-documents==11.6.0b4 --quiet
-   ! pip install azure-identity==1.16.0 --quiet
+   ! pip install azure-search-documents==11.6.0b5 --quiet
+   ! pip install azure-identity==1.16.1 --quiet
    ! pip install openai --quiet
    ! pip intall aiohttp --quiet
+   ! pip intall ipykernel --quiet
    ```
 
 1. Set the following variables, substituting placeholders with the endpoints you collected in the previous step. 
 
    ```python
     AZURE_SEARCH_SERVICE: str = "PUT YOUR SEARCH SERVICE ENDPOINT HERE"
     AZURE_OPENAI_ACCOUNT: str = "PUT YOUR AZURE OPENAI ENDPOINT HERE"
-    AZURE_DEPLOYMENT_MODEL: str = "gpt-35-turbo"
+    AZURE_DEPLOYMENT_MODEL: str = "gpt-4o"
    ```
 
-1. Run the following code to set query parameters. The query is a keyword search using semantic ranking. In a keyword search, the search engine returns up to 50 matches, but only the top 5 are provided to the model. If you can't [enable semantic rankersemantic-how-to-enable-disable.md) on your search service, set the value to false.
-
-   ```python
-   # Set query parameters for grounding the conversation on your search index
-    search_type="text"
-    use_semantic_reranker=True
-    sources_to_include=5
-    ```
-
 1. Set up clients, the prompt, query, and response.
 
    ```python
@@ -227,7 +259,7 @@ This section uses Visual Studio Code and Python to call the chat completion APIs
     """
     
     # Query is the question being asked. It's sent to the search engine and the LLM.
-    query="Can you recommend a few hotels near the ocean with beach access and good views"
+    query="Can you recommend a few hotels with complimentary breakfast?"
     
     # Set up the search results and the chat thread.
     # Retrieve the selected fields from the search index related to the question.
@@ -254,12 +286,22 @@ This section uses Visual Studio Code and Python to call the chat completion APIs
     Output is from Azure OpenAI, and it consists of recommendations for several hotels. Here's an example of what the output might look like:
 
     ```
-    Based on your criteria, we recommend the following hotels:
-    
-    - Contoso Ocean Motel: located right on the beach and has private balconies with ocean views. They also have indoor and outdoor pools. It's located on the boardwalk near shops and art entertainment.
-    - Northwind Plaza & Suites: offers ocean views, free Wi-Fi, full kitchen, and a free breakfast buffet. Although not directly on the beach, this hotel has great views and is near the aquarium. They also have a pool.
-    
-    Several other hotels have views and water features, but do not offer beach access or views of the ocean.
+   Sure! Here are a few hotels that offer complimentary breakfast:
+   
+   - **Head Wind Resort**
+   - Complimentary continental breakfast in the lobby
+   - Free Wi-Fi throughout the hotel
+   
+   - **Double Sanctuary Resort**
+   - Continental breakfast included
+   
+   - **White Mountain Lodge & Suites**
+   - Continental breakfast available
+   
+   - **Swan Bird Lake Inn**
+   - Continental-style breakfast each morning with a variety of food and drinks 
+     such as caramel cinnamon rolls, coffee, orange juice, milk, cereal, 
+     instant oatmeal, bagels, and muffins
     ```
 
     If you get a **Forbidden** error message, check Azure AI Search configuration to make sure role-based access is enabled.
@@ -272,6 +314,85 @@ This section uses Visual Studio Code and Python to call the chat completion APIs
 
     You might also try the query without semantic ranking by setting `use_semantic_reranker=False` in the query parameters step. Semantic ranking can noticably improve the relevance of query results and the ability of the LLM to return useful information. Experimentation can help you decide whether it makes a difference for your content.
 
+## Send a complex RAG query
+
+Azure AI Search supports [complex types](search-howto-complex-data-types.md) for nested JSON structures. In the hotels-sample-index, `Address` is an example of a complex type, consisting of `Address.StreetAddress`, `Address.City`, `Address.StateProvince`, `Address.PostalCode`, and `Address.Country`. The index also has complex collection of `Rooms` for each hotel.
+
+If your index has complex types, your query can provide those fields if you first convert the search results output to JSON, and then pass the JSON to the LLM. The following example adds complex types to the request. The formatting instructions include a JSON specification.
+
+```python
+import json
+
+# Query is the question being asked. It's sent to the search engine and the LLM.
+query="Can you recommend a few hotels that offer complimentary breakfast? 
+Tell me their description, address, tags, and the rate for one room that sleeps 4 people."
+
+# Set up the search results and the chat thread.
+# Retrieve the selected fields from the search index related to the question.
+selected_fields = ["HotelName","Description","Address","Rooms","Tags"]
+search_results = search_client.search(
+    search_text=query,
+    top=5,
+    select=selected_fields,
+    query_type="semantic"
+)
+sources_filtered = [{field: result[field] for field in selected_fields} for result in search_results]
+sources_formatted = "\n".join([json.dumps(source) for source in sources_filtered])
+
+response = openai_client.chat.completions.create(
+    messages=[
+        {
+            "role": "user",
+            "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
+        }
+    ],
+    model=AZURE_DEPLOYMENT_MODEL
+)
+
+print(response.choices[0].message.content)
+```
+
+Output is from Azure OpenAI, and it adds content from complex types.
+
+```
+Here are a few hotels that offer complimentary breakfast and have rooms that sleep 4 people:
+
+1. **Head Wind Resort**
+   - **Description:** The best of old town hospitality combined with views of the river and 
+   cool breezes off the prairie. Enjoy a complimentary continental breakfast in the lobby, 
+   and free Wi-Fi throughout the hotel.
+   - **Address:** 7633 E 63rd Pl, Tulsa, OK 74133, USA
+   - **Tags:** Coffee in lobby, free Wi-Fi, view
+   - **Room for 4:** Suite, 2 Queen Beds (Amenities) - $254.99
+
+2. **Double Sanctuary Resort**
+   - **Description:** 5-star Luxury Hotel - Biggest Rooms in the city. #1 Hotel in the area 
+   listed by Traveler magazine. Free WiFi, Flexible check in/out, Fitness Center & espresso 
+   in room. Offers continental breakfast.
+   - **Address:** 2211 Elliott Ave, Seattle, WA 98121, USA
+   - **Tags:** View, pool, restaurant, bar, continental breakfast
+   - **Room for 4:** Suite, 2 Queen Beds (Amenities) - $254.99
+
+3. **Swan Bird Lake Inn**
+   - **Description:** Continental-style breakfast featuring a variety of food and drinks. 
+   Locally made caramel cinnamon rolls are a favorite.
+   - **Address:** 1 Memorial Dr, Cambridge, MA 02142, USA
+   - **Tags:** Continental breakfast, free Wi-Fi, 24-hour front desk service
+   - **Room for 4:** Budget Room, 2 Queen Beds (City View) - $85.99
+
+4. **Gastronomic Landscape Hotel**
+   - **Description:** Known for its culinary excellence under the management of William Dough, 
+   offers continental breakfast.
+   - **Address:** 3393 Peachtree Rd, Atlanta, GA 30326, USA
+   - **Tags:** Restaurant, bar, continental breakfast
+   - **Room for 4:** Budget Room, 2 Queen Beds (Amenities) - $66.99
+...
+   - **Tags:** Pool, continental breakfast, free parking
+   - **Room for 4:** Budget Room, 2 Queen Beds (Amenities) - $60.99
+
+Enjoy your stay! Let me know if you need any more information.
+```
+
 ## Troubleshooting errors
 
 To debug authentication errors, insert the following code before the step that calls the search engine and the LLM.

Summary

{
    "modification_type": "minor update",
    "modification_title": "生成的検索クイックスタートの更新"
}

Explanation

この変更は、search-get-started-rag.md ファイルにおけるいくつかの重要な更新を含んでいます。まず、クイックスタートの説明が一部修正され、基本的および複雑なクエリの両方をLarge Language Model(LLM)に送信する方法が強調されています。また、手順の中で使用されるモデルが更新され、gpt-35-turbo から gpt-4o 及びそのミニバージョンに置き換えられています。加えて、デモクエリが「海の近くにビーチアクセスのあるホテル」という内容から「無料の朝食があるホテル」を問い合わせる形に変更され、出力例も新たに追加されています。

依存関係のバージョン番号も更新され、特定のPythonパッケージが最新版にバージョンアップされています。このように、全体としてドキュメントの内容が最新の情報に反映され、ユーザーに対してより良い体験を提供するよう洗練されています。

articles/search/search-how-to-define-index-projections.md

Diff
@@ -0,0 +1,402 @@
+---
+title: Define index projections
+titleSuffix: Azure AI Search
+description: Index projections specify how parent-child content is mapped to fields in a search index when you use integrated vectorization for data chunking.
+author: HeidiSteen
+ms.author: heidist
+ms.service: azure-ai-search
+ms.custom:
+  - ignite-2023
+ms.topic: how-to
+ms.date: 10/10/2024
+---
+
+# Define an index projection for parent-child indexing
+
+For indexes containing chunked documents, an index projection specifies how parent-child content is mapped to fields in a search index for one-to-many indexing. Through an index projection, you can send content to:
+
+- A single index, where the parent fields repeat for each chunk, but the grain of the index is at the chunk level. The [RAG tutorial](tutorial-rag-build-solution-index-schema.md) is an example of this approach.
+
+- Two or more indexes, where the parent index has fields related to the parent document, and the child index is organized around chunks. The child index is the primary search corpus, but the parent index could be used for [lookup queries](/rest/api/searchservice/documents/get) when you want to retrieve the parent fields of a particular chunk, or for independent queries.
+
+Most implementations are a single index organized around chunks with parent fields, such as the document filename, repeating for each chunk. However, the system is designed to support separate and multiple child indexes if that's your requirement. Azure AI Search doesn't support index joins so your application code must handle which index to use.
+
+An index projection is defined in a [skillset](cognitive-search-working-with-skillsets.md). It's responsible for coordinating the indexing process that sends chunks of content to a search index, along with the parent content associated with each chunk. It improves how native data chunking works by giving your more options for controlling how parent-child content is indexed.
+
+This article explains how to create the index schema and indexer projection patterns for one-to-many indexing.
+
+## Prerequisites
+
+- An [indexer-based indexing pipeline](search-indexer-overview.md).
+
+- An index (one or more) that accepts the output of the indexer pipeline.
+
+- A [supported data source](search-indexer-overview.md#supported-data-sources) having content that you want to chunk. It can be vector or nonvector content.
+
+- A skill that splits content into chunks, either the [Text Split skill](cognitive-search-skill-textsplit.md) or a custom skill that provides equivalent functionality. 
+
+The skillset contains the indexer projection that shapes the data for one-to-many indexing. A skillset could also have other skills, such as an embedding skill like [AzureOpenAIEmbedding](cognitive-search-skill-azure-openai-embedding.md) if your scenario includes integrated vectorization.
+
+### Dependency on indexer processing
+
+One-to-many indexing takes a dependency on skillsets and indexer-based indexing that includes the following four components:
+
+- A data source
+- One or more indexes for your searchable content
+- A skillset that contains an index projection*
+- An indexer
+
+Your data can originate from any supported data source, but the assumption is that the content is large enough that you want to chunk it, and the reason for chunking it is that you're implementing a RAG pattern that provides grounding data to a chat model. Or, you're implementing vector search and need to meet the smaller input size requirements of embedding models.
+
+Indexers load indexed data into a predefined index. How you define the schema and whether to use one or more indexes is the first decision to make in a one-to-many indexing scenario. The next section covers index design.
+
+## Create an index for one-to-many indexing
+
+Whether you create one index that combines parent-child fields, or multiple indexes for separating parent-child fields, the primary index used for searching is designed around data chunks. It must have the following fields:
+
+- A document key field uniquely identifying each document. It must be defined as type `Edm.String` with the `keyword` analyzer.
+
+- A field associating each chunk with its parent. It must be of type `Edm.String`. It can't be the document key field, and must have `filterable` set to true. It's referred to as parent_id in the examples and as a [projected key value](#projected-key-value) in this article.
+
+- Other fields for content, such as text or vectorized chunk fields.
+
+An index must exist on the search service before you create the skillset or run the indexer
+
+### Single index schema inclusive of parent and child fields
+
+A single index designed around chunks with parent content repeating for each chunk is the predominant pattern for RAG and vector search scenarios. The ability to associate the correct parent content with each chunk is enabled through index projections.
+
+The following schema is an example that meets the requirements for index projections. In this example, parent fields are the parent_id and the title. Child fields are the vector and nonvector vector chunks. The chunk_id is the document ID of this index. The parent_id and title repeat for every chunk in the index.
+
+You can use the Azure portal, REST APIs, or an Azure SDK to [create an index](search-how-to-load-search-index.md).
+
+#### [**REST**](#tab/rest-create-index)
+
+```json
+{
+    "name": "my_consolidated_index",
+    "fields": [
+        {"name": "chunk_id", "type": "Edm.String", "key": true, "filterable": true, "analyzer": "keyword"},
+        {"name": "parent_id", "type": "Edm.String", "filterable": true},
+        {"name": "title", "type": "Edm.String", "searchable": true, "filterable": true, "sortable": true, "retrievable": true},
+        {"name": "chunk", "type": "Edm.String","searchable": true,"retrievable": true},
+        {"name": "chunk_vector", "type": "Collection(Edm.Single)", "searchable": true, "retrievable": false, "stored": false, "dimensions": 1536, "vectorSearchProfile": "hnsw"}
+    ],
+    "vectorSearch": {
+        "algorithms": [{"name": "hsnw", "kind": "hnsw", "hnswParameters": {}}],
+        "profiles": [{"name": "hsnw", "algorithm": "hnsw"}]
+    }
+}
+```
+
+#### [**Python**](#tab/python-create-index)
+
+This example is similar to the [RAG tutorial](tutorial-rag-build-solution-index-schema.md). It's an index schema designed for chunked content extracted from a parent document and combines all parent-child fields in the same index.
+
+```python
+ # Create a search index  
+ index_name = "my_consolidated_index"
+ index_client = SearchIndexClient(endpoint=AZURE_SEARCH_SERVICE, credential=credential)  
+ fields = [
+     SearchField(name="document_id", type=SearchFieldDataType.String, key=True, sortable=True, filterable=True, facetable=True, analyzer_name="keyword"),  
+     SearchField(name="parent_id", type=SearchFieldDataType.String, filterable=True),  
+     SearchField(name="title", type=SearchFieldDataType.String, searchable=True, sortable=False, filterable=True, facetable=False, retrievable=True), 
+     SearchField(name="chunk", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False, retrievable=True),  
+     SearchField(name="chunk_vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single, searchable=True, retrievable=False), vector_search_dimensions=1024, vector_search_profile_name="myHnswProfile")
+     ]  
+
+ # Configure the vector search configuration  
+ vector_search = VectorSearch(  
+     algorithms=[  
+         HnswAlgorithmConfiguration(name="myHnsw"),
+     ],  
+     profiles=[  
+         VectorSearchProfile(  
+             name="myHnswProfile",  
+             algorithm_configuration_name="myHnsw",  
+             vectorizer_name="myOpenAI",  
+         )
+     ],  
+     vectorizers=[  
+         AzureOpenAIVectorizer(  
+             vectorizer_name="myOpenAI",  
+             kind="azureOpenAI",  
+             parameters=AzureOpenAIVectorizerParameters(  
+                 resource_url=AZURE_OPENAI_ACCOUNT,  
+                 deployment_name="text-embedding-3-large",
+                 model_name="text-embedding-3-large"
+             ),
+         ),  
+     ], 
+ )  
+
+ # Create the search index
+ index = SearchIndex(name=index_name, fields=fields, vector_search=vector_search)  
+ result = index_client.create_or_update_index(index)  
+ print(f"{result.name} created")  
+```
+
+---
+
+## Add index projections to a skillset
+
+Index projections are defined inside a skillset definition and are primarily defined as an array of `selectors`, where each selector corresponds to a different target index on the search service. Each selector requires the following parameters as part of its definition:
+
+| Parameter | Definition |
+|-----------|------------|
+| `targetIndexName` | The name of the index into which index data is projected. It's either the single chunked index with repeating parent fields, or it's the child index if you're using [separate indexes](#example-of-separate-parent-child-indexes) for parent-child content. |
+| `parentKeyFieldName` | The name of the field providing the key for the parent document.|
+| `sourceContext` | The enrichment annotation that defines the granularity at which to map data into individual search documents. For more information, see [Skill context and input annotation language](cognitive-search-skill-annotation-language.md). |
+| `mappings` | An array of mappings of enriched data to fields in the search index. Each mapping consists of: <br>`name`: The name of the field in the search index that the data should be indexed into. <br>`source`: The enrichment annotation path that the data should be pulled from. <br><br>Each `mapping` can also recursively define data with an optional `sourceContext` and `inputs` field, similar to the [knowledge store](knowledge-store-concept-intro.md) or [Shaper Skill](cognitive-search-skill-shaper.md). Depending on your application, these parameters allow you to shape data into fields of type `Edm.ComplexType` in the search index. Some LLMs don't accept a complex type in search results, so the LLM you're using determines whether a complex type mapping is helpful or not.|
+
+You must explicitly map every field in the child index, except for the ID fields such as document key and the parent ID. 
+
+This requirement is in contrast with other field mapping conventions in Azure AI Search. For some data source types, the indexer can implicitly map fields based on similar names, or known characteristics (for example, blob indexers use the unique metadata storage path as the default document key). However, for indexer projections, you must explicitly specify every field mapping on the "many" side of the relationship.
+
+<!-- Avoid creating a field mapping for the parent key field. Doing so disrupts change tracking and synchronized data refresh. -->
+
+#### [**REST**](#tab/rest-create-index-projection)
+
+Index projections are generally available. We recommend the most recent stable API:
+
+- [Create Skillset (api-version=2024-07-01)](/rest/api/searchservice/skillsets/create)
+
+Here's an example payload for an index projections definition that you might use to project individual pages output by the [Text Split skill](cognitive-search-skill-textsplit.md) as their own documents in the search index.
+
+```json
+"indexProjections": {
+    "selectors": [
+        {
+            "targetIndexName": "my_consolidated_index",
+            "parentKeyFieldName": "parent_id",
+            "sourceContext": "/document/pages/*",
+            "mappings": [
+                {
+                    "name": "chunk",
+                    "source": "/document/pages/*",
+                    "sourceContext": null,
+                    "inputs": []
+                },
+                {
+                    "name": "chunk_vector",
+                    "source": "/document/pages/*/chunk_vector",
+                    "sourceContext": null,
+                    "inputs": []
+                },
+                {
+                    "name": "title",
+                    "source": "/document/title",
+                    "sourceContext": null,
+                    "inputs": []
+                }
+            ]
+        }
+    ],
+    "parameters": {
+        "projectionMode": "skipIndexingParentDocuments"
+    }
+}
+```
+
+#### [**Python**](#tab/python-create-index-projection)
+
+```python
+index_projections = SearchIndexerIndexProjection(  
+    selectors=[  
+        SearchIndexerIndexProjectionSelector(  
+            target_index_name=index_name,  
+            parent_key_field_name="parent_id",  
+            source_context="/document/pages/*",  
+            mappings=[  
+                InputFieldMappingEntry(name="chunk", source="/document/pages/*"),  
+                InputFieldMappingEntry(name="chunk_vector", source="/document/pages/*/chunk_vector"),
+                InputFieldMappingEntry(name="title", source="/document/title")
+            ],  
+        ),  
+    ],  
+    parameters=SearchIndexerIndexProjectionsParameters(  
+        projection_mode=IndexProjectionMode.SKIP_INDEXING_PARENT_DOCUMENTS  
+    ),  
+) 
+```
+
+#### [**.NET**](#tab/dotnet-create-index)
+
+For .NET developers, use the [IndexProjections Class](/dotnet/api/azure.search.documents.indexes.models.searchindexerskillset.indexprojection?view=azure-dotnet&preserve-view=true) in the Azure.Search.Documents client library.
+
+---
+
+> [!TIP]
+> We recommend setting the `skipIndexingParentDocuments` parameter for the consolidated schema scenario. If you don't set parameters for skipping parent document indexing, you get extra search documents in your index that are null for chunks, but populated with parent fields only. For example, if five documents contribute 100 chunks to the index, then the number of documents in the index is 105. The five documents created or parent fields have nulls for child fields, making them substantially different from the bulk of the documents in the index.
+
+## Handling parent documents
+
+Now that you've seen several patterns for one-to-many indexings, lets compare key differences about each option. Index projections effectively generate "child" documents for each "parent" document that runs through a skillset. You have several choices for handling the "parent" documents.
+
+- To send parent and child documents to separate indexes, set the `targetIndexName` for your indexer definition to the parent index, and set the `targetIndexName` in the index projection selector to the child index.
+
+- To keep parent and child documents in the same index, set the indexer `targetIndexName` and the index projection `targetIndexName` to the same index.
+
+- To avoid creating parent search documents and ensuring the index contains only child documents of a uniform grain, set the `targetIndexName` for both the indexer definition and the selector to the same index, but add an extra `parameters` object after `selectors`, with a `projectionMode` key set to `skipIndexingParentDocuments`, as shown here:
+
+    ```json
+    "indexProjections": {
+        "selectors": [
+            ...
+        ],
+        "parameters": {
+            "projectionMode": "skipIndexingParentDocuments"
+        }
+    }
+    ```
+
+## Review field mappings
+
+Indexers are affiliated with three different types of field mappings. Before you run the indexer, check your field mappings and know when to use each type.
+
+[Field mappings](search-indexer-field-mappings.md) are defined in an indexer and used to map a source field to an index field. Field mappings are used for data paths that lift data from the source and pass it in for indexing, with no intermediate skills processing step. Typically, an indexer can automatically map fields that have the same name and type. Explicit field mappings are only required when there's discrepancies. In one-to-many indexing and the patterns discussed thus far, you might not need field mappings.
+
+[Output field mappings](cognitive-search-output-field-mapping.md) are defined in an indexer and used to map enriched content generated by a skillset to a field into the main index. In the one-to-many patterns covered in this article, this is the parent index in a [two-index solution](#example-of-separate-parent-child-indexes). In the examples shown in this article, the parent index is sparse, with just a title field, and that field isn't populated with content from the skillset processing, so we don't an output field mapping.
+
+Indexer projection field mappings are used to map skillset-generated content to fields in the child index. In cases where the child index also includes parent fields (as in the [consolidated index solution](#single-index-schema-inclusive-of-parent-and-child-fields)), you should set up field mappings for every field that has content, including the parent-level title field, assuming you want the title to show up in each chunked document. If you're using [separate parent and child indexes](#example-of-separate-parent-child-indexes), the indexer projections should have field mappings for just the child-level fields.
+
+> [!NOTE]
+> Both output field mappings and indexer projection field mappings accept enriched document tree nodes as source inputs. Knowing how to specify a path to each node is essential to setting up the data path. To learn more about path syntax, see [Reference a path to enriched nodes](cognitive-search-concept-annotations-syntax.md) and [skillset definition](cognitive-search-working-with-skillsets.md#skillset-definition) for examples.
+
+## Run the indexer
+
+Once you have created a data source, indexes, and skillset, you're ready to [create and run the indexer](search-howto-create-indexers.md#run-the-indexer). This step puts the pipeline into execution. 
+
+You can query your search index after processing concludes to test your solution.
+
+## Content lifecycle
+
+Depending on the underlying data source, an indexer can usually provide ongoing change tracking and deletion detection. This section explains the content lifecycle of one-to-many indexing as it relates to data refresh.
+
+For data sources that provide change tracking and deletion detection, an indexer process can pick up changes in your source data. Each time you run the indexer and skillset, the index projections are updated if the skillset or underlying source data has changed. Any changes picked up by the indexer are propagated through the enrichment process to the projections in the index, ensuring that your projected data is a current representation of content in the originating data source. Data refresh activity is captured in a projected key value for each chunk. This value gets updated when the underlying data changes.
+
+> [!NOTE]
+> While you can manually edit the data in the projected documents using the [index push API](search-how-to-load-search-index.md), you should avoid doing so. Manual updates to an index are overwritten on the next pipeline invocation, assuming the document in source data is updated and the data source has change tracking or deletion detection enabled. 
+
+### Updated content
+
+If you add new content to your data source, new chunks or child documents are added to the index on the next indexer run.
+
+If you modify existing content in the data source, chunks are updated incrementally in the search index if the data source you're using supports change tracking and deletion detection. For exammple, if a word or sentence changes in a document, the chunk in the target index that contains that word or sentence is updated on the next indexer run. Other types of updates, such as changing a field type and some attributions, aren't supported for existing fields. For more information about allowed updates, see [
+Change an index schema](search-howto-reindex.md#change-an-index-schema).
+
+Some data sources like [Azure Storage](search-howto-index-changed-deleted-blobs.md) support change and deletion tracking by default, based on the timestamp. Other data sources such as [OneLake](search-how-to-index-onelake-files.md), [Azure SQL](search-howto-connecting-azure-sql-database-to-azure-search-using-indexers.md), or [Azure Cosmos DB](search-howto-index-cosmosdb.md) must be configured for change tracking.
+
+### Deleted content
+
+If the source content no longer exists (for example, if text is shortened to have fewer chunks), the corresponding child document in the search index is deleted. The remaining child documents also get their key updated to include a new hash value, even if their content didn't otherwise change.
+
+If a parent document is completely deleted from the datasource, the corresponding child documents only get deleted if the deletion is detected by a `dataDeletionDetectionPolicy` defined on the datasource definition. If you don't have a `dataDeletionDetectionPolicy` configured and need to delete a parent document from the datasource, then you should manually delete the child documents if they're no longer wanted. 
+
+### Projected key value
+
+To ensure data integrity for updated and deleted content, data refresh in one-to-many indexing relies on a *projected key value* on the "many" side. If you're using integrated vectorization or the [Import and vectorize data wizard](search-import-data-portal.md), the projected key value is the `parent_id` field in a chunked or "many" side of the index.
+
+A projected key value is a unique identifier that the indexer generates for each document. It ensures uniqueness and allows for change and deletion tracking to work correctly. This key contains the following segments:
+
+- A random hash to guarantee uniqueness. This hash changes if the parent document is updated on subsequent indexer runs.
+- The parent document's key.
+- The enrichment annotation path that identifies the context that that document was generated from.
+
+For example, if you split a parent document with key value "aa1b22c33" into four pages, and then each of those pages is projected as its own document via index projections:
+
+- aa1b22c33
+- aa1b22c33_pages_0
+- aa1b22c33_pages_1
+- aa1b22c33_pages_2
+
+If the parent document is updated in the source data, perhaps resulting in more chunked pages, the random hash changes, more pages are added, and the content of each chunk is updated to match whatever is in the source document.
+
+## Example of separate parent-child indexes
+
+This section shows examples for separate parent and child indexes. It's an uncommon pattern, but it's possible you might have application requirements that are best met using this approach. In this scenario, you're projecting parent-child content into two separate indexes.
+
+Each schema has the fields for its particular grain, with the parent ID field common to both indexes for use in a [lookup query](/rest/api/searchservice/documents/get). The primary search corpus is the child index, but then issue a lookup query to retrieve the parent fields for each match in the result. Azure AI Search doesn't support joins at query time, so your application code or orchestration layer would need to merge or collate results that can be passed to an app or process.
+
+The parent index has a parent_id field and title. The parent_id is the document key. You don't need vector search configuration unless you want to vectorize fields at the parent document level.
+
+```json
+{
+    "name": "my-parent-index",
+    "fields": [
+
+        {"name": "parent_id", "type": "Edm.String", "filterable": true},
+        {"name": "title", "type": "Edm.String", "searchable": true, "filterable": true, "sortable": true, "retrievable": true},
+    ]
+}
+```
+
+The child index has the chunked fields, plus the parent_id field. If you're using integrated vectorization, scoring profiles, semantic ranker, or analyzers you would set these in the child index.
+
+```json
+{
+    "name": "my-child-index",
+    "fields": [
+        {"name": "chunk_id", "type": "Edm.String", "key": true, "filterable": true, "analyzer": "keyword"},
+        {"name": "parent_id", "type": "Edm.String", "filterable": true},
+         {"name": "chunk", "type": "Edm.String","searchable": true,"retrievable": true},
+        {"name": "chunk_vector", "type": "Collection(Edm.Single)", "searchable": true, "retrievable": false, "stored": false, "dimensions": 1536, "vectorSearchProfile": "hnsw"}
+    ],
+    "vectorSearch": {
+        "algorithms": [{"name": "hsnw", "kind": "hnsw", "hnswParameters": {}}],
+        "profiles": [{"name": "hsnw", "algorithm": "hnsw"}]
+    },
+    "scoringProfiles": [],
+    "semanticConfiguration": [],
+    "analyzers": []
+}
+```
+
+Here's an example of an index projection definition that specifies the data path the indexer should use to index content. It specifies the child index name in the index projection definition, and it specifies the mappings of every child or chunk-level field. This is the only place the child index name is specified. 
+
+```json
+"indexProjections": {
+    "selectors": [
+        {
+            "targetIndexName": "my-child-index",
+            "parentKeyFieldName": "parent_id",
+            "sourceContext": "/document/pages/*",
+            "mappings": [
+                {
+                    "name": "chunk",
+                    "source": "/document/pages/*",
+                    "sourceContext": null,
+                    "inputs": []
+                },
+                {
+                    "name": "chunk_vector",
+                    "source": "/document/pages/*/chunk_vector",
+                    "sourceContext": null,
+                    "inputs": []
+                }
+            ]
+        }
+    ]
+}
+```
+
+The indexer definition specifies the components of the pipeline. In the indexer definition, the index name to provide is the parent index. If you need field mappings for the parent-level fields, define them in outputFieldMappings. For one-to-many indexing that uses separate indexes, the indexer definition might look like the following example. 
+
+```json
+{
+  "name": "my-indexer",
+  "dataSourceName": "my-ds",
+  "targetIndexName": "my-parent-index",
+  "skillsetName" : "my-skillset"
+  "parameters": { },
+  "fieldMappings": (optional) Maps fields in the underlying data source to fields in an index,
+  "outputFieldMappings" : (required) Maps skill outputs to fields in an index,
+}
+```
+
+## Next step
+
+Data chunking and one-to-many indexing are part of the RAG pattern in Azure AI Search. Continue on to the following tutorial and code sample to learn more about it.
+
+> [!div class="nextstepaction"]
+> [How to build a RAG solution using Azure AI Search](tutorial-rag-build-solution.md)
\ No newline at end of file

Summary

{
    "modification_type": "new feature",
    "modification_title": "インデックスプロジェクションの定義に関する新しいドキュメントの追加"
}

Explanation

この変更は、search-how-to-define-index-projections.md ファイルが新たに追加されたことを示しています。このドキュメントは、親子関係を持つコンテンツを検索インデックスのフィールドにマッピングする方法について詳細に説明しています。特に、データのチャンク処理を伴うインデックス投影の定義に焦点を当てています。ドキュメントでは、単一のインデックスを使用した親子インデックスの構造や、複数のインデックスを用いる方法、スキルセットを使用したインデックス投影の設定手順が紹介されています。

さらに、インデックス作成に必要な前提条件、単一または複数のインデックスの作成方法、スキルセットへのインデックスプロジェクションの追加手順が網羅されており、具体的なJSONおよびPythonのコード例も示されています。また、親子ドキュメントの取り扱いやデータのライフサイクル管理についての情報も提供されているため、ユーザーが効果的にインデックスプロジェクションを活用できるようになっています。この新しいドキュメントは、Azure AI Searchの機能を活用したい開発者にとって貴重なリソースとなるでしょう。

articles/search/search-howto-complex-data-types.md

Diff
@@ -11,12 +11,12 @@ ms.custom:
   - ignite-2023
 ms.service: azure-ai-search
 ms.topic: how-to
-ms.date: 01/18/2024
+ms.date: 10/14/2024
 ---
 
 # Model complex data types in Azure AI Search
 
-External datasets used to populate an Azure AI Search index can come in many shapes. Sometimes they include hierarchical or nested substructures. Examples might include multiple addresses for a single customer, multiple colors and sizes for a single SKU, multiple authors of a single book, and so on. In modeling terms, you might see these structures referred to as *complex*, *compound*, *composite*, or *aggregate* data types. The term Azure AI Search uses for this concept is **complex type**. In Azure AI Search, complex types are modeled using **complex fields**. A complex field is a field that contains children (subfields) which can be of any data type, including other complex types. This works in a similar way as structured data types in a programming language.
+External datasets used to populate an Azure AI Search index can come in many shapes. Sometimes they include hierarchical or nested substructures. Examples might include multiple addresses for a single customer, multiple colors and sizes for a single product, multiple authors of a single book, and so on. In modeling terms, you might see these structures referred to as *complex*, *compound*, *composite*, or *aggregate* data types. The term Azure AI Search uses for this concept is **complex type**. In Azure AI Search, complex types are modeled using **complex fields**. A complex field is a field that contains children (subfields) which can be of any data type, including other complex types. This works in a similar way as structured data types in a programming language.
 
 Complex fields represent either a single object in the document, or an array of objects, depending on the data type. Fields of type `Edm.ComplexType` represent single objects, while fields of type `Collection(Edm.ComplexType)` represent arrays of objects.
 
@@ -61,12 +61,6 @@ The following JSON document is composed of simple fields and complex fields. Com
 }
 ```
 
-## Indexing complex types
-
-During indexing, you can have a maximum of 3000 elements across all complex collections within a single document. An element of a complex collection is a member of that collection, so in the case of Rooms (the only complex collection in the Hotel example), each room is an element. In the example above, if the "Secret Point Motel" had 500 rooms, the hotel document would have 500 room elements. For nested complex collections, each nested element is also counted, in addition to the outer (parent) element.
-
-This limit applies only to complex collections, and not complex types (like Address) or string collections (like Tags).
-
 ## Create complex fields
 
 As with any index definition, you can use the portal, [REST API](/rest/api/searchservice/indexes/create), or [.NET SDK](/dotnet/api/azure.search.documents.indexes.models.searchindex) to create a schema that includes complex types. 
@@ -184,9 +178,15 @@ namespace AzureSearch.SDKHowTo
 
 ---
 
+### Complex collection limits
+
+During indexing, you can have a maximum of 3,000 elements across all complex collections within a single document. An element of a complex collection is a member of that collection. For Rooms (the only complex collection in the Hotel example), each room is an element. In the example above, if the "Secret Point Motel" had 500 rooms, the hotel document would have 500 room elements. For nested complex collections, each nested element is also counted, in addition to the outer (parent) element.
+
+This limit applies only to complex collections, and not complex types (like Address) or string collections (like Tags).
+
 ## Update complex fields
 
-All of the [reindexing rules](search-howto-reindex.md) that apply to fields in general still apply to complex fields. Restating a few of the main rules here, adding a field to a complex type doesn't require an index rebuild, but most modifications do.
+All of the [reindexing rules](search-howto-reindex.md) that apply to fields in general still apply to complex fields. Adding a new field to a complex type doesn't require an index rebuild, but most other modifications do require a rebuild.
 
 ### Structural updates to the definition
 
@@ -198,7 +198,7 @@ Notice that within a complex type, each subfield has a type and can have attribu
 
 Updating existing documents in an index with the `upload` action works the same way for complex and simple fields: all fields are replaced. However, `merge` (or `mergeOrUpload` when applied to an existing document) doesn't work the same across all fields. Specifically, `merge` doesn't support merging elements within a collection. This limitation exists for collections of primitive types and complex collections. To update a collection, you need to retrieve the full collection value, make changes, and then include the new collection in the Index API request.
 
-## Search complex fields
+## Search complex fields in text queries
 
 Free-form search expressions work as expected with complex types. If any searchable field or subfield anywhere in a document matches, then the document itself is a match.
 
@@ -208,6 +208,51 @@ Queries get more nuanced when you have multiple terms and operators, and some te
 
 Queries like this are *uncorrelated* for full-text search, unlike filters. In filters, queries over subfields of a complex collection are correlated using range variables in [`any` or `all`](search-query-odata-collection-operators.md). The Lucene query above returns documents containing both "Portland, Maine" and "Portland, Oregon", along with other cities in Oregon. This happens because each clause applies to all values of its field in the entire document, so there's no concept of a "current subdocument". For more information on this, see [Understanding OData collection filters in Azure AI Search](search-query-understand-collection-filters.md).
 
+## Search complex fields in RAG queries
+
+A RAG pattern passes search results to a chat model for generative AI and conversational search. By default, search results passed to an LLM are a flattened rowset. However, if your index has complex types, your query can provide those fields if you first convert the search results output to JSON, and then pass the JSON to the LLM.
+
+A partial example illustrates the technique:
+
++ Indicate the fields you want in the prompt or in the query
++ Make sure the fields are searchable and retrievable in the index
++ Select the fields for the search results
++ Format the results as JSON
++ Send the request for chat completion to the model provider
+
+```python
+import json
+
+# Query is the question being asked. It's sent to the search engine and the LLM.
+query="Can you recommend a few hotels that offer complimentary breakfast? Tell me their description, address, tags, and the rate for one room they have which sleep 4 people."
+
+# Set up the search results and the chat thread.
+# Retrieve the selected fields from the search index related to the question.
+selected_fields = ["HotelName","Description","Address","Rooms","Tags"]
+search_results = search_client.search(
+    search_text=query,
+    top=5,
+    select=selected_fields,
+    query_type="semantic"
+)
+sources_filtered = [{field: result[field] for field in selected_fields} for result in search_results]
+sources_formatted = "\n".join([json.dumps(source) for source in sources_filtered])
+
+response = openai_client.chat.completions.create(
+    messages=[
+        {
+            "role": "user",
+            "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
+        }
+    ],
+    model=AZURE_DEPLOYMENT_MODEL
+)
+
+print(response.choices[0].message.content)
+```
+
+For the end-to-end example, see [Quickstart: Generative search (RAG) with grounding data from Azure AI Search](search-get-started-rag.md).
+
 ## Select complex fields
 
 The `$select` parameter is used to choose which fields are returned in search results. To use this parameter to select specific subfields of a complex field, include the parent field and subfield separated by a slash (`/`).
@@ -244,15 +289,20 @@ To filter on a complex collection field, you can use a **lambda expression** wit
 
 As with top-level simple fields, simple subfields of complex fields can only be included in filters if they have the **filterable** attribute set to `true` in the index definition. For more information, see the [Create Index API reference](/rest/api/searchservice/indexes/create).
 
-Azure Search has the limitation that the complex objects in the collections across a single document cannot exceed 3000.
+### Workaround for the complex collection limit
 
-Users will encounter the below error during indexing when complex collections exceed the 3000 limit.
+Recall that Azure AI Search limits complex objects in a collection to 3,000 objects per document. Exceeding this limit results in the following message:
 
-“A collection in your document exceeds the maximum elements across all complex collections limit. The document with key '1052' has '4303' objects in collections (JSON arrays). At most '3000' objects are allowed to be in collections across the entire document. Remove objects from collections and try indexing the document again."
+```
+A collection in your document exceeds the maximum elements across all complex collections limit. 
+The document with key '1052' has '4303' objects in collections (JSON arrays). 
+At most '3000' objects are allowed to be in collections across the entire document. 
+Remove objects from collections and try indexing the document again."
+```
 
-In some use cases, we might need to add more than 3000 items to a collection. In those use cases, we can pipe (|) or use any form of delimiter to delimit the values, concatenate them, and store them as a delimited string. There is no limitation on the number of strings stored in an array in Azure Search. Storing these complex values as strings avoids the limitation. The customer needs to validate whether this workaround meets their scenario requirements.
+If you need more than 3,000 items, you can pipe (`|`) or use any form of delimiter to delimit the values, concatenate them, and store them as a delimited string. There's no limitation on the number of strings stored in an array. Storing complex values as strings bypasses the complex collection limitation.
 
-For example, it wouldn't be possible to use complex types if the "searchScope" array below had more than 3000 elements.
+To illustrate, assume you have a `"searchScope`" array with more than 3,000 elements:
 
 ```json
 
@@ -267,10 +317,11 @@ For example, it wouldn't be possible to use complex types if the "searchScope" a
      "productCode": 1235,
      "categoryCode": "C200" 
   }
+  . . .
 ]
 ```
 
-Storing these complex values as strings with a delimiter avoids the limitation
+The workaround for storing the values as a delimited string might look like this:
 
 ```json
 "searchScope": [
@@ -283,26 +334,10 @@ Storing these complex values as strings with a delimiter avoids the limitation
 ]
 
 ```
-Rather than storing these with wildcards, we can also use a [custom analyzer](index-add-custom-analyzers.md)  that splits the word into | to cut down on storage size.
-
-The reason we have stored the values with wildcards instead of just storing them as below
-
->`|FRA|1234|C100|`
-
-is to cater to search scenarios where the customer might want to search for items that have country France, irrespective of products and categories. Similarly, the customer might need to search to see if the item has product 1234, irrespective of the country or the category.
-
-If we had stored only one entry
-
->`|FRA|1234|C100|`
 
-without wildcards, if the user wants to filter only on France, we cannot convert the user input to match the "searchScope" array because we don't know what combination of France is present in our "searchScope" array
+Storing all of the search variants in the delimited string is helpful in search scenarios where you want to search for items that have just "FRA" or "1234" or another combination within the array.
 
-
-If the user wants to filter only by country, let's say France. We will take the user input and construct it as a string as below:
-
->`|FRA|*|*|`
-
-which we can then use to filter in azure search as we search in an array of item values
+Here's a filter formatting snippet in C# that converts inputs into searchable strings:
 
 ```csharp
 foreach (var filterItem in filterCombinations)
@@ -312,39 +347,25 @@ foreach (var filterItem in filterCombinations)
         }
 
 ```
-Similarly, if the user searches for France and the 1234 product code, we will take the user input, construct it as a delimited string as below, and match it against our search array.
-
->`|FRA|1234|*|`
-
-If the user searches for 1234 product code, we will take the user input, construct it as a delimited string as below, and match it against our search array.
-
->`|*|1234|*|`
-
-If the user searches for the C100 category code, we will take the user input, construct it as a delimited string as below, and match it against our search array.
-
->`|*|*|C100|`
-
-If the user searches for France and the 1234 product code and C100 category code, we will take the user input, construct it as a delimited string as below, and match it against our search array.
 
->`|FRA|1234|C100|`
+The following list provides inputs and search strings (outputs) side by side:
 
-If a user tries to search for countries not present in our list, it will not match the delimited array "searchScope" stored in the search index, and no results will be returned.
-For example, a user searches for Canada and product code 1234. The user search would be converted to
++ For "FRA" county code and the "1234" product code, the formatted output is ```|FRA|1234|*|```.
 
->`|CAN|1234|*|`
++ For "1234" product code, the formatted output is ```|*|1234|*|```.
 
-This will not match any of the entries in the delimited array in our search index.
++ For "C100" category code, the formatted output is ```|*|*|C100|```.
 
-Only the above design choice requires this wild card entry; if it had been saved as a complex object, we could have simply performed an explicit search as shown below.
+Only provide the wildcard (`*`) if you're implementing the string array workaround. Otherwise, if you're using a complex type, your filter might look like this example:
 
 ```csharp
-           var countryFilter = $"searchScope/any(ss: search.in(countryCode ,'FRA'))";
-            var catgFilter = $"searchScope/any(ss: search.in(categoryCode ,'C100'))";
-            var combinedCountryCategoryFilter = "(" + countryFilter + " and " + catgFilter + ")";
+var countryFilter = $"searchScope/any(ss: search.in(countryCode ,'FRA'))";
+var catgFilter = $"searchScope/any(ss: search.in(categoryCode ,'C100'))";
+var combinedCountryCategoryFilter = "(" + countryFilter + " and " + catgFilter + ")";
 
 ```
-We can thus satisfy requirements where we need to search for a combination of values by storing it as a delimited string instead of a complex collection if our complex collections exceed the Azure Search limit. This is one of the workarounds, and the customer needs to validate if this would meet their scenario requirements.
 
+If you implement the workaround, be sure to test extentively.
 
 ## Next steps
 

Summary

{
    "modification_type": "minor update",
    "modification_title": "複雑なデータ型のモデル化に関するドキュメントの更新"
}

Explanation

この変更は、search-howto-complex-data-types.md ファイルにおけるいくつかの重要な修正を含んでいます。主な更新点として、複雑なデータ型をモデル化する方法に関する情報が強化されています。特に、Azure AI Searchにおける複雑なフィールドの定義およびインデクシング時の制限が、より明確に説明されるようになりました。

具体的には、複雑なコレクションの要素数が1つのドキュメントあたり3,000を超えることができないという制限が強調され、これを回避するための代替手段として、値を区切り文字で連結し、文字列として保存する方法が示されています。このアプローチにより、複雑なコレクション制限を回避しつつ、必要な情報を検索できるようになります。

さらに、複雑なフィールドを検索する際の新しい手法として、生成的AIおよび会話型検索への渡し方が具体例とともに紹介されており、ユーザーにより実用的なアドバイスを提供しています。これらの更新は、Azure AI Searchを使用したアプリケーション開発において、ユーザーが複雑なデータ型を効果的に扱うための助けとなります。

articles/search/search-limits-quotas-capacity.md

Diff
@@ -53,7 +53,7 @@ Maximum limits on storage, workloads, and quantities of indexes and other object
 
 <sup>2</sup> The upper limit on fields includes both first-level fields and nested subfields in a complex collection. For example, if an index contains 15 fields and has two complex collections with five subfields each, the field count of your index is 25. Indexes with a very large fields collection can be slow. [Limit fields and attributes](search-what-is-an-index.md#physical-structure-and-size) to just those you need, and run indexing and query test to ensure performance is acceptable.
 
-<sup>3</sup> An upper limit exists for elements because having a large number of them significantly increases the storage required for your index. An element of a complex collection is defined as a member of that collection. For example, assume a [Hotel document with a Rooms complex collection](search-howto-complex-data-types.md#indexing-complex-types), each room in the Rooms collection is considered an element. During indexing, the indexing engine can safely process a maximum of 3,000 elements across the document as a whole. [This limit](search-api-migration.md#upgrade-to-2019-05-06) was introduced in `api-version=2019-05-06` and applies to complex collections only, and not to string collections or to complex fields.
+<sup>3</sup> An upper limit exists for elements because having a large number of them significantly increases the storage required for your index. An element of a complex collection is defined as a member of that collection. For example, assume a [Hotel document with a Rooms complex collection](search-howto-complex-data-types.md#complex-collection-limits), each room in the Rooms collection is considered an element. During indexing, the indexing engine can safely process a maximum of 3,000 elements across the document as a whole. [This limit](search-api-migration.md#upgrade-to-2019-05-06) was introduced in `api-version=2019-05-06` and applies to complex collections only, and not to string collections or to complex fields.
 
 <sup>4</sup> On most tiers, maximum index size is all available storage on your search service. For S2, S3, and S3 HD, the maximum size of any index is the number provided in the table. Applies to search services created after April 3, 2024.
 

Summary

{
    "modification_type": "minor update",
    "modification_title": "検索制限、クオータ、キャパシティに関するドキュメントの修正"
}

Explanation

この変更は、search-limits-quotas-capacity.md ファイルのテキストにおける小規模な修正を示しています。具体的には、複雑なコレクション内の要素数に関する説明が強化されました。

変更の要点は、要素の定義に関する説明が更新され、Rooms 複雑コレクションの具体例を明記しました。この例では、ホテルドキュメントの中で各部屋が要素としてカウントされることが説明され、インデクシング時に最大3,000要素を安全に処理できるという制限についても言及しています。また、これらの要素数制限が api-version=2019-05-06 から導入されたものであることも明記されています。

この修正は、ユーザーがAzure AI Searchのインデクシング制限を理解し、性能に関する懸念を軽減するための参考になります。全体的に、ドキュメントの明確性が向上し、特定の使用事例に対する理解が促進されています。

articles/search/toc.yml

Diff
@@ -305,8 +305,8 @@ items:
         href: cognitive-search-defining-skillset.md
       - name: Attach an Azure AI resource
         href: cognitive-search-attach-cognitive-services.md
-      - name: Create an index projection for a secondary index
-        href: index-projections-concept-intro.md
+      - name: Define an index projection
+        href: search-how-to-define-index-projections.md
       - name: Debug sessions overview
         href: cognitive-search-debug-session.md
       - name: Debug a skillset

Summary

{
    "modification_type": "minor update",
    "modification_title": "目次ファイルのリンクの更新"
}

Explanation

この変更は、toc.yml ファイルにおける目次項目のリンクを更新するもので、具体的にはインデックス投影に関する項目に関する修正が含まれています。

主な変更点は、次のとおりです:

  • “Create an index projection for a secondary index” という項目名が “Define an index projection” に変更され、対応するリンクも index-projections-concept-intro.md から search-how-to-define-index-projections.md に更新されました。これにより、ユーザーはより具体的でトピックに関連した情報にアクセスしやすくなります。

この修正は、ドキュメントの整合性と明確性を向上させ、ユーザーが必要な情報を迅速に見つけられるようにするため的重要です。

articles/search/tutorial-rag-build-solution-index-schema.md

Diff
@@ -63,7 +63,7 @@ In Azure AI Search, an index that works best for RAG workloads has these qualiti
 
 - Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors, such as searchable or filterable. You can only query one index at a time (no joins) so your fields collection should define all of your searchable content.
 
-- Your schema should be flat (no complex types or structures). This requirement is specific to the RAG pattern in Azure AI Search.
+- Your schema should either be flat (no complex types or structures), or you should [format the complext type output as JSON](search-get-started-rag.md#send-a-complex-rag-query) before sending it to the LLM. This requirement is specific to the RAG pattern in Azure AI Search.
 
 <!-- Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use sequential queries in your search logic to pull from both (a query on the chunked data index, a lookup on the parent index). This exercise includes templates for parent-child elements in the same index and in separate indexes, where information from the parent index is retrieved using a lookup query. -->
 
@@ -209,4 +209,4 @@ Tasks:
 ## Next step
 
 > [!div class="nextstepaction"]
-> [Create an indexing pipeline](tutorial-rag-build-solution-pipeline.md)
\ No newline at end of file
+> [Create an indexing pipeline](tutorial-rag-build-solution-pipeline.md)

Summary

{
    "modification_type": "minor update",
    "modification_title": "RAGソリューションのインデックススキーマに関するチュートリアルの修正"
}

Explanation

この変更は、tutorial-rag-build-solution-index-schema.md ファイルにおけるRAG(Retrieval-Augmented Generation)パターンに関連するインデックススキーマの要件に関する説明を明確にするものです。

主な変更点は以下の通りです:

  • スキーマに関する要件が強化され、“Your schema should be flat (no complex types or structures)” という表現が更新されました。新しい表現では、スキーマがフラットでなければならないか、もしくは複雑な型の出力をJSON形式にフォーマットする必要があることが明記されています。この修正により、ユーザーがインデックススキーマの設計において求められる要件をより明確に理解できるようになります。

  • 最後のステップに関連するリンクは変更されていませんが、レイアウトの一貫性が保たれています。

この変更は、特にRAGパターンにおけるインデックスの設計や実装を検討しているユーザーにとって、重要なガイダンスを提供します。