利用上の注意

このポストは Microsoft 社の Azure 公式ドキュメント(CC BY 4.0 または MIT ライセンス) をもとに生成AIを用いて翻案・要約した派生作品です。元の文書は MicrosoftDocs/azure-ai-docs にホストされています。

生成AIの性能には限界があり、誤訳や誤解釈が含まれる可能性があります。本ポストはあくまで参考情報として用い、正確な情報は必ず元の文書を参照してください。

このポストで使用されている商標はそれぞれの所有者に帰属します。これらの商標は技術的な説明のために使用されており、商標権者からの公式な承認や推奨を示すものではありません。

View Diff on GitHub

ハイライト

新機能

cluster-green-dot.pngという新しい画像ファイルの追加。

破壊的変更

特に破壊的変更はありません。

その他の更新

画像ファイルの軽微な修正 (create-notebook.png, create-seven-cells.png, install-library-from-maven.png, install-library.png)
ドキュメント (search-synapseml-cognitive-services.md) の依存関係および説明内容の更新。
目次ファイル (toc.yml) の項目名の変更。

インサイト

この一連の変更は、Azure AIドキュメントのビジュアルとナビゲーション要素を最新の状態に保つことを目的としています。以下にその詳細を解説します。

新しい画像の追加

cluster-green-dot.pngが追加されたことは、ドキュメント内で視覚的な要素を増やし、概念の理解を助けるためのものと考えられます。特にユーザビリティやインターフェースに関連するセクションで、この種の新しい画像はナビゲーションの指針を提供するために有効です。

画像ファイルの更新

複数の画像ファイルが更新されていますが、ファイル自体の追加や削除がないため、主に視覚品質の改善あるいはコンテンツの最新化を目的とした更新である可能性があります。例えば、UIの変更や新しい手法を取り入れた際に、視覚素材を更新することがあります。

Markdownドキュメントの修正

search-synapseml-cognitive-services.mdでは、依存するライブラリの情報更新や、新しいAPIに関する説明の追加が見受けられます。これは、ユーザーが最新バージョンを利用しやすくするためのものであり、技術ドキュメントとしての信頼性や正確性を保つことにもつながる重要な変更です。

目次ファイルの更新

toc.ymlファイルでの項目名変更は、ユーザーが目的の情報により直接的にアクセスできるようにするための調整です。特に、ドキュメントのクリアなナビゲーションはユーザー体験向上に寄与します。

このような変更により、ドキュメントの品質向上とユーザーエクスペリエンスの改善が見込まれます。ユーザーは提供されたリンクを通じて、これらの修正が反映された最新のドキュメントや視覚素材を確認できます。

Summary Table

Filename	Type	Title	Status	A	D	M
cluster-green-dot.png	new feature	新しい画像ファイルの追加	added	0	0	0
create-notebook.png	minor update	画像ファイルの更新	modified	0	0	0
create-seven-cells.png	minor update	画像ファイルの修正	modified	0	0	0
install-library-from-maven.png	minor update	画像ファイルの更新	modified	0	0	0
install-library.png	minor update	画像ファイルの更新	modified	0	0	0
search-synapseml-cognitive-services.md	minor update	ドキュメントの更新と依存関係の修正	modified	22	18	40
toc.yml	minor update	目次ファイルの修正	modified	1	1	2

Modified Contents

articles/search/media/search-synapseml-cognitive-services/cluster-green-dot.png

Summary

{
    "modification_type": "new feature",
    "modification_title": "新しい画像ファイルの追加"
}

Explanation

この変更では、articles/search/media/search-synapseml-cognitive-services/cluster-green-dot.pngという名前の画像ファイルが新たに追加されました。このファイルは、Azure AIドキュメントの検索機能に関連するコンテンツの一部として使用されると思われます。変更の内容には、ファイルの追加のみが含まれ、既存のコードやファイルへの変更は伴っていません。この画像は、リポジトリの指定されたパスでアクセスできるようになります。

articles/search/media/search-synapseml-cognitive-services/create-notebook.png

Summary

{
    "modification_type": "minor update",
    "modification_title": "画像ファイルの更新"
}

Explanation

この変更には、articles/search/media/search-synapseml-cognitive-services/create-notebook.pngという画像ファイルの修正が含まれています。具体的には、ファイル自体には追加や削除の変更はありませんが、何らかの理由で更新されたことを示しています。これにより、検索機能に関連するドキュメントのビジュアルコンテンツが最新の状態に保たれることが期待されます。ファイルは、指定されたリンクから引き続きアクセス可能です。

articles/search/media/search-synapseml-cognitive-services/create-seven-cells.png

Summary

{
    "modification_type": "minor update",
    "modification_title": "画像ファイルの修正"
}

Explanation

この変更では、articles/search/media/search-synapseml-cognitive-services/create-seven-cells.pngという画像ファイルが修正されました。具体的な変更内容は、ファイルに対しての追加や削除が存在しないため、変更は無いものとして記録されています。しかしながら、ファイルが更新されたことから、内容の視覚的な改善やメタデータの更新などの可能性があります。引き続き、指定されたリンクから画像ファイルにアクセスすることができます。

articles/search/media/search-synapseml-cognitive-services/install-library-from-maven.png

Summary

{
    "modification_type": "minor update",
    "modification_title": "画像ファイルの更新"
}

Explanation

この変更は、articles/search/media/search-synapseml-cognitive-services/install-library-from-maven.pngという画像ファイルの修正に関連しています。具体的には、追加や削除の変更はなく、ファイルは更新されたことを示しています。更新の理由は明示されていませんが、画像が内容の視覚的な整合性または最新の情報を反映するために修正された可能性があります。ユーザーは引き続き提供されたリンクを通じて、画像にアクセスすることができます。

articles/search/media/search-synapseml-cognitive-services/install-library.png

Summary

{
    "modification_type": "minor update",
    "modification_title": "画像ファイルの更新"
}

Explanation

この変更では、articles/search/media/search-synapseml-cognitive-services/install-library.pngという画像ファイルが修正されています。具体的には、変更内容としては追加や削除はなく、ファイルの状態が「修正済み」としてマークされています。この修正は、視覚的な要素や情報の更新、またはメタデータに関連している可能性があります。ユーザーは、提供されたリンクを通じて修正された画像ファイルにアクセスできます。

articles/search/search-synapseml-cognitive-services.md

Diff

@@ -10,7 +10,7 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: tutorial
-ms.date: 04/22/2024
+ms.date: 01/30/2025
 ---
 
 # Tutorial: Index large data from Apache Spark using SynapseML and Azure AI Search
@@ -24,7 +24,7 @@ In this Azure AI Search tutorial, learn how to index and query large data loaded
 > + Write the output to a search index hosted in Azure AI Search
 > + Explore and query over the content you created
 
-This tutorial takes a dependency on [SynapseML](https://www.microsoft.com/research/blog/synapseml-a-simple-multilingual-and-massively-parallel-machine-learning-library/), an open source library that supports massively parallel machine learning over big data. In SynapseML, search indexing and machine learning are exposed through *transformers* that perform specialized tasks. Transformers tap into a wide range of AI capabilities. In this exercise, use the **AzureSearchWriter** APIs for analysis and AI enrichment.
+This tutorial takes a dependency on [SynapseML](https://microsoft.github.io/SynapseML/), an open source library that supports massively parallel machine learning over big data. In SynapseML, search indexing and machine learning are exposed through *transformers* that perform specialized tasks. Transformers tap into a wide range of AI capabilities. In this exercise, use the **AzureSearchWriter** APIs for analysis and AI enrichment.
 
 Although Azure AI Search has native [AI enrichment](cognitive-search-concept-intro.md), this tutorial shows you how to access AI capabilities outside of Azure AI Search. By using SynapseML instead of indexers or skills, you're not subject to data limits or other constraints associated with those objects.
 
@@ -35,18 +35,18 @@ Although Azure AI Search has native [AI enrichment](cognitive-search-concept-int
 
 You need the `synapseml` library and several Azure resources. If possible, use the same subscription and region for your Azure resources and put everything into one resource group for simple cleanup later. The following links are for portal installs. The sample data is imported from a public site.
 
-+ [SynapseML package](https://microsoft.github.io/SynapseML/docs/Get%20Started/Install%20SynapseML/#python) <sup>1</sup> 
-+ [Azure AI Search](search-create-service-portal.md) (any tier) <sup>2</sup> 
-+ [Azure AI services](/azure/ai-services/multi-service-resource?pivots=azportal) (any tier) <sup>3</sup> 
-+ [Azure Databricks](/azure/databricks/scenarios/quickstart-create-databricks-workspace-portal?tabs=azure-portal) (any tier) <sup>4</sup>
++ [SynapseML package](https://microsoft.github.io/SynapseML/docs/Get%20Started/Install%20SynapseML/#python) <sup>1</sup>
++ [Azure AI Search](search-create-service-portal.md) (any tier), with an **API Kind** of `AIServices` <sup>2</sup> 
++ [Azure AI multi-service account](/azure/ai-services/multi-service-resource?pivots=azportal) (any tier) <sup>3</sup>
++ [Azure Databricks](/azure/databricks/scenarios/quickstart-create-databricks-workspace-portal?tabs=azure-portal) (any tier) with Apache Spark 3.3.0 runtime<sup>4</sup>
 
 <sup>1</sup> This link resolves to a tutorial for loading the package.
 
 <sup>2</sup> You can use the free search tier to index the sample data, but [choose a higher tier](search-sku-tier.md) if your data volumes are large. For billable tiers, provide the [search API key](search-security-api-keys.md#find-existing-keys) in the [Set up dependencies](#step-2-set-up-dependencies) step further on.
 
-<sup>3</sup> This tutorial uses Azure AI Document Intelligence and Azure AI Translator. In the instructions that follow, provide a [multi-service](/azure/ai-services/multi-service-resource?pivots=azportal) key and the region. The same key works for both services.
+<sup>3</sup> This tutorial uses Azure AI Document Intelligence and Azure AI Translator. In the instructions that follow, provide a [multi-service account](/azure/ai-services/multi-service-resource?pivots=azportal) key and the region. The same key works for both services. **It's important that you use an Azure AI multiservice account of API kind of `AIServices` for this tutorial**. You can check the API kind in the Azure portal on the Overview section of your Azure AI multiservice account page. For more information about API kind, see [Attach an Azure AI multi-service resource in Azure AI Search](cognitive-search-attach-cognitive-services.md).
 
-<sup>4</sup> In this tutorial, Azure Databricks provides the Spark computing platform. We used the [portal instructions](/azure/databricks/scenarios/quickstart-create-databricks-workspace-portal?tabs=azure-portal) to set up the workspace.
+<sup>4</sup> In this tutorial, Azure Databricks provides the Spark computing platform. We used the [portal instructions](/azure/databricks/scenarios/quickstart-create-databricks-workspace-portal?tabs=azure-portal) to set up the cluster and workspace.
 
 > [!NOTE]
 > All of the above Azure resources support security features in the Microsoft Identity platform. For simplicity, this tutorial assumes key-based authentication, using endpoints and keys copied from the Azure portal pages of each service. If you implement this workflow in a production environment, or share the solution with others, remember to replace hard-coded keys with integrated security or encrypted keys.
@@ -63,6 +63,10 @@ In this section, create a cluster, install the `synapseml` library, and create a
 
 1. Accept the default configuration. It takes several minutes to create the cluster.
 
+1. Verify the cluster is operational and running. A green dot by the cluster name confirms its status.
+
+   :::image type="content" source="media/search-synapseml-cognitive-services/cluster-green-dot.png" alt-text="Screenshot of a Data Bricks compute page with a green dot by the cluster name.":::
+
 1. Install the `synapseml` library after the cluster is created:
 
    1. Select **Libraries** from the tabs at the top of the cluster's page.
@@ -73,7 +77,7 @@ In this section, create a cluster, install the `synapseml` library, and create a
 
    1. Select **Maven**.
 
-   1. In Coordinates, enter `com.microsoft.azure:synapseml_2.12:1.0.4`
+   1. In Coordinates, search for or type `com.microsoft.azure:synapseml_2.12:1.0.9`
 
    1. Select **Install**.
 
@@ -85,15 +89,15 @@ In this section, create a cluster, install the `synapseml` library, and create a
 
 1. Give the notebook a name, select **Python** as the default language, and select the cluster that has the `synapseml` library.
 
-1. Create seven consecutive cells. Paste code into each one.
+1. Create seven consecutive cells. You use these to paste in code in the following sections.
 
    :::image type="content" source="media/search-synapseml-cognitive-services/create-seven-cells.png" alt-text="Screenshot of the notebook with placeholder cells." border="true":::
 
 ## Step 2: Set up dependencies
 
 Paste the following code into the first cell of your notebook. 
 
-Replace the placeholders with endpoints and access keys for each resource. Provide a name for a new search index. No other modifications are required, so run the code when you're ready.
+Replace the placeholders with endpoints and access keys for each resource. Provide a name for a new search index that's created for you. No other modifications are required, so run the code when you're ready.
 
 This code imports multiple packages and sets up access to the Azure resources used in this workflow.
 
@@ -103,12 +107,12 @@ from pyspark.sql.functions import udf, trim, split, explode, col, monotonically_
 from pyspark.sql.types import StringType
 from synapse.ml.core.spark import FluentAPI
 
-cognitive_services_key = "placeholder-cognitive-services-multi-service-key"
-cognitive_services_region = "placeholder-cognitive-services-region"
+cognitive_services_key = "placeholder-azure-ai-services-multi-service-key"
+cognitive_services_region = "placeholder-azure-ai-services-region"
 
 search_service = "placeholder-search-service-name"
-search_key = "placeholder-search-service-api-key"
-search_index = "placeholder-search-index-name"
+search_key = "placeholder-search-service-admin-api-key"
+search_index = "placeholder-for-new-search-index-name"
 ```
 
 ## Step 3: Load data into Spark
@@ -128,7 +132,7 @@ def blob_to_url(blob):
 
 
 df2 = (spark.read.format("binaryFile")
-    .load("wasbs://ignite2021@mmlsparkdemo.blob.core.windows.net/form_subset/*")
+    .load("wasbs://publicwasb@mmlspark.blob.core.windows.net/form_subset/*")
     .select("path")
     .limit(10)
     .select(udf(blob_to_url, StringType())("path").alias("url"))
@@ -141,10 +145,10 @@ display(df2)
 
 Paste the following code into the third cell. No modifications are required, so run the code when you're ready.
 
-This code loads the [AnalyzeInvoices transformer](https://mmlspark.blob.core.windows.net/docs/0.11.2/pyspark/synapse.ml.cognitive.form.html#module-synapse.ml.cognitive.form.AnalyzeInvoices) and passes a reference to the data frame containing the invoices. It calls the prebuilt [invoice model](/azure/ai-services/document-intelligence/concept-invoice) of Azure AI Document Intelligence to extract information from the invoices.
+This code loads the [AnalyzeInvoices transformer](https://mmlspark.blob.core.windows.net/docs/1.0.9/pyspark/synapse.ml.services.form.html#module-synapse.ml.services.form.AnalyzeInvoices) and passes a reference to the data frame containing the invoices. It calls the prebuilt [invoice model](/azure/ai-services/document-intelligence/concept-invoice) of Azure AI Document Intelligence to extract information from the invoices.
 
 ```python
-from synapse.ml.cognitive import AnalyzeInvoices
+from synapse.ml.services import AnalyzeInvoices
 
 analyzed_df = (AnalyzeInvoices()
     .setSubscriptionKey(cognitive_services_key)

Summary

{
    "modification_type": "minor update",
    "modification_title": "ドキュメントの更新と依存関係の修正"
}

Explanation

このコードの変更は、articles/search/search-synapseml-cognitive-services.mdというMarkdownドキュメントに関連しています。主な修正点は、22行の追加と18行の削除が行われたことです。主な内容としては、日付の更新や、依存するライブラリの情報および設定方法に関する軽微な変更が含まれています。

具体的には、SynapseMLに関する依存関係のリンクやバージョンが更新され、Azure AI サービスアカウントに関する重要な注意事項が追加されました。また、クラスタの操作状況の確認や、ライブラリのインストール方法に関する説明もやや変更されており、新しいバージョンのAPIや手法を反映した内容に修正されています。

この更新は、ドキュメントの品質を向上させ、最新の情報をユーザーに提供することを目的としています。ユーザーは、提供されたリンクを通じて修正されたドキュメントを確認することができます。

articles/search/toc.yml

Diff

@@ -303,7 +303,7 @@ items:
             href: search-how-to-index-sql-server.md
         - name: OneLake files
           href: search-how-to-index-onelake-files.md
-        - name: SharePoint and OneDrive
+        - name: SharePoint Online
           href: search-howto-index-sharepoint-online.md
     - name: Skillsets
       items:

Summary

{
    "modification_type": "minor update",
    "modification_title": "目次ファイルの修正"
}

Explanation

この変更は、articles/search/toc.ymlという目次ファイルに関連しています。変更内容としては、1行の追加と1行の削除があり、合計で2行が改変されています。具体的には、メニュー項目の名称が「SharePoint and OneDrive」から「SharePoint Online」に変更されました。

この変更は、正確なコンテンツの説明やナビゲーションの向上を図るために行われたもので、ユーザーが関連するドキュメントをより簡単に見つけられるようにすることを目的としています。修正された目次ファイルは、提供されたリンクを通じて確認することができます。

Diff Insight Report - search

ハイライト

新機能

破壊的変更

その他の更新

インサイト

新しい画像の追加

画像ファイルの更新

Markdownドキュメントの修正

目次ファイルの更新

Summary Table

Modified Contents

articles/search/media/search-synapseml-cognitive-services/cluster-green-dot.png

Summary

Explanation

articles/search/media/search-synapseml-cognitive-services/create-notebook.png

Summary

Explanation

articles/search/media/search-synapseml-cognitive-services/create-seven-cells.png

Summary

Explanation

articles/search/media/search-synapseml-cognitive-services/install-library-from-maven.png

Summary

Explanation

articles/search/media/search-synapseml-cognitive-services/install-library.png

Summary

Explanation

articles/search/search-synapseml-cognitive-services.md

Summary

Explanation

articles/search/toc.yml

Summary

Explanation