Diff Insight Report - misc

最終更新日: 2024-12-21

利用上の注意

このポストは Microsoft 社の Azure 公式ドキュメント(CC BY 4.0 または MIT ライセンス) をもとに生成AIを用いて翻案・要約した派生作品です。 元の文書は MicrosoftDocs/azure-ai-docs にホストされています。

生成AIの性能には限界があり、誤訳や誤解釈が含まれる可能性があります。 本ポストはあくまで参考情報として用い、正確な情報は必ず元の文書を参照してください。

このポストで使用されている商標はそれぞれの所有者に帰属します。これらの商標は技術的な説明のために使用されており、商標権者からの公式な承認や推奨を示すものではありません。

View Diff on GitHub

ハイライト

今回のコード変更では、いくつかのドキュメントのメタデータが更新され、レビュアー情報が修正されました。具体的には、日付やレビュアーに関する情報が最新の状態に更新され、いくつかの手順やコマンドが修正されています。また、評価SDKに関するドキュメントに大幅な更新が加えられたことが確認できます。

主な新機能

  • SDK評価ドキュメントでの大幅な内容追加、画像とマルチモーダルデータの評価サポート。
  • コマンドラインでのインストール手順の修正。

主な破壊的変更

  • 特筆すべき破壊的変更は含まれていません。

その他の更新

  • ドキュメントの日付、レビュアー情報の更新。
  • 一部の手順やUIの指示に関する軽微な内容修正。
  • サービス名の正確化(目次での名称変更)。

洞察

今回の更新は、Azure AI Studioの各種ドキュメントを最新の状態に保ち、利用者に正確で信頼性の高い情報を提供するための変更です。多くのドキュメントで日付やレビュアーが更新されていますが、これは情報の新鮮さとチェック体制の向上を意識したものです。

特に重要なのは、SDK評価ドキュメントの大幅な更新です。増えた内容の中では、特に画像とマルチモーダルデータへの興味が示されています。これは、テキストデータだけでなく、さまざまなデータ形式をサポートすることで、より広い範囲のユーザーに対応しようとする意図が見られます。この傾向は、マルチメディアデータがますます重要視される現代のニーズに応えるものです。

また、サーバーレスの接続方法やモデルデプロイに関する手順修正も、迅速で効率的な利用体験を目指しています。こうした変化は利用者が手順を理解しやすくし、誤解を減らすと同時に時間を節約する助けとなるでしょう。

目次の修正は一見些細な変更に思えますが、ユーザーが求める情報を正確に提供するための配慮です。一見細部に思えるこれらの更新が、総じてユーザー体験の向上、ひいてはAzure AI Studioの利用価値を高めることに寄与しています。

Summary Table

Filename Type Title Status A D M
deploy-models-cohere-command.md minor update 更新された日付情報 modified 1 1 2
deploy-models-phi-3-5-vision.md minor update ドキュメントのメタデータの更新 modified 3 3 6
deploy-models-phi-3-vision.md minor update Phi-3ビジョンモデルに関するドキュメントの更新 modified 3 3 6
deploy-models-phi-3.md minor update Phi-3ファミリーチャットモデルに関するドキュメントの更新 modified 3 3 6
deploy-models-serverless-connect.md minor update サーバーレス接続モデルに関するドキュメントの更新 modified 2 2 4
deploy-models-serverless.md minor update サーバーレスAPIモデルに関するドキュメントの更新 modified 1 1 2
deploy-models-timegen-1.md minor update TimeGenモデルに関するドキュメントのレビュアー情報の更新 modified 2 2 4
evaluate-sdk.md minor update SDK評価方法に関するドキュメントの大幅な更新 modified 139 60 199
fine-tune-phi-3.md minor update Phi-3モデルのファインチューニングに関するドキュメントのレビュアー情報の更新 modified 3 1 4
toc.yml minor update 目次の項目名の更新 modified 1 1 2
copilot-sdk-evaluate.md minor update コマンドの更新: パッケージインストールの修正 modified 1 1 2

Modified Contents

articles/ai-studio/how-to/deploy-models-cohere-command.md

Diff
@@ -5,7 +5,7 @@ description: Learn how to use Cohere Command chat models with Azure AI Foundry.
 ms.service: azure-ai-studio
 manager: scottpolly
 ms.topic: how-to
-ms.date: 09/23/2024
+ms.date: 12/20/2024
 ms.reviewer: shubhiraj
 reviewer: shubhirajMsft
 ms.author: mopeakande

Summary

{
    "modification_type": "minor update",
    "modification_title": "更新された日付情報"
}

Explanation

この変更は、ドキュメントのメタデータ内の日付情報の更新に関するものです。具体的には、ms.dateフィールドの値が「09/23/2024」から「12/20/2024」に変更されました。この修正により、Cohere Commandチャットモデルに関する最新情報を反映し、利用者に正確な日付を提供することができます。コードの行が1行追加され、1行削除されているため、変更は最小限ですが、重要な更新です。

articles/ai-studio/how-to/deploy-models-phi-3-5-vision.md

Diff
@@ -5,9 +5,9 @@ description: Learn how to use Phi-3.5 chat model with vision with Azure AI Found
 ms.service: azure-ai-studio
 manager: scottpolly
 ms.topic: how-to
-ms.date: 08/29/2024
-ms.reviewer: kritifaujdar
-reviewer: fkriti
+ms.date: 12/20/2024
+ms.reviewer: v-vkonjarla
+reviewer: VindyaKonjarla 
 ms.author: mopeakande
 author: msakande
 ms.custom: references_regions, generated

Summary

{
    "modification_type": "minor update",
    "modification_title": "ドキュメントのメタデータの更新"
}

Explanation

この変更は、Phi-3.5チャットモデルのビジョンを使用する方法に関するドキュメントのメタデータの更新に関するものです。具体的には、以下の3つの項目が変更されました。まず、ms.dateフィールドの値が「08/29/2024」から「12/20/2024」に更新され、最新の情報が反映されました。次に、レビュー担当者(ms.reviewer)が「kritifaujdar」から「v-vkonjarla」に変更され、レビューアー名(reviewer)も「fkriti」から「VindyaKonjarla」に更新されています。これにより、ドキュメントの正確性と信頼性が向上します。全体として、3行が追加され、3行が削除され、合計で6つの変更が行われています。

articles/ai-studio/how-to/deploy-models-phi-3-vision.md

Diff
@@ -5,9 +5,9 @@ description: Learn how to use Phi-3 chat model with vision with Azure AI Foundry
 ms.service: azure-ai-studio
 manager: scottpolly
 ms.topic: how-to
-ms.date: 09/13/2024
-ms.reviewer: kritifaujdar
-reviewer: fkriti
+ms.date: 12/20/2024
+ms.reviewer: v-vkonjarla
+reviewer: VindyaKonjarla
 ms.author: mopeakande
 author: msakande
 ms.custom: references_regions, generated

Summary

{
    "modification_type": "minor update",
    "modification_title": "Phi-3ビジョンモデルに関するドキュメントの更新"
}

Explanation

この変更は、Phi-3チャットモデルのビジョンに関するドキュメントのメタデータの更新を示しています。具体的には、ms.dateフィールドが「09/13/2024」から「12/20/2024」に変更され、最新の日付が反映されています。また、レビュー担当者(ms.reviewer)が「kritifaujdar」から「v-vkonjarla」に変更され、レビューアー名(reviewer)も「fkriti」から「VindyaKonjarla」に更新されています。これにより、ドキュメントの信頼性と正確性が向上することが目的です。全体として、3行の追加と3行の削除があり、合計で6つの変更が行われています。

articles/ai-studio/how-to/deploy-models-phi-3.md

Diff
@@ -5,9 +5,9 @@ description: Learn how to use Phi-3 family chat models with Azure AI Foundry.
 ms.service: azure-ai-studio
 manager: scottpolly
 ms.topic: how-to
-ms.date: 09/18/2024
-ms.reviewer: kritifaujdar
-reviewer: fkriti
+ms.date: 12/20/2024
+ms.reviewer: v-vkonjarla 
+reviewer: VindyaKonjarla
 ms.author: mopeakande
 author: msakande
 ms.custom: references_regions, generated

Summary

{
    "modification_type": "minor update",
    "modification_title": "Phi-3ファミリーチャットモデルに関するドキュメントの更新"
}

Explanation

このコードの変更は、Phi-3ファミリーのチャットモデルに関するドキュメントのメタデータを更新するものです。具体的には、ms.dateフィールドが「09/18/2024」から「12/20/2024」に変更されており、最新の日付が反映されています。また、レビュー担当者(ms.reviewer)が「kritifaujdar」から「v-vkonjarla」に変更され、レビューアー名(reviewer)も「fkriti」から「VindyaKonjarla」に更新されています。この変更により、ドキュメントの信頼性と正確性が向上します。全体として、3行が追加され、3行が削除され、合計で6つの変更が行われています。

articles/ai-studio/how-to/deploy-models-serverless-connect.md

Diff
@@ -5,7 +5,7 @@ description: Learn how to consume deployed serverless API endpoints from a diffe
 manager: scottpolly
 ms.service: azure-ai-studio
 ms.topic: how-to
-ms.date: 5/21/2024
+ms.date: 12/20/2024
 ms.author: mopeakande
 author: msakande
 ms.reviewer: fasantia
@@ -171,7 +171,7 @@ Follow these steps to create a connection:
 
     # [AI Foundry portal](#tab/azure-ai-studio)
 
-    1. From the left sidebar of your project in AI Foundry portal, select **Management center**.
+    1. From your project in AI Foundry portal, go to the bottom part of the left sidebar and select **Management center**.
 
     1. From the left sidebar of the management center, select **Connected resources**.
     

Summary

{
    "modification_type": "minor update",
    "modification_title": "サーバーレス接続モデルに関するドキュメントの更新"
}

Explanation

この変更は、サーバーレスAPIエンドポイントを消費する方法に関するドキュメントの更新を示しています。具体的には、ms.dateフィールドが「5/21/2024」から「12/20/2024」に変更され、最新の日付が反映されています。また、手順の一部が修正されており、具体的には「AI Foundryポータルのプロジェクトの左サイドバーからManagement centerを選択」の部分が「プロジェクトの左サイドバーの下部からManagement centerを選択」に変更されています。この4つの変更は、2行の追加と2行の削除を含んでおり、ドキュメントの内容をより明確にするためのものです。

articles/ai-studio/how-to/deploy-models-serverless.md

Diff
@@ -5,7 +5,7 @@ description: Learn to deploy models as serverless APIs, using Azure AI Foundry.
 manager: scottpolly
 ms.service: azure-ai-studio
 ms.topic: how-to
-ms.date: 07/18/2024
+ms.date: 12/20/2024
 ms.author: mopeakande
 author: msakande
 ms.reviewer: fasantia

Summary

{
    "modification_type": "minor update",
    "modification_title": "サーバーレスAPIモデルに関するドキュメントの更新"
}

Explanation

この変更は、Azure AI Foundryを使用してモデルをサーバーレスAPIとしてデプロイする方法に関するドキュメントの一部を更新しています。具体的には、ms.dateフィールドが「07/18/2024」から「12/20/2024」に変更され、最新の日付が反映されています。この更新は、ドキュメントがより正確な情報を提供することを目的としており、全体としては1行の追加と1行の削除が加えられています。このような更新は、ユーザーに最新の情報を提供するために重要です。

articles/ai-studio/how-to/deploy-models-timegen-1.md

Diff
@@ -6,8 +6,8 @@ manager: scottpolly
 ms.service: azure-ai-studio
 ms.topic: how-to
 ms.date: 12/16/2024
-ms.reviewer: kritifaujdar
-reviewer: fkriti
+ms.reviewer: haelhamm
+reviewer: hazemelh
 ms.author: mopeakande
 author: msakande
 ms.custom: references_regions, build-2024, ignite-2024

Summary

{
    "modification_type": "minor update",
    "modification_title": "TimeGenモデルに関するドキュメントのレビュアー情報の更新"
}

Explanation

この変更は、TimeGenモデルに関するドキュメントのレビュアー情報を更新しています。具体的には、ms.reviewerフィールドが「kritifaujdar」から「haelhamm」に、そしてreviewerフィールドも「fkriti」から「hazemelh」に変更されています。また、ドキュメントの日付は「12/16/2024」のままで、レビュアー情報の変更のみが反映されています。この更新は、正確なレビュアー情報を提供することを目的としており、ドキュメントの信頼性を高める役割を果たします。全体として、2行の追加と2行の削除があるため、変更は軽微なものです。

articles/ai-studio/how-to/develop/evaluate-sdk.md

Diff
@@ -9,7 +9,7 @@ ms.custom:
   - references_regions
   - ignite-2024
 ms.topic: how-to
-ms.date: 11/19/2024
+ms.date: 12/18/2024
 ms.reviewer: minthigpen
 ms.author: lagayhar
 author: lgayhardt
@@ -63,39 +63,39 @@ Built-in evaluators can accept *either* query and response pairs or a list of co
 
 | Evaluator       | `query`      | `response`      | `context`       | `ground_truth`  | `conversation` |
 |----------------|---------------|---------------|---------------|---------------|-----------|
-|`GroundednessEvaluator`   | Optional: String | Required: String | Required: String | N/A  | Supported |
-| `GroundednessProEvaluator`   | Required: String | Required: String | Required: String | N/A  | Supported |
-| `RetrievalEvaluator`        | Required: String | N/A | Required: String         | N/A           | Supported |
-| `RelevanceEvaluator`      | Required: String | Required: String | N/A | N/A           | Supported |
-| `CoherenceEvaluator`      | Required: String | Required: String | N/A           | N/A           |Supported |
-| `FluencyEvaluator`        | N/A  | Required: String | N/A          | N/A           |Supported |
+|`GroundednessEvaluator`   | Optional: String | Required: String | Required: String | N/A  | Supported for text |
+| `GroundednessProEvaluator`   | Required: String | Required: String | Required: String | N/A  | Supported for text |
+| `RetrievalEvaluator`        | Required: String | N/A | Required: String         | N/A           | Supported for text |
+| `RelevanceEvaluator`      | Required: String | Required: String | N/A | N/A           | Supported for text |
+| `CoherenceEvaluator`      | Required: String | Required: String | N/A           | N/A           |Supported for text |
+| `FluencyEvaluator`        | N/A  | Required: String | N/A          | N/A           |Supported for text |
 | `SimilarityEvaluator` | Required: String | Required: String | N/A           | Required: String |Not supported |
 |`F1ScoreEvaluator` | N/A  | Required: String | N/A           | Required: String |Not supported |
 | `RougeScoreEvaluator` | N/A | Required: String | N/A           | Required: String           | Not supported |
 | `GleuScoreEvaluator` | N/A | Required: String | N/A           | Required: String           |Not supported |
 | `BleuScoreEvaluator` | N/A | Required: String | N/A           | Required: String           |Not supported |
 | `MeteorScoreEvaluator` | N/A | Required: String | N/A           | Required: String           |Not supported |
-| `ViolenceEvaluator`      | Required: String | Required: String | N/A           | N/A           |Supported |
-| `SexualEvaluator`        | Required: String | Required: String | N/A           | N/A           |Supported |
-| `SelfHarmEvaluator`      | Required: String | Required: String | N/A           | N/A           |Supported |
-| `HateUnfairnessEvaluator`        | Required: String | Required: String | N/A           | N/A           |Supported |
-| `IndirectAttackEvaluator`      | Required: String | Required: String | Required: String | N/A           |Supported |
-| `ProtectedMaterialEvaluator`  | Required: String | Required: String | N/A           | N/A           |Supported |
+| `ViolenceEvaluator`      | Required: String | Required: String | N/A           | N/A           |Supported for text and image |
+| `SexualEvaluator`        | Required: String | Required: String | N/A           | N/A           |Supported for text and image |
+| `SelfHarmEvaluator`      | Required: String | Required: String | N/A           | N/A           |Supported for text and image |
+| `HateUnfairnessEvaluator`        | Required: String | Required: String | N/A           | N/A           |Supported for text and image |
+| `IndirectAttackEvaluator`      | Required: String | Required: String | Required: String | N/A           |Supported for text |
+| `ProtectedMaterialEvaluator`  | Required: String | Required: String | N/A           | N/A           |Supported for text and image |
 | `QAEvaluator`      | Required: String | Required: String | Required: String | Required: String           | Not supported |
-| `ContentSafetyEvaluator`      | Required: String | Required: String |  N/A  | N/A           | Supported |
+| `ContentSafetyEvaluator`      | Required: String | Required: String |  N/A  | N/A           | Supported for text and image |
 
 - Query: the query sent in to the generative AI application
 - Response: the response to the query generated by the generative AI application
 - Context: the source on which generated response is based (that is, the grounding documents)
 - Ground truth: the response generated by user/human as the true answer
 - Conversation: a list of messages of user and assistant turns. See more in the next section.
 
-
 > [!NOTE]
-> AI-assisted quality evaluators except for `SimilarityEvaluator` come with a reason field. They employ techniques including chain-of-thought reasoning to generate an explanation for the score. Therefore they will consume more token usage in generation as a result of improved evaluation quality. Specifically, `max_token` for evaluator generation has been set to 800 for all AI-assisted evaluators (and 1600 for `RetrievalEvaluator` to accommodate for longer inputs.) 
+> AI-assisted quality evaluators except for `SimilarityEvaluator` come with a reason field. They employ techniques including chain-of-thought reasoning to generate an explanation for the score. Therefore they will consume more token usage in generation as a result of improved evaluation quality. Specifically, `max_token` for evaluator generation has been set to 800 for all AI-assisted evaluators (and 1600 for `RetrievalEvaluator` to accommodate for longer inputs.)
+
+#### Conversation support for text
 
-#### Conversation Support
-For evaluators that support conversations, you can provide `conversation` as input, a Python dictionary with a list of `messages` (which include `content`, `role`, and optionally `context`). The following is an example of a two-turn conversation.
+For evaluators that support conversations for text, you can provide `conversation` as input, a Python dictionary with a list of `messages` (which include `content`, `role`, and optionally `context`). The following is an example of a two-turn conversation.
 
 ```json
 {"conversation":
@@ -128,19 +128,98 @@ Our evaluators understand that the first turn of the conversation provides valid
 > [!NOTE]
 > Note that in the second turn, even if `context` is `null` or a missing key, it will be interpreted as an empty string instead of erroring out, which might lead to misleading results. We strongly recommend that you validate your evaluation data to comply with the data requirements.
 
+#### Conversation support for images and multi-modal text and image
+
+For evaluators that support conversations for image and multi-modal image and text, you can pass in image URLs or base64 encoded images in `conversation`.
+
+Following are the examples of supported scenarios:
+
+- Multiple images with text input to image or text generation
+- Text only input to image generations
+- Image only inputs to text generation
+
+```python
+from pathlib import Path
+from azure.ai.evaluation import ContentSafetyEvaluator
+import base64
+
+# instantiate an evaluator with image and multi-modal support
+safety_evaluator = ContentSafetyEvaluator(credential=azure_cred, azure_ai_project=project_scope)
+
+# example of a conversation with an image URL
+conversation_image_url = {
+    "messages": [
+        {
+            "role": "system",
+            "content": [
+                {"type": "text", "text": "You are an AI assistant that understands images."}
+            ],
+        },
+        {
+            "role": "user",
+            "content": [
+                {"type": "text", "text": "Can you describe this image?"},
+                {
+                    "type": "image_url",
+                    "image_url": {
+                        "url": "https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"
+                    },
+                },
+            ],
+        },
+        {
+            "role": "assistant",
+            "content": [
+                {
+                    "type": "text",
+                    "text": "The image shows a man with short brown hair smiling, wearing a dark-colored shirt.",
+                }
+            ],
+        },
+    ]
+}
+
+# example of a conversation with base64 encoded images
+base64_image = ""
+
+with Path.open("Image1.jpg", "rb") as image_file:
+    base64_image = base64.b64encode(image_file.read()).decode("utf-8")
+
+conversation_base64 = {
+    "messages": [
+        {"content": "create an image of a branded apple", "role": "user"},
+        {
+            "content": [{"type": "image_url", "image_url": {"url": f"data:image/jpg;base64,{base64_image}"}}],
+            "role": "assistant",
+        },
+    ]
+}
+
+# run the evaluation on the conversation to output the result
+safety_score = safety_evaluator(conversation=conversation_image_url)
+```
+
+Currently the image and multi-modal evaluators support:
+
+- Single turn only (a conversation can have only 1 user message and 1 assistant message)
+- Conversation can have only 1 system message
+- Conversation payload should be less than 10MB size (including images)
+- Absolute URLs and Base64 encoded images
+- Multiple images in a single turn
+- JPG/JPEG, PNG, GIF file formats
+
 ### Performance and quality evaluators
 
-You can use our built-in AI-assisted and NLP quality evaluators to assess the performance and quality of your generative AI application. 
+You can use our built-in AI-assisted and NLP quality evaluators to assess the performance and quality of your generative AI application.
 
 #### Set up
 
 1. For AI-assisted quality evaluators except for `GroundednessProEvaluator`, you must specify a GPT model to act as a judge to score the evaluation data. Choose a deployment with either GPT-3.5, GPT-4, GPT-4o or GPT-4-mini model for your calculations and set it as your `model_config`. We support both Azure OpenAI or OpenAI model configuration schema. We recommend using GPT models that don't have the `(preview)` suffix for the best performance and parseable responses with our evaluators.
 
-> [!NOTE] 
->  Make sure the you have at least `Cognitive Services OpenAI User` role for the Azure OpenAI resource to make inference calls with API key. For more permissions, learn more about [permissioning for Azure OpenAI resource](../../../ai-services/openai/how-to/role-based-access-control.md#summary).  
-
-2. For `GroundednessProEvaluator`, instead of a GPT deployment in `model_config`, you must provide your `azure_ai_project` information. This accesses the backend evaluation service of your Azure AI project. 
+> [!NOTE]
+> Make sure the you have at least `Cognitive Services OpenAI User` role for the Azure OpenAI resource to make inference calls with API key. For more permissions, learn more about [permissioning for Azure OpenAI resource](../../../ai-services/openai/how-to/role-based-access-control.md#summary).  
 
+2. For `GroundednessProEvaluator`, instead of a GPT deployment in `model_config`, you must provide your `azure_ai_project` information. This accesses the backend evaluation service of your Azure AI project.
 
 #### Performance and quality evaluator usage
 
@@ -193,7 +272,8 @@ print(groundedness_pro_score)
 
 Here's an example of the result for a query and response pair:
 
-For 
+For
+
 ```python
 
 # Evaluation Service-based Groundedness Pro score:
@@ -209,14 +289,16 @@ For
 }
 
 ```
+
 The result of the AI-assisted quality evaluators for a query and response pair is a dictionary containing:
+
 - `{metric_name}` provides a numerical score.
 - `{metric_name}_label` provides a binary label.
 - `{metric_name}_reason` explains why a certain score or label was given for each data point.
 
-For NLP evaluators, only a score is given in the `{metric_name}` key.   
+For NLP evaluators, only a score is given in the `{metric_name}` key.
 
-Like 6 other AI-assisted evaluators, `GroundednessEvaluator` is a prompt-based evaluator that outputs a score on a 5-point scale (the higher the score, the more grounded the result is). On the other hand, `GroundednessProEvaluator` invokes our backend evaluation service powered by Azure AI Content Safety and outputs `True` if all content is grounded, or `False` if any ungrounded content is detected. 
+Like 6 other AI-assisted evaluators, `GroundednessEvaluator` is a prompt-based evaluator that outputs a score on a 5-point scale (the higher the score, the more grounded the result is). On the other hand, `GroundednessProEvaluator` invokes our backend evaluation service powered by Azure AI Content Safety and outputs `True` if all content is grounded, or `False` if any ungrounded content is detected.
 
 We open-source the prompts of our quality evaluators except for `GroundednessProEvaluator` (powered by Azure AI Content Safety) for transparency. These prompts serve as instructions for a language model to perform their evaluation task, which requires a human-friendly definition of the metric and its associated scoring rubrics (what the 5 levels of quality mean for the metric). We highly recommend that users customize the definitions and grading rubrics to their scenario specifics. See details in [Custom Evaluators](#custom-evaluators).
 
@@ -235,7 +317,6 @@ print(groundedness_conv_score)
 
 For conversation outputs, per-turn results are stored in a list and the overall conversation score `'groundedness': 4.0` is averaged over the turns:
 
-
 ```python
 {   'groundedness': 4.0,
     'gpt_groundedness': 4.0,
@@ -248,8 +329,6 @@ For conversation outputs, per-turn results are stored in a list and the overall
 > [!NOTE]
 > We strongly recommend users to migrate their code to use the key without prefixes (for example, `groundedness.groundedness`) to allow your code to support more evaluator models.
 
-
-
 ### Risk and safety evaluators
 
 When you use AI-assisted risk and safety metrics, a GPT model isn't required. Instead of `model_config`, provide your `azure_ai_project` information. This accesses the Azure AI project safety evaluations back-end service, which provisions a GPT model specific to harms evaluation that can generate content risk severity scores and reasoning to enable the safety evaluators.
@@ -260,7 +339,6 @@ Currently AI-assisted risk and safety metrics are only available in the followin
 
 | Region | Hate and unfairness, Sexual, Violent, Self-harm, Indirect attack | Protected material |
 | - | - | - |
-|UK South | Will be deprecated 12/1/24 | N/A |
 |East US 2 | Supported | Supported  |
 |Sweden Central | Supported  | N/A |
 |France Central | Supported | N/A |
@@ -317,20 +395,18 @@ The result of the content safety evaluators for a query and response pair is a d
 - `{metric_name}_score` has a range between 0 and 7 severity level that maps to a severity label given in `{metric_name}`.
 - `{metric_name}_reason` explains why a certain severity score was given for each data point.
 
-
 For conversation outputs, per-turn results are stored in a list and the overall conversation score `'violence_score': 0.0` is averaged over the turns:
 
-```
+```python
 {
     'violence_score': 0.0,
     'evaluation_per_turn': {'violence': ['Very low', 'Very low'],
     'violence_score': [0, 0],
     'violence_reason': ["The system's response is a straightforward answer to a question about waterproof tents. There is no mention of violence, harm, or any related content. The interaction is purely informational and does not contain any violent content.",
-   "The system's response does not contain any violent content. It simply provides a price in response to the human's question. There is no mention or depiction of violence, harm, or any related themes."]}
+   "The system's response does not contain any violent content. It simply provides a price in response to the human's question. There is no mention or depiction of violence, harm, or any related themes."]
 }
 ```
 
-
 #### Evaluating direct and indirect attack jailbreak vulnerability
 
 We support evaluating vulnerability towards the following types of jailbreak attacks:
@@ -372,6 +448,7 @@ class AnswerLengthEvaluator:
     def __call__(self, *, answer: str, **kwargs):
         return {"answer_length": len(answer)}
 ```
+
 Then run the evaluator on a row of data by importing a callable class:
 
 ```python
@@ -391,7 +468,6 @@ The result:
 {"answer_length":27}
 ```
 
-
 ### Prompt-based evaluators
 
 To build your own prompt-based large language model evaluator or AI-assisted annotator, you can create a custom evaluator based on a **Prompty** file. Prompty is a file with `.prompty` extension for developing prompt template. The Prompty asset is a markdown file with a modified front matter. The front matter is in YAML format that contains many metadata fields that define model configuration and expected inputs of the Prompty. Let's create a custom evaluator `FriendlinessEvaluator` to measure friendliness of a response.
@@ -496,22 +572,18 @@ Here's the result:
 
 After you spot-check your built-in or custom evaluators on a single row of data, you can combine multiple evaluators with the `evaluate()` API on an entire test dataset.
 
-
 ### Prerequisites
 
-If you want to enable logging and tracing to your Azure AI project for evaluation results, follow these steps:
+If you want to enable logging to your Azure AI project for evaluation results, follow these steps:
 
 1. Make sure you're first logged in by running `az login`.
-2. Install the following sub-package:
 
-```python
-pip install azure-ai-evaluation[remote]
-```
-3. Make sure you have the [Identity-based access](../secure-data-playground.md#prerequisites) setting for the storage account in your Azure AI hub. To find your storage, go to the Overview page of your Azure AI hub and select Storage.
+2. Make sure you have the [Identity-based access](../secure-data-playground.md#prerequisites) setting for the storage account in your Azure AI hub. To find your storage, go to the Overview page of your Azure AI hub and select Storage.
 
-4. Make sure you have `Storage Blob Data Contributor` role for the storage account.
+3. Make sure you have `Storage Blob Data Contributor` role for the storage account.
 
 ### Local evaluation on datasets
+
 In order to ensure the `evaluate()` can correctly parse the data, you must specify column mapping to map the column from the dataset to key words that are accepted by the evaluators. In this case, we specify the data mapping for `query`, `response`, and `context`.
 
 ```python
@@ -672,9 +744,9 @@ result = evaluate(
 
 After local evaluations of your generative AI applications, you may want to run evaluations in the cloud for pre-deployment testing, and [continuously evaluate](https://aka.ms/GenAIMonitoringDoc) your applications for post-deployment monitoring. Azure AI Projects SDK offers such capabilities via a Python API and supports almost all of the features available in local evaluations. Follow the steps below to submit your evaluation to the cloud on your data using built-in or custom evaluators.
 
-  
 ### Prerequisites
-- Azure AI project in the same [regions](#region-support) as risk and safety evaluators. If you don't have an existing project, follow the guide [How to create Azure AI project](../create-projects.md?tabs=ai-studio) to create one. 
+
+- Azure AI project in the same [regions](#region-support) as risk and safety evaluators. If you don't have an existing project, follow the guide [How to create Azure AI project](../create-projects.md?tabs=ai-studio) to create one.
 
 > [!NOTE]
 > Cloud evaluations do not support `ContentSafetyEvaluator`, and `QAEvaluator`.
@@ -686,17 +758,22 @@ After local evaluations of your generative AI applications, you may want to run
 ### Installation Instructions
 
 1. Create a **virtual Python environment of you choice**. To create one using conda, run the following command:
+
     ```bash
     conda create -n cloud-evaluation
     conda activate cloud-evaluation
     ```
+
 2. Install the required packages by running the following command:
+
     ```bash
    pip install azure-identity azure-ai-projects azure-ai-ml
     ```
+
     Optionally you can `pip install azure-ai-evaluation` if you want a code-first experience to fetch evaluator ID for built-in evaluators in code.
 
 Now you can define a client and a deployment which will be used to run your evaluations in the cloud:
+
 ```python
 
 import os, time
@@ -717,36 +794,43 @@ project_client = AIProjectClient.from_connection_string(
 ```
 
 ### Uploading evaluation data
-We provide two ways to register your data in Azure AI project required for evaluations in the cloud: 
-1. **From SDK**: Upload new data from your local directory to your Azure AI project in the SDK, and fetch the dataset ID as a result: 
+
+We provide two ways to register your data in Azure AI project required for evaluations in the cloud:
+
+1. **From SDK**: Upload new data from your local directory to your Azure AI project in the SDK, and fetch the dataset ID as a result:
+
 ```python
 data_id, _ = project_client.upload_file("./evaluate_test_data.jsonl")
 ```
+
 **From UI**: Alternatively, you can upload new data or update existing data versions by following the UI walkthrough under the **Data** tab of your Azure AI project.
 
-2. Given existing datasets uploaded to your Project: 
-- **From SDK**: if you already know the dataset name you created, construct the dataset ID in this format: `/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<project-name>/data/<dataset-name>/versions/<version-number>`
+2. Given existing datasets uploaded to your Project:
 
-- **From UI**: If you don't know the dataset name, locate it under the **Data** tab of your Azure AI project and construct the dataset ID as in the format above. 
+- **From SDK**: if you already know the dataset name you created, construct the dataset ID in this format: `/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<project-name>/data/<dataset-name>/versions/<version-number>`
 
+- **From UI**: If you don't know the dataset name, locate it under the **Data** tab of your Azure AI project and construct the dataset ID as in the format above.
 
 ### Specifying evaluators from Evaluator library
+
 We provide a list of built-in evaluators registered in the [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab of your Azure AI project. You can also register custom evaluators and use them for Cloud evaluation. We provide two ways to specify registered evaluators:
 
 #### Specifying built-in evaluators
+
 - **From SDK**: Use built-in evaluator `id` property supported by `azure-ai-evaluation` SDK:
+
 ```python
 from azure.ai.evaluation import F1ScoreEvaluator, RelevanceEvaluator, ViolenceEvaluator
 print("F1 Score evaluator id:", F1ScoreEvaluator.id)
 ```
 
 - **From UI**: Follows these steps to fetch evaluator ids after they're registered to your project:
-    - Select **Evaluation** tab in your Azure AI project;
-    - Select Evaluator library;
-    - Select your evaluators of choice by comparing the descriptions;
-    - Copy its "Asset ID" which will be your evaluator id, for example, `azureml://registries/azureml/models/Groundedness-Evaluator/versions/1`.
+  - Select **Evaluation** tab in your Azure AI project;
+  - Select Evaluator library;
+  - Select your evaluators of choice by comparing the descriptions;
+  - Copy its "Asset ID" which will be your evaluator id, for example, `azureml://registries/azureml/models/Groundedness-Evaluator/versions/1`.
 
-#### Specifying custom evaluators 
+#### Specifying custom evaluators
 
 - For code-based custom evaluators, register them to your Azure AI project and fetch the evaluator ids with the following:
 
@@ -793,7 +877,6 @@ After registering your custom evaluator to your Azure AI project, you can view i
 
 - For prompt-based custom evaluators, use this snippet to register them. For example, let's register our `FriendlinessEvaluator` built as described in [Prompt-based evaluators](#prompt-based-evaluators):
 
-
 ```python
 # Import your prompt-based custom evaluator
 from friendliness.friend import FriendlinessEvaluator
@@ -836,11 +919,8 @@ versioned_evaluator = ml_client.evaluators.get(evaluator_name, version=1)
 print("Versioned evaluator id:", registered_evaluator.id)
 ```
 
-
-
 After logging your custom evaluator to your Azure AI project, you can view it in your [Evaluator library](../evaluate-generative-ai-app.md#view-and-manage-the-evaluators-in-the-evaluator-library) under **Evaluation** tab of your Azure AI project.
 
-
 ### Cloud evaluation with Azure AI Projects SDK
 
 You can submit a cloud evaluation with Azure AI Projects SDK via a Python API. See the following example to submit a cloud evaluation of your dataset using an NLP evaluator (F1 score), an AI-assisted quality evaluator (Relevance), a safety evaluator (Violence) and a custom evaluator. Putting it altogether:
@@ -933,7 +1013,6 @@ evaluation = client.evaluations.create(
 )
 ```
 
-
 ## Related content
 
 - [Azure Python reference documentation](https://aka.ms/azureaieval-python-ref)

Summary

{
    "modification_type": "minor update",
    "modification_title": "SDK評価方法に関するドキュメントの大幅な更新"
}

Explanation

この変更は、Azure AI StudioにおけるSDK評価方法に関するドキュメントに対して大幅な更新を行っています。具体的には、内容の追加が139行、削除が60行あり、合計で199行の変更が加えられています。主な変更点には、評価者に関する情報が更新され、各評価者がサポートする入力形式やパラメータについての詳細が追加されています。また、画像とマルチモーダルデータに対する評価のサポートが新たに追加され、具体的なコード例も提供されています。これにより、開発者はテキストのみならず画像データを評価に含めることが可能になり、使いやすさと柔軟性が向上しています。このドキュメントの更新は、ユーザーが最新の機能や設定を適切に利用できるようにするために重要です。

articles/ai-studio/how-to/fine-tune-phi-3.md

Diff
@@ -4,9 +4,11 @@ titleSuffix: Azure AI Foundry
 description: This article introduces fine-tuning Phi-3 models in Azure AI Foundry portal.
 manager: scottpolly
 ms.service: azure-ai-studio
-ms.custom:
+ms.custom: references_regions
 ms.topic: how-to
 ms.date: 12/16/2024
+ms.reviewer: v-vkonjarla
+reviewer: VindyaKonjarla
 ms.author: ssalgado
 author: ssalgadodev
 ---

Summary

{
    "modification_type": "minor update",
    "modification_title": "Phi-3モデルのファインチューニングに関するドキュメントのレビュアー情報の更新"
}

Explanation

この変更は、Azure AI FoundryにおけるPhi-3モデルのファインチューニングについてのドキュメントに対するマイナーな更新です。具体的には、レビュアーに関する情報が更新され、ms.reviewerフィールドが「v-vkonjarla」に、reviewerフィールドが「VindyaKonjarla」に変更されています。また、ms.customフィールドに「references_regions」が追加され、カスタムフィールドがより明確になっています。このような変更は、ドキュメントの正確性を高め、ユーザーに信頼性のある情報を提供するために重要です。全体として、内容の変更は軽微ですが、レビュアー情報の更新はドキュメントの信頼性を向上させる役割を果たしています。

articles/ai-studio/toc.yml

Diff
@@ -73,7 +73,7 @@ items:
       - name: Use your image data with Azure OpenAI
         href: how-to/data-image-add.md
         displayName: vision, gpt, turbo
-    - name: Azure Speech
+    - name: Azure AI Speech
       items:
       - name: Real-time speech to text
         href: ../ai-services/speech-service/get-started-speech-to-text.md?context=/azure/ai-studio/context/context

Summary

{
    "modification_type": "minor update",
    "modification_title": "目次の項目名の更新"
}

Explanation

この変更は、Azure AI Studioの目次ファイル(toc.yml)に対する軽微な更新です。具体的には、「Azure Speech」という項目名が「Azure AI Speech」に変更されました。これは、サービスの名称をより正確に反映させるための更新であり、ユーザーに対してより明確な情報を提供することを目的としています。この小さな変更により、ユーザーが目次を参照する際に、エラーや混乱を避けられるようになります。全体として、内容の変更はわずか2行ですが、項目名の精度を向上させる重要な役割を果たしています。

articles/ai-studio/tutorials/copilot-sdk-evaluate.md

Diff
@@ -99,7 +99,7 @@ In Part 1 of this tutorial series, you created an **.env** file that specifies t
 1. Install the required package:
 
     ```bash
-    pip install azure_ai-evaluation[prompts]
+    pip install azure_ai-evaluation
     ```
 
 1. Now run the evaluation script:

Summary

{
    "modification_type": "minor update",
    "modification_title": "コマンドの更新: パッケージインストールの修正"
}

Explanation

この変更は、Azure AI Studioのチュートリアル記事「copilot-sdk-evaluate.md」に対する軽微な更新です。具体的には、Pythonのパッケージインストールに関するコマンドが修正されました。元々は「pip install azure_ai-evaluation[prompts]」と記載されていましたが、これが「pip install azure_ai-evaluation」に変更されました。この修正により、ユーザーが必要なパッケージをインストールする際に不必要なオプションを省略できるようになり、手順がシンプルで明確になります。全体として、この変更はチュートリアルの信頼性を高め、ユーザーにとって使いやすい内容となるよう改善されています。