Format Set Data yang Didukung untuk Tugas Bring-Your-Own-Dataset (BYOD)

Custom Scorer dan jenis LLM-as-judge evaluasi memerlukan file JSONL dataset kustom yang terletak di S3. AWS Anda harus menyediakan file sebagai file JSON Lines mengikuti salah satu format yang didukung berikut. Contoh dalam dokumen ini diperluas untuk kejelasan.

Setiap format memiliki nuansa tersendiri tetapi setidaknya semua membutuhkan prompt pengguna.

Bidang yang Wajib Diisi
Bidang	Diperlukan
Prompt Pengguna	Ya
Prompt Sistem	Tidak
Kebenaran dasar	Hanya untuk Custom Scorer
Kategori	Tidak

1. Format OpenAI


{
    "messages": [
        {
            "role": "system",    # System prompt (looks for system role)
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",       # Query (looks for user role)
            "content": "Hello!"
        },
        {
            "role": "assistant",  # Ground truth (looks for assistant role)
            "content": "Hello to you!"
        }
    ]
}

2. SageMaker Evaluasi


{
   "system":"You are an English major with top marks in class who likes to give minimal word responses: ",
   "query":"What is the symbol that ends the sentence as a question",
   "response":"?", # Ground truth
   "category": "Grammar"
}

3. HuggingFace Penyelesaian yang cepat

Format Standar dan Percakapan didukung.


# Standard

{
    "prompt" : "What is the symbol that ends the sentence as a question", # Query
    "completion" : "?" # Ground truth
}

# Conversational
{
    "prompt": [
        {
            "role": "user", # Query (looks for user role)
            "content": "What is the symbol that ends the sentence as a question"
        }
    ],
    "completion": [
        {
            "role": "assistant", # Ground truth (looks for assistant role)
            "content": "?"
        }
    ]
}

4. HuggingFace Preferensi

Support untuk format standar (string) dan format percakapan (array pesan).


# Standard: {"prompt": "text", "chosen": "text", "rejected": "text"}
{
     "prompt" : "The sky is", # Query
     "chosen" : "blue", # Ground truth
     "rejected" : "green"
}

# Conversational:
{
    "prompt": [
        {
            "role": "user", # Query (looks for user role)
            "content": "What color is the sky?"
        }
    ],
    "chosen": [
        {
            "role": "assistant", # Ground truth (looks for assistant role)
            "content": "It is blue."
        }
    ],
    "rejected": [
        {
            "role": "assistant",
            "content": "It is green."
        }
    ]
}

5. Format Verl

Format Verl (baik format saat ini dan lama) didukung untuk kasus penggunaan pembelajaran penguatan. Verl docs untuk referensi: https://verl.readthedocs.io/en/latest/preparation/prepare_data.html

Pengguna format VERL biasanya tidak memberikan respons kebenaran dasar. Jika Anda ingin menyediakannya, gunakan salah satu bidang extra_info.answer ataureward_model.ground_truth; extra_info diutamakan.

SageMaker mempertahankan bidang khusus Verl berikut sebagai metadata jika ada:

id
data_source
ability
reward_model
extra_info
attributes
difficulty


# Newest VERL format where `prompt` is an array of messages.
{
  "data_source": "openai/gsm8k",
  "prompt": [
    {
      "content": "You are a helpful math tutor who explains solutions to questions step-by-step.",
      "role": "system"
    },
    {
      "content": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? Let's think step by step and output the final answer after \"####\".",
      "role": "user"
    }
  ],
  "ability": "math",
  "extra_info": {
    "answer": "Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72",
    "index": 0,
    "question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",
    "split": "train"
  },
  "reward_model": {
    "ground_truth": "72" # Ignored in favor of extra_info.answer
  }
}

# Legacy VERL format where `prompt` is a string. Also supported.
{
  "data_source": "openai/gsm8k",
  "prompt": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? Let's think step by step and output the final answer after \"####\".",
  "extra_info": {
    "answer": "Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72",
    "index": 0,
    "question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",
    "split": "train"
  }
}

Awas Javascript dinonaktifkan atau tidak tersedia di browser Anda.

Untuk menggunakan Dokumentasi AWS, Javascript harus diaktifkan. Lihat halaman Bantuan browser Anda untuk petunjuk.

Konvensi Dokumen

Format Metrik Evaluasi

Evaluasi dengan Preset dan Custom Scorers