Unterstützte Datensatzformate für Bring-Your-Own-Dataset (BYOD-) Aufgaben

Für die Typen „Benutzerdefinierter Punktezähler“ und LLM-as-judge „Auswertung“ ist eine JSONL-Datei mit benutzerdefiniertem Datensatz erforderlich, die sich in S3 befindet. AWS Sie müssen die Datei als JSON Lines-Datei bereitstellen, die einem der folgenden unterstützten Formate entspricht. Die Beispiele in diesem Dokument wurden aus Gründen der Übersichtlichkeit erweitert.

Jedes Format hat seine eigenen Nuancen, aber mindestens alle erfordern eine Benutzereingabe.

Pflichtfelder
Feld	Erforderlich
Benutzeraufforderung	Ja
System-Prompt	Nein
Grundlegende Wahrheit	Nur für Custom Scorer
Kategorie	Nein

1. OpenAI-Format


{
    "messages": [
        {
            "role": "system",    # System prompt (looks for system role)
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",       # Query (looks for user role)
            "content": "Hello!"
        },
        {
            "role": "assistant",  # Ground truth (looks for assistant role)
            "content": "Hello to you!"
        }
    ]
}

2. SageMaker Bewertung


{
   "system":"You are an English major with top marks in class who likes to give minimal word responses: ",
   "query":"What is the symbol that ends the sentence as a question",
   "response":"?", # Ground truth
   "category": "Grammar"
}

3. HuggingFace Sofortiger Abschluss

Sowohl Standard- als auch Konversationsformate werden unterstützt.


# Standard

{
    "prompt" : "What is the symbol that ends the sentence as a question", # Query
    "completion" : "?" # Ground truth
}

# Conversational
{
    "prompt": [
        {
            "role": "user", # Query (looks for user role)
            "content": "What is the symbol that ends the sentence as a question"
        }
    ],
    "completion": [
        {
            "role": "assistant", # Ground truth (looks for assistant role)
            "content": "?"
        }
    ]
}

4. HuggingFace Präferenz

Support sowohl für das Standardformat (Zeichenfolge) als auch für das Konversationsformat (Nachrichtenarray).


# Standard: {"prompt": "text", "chosen": "text", "rejected": "text"}
{
     "prompt" : "The sky is", # Query
     "chosen" : "blue", # Ground truth
     "rejected" : "green"
}

# Conversational:
{
    "prompt": [
        {
            "role": "user", # Query (looks for user role)
            "content": "What color is the sky?"
        }
    ],
    "chosen": [
        {
            "role": "assistant", # Ground truth (looks for assistant role)
            "content": "It is blue."
        }
    ],
    "rejected": [
        {
            "role": "assistant",
            "content": "It is green."
        }
    ]
}

5. Verl-Format

Das Verl-Format (sowohl aktuelle als auch ältere Formate) wird für Reinforcement-Learning-Anwendungsfälle unterstützt. Verl-Dokumente als Referenz: https://verl.readthedocs.io/en/latest/preparation/prepare_data.html

Benutzer des VERL-Formats geben in der Regel keine Ground-Truth-Antwort. Wenn Sie trotzdem eine angeben möchten, verwenden Sie eines der Felder extra_info.answer oderreward_model.ground_truth; extra_info hat Vorrang.

SageMaker behält die folgenden VERL-specific Felder als Metadaten bei, falls vorhanden:

id
data_source
ability
reward_model
extra_info
attributes
difficulty


# Newest VERL format where `prompt` is an array of messages.
{
  "data_source": "openai/gsm8k",
  "prompt": [
    {
      "content": "You are a helpful math tutor who explains solutions to questions step-by-step.",
      "role": "system"
    },
    {
      "content": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? Let's think step by step and output the final answer after \"####\".",
      "role": "user"
    }
  ],
  "ability": "math",
  "extra_info": {
    "answer": "Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72",
    "index": 0,
    "question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",
    "split": "train"
  },
  "reward_model": {
    "ground_truth": "72" # Ignored in favor of extra_info.answer
  }
}

# Legacy VERL format where `prompt` is a string. Also supported.
{
  "data_source": "openai/gsm8k",
  "prompt": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? Let's think step by step and output the final answer after \"####\".",
  "extra_info": {
    "answer": "Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72",
    "index": 0,
    "question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?",
    "split": "train"
  }
}

Warnung JavaScript ist in Ihrem Browser nicht verfügbar oder deaktiviert.

Zur Nutzung der AWS-Dokumentation muss JavaScript aktiviert sein. Weitere Informationen finden auf den Hilfe-Seiten Ihres Browsers.

Dokumentkonventionen

Formate für Bewertungsmetriken

Evaluieren Sie mit voreingestellten und benutzerdefinierten Punktezählern