The maximum number of tokens to generate in the model response during evaluation.
The list of sequences that will cause the model to stop generating tokens when encountered.
The temperature value that controls randomness in the model's responses. Lower values produce more deterministic outputs.
The top-p sampling parameter that controls the diversity of the model's responses by limiting the cumulative probability of token choices.