The maximum number of tokens to generate in the response.
The temperature setting for controlling randomness in the generated response.
The top-K sampling parameter for token selection.
The top-P sampling parameter for nucleus sampling.