RftHyperParameters

Hyperparameters for controlling the reinforcement fine-tuning training process, including learning settings and evaluation intervals.

Types

Link copied to clipboard
class Builder
Link copied to clipboard
object Companion

Properties

Link copied to clipboard

Number of training samples processed in each batch during reinforcement fine-tuning (RFT) training. Larger batches may improve training stability.

Link copied to clipboard

Number of training epochs to run during reinforcement fine-tuning. Higher values may improve performance but increase training time.

Link copied to clipboard

Interval between evaluation runs during RFT training, measured in training steps. More frequent evaluation provides better monitoring.

Link copied to clipboard

Maximum number of tokens the model can generate in response to each prompt during RFT training.

Link copied to clipboard

Learning rate for the reinforcement fine-tuning. Controls how quickly the model adapts to reward signals.

Link copied to clipboard

Maximum length of input prompts during RFT training, measured in tokens. Longer prompts allow more context but increase memory usage and training-time.

Link copied to clipboard

Level of reasoning effort applied during RFT training. Higher values may improve response quality but increase training time.

Link copied to clipboard

Number of response samples generated per prompt during RFT training. More samples provide better reward signal estimation.

Functions

Link copied to clipboard
Link copied to clipboard
open operator override fun equals(other: Any?): Boolean
Link copied to clipboard
open override fun hashCode(): Int
Link copied to clipboard
open override fun toString(): String