Reward function API¶
Reward files should expose a callable named reward_fn:
def reward_fn(example, completions) -> list[float]:
...
Parameters:
exampleThe source record returned by the dataset loader.
completionsThe generated completions to score.
Return value:
list[float]One reward score per completion.
Keep this API stable for task code. If the reward needs extra metadata, add it through the dataset loader record rather than through global state.