d3rlpy.metrics.scorer.evaluate_on_environment

d3rlpy.metrics.scorer.evaluate_on_environment(env, n_trials=10, epsilon=0.0, render=False)[source]

Returns scorer function of evaluation on environment.

This function returns scorer function, which is suitable to the standard scikit-learn scorer function style. The metrics of the scorer function is ideal metrics to evaluate the resulted policies.

import gym

from d3rlpy.algos import DQN
from d3rlpy.metrics.scorer import evaluate_on_environment


env = gym.make('CartPole-v0')

scorer = evaluate_on_environment(env)

cql = CQL()

mean_episode_return = scorer(cql)
Parameters
  • env (gym.core.Env) – gym-styled environment.

  • n_trials (int) – the number of trials.

  • epsilon (float) – noise factor for epsilon-greedy policy.

  • render (bool) – flag to render environment.

Returns

scoerer function.

Return type

Callable[[…], float]