d3rlpy.metrics.scorer.discrete_action_match_scorer¶

d3rlpy.metrics.scorer.discrete_action_match_scorer(algo, episodes)[source]¶

Returns percentage of identical actions between algorithm and dataset.

This metrics suggests how different the greedy-policy is from the given episodes in discrete action-space. If the given episdoes are near-optimal, the large percentage would be better.

\[\frac{1}{N} \sum^N \parallel \{a_t = \text{argmax}_a Q_\theta (s_t, a)\}\]

Parameters

algo (d3rlpy.metrics.scorer.AlgoProtocol) – algorithm.
episodes (List[d3rlpy.dataset.Episode]) – list of episodes.

Returns

percentage of identical actions.

Return type

Read the Docs v: v0.61

Versions: latest; stable; v0.61; v0.60; v0.51; v0.50; v0.41; v0.40; v0.32; v0.31; v0.30; v0.23; v0.22; v0.21; v0.2; v0.1

Downloads: pdf; html; epub

On Read the Docs: Project Home; Builds