d3rlpy.algos.SoftmaxTransformerActionSampler

class d3rlpy.algos.SoftmaxTransformerActionSampler(temperature=1.0)[source]

Softmax action-sampler.

This class implements softmax function to sample action from discrete probability distribution.

Parameters:

temperature (int) – Softmax temperature.

Methods

__call__(transformer_output)[source]

Returns sampled action from Transformer output.

Parameters:

transformer_output (ndarray[Any, dtype[Any]]) – Output of Transformer algorithms.

Returns:

Sampled action.

Return type:

Union[ndarray[Any, dtype[Any]], int]