d3rlpy.algos.SoftmaxTransformerActionSampler

class d3rlpy.algos.SoftmaxTransformerActionSampler(temperature=1.0)[source]

Softmax action-sampler.

This class implements softmax function to sample action from discrete probability distribution.

Parameters

temperature (int) – Softmax temperature.

Methods

__call__(transformer_output)[source]

Returns sampled action from Transformer output.

Parameters

transformer_output (numpy.ndarray[Any, numpy.dtype[Any]]) – Output of Transformer algorithms.

Returns

Sampled action.

Return type

Union[numpy.ndarray[Any, numpy.dtype[Any]], int]