d3rlpy.preprocessing.ClipRewardScaler¶
- class d3rlpy.preprocessing.ClipRewardScaler(low=None, high=None)[source]¶
Reward clipping preprocessing.
from d3rlpy.preprocessing import ClipRewardScaler # clip rewards within [-1.0, 1.0] reward_scaler = ClipRewardScaler(low=-1.0, high=1.0) cql = CQL(reward_scaler=reward_scaler)
Methods
- fit(episodes)[source]¶
Estimates scaling parameters from dataset.
- Parameters
episodes (List[d3rlpy.dataset.Episode]) – list of episodes.
- Return type
- fit_with_env(env)¶
Gets scaling parameters from environment.
Note
RewardScaler
does not support fitting with environment.- Parameters
env (gym.core.Env) – gym environment.
- Return type
- reverse_transform(reward)[source]¶
Returns reversely processed rewards.
- Parameters
reward (torch.Tensor) – reward.
- Returns
reversely processed reward.
- Return type
torch.Tensor
- transform(reward)[source]¶
Returns processed rewards.
- Parameters
reward (torch.Tensor) – reward.
- Returns
processed reward.
- Return type
torch.Tensor
- transform_numpy(reward)[source]¶
Returns transformed rewards in numpy array.
- Parameters
reward (numpy.ndarray) – reward.
- Returns
transformed reward.
- Return type
Attributes