Preprocessing¶
Observation¶
d3rlpy provides several preprocessors tightly incorporated with algorithms. Each preprocessor is implemented with PyTorch operation, which will be included in the model exported by save_policy method.
from d3rlpy.algos import CQL
from d3rlpy.dataset import MDPDataset
dataset = MDPDataset(...)
# choose from ['pixel', 'min_max', 'standard'] or None
cql = CQL(scaler='standard')
# scaler is fitted from the given episodes
cql.fit(dataset.episodes)
# preprocesing is included in TorchScript
cql.save_policy('policy.pt')
# you don't need to take care of preprocessing at production
policy = torch.jit.load('policy.pt')
action = policy(unpreprocessed_x)
You can also initialize scalers by yourself.
from d3rlpy.preprocessing import StandardScaler
scaler = StandardScaler(mean=..., std=...)
cql = CQL(scaler=scaler)
Pixel normalization preprocessing. |
|
Min-Max normalization preprocessing. |
|
Standardization preprocessing. |
Action¶
d3rlpy also provides the feature that preprocesses continuous action. With this preprocessing, you don’t need to normalize actions in advance or implement normalization in the environment side.
from d3rlpy.algos import CQL
from d3rlpy.dataset import MDPDataset
dataset = MDPDataset(...)
# 'min_max' or None
cql = CQL(action_scaler='min_max')
# action scaler is fitted from the given episodes
cql.fit(dataset.episodes)
# postprocessing is included in TorchScript
cql.save_policy('policy.pt')
# you don't need to take care of postprocessing at production
policy = torch.jit.load('policy.pt')
action = policy(x)
You can also initialize scalers by yourself.
from d3rlpy.preprocessing import MinMaxActionScaler
action_scaler = MinMaxActionScaler(minimum=..., maximum=...)
cql = CQL(action_scaler=action_scaler)
Min-Max normalization action preprocessing. |