Wrappers¶

ObjectSelectorWrapper¶

class causal_world.wrappers.ObjectSelectorWrapper(env)[source]¶

__init__(env)[source]¶

Parameters: env – (causal_world.CausalWorld) the environment to convert.

reset()[source]¶

Resets the environment to the current starting state of the environment.

Returns: (nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

step(action)[source]¶

Used to step through the enviroment.

Parameters: action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.
Returns: (nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.

MovingAverageActionEnvWrapper¶

class causal_world.wrappers.MovingAverageActionEnvWrapper(env, widow_size=8, initial_value=0)[source]¶

__init__(env, widow_size=8, initial_value=0)[source]¶

Parameters

env – (causal_world.CausalWorld) the environment to convert.
widow_size – (int) the window size for avergaing and smoothing the actions.
initial_value – (float) intial values to fill the window with.

action(action)[source]¶

Processes the action to transform to a smoothed action.

Parameters: action – (nd.array) the raw action to be processed.
Returns: (nd.array) the smoothed action.

reverse_action(action)[source]¶

Reverses processing the action to transform to a raw action again.

Parameters: action – (nd.array) the smoothed action.
Returns: (nd.array) the raw action before processing.

DeltaActionEnvWrapper¶

class causal_world.wrappers.DeltaActionEnvWrapper(env)[source]¶

__init__(env)[source]¶

A delta action wrapper for the environment to turn the actions to a delta wrt the previous action executed.

Parameters: env – (causal_world.CausalWorld) the environment to convert.

action(action)[source]¶

Processes the action to transform to a delta action.

Parameters: action – (nd.array) the raw action to be processed.
Returns: (nd.array) the delta action.

reverse_action(action)[source]¶

Reverses processing the action to transform to a raw action again.

Parameters: action – (nd.array) the delta action.
Returns: (nd.array) the raw action before processing.

CurriculumWrapper¶

class causal_world.wrappers.CurriculumWrapper(env, intervention_actors, actives)[source]¶

__init__(env, intervention_actors, actives)[source]¶

Parameters

env – (causal_world.CausalWorld) the environment to convert.
intervention_actors – (list) list of intervention actors
actives – (list of tuples) each tuple indicates (episode_start, episode_end, episode_periodicity, time_step_for_intervention)

reset()[source]¶

Resets the environment to the current starting state of the environment.

Returns: (nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

step(action)[source]¶

Used to step through the enviroment.

Parameters: action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.
Returns: (nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.

HERGoalEnvWrapper¶

class causal_world.wrappers.HERGoalEnvWrapper(env, activate_sparse_reward=False)[source]¶

__init__(env, activate_sparse_reward=False)[source]¶

Parameters

env – (causal_world.CausalWorld) the environment to convert.
activate_sparse_reward – (bool) True to activate sparse rewards.

classmethod class_name()[source]¶

Returns

close()[source]¶

closes the environment in a safe manner should be called at the end of the program.

Returns: None

compute_reward(achieved_goal, desired_goal, info)[source]¶

Used to calculate the reward given a hypothetical situation that could be used in hindsight experience replay algorithms variants. Can only be used in the spare reward setting for the other setting it can be tricky here.

Parameters

achieved_goal – (nd.array) specifies the achieved goal as bounding boxes of objects by default.
desired_goal – (nd.array) specifies the desired goal as bounding boxes of goal shapes by default.
info – (dict) not used for now.

Returns

(float) the final reward achieved given the hypothetical situation.

render(mode='human', **kwargs)[source]¶

Returns an RGB image taken from above the platform.

Parameters: mode – (str) not taken in account now.
Returns: (nd.array) an RGB image taken from above the platform.

reset()[source]¶

Resets the environment to the current starting state of the environment.

Returns: (nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

seed(seed=None)[source]¶

Used to set the seed of the environment, to reproduce the same randomness.

Parameters: seed – (int) specifies the seed number
Returns: (int in list) the numpy seed that you can use further.

property spec¶: return:

step(action)[source]¶

Used to step through the enviroment.

Parameters: action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.
Returns: (nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.

property unwrapped¶: return:

ProtocolWrapper¶

class causal_world.wrappers.ProtocolWrapper(env, protocol)[source]¶

__init__(env, protocol)[source]¶

Parameters

env – (causal_world.CausalWorld) the environment to convert.
protocol – (causal_world.evaluation.ProtocolBase) protocol to evaluate.

reset()[source]¶

Resets the environment to the current starting state of the environment.

Returns: (nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

step(action)[source]¶

Used to step through the enviroment.

Parameters: action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.
Returns: (nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.