Wrappers

ObjectSelectorWrapper

class causal_world.wrappers.ObjectSelectorWrapper(env)[source]
__init__(env)[source]
Parameters

env – (causal_world.CausalWorld) the environment to convert.

reset()[source]

Resets the environment to the current starting state of the environment.

Returns

(nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

step(action)[source]

Used to step through the enviroment.

Parameters

action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.

Returns

(nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.

MovingAverageActionEnvWrapper

class causal_world.wrappers.MovingAverageActionEnvWrapper(env, widow_size=8, initial_value=0)[source]
__init__(env, widow_size=8, initial_value=0)[source]
Parameters
  • env – (causal_world.CausalWorld) the environment to convert.

  • widow_size – (int) the window size for avergaing and smoothing the actions.

  • initial_value – (float) intial values to fill the window with.

action(action)[source]

Processes the action to transform to a smoothed action.

Parameters

action – (nd.array) the raw action to be processed.

Returns

(nd.array) the smoothed action.

reverse_action(action)[source]

Reverses processing the action to transform to a raw action again.

Parameters

action – (nd.array) the smoothed action.

Returns

(nd.array) the raw action before processing.

DeltaActionEnvWrapper

class causal_world.wrappers.DeltaActionEnvWrapper(env)[source]
__init__(env)[source]

A delta action wrapper for the environment to turn the actions to a delta wrt the previous action executed.

Parameters

env – (causal_world.CausalWorld) the environment to convert.

action(action)[source]

Processes the action to transform to a delta action.

Parameters

action – (nd.array) the raw action to be processed.

Returns

(nd.array) the delta action.

reverse_action(action)[source]

Reverses processing the action to transform to a raw action again.

Parameters

action – (nd.array) the delta action.

Returns

(nd.array) the raw action before processing.

CurriculumWrapper

class causal_world.wrappers.CurriculumWrapper(env, intervention_actors, actives)[source]
__init__(env, intervention_actors, actives)[source]
Parameters
  • env – (causal_world.CausalWorld) the environment to convert.

  • intervention_actors – (list) list of intervention actors

  • actives – (list of tuples) each tuple indicates (episode_start, episode_end, episode_periodicity, time_step_for_intervention)

reset()[source]

Resets the environment to the current starting state of the environment.

Returns

(nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

step(action)[source]

Used to step through the enviroment.

Parameters

action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.

Returns

(nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.

HERGoalEnvWrapper

class causal_world.wrappers.HERGoalEnvWrapper(env, activate_sparse_reward=False)[source]
__init__(env, activate_sparse_reward=False)[source]
Parameters
  • env – (causal_world.CausalWorld) the environment to convert.

  • activate_sparse_reward – (bool) True to activate sparse rewards.

classmethod class_name()[source]
Returns

close()[source]

closes the environment in a safe manner should be called at the end of the program.

Returns

None

compute_reward(achieved_goal, desired_goal, info)[source]

Used to calculate the reward given a hypothetical situation that could be used in hindsight experience replay algorithms variants. Can only be used in the spare reward setting for the other setting it can be tricky here.

Parameters
  • achieved_goal – (nd.array) specifies the achieved goal as bounding boxes of objects by default.

  • desired_goal – (nd.array) specifies the desired goal as bounding boxes of goal shapes by default.

  • info – (dict) not used for now.

Returns

(float) the final reward achieved given the hypothetical situation.

render(mode='human', **kwargs)[source]

Returns an RGB image taken from above the platform.

Parameters

mode – (str) not taken in account now.

Returns

(nd.array) an RGB image taken from above the platform.

reset()[source]

Resets the environment to the current starting state of the environment.

Returns

(nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

seed(seed=None)[source]

Used to set the seed of the environment, to reproduce the same randomness.

Parameters

seed – (int) specifies the seed number

Returns

(int in list) the numpy seed that you can use further.

property spec

return:

step(action)[source]

Used to step through the enviroment.

Parameters

action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.

Returns

(nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.

property unwrapped

return:

ProtocolWrapper

class causal_world.wrappers.ProtocolWrapper(env, protocol)[source]
__init__(env, protocol)[source]
Parameters
  • env – (causal_world.CausalWorld) the environment to convert.

  • protocol – (causal_world.evaluation.ProtocolBase) protocol to evaluate.

reset()[source]

Resets the environment to the current starting state of the environment.

Returns

(nd.array) specifies the observations returned after resetting the environment. Again, it follows the observation_mode specified.

step(action)[source]

Used to step through the enviroment.

Parameters

action – (nd.array) specifies which action should be taken by the robot, should follow the same action mode specified.

Returns

(nd.array) specifies the observations returned after stepping through the environment. Again, it follows the observation_mode specified.