Task Generators¶

Task¶

causal_world.task_generators.task.generate_task(task_generator_id='reaching', **kwargs)[source]¶

Parameters

task_generator_id – picking, pushing, reaching, pick_and_place, stacking2, stacked_blocks, towers, general or creative_stacked_blocks.
kwargs – args that are specific to the task generator

Returns

the task to be used in the CausalWorld

BaseTask¶

class causal_world.task_generators.BaseTask(task_name, variables_space, fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False)[source]¶

__init__(task_name, variables_space, fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False)[source]¶

This class represents the base task generator which includes all the common functionalities of the task generators.

Parameters

task_name – (str) the task name
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

activate_sparse_reward()[source]¶

Activate the sparse reward in the environment.

Returns

add_ground_truth_state_to_info()[source]¶

Specified to add the full ground truth state to the info dict.

Returns

apply_interventions(interventions_dict, check_bounds=False)[source]¶

Parameters

interventions_dict – (dict) specifying the variable names and its corresponding values.
check_bounds – (bool) true to check if variables and their corresponding values exist in the operating space.

Returns

(tuple): success_signal specifying if the intervention is successful or not, interventions_info specifying number of interventions and other info, reset_observation_space_signal a bool specifying if the observation space needs to be changed or not.

apply_task_generator_interventions(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) variables and their corresponding intervention value.
Returns: (tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.

compute_reward(achieved_goal, desired_goal, info)[source]¶

Used to calculate the reward given a hypothetical situation that could be used in hindsight experience replay algorithms variants. Can only be used in the spare reward setting for the other setting it can be tricky here.

Parameters

achieved_goal – (nd.array) specifies the achieved goal as bounding boxes of objects by default.
desired_goal – (nd.array) specifies the desired goal as bounding boxes of goal shapes by default.
info – (dict) not used for now.

Returns

(float) the final reward achieved given the hypothetical situation.

divide_intervention_dict(interventions_dict)[source]¶

Divides the interventions to three dicts for the robot, stage and the task specific interventions.

Parameters: interventions_dict – (dict) specifying the variable names and its corresponding values.
Returns: (tuple) robot_interventions_dict, stage_interventions_dict, task_generator_interventions_dict

do_single_random_intervention()[source]¶

Does a single random intervention on one of the variables in the environment.

Returns: (tuple): success_signal specifying if the intervention is successful or not, interventions_info specifying number of interventions and other info, interventions_dict specifying the intervention performed, reset_observation_space_signal a bool specifying if the observation space needs to be changed or not.

expose_potential_partial_solution()[source]¶

Specified to add the potential partial solution to the info dict, that can be used as privileged information afterwards.

Returns

filter_structured_observations()[source]¶

Returns: (np.array) returns the structured observations as set up by the corresponding task generator.

get_achieved_goal()[source]¶

Returns: (nd.array) specifies the achieved goal as bounding boxes of objects by default.

get_current_variable_values()[source]¶

Returns

get_default_max_episode_length()[source]¶

Returns: (int) returns the default maximum episode length.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

get_desired_goal()[source]¶

Returns: (nd.array) specifies the desired goal as bounding boxes of goal shapes by default.

get_info()[source]¶

Returns: (dict) returns the info dictionary after every step of the environment.

get_intervention_space_a()[source]¶

Returns: (dict) specifies the variables and their corresponding bounds in space A.

get_intervention_space_a_b()[source]¶

Returns: (dict) specifies the variables and their corresponding bounds in space A_B.

get_intervention_space_b()[source]¶

Returns: (dict) specifies the variables and their corresponding bounds in space B.

get_reward()[source]¶

Used to calculate the final reward for the last action executed in the system.

Returns: (float) the final reward which can be a mix of dense rewards and the sparse rewards caclulated by default using the fractional overlap of visual objects and rigid objects.

get_task_generator_variables_values()[source]¶

Returns: (dict) specifying the variables belonging to the task itself.

get_task_name()[source]¶

Returns: (str) specifies the name of the goal shape family generator.

get_task_params()[source]¶

Returns: (dict) specifying all variables belonging to the task generator and their values.

get_variable_space_used()[source]¶

Returns: (dict) returns the variables and their corresponding spaces used in the current environment.

init_task(robot, stage, max_episode_length, create_world_func)[source]¶

Parameters

robot – (causal_world.envs.Robot) robot object of the environment.
stage – (causal_world.envs.Stage) stage object of the environment.
max_episode_length – (int) specifies the maximum episode length of the task.
create_world_func – (func) the function used to create the world around the robot.

Returns

is_done()[source]¶

Returns: (bool) specifying if the task is achieved or not - not used for now.

is_intervention_in_bounds(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) specifying the variable names and its corresponding values.
Returns: (bool) true if the intervention values are in the operating intervention space. False otherwise.

reset_default_state()[source]¶

Resets the environment task to the default task setting of the corresponding shape family, when it was first initialized. Without the interventions performed afterwards.

Returns

reset_task(interventions_dict=None, check_bounds=True)[source]¶

Parameters

interventions_dict – (dict) intervention dict to be specified if an intervention to be latched as the new starting state of the environment.
check_bounds – (bool) specified when not in train mode and a check for the intervention if its allowed or not is needed.

Returns

(tuple): success_signal specifying if the intervention is successful or not, interventions_info specifying number of interventions and other info, reset_observation_space_signal a bool specifying if the observation space needs to be changed or not.

restore_state(state_dict, avoid_reloading_urdf=False)[source]¶

Parameters

state_dict – (dict) specifies all variables and their corresponding values in the environment.
avoid_reloading_urdf – (bool) true if reloading the urdf is to be avoided.

Returns

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

save_state()[source]¶

Returns: (dict) specifies all variables and their corresponding values in the environment.

set_intervention_space(variables_space)[source]¶

Parameters: variables_space – (str) “space_a”, “space_b” or “space_a_b”
Returns

Reaching Task¶

class causal_world.task_generators.ReachingTaskGenerator(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([100000, 0, 0, 0]), default_goal_60=array([0.0, 0.0, 0.1]), default_goal_120=array([0.0, 0.0, 0.13]), default_goal_300=array([0.0, 0.0, 0.16]), joint_positions=None, activate_sparse_reward=False)[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([100000, 0, 0, 0]), default_goal_60=array([0.0, 0.0, 0.1]), default_goal_120=array([0.0, 0.0, 0.13]), default_goal_300=array([0.0, 0.0, 0.16]), joint_positions=None, activate_sparse_reward=False)[source]¶

This task generator will generate a task for reaching.

param variables_space

(str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

Parameters

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

default_goal_60 – (nd.array) the position of the goal for first finger, x, y, z.

default_goal_120 – (nd.array) the position of the goal for second finger, x, y, z.

default_goal_300 – (nd.array) the position of the goal for third finger, x, y, z.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the mean distance from goal is < 0.01.

apply_task_generator_interventions(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) variables and their corresponding intervention value.
Returns: (tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.

get_achieved_goal()[source]¶

Returns: (nd.array) specifies the achieved goal as concatenated end-effector positions.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

get_desired_goal()[source]¶

Returns: (nd.array) specifies the desired goal as array of all three positions of the finger goals.

get_info()[source]¶

Returns: (dict) returns the info dictionary after every step of the environment.

get_task_generator_variables_values()[source]¶

Returns: (dict) specifying the variables belonging to the task itself.

PushingTaskGenerator¶

class causal_world.task_generators.PushingTaskGenerator(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750, 250, 100]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.08, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.08, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750, 250, 100]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.08, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.08, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶

This task generates a task for pushing an object on the arena’s floor.

Parameters

variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

tool_block_position – (nd.array) specifies the cartesian position of the tool block, x, y, z.

tool_block_orientation – (nd.array) specifies the euler orientation of the tool block, yaw, roll, pitch.

goal_block_position – (nd.array) specifies the cartesian position of the goal block, x, y, z.

goal_block_orientation – (nd.array) specifies the euler orientation of the goal block, yaw, roll, pitch.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

PickingTaskGenerator¶

class causal_world.task_generators.PickingTaskGenerator(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([250.0, 0.0, 125.0, 0.0, 750.0, 0.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, 0.0, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_height=0.15)[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([250.0, 0.0, 125.0, 0.0, 750.0, 0.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, 0.0, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_height=0.15)[source]¶

This task generates a task for picking an object in the air.

Parameters

variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

tool_block_position – (nd.array) specifies the cartesian position of the tool block, x, y, z.

tool_block_orientation – (nd.array) specifies the euler orientation of the tool block, yaw, roll, pitch.

goal_height – (float) specifies the goal height that needs to be reached.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

PickAndPlaceTaskGenerator¶

class causal_world.task_generators.PickAndPlaceTaskGenerator(variables_space='space_a_b', fractional_reward_weight=0, dense_reward_weights=array([750.0, 50.0, 250.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.09, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.09, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=0, dense_reward_weights=array([750.0, 50.0, 250.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.09, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.09, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶

This task generator generates a task of picking and placing an object across a fixed block in the middle of the arena.

Parameters

variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

tool_block_position – (nd.array) specifies the cartesian position of the tool block, x, y, z.

tool_block_orientation – (nd.array) specifies the euler orientation of the tool block, yaw, roll, pitch.

goal_block_position – (nd.array) specifies the cartesian position of the goal block, x, y, z.

goal_block_orientation – (nd.array) specifies the euler orientation of the goal block, yaw, roll, pitch.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

Stacking2Generator¶

class causal_world.task_generators.Stacking2TaskGenerator(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750.0, 250.0, 250.0, 125.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, tool_block_size=0.065, joint_positions=None, tool_block_1_position=array([0.0, 0.0, 0.0325]), tool_block_1_orientation=array([0, 0, 0, 1]), tool_block_2_position=array([0.01, 0.08, 0.0325]), tool_block_2_orientation=array([0, 0, 0, 1]), goal_position=array([- 0.06, - 0.06, 0.0325]), goal_orientation=array([0, 0, 0, 1]))[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750.0, 250.0, 250.0, 125.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, tool_block_size=0.065, joint_positions=None, tool_block_1_position=array([0.0, 0.0, 0.0325]), tool_block_1_orientation=array([0, 0, 0, 1]), tool_block_2_position=array([0.01, 0.08, 0.0325]), tool_block_2_orientation=array([0, 0, 0, 1]), goal_position=array([- 0.06, - 0.06, 0.0325]), goal_orientation=array([0, 0, 0, 1]))[source]¶

This task generates a task for stacking 2 blocks above each other. Note: it belongs to the same shape family of towers, we only provide a specific task generator for it to be able to do reward engineering and to reproduce the baselines for it in an easy way.

Parameters

variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

tool_block_1_position – (nd.array) specifies the cartesian position of the first tool block, x, y, z.

tool_block_1_orientation – (nd.array) specifies the euler orientation of the first tool block, yaw, roll, pitch.

tool_block_2_position – (nd.array) specifies the cartesian position of the second tool block, x, y, z.

tool_block_2_orientation – (nd.array) specifies the euler orientation of the second tool block, yaw, roll, pitch.

goal_position – (nd.array) specifies the cartesian position of the goal stack, x, y, z.

goal_orientation – (nd.array) specifies the euler orientation of the goal stack, yaw, roll, pitch.

apply_task_generator_interventions(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) variables and their corresponding intervention value.
Returns: (tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

get_task_generator_variables_values()[source]¶

Returns: (dict) specifying the variables belonging to the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

StackedBlocksGeneratorTask¶

class causal_world.task_generators.StackedBlocksGeneratorTask(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=5, max_level_width=0.25)[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=5, max_level_width=0.25)[source]¶

This task generator will generate a task for stacking an arbitrary random configuration of blocks above each other.

Parameters

variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

blocks_min_size – (float) specifies the blocks minimum size/ side length for the goal shape generator.

num_of_levels – (int) specifies the number of levels to be generated.

max_level_width – (float) specifies the maximum width of the goal shape.

apply_task_generator_interventions(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) variables and their corresponding intervention value.
Returns: (tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

get_task_generator_variables_values()[source]¶

Returns: (dict) specifying the variables belonging to the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

CreativeStackedBlocksGeneratorTask¶

class causal_world.task_generators.CreativeStackedBlocksGeneratorTask(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=8, max_level_width=0.12)[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=8, max_level_width=0.12)[source]¶

This task generator generates a task in the family of create stacked
blocks which generate a random configuration of stacked blocks, however only the first level and the last level are shown explicitly, the rest is left for the “imagination” of the agent itself.

Parameters

variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

blocks_min_size – (float) specifies the blocks minimum size/ side length for the goal shape generator.

num_of_levels – (int) specifies the number of levels to be generated.

max_level_width – (float) specifies the maximum width of the goal shape.

apply_task_generator_interventions(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) variables and their corresponding intervention value.
Returns: (tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

get_task_generator_variables_values()[source]¶

Returns: (dict) specifying the variables belonging to the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

TowersGeneratorTask¶

class causal_world.task_generators.TowersGeneratorTask(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, number_of_blocks_in_tower=array([1, 1, 5]), tower_dims=array([0.035, 0.035, 0.175]), tower_center=array([0, 0]))[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, number_of_blocks_in_tower=array([1, 1, 5]), tower_dims=array([0.035, 0.035, 0.175]), tower_center=array([0, 0]))[source]¶

This task generator will generate a task for stacking blocks into towers. :param variables_space: (str) space to be used either ‘space_a’ or

‘space_b’ or ‘space_a_b’

Parameters

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

number_of_blocks_in_tower – (nd.array) specifies the number of blocks in the tower in each direction x,y,z.

tower_dims – (nd.array) (nd.array) specifies the dimension of the tower in each direction x,y,z.

tower_center – (nd.array) specifies the cartesian position of the center of the tower, x, y, z.

apply_task_generator_interventions(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) variables and their corresponding intervention value.
Returns: (tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

get_task_generator_variables_values()[source]¶

Returns: (dict) specifying the variables belonging to the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.

GeneralGeneratorTask¶

class causal_world.task_generators.GeneralGeneratorTask(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, tool_block_size=0.05, nums_objects=5)[source]¶

__init__(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, tool_block_size=0.05, nums_objects=5)[source]¶

This task generator generates a general/ random configuration of the blocks by dropping random blocks from the air and waiting till it comes to a rest position and then this becomes the new shape/goal that the actor needs to achieve.

param variables_space

(str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’

Parameters

fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.

dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.

activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.

tool_block_mass – (float) specifies the blocks mass.

joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.

tool_block_size – (float) specifies the blocks size.

nums_objects – (int) specifies the number of objects to be dropped from the air.

apply_task_generator_interventions(interventions_dict)[source]¶

Parameters: interventions_dict – (dict) variables and their corresponding intervention value.
Returns: (tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.

get_description()[source]¶

Returns: (str) returns the description of the task itself.

get_task_generator_variables_values()[source]¶

Returns: (dict) specifying the variables belonging to the task itself.

sample_new_goal(level=None)[source]¶

Used to sample new goal from the corresponding shape families.

Parameters: level – (int) specifying the level - not used for now.
Returns: (dict) the corresponding interventions dict that could then be applied to get a new sampled goal.