Task Generators¶
Task¶
-
causal_world.task_generators.task.
generate_task
(task_generator_id='reaching', **kwargs)[source]¶ - Parameters
task_generator_id – picking, pushing, reaching, pick_and_place, stacking2, stacked_blocks, towers, general or creative_stacked_blocks.
kwargs – args that are specific to the task generator
- Returns
the task to be used in the CausalWorld
BaseTask¶
-
class
causal_world.task_generators.
BaseTask
(task_name, variables_space, fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False)[source]¶ -
__init__
(task_name, variables_space, fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False)[source]¶ This class represents the base task generator which includes all the common functionalities of the task generators.
- Parameters
task_name – (str) the task name
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
-
add_ground_truth_state_to_info
()[source]¶ Specified to add the full ground truth state to the info dict.
- Returns
-
apply_interventions
(interventions_dict, check_bounds=False)[source]¶ - Parameters
interventions_dict – (dict) specifying the variable names and its corresponding values.
check_bounds – (bool) true to check if variables and their corresponding values exist in the operating space.
- Returns
(tuple): success_signal specifying if the intervention is successful or not, interventions_info specifying number of interventions and other info, reset_observation_space_signal a bool specifying if the observation space needs to be changed or not.
-
apply_task_generator_interventions
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) variables and their corresponding intervention value.
- Returns
(tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.
-
compute_reward
(achieved_goal, desired_goal, info)[source]¶ Used to calculate the reward given a hypothetical situation that could be used in hindsight experience replay algorithms variants. Can only be used in the spare reward setting for the other setting it can be tricky here.
- Parameters
achieved_goal – (nd.array) specifies the achieved goal as bounding boxes of objects by default.
desired_goal – (nd.array) specifies the desired goal as bounding boxes of goal shapes by default.
info – (dict) not used for now.
- Returns
(float) the final reward achieved given the hypothetical situation.
-
divide_intervention_dict
(interventions_dict)[source]¶ Divides the interventions to three dicts for the robot, stage and the task specific interventions.
- Parameters
interventions_dict – (dict) specifying the variable names and its corresponding values.
- Returns
(tuple) robot_interventions_dict, stage_interventions_dict, task_generator_interventions_dict
-
do_single_random_intervention
()[source]¶ Does a single random intervention on one of the variables in the environment.
- Returns
(tuple): success_signal specifying if the intervention is successful or not, interventions_info specifying number of interventions and other info, interventions_dict specifying the intervention performed, reset_observation_space_signal a bool specifying if the observation space needs to be changed or not.
-
expose_potential_partial_solution
()[source]¶ Specified to add the potential partial solution to the info dict, that can be used as privileged information afterwards.
- Returns
-
filter_structured_observations
()[source]¶ - Returns
(np.array) returns the structured observations as set up by the corresponding task generator.
-
get_achieved_goal
()[source]¶ - Returns
(nd.array) specifies the achieved goal as bounding boxes of objects by default.
-
get_desired_goal
()[source]¶ - Returns
(nd.array) specifies the desired goal as bounding boxes of goal shapes by default.
-
get_intervention_space_a
()[source]¶ - Returns
(dict) specifies the variables and their corresponding bounds in space A.
-
get_intervention_space_a_b
()[source]¶ - Returns
(dict) specifies the variables and their corresponding bounds in space A_B.
-
get_intervention_space_b
()[source]¶ - Returns
(dict) specifies the variables and their corresponding bounds in space B.
-
get_reward
()[source]¶ Used to calculate the final reward for the last action executed in the system.
- Returns
(float) the final reward which can be a mix of dense rewards and the sparse rewards caclulated by default using the fractional overlap of visual objects and rigid objects.
-
get_task_generator_variables_values
()[source]¶ - Returns
(dict) specifying the variables belonging to the task itself.
-
get_task_params
()[source]¶ - Returns
(dict) specifying all variables belonging to the task generator and their values.
-
get_variable_space_used
()[source]¶ - Returns
(dict) returns the variables and their corresponding spaces used in the current environment.
-
init_task
(robot, stage, max_episode_length, create_world_func)[source]¶ - Parameters
robot – (causal_world.envs.Robot) robot object of the environment.
stage – (causal_world.envs.Stage) stage object of the environment.
max_episode_length – (int) specifies the maximum episode length of the task.
create_world_func – (func) the function used to create the world around the robot.
- Returns
-
is_intervention_in_bounds
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) specifying the variable names and its corresponding values.
- Returns
(bool) true if the intervention values are in the operating intervention space. False otherwise.
-
reset_default_state
()[source]¶ Resets the environment task to the default task setting of the corresponding shape family, when it was first initialized. Without the interventions performed afterwards.
- Returns
-
reset_task
(interventions_dict=None, check_bounds=True)[source]¶ - Parameters
interventions_dict – (dict) intervention dict to be specified if an intervention to be latched as the new starting state of the environment.
check_bounds – (bool) specified when not in train mode and a check for the intervention if its allowed or not is needed.
- Returns
(tuple): success_signal specifying if the intervention is successful or not, interventions_info specifying number of interventions and other info, reset_observation_space_signal a bool specifying if the observation space needs to be changed or not.
-
restore_state
(state_dict, avoid_reloading_urdf=False)[source]¶ - Parameters
state_dict – (dict) specifies all variables and their corresponding values in the environment.
avoid_reloading_urdf – (bool) true if reloading the urdf is to be avoided.
- Returns
-
sample_new_goal
(level=None)[source]¶ Used to sample new goal from the corresponding shape families.
- Parameters
level – (int) specifying the level - not used for now.
- Returns
(dict) the corresponding interventions dict that could then be applied to get a new sampled goal.
-
Reaching Task¶
-
class
causal_world.task_generators.
ReachingTaskGenerator
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([100000, 0, 0, 0]), default_goal_60=array([0.0, 0.0, 0.1]), default_goal_120=array([0.0, 0.0, 0.13]), default_goal_300=array([0.0, 0.0, 0.16]), joint_positions=None, activate_sparse_reward=False)[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([100000, 0, 0, 0]), default_goal_60=array([0.0, 0.0, 0.1]), default_goal_120=array([0.0, 0.0, 0.13]), default_goal_300=array([0.0, 0.0, 0.16]), joint_positions=None, activate_sparse_reward=False)[source]¶ This task generator will generate a task for reaching.
- param variables_space
(str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
- Parameters
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
default_goal_60 – (nd.array) the position of the goal for first finger, x, y, z.
default_goal_120 – (nd.array) the position of the goal for second finger, x, y, z.
default_goal_300 – (nd.array) the position of the goal for third finger, x, y, z.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the mean distance from goal is < 0.01.
-
apply_task_generator_interventions
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) variables and their corresponding intervention value.
- Returns
(tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.
-
get_achieved_goal
()[source]¶ - Returns
(nd.array) specifies the achieved goal as concatenated end-effector positions.
-
PushingTaskGenerator¶
-
class
causal_world.task_generators.
PushingTaskGenerator
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750, 250, 100]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.08, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.08, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750, 250, 100]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.08, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.08, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶ This task generates a task for pushing an object on the arena’s floor.
- Parameters
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
tool_block_position – (nd.array) specifies the cartesian position of the tool block, x, y, z.
tool_block_orientation – (nd.array) specifies the euler orientation of the tool block, yaw, roll, pitch.
goal_block_position – (nd.array) specifies the cartesian position of the goal block, x, y, z.
goal_block_orientation – (nd.array) specifies the euler orientation of the goal block, yaw, roll, pitch.
-
PickingTaskGenerator¶
-
class
causal_world.task_generators.
PickingTaskGenerator
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([250.0, 0.0, 125.0, 0.0, 750.0, 0.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, 0.0, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_height=0.15)[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([250.0, 0.0, 125.0, 0.0, 750.0, 0.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, 0.0, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_height=0.15)[source]¶ This task generates a task for picking an object in the air.
- Parameters
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
tool_block_position – (nd.array) specifies the cartesian position of the tool block, x, y, z.
tool_block_orientation – (nd.array) specifies the euler orientation of the tool block, yaw, roll, pitch.
goal_height – (float) specifies the goal height that needs to be reached.
-
PickAndPlaceTaskGenerator¶
-
class
causal_world.task_generators.
PickAndPlaceTaskGenerator
(variables_space='space_a_b', fractional_reward_weight=0, dense_reward_weights=array([750.0, 50.0, 250.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.09, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.09, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=0, dense_reward_weights=array([750.0, 50.0, 250.0, 0.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, joint_positions=None, tool_block_position=array([0.0, - 0.09, 0.0325]), tool_block_orientation=array([0, 0, 0, 1]), goal_block_position=array([0.0, 0.09, 0.0325]), goal_block_orientation=array([0, 0, 0, 1]))[source]¶ This task generator generates a task of picking and placing an object across a fixed block in the middle of the arena.
- Parameters
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
tool_block_position – (nd.array) specifies the cartesian position of the tool block, x, y, z.
tool_block_orientation – (nd.array) specifies the euler orientation of the tool block, yaw, roll, pitch.
goal_block_position – (nd.array) specifies the cartesian position of the goal block, x, y, z.
goal_block_orientation – (nd.array) specifies the euler orientation of the goal block, yaw, roll, pitch.
-
Stacking2Generator¶
-
class
causal_world.task_generators.
Stacking2TaskGenerator
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750.0, 250.0, 250.0, 125.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, tool_block_size=0.065, joint_positions=None, tool_block_1_position=array([0.0, 0.0, 0.0325]), tool_block_1_orientation=array([0, 0, 0, 1]), tool_block_2_position=array([0.01, 0.08, 0.0325]), tool_block_2_orientation=array([0, 0, 0, 1]), goal_position=array([- 0.06, - 0.06, 0.0325]), goal_orientation=array([0, 0, 0, 1]))[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([750.0, 250.0, 250.0, 125.0, 0.005]), activate_sparse_reward=False, tool_block_mass=0.02, tool_block_size=0.065, joint_positions=None, tool_block_1_position=array([0.0, 0.0, 0.0325]), tool_block_1_orientation=array([0, 0, 0, 1]), tool_block_2_position=array([0.01, 0.08, 0.0325]), tool_block_2_orientation=array([0, 0, 0, 1]), goal_position=array([- 0.06, - 0.06, 0.0325]), goal_orientation=array([0, 0, 0, 1]))[source]¶ This task generates a task for stacking 2 blocks above each other. Note: it belongs to the same shape family of towers, we only provide a specific task generator for it to be able to do reward engineering and to reproduce the baselines for it in an easy way.
- Parameters
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
tool_block_1_position – (nd.array) specifies the cartesian position of the first tool block, x, y, z.
tool_block_1_orientation – (nd.array) specifies the euler orientation of the first tool block, yaw, roll, pitch.
tool_block_2_position – (nd.array) specifies the cartesian position of the second tool block, x, y, z.
tool_block_2_orientation – (nd.array) specifies the euler orientation of the second tool block, yaw, roll, pitch.
goal_position – (nd.array) specifies the cartesian position of the goal stack, x, y, z.
goal_orientation – (nd.array) specifies the euler orientation of the goal stack, yaw, roll, pitch.
-
apply_task_generator_interventions
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) variables and their corresponding intervention value.
- Returns
(tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.
-
StackedBlocksGeneratorTask¶
-
class
causal_world.task_generators.
StackedBlocksGeneratorTask
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=5, max_level_width=0.25)[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=5, max_level_width=0.25)[source]¶ This task generator will generate a task for stacking an arbitrary random configuration of blocks above each other.
- Parameters
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
blocks_min_size – (float) specifies the blocks minimum size/ side length for the goal shape generator.
num_of_levels – (int) specifies the number of levels to be generated.
max_level_width – (float) specifies the maximum width of the goal shape.
-
apply_task_generator_interventions
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) variables and their corresponding intervention value.
- Returns
(tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.
-
CreativeStackedBlocksGeneratorTask¶
-
class
causal_world.task_generators.
CreativeStackedBlocksGeneratorTask
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=8, max_level_width=0.12)[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, blocks_min_size=0.035, num_of_levels=8, max_level_width=0.12)[source]¶ - This task generator generates a task in the family of create stacked
blocks which generate a random configuration of stacked blocks, however only the first level and the last level are shown explicitly, the rest is left for the “imagination” of the agent itself.
- Parameters
variables_space – (str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
blocks_min_size – (float) specifies the blocks minimum size/ side length for the goal shape generator.
num_of_levels – (int) specifies the number of levels to be generated.
max_level_width – (float) specifies the maximum width of the goal shape.
-
apply_task_generator_interventions
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) variables and their corresponding intervention value.
- Returns
(tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.
-
TowersGeneratorTask¶
-
class
causal_world.task_generators.
TowersGeneratorTask
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, number_of_blocks_in_tower=array([1, 1, 5]), tower_dims=array([0.035, 0.035, 0.175]), tower_center=array([0, 0]))[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, number_of_blocks_in_tower=array([1, 1, 5]), tower_dims=array([0.035, 0.035, 0.175]), tower_center=array([0, 0]))[source]¶ This task generator will generate a task for stacking blocks into towers. :param variables_space: (str) space to be used either ‘space_a’ or
‘space_b’ or ‘space_a_b’
- Parameters
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
number_of_blocks_in_tower – (nd.array) specifies the number of blocks in the tower in each direction x,y,z.
tower_dims – (nd.array) (nd.array) specifies the dimension of the tower in each direction x,y,z.
tower_center – (nd.array) specifies the cartesian position of the center of the tower, x, y, z.
-
apply_task_generator_interventions
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) variables and their corresponding intervention value.
- Returns
(tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.
-
GeneralGeneratorTask¶
-
class
causal_world.task_generators.
GeneralGeneratorTask
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, tool_block_size=0.05, nums_objects=5)[source]¶ -
__init__
(variables_space='space_a_b', fractional_reward_weight=1, dense_reward_weights=array([], dtype=float64), activate_sparse_reward=False, tool_block_mass=0.08, joint_positions=None, tool_block_size=0.05, nums_objects=5)[source]¶ This task generator generates a general/ random configuration of the blocks by dropping random blocks from the air and waiting till it comes to a rest position and then this becomes the new shape/goal that the actor needs to achieve.
- param variables_space
(str) space to be used either ‘space_a’ or ‘space_b’ or ‘space_a_b’
- Parameters
fractional_reward_weight – (float) weight multiplied by the fractional volumetric overlap in the reward.
dense_reward_weights – (list float) specifies the reward weights for all the other reward terms calculated in the calculate_dense_rewards function.
activate_sparse_reward – (bool) specified if you want to sparsify the reward by having +1 or 0 if the volumetric fraction overlap more than 90%.
tool_block_mass – (float) specifies the blocks mass.
joint_positions – (nd.array) specifies the joints position to start the episode with. None if the default to be used.
tool_block_size – (float) specifies the blocks size.
nums_objects – (int) specifies the number of objects to be dropped from the air.
-
apply_task_generator_interventions
(interventions_dict)[source]¶ - Parameters
interventions_dict – (dict) variables and their corresponding intervention value.
- Returns
(tuple) first position if the intervention was successful or not, and second position indicates if observation_space needs to be reset.
-