On the left, the gripper is holding the brush and there are some objects (yellow cup, blue plastic block) in the background. On the right, the gripper is holding the yellow cup and the brush is in the background. If the left image was the desired outcome, a good reward function should “understand” that the two images above correspond to different objects. |