Our algorithm quickly adapts a legged robot’s policy to dynamics changes. In this example, the battery voltage dropped from 16.8V to 10V which reduced motor power, and a 500g mass was also placed on the robot's side, causing it to turn rather than walk straight. The policy is able to adapt in only 50 episodes (or 150s of real-world data). |