For locomotion tasks, we consider a legged robot walking in a dynamic environment with changing terrains, where their properties are only locally consistent. We apply all the methods with MPC for action selection and control, and report in Table 3 the average return computed over 500 episodes. Again, red numbers denote the best performance and the black ones represent the oracle performance. In all tasks, HyperDynamics outperforms all the baselines. It is able to infer accurate system properties and generate corresponding dynamics models that match the oracle Expert Ensemble model on seen terrains, and shows a great advantage over it when tested on unseen terrains.
|