Continuous goal-directed actionsadvances in robot learning

  1. Santiago Morante Cendrero
Dirixida por:
  1. Juan Carlos Gonzalez Victores Director
  2. Carlos Balaguer Bernaldo de Quirós Director

Universidade de defensa: Universidad Carlos III de Madrid

Fecha de defensa: 04 de marzo de 2016

Tribunal:
  1. Vicente Matellán Olivera Presidente
  2. María Dolores Blanco Rojas Secretario/a
  3. Antonio Barrientos Cruz Vogal

Tipo: Tese

Resumo

Robot Programming by Demonstration (PbD) has several limitations. This thesis presents a solution to the shortcomings of PbD with an inspiration on Goal-Directed imitation applied to robots. A framework for goal imitation, called Continuous Goal-Directed Actions (CGDA), has been designed and developed. This framework provides a mechanism to encode actions as changes in the environment. CGDA learns the objective of the action, beyond the movements made to perform it. With CGDA, an action such as ¿painting a wall¿ can be learned as ¿the wall changed its color a 50% from blue to red¿. Traditional robot imitation paradigms such us PbD would learned the same action as ¿move the joint i by 30 degrees, then joint j by 43 degrees...¿. This thesis is innovative in providing a framework able to measure and generalize the effects of actions. It also innovates by creating metrics to compare and reproduce goal-directed actions. Reproducing actions encoded in terms of goals allows a robot-configuration independence when reproducing tasks. This innovation allows to circumvent the correspondence problem (adapting the kinematic parameters from humans to robots). CGDA can complement current kinematic-focused paradigms, such as PbD, in robot imitation. CGDA action encoding is centered on the changes an action produces on the features of the object altered during the action. The features can be any measurable characteristic of the object such as color, area, form, etc. By tracking object features during human action demonstrations, a high dimensional feature trajectory is created. This trajectory represents object temporal states during the action. This trajectory is the main resource for the generalization, recognition and execution of actions in CGDA. Around this presented framework, several components have been added to facilitate and improve the imitation. Na¿¿ve implementations of robot learning frameworks usually assume that all the data from the user demonstrations has been correctly sensed and is relevant to the task. This assumption proves wrong in most human-demonstrated learning scenarios. This thesis presents an automatic demonstration and feature selection process to solve this issue. This machine learning pipeline is called Dissimilarity Mapping Filtering (DMF). DMF can filter both irrelevant demonstrations and irrelevant features. Once an action is generalized from a series of correct human demonstrations, the robot must be provided a method to reproduce this action. Robot joint trajectories are computed in simulation using evolutionary computation through diverse proposed strategies. This computation can be improved by using human-robot interaction. Specifically, a system for robot discovery of motor primitives from random human-guided movements has been developed. These Guided Motor Primitives (GMP) are combined to reproduce goal-directed actions. To test all these developments, experiments have been performed using a humanoid robot in a simulated environment, and the real full-sized humanoid robot TEO. A brief analysis about the cyber safety of current robots is additionally presented in the final appendices of this thesis.