ERCIM NEWS 107 October 2016 18 Special Theme: Machine Learning Watching a child learn reveals how well humans can learn: a child may only need a few examples of a concept to “learn it”. By contrast, the impressive results achieved with modern machine learning (in particular, by deep learning) are made possible largely by the use of huge datasets. For instance, the ImageNet data- base used in image recognition contains about 1.2 million labelled examples; DeepMinds's AlphaGo used more than 38 million positions to train their algorithm to play Go; and the same company used more than 38 days of play to train a neural network to play Atari 2600 games, such as Space Invaders or Breakout. Like children, robots have to face the real world, in which trying something might take seconds, hours, or days. And seeing the consequence of this trial might take much more. When robots share our world, they are expected to learn like humans or animals, that is, in far fewer than a million trials. Robots are not alone to be cursed by the price of data: Any learning process that involves physical tests or precise simulations (e.g., computational fluid dynamics) comes up against the same issue. In short, while data might be abundant in the virtual world, it is often a scarce resource in the physical world. I refer to this challenge as “micro-data” learning (see Figure 1). The first precept of micro-data learning is to choose as wisely as possible what to test next (active learning). Since compu- tation tends to become cheaper every year, it is often effective to trade data resources for computational resources, that is, to employ computationally inten- sive algorithms to select the next data point to acquire. Bayesian optimisation [1] is such a data-efficient algorithm that has recently attracted a lot of interest in the machine learning community. Using the data acquired so far, this algorithm creates a probabilistic model of the func- tion that needs to be optimised (e.g., the walking speed of a robot or the lift gen- erated by an airfoil); it then exploits this model to identify the most promising points of the search space. It can, for example, find good values for the gait of a quadruped robot (Sony Aibo / 15 parameters to learn) in just two hours of learning. The second precept of micro-data learning is to exploit every bit of infor- mation from each test. For instance, when a robotic arm tries to reach a point in space, the learning algorithm can per- form the movement, then, at the end of the trial, measure the distance to the target. In this case, each test corre- sponds to a single data point. However, the algorithm can also record the posi- tion of the "hand" every 10ms, thus get- ting thousands of data points from a single test. This is a very effective approach for learning control strategies in robotics; for example, the Pilco algo- rithm can learn to balance a non-actu- ated pole on an actuated moving cart in 15-20 seconds (about 3-5 trials) [2]. The third precept of micro-data learning is to use the "right" prior knowledge. Most problems are indeed simply too hard to be learned from scratch in a few trials, even with the best algorithms: The quick learning ability of humans and animals is due largely to their prior knowledge about what could and could not work. When using priors, it is crit- ical to make them as explicit as pos- sible, and to make sure that the learning algorithm can question or even ignore them. In academic examples, it can also be challenging to distinguish between prior knowledge that is useful and prior knowledge that actually gives the solu- Micro-Data Learning: The Other End of the Spectrum by Jean-Baptiste Mouret (Inria) Many fields are now snowed under with an avalanche of data, which raises considerable challenges for computer scientists. Meanwhile, robotics (among other fields) can often only use a few dozen data points because acquiring them involves a process that is expensive or time-consuming. How can an algorithm learn with only a few data points? Figure 1: Modern machine learning (e.g., deep learning) is designed to work with a large amount of data. For example, the Go player AlphaGo by DeepMind used a dataset of 38 million positions, and the deep reinforcement learning experiments from the same team used the equivalent of 38 days to learn to play Atari 2600 video games. Robotics is at the opposite end of the spectrum: most of the time, it is difficult to perform more than a few dozen trials. Learning with such a small amount of data is what we term "Micro-data learning". ERCIM NEWS 107 October 2016 19 tion to the algorithm, which leaves nothing to learn. We focused on prior knowledge in our recent article about damage recovery in robotics [3, L1]. In this scenario, a six- legged walking robot needs to discover a new way to walk by trial-and-error because it is damaged. Before the mis- sion, a novel algorithm explores a large search space with a simulation of the intact robot to identify the most prom- ising solution of each "family". Metaphorically, this algorithm takes the needles out of a haystack to make a stack of needles. If the robot is dam- aged, the learning algorithm, which is a derivative of Bayesian optimisation [1], exploits this prior knowledge to choose the best trials. In our experiments, the robot discovers compensatory gaits in less than two minutes and a dozen trials, for the five damage conditions that we tested [3]. In this learning approach, a data-effi- cient learning algorithm that works with the physical, damaged robot is guided by prior knowledge based on a simula- tion of the intact robot. This micro-data learning algorithm makes it possible to learn a complex task in only a few trials. The subsequent challenge is to exploit more knowledge from the trials [2] and select the next trials while taking the context into account (e.g., potential obstacles). Link: [L1] http://www.resibots.eu References: [1] B. Shahriari, et al.: “Taking the human out of the loop: A review of bayesian optimization”, Proc. of the IEEE, 2016. [2] M. P. Deisenroth, D. Fox, C. E. Rasmussen: “Gaussian processes for data-efficient learning in robotics and control”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2016. [3] A. Cully, et al.: “Robots that can adapt like animals”, Nature, 2015. Please contact: Jean-Baptiste Mouret Inria, France jean-baptiste.mouret@inria.fr After a history spanning over five decades, artificial general intelligence still remains out of reach. Machine learning has common roots with AI research, but focuses on more attain- able goals and has achieved tremen- dous success in many application fields. Similarly, a universal quantum computer is still far ahead in the distant future: the criterion for this machine is to be able to simulate an arbitrary closed quantum system. Nevertheless, uses of quantum information pro- cessing are proliferating: two notable examples are quantum key distribution systems and quantum random number generators. Recently, there has been a surge of interest in the intersection of machine learning and quantum information pro- cessing. Combining ideas from these two fields leads to tremendous benefits for both. We are collaborating on sev- eral subjects in this domain between ICFO-The Institute of Photonic Sciences, the Autonomous University of Barcelona, the University of the Basque Country, all in Spain, as well as the University of Calgary, Canada. At the highest level, abstracting of the actual algorithms and focusing on the foundations of statistical learning theory, we can ask what it means to learn with quantum data and channels, what induction and transduction mean in this setting, how we can define fig- ures of merit to quantify performance, and eventually establish bounds on gen- eralisation performance using sample and model complexity. We studied supervised learning, and proved that in the asymptotic limit and under an assumption of exchangeability, quantum entanglement does not break our traditional notion of induction [L1]. This is an important stepping stone towards understanding generalisation properties of quantum learning proto- cols. The next natural question to ask is that given a universal quantum computer, what kind of protocols can we use for Making Learning Physical: Machine Intelligence and Quantum Resources by Peter Wittek (ICFO-The Institute of Photonic Sciences and University of Borås) It is not only machine learning that is advancing rapidly: quantum information processing has witnessed several breakthroughs in recent years. In theory, quantum protocols can offer an exponential speedup for certain learning algorithms, but even contemporary implementations show remarkable results – this new field is called quantum machine learning. The benefits work both ways: classical machine learning finds more and more applicability in problems in quantum computing. Figure 1: Overview of the interplay between quantum information processing and machine learning.