Thesis - Fi.en Bipedal Model

HELSINKI UNIVERSITY OF TECHNOLOGY Electrical and Communications Engineering
Olli Haavisto
Walking robot model and its development a data-based modeling and control
Master's thesis, which has been submitted for review scholarship Master of Science degree in Espoo 9/.2/2004.
Supervisor
Professor Heikki Hytyniemi
HELSINKI UNIVERSITY OF TECHNOLOGY
Summary of the thesis
Author: Olli Haavisto Title of thesis: Walking robot model and its development of a data-based modeling and control
Date: 02/09/2004
Number of pages: 72
Department: Electrical and Communications Engineering Chair: AS-74 Control Engineering
Supervisor: Prof. Heikki Hytyniemi
Walk through the control of robots is a challenging multivariate methods demanding control problem. This work deals with a two-legged walking robot modeltion and control. Initially derived from the robot model dynamiikkayhtlt Lagrange mechanics and the development of Matlab / Simulink environment simulation tool model to simulate. Model is obtained by directing it to walk on separate PD controls continuously updated reference signal. PD-Walking collected from the system input and response data on the basis of maldatapohjaisesti necessary communication to the inverse dynamics of walking, that is, the scanning system holdings in PLCs. As an mallirakenteena are clustered regression, wheresame overall model consists of local the pkomponenttiregressiomalleista. Form dostettua model is a model for controlling the robot so that the walker TI quality of the corresponding regression model ohjausestimaatti connected directly to the ohwalker hedges. It is shown that the regression model in a clustered can be rendered PD-guided walking almost unchanged. On the other hand the control proves to be melsensitive to imperfections in question, who drive system from the learned behavior area, instead of focusing on data-based optimization can be carried out.
Keywords: Two-legged, walking, the robot, a data-based modeling, clustergrated Regression, Principal Components, Lagrange-mechanics
Helsinki University of Technology
Abstract of the Master's Thesis
About the Author: Olli Haavisto Name of the Thesis: Development of a walking robot model and the ITS data based modeling and control
Date: 02/09/2004
Number of pages: 72
Department: Department of Electrical and Communications Engineering Professorship: AS-74 Control Engineering
Supervisor: Prof. Heikki Hytyniemi
The control of walking robots is a Challenging Multivariable problem. This Concerns thesis on the modeling and control of a biped robot. The first aim of the Work Is To Derive The Chosen robot dynamics model using Lagrangian methods and Develop a Matlab / Simulink tool for the simulation of the model. To Make the model walk, separate PD controllers are used to love control the biped According Thurs Continuously updated reference signals. The second aim is Thurs model the inverse dynamics, That Is, the mapping from the system output to the input, of the biped gait using input-output data Collected from the PD-controlled system. The model structure is the Applied clustered regression, Which Combines Several local principal component regression models. The model is then utilized Thurs control the biped so That the control signal Estimate corresponding to the current system state is the directly used as an input for the system. It is the show That the clustered regression model can repeat the PD-controlled gait with quite good Accuracy. The controlled system is, however, Relatively sensitive Thurs errors Which try to Drive it out of the state space regions Learned and the optimization of the control is NOT POSSIBLE.
Keywords: biped, walking, the robot, a data-based modeling, clustered regression, principal component regression, Lagrangian mechanics
Introduction
This work has been the University of Technology Control Engineering Laboratory follow-up to the summer of 2002 launched a data-based modeling and control the study. I thank the supervisor, Professor Heikki benefit the Cape for his nationalinstalled floating guidance and advice as well as the possibility of the thesis TEKEtion. I thank all the staff of the laboratory in positive and inspiring working lyilmapiirin creation. I would also like to thank my parents and my brother-saamas tani background support.
Espoo, 02.09.2004
Olli Haavisto
Contents
List of symbols 1 Introduction 2 Kaksijalkaisten walk to the modeling of robots of 2.1 Modeling. . . . . . . . . . . . . . . . . . . . . . 2.2 A passive dynamic walking. . . . . . . . . . . . . 2.3 Optimal trajectories. . . . . . . . . . . . . . . . . 2.4 Neuroverkkopohjainen control. . . . . . . . . . . . . 2.5 Genetic Programming. . . . . . . . . . . . . . . . 3 5 and the control7 . . . . . . 7 . . . . . . 7 . . . . . . 8 . . . . . . 8 . . . . . . 9 11 11 13 15 15 16 17 17 18
3 walker simulation 3.1 Model. . . . . . . . . . . . . . . 3.2 The model equations. . . . . . . . . . 3.3 Simulink implementation. . . . . . . . 3.3.1 The dynamics of the walker. . 3.3.2 The substrate support forces. . . 3.3.3 The knee angles limiters 3.4 The simulator user interface. . 3.5 walker parameters. . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
4-PD control 4.1 Controls. . . . . . 4.2 of reference signals 4.3 Parameters. . . . . 4.4 Walking Business. . . . . 4.5 Simulink implementation.
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
20 20 20 22 23 24
5 Local Learning 5.1 Background. . . . . . . . . . . . . 5.2 Locally Weighted Regression 5.3 Locales models. . . . . . . . . . 5.4 Examples and applications. . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
26 26 27 27 28 30 30 31 31 32 32 33 34
Clustered 6 regressiost 6.1 Principle. . . . . . . . . . . . . . . . . . . . . . . . . 6.2 In Alien pkomponenttiregressiomallien formation 6.3 Control of account. . . . . . . . . . . . . . . . . . . 6.4 Optimization. . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 ment is obtained. . . . . . . . . . . . . . . . . . . 6.4.2 Dynamic Programming. . . . . . . . . . . . 6.4.3 clustered regressiorakenteen optimization. .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
6.5
The earlier applications. . . . . . . . . . . . . . . . . . . . . . . 35
7 Application of clustered regressiostjn walker to guidetion 7.1 Controller Implementation in Simulink. . . . . . . . . . . . . . . . . . . . 7.2 The training data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Data Clustering. . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Action points and feature selection in the number of variables. . . 7.5 The points of internal models of teaching. . . . . . . . . . . . 7.6 Lesson walking. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7 Adaptive Education. . . . . . . . . . . . . . . . . . . . . . . . 7.8 Teaching play. . . . . . . . . . . . . . . . . . . . . . .
37 37 39 40 43 45 45 50 51
8 Conclusions A Lagrange mekaniikka58 A.1 The generalized coordinates. . . . . . . . . . . . . . . . . . . . . . . 58 A.2 Lagrange equations. . . . . . . . . . . . . . . . . . . . . . . . . . 58 B. Dynamic Model Equations C. Multivariate Regression methods Principal Component Analysis C.1. . . . . . . . . . . Principal Components C.2. . . . . . . . . . . C.3 Hebbin and anti-based learning Hebbin regressioalgoritmi (HAH). . . . . . . . . . . C.3.1 neural network structure and function. . C.3.2 Principal Component Analysis. . . . . . . C.3.3 Principal Components. . . . . . . 65 . . . . . . . . . . . 65 . . . . . . . . . . . 67 The main component, . . . . . . . . . . . 67 . . . . . . . . . . . 68 . . . . . . . . . . . 71 . . . . . . . . . . . 72
53
61
List of symbols
A (q) Walker dynamics equations inertiamatriisi b (q, q ', M, F ) Walker dynamics equations right-hand side pCu Operating point pplace for state-space pCy Operating point pplace to control space D Arbitrary TheTheortogonaalimatriisi f( (k), v (k)) The general state of the system transfer function F The substrate support forces vector Fqi Generalized coordinate qiassociated with a generalized power g The gravitational acceleration G Regressiokuvaus h Diskretoinnin the sample interval H A cost is painomatriisi IThe TheThe-The size of the unit matrix J Cost Criterion k Discrete time index Kp p-th the operating point ohjausestimaatin weighting factor l The controller ulostulovektorin dimension l0L 1L2 Members of the lengths of the walker m The controller input vector dimension m0, M 1, M 2 Walker members of the masses M Momenttivektori The Number of main components N Educational Data Number of samples Np Action at the center prelated to number of training samples Ncr Number of action points p Operating point of the index p* The best operating point for the index pPxx Feature of the inverse covariance data (the operating point tr) q Generalized coordinates vector qr Coordinate qrparallel to the virtual deviation r0And R1And R2 Walker masses distances joints pRxu Signals xand ubetween the cross-covariance (the operating point tr) R () The robot dynamics model sL, SecR The left (L) and right (R) the value of the leg kosketussensorin T The kinetic energy u (k) The scaled input vector and the controller nollakeskiarvoistettu ureal (K) System status, or the controller input vector u(K) The reconstructed input vector control u(K) ~ The controller input vector, with the exception of sensor values U The controller sytevektoridata
v (k) wi W W x (k) (X0, Y0) X y (k) yreal (K) Y z (k) L, R L, R k , s 2 2The (k) 0The
Control i th pkomponenttivektori Pkomponenttikanta The virtual work Piirremuuttujavektori Walker upper body center of mass coordinates Plot Variable Data And the scaled control nollakeskiarvoistettu The controller or system control ulostulovektori System control data is Bleached piirremuuttujavektori Walker torso angle Walker on the left and the right leg and thigh angle Thigh angles difference R-L Walker on the left and the right leg and knee angle The robot joint angles Matrix formed by eigenvectors I told an oversight Ominaisarvomatriisi Business and static friction coefficients Pkomponenttikannan transpose of Control estimates the variance of the weighting function Naapuruusfunktion variance The continuous time variable The general state of the system TheThe-Sized zero-matrix
Introduction
Crawling for guiding a robot has to be typically a good resolution vin challenging problems. Robots to complex mechanical design results in difficult to nonlinear modeling the dynamics equations, the systemic systems the mathematical treatment becomes more difficult. Control problem of walking guiding requires the use of multivariate methods, since the actuators are generally much and the cross effects of different control variables are as powerful. Alone, two legged walkers have been studied a lot, and a variety of controlmethods developed for a lot. More traditional, pre-calculated liikeratoi price-based control algorithms, for example, have joined neuroverkKoja using a data-based approaches. Data-based modeling system to model the operation of only the control data of the response, and, when the internal device constructed in s do not need to know. Model structure selection strongly influenced by the modelnuksen, but which also must contain the data used in sufficient information to be rendered. Complex systems to describe the problem it is usually divided into into smaller units, with a simpler structure remains. The whole system Having a model can be a number, can easily be analyzed in the models in combination. This modeling is scalable: By the same method capable of handling a very simple more complicated to system. In this work, there are two different modeling methods. The first objective of The aim is to form a two-legged walker dynamic simulation model for lasmeasured as the system is accurate dynamiikkayhtlt Lagrange-mechanics. The model simulointi implemented in Matlab / Simulink environment. The simulation model and the associated controller by means of a simple simulated walking and collecting systemic data. The second goal of the study is to investigate piecewise linear regression regressiorakenteen clustered, or the use of potassium velijn dynamic data-based modeling, and thus formed to mallin suitability of the system control. In addition, to explore possibilities for control optimization model updating. Clustered regression is used to inverse dynamics of the system Kamall formation. System and the required space between the controllers connection in a given trajectory is saved directly to the model of the structure, when the mitattua space corresponding ohjausestimaattia can be directly used as a controlseen. The control method of a simple structure and a data-based learning regressiost clustered on the basis of biological systems can be compared with activity: memory on things learned model is a simple measurement of the basic signal-
basis of the situation in an appropriate control. A similar approach has been used for a simple robot combsivarren controlling [1], and the results showed the method to work well. Walkedjan shift, however, considerably more challenging problem, because the sysTeem dynamics is more complex and varies greatly from walking cycle period. The work is divided into sections so that Chapter 2 presents the literature based on differentvariety of methods which have been used for kaksijalkaisten walking robot, mal lintamiseen and control. Figures 3 and 4 describe the work developed in the walker model and PD control. Data modeling is the theory and application model to guide the walker goes through chapters 5-7 and the entire results of the work are grouped together in the summary of Chapter 8.
Kaksijalkaisten walking robots for modeling and Controlling
Kaksijalkaisten walk the robot modeling and control is applicable tu a number of different approaches. The simplest devices are passive kavelijt, which operate by the system in a suitable mechanical structure and does not require external control signals. Represent the other extreme, for example si Honda-developed by the group's complex and powerful controls requireVAT humanoid, whose movements are trying to mimic human behavior as accurately as possible.
This chapter presents a first modeling kaksijalkaisten walkers and reliis then review the various methods which have been used for walkers OH durable and reliable repair.
2.1
Modelling
In the modeling is appropriate to focus on the fundamental structures of the robotseen, and functions. Quite often kaksijalkaisten walkers disabilityTo facilitate the handling of the theoretical two-dimensional situation in which Walking robot advance of the considered side. Pedestrian movement can be roughly divided into two parts: Double-Support Phase walker with both feet touching the ground the weight moves taaemmalta of the front foot to. Change in step a support leg touches the ground and the rear leg swings forward. These steps can be described by repeating the entire walking each foot alternately acting as support foot. If you want to mallintaa including running, must also take into account the situation in which neither of the the foot is not attached to the country.
Walkers usually used to control the robot's joints to be connected to moment. Execution control is not straightforward, because almost The system is always greater than the degrees of freedom of adjustable-suurei Pub. In addition, the situation makes it difficult system dynamics vary strongly with walker moves from one stage to another walk.
2.2
Passive dynamic walking
Passive dynamic walking is based solely on the walker dynamic ratraffic utilization. Walking does not require external energy, but the pasmore intensive walkers are able to maintain a stable, repetitive walking motion only slightly downwardly sloping surfaces. These devices walking one foot
to swing freely under its own weight on the second leg supporting the system to the ground. Finally, the swing weight of one foot to move, when one foot turn to swing forward. McGeer [2, 3] showed that a suitably constructed two-legged device is capable of EVjust improved slightly downhill without an active control. Subsequently, a variety of passive walkers have been built and tested title high (for example, [4, 5]). Walking is a simple device as a "natural" looking - varzinc, where the walker is the knee joint. For example, he uses kavelyssn largely benefit the body and legs of a dynamic structure, when walking requires little energy as possible.
Purely passive walkers is a major limitation for their ability to walk on a downhill from there. By increasing the system low-power control can be achieved walker to walk to maintain a stable movement of the flat-top or at a slight slope [6, 7], which is near the minimum energy consumption.
2.3
Optimal movement
Pre-computed optimal trajectories are commonly used walkers control. In this approach, the system aims to control in such a way, et what has been parts of the operators comply with pre-deposited with a reference range of motion. On gelmana equilibrium is disturbed, for example, the vessel bridge or otherwise ideal conditions. Methods to maintain a balancetion can be divided into two groups: static and dynamic kvelijiprice. Walkers to remain stationary upright from the fact that the device focus through continuous control and perpendicular to the leg or the cornerstonecasting relatively substrate covered segment. The method requires strong controls and often leads to a slow and awkward gait. On the other hand walker is at steady state, that is, will stand even if the business was shut down. Dynamic walkers balances in turn, look its base point positions, through which system the total bearing force, vertical tysuora component runs (Zero Moment Point, ZMP). The point of the DeputyWhenever possible, to the middle of the support surface, and where the support surface moves to a point the edge of the device starts to tip over. For example, Honda robots [8] The balance of mainlyttminen based on this point, the desired and the actual location of the continuous adjustment.
2.4
Neuroverkkopohjainen control
The strength of neural networks is their ability to model complex eplilinear sound functions. They must be applied to control kaksijalkaisten walkers 8
in several studies, usually to perform a specific calculation of the controller inside. Such a task can be, for example the inverse kinematics calculation [9]. Oscillating neural or neuronal (neural oscillator) may be to use the walker model periodic trajectories form [10] or directly calculation of joint moments [11]. Artificial neural networks are used in adaptive systems. Typically, the comcolleagues to teach a non-linear dynamics of the system, which is then utilized the controller in operation [12].
2.5
Genetic Programming
The principle of genetic algorithms mimics evolution in nature paclaim an ever greater solution to form a specific problem. The solution, for example, set of parameter values, must first be encoded into a string. These solutions his Opinion (individuals) generates a random number of populations, which algorithm is the first space. Goodness of each solution is estimated goodness of funk thio, which gives the solution of goodness value. The higher the quality factor value, it is a better solution.
Population the best individuals to produce the next generation of three different methe purchase method: copied itself to the new population, are copied to a little converted or produced by a pair of two offspring, whose structure is a mixture of from each of the original individual. The new population creation are individuals are selected on the basis of values of goodness, but the selection is also randomdiversity. This is to avoid hyvyysfunktion lokaaleihin minima drifting.
Genetic programming [13], combined with automatic programming and genetic set algorithms. To develop a series of individual or population are various forms of computer programs, whose goodness is determined by driving the adoption of the tucontrol carried out. Two programs in the breed: random parts of the program is changed, with each other, creating a potentially more critical issue, programs. Since the gradually evolving control algorithms may also provide the robot the structure unsafe controls, you use the goodness values of calculation Usually the first simulator. For example, Sigel simulator [14], possible list of different robot structures simulation and genetic programming, cokeilemisen. By using simulations of real robots, simulation models can be obtained from the control algorithms to move the actual system, the simulator motor is able to describe a robot sufficiently [15].
The control algorithms development of a simulation requires a lot of times, because control of all companies should evaluate the goodness of simulations. Actual physical risks in9
cabbage Systems simulation is generally slow, so find a good controltyminen can take a long time. In addition, problems arise in the simulation model and differences in the actual system.
10
Walker simulation
In this work, the first objective was to develop a walking robot model, which can be tested by simulating a variety of methods to control and collect kvelevsPC ON systemic data. The model chosen was a two-legged walker, whose dynamics kayhtlt derived from Lagrange-mechanics. The actual simulation was performed Matlab Simulink environment, and the simulation results of the examination was developed graphical user interface. The following describes the first accurate model of the system as well as dynamiikkayhtlithe formation. After this, go through the model in Simulink implementation and parameter values used in the simulations. A more detailed description of the Simulink model as well as a graphical user interface is included in separate documentation [16].
3.1
Model
Used in the work system model to describe strongly simplified twolegged walker. Simulation to speed up and facilitate the calculation carried out in a two-dimensional model, the flow direction of the lateral business could be ignored. Model of identical legs consist of rigid legand the leg parts which are connected to the knee joints. Upper body composed of one rigid body, which adheres to the feet hip joints. Figure 1 (a) above tetty walker structure and state of the system to describe the parameters used.
(A)
(B)
Figure 1: System variables and constants (a) as well as external forces and momentit (b).
11
The considered system and the position of the place to describe a two-dimensional coordinate system is at least seven of the variable, ie the system has a seven degrees of freedom. Of coordinates (x0, Y0) To determine the upper body mass and the angular position sakeskipisteen deviation from the y-axis direction. The left man (L) and right (R) foot postures are described in relation to the upper body, hip and knee joint angles (L, R, L, R). Walker, and on the thighs and the torso or leg are determined by the lengths of Figure 1 (a) accordance with the parameters l0L1and l2. The upper body center of mass (mass m0) within the pelvis is r0. Each of the thigh center of mass (mass m1) assumed to be hip and knee straight line passing through a distance r1 hip joints. Similarly, the leg by the average score (the m2) Are located in the knee and the foot over the tip of a straight line passing from the r2the knee. The substrate for the modeling of the two walking legs, or at the ends, it is possible to affects the horizontal and the vertical external force (FLx , FLy , FRx , FRy )-JOL I created different platform materials and uneven use of the simulation platform is a simple (Figure 1 (b)). External forces are formed adjuster, a push-type after ambient foot hits the top of the substrate. The regular model, control signals are upper body and thighs, between moments (ML1 , MR1 ), And knee joints momentit (ML2 , MR2 ).
Feet of interaction with the substrate will therefore be implemented with separate external canhow, when the same model describes the dynamics of the pedestrian at all times. This allows the use of a simulation model of arbitrary movements of the simulation. Addition tosi dynamics model is holonominen; or any part of the system can not meet moving in an external mechanical restrictions. Another possible approach Presentation 'would be to use separate models of the feet touching the ground, depending on the luthe number of contacts, and to assume that the ground is not touching the foot from slipping off. So the necessary models are simpler, but the transitions between models should be calculated separately. The work used a model similar two-legged robot walkers have been studied a lot. Structurally identical to, for example, RABBIT-robot [17], which is derived from a simulation model. In the modeling of the robot is dry however assumed that the walking swing foot hits the ground the other foot will direct the air. Thus a single model could be described as all the walkingLyn steps, but the support leg to cause the change of each individual is calculated as a stepped change in state of the system. Similarly, a simulation model of activity is repeating steps a limited alternating, and, for example, both jalko THE simultaneous contact with the ground not simulated.
12
3.2
The model equations
Walker dynamics modeling was carried out Lagrange technique (Appendix A). System status is determined by generalized coordinates q= [X0, Y0, , L, R, L, R]T (1)
and their time derivatives. Each coordinate associated with the corresponding generalized power: Fq= [Cx0, Fy0, F, FL, FR, FL, FR]T.(2) Denote the centers of mass of the thighs of places cartesian coordinates (XL1 , YL1 ) And (xR1 , YR1 ). Massakeskipisteitten leg positions are (xL2 , YL2 ) and (xR2 , YR2 ) And the legs pitten coordinates (xLG , YLG ) And (xRG , YRG ). The left Da leg of the coordinates can be expressed in generalized coordinates as follows: xL1 =x0-r0sin -r1sin ( -L) yL1 =y0-r0cos -r1cos ( -L) xL2 =x0-r0sin -l1sin ( -L)-r2sin ( -L+L) (3) yL2 =y0-r0cos -l1cos ( -L)-r2cos ( -L+L) The right foot corresponding to the coordinates obtained by substituting equations (3)-va sive leg angles Land Lright leg angles Rand R. xLG =x0-r0sin -l1sin ( -L)-l2sin ( -L+L) The kinetic energy of the system can be easily pronounce cartesian coordinates different points of the kinetic energies of the sum of: yLG =y0-r0cos -l1cos ( -L)-l2cos ( -L+L). T= 1 m0(x2+y0) + m1(x2+yL1 +x2+yR1 ) 02L1 2R1 2 2 (4) + M2(x2+yL2 +x2+yR2 ).L2 2R2 2
For each generalized coordinate qrthe corresponding generalized force expression Fqr managed by increasing the value of the coordinate the deviation of the virtual qrverflow and keeping the other generalized coordinates as constants. All the system the forces acting on the amendment made by the virtual work W qrdepends on the followfollowing equation in accordance with a desired force: W qr=Fqrqr. Generalized forces, equations of generalized coordinates to be (5)
13
so Fx0 =FLx +FRx =- (M0+ 2m1+ 2m2G) + FLy +FRy

y0
=- ( yL1 m1+ yL2 m2+ yR1 m1+ yR2 m2G) +

yRG xLG xRG+ FRy + FLx + FRx
yLGFLy
(6)
=- ( yL1 m1+
L =- yL2 m2g L
yL2m2G) L L L L L
+ yLG FLy + xLG FLx
+ML1
+ yLG FLy + xLG FLx +ML2 .
The right leg of the forces is obtained by replacing the left leg forces F expressions,L variables to the right leg with the corresponding quantities. Dynamiikkayhtlitten system to solve the kinetic energy states gy (4) using the generalized coordinates transformation equations (3). The resulting F L generalized expression as well as forces in the expressions (6) located in the Lagrange-one other equations of the form d dt T T qr =Fqr. qr
(7)
The resulting equation seven secondary differentiaaliyhtlryhm can be written as joittaa matrix form A (q) =b (q, q ', M, F );q where M= [ML1 , MR1 , ML2 , MR2 ]T model are affected by external torques and F= [CLx , FLy , FRx , FRy ]T (10) (9) (8)
the chassis support forces resulting from Figure 1 (b). Vector b (q, q ', M, F )containing not more than the first rate, time derivatives of general Particular coordinates. Inertiamatriisi A (q) does not include the generalized coordiroad time derivatives. Lagrange equations (7) to a mechanical and writing the final liseen shape (8) made of Mathematica software. The obtained expressions A (q)-matrix items, as well as vector b (q, q ', M, F )was then converted Matlab format, so that they could be applied to the simulation. Inertiamatriisin A (q) and the vector b (q, q ', M, F )alkioitten phrases are members teen B.
14
3.3
Simulink implementation
Walker was used to simulate the dynamics of the model in Matlab Simulink environrist, where the model space and the necessary base for calculating the forces could be collected coDEKSI values (Figure 2).
Figure 2: Simulink model of the walker consists of dynamic equations, the substrate support forces and knee angles of the stop moments calculation blocks. Biped model -Block input signal is a vector which contains a model of profault protection for articles (9). The output signal are grouped generalized the coordinates of (1), the first time derivative of each leg the touch sensor value: [Q T, Q, T, SecL, SecR]T. If your foot touches the ground, will rise against- Taava contact with the signal (sL, SecR) To unity. The leg of the air signal value is zero. And the system will walk contact with the substrate to simulate the permanent statistical same. Control system applicable to discrete explains, adding, however, so the controldiskretoidaan and the output signals of the zero order hold. Block also deposit and the discrete output signal sytesignaalinsa Matlab workspace. The simulation parameters for the walker and the initial state is fed to block the mask dialog box.
3.3.1
Dynamics of the walker
Dynamiikkayhtlitten (8) for carrying out the simulation of a block is shown in Figure 3 Koska acceleration vector qSolving a closed form does not in practice be-
15
nistu, has to be the matrix A (q) to calculate the inverse matrix, which iteraatioaske Lee separately.
Figure 3: Dynamic model Block simulates the actual walk, dynamiikkayhtlit lijn simulation model.
3.3.2
The substrate support forces
Walking modeled shape of the substrate murtoviivana passing as a parameter, an total station. Walker legs to the tips allocated to the separate PD controllers at the support forces when the foot contacts the ground, so in practice, the beginning of the spring acts as a damping system. One-leg support to calculate the forces on foot and place the tip speed projected onto the first platform ratio in normal and tangentiaalikomponentteihin. The normal component of prois recognized in the PD controller in such a way that the bearing force perpendicular to the base components, nent FTheare limited to positive values. The foot can not be grasped onto the substrate. Tangentiaalisuunnassa view of the substrate adhesion properties. When the footka hits the ground, provided the deviations from the landing location of PD control, which is tangential to the output from the power Ft. If, however, the force required to exceeds the maximum friction force Ft, max =sFThe, (11)
where sthe static friction coefficient of the substrate, the foot begins to slip. This tangential NEN support the force is determined by kinetic Ft=kFThe, where kthe motion of the friction coefficient. Finally, in normal and tangentiaalivoimat projected back to the vertical and horizontal voimiksi parallel to (10), which are introduced into a dynamic model. Forces calculation takes place in block Ground contact (Figure 2). 16 (12)
3.3.3
Knee angles limiters
Knee angles restriction is the same principle as the substrate normal parallel to the bearing force calculation. When the knee angle exceeds the maximum or below the minimum permissible angle is called the angle of the PD control on. The controller adds steering systems, directly related to the joint torque value. The minimum knee angle limiter control of the control is limited to positive-worthy price in such a way that it never opposed the knee bending. Similarly, the largest man has no objection to the angle of the knee limiter adjustment. The calculation takes place walker simulation block Knee stopper Sub-blocks (Fig. 2).
3.4
The simulator user interface
Pedestrian modeling in Simulink blocks used by entering the inlet blockflask used for controlling moments. The output of block so on may be pedestrian state, which can be used for example for calculating the control torques. Simulation and visualization of results was developed in Matlab graphical user interface (Fig. 4).
Figure 4: The simulator user interface. All walked to the ice and the substrate parameters are defined together with the Matlab file, run the simulator before the simulation and animation showfrom. Also, any controller or others in the model blocks necessary
17
scapes the parameter file to run. The model canbe simulated desired period of time. Once the simulation has been carried out, can be walker to examine the behavior of an animation.
3.5
Walker parameters
Parameters used in the work of the walker shown in Table 1 Walkers andbers chosen relatively small mass compared to the respective lengths, in order to PD-control and control systems to facilitate the implementation of the intensity does not increase unreasonably. Masses were placed in the middle of each member. The legs too slipping walking platform was used to relatively large coefficient of friction values. A flexible platform was chosen to support the forces of falling in PD controllers dynamics rates were sufficiently low level. Hard surfaces simulation tiajat are getting longer, because integrointialgoritmin step size must be reduced, so abrupt dynamics can be solved with sufficient accuracy.
Table 1: The parameter values for the walker and the substrate. Masses (kg): m05 m12 m21 Length (m): l00.8 l10.5 l20.5 Positions of the masses (m): r00.4 r10.25 r20.25 The substrate properties: static friction coefficient s1.2 coefficient of friction in k0.6 jousivakio1000 (kg / s2) vaimennuskerroin500 (kg / s) Knee angles limiters parameters: jousivakio1000 (Nm / (rad)) vaimennuskerroin100 (Nms / (rad)) Other parameters are: acceleration of gravity g9.81 (m / s2)
While walking the substrate roughness modeling would have been simulaattorilSat possible, the simulations were used on a flat surface. This was facilitated by partici18
the gait control program, with a few uneven surfaces Trials with showing even small variations in height to hinder the walkwhich significantly.
19
PD control
Gait-based modeling of the dynamics of the data was collected for systemic input and response data in the execution of optimoimatonta walking motion. Mallikavely carried out in separate PD-control, and thus it was used as a basis for clustered regressiostimen teaching. This chapter describes the control implementation and the parameters used.
4.1
Controls
A model to form a walking pedestrian is controlled by four discrete separate PD controller, which is fed to change the reference signals. Both kneeits control angle is adjusted, and a controller which controls the corners of the thighs the difference = R-L. Thus keeping the torso upright can be separated stongelmakseen completely separate into a parallel, which receives the control signal, be the fourth-PD controller. Knee angles controllers provide the control signals directly to the moments ML2 and MR2 . The angle between the thighs control signal as a positive impact on the right thigh torque MR1 and the negative torque of the left thigh ML1 . Upper linkHolding the house has been taken in to support the addition of the controller to the control signal to a torque of the thigh, corresponding to a foot touches the ground. Both feet on the ground affected by the control signal as much in both femoral articles.
System is used as the discrete PD-regulators, to a transfer function is of the form D u (kh) =P e (kh) + e (kh),(13) h where the time index kdescribes the sample number and the standard hthe sample interval. Erosignaa li e (kh) calculated by subtracting the adjustable value of the variable reference value. Difference variable change e (kh) obtained directly from the current and previous signaaliarvon difference between: e (kh) = e (kh) -e((C -1) h).(14) Parameters of the controller operating ratio control unit and derivaattaosuuden reinforcements Pand D.
4.2
A reference signal
The system is obtained by feeding the legs to walk PD controllers cyclic reference signals. The signals are formed successively by adding, to reduce the
20
tmll or by keeping constant references to the values between the sampling sysTeem's condition. Figure 5 shows a reference signal, the corresponding adjustable parameters and system behavior during one step.
Figure 5: PD-guided walking one step. A step starts the double support phase, where the references are initially constant. When the upper body center of mass is in the system due to the momentum transfer tynyt sufficiently forward, the rear knee will reference of the angle increases. At the same time the angle between the thighs, is reduced, when the foot is raised transferred to the substrate and change the phase of the system. Swing leg is transferred the front by reducing the angle between the thighs, to a certain constant value as, ti. In order not to hit the foot of the base too soon, bend the knee swing period of time. When the leg is swung forward sufficiently to straighten the knee before the Kosfox in the country. Oscillation at the beginning of the support leg knee straightened, the swing foot remains a sufficient distance from the base. The new double-support phase starts only when the leg is swung touched hetken time to initialize. This is to prevent a new phase transition such as tilanteessa, just brushing the foot platform between the swing. Knee angles reference values change in the angle between the femur and the modified reference vastaluvukseen a new step at the start. At the same time status signals are exchanged, to the right and the left leg between, in which case the next step is the track record
21
the formation of the same as the previous one, even if the walker hilahtava foot and leg are interchanged. Upper angle of the reference value remains constant throughout the period of walking, and the actual tion angle value oscillates around the object reference once in every step of the way. Top body motion dampens the feet crash pad, and helps retain pedestrianThe total required by each pace forward.
4.3
The parameters
PD controllers are used for the various parameters of amplitude and a double-stage support; because the guidance of the forces needed will depend on whether or not the foot of teror will you in the air. Thus for example, swing the controller to the foot of the knee reinforcements are weaker than the support leg as swing leg is not needed so much tomuch power. Parameters are changed to the legs of the information provided by kosketussensorien just keywords. D parameter changes in PD controller in a single D-term to zero temporarily when the articles there is no possibility of a change basis for any inconvenience. Table 2 shows the work of the parameters used in the simulation stjien a step chain. Values of the parameters and the reference signals Generation tuned by testing logic, since this method does not control had carkoitus otherwise particularly optimized. Overall, controlled by the system It was a pretty sensitive for a tuning of the parameters with respect.
Table 2: PD-down controller parameters are used for walking. PD Double-Support Phase: The angle between the thighs, 60 1 Knee Angles L, R40 0,5 Torso angle 40 2 Change the Stage: The angle between the thighs, 70 6 The supporting leg polvikulma30 2 Swing leg knee angle of 10 0,1
The aim was that the walker would follow as well formed reference signal and the adjustment occur in a severe vibration. In practice very good reference signal is reached again (Fig. 5), since systemic mi is aliohjattu and strong cross effects of adjustable parameters. In addition, difficulties in producing the transition of support one phase to another, as for example, si on foot hits the ground the system dynamics will change dramatically.
22
Steering nytevliksi elected h= 10 ms. At higher values of the sample interval Finding the appropriate parameters for the PD controllers turned out to be a difficult one, and on the other hand the smaller the value selected will be shown on the margins of implementation in real sysus, it becomes difficult.
4.4
Walking Business
The system was obtained reference signal and the PD controller are to walk with youa SE remained virtually constant up and repeated the same kvelysekvenssi. Although the pedestrian initially was simulated trouble-free and flat surface, walkinglyliike never quite fully stabilized, but fell slightly short of a variable. Figure 6 shows the step in the length of the system, ie the distance between the ends of the legs, kaksoistu kivaiheen a function of time at the beginning of a disturbance, for walking. The variations are clearly sporadic, but remains within certain limits.
Figure 6: PD-controlled step length varied randomly. The PD was collected Walking input and output data, which was used opetusai tisfying clustered regressiostjlle. Variation in the data to increase the feedsignal, namely articles (9) were added to the simulation of a random-normaalija kautunutta noise. Since walking is symmetric, which is the second step of the way right and left and exchange the roles of the LAN so as to swing each of the left foot. There was thus obtained data, which recurred in the same double-support phases (left foot back), and oscillation phases (left leg swings) in turn.
23
4.5
Simulink implementation
PD control was implemented in Simulink block, which takes as input the system tilan, update it on the basis of reference signal and calculates the difference variable pebasis of the necessary moments. Figure 7 shows the structural strand in which the referrenssisignaalien and the actual formation of the PD controller is separated into separate subsectors.
Figure 7: block PD controller comprises a reference signal and the formation of the PD the instrument. Figure 8 shows the Create references -Block content. The reference signals are calculated, are calculated on the assumption that the swing at the foot of the left and the right leg. Therefore, inka in the second step of the way the signals must be changed with each leg of the container prior the steering controller.
Figure 8: The reference signals for updating its own and the state of the system lohkossaan on the basis of the old references. Controller -Block is calculated first difference variable and adjustable parameters concluded the sensor values, which is going to step in step. The basebasis is chosen the correct parameters for the PD-controllers, and controllers are converted outincome momenttisignaaleiksi guiding system (Figure 9). PD control block structure is presented in more detail the simulation tool documentation mutations [16].
24
Figure 9: Controller -Area is calculated actual control signals.
25
Local Learning
Data-based modeling system to a review of collected Tyn input and response data based on. This chapter introduces lokaaleihin malduty designs based modeling methods, a local learning (local learning), that provides tools for data-based functions approximation. Local to learning has been applied quite successfully in a single robottisdss part of the regulator itself. The sections below outline some of the local learning methods, with a periaate partly responsible for the work actually used in clustered regression. It also presents a few examples of the local learning applications.
5.1
Background
Robotics robotic model describes the dynamics of R (), which is connected to the joints coupling newly issued moments Mthe corresponding positions of the joints ; speeds and acceleration vyyksiin . Inverse dynamics of the system gives a state of controlto: inverse dynamics model is known, it can take advantage of the robot If the simple time-and feedback-connect adjustment. For example, a M=R-1 (, , ).(15) between controller [18] using the inverse dynamics of the model estimate R-1 and controls the a robot in accordance with the reference signals. Reference signal is needed in joints ten desired positions, velocities and accelerations (d, d, d). The feedback correctdella correct operation by the control error, the whole control, I consists of the expression ~M=R-1 (d, d, d) + P(d-) +D (d-). (16) Data-based modeling of dynamics of the model formed by the systemic collected and the input of response data. This avoids a system of internal more detailed analysis of the structure. Typically, a data-based modeling systems to form a the global model, which describes the behavior of the entire system. For example neural networks parameters are adjusted as a whole the available datajoukkoon, which requires a lot of network neurons, and the structure may become very complicated. Learning to use the local template to form the opposite approach proach. The objective is to estimate the desired multi-variable non-linear
26
simple function of several locale a combination of the model. In [19] gives a very full description of the local learning backgrounds, but the followraavassa describes some of the main points.
5.2
Locally weighted regression
Locally weighted regression (Locally weighted regression, LWR) based on "Lazy learning" (lazy learning), in which all claims against the data points deposited such a memory. The new entry points corresponding to the model estimate of the lasare calculated by fitting entry point for nearby points on a deposit of a local model for a query is received. Fitting is made from pressurerisk-weighted regression, in which case the closest score the greatest impact resulting from the model and the estimate. Adapted form of the function is not limited, but In order to avoid complexity is normally limited to a linear model. Locally weighted regression requires a large amount of memory, and calculation of the estimate may be too slow, because the new model needs to be computed for each sytepisteelle new individual. In addition, the data dimension increases, the points The distance between the relevant with regard to (a so-called curse of dimensionality) to give focus becomes more difficult.
5.3
Locales models
Storing all data points in a general sense rather than formation of TAA to specific points in space, ready Locales models, which were then querypoints closest to affect the final estimate. Schaal, and Atkeson Vijayakumar [20, 21] have been developed by application of robotics lettavia local learning methods (Locally weighted learning, LWL), where new models are continuously generated as new data points are collected. If the old models, there is nothing very new to enough data points, createbe a new model. Only these templates I created stored in a memory, when the teaching data can be plenty. Common to these methods is the fact that the number of data dimensions, small employee repre locally models for. This is based on the finding, Jon optical systems according to which the data is collected korkeaulotteinen typical so far revealed locally up to 5-8 dimensional. Such compression may be carried out. Principal component analysis, principal component regression (Appendix C) or osittaisella method of least squares (PLS) [22]. Local learning methods is typically possible to inkremental form, in which the new data to which is easy. It was-
27
Originally based methods is the fact that the model development and day-totion in real-time moving the robot to be possible.
5.4
Examples and applications
Local learning is applied in particular korkeaulotteisten descriptions mallintamiseen robotics, and the estimation functions. Figure 10 is simplified Example competent modeling non-linear function [23]. Of the original two-dimensional function (Fig. 10 (a)) was chosen as random 500 samples, with a dimension increased by the addition of such noise, and, drying the dependent upon the data to 20-dimensional. At the local trained with data liseen LWPR-based learning algorithm (Locally weighted projection regression) was able to fairly accurately reconstruct the original form of the functiondo not (Fig. 10 (b)). Position the algorithm locales models centers evenly photographed in space and adapted to their sphere of influence of the correct shape (Fig. 10 (c)).
Local learning for robotics applications, typically modeled by a robot inverse kinematics or inverse dynamics. For example, the robottiksivarren inverse kinematics of images of the head arm cartesian coordijoint angles than once. Description of the complex and the same point can be typically achieved by various combinations of the angles. Inverse Kinematics must be familiar with, so that the arm can be controlled to the desired points suitable routes.
LWPR-algorithm of the inverse kinematics can be modeled by collecting daTAA joint angles robot arm and the head position of the arm-liikkues same. Both the model taught in all the possible solutions to the problem, and a separate costtannuskriteerill is chosen, which is currently the best solution. In [24] LWPR algorithm described in the application of a robot arm of an inverted-Kinema policy modeling. Modeling can gradually better and better, when the arm is in motion to gather more data in teaching and learning progresses. The inverse dynamics modeling of the local learning methods pyritn generally describes the dynamics of the entire system. Mytkytketyll controller by three of which the model is formed, may then refer-controlled system renssisignaalien accordingly. Since the action space is extremely complex, and wide, we need a lot of templates locales. For example, 7-degree of freedom robot arm inverse dynamics of the shooting at 260 locale model, the description was made 21-dimensional space-space (angles, speeds and accelerations) separately for each of the seven torque [23].
28
(A)
(B)
(C)
Figure 10: Example of a local application of the learning function approksimointiin. Model the function of (a) it was possible to reconstruct a good accuracy of Six (b). Locales designs into two regions as the effect and shapes (c). Section images [23].
29
Clustered regressiost
Clustered regressiostj used in this work-walking motion datapoh icy modeling and the resulting model walker simulation model controlseen. The controller consists of a linear model, ie the operating point, which used for guidance at various stages. Operating points models are formed pelkstn operation of the collected system data in the statistical properties of PE basis, and control of accounting is used for multiple comparisons set out in Annex Cjaregressiomenetelm ie principal component regression. This chapter describes the clustered regressiostmenetelmn of operation and use the modelstructure.
6.1
Principle
The system controlling the mode changes to control the impact of a function of timeNa. Mechanical systems and the continuous case, the state variables change can be represented in space-space trajectory, generally toimintakyrn. Ohjauksen For example, a retention system in a given takyrll or the desired final state is achieved. Clustered regressiostj suitable for a situation in which the system must be able to repeat the same operations TAA several times, that is, the whole operation can be described by one toimintakyrll.
Walking to repeat the same motion walker liikesykli every step. Both the controlsignal y (k) on condition that the system u (k)1are repeated at standard conditions cyclically, to store a constant sampling rate of the data collected form forms a combined control-state-space, a solid operating curve. Koska walker steps in the right and the left leg of symmetry, also This curve is made up of two symmetrical part and the analysis can be to focus on one of these parts. If walking is a little variation, inVAT data points at random around the mean operating curve.
Regressiostjss clustered curve is modeled as operating system, when it is known that the current status of the system u (k), controller to provide an estimate of y(K) space-related counseling y (k). The above local learning verpared now focus on the size of the inverse dynamics modeling rather than just toimintakyrn particular, the specimen size is much smaller. Samoin feed is just a description of the system state variables (and their timederivatives), and the required accelerations. The main difference between the control solutions, however, is that the clustered regression siostjn given by the control signal is obtained directly from the learned model to give
1 Entries have been selected controller point of view, so that the system state u (k) is controller input and the control y (k) controller response.
30
Control estimate. The local learning used in a separate timecontroller and connected to the reference signal is not needed, but that the desired trajectory is built into the model. This of course limits the system's function Only one business track, which for example, just walking only one kind of implementation of the walking cycle. Clustered regressiostjss modeling is based on a system-toimintaky ran-sharing operating points, around which the curve can be assumed to linear. Assigned to each operating point of the main component, locale regression model, which is the description of the operating point input data of the cluster, roads and responses between. Modeling of the data dimensions, therefore, be reduced toend kaalisti as well as in certain applications, local learning, in which case, oleel liset dependencies can be made more explicit.
6.2
Locales formed pkomponenttiregressiomallien TUS
Operating point of the internal operating point regression model formed the Annextyvn data on the basis of the cluster. Entered number of operating points Ncr and the action point p-related data (up, Y p) The number of samples Np. Model at the point pconsisting of training data from statistical characteristics:
pCu: The controller's input data upodotusarvovektori ie operating point location state-space. pCy: The controller's response to the data ypodotusarvovektori ie operating point location the control space. pPxx : Plot the Data xpthe inverse covariance matrix. pRxu : pRyz :
Plot and input data ristikovarianssimatriisi. Control Data and bleached feature of the data zpristikovarianssimatriisi.
6.3
Guidance for calculating
Controller controls at the time kcalculated by the system state u (k) basis. And automoponenttiregression be used to calculate Hebbin set out in Annex C and antiHebbin formulas based on the principle pkomponenttiregressioalgoritmin situation in which the covariance matrices, and regressiorakenteen odotusarvovektorit have converged.
31
Every model pcalculated estimate of the main component of the first controlregression equation
ppppppyp(K)
= Ryz Pxx Half Pxx Rxu (U (k) -Cu) + Cy,
(17)
which is connected to the formulas (66) and (71). Similarly, for all operating points of lasare calculated cost Jp(U (k)); illustrating the operating point of the model, i.e., pivuutta sample u (k). Cost is defined in the classroom kaavalsat (29) and the steering case by the formula (30). The preferred operating point p* selected is that which gives the lowest cost. The whole controller ohjausestimaatti formed by all the models estimates weighted average of the
Ncrpypp = 1 K(K)
(k) (18)
y(K) =
Ncr
, Kp(K) p=1
where the weights Kp(K) depend on the operating point of the costs associated with as follows:
*2
-Jp(U (k)) -Jp(U (k)) Kp(K) = exp.(19) 2 2 Increases the cost of an estimate of the operating point, therefore, by weight of small nevertheless edge of the Gaussian function. If the internal distributions of the clusters is assumed normal distribution with a variance of the elements because of simplicity, which is is a constant direction 2, Corresponds to the situation in the maximum likelihood estimate. Parameter 2will tell you how many distant functioning score affect the value of the estimate. If the variance is very small, the conversion-toimintapistei between the step-like. On the other hand a larger variance of the value of multi-locale model estimates are combined and the conversion can be implemented smoothly.
6.4
Optimization
Clustered regressiostj taught to repeat a model used in the system, brought activities. Usually, however, desirable to that the system would behave in a MIE Lessa - for example, control of energy used in relation to - the usual optimum Saturday. The following summarizes some optimistteorian factors, which are then compared with the clustered regressiostimen optimization principles.
6.4.1
Ment is
Optimized [25] is to control the system so that it acts as the HA lutussa optimally sense. In general, a discrete non-linear dynamic 32
system can be described by the equation (k + 1) = fk( (k), v (k)); (20)
where (k) is the system state and v (k) control at the time k State transition function fk( (k), v (k)) gives a moment kand mode control, on the basis of the new tilavek the value of the square. For optimum operation is typically used to define a cost criterion, which is the most frequent form of
T-1
Jt= (R, (T )) +
k=t
Lk( (k), v (k));
(21)
when you consider the time interval is k [T, T]. The cost of the first term in otmay be disregarded and the final state of the time, the second term will emphasize allIRRADIATED controls and facilities. The optimal control v* control system facilities * through such a way that the cost criterion (21) is minimized. By selecting a cost criterion, the shape and priorities can be suitably costtannuksen to a minimum to achieve the desired optimal behavior. A typical minimized so far revealed the guidance of the energy used or the time and to the same by you desire to keep the system operating costs by weight is also used in farms. In the general case, where the system's structure does not impose any restrictions, Calculation of the optimal control is not usually succeed in a closed form. Clauses of the general solution can be extrapolated from the treatment be done constrained optimization problem and applying the Lagrange multipliers. The cost function (21) is minimized so that each time index, kPI MPLIANCEWITH apply in addition to the dynamics equation obtained by a restrictive condition. This minimized function, without limitation, the
T-1
Jt= (R, (T )) + + T(K + 1)
Lk( (k), v (k))

k=t k (F ( (k),
(22)
v (k)) - (k + 1)) .
Vector (k) consists of the moment kused in the Lagrange multipliers. The final formulas for iterative solution to the problem is disclosed, for example with reference to Annex [25], but their resolution is successful only in very simple cases. 6.4.2 Dynamic programming
Dynamic programming (eg, [26]) based on the Bellman's optimaalisuusperiideals. According to the optimum control feature is that the time k 33
forward control need to connect to the optimal control v* regardless of alelectrical circuits, controls and facilities. At the time of kon the type of control does not affect the way in which the current state of (k) is led, as long as the remainder of the optimum control. This leads to dynamic programming algorithm in which the optimization problem will be examined inversely with respect to time. To start at the end of the state (T )and work your way step by step back in time to complete the calculation of the optimal control mode as a function of the whole problem. Optimum of the cost of iterative tive formula
1))],
**Jk( v (k)
(k)) = min [Lk( (k), v (k)) +Jk +1 ( (k +
(23)
*where the initial value JT( (T)) used in the final state cost (R, (T )). Optimum control to provide guidance every step of the v* (K), a costminimizing the cost represents.
By minimizing the cost, therefore, locally, each step of the way the system is obtained as a whole to behave optimally.
6.4.3
Clustered regressiorakenteen optimization
Datapohjaisena pure clustered method of operating regressiostjn s is based on only the data used in teaching. The idea is to start with simply carried out by pattern control, which may be far from the optimistic. After this initial state has been taught to the controller, will update the model structure the new data, which controls the operation more effectively. This controller gradually learns to better control and approaches the desired optimum. Regressiostjss clustered dynamic optimization may be treated optimization of the starting points: If The last a steering operating point of the optical paint, it is sufficient to optimize the activity the previous point. When this is factoryty, can move to the previous operating point again and continue the optimization. This proceeding back in time to provide the entire operation piecewise -optimaalisek si. If in addition to action points in place in India will be updated, the overall optimum solution is gradually achieved. Cyclic operation, such as just walking in steering, can not be separated "fiveUs "function points. Optimization policy to gradually upgrade all operating points and their position, giving the weather brought ice-taught mintakyr is deformed and enters the state-space more-optimal activities to match. Data-based approach, learned models will be updated a little early that coming in a special data, the controller when the system becomes a new data 34
direction. In order to adaptoituminen occur toward the desired optimum, it should new the training data, therefore, represents a more "better" activity. This is achieved, be as simple as "trial and error" basis, which in practice based on random search. The method may be made in stages as follows:
1. Taught clustered regressiostj optimize non-behavioriN collected in the teaching data. 2. Calculate the control of the original cost Jopt (In cyclic operation The average cost / cycle). 3. Carry out a random time instant control of random change and taught in the thus obtained a new data point regulator. 4. Calculate an updated cost function Jopt and compares it to the old the cost. If Jopt > Jopt : A new model is rejected and return to the old days. Jopt Jopt : To adopt a new model and the updated cost Jopt Jopt . 5. Go back to step 3, until a clear optimoitumista is no longer used, pahdu or the desired number of iterations have been completed. The method, therefore, is testing a random change in the guidance of, and agrees, if the result is a better ratio of the selected cost. This update is done only For optimal operation of the direction. The random search-based methods is a weakness of their inertia, and the minot inhibit the operation of the inflow into the cost function lokaaliin minimum. On the the other hand, the optimization algorithm is the most universal and the UN simply. Optimization of the cost criterion can be selected any time by how same, as long as the calculation of the potential for systemic collected data based on basis. The structure of the system does not need to know or be able to more precisely analyze that, as long as the simulation of the hand or right, and redirecting the data in The collection is successful.
6.5
The earlier applications
Regressiostj clustered in the past has been applied, the robot kaksinivelisen tiksivarren to control your Chunk of control [1]. From controller, we used methods differed slightly now presented, but the basic idea, however, was the same Monday. The only difference was that Chunk of control -Controller did not distinguish between state-
35
and ohjausdatoja, but the principal component in the combined data-directed into space. The results obtained were very good, because the controller is able to reproduce as a model for used by individual PID controller tuning and optimization activities successfully. Provide a process, however, was compared with the very simple structure, na this work on walking system.
36
Clustered application regressiostjn walker to control
Regressiostimen clustered to form the basis for operation of the adjustable system status and control signals between the description of the modeling. As luvussame was found to 6, a system modeled by a regressiostjn activity curve, which mimics the first controller is used for the arbitrary and mallistj learned to repeat guidance. Thus a separate reference signal is not needed, but the operating the system is limited to what has been learned toimintakyrlle. Clustered regressiostimen application aimed at the control ratures could be updated on an ongoing basis. At first controller would mallistimen activities, and learned enough to self-manage system. Thereafter, control adaptoituisi to changing conditions and to optimize the cost-selected walking tannuskriteerin accordingly. Thus, for example walking speed ratio is used in the energy to progressively maximize the operation is in progress. In this work use was clustered regressiostimen clearly into two parts. Taught in the first stage of the controller, an initial introduced mintakyr PD controller systemic control from collected data. In the second stage was studied in controller models upgrade to Annex C, as shown Hebbin and anti-based learning Hebbin pkomponenttiregressioalgoritmilla. Controller used in teaching the Matlab source code and Simulink models is pretreated in more detail in reference [16].
7.1
Controller implemented in Simulink
Clustered regressiostjn Simulink implementation of the design is based on the earliersemmissa applications used in the solutions as well as specialized work [27] developedTYY model fitting block. The whole controller functions to bring together the strands, whose input is the system state vector and the response to a learned model of the proCONTROLS. Figure 11 shows the block structure, including a controller to-eat tesignaalin treatment, the actual results of computation and ohjausrekonstruktion post-processing.
Controller is constituted by Clustered regression Block. Taught model is given in this block parameter dialog box, Matlab "structure array "variable of type, which contains all the operating points of matrices (see section 6.2). Block structure is dynamically updated mallirakenteen with in such a way that for every operating point of each block containing MPLIANCEWITH one PCR sub-block (Figure 12). These sub-block, each performing its own operating point within the required principal component regression and costs
37
Figure 11: clustered regressiostjn Simulink model. its calculation. Finally, all the blocks from reconstructions and calculated Formula (18) according to a weighted average, which gives the actual model output, of income.
Figure 12: Clustered regression Block contains the automatically operating points regressionlaskentalohkoja equivalent amount.
38
Since regressiolohkon structure is a dynamic regulator can be used simply by varying the number of operating points and the amounts of P-controlled model of RA kenteeseen have to manually make changes. The PCR-blocks number is updated annetun sample structure at the beginning of the simulation Clustered regression Alustuskomennoilla block mask. Connections between the blocks are formed Simulink Goto and From-segments, of separate lines on which the coupling is not required. As in the PD-adjustment, the legs of a clustered regressiostjss signals must be replaced with each other on every other step of the way. This is because the a block that controls the alternating left and right foot steps in the host potassium velij, but the model has been taught only a step in which the left foot and swing the right is supporting leg. Exchange controls Switch-signal, which is converted to-nollas s number one or vice versa, always double-support phase begins. Input Signals change Switch legs Block and control signals Switch controls Block (Figure 11).
7.2
The training data
Teaching phase, were based on the PD-Walking collected daTAA. Prior to the actual models for the calculation of the premises and the control data nollakeskiarvoistettiin, and all components were scaled to the variances of ones. This is justified in that the input data includes for example. and the values of the angularkulmanopeuk acteristics which variances and mean values can differ significantly. Feed values of the sensor is handled at all, and it was left off the locale models calculated. Enter the next simulation time kfrom the state vector ureal (K) = [x0(K), y0(K), (k), L(K), R(K), L(K), R(K), x0(K), y0(K), (k), L(K), R(K), L(K), R(K), (24) sL(K), sR(K)]T. Before the regression model for the use of the state vector is removed x0-Coordinate, and nollakeskiarvoistetaan components, and are scaled in such a way that the variance, then are the ones. Denote the state vector thus obtained u (k). Further, sytevektoRIA, which is further removed from the sensor values, denoted u(K). A control signal or~ response regulator in turn, consists of items of the system: yreal (K) = [ML1 (K), MR1 (K), ML2 (K), MR2 (K)]T. (25)
Denote zero-mean, and scaled control signal y (k).
39
7.3
Data Clustering
Teaching phase, activity score was placed in the data, the system broughtmintakyrlle appropriately, so that the internal model should provide the structures. Each operating point attached to it are located closest to the data points. The aim was that it created are spread over toimintapisteklusterit broughtmintakyrlle sequence and the controller is able as far as possible, reconnection struoimaan learned behavior. At first the steps were divided into training data occurred over time as long-jak spouses, with a mean value points were selected toimintapisteiksi. In this method, each cluster was as much educational materials and action points sijaitsivat toimintakyrll uniformly. However, the simulations showed that the The equal division did not produce a good reconstruction results. This was due to the fact that activity curve was assumed linear in each operating point of the region. However, the in fact each curve is steep bent to near-toimintapis you should invest in more densely, so that the linearity assumption would be true with sufficient accuracy six. On the other hand, in some parts of the operating kvelysykli has a wider linear regions, the operating points could be fewer. Higher operating points Attempts were made to carry out positioning by using the selfganisoituvan map [28]-like competitive learning-based opetusalgorit mia of points to calculate points. Method of operating points are attached to one-dimensional strip in numerical order and the algorithm goes through opetusdaTAA will be shown at a time in random order. Each iteraatiokierrokSella kThe following steps are carried out:
1. Selected training samples u (k), y (k) that best matches the operating point p* the cost of Jp(U (k), y (k)) relationship. 2. Deposited in the lecture index kthe best operating point p* clusterRegister. 3. Calculate the coefficients of an oversight p(K) = 1 -(1 -0) HThe(P; p* ); 4. Updating all the action points in the formulas
ppCu(K ppCy(K
p= 1; . . . , Ncr
(26)
+ 1) = p(K) Cu(K) + (1 -p(K)) u (k) + 1) = p(K) Cy(K) + (1 -p(K)) y. (k)
(27)
The whole iteration after the activity score is transferred to the associated dataklustereitten average score. In the formula (26), Gaussian naapuruusfunktio hThe(P; * = Exp p) -p-p* 22The
2
(28) 40
greatly appreciates the activity score pand p* are located close to each other, in deksiavaruudessa. On the other hand the points located far from each other is a function 2a small value. Parameter Thedetermines the width of the neighborhood. The minimum cost *nuksen Jp(U (k), y (k)) gave the operating point is updated in the original odds of an oversight 0, But other units oblivion coefficient approaches the UN group or unit, the cost Jp(U (k), y (k)) is increased. Naapuruusefekti algorithm tends to draw the units towards one another, but on the other hand, located in different parts of the data units to spread more widely. Thus, introduced mintapisteet konvergoiduttua fitting algorithm in accordance with the data, et PC ON more abundant in areas containing data units is more than a few regions. As a cost Jp(U (k), y (k)) elected to the weighted sum of the training sample and operating point under consideration pEuclidean distances, the squares and the corresponding inputteavaruudessa: ppJp(U (k), y (k)) = (U (k) -Cu(K))TH1(U (k) -Cu(K)) (29) pp+ (Y (k) -Cy(K))TH2(Y (k) -Cy(K)). Painomatriiseilla H1and H2can be affected by the various components, merkittvyyk calculating the cost member. Instruction controller 4000 using training data sample, which was collected PD-Walking. Data were operating points are fixed to a lower allength N to 20 times in random order. Parameters used in the examples tetty in Table 3, where the label IThemean TheTheSize-yksikkmat rice and 0Thethe corresponding blank matrix.
Table 3: The parameters used in the placement of points. The parameter value Number of data points N4000 Data of performances Number 20 2Naapuruusfunktion variance The0.8 Original oblivion factor 00.995 I6 07Painomatriisi H1 I2 Painomatriisi H25I4 Action points to its insertion was thus most strongly account the controlcomponents and the generalized coordinates. Coordinates, velocity values were left used, since they contain many interfering affected by vibrations. Algorithm konvergoitumista was monitored by calculating the amount of teaching time Winner of the cost of operating points of all training samples over. Always 41
a new sample presentation time, the start of the amount drawn on the value and resetadded. There was thus obtained the graph, it could be concluded that the behavior of the algorithm konvergoitumisen progress. Figure 13 shows a typical convergestumiskyr.
Figure 13: Action points sijoittelualgoritmin konvergoitumista examined Winner of the unit cost amounts to. Figure 14 shows the clustered training data projected onto the three-state variables applicant. Clusters, the average score, or score function is described in a different tone of these environpyrill, and include the corresponding data points in the colors of points. Each the cluster center is also connected to one system in the state of operating point. Teaching data positioning system operation is clearly seen in curve form Thurs projected in space. Walking cycle, the rapid movements during the pedestrian state changes one time interval will be such that consecutive Set the data points are located far from each other. Since the discrete model of walking a reference signal and controls the sample interval is the same as the data collection used in the sample interval, collects training data at the same kvelyliiket the elasticIn th. This is shown in Figure 14 in parallel pistejoukkoina, brought-in levittyvt mintakyrn against the traveling direction at right angles.
Clustering algorithms is a shared activity score fairly evenly throughout the operating takyrn trip, and a step of the curve passing through all points. Only in the last four operating point within the data does not seem to be divided intisti clusters, but this is probably due to the selected projections.
42
Figure 14: Clustered training data projected onto the three state variables, activity tapistejako system status and operating points.
7.4
Operating points and the number of feature variables selection
Activities affected by the number of points in theory, directly clustered regression siostjn on steering accuracy. If the operating points is a moresystem, within the linearity assumption is valid in general, and better taught kuvaus holding system control installations can be replicated in more detail. Secondly using too many of the operating point of training data may not be enough inkaiselle regression sufficient to form a unit. Similarly, control simulation time increases activity increases, the number of points.
Similarly, the feature of the number of variables explains clearly how closely opetusdatan dependencies into account. The aim is that the model used, would only essential information is contained in the data and the noise contribution would be modeling without. The more characteristic variables are used, that is, the more automonents regressiokuvaukseen will be included in, the more training data optimitilanteessa able to reconstruct. Figure 15 shows the regulator remaining in teaching the description of the relative
43
squared reconstruction error of the different operating points and the characteristic parameters of Moravianla. Errors have been calculated on the basis of teaching used klusterointijaon summaamalla all training data points squared reconstruction errors.
Figure 15: Educational Data normalized quadratic control rekonstruointivirhe operating points and the feature of the variables as a function of volume. Figure 15 is not suitable for operation on the basis of the points and the characteristic quantities of the variables however, possible to select, in practice, since the regulator can not shown, estimoin titarkkuuteen. This is mainly due to the fact that the guidance for calculating the rightaa possible to select the operating point of the instruction as well as the stage, and To calculate the cost there is no control components in the real valueand. The control phase, the operating point for the selection of otherwise-opetusvai phase cost (29) such expression, but the control word was omitted, ie the set H2=04:
ppJp(U
(k)) = (u (k) -Cu)TH1(U (k) -Cu).
(30)
This proved to be effective and the easiest way to replace ohjausterThe absence min. In practice, the average reconstruction accuracy is no more important to feelvan, in which kvelysykli contain errors, and the type they are. At the critical moment occurring inaccuracy in the wrong direction to drive the system 44
so far opitulta toimintakyrlt that the controller is unable to continue operations tics. This leads to a walker to fall over. The results of simulation shows that a suitable number of points, is approximately 20 Similarly, the major components needed regressiokuvauksessa of 7-8 pieces, so that walking motion is repeated.
7.5
Operating points of internal models of teaching
Regression model a given operating point will be determined by just a point Education-related data. The model was not used in forming the feet touchtussensoreista collected data, it is typically because one introduced remains constant, mintapisteen region. As used sytedatana instruction operating point of p were, therefore, the associated Npvector up(1); . . . , Up(N p); which include general-~ ~ tetyt coordinates and the time derivates of the air x0-Coordinates. Data Response formed the inputs associated with ohjausvektorit y (1). . . , Y (N p).
Feature of the data was obtained by principal component analysis and a description of the feed input vector main components. Section 6.2 The correlation matrix was calculated during the training directly to the covariance equation, that is, for example, 1 Rx ~ u=p N
p Np pxp(D) k=1
(~p(K) -Cu)T,
u
~
(31)
pwhere
xpand upthe operating point prelated data vector and

~
Cuactivity~ tapisteen input data without odotusarvovektori kosketussensorin values. The samples were the number of copies Npsupposed to be kept large enough so that the result is true.
7.6
Learned to walk
Taught clustered regressiosdin is able to estimate the walker tilasignaalin on the basis of its controls. Dry ice is actually stored as a model used in the PD-controller functions: both decrease the measured state of steel-peru sat on the control signal value. Pedestrian can be controlled by changing clustered regressiosdin PD controller to the scene. Was able to walk model in a clustered regressiostimell the second number of different laisilla parameter values. Table 4 summarizes the parameters used in the teachingriarvot, the walk was achieved quite well. The best operating point selection was also able to control the desired manner. Figure 16 shows the best operating point for the index p* versus time gait progresses. Teaching points 4 and 5 are arranged with each other the wrong way round,
45
Table 4: clustered regressiostjn the parameters used in teaching. The parameter value Number of action points Ncr20 Plot the number of variables The8 Control estimates the weighting function (19) the variance 20.05 2His directions for walking model added random noise variance The0.01
Figure 16: All the action points will take place in general through every step of the way. the point 5 will take place prior to 4 points This does not affect the weather draft control operation, since the order of the points in between are no longer relevant. Figure 16 also shows that a single cycle of the operating point can be should be omitted. In practice, these points still affect the control of, bewill result in the closest operating points are calculated according to the results weighted by the average. Learned to walk a little different from the model walks. Figure 17 shows the PD-down and learning to walk in a few moments as a function of time during the step. The initial moment (0 s) is selected in such a way that both in the foot from the double support phase. SEL accordingly be seen that the model walking walker to take steps more quickly, and learned walking left behind. On the other hand clustered regressiostjn guided by walking, Lyssa control signals are smoother. In particular, the strong vibrations of the knee the corners of the articles (ML2 ,MR2 ) Have been weakened considerably.
Similarly, Figure 18 shows changes in the coordinates of generalized multiof both models the walk progresses. The most obvious difference between a walk is still the fact that learned to walk less often proceeds in steps. Walking speed is not very much the same as learning to walk the steps are slightly longer. Walking to46
Figure 17: Opitussa walking (KRS) and the model of walking (PD) used in the moments During a few steps.
47
Figure 18: Change in the coordinates of the walker walking opitussa (KRS) and modelwalking (PD).
lijn upper body y0-Coordinate the behavior is slightly different model for a walk, but other than that learned behavior I play quite accurately. As the control signal (Figure 17), the generalized coordinates aikaderivaatoissa (Figure 19) is a model walking the vibrations are attenuated significantly siirryttesYOU clustered regressiostjn control. At least on this matter in the headkomponenttiregressio seems to work as intended, and unnecessary betweenrhtelyt reduced. Animated walking motion of the difference between the stjien is hardly noticeable. KuVassa 20 are compared in a model of walking, and walking lessons learned by drawing a system state in either case, the 70 millisecond intervals. The horizontal axis shows the premises, price, time.
48
Figure 19: The coordinates of the walker walking opitussa time derivates (ROW) and walking model (PD).
Figure 20: walker behavior of both control methods.
49
7.7
Adaptive Education
Model Structure Calculation of pre-collected data set, it is therefore possible. Generating a model, however, more naturally, if you learntaisiin slowly walking along the way, in which case the model updating of walking changes in the properties of this Chapter. Originally, the operating points within the model was intended to form in Annex C. Hebbin described Hebbin and anti-based learning algorithm, the recursive ritmilla (HAH). It would have provided a method of adaptive models to calculate seen, and to update the walking. Turned out, however, that the algorithm possible to teach even the models of PD walking with data collected from teaching the required albattery space is achieved. The problems caused by the fact that the data in less space constitute the main components of the variances were very small, which made it difficult HAH-konvergoitumista algorithm. Algorithm to show emphasis should be placedvan activities largest principal components. Table 5 shows one typical operation of the data point variances of the feature vectors in order of magnitude. Determination of the variances of these found directly responsible for the main components in the importance of pkomponenttiregression is calculated. Table 5: Mixed in a typical operating point associated with the data of the main variances of the feature vectors. The main component of Variance 1.0,8264 2.0,2099 3.0,0445 4.0,0124 5.0,0021 6.0,0014 7.0,0007 8.0,0004
Although the significance of the main components in the past seems like a pretty small, teachingstage, it was clearly established that walking the other can not succeed without them. HAH-konvergoitumista algorithm was examined by drawing a particular subject model varianssiestimaatit by teaching the course. Figure 21 shows the algorithm behavior of the training data used in a single operating point associated with the requirements of data. Varianssiestimaattien converged state should meet the directdirectly from the data, the calculated variances. Distinguishing only the two main main components (the = 2) konvergoitui correctly the algorithm (Fig. 21 (a)). Oblivioncoefficient was used to a rather small value =0.99 to konvergoituminen
50
occurred about a thousand iteration.
(A)
(B)
Figure 21: HAH konvergoitui algorithm only when the least significant automoponents variances were not too small. Direct data kovarianssimatrice, the eigenvalues calculated from the "correct" values of variances are drawn in broken line butter. The addition amount of the three main components to separate, in which case the minimum VA variance is 0.0445, the algorithm does not konvergoitunut. Although the record straight coefficient were grown in value =0.9999, teaching dispersed even a significant amount of iteraaStates from (Figure 21 (b)). Clearly, all the eight major components needed for separation of daThis will not be able to HAH-algorithm.
7.8
Teaching Play
Optimization of the conditions was also investigated by teaching clustered regressiosDirector and again since the things learned walking collected data. The purpose was to find out whether a pedestrian learned enough information about the system The agency Pub. Taught walker acted in the PD-controlled pedestrian so far, the walking motion is never stabilized to repeat exactly the same cycle. The other hand, the range was as small as the PD-walking, so to walk, learning to Lyn His directions were added to the simulation of Gaussian white noise, answered by the variance of the original instruction used in the noise variance.
Regressiostjn clustered control from the data collected systemic proved to be was, however, properties in such a way that walking is one of the longersuccessfully won. Different operating points of the control provided by a reconstruction of averaging tusparametria 2was obtained by adjusting the walker to step a few steps, butto a permanent walking motion is achieved. Apparently the data is no longer represented the sys51
dynamics Teem comprehensive enough. On the other hand inaccuracies in the model does not had to be large, since the controller is quite sensitive to even small deviations who drive it away opitulta toimintakyrlt.
52
Conclusions
This work was created initially using Matlab / Simulink environment simulation tool simplified walking robot model. This model describes a two-dimensional, twosijalkaista robot, which comprise two legs attached to the knee joint the rigid-body and upper body in one piece. The robot model dimensions masses, and can be chosen arbitrarily, as well as the shape of the substrate and the foot Other features. Despite the fact that the mechanical means of Lagrange formed walker dynamics equations were complex, a model to simulate how the paragraphreasonably quickly. Also within walking platform for modeling the robot legs coupledwith external support forces turned out to be the ideal, even if the hard substrates decreased the rate of the simulation. Walkers was developed in the PD-control, which creates a simulation of the cyclic nonylsisignaalit joints, corners and guided by separate PD controller walking. Walker guidance, at least in selected parameter values proved to be sensitive to control headif these parameters change. On the other hand a truly multi-variable system, the weatherof individual adjusters can not generally act in the best-mahdollisel manner. The study also used the walker maintain a balance not to structure, not in theory, it is not easy: for example, the static stability suuden achieved only one foot on the ground is not possible, because the when the walker support surface has shrunk to one point.
PD-Walking collected walker input and response data that is based on basis of pedestrian movement in an inverse dynamics model was datapohjaisesti. Mallirakenteena used piecewise-linear pkomponenttiregressioraken of clustered converted regression. When the model, the necessary correlation was calculated from data in advance of teaching, modeling might be a fairly good precision of Saturday. The obtained model could be used as such by connecting the system control measured based on the state the estimated value of the steering control to direct the walker. The starting point used in the PD-guided walking it was possible to repeat this singlesimple regressiorakenteella only slightly modified.
Clustered regressiost was found sensitive to imperfections, which are directed totaught in the VAT off the path of movement. What is clear is that the controller could have acted very well in areas that were not represented in the training data. As the self-adjustable stabiloinut not be systemic disorders controller does not normally survive very high deviations. On the other hand the control signal could be clustered, regressios event, to increase the draft control of the same magnitude as the white noise Model walk the PD-controller, before the walker fell down. Pedestrian movement modeling adaptively updating regressiorakenteita not tasin this work a success. Principal component regression required fewer signifi-
53
warding variances of the major components were so small that the selected headkomponenttiregressioalgoritmi konvergoitunut longer. The optimization of the walking not been tried out in practice. A further study should focus in particular on the model structure of the adaptiveradioactive update for testing. A suitable pkomponenttiregressioalgoritmin better able to avoid the major components of small variances due to difficulties. On the other hand it may interfere with the optimization model of degeneration, it was observed that the regression model learned from steering walking detracting the data is no longer described process as well as the original data.
54
References
[1] H. Hytyniemi. Life-like Control. The book Step 2002 - Intelligence, the Art of Natural and Artificial, ss. 124-139. Finnish Artificial Intelligence Society, 2002. [2] T. McGeer. Passive dynamic walking. The International Journal of Robotics Research, 9 (2) :62-82, April 1990. [3] by T. McGeer. Passive walking with knees. The book Proceedings of the IEEE the International Conference on Robotics and Automation, ss. 1640-1645. 1990. [4], Coleman, M. and A. Ruina. An uncontrolled toy That can walk but Can not standstill. Physical Review Letters, 80 (16) :3658-3661, April 1998. [5] A. Kuo. Stabilization of lateral motion in passive dynamic walking. The International Journal of Robotics Research, 18 (9) :917-930, September 1999. [6] M. Ogino, K. Hosoda, and M. Asada. Acquiring passive dynamic walking based on ballistic walking. The book Proceedings of the Fifth International Conference on Climbing and Walking Robots (CLAWAR 2002), ss. 139 146. In 2002. [7], H. Ohta, M., K. Furuta Yamakita. From passive to active dynamic wal Thursking. The book Proceedings of the 38th Conference on Decision & Control Part 4, pp.. 3883-3885. December 1999. [8] K. Hirai. Current and future perspective of Honda humanoid robot. Kirwidest in the four Proceedings of the 1997 IEEE / RSJ International Conference on-In Telligent Robots and Systems (iros '97), Part 2, pp.. 500-508. September In 1997. [9] by S. Kitamura, Y., And Iwata, M. Kurematsu. Motion generation of a biped locomotive robot using an inverted pendulum model and neural networks. The book Proceedings of the 29th IEEE Conference on Decision and Control, Part 6, ss. 3308-3312. 1990. [10] Y. Kurematsu, T. Maeda, S., and Kitamura. Autonomous trajectory is generated by the ratio legis of a biped locomotive robot using neuro oscillator. The book IEEE International Conference on Neural Networks, Part 3, pp.. 1961 to 1966. In 1993. [11] G. Taga, Y., Yamaguchi and H. Shimizu. Self-Organized control of bipedal Locomotion by Neural Oscillators in unpredictable environment. Biological Cybernetics, 65 (3) :147-159, 1991. [12] J. Hu, G., J. and Pratt, Pratt. Adaptive dynamic control of a bipedal walking robot with radial basis-function neural networks. The book Proceedings of
55
the 1998 IEEE / RSJ International Conference on Intelligent Robots and Systems, Part 1, pp.. 400-405. In 1998. [13] J. R. Koza. Genetic Programming: on the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge, MA, 1992. [14] Sigel project. http://sourceforge.net/projects/sigel. [15], Ziegler, J., J. Barnholt, J. W., and Busch Banzhaf. Automatic evolution of control programs for a small humanoid walking robot. The book Proceedings of the Fifth International Conference on Climbing and Walking Robots (CLAWAR 2002), ss. 109-116. In 2002. [16] O. Haavisto and H. Hytyniemi. Simulation tool of a biped walking robot. Technical Report 138, Helsinki University of Technology Laboratory, 2004. [17] C. Chevallereau, Abba, G., Y., Aoustin, Plestan C., E., R. Westervelt; C. Canudas de Wit and J.-W Grizzle. RABBIT: A testbed for advanced control theory. IEEE Control Systems Magazine, 23 (5) :57-79, October In 2003. [18] Z. Zografski. Geometric and neuromorphic learning for nonlinear mode ling, control and forecasting. The book Proceedings of the IEEE 1993-In ternational Symposium on Intelligent Control, ss. 158-163. August 1992. [19] Atkeson C. G. A. W. Moore and P. Schaal. Locally weighted learning. Artificial Intelligence Review, 11:11-73, 1997. [20] P. Schaal, C. G., and S. Atkeson Vijayakumar. Scalable Locally Weighted statistical techniques for real time robot learning. Applied Intelligence - Special issue is the Scalable Robotic Applications of Neural Networks, 17 (1) :49-60, 2002. [21] and S. S. Vijayakumar Schaal. Local Adaptive Subspace Regression. Neural Processing Letters, 7 (3) :139-149, 1998. [22] P. Schaal, S. and C. Vijayakumar Atkeson. Local dimensionality reduction. The book , Advances in Neural Information Processing Systems, 10, ss. 633 639. MIT Press, Cambridge, MA, 1998. [23] S. Vijayakumar, A. D'Souza, S., and Schaal. Incremental online learning in high dimensions. Neural Computation (submitted), In 2003. http://wwwclmc.usc.edu/sethu/pub_topics.1.html. [24] S. Vijayakumar, T., Shibata, J. P. Schaal and Conradt. Statistical Learning for humanoid robots. Autonomous Robots, 12 (1) :55-69, 2002.
56
[25] C. L., Lewis, A. L., Syrmos. Optimal Control, second edition. John Wiley and & Sons, New York, 1995. [26] S. Haykin. Neural Networks, a Comprehensive Foundation, second edition. Prentice Hall, Upper Saddle River, New Jersey, 1999. [27] O. Haavisto. Control Engineering, special assignment: Mikstruurimallinsovituslohkon Wedkeeping with the Matlab / Simulink environment. In 2003. [28] T. Kohonen. Self-Organization and Associative Memory. Springer-Verlag, In 1984. [29] D. A. Wells. Schaum's Outline of Theory and Problems of Lagrangian Dynamics with a Treatment of Euler's Equations of Motion, Hamilton's Equations and Hamilton's Principle. Schaum Publishing Co.., New York, 1967. [30] K. I., and S. Y. Diamantras Kung. Principal Component Neural Networks, Theory and Applications. , John Wiley & Sons, New York, 1996. [31] H. Hytyniemi. Multivariate regression: Techniques and tools. Technical tion Report 125, Helsinki University of Technology, In 2001. [32] H. Hytyniemi. And Anti-Hebbian Hebbian Learning: System theoretic approach. Neural Networks (submitted), In 2003.
57
Lagrange-mechanics
Lagrange-mechanics provides a systematic method of system dynamics dynamics equations for the formation and resolution. It is based on Newton's classical mechanics and the concept of virtual work. The following isan outline of the Lagrange equations and the dynamics of leadership asliden formation. Paper used as a source [29].
A.1
The generalized coordinates
The mechanical state of the system can be presented with a variety of co-ordinates help. For example, a simple mass tasoheilurin lausuttavis-position is sa cartesian coordinates (x1, Y1) Or by using the angle of the pendulum arm (). The coordinates of the minimum amount necessary to the system-vapausastei the number of which must be inferred from the structure of the system. The generalized coorditops q1, Q,2. . . , Q,srepresent the coordinates of any other, but they are selected independent of each other, this means that every system degrees of freedom per is one of the generalized coordinate. Many of the mechanical system consisting of the points dealing with MassoJEN places are usually the easiest way inform karteesissa coordinates (xi, Yi, Zi). The following are restricted to those so-called. holonomisia systeemeit, JOLI created cartesian coordinates can be expressed in generalized coordinates as follows: xi=xi(Q1, Q,2. . . , Q,s) (32)y=yi(Q1, Q,2. . . , Q,s)
i
zi=zi(Q1, Q,2. . . , Q,s).
A.2
Lagrange equations
Application of Newton's second law massapisteelle miGives Fxi=mixi F=miyi

yi
(33)
Fz ]Tis the where Fi= [Cxi, Fyi, Fzii=mizi, mass of the force on. It is assumed that the mass iis transferred to an arbitrary, the small different from the virtualman (xi, yi, zi) Compared. Multiplying equations (33) semi-coordination of each carbonate, anomalies and by summing up all of the equations from NmOf
58
the point is more than D'Alembert the formula:

Nm Nm
mi(ixi+yiyi+zizi) =x
i=1 i=1
(Cxixi+Fyiyi+Fzizi).
(34)
In the formula (34) right-hand side is a joint effort FiTotal by Virtual work W , And the left side of the system can interpret the kinetic energy of this induced by a small change. Virtual deviation can be expressed in generalized coordinates using equations (32), for example, when the deviation of xithe following is obtained follows: xi xi x i q1+q2+ +qs. q1 q2 qs
xi=
(35)
Place in a generalized virtual deviation from the expressions (35) D'Alembert formula (34), and (because the original deviation was arbitrary) selected examined only the first of the generalized coordinate trend towards variation, ie qi= 0, i> 1.(36)
Re-arranging terms gives a

Nm
W q1=
i=1 Nm
mixi
yi zi xi +yi + zi q1 q1 q1 yi zi xi +Fyi+Fzi q1 q1 q1
q1
(37)
=
i=1
Fxi
q1.
(38)
Apukaava introduced, which is proven, for example paper [29]: (x2/ 2) (x2/ 2) d xi x d x (39) q1 -. xi = x -x = q1 q1dt q1 q1dt By position the formula (39) and the corresponding parameters yiand ziFormulas can be asequation (37) right side of the expression: d dt q1
Nm
i=1
mi2 xi+yi+zi2 2 2
q1
Nm
i=1
mi2 xi+yi+zi2 2 2
q1.
(40)
The clause appearing twice in the sum of the term is directly the entire mass kinetic energy of the system T. Coordinate q1associated with a generalized power Fq1 is obtained on the expression (38), which is of the form Fq1q1. 59
The expression (40) applies to any generalized coordinates, and aims to offer la to the equation (37, 38) to write Lagrange equation : d dt T T qr =Fqr, qr r= 1, 2; . . . P. (41)
Dynamics of the system of equations is therefore needed to form the kinetic energy expression, muunnosyhtlt (32) and expressions of generalized forces. Followgiven below of the method used in this work coordinate qrof the generalized force Fqrclause to clarify:
1. Attached to all the other generalized coordinates of the constants, but the resttetaan coordinates qrvalue + qr. 2. Calculate the masses of all the forces at offset by work W qr. 3. Solve the desired force Fqrgiven by the formula W qr=Fqrqr. Kinetic energy and generalized forces of the clauses are typically easier to a first bounce karteesisten coordinates and then converts the relationship equations (32) of generalized coordinates functions. By placing the LagranGen. equations (41) is formed dynamiikkayhtlt system. (42)
60
Dynamics of the model equations
Simulations used a two-legged walker dynamics model was written form A (q) =b (q, q ', M, F ).q(43) Structure of the walker and the parameters used is shown in Figure 1 Vector qconsisted of Come selected generalized coordinates, Fplatform support for the forces and Min joints ten moments. Dynamiikkayhtlit formed in the system, vapausastei the generalized coordinates and the amount of the seven tracks. In the following a dynamic equation of the matriisimuotoisen (43) inertia, rice A (q) and the right side b (q, q ', F, M )expressions element by element:
A (q): A11 =m0+ 2m1+ 2m2 A12 = 0 A13 = (-2m1r0-2m2r0) Cos () + (l-1m2-m1r1) Cos ( -L)-l1m2cos ( -R) -m1r1cos ( -R)-m2r2cos ( -L+L)-m2r2cos ( -R+R) A14 = (L1m2+m1r1) Cos ( -L) + m2r2cos ( -L+L) A15 = (L1m2+m1r1) Cos ( -R) + m2r2cos ( -R+R) A16 =-m2r2cos ( -L+L) A17 =-m2r2cos ( -R+R) A21 = 0 A22 =m0+ 2m1+ 2m2 A23 = (2m1r0+ 2m2r0) Sin () + (L1m2+m1r1) Sin ( -L) + l1m2sin ( -R) + m1r1sin ( -R) + m2r2sin ( -L+L) + m2r2sin ( -R+R) A24 = (L-1m2-m1r1) Sin ( -L)-m2r2sin ( -L+L) A25 = (L-1m2-m1r1) Sin ( -R)-m2r2sin ( -R+R) A26 =m2r2sin ( -L+L) A27 =m2r2sin ( -R+R) A31 = (-2m1r0-2m2r0) Cos () + (l-1m2-m1r1) Cos ( -L)-l1m2cos ( -R) -m1r1cos ( -R)-m2r2cos ( -L+L)-m2r2cos ( -R+R)
A32 = (2m1r0+ 2m2r0) Sin () + (L1m2+m1r1) Sin ( -L) + l1m2sin ( -R) +m1r1sin ( -R) + m2r2sin ( -L+L) + m2r2sin ( -R+R) A33 = 2l12m2+ 2m1r02+ 2m2r02+ 2m1r12+ 2m2r22+ (2l1m2r0 + 2m1r0r1) Cos (L) + (2l1m2r0+ 2m1r0r1) Cos (R) + 2m2r0r2cos (L-L) + 2l1m2r2cos (L) + 2m2r0r2cos (R -R) + 2l1m2r2cos (R)
61
A34 =-l12m2-m1r12-m2r22+ (L-1m2r0-m1r0r1) Cos (L) -m2r0r2cos (L-L)-2l1m2r2cos (L) A35 =-l12m2-m1r12-m2r22+ (L-1m2r0-m1r0r1) Cos (R) -m2r0r2cos (R-R)-2l1m2r2cos (R) A36 =m2r2(R2+r0cos (L-L) + l1cos (L)) A37 =m2r2(R2+r0cos (R-R) + l1cos (R)) A41 = (L1m2+m1r1) Cos ( -L) + m2r2cos ( -L+L) A42 = (L-1m2-m1r1) Sin ( -L)-m2r2sin ( -L+L)
A43 =-l12m2-m1r12-m2r22+r0(-L1m2-m1r1) Cos (L) -m2r0r2cos (L-L)-2l1m2r2cos (L) A44 =l12m2+m1r12+m2r22+ 2l1m2r2cos (L) A45 = 0 A46 =m2r2(R2-l1cos (L)) A47 = 0 A51 = (L1m2+m1r1) Cos ( -R) + m2r2cos ( -R+R) A52 = (L-1m2-m1r1) Sin ( -R)-m2r2sin ( -R+R)
A53 =-l12m2-m1r12-m2r22+r0(-L1m2-m1r1) Cos (R) -m2r0r2cos (R-R)-2l1m2r2cos (R) A54 = 0 A55 =l12m2+m1r12+m2r22+ 2l1m2r2cos (R) A56 A57 A61 A62 A63 A64 A65 =0 =m2r2(R2-l1cos (R)) =-m2r2cos ( -L+L) =m2r2sin ( -L+L) =m2r2(R2+r0cos (L-L) + l1cos (L)) =m2r2(R2-l1cos (L)) =0
A66 A67 A71 A72 A73 A74 A75 A76
=m2r22 =0 =-m2r2cos ( -R+R) =m2r2sin ( -R+R) =m2r2(R2+r0cos (R-R) + l1cos (R)) =0
=m2r2(R2-l1cos (R)) =0 A77 =m2r22
62
b (q, q ', F, M ): 2b1=-22m1r0sin () + FRx -Rm1r1sin ( -R)-2l1m2sin ( -L) + FLx -Lm2r2sin ( -L+L)-2m1r1sin ( -R)2 -2l1m2sin ( -R)-2l1m2sin ( -R)-2m2r2sin (
R R
22 -R+R)-Lm2r2sin ( -L+L)-Lm1r1sin ( -L) -2m2r2sin ( -R+R)-2m1r1sin ( -L) R 2Ll1m2sin ( -L)-2m2r2sin ( - 2m2r2sin ( -L+L) + 2Rl1m2 -
R+R) sin ( -R)
+ 2RRm2r2sin ( -R+R)-2 Rm2r2sin ( -R+R) + 2Lm2r2sin ( -L+L) + 2Lm1r1sin ( -L) + 2Rm2r2sin ( -R+R) + 2Ll1m2sin ( -L) -22m2r0sin () + 2Rm1r1sin ( -R) + 2LLm2r2sin ( -L+L)-2 Lm2r2sin ( -L+L) 22b2=-Lm2r2cos ( -L+L) + FRy -2GM2-Rm2r2cos ( -R+R) -2m2r2cos ( -R+R)-2m2r2cos ( -R+R)
R
- Lm2r2cos ( -L+L)-2m1r1cos ( -R)2 2Rm1r1cos ( -R)-2(2m1+ 2m2) R0cos () - -
2l1m2cos (
2-R)-Rl1m2cos ( -R) + FLy -2m2r2cos ( -L+L)-2GM1 + 2Rl1m2cos ( -R) + 2RRm2r2cos ( -R+R) -2 Rm2r2cos ( -R+R) + 2Rm2r2cos ( -R+R) -2 Lm2r2cos ( -L+L) + 2LLm2r2cos ( -L+L) + 2Lm2r2cos ( -L+L) + 2Rm1r1cos ( -R) -(L(2l-1m2-2m1r1) + 2(L1m2+m1r1) + 2(L1m2 L +m1r1)) Cos ( -L)-GM0
63
b3=Rl1m2r2sin (R) + FRy l2sin ( -R+R)-FLx l1cos ( -L)2 -FRx l1cos ( -R)-FLx l2cos ( -L+L) + FLy r0sin () + FRy r0sin () + FLy l1sin ( -L) + FLy l2sin ( -L+L)-GM2r2sin ( -R+R) -2m2r0r2sin (R-R)-2m2r0r2sin (R-R)
R R
- GM2r2sin ( -L+L) + Ll1m2r2sin (L)2 2 - Lm2r0r2sin (L-L)-Rm1r0r1sin (R)2 22 - Lm2r0r2sin (L-L)-Rl1m2r0sin (R) - 2 sin ( -R)-GM1r1sin ( Lm1r0r1sin (L)-gl1m2sin ( 2-R) + FRy l1 -R)-gl1m2sin ( -L)-GM1r1sin ( -L)-Ll1m2r0sin (L) -(CLx r0+FRx r0) Cos () -2 Lm2r0r2sin (L-L) + 2LLm2r0r2sin (L-L) + 2Lm2r0r2sin (L -L) + 2Rm1r0r1sin (R)-2GM2r0sin () + 2 Rl1m2r2sin (R) + 2Rm2r0r2sin (R-R) -2 Rm2r0r2sin (R-R) + 2 Ll1m2r2sin (L) -2LLl1m2r2sin (L)-2GM1r0sin () + 2Rl1m2r0sin (R) + 2Ll1m2r0sin (L) + 2Lm1r0r1sin (L)-2RRl1m2r2sin (R) + 2RRm2r0r2sin (R-R)-FRx l2cos ( -R+R) b4=ML1 +FLx l1cos ( -L) + FLx l2cos ( -L+L)-FLy l1sin ( -L) +gl1m2sin ( -L) + GM1r1sin ( -L)-2l1m2r0sin (L) -2m1r0r1sin (L)-2m2r0r2sin (L-L) -2 Ll1m2r2sin (L) + 2LLl1m2r2sin (L) - Ll1m2r2sin (L)-FLy l2sin ( -L+L) + GM2r2sin ( -L+L)2 b5=MR1 +FRx l1cos ( -R) + FRx l2cos ( -R+R)-FRy l1sin ( -R) +gl1m2sin ( -R) + GM1r1sin ( -R)-2l1m2r0sin (R) -2m1r0r1sin (R)-2m2r0r2sin (R-R) -2 Rl1m2r2sin (R) + 2RRl1m2r2sin (R) -Rl1m2r2sin (R)-FRy l2sin ( -R+R) + GM2r2sin ( -R+R)2 b6=ML2 -FLx l2cos ( -L+L) + 2m2r0r2sin (L-L) -2l1m2r2sin (L) + 2Ll1m2r2sin (L) 2 -Ll1m2r2sin (L) + FLy l2sin ( -L+L)-GM2r2sin ( -L+L) b7=MR2 -FRx l2cos ( -R+R) + 2m2r0r2sin (R-R) -2l1m2r2sin (R) + 2Rl1m2r2sin (R) 2 -Rl1m2r2sin (R) + FRy l2sin ( -R+R)-GM2r2sin ( -R+R)
64
Multivariate Regression methods
Multivariate Regression examines the data for statistical dependencies, drank, formed the basis of the description of the response to input from you. This annex isbe through the principal component analysis in which the input data to search for a maximum of VA rianssin suuntavektorit major components, namely, as well as Principal Components alreadyka regressiokuvauksen computes a linear response for the main components. In addition to presents an iterative algorithm based on neural which may be carried out so as principal component analysis, regression. Methods form the basis on which section 6 and presented in this work ina material used in a data-based modeling tools, namely the clustered regression is built.
C.1
Principal component analysis
The aim of principal component analysis is to describe each other potentially highreloivat random variables to a smaller number of uncorrelated random female variables. The method can be used for compacting data and internal structure visualization. Pkomponenttivektorit are virittjvektoreita the space to which a data-ku are fully protected. Pkomponenttivektori shows the first direction in which the random naisvektorin variance is maximum, that is, it tends to describe the greatest possible Lisen part of the data variability. The next pkomponenttivektorin calculate Data is projected perpendicular to the first pkomponenttivektoria-oleval le hyper level and find again the maximum variance direction. It may be will continue until the original is formed pkomponenttivektoreita random naisvektorin dimension corresponding amount.
Figure 22 is shown in a number of two-dimensional random vector values [U1, U2]Tpkomponenttivektoreitten and both directions. Clearly seen, be the first pkomponenttivektori w1indicates the maximum varianssin direction. Another pkomponenttivektori w2in turn, is perpendicular to the rassa compared to the first image and the rest of the data variability. Consider m-dimensional data vector u= [U1, U2. . . , Um]TA-odotusar Vo is the E {u} =0and covariance matrix Rnew =E {newT}. Describes the data vector pkomponenttivektorien formed by a linear feature-space vector (latenttivektoriksi) x= (W WT)-1 W u =W u, x RThe. (44)
Kuvausmatriisi W= [W 1, W2. . . , W The]Tis orthogonal to the rows, and containing The normalized TS suuntavektorit the major components that form traits, 65
Figure 22: a two-dimensional data upkomponenttivektorit w1and w2. reavaruuden position. To evaluate the description, since the orthogonal matrix is the case W W T=I. Compacting the data leaves the last part of the main components of the use of (N <M); when no data is to be described without error. Original reconstruction of the vector into the wild can be directly multiplied by the rule feature of the feature vectors corresponding components in the data: u=W Tx=W TW u. (45)
Major constituents are directed to the corresponding covariance matrix Rnew normalized eigenvectors and associated variances of the eigenvalues. So by arranging the matrix Rnew eigenvalues in descending order of size is obtained vastaaviseigenvectors directly to the desired projektiosuunnat in order of importance. Will also prove that the matrix Wminimizes the reconstruction error Je=E {u -u} (46)
and second maximizing the major components of the variances of the above-described in [30]. If the original random variables is a linear correlation with eachthe wall, on the last major components of the variances are small. A typical so far revealed characteristic parameters is less than the dimension of the original data of dimensio, when a data error can be more closely described in the near-plastic dossa by using only the relevant pkomponenttisuuntia. May be the same,
66
be found in the data present in the independent variables determined the realra.
C.2
Principal Components
Principal component is designed to provide your regressiokuvausAnti-subsidy response to input from you. For data entry is done first, principal component analysis si, leaving the last of the main components can be reduced by the use of dimension of the data and potentially reduce the occurrence of noise in the data orkutusta [31]. This is based on the assumption that the noise is distributed indean Independent koordinaattisuuntiin any, between the actual variables listen dependencies will reduce the noise. Finally, piirrevektorida standard is described by linear least-squares multivariate regression to obtain the desired output of the estimate. Suppose that there are given the training data, which includes Npieces m-dimensional sioisia matrix input vector U= [U (1); u (2). . . , U (N )]Tand related TS l-dimensional vastevektorit matrix Y= [Y (1); y (2). . . , Y (N )]T. At first calculated by the matrix Wlines of feeds TheThe first-pkomponenttivek square, which is the covariance Rnew eigenvectors. After this, ing are described all the feeds of feature vectors, which occurred in matrix form, Come by the formula X=UW T.(47)
The least-squares linear take monimuuttujaregressiokuvauksen rikuvaus Gfeature vectors for the response can be calculated pseudoinverssill gas only G= (X TX)-1 XTY.(48)
A new input vector uest response estimate is formed now by using the formula (44), (47) and (48): yest =GTxest =GTW uest =YTU W T(W UTU W T)-1 W uest . (49)
C.3
And anti-Hebbin Hebbin learning based on the main komponenttiregressioalgoritmi (HAH)
Neural based on a number of similar units, neurons, jotka are able to perform simple calculations. By connecting these units a row side by side and can be implemented with neural networks generated complex descriptions of prior art input from you for the response. Literature [30], a number of
67
neuraalilaskentaa pkomponenttialgoritmeja use, typically PE Hebbin are based on the learning rule (eg, [26]). Hebbin learning the connection between neurons is strengthened when the neurons active tivaatiot correlated with each other and decreases when the correlation is negative. Anti-Hebbin learning cycles will be updated just the opposite, in which case neurons compete with each other. In general, the anti-Hebbin updated learning parallel neurons, when the goal is that only one neuron is activated tuisi in a given feed. Next Hebbin and anti-Hebbin pkomponenttiregres-based learning sioalgoritmi is shown in reference [32]. The algorithm allows regressiorakenI continue to update the data for statistical properties of changes versus time.
C.3.1
Structure and function of the neural network
An algorithm for neuraalirakenne consists of parallel-neuroneis s, which is Thepieces. The network is used as input m-dimensional -datavek tors u (k), and all neurons can be made aktivaatiotilat-vektoril la x (, k) RThe. Index krefers to the number of sample presented to the network. The continuous time variable in turn, whereas neurons the internal dynamic tion structure: When a new sample is shown in neurons, the set-aktivaatioarvo Come take time to build a state of equilibrium. At the time of sample u (k) the presentation of intraneuronal spaces are thus x (, k).
It is assumed that the instantaneous change in neuron depends linearly on the activation of input vector and the neuron state, the dynamics of the entire neural network, may be a matrix form as follows: dx (; k) =Ax (, k) +BU (k). d Matrices Aand Binclude the connections between the neurons weights. KuVassa 23 shows a dynamic diagram of a network structure. Neurons in behavior of interest is the result obtained konvergoitumisen be steady state x(K) = lim x (, k), whose expression can be solved by the term change in the formula (50) to zero: x(K) = A-1 BU (k). The dynamics of the review can go directly to the outcome of the benefitstmiseen. Is denoted by the activation of neurons in the stable end-state x (k), when the input is shown u (k). (51) (50)
68
Figure 23: and an anti-Hebbin Hebbin pkomponenttiregres-based learning sioalgoritmin the dynamic neural network structure. Integration is performed at the time variable relationship.
Now, select weight coefficients A=-E {xxT} and B=E {xuT} (52)
where determine the speed of the algorithm konvergoitumisen and expected values E {} the index is calculated krelationship. So the input weights Bother statechange are determined by the expression Hebbin rule directly on the entryand status signals, ie the correlation between the covariance matrix of germs. If signals are highly correlated, the equivalent weight is a factor, and vice versa. On the other hand a negative effect on the signal state of the feedback A through a new state of the change complies with the anti-Hebbin learning rule. When a correlation is completed, change the term iteration to the reduction, system.
Steady state is obtained by placing the selections (52) the expression (51): x (k) =E-1 {XxT{E} xuT} U (k) =Tu (k), (53)
where lineaarikuvausta input for neurons in the farms are marked matriisilla T Rn m . Referring next to the end space covariance matrix of neurons, which is calculated, are calculated vector x (k) Foreign income expectation value of the formula (53): E {xxT}=E-1 {XxT{E} xuTE {new}TE} T{XuTE} -1 {XxT}. The formula is true if the input data is assumed stationary. It is said now mofrom both sides ML-estimate E {xxT} yield: E3{XxT}=E {xuTE {new}TE} T{XuT}. Substituting equation (53) x (k) 's clause in the equation transforms to TE {newT}
3
(54)
(55)
=TE3{NewT}.
(56) 69
Kuvausmatriisi is trivial when The=m ie the description is not the case compressound. If, however, n <m, should look at the description of the properties in more detail. Let's move input covariance matrix E {newT}eigenvectors and the eigenvalues, which pkomponenttivektoreista input data and associated the variances. Ominaisarvohajotelmalla can be written in matrix form toon E {newT}= -1 = T,(57) where and a diagonal ominaisarvomatriisi s columns are summarized in the corresponding eigenvectors. Since the characteristic symmetric covariance matrix vectors form the orthogonal position, can be inverse calculation transposed when to replace. Selected kuvausmatriisiksi =D; (58) where Dis an arbitrary TheTheOrtogonaalimatriisi-size, which is the case DDT=DTD=IThe, And composed of any from among the matrix of E {newT}separate supplements eigenvectors. Substituting equation (56) is on the left side of expression TE {newT}
3
=DTTTD =DTD D TD =DT3D.
DTD
(59)
Diagonaalimatriisissa diagonal elements are the position the characteristic of selected vectors corresponding to the eigenvalues. On the other hand the right side of the equation to evaluate the shape of TE3{NewT} =DTTT T TD =DT3D.
(60)
Thus, if the kuvausmatriisin columns are selected arbitrary lineaarikombinaapating Member States part of the input data pkomponenttivektoreita, obtained a number of solutions, to equation (56). Since the final states of neurons is not known in advance, is not necessary to covariance be directly calculated. Covariances estimates canhowever, be gradually updated estimates space by means of which in turn the obtained covariance matrix estimates. It turns out [32] that this a double iteration konvergoiduttuaan actually lead to a situation in which matrix =ET{XuTE} -1 {XxT}(61) tune of the same subspace as Thethe most important input data pkomponenttivekthe monitor. 70
If the matrix columns to be selected to correspond exactly to automoponenttivektoreita ; covariance matrix of force to keep the premises diagonaaliseksi. Then the areas which are latenttimuuttujat uncorrelated with each other and PA tea E {xxT}= . This is achieved by resetting the iteration alapuoleiset diagonal elements in question, variance E {xxT}neurons in the calculation of the premises on which the first neuron converges the most important main components. The remaining neurons share with each subspace, to the remaining, so that main components of converging to neurons in order. By using the thus-covariance matrix SIA ylkolmiomatriisimuodossa each iteration can be calculated inpkomponenttivektorit women, and the neurons in the areas corresponding to the major fiber component, analyzes with latenttimuuttujia.
C.3.2
Principal component analysis
Calculation of work in order to facilitate iteraatiokaavat principal component analysis to calculate in order to update and kovarianssiestimaattien is shown in the following usetmll neurons in the dynamics of the final status directly from formula (53). Enter the continpresent in Table kcalculated latenttimuuttujadatan x (k) kovarianssiestimaattia Rxx (K) Rn n and latent heat, and feed the data cross-covariance Rxu (K) Rn m . Cross-covariance estimate at the time kis updated using the formula Rxu (K) = (k) Rxu (D -1) + (1 - (k)) x (k -1), uT(D -1); where the parameter (k) factor is an oversight. An algorithm is needed to latent timuuttujien inverse covariance matrix Pxx (K), a direct update of tyssnt can lead to move with the same way as in the formula (62):
-1Pxx (K) -1=
(62)
= Rxx (K)
-1
(k) Rxx (D -1) + (1 - (k)) x (k -1) xT(D -1)

-1
(63)
= (k) Pxx (D -1) + (1 - (k)) x (k -1) xT(D -1). Now, the application of matriisinkntlemmaa (e.g. [28]) (A-1 +BC -1 D)-1 =A-AB (DAB +C)-1 DA obtained from an expression, a settlement of the update rule Pxx (K) =
1 (k)
(64)
Pxx (D -1) -
Pxx (K-1) x (k-1) xT(K-1) Pxx (K-1) (k)xT(K-1) Pxx (K-1) x (k-1) + . (k) 1-
(65)
Covariance estimates can be calculated by using the latenttimuuttujadatan estimate: x (k) =triang {Pxx (K)} Rxu (K) u (k), (66) 71
where the operation triang argumenttimatriisinsa zero below the diagonal Germ. Iterating the formulas (66), (62) and (65) converges with the covariance matrix is obtained, duttua calculated on principal component analysis. Pkomponenttivektorit are describedvausmatriisin (61) columns, so that the above-pkomponenttikantamat Rice can transpose: W(K) = T(K) = Pxx (K) Rxu (K). (67)
C.3.3
Principal Components
Principal Components Algorithm-share calculation is based on the same as the principal components analysis of the foregoing. When starting in received latenttimuuttujadatasta, is the first step to whiter, regressiorakennetta to provide simplification. Bleached latenttimuuttujadatan z (1), z (2). . . , Z (n )covariance matrix is the identity, JOLI created regressiokuvauksen final output is given by
y=E {yz T} z (t). Denote further bleached latenttimuuttujadatan cross-covariance and responseSIA E {yz T}=Ryz Rl n ,(69) where la response y (1), y (2). . . , Y (N )dimension.
(68)
The cross-covariance estimate at the time kupdating the formula (62) corresponding follows: Ryz (K) = Ryz (D -1) + (1 -) y (k -1) z T(D -1).(70) Covariance matrix used in the calculation signaaliestimaatit is leftCorrespondingly, the covariance estimates using equations (66), z (k) =diag {Pxx (K)}Half x (k) y(K) = Ryz (K) z (k). (71)
The operation diag argumenttimatriisinsa to force other elements to zero diagonal lialkioita exception. The upper formula whitens the data by dividing each component of the standard deviation, and calculates the actual lower regression. Principal component regression response can now be iterating formulas (66), (71), (62), (65) and (70), the covariance matrix estimates are converged. Konvergoitumisen has occurred, an inverse covariance latenttimuuttujadatan matrix Pxx is diagonaalimuodossa, so the formulas (66) and (71) to waste MPLIANCEWITH operators triang and diag redundant off.
72

Thesis - Fi.en Bipedal Model

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Thesis - Fi.en Bipedal Model

Transféré par

Droits d'auteur :

Formats disponibles

HELSINKI UNIVERSITY OF TECHNOLOGY Electrical and Communications Engineering

Professor Heikki Hytyniemi

HELSINKI UNIVERSITY OF TECHNOLOGY

Summary of the thesis

Department: Electrical and Communications Engineering Chair: AS-74 Control Engineering

Supervisor: Prof. Heikki Hytyniemi

Helsinki University of Technology

Abstract of the Master's Thesis

Supervisor: Prof. Heikki Hytyniemi

The earlier applications. . . . . . . . . . . . . . . . . . . . . . . 35

Kaksijalkaisten walking robots for modeling and Controlling

Passive dynamic walking

The model equations

so Fx0 =FLx +FRx =- (M0+ 2m1+ 2m2G) + FLy +FRy

=- ( yL1 m1+ yL2 m2+ yR1 m1+ yR2 m2G) +

+ yLG FLy + xLG FLx

+ yLG FLy + xLG FLx +ML2 .

Dynamics of the walker

The substrate support forces

Knee angles limiters

The simulator user interface

Figure 9: Controller -Area is calculated actual control signals.

Locally weighted regression

Examples and applications

Locales formed pkomponenttiregressiomallien TUS

Guidance for calculating

= Ryz Pxx Half Pxx Rxu (U (k) -Cu) + Cy,

system can be described by the equation (k + 1) = fk( (k), v (k)); (20)

Lk( (k), v (k));

Jt= (R, (T )) + + T(K + 1)

Lk( (k), v (k))

(k)) = min [Lk( (k), v (k)) +Jk +1 ( (k +

Clustered regressiorakenteen optimization

The earlier applications

Clustered application regressiostjn walker to control

Controller implemented in Simulink

The training data

Denote zero-mean, and scaled control signal y (k).

+ 1) = p(K) Cu(K) + (1 -p(K)) u (k) + 1) = p(K) Cy(K) + (1 -p(K)) y. (k)

Operating points and the number of feature variables selection

(k)) = (u (k) -Cu)TH1(U (k) -Cu).

Operating points of internal models of teaching

xpand upthe operating point prelated data vector and

Figure 20: walker behavior of both control methods.

occurred about a thousand iteration.

The generalized coordinates

zi=zi(Q1, Q,2. . . , Q,s).

Application of Newton's second law massapisteelle miGives Fxi=mixi F=miyi

the point is more than D'Alembert the formula:

Re-arranging terms gives a

Dynamics of the model equations

A66 A67 A71 A72 A73 A74 A75 A76

=m2r22 =0 =-m2r2cos ( -R+R) =m2r2sin ( -R+R) =m2r2(R2+r0cos (R-R) + l1cos (R)) =0

=m2r2(R2-l1cos (R)) =0 A77 =m2r22

R+R) sin ( -R)

- Lm2r2cos ( -L+L)-2m1r1cos ( -R)2 2Rm1r1cos ( -R)-2(2m1+ 2m2) R0cos () - -

Multivariate Regression methods

Principal component analysis

And anti-Hebbin Hebbin learning based on the main komponenttiregressioalgoritmi (HAH)

Structure and function of the neural network

=DTTTD =DTD D TD =DT3D.

Principal component analysis