Genetic Algorithm Walkers

Animats · on Jan 19, 2015

I'm currently stuck at 1233 after 250 generations. The last improvement was at gen 183. Walking gets part way down the first downslope, then falls. The current champions are far better than any of their mutations, so a local maximum has been reached. This is the fate of most evolutionary AI.

Once on the downslope, the controller for flat ground won't work, so now it has to jump to something substantially more complex to progress. Making such jumps was an unsolved problem with genetic algorithms 20 years ago, and I commented back then that whoever solved it would win a Nobel Prize. Still waiting.

I suspect that a solution lies in work on neural nets where one net is trained, then a second, smaller, net is trained to match the first. This is a form of "abstraction". Somewhere in that space is something very important to AI.

Animats · on Jan 20, 2015

Just to see if any further improvement was possible, I left this running for over 18,000 generations. 98% of the improvement had happened by gen 183. The last minor improvement was at gen 7313. There's no sign of a controller evolving that can handle non-flat terrain.

I used to work on legged locomotion. Once you get off the flat, traction control ("anti-slip") starts to dominate the problem. It looks like this simulation isn't going to evolve anti-slip control.

Houshalter · on Jan 20, 2015

What you describe is called local optima and it's well studied. There is no magic way around it, you just need to put resources into exploring a larger area. Methods include selecting really diverse solutions rather than just the best one, restarting the simulation with a new random population, or sometimes throwing away the best solution and letting it wander around randomly longer before converging.

>I suspect that a solution lies in work on neural nets where one net is trained, then a second, smaller, net is trained to match the first. This is a form of "abstraction". Somewhere in that space is something very important to AI.

Neural nets have a lot of desirable properties for this kind of stuff. Mainly that they can be trained with reinforcement learning. Instead of randomly testing which solutions work better, you have them try to predict how good each action is.

m_mueller · on Jan 19, 2015

From my experience, restarts are as important as mutation and recombination. The most efficient algorithm I've seen so far is a mixture of ES, hillclimbing and restarts. The whole field seems to be pretty stuck IMO, teaching the same inefficient approaches over and over again.

saganus · on Jan 19, 2015

A bit OT, but I remember a very cool Quake2 mod that basically built a neural network so that bots could learn how to play.

The useful part was that you could save the NN to a file, and so you could let Q2 run for hours (mostly I left it at night so it wouldn't prevent me from playing :) ) and the hext day I would save the progress.

After about a week of doing this everynight, the bots evolved from just jumping around in their spawn points without moving, to actually run around the level and shoot the other bots on sight.

They didn't really have a good aim, but sometimes they would land a rocket or a rail. I always wondered how much time would I have to let the bots evolve so that they would become competitive.

Unfortunately after some time I lost my NN files and lost the project page. Some time later I found again the project but it was not updated and the download files were broken.

Wish I had saved the mod :(

Now, honest question: would this kind of approach work for, say, programming a Hello World program? or would the number of variables and possibilities is too big?

It would be interesting to put several of these bots to compete against each other, but in different languages and see which one is "easier" to grasp by the bot.

deong · on Jan 19, 2015

There are branches of evolutionary computation that aim to evolve programs. The best known examples are Genetic Programming (GP) and Evolutionary Programming (EP). GP is by far the most popular today, and involves evolving source code, generally in Lisp-like languages where programmatically manipulating source code is a bit easier, but there are also GP approaches that use stack-based languages like Forth or Factor. I'm not aware of anyone trying to evolve code in languages like C++ or Java, but I suppose someone out there has probably experimented with it.

EP is about evolving finite state machines, and from there, you could presumably infer a program in the conventional way, although it might not be particularly nice for humans to read. Neither are the programs learned by GP though, for that matter.

meric · on Jan 20, 2015

I once experimented with small math calculation programs and found if I added a heuristic for selecting programs with less length, given equal correctness, the former is preferred. This helped evolve shorter more readable programs.

saganus · on Jan 19, 2015

I Was thinking for example, how hard would it be to let an algorithm pick a random number of valid tokens from a language (say java) and the suyccess metric be something like, the amount (or score) of errors that the compiler returns.

Presumably I would expect to see the first few iterations produce nonsense, but maybe after a while, a bot is able to correctly write a variable assignment.

The first obvious problem I see comes from the fact that we know we can have a completely valid piece of code after gibberish and the compiler won't notice the difference.

But what about if there's a compiler able to "score" the "correctness" of the program? that would solve that issue and in theory be able to "learn" how to program, albeit slowly and like you say, most likely not human readable.

deong · on Jan 20, 2015

Number of syntax errors is extremely weakly predictive of program correctness, and figuring out correctness of arbitrary source code is extremely difficult.

This is why the bread-and-butter of GP is in symbolic regression. Rather than deal with programs that do literally anything, we focus on evolving mathematical expressions. Given a set of data points, can you find a regression equation of some variables that minimize the error on the training data? There are no syntax errors to worry about, no early termination, no control flow, just a small fixed set of arithmetic operators, numbers, and variables. And by keeping the domain in numeric functions, you get a free error metric that is generally sensible.

There is talk in the field about the future -- can you evolve a web browser, for instance -- but this is very futuristic at the moment at least.

saganus · on Jan 21, 2015

That's very interesting.

But then, I remember doing some work on college about Gödel number (https://en.wikipedia.org/wiki/G%C3%B6del_numbering). Would it be possible for example, to use this as a form of encoding instructions?

Maybe if arbitrary programs are too broad of scope, how about specifically getting only one instruction correct? e.g. to evolve the correct print(message) syntax. Do you have any references to this? it's a fascinating subject!

deong · on Jan 22, 2015

I don't have any references for it, but evolving correct syntax should be doable I think, at least for smallish scale problems.

Others have mentioned that people have evolved "Hello world" in Brainfuck, so it's definitely possible to do interesting things. It's just too ambitious right now to shoot for actually useful programs.

markovbling · on Jan 20, 2015

Reminds me of the 'bot armistice' where a guy let a bot-vs-bot match run for 4 years and the steady state was 'the bots on both teams are simply standing still, not doing anything. The server is running and the game isn’t frozen. The bots are simply standing there, not killing one another.'

http://www.forbes.com/sites/erikkain/2013/07/02/quake-iii-ar...

Could be urban legend but fun story nonetheless :)

saganus · on Jan 21, 2015

So basically, a WOPR's "the only winning move is not to play" moment?

Houshalter · on Jan 19, 2015

Someone made a GA evolve a hello world program in brainfuck. It's definitely possible, but incredibly inefficient.

Florin_Andrei · on Jan 20, 2015

If you look at the trillions of creatures thrown at the evolutionary game every day in real life, Mother Nature herself is not terribly efficient either.

happycube · on Jan 20, 2015

Would that be like an infinite number of bonobos...?

saganus · on Jan 19, 2015

Wow. That does sound inefficient.

sinwave · on Jan 19, 2015

I suspect that the best way to tackle this problem is to treat "learning to walk" as a reinforcement learning problem. This way you can evaluate actions within each episode rather than waiting until the end of each trial to adjust parameters encoding the walking strategy.

As karpathy suggests below (and he's certainly much more qualified than me), an evolutionary method such as GA's, while apparently fairly effective, could well be wasting valuable information learned in real time through interaction.

lukev · on Jan 19, 2015

One interesting thing is that progress very much happens by fits and starts, as it waits for the "lucky mutation" that will enable evolution to continue.

For example,

  0	Bedaaa Ceeici	6.07
  1	Bocodo Bidobo	105.93
  3	Aibebe Docoeo	107.74
  7	Diaebe Eocoeu	107.88
  10	Ciaabe Eocoeo	107.95
  25	Diaebe Facodu	108.19
  28	Biaebe Eocoeo	108.20
  30	Biaeae Eacoeo	108.47
  35	Beaebi Fucieo	109.88
  36	Biaebi Eucici	203.60
  42	Biaibi Fuceci	204.65
  45	Aiaibi Fuceeo	206.30
  47	Aiaibi Fucici	206.60
  48	Aeaebe Eoceeo	412.96
  56	Beaebe Euceeo	414.05
  59	Beaebi Fubiei	415.76
  73	Beaebi Focieu	519.01
  75	Beaebi Eocido	519.39
  96	Baaebi Gidido	521.00
  99	Baaebe Focedo	627.14

There are pretty massive jumps at generation 36, 48, 73 and 99.

By the way, this is at 50% mutation probability and 25% mutation amount. It got stuck way earlier with large less frequent mutations (as one would expect, if the probability of a beneficial mutation occurring is the limiting factor).

fivedogit · on Jan 19, 2015

This is called punctuated equilibrium. Evolution happens in fits and starts after periods of relative non-change.

http://en.wikipedia.org/wiki/Punctuated_equilibrium

im3w1l · on Jan 19, 2015

If you look at how the scores for the walkers increase as they walk, you will see that it jumps in increments of 100. I guess it is some kind of "milestone bonus".

codys · on Jan 20, 2015

The page says "bonus points are given for each proper step forward".

So it appears they are optimizing for whatever their idea of a "proper step" is.

rednuht · on Jan 20, 2015

That's every time the legs switch places. Without that bonus, you'd get a lot of QWOP-ish individuals dragging one foot behind. :)

kendallpark · on Jan 19, 2015

Ah. Aiaeca Eeaeci rocking it at 1048.50 at gen 256.

EDIT I've heavily tweaked my config throughout depending on whether I thought I was trapped at a local maxima, etc. Right now I think I've settled on for late-game:

10% mutation prob 1% mutation amount 3 to copy

Currently: 1056.91 @ 338.

freehunter · on Jan 19, 2015

With a Core i7, I wasn't expecting the simulation to run so slow, even at the lowest settings.. Maybe Firefox on RHEL 6 isn't the optimal platform for this little toy?

--edit: it runs much, much faster on Firefox with Windows 8.1. Not a Core i7, but an AMD 8-core thing. Must be Firefox on Linux's implementation, unless it's a video card/driver thing.

aselzer · on Jan 19, 2015

Runs pretty fast with Chromium on Linux, but in Firefox it's pretty slow as well. That's with a 2-core 1.7GHz i7

It's probably the physics simulation. Though random JS benchmarks show Chrome and Firefox to be approximately equal in JS performance on my computer, Chrome beating FF slighly on sunspider and the other way around with Octane.

yid · on Jan 19, 2015

Running plenty fast for me on Chrome / i5 / Linux.

Houshalter · on Jan 19, 2015

What exactly is evolving? How is the genome represented, how does it control the walking?

rednuht · on Jan 19, 2015

The genome is represented by a set of parameters that control the rotation speed on each joint, 3 parameters per joint.

The rotation speed for each joint on each simulation step is given by:

  x * cos(y + z * simulation_steps);

where x, y, and z are the parameters. I've experimented with more sophisticated models, but it was hard to get any kind of evolution in the attention span people give to a browser game. :D

karpathy · on Jan 19, 2015

I've worked on similar projects during my Master's degree. An important trick I learned was to make the controller a function of some kind of discrete feedback in the world (e.g. when the foot hits the ground). In your equation above for example, a simple way to incorporate this would be to reset simulation_steps to zero every time a foot hits the ground.

This is also an important feature in the SIMBICON controller, which is arguably the simplest and most robust walker system. (http://www.cs.ubc.ca/~van/papers/Simbicon.htm)

EDIT: and yes, Reinforcement Learning is much more effective way of attacking this type of problem (e.g. see some recent work from Sergey Levine http://www.eecs.berkeley.edu/~svlevine/), but I also agree with the author that GA are fun :)

Aqwis · on Jan 19, 2015

Fun, but even after hundreds of generations the walkers are still pretty bad. How good would they be after thousands or millions of generations? Is it even feasible to create a decent walker using genetic algorithms?

Houshalter · on Jan 19, 2015

These simulations are very sensitive to the parameters, some of which we don't even have control over (e.g. population size or what method to select the next generation.) And especially the way the genome is represented.

It doesn't really say how it works, but it doesn't seem like a very natural way to do walking. E.g. here is are evolved walkers in a more complicated 3d simulation: http://vimeo.com/79098420 They seem to get very good after just a few generations compared to this.

jamesk_au · on Jan 19, 2015

Incredible. It even evolves a walker resembling a variant of a kangaroo (at [01:57]). With a target speed of 1.0 m/s, the legs alternate in taking small steps, but at 2.0 m/s, both legs are used to jump.

V-2 · on Jan 19, 2015

Now that's awesome!

constexpr · on Jan 19, 2015

This experiment reminds me of one of my recent experiments: http://madebyevan.com/nn-gait/. Mine ends up figuring how to walk relatively quickly. It does have a regular pulse as an input though.

myrkvid · on Jan 19, 2015

Quite interesting! Is there any source code available publicly for this?

constexpr · on Jan 19, 2015

I'm always confused when people ask me this question. All browsers have the ability to view the source code of a web page (View > Developer > View Source in Chrome). The source code is here: http://madebyevan.com/nn-gait/script.js. Is that what you meant?

codezero · on Jan 19, 2015

Most people assume the in-browser code will not be the fully commented, clean and presentable code that is more likely to appear on Github, I think that's what prompts queries like this :)

V-2 · on Jan 19, 2015

I'd like to see a genetic algorithm evolutionarily figuring out what parameters give you best performing walkers in shortest time ;) Meta-genetic if you will

deong · on Jan 19, 2015

It's a pretty well known field of research (though not specifically for toy walking problems). Genetic algorithms are a member of the class of problems called "metaheuristics", so your "meta" prefix is sort of already there. So they went up a level and called them "hyperheuristics".

nhzmju · on Jan 19, 2015

I was just about to give up at 70th generation, when the walkers suddenly evolved from 1 to 4 steps in 5 generations.

Currently at generation 120 with 7 steps, looks like it is "evolving" in bursts, with long periods of nothing.

Well it is stuck at 9 steps at gen 400+.

  327	aibobe baeado	948.93

  >> 548   aibobe baeado   1058.76

I wonder if the author got any further. I would say this is close to the limit.

comboy · on Jan 19, 2015

Falling into local minima and getting rescued by mutations.

saganus · on Jan 19, 2015

It's interesting how I got:

0 Cicebi Bicedi 206.20

1 Cibebi Dibodi 208.03

4 Cobebi Bibedi 208.54

6 Cobebi Bobedi 208.68

8 Cocebi Boaedi 209.53

14 Cocibi Boaedi 310.47

20 Cocebi Boaedi 312.73

36 Cocebi Boaedi 312.85

38 Cocebi Biaedi 312.89

40 Cocebi Bobedi 314.06

58 Cocebi Bobedi 417.71

119 Cicebi Bibodi 522.37

331 Cicebi Bibuei 624.26

And at ~500 generations the champions are still the same 3. I have fiddled with the parameters several times to no avail. I've tried very little mutation probability (1-5%) with low mutation amount (1-10%) but also 75-100 mut. prob with 1-5% mutation amount and any figures in between and nothing seems to make it go out of the local minima.

Is there any way to get out of this? or when this happens in nature the species simply goes extinct or gets eaten by everyone else?

saganus · on Jan 19, 2015

Ah.. seems like I finally got out of it at gen ~530 by changing the amount of champions to copy to increase the amount of gene variation in the pool.

I guess it kinda stuck into an inbreeding sort of loop.

rednuht · on Jan 19, 2015

It's quite hard to get any better than 10 or so steps, luck starts playing a major role after a while.

Some gait branches seem to reach a dead-end where any improvement would need to come from a major overhaul of the gene combination instead of small mutations.

The terrain starts to get more variation as the distance increases, so that's another piece of evil against the walkers. :D Maybe I should turn that off.

I've let it run for a couple of days. I don't think I got past 12 steps, but it looked pretty regular walking for a while. :)

nhzmju · on Jan 19, 2015

I was thinking about writing something similar in C. How long is the genome( bytes/chars )?

rednuht · on Jan 19, 2015

The genome is pretty simple, just a few motor parameters for each joint in the body.

3 floats for each joint, to be precise. Then those are combined to set the speed of the motor in each joint on every simulation step.

I tried more complex genomes, but it just takes way too long to get any kind of improvement. It didn't make for a fun casual browser experience. :)

mariusz79 · on Jan 19, 2015

Looks like you've hit an evolutionary dead-end.. exterminate!

nhzmju · on Jan 19, 2015

I felt bad closing the window.

sago · on Jan 19, 2015

Welcome to punctuated equilibria.

ColinWright · on Jan 19, 2015

So is this really a genetic algorithm, or is it hill-climbing? Are you combining genomes from different individuals? Or are you just mutating a single genome? There are no controls for crossing genomes, so it seems likely to be hill-climbing.

If not, how are you mapping the genome to the phenome?

Houshalter · on Jan 20, 2015

In the source code their is a function for mating.

mmanfrin · on Jan 19, 2015

Reminds me of this really fantastic video of finding the gaits of different sized/legged 'animals' at different speeds: http://vimeo.com/79098420

gus_massa · on Jan 19, 2015

I just got one that evolved kneeling to last a few more seconds: http://imgur.com/mii9N8J

hellbanner · on Jan 19, 2015

Please make one for QWOP http://www.foddy.net/Athletics.html

matsemann · on Jan 19, 2015

There already is, actually. I read the paper a while back. Video: https://www.youtube.com/watch?v=eWxFI3NHtT8 Paper: http://dl.acm.org/citation.cfm?id=2598248

hellbanner · on Jan 21, 2015

You made my day. Nice find!

V-2 · on Jan 19, 2015

Reminded me about http://rednuht.org/genetic_cars_2/

If you're disappointed with walkers' performance, try cars, they improve much more. The principle remains the same of course. Attention: addictive.

codys · on Jan 20, 2015

Yep, I remember that and http://boxcar2d.com/ ending up on HN some time ago

billconan · on Jan 19, 2015

I remember this. I could watch it forever!

shkkmo · on Jan 19, 2015

You seem to have some bugs with the copying of champions and/or the identical reproduction of walks at the very short simulation length. Without having adjusted simulation parameters I no longer have my highest scoring walker saved as a champion.

rednuht · on Jan 19, 2015

Thanks, I'll look into it!

ehtd · on Jan 19, 2015

Very cool! I did my master thesis in Genetic Algorithms and bipedal robots. You can watch the results here: https://www.youtube.com/watch?v=SYONnbvsxz0

clemsen · on Jan 19, 2015

This looks like fun! It seems that the copied champions are simulated in every run. As long there is no random influence you can probably skip this simulation, as the simulation of the champions is also most likely the most CPU intensive.

EDIT: Grammar.

nhzmju · on Jan 19, 2015

I was thinking the same thing. Since the gene certainly isn't very long the best champions could be stored, until a better champion is found.

gus_massa · on Jan 19, 2015

The simulation of the champions is unnecessary, but it's a nice detail. It gives a visual sense of improvement.

rednuht · on Jan 19, 2015

Good point!

hcarvalhoalves · on Jan 19, 2015

Converged pretty fast to the first step on the 10th gen, then got stuck having a seizure and falling on it's head "forever".

I bet this would be interesting evolving a simpler movement mechanic, like in bacteria.

sago · on Jan 19, 2015

This is how NaturalMotion started. I remember their CEO Torsten evolving muscle controllers on the Mathengine physics engine (1999?). Get it right, and you get to build a pretty nice business out of it!

bryogenic · on Jan 19, 2015

You could optimize the no-graphics/high speed even more by not simulating the copied champions (since their end score is pre-determined)

comboy · on Jan 19, 2015

I always spend way too much time on these sites. Not a new idea, but the visualization is awesome.

yuashizuki · on Jan 19, 2015

its not improving after gen 190. Its stuck.

nhzmju · on Jan 19, 2015

How may steps/score? Try changing the parameters.

yuashizuki · on Jan 19, 2015

I did it was runnig at 0fps with max speed. I even changed the mutation part but was still no improving. I am running dual core machine.