Hacker News new | past | comments | ask | show | jobs | submit login
Genetic Algorithm Walkers (rednuht.org)
122 points by clukic on Jan 19, 2015 | hide | past | favorite | 74 comments



I'm currently stuck at 1233 after 250 generations. The last improvement was at gen 183. Walking gets part way down the first downslope, then falls. The current champions are far better than any of their mutations, so a local maximum has been reached. This is the fate of most evolutionary AI.

Once on the downslope, the controller for flat ground won't work, so now it has to jump to something substantially more complex to progress. Making such jumps was an unsolved problem with genetic algorithms 20 years ago, and I commented back then that whoever solved it would win a Nobel Prize. Still waiting.

I suspect that a solution lies in work on neural nets where one net is trained, then a second, smaller, net is trained to match the first. This is a form of "abstraction". Somewhere in that space is something very important to AI.


Just to see if any further improvement was possible, I left this running for over 18,000 generations. 98% of the improvement had happened by gen 183. The last minor improvement was at gen 7313. There's no sign of a controller evolving that can handle non-flat terrain.

I used to work on legged locomotion. Once you get off the flat, traction control ("anti-slip") starts to dominate the problem. It looks like this simulation isn't going to evolve anti-slip control.


What you describe is called local optima and it's well studied. There is no magic way around it, you just need to put resources into exploring a larger area. Methods include selecting really diverse solutions rather than just the best one, restarting the simulation with a new random population, or sometimes throwing away the best solution and letting it wander around randomly longer before converging.

>I suspect that a solution lies in work on neural nets where one net is trained, then a second, smaller, net is trained to match the first. This is a form of "abstraction". Somewhere in that space is something very important to AI.

Neural nets have a lot of desirable properties for this kind of stuff. Mainly that they can be trained with reinforcement learning. Instead of randomly testing which solutions work better, you have them try to predict how good each action is.


From my experience, restarts are as important as mutation and recombination. The most efficient algorithm I've seen so far is a mixture of ES, hillclimbing and restarts. The whole field seems to be pretty stuck IMO, teaching the same inefficient approaches over and over again.


A bit OT, but I remember a very cool Quake2 mod that basically built a neural network so that bots could learn how to play.

The useful part was that you could save the NN to a file, and so you could let Q2 run for hours (mostly I left it at night so it wouldn't prevent me from playing :) ) and the hext day I would save the progress.

After about a week of doing this everynight, the bots evolved from just jumping around in their spawn points without moving, to actually run around the level and shoot the other bots on sight.

They didn't really have a good aim, but sometimes they would land a rocket or a rail. I always wondered how much time would I have to let the bots evolve so that they would become competitive.

Unfortunately after some time I lost my NN files and lost the project page. Some time later I found again the project but it was not updated and the download files were broken.

Wish I had saved the mod :(

Now, honest question: would this kind of approach work for, say, programming a Hello World program? or would the number of variables and possibilities is too big?

It would be interesting to put several of these bots to compete against each other, but in different languages and see which one is "easier" to grasp by the bot.


There are branches of evolutionary computation that aim to evolve programs. The best known examples are Genetic Programming (GP) and Evolutionary Programming (EP). GP is by far the most popular today, and involves evolving source code, generally in Lisp-like languages where programmatically manipulating source code is a bit easier, but there are also GP approaches that use stack-based languages like Forth or Factor. I'm not aware of anyone trying to evolve code in languages like C++ or Java, but I suppose someone out there has probably experimented with it.

EP is about evolving finite state machines, and from there, you could presumably infer a program in the conventional way, although it might not be particularly nice for humans to read. Neither are the programs learned by GP though, for that matter.


I once experimented with small math calculation programs and found if I added a heuristic for selecting programs with less length, given equal correctness, the former is preferred. This helped evolve shorter more readable programs.


I Was thinking for example, how hard would it be to let an algorithm pick a random number of valid tokens from a language (say java) and the suyccess metric be something like, the amount (or score) of errors that the compiler returns.

Presumably I would expect to see the first few iterations produce nonsense, but maybe after a while, a bot is able to correctly write a variable assignment.

The first obvious problem I see comes from the fact that we know we can have a completely valid piece of code after gibberish and the compiler won't notice the difference.

But what about if there's a compiler able to "score" the "correctness" of the program? that would solve that issue and in theory be able to "learn" how to program, albeit slowly and like you say, most likely not human readable.


Number of syntax errors is extremely weakly predictive of program correctness, and figuring out correctness of arbitrary source code is extremely difficult.

This is why the bread-and-butter of GP is in symbolic regression. Rather than deal with programs that do literally anything, we focus on evolving mathematical expressions. Given a set of data points, can you find a regression equation of some variables that minimize the error on the training data? There are no syntax errors to worry about, no early termination, no control flow, just a small fixed set of arithmetic operators, numbers, and variables. And by keeping the domain in numeric functions, you get a free error metric that is generally sensible.

There is talk in the field about the future -- can you evolve a web browser, for instance -- but this is very futuristic at the moment at least.


That's very interesting.

But then, I remember doing some work on college about Gödel number (https://en.wikipedia.org/wiki/G%C3%B6del_numbering). Would it be possible for example, to use this as a form of encoding instructions?

Maybe if arbitrary programs are too broad of scope, how about specifically getting only one instruction correct? e.g. to evolve the correct print(message) syntax. Do you have any references to this? it's a fascinating subject!


I don't have any references for it, but evolving correct syntax should be doable I think, at least for smallish scale problems.

Others have mentioned that people have evolved "Hello world" in Brainfuck, so it's definitely possible to do interesting things. It's just too ambitious right now to shoot for actually useful programs.


Reminds me of the 'bot armistice' where a guy let a bot-vs-bot match run for 4 years and the steady state was 'the bots on both teams are simply standing still, not doing anything. The server is running and the game isn’t frozen. The bots are simply standing there, not killing one another.'

http://www.forbes.com/sites/erikkain/2013/07/02/quake-iii-ar...

Could be urban legend but fun story nonetheless :)


So basically, a WOPR's "the only winning move is not to play" moment?


Someone made a GA evolve a hello world program in brainfuck. It's definitely possible, but incredibly inefficient.


If you look at the trillions of creatures thrown at the evolutionary game every day in real life, Mother Nature herself is not terribly efficient either.


Would that be like an infinite number of bonobos...?


Wow. That does sound inefficient.


I suspect that the best way to tackle this problem is to treat "learning to walk" as a reinforcement learning problem. This way you can evaluate actions within each episode rather than waiting until the end of each trial to adjust parameters encoding the walking strategy.

As karpathy suggests below (and he's certainly much more qualified than me), an evolutionary method such as GA's, while apparently fairly effective, could well be wasting valuable information learned in real time through interaction.


One interesting thing is that progress very much happens by fits and starts, as it waits for the "lucky mutation" that will enable evolution to continue.

For example,

  0	Bedaaa Ceeici	6.07
  1	Bocodo Bidobo	105.93
  3	Aibebe Docoeo	107.74
  7	Diaebe Eocoeu	107.88
  10	Ciaabe Eocoeo	107.95
  25	Diaebe Facodu	108.19
  28	Biaebe Eocoeo	108.20
  30	Biaeae Eacoeo	108.47
  35	Beaebi Fucieo	109.88
  36	Biaebi Eucici	203.60
  42	Biaibi Fuceci	204.65
  45	Aiaibi Fuceeo	206.30
  47	Aiaibi Fucici	206.60
  48	Aeaebe Eoceeo	412.96
  56	Beaebe Euceeo	414.05
  59	Beaebi Fubiei	415.76
  73	Beaebi Focieu	519.01
  75	Beaebi Eocido	519.39
  96	Baaebi Gidido	521.00
  99	Baaebe Focedo	627.14
There are pretty massive jumps at generation 36, 48, 73 and 99.

By the way, this is at 50% mutation probability and 25% mutation amount. It got stuck way earlier with large less frequent mutations (as one would expect, if the probability of a beneficial mutation occurring is the limiting factor).


This is called punctuated equilibrium. Evolution happens in fits and starts after periods of relative non-change.

http://en.wikipedia.org/wiki/Punctuated_equilibrium


If you look at how the scores for the walkers increase as they walk, you will see that it jumps in increments of 100. I guess it is some kind of "milestone bonus".


The page says "bonus points are given for each proper step forward".

So it appears they are optimizing for whatever their idea of a "proper step" is.


That's every time the legs switch places. Without that bonus, you'd get a lot of QWOP-ish individuals dragging one foot behind. :)


Ah. Aiaeca Eeaeci rocking it at 1048.50 at gen 256.

EDIT I've heavily tweaked my config throughout depending on whether I thought I was trapped at a local maxima, etc. Right now I think I've settled on for late-game:

10% mutation prob 1% mutation amount 3 to copy

Currently: 1056.91 @ 338.


With a Core i7, I wasn't expecting the simulation to run so slow, even at the lowest settings.. Maybe Firefox on RHEL 6 isn't the optimal platform for this little toy?

--edit: it runs much, much faster on Firefox with Windows 8.1. Not a Core i7, but an AMD 8-core thing. Must be Firefox on Linux's implementation, unless it's a video card/driver thing.


Runs pretty fast with Chromium on Linux, but in Firefox it's pretty slow as well. That's with a 2-core 1.7GHz i7

It's probably the physics simulation. Though random JS benchmarks show Chrome and Firefox to be approximately equal in JS performance on my computer, Chrome beating FF slighly on sunspider and the other way around with Octane.


Running plenty fast for me on Chrome / i5 / Linux.


What exactly is evolving? How is the genome represented, how does it control the walking?


The genome is represented by a set of parameters that control the rotation speed on each joint, 3 parameters per joint.

The rotation speed for each joint on each simulation step is given by:

  x * cos(y + z * simulation_steps);
where x, y, and z are the parameters. I've experimented with more sophisticated models, but it was hard to get any kind of evolution in the attention span people give to a browser game. :D


I've worked on similar projects during my Master's degree. An important trick I learned was to make the controller a function of some kind of discrete feedback in the world (e.g. when the foot hits the ground). In your equation above for example, a simple way to incorporate this would be to reset simulation_steps to zero every time a foot hits the ground.

This is also an important feature in the SIMBICON controller, which is arguably the simplest and most robust walker system. (http://www.cs.ubc.ca/~van/papers/Simbicon.htm)

EDIT: and yes, Reinforcement Learning is much more effective way of attacking this type of problem (e.g. see some recent work from Sergey Levine http://www.eecs.berkeley.edu/~svlevine/), but I also agree with the author that GA are fun :)


Fun, but even after hundreds of generations the walkers are still pretty bad. How good would they be after thousands or millions of generations? Is it even feasible to create a decent walker using genetic algorithms?


These simulations are very sensitive to the parameters, some of which we don't even have control over (e.g. population size or what method to select the next generation.) And especially the way the genome is represented.

It doesn't really say how it works, but it doesn't seem like a very natural way to do walking. E.g. here is are evolved walkers in a more complicated 3d simulation: http://vimeo.com/79098420 They seem to get very good after just a few generations compared to this.


Incredible. It even evolves a walker resembling a variant of a kangaroo (at [01:57]). With a target speed of 1.0 m/s, the legs alternate in taking small steps, but at 2.0 m/s, both legs are used to jump.


Now that's awesome!


This experiment reminds me of one of my recent experiments: http://madebyevan.com/nn-gait/. Mine ends up figuring how to walk relatively quickly. It does have a regular pulse as an input though.


Quite interesting! Is there any source code available publicly for this?


I'm always confused when people ask me this question. All browsers have the ability to view the source code of a web page (View > Developer > View Source in Chrome). The source code is here: http://madebyevan.com/nn-gait/script.js. Is that what you meant?


Most people assume the in-browser code will not be the fully commented, clean and presentable code that is more likely to appear on Github, I think that's what prompts queries like this :)


I'd like to see a genetic algorithm evolutionarily figuring out what parameters give you best performing walkers in shortest time ;) Meta-genetic if you will


It's a pretty well known field of research (though not specifically for toy walking problems). Genetic algorithms are a member of the class of problems called "metaheuristics", so your "meta" prefix is sort of already there. So they went up a level and called them "hyperheuristics".


I was just about to give up at 70th generation, when the walkers suddenly evolved from 1 to 4 steps in 5 generations.

Currently at generation 120 with 7 steps, looks like it is "evolving" in bursts, with long periods of nothing.

Well it is stuck at 9 steps at gen 400+.

  327	aibobe baeado	948.93

  >> 548   aibobe baeado   1058.76
I wonder if the author got any further. I would say this is close to the limit.


Falling into local minima and getting rescued by mutations.


It's interesting how I got:

0 Cicebi Bicedi 206.20

1 Cibebi Dibodi 208.03

4 Cobebi Bibedi 208.54

6 Cobebi Bobedi 208.68

8 Cocebi Boaedi 209.53

14 Cocibi Boaedi 310.47

20 Cocebi Boaedi 312.73

36 Cocebi Boaedi 312.85

38 Cocebi Biaedi 312.89

40 Cocebi Bobedi 314.06

58 Cocebi Bobedi 417.71

119 Cicebi Bibodi 522.37

331 Cicebi Bibuei 624.26

And at ~500 generations the champions are still the same 3. I have fiddled with the parameters several times to no avail. I've tried very little mutation probability (1-5%) with low mutation amount (1-10%) but also 75-100 mut. prob with 1-5% mutation amount and any figures in between and nothing seems to make it go out of the local minima.

Is there any way to get out of this? or when this happens in nature the species simply goes extinct or gets eaten by everyone else?


Ah.. seems like I finally got out of it at gen ~530 by changing the amount of champions to copy to increase the amount of gene variation in the pool.

I guess it kinda stuck into an inbreeding sort of loop.


It's quite hard to get any better than 10 or so steps, luck starts playing a major role after a while.

Some gait branches seem to reach a dead-end where any improvement would need to come from a major overhaul of the gene combination instead of small mutations.

The terrain starts to get more variation as the distance increases, so that's another piece of evil against the walkers. :D Maybe I should turn that off.

I've let it run for a couple of days. I don't think I got past 12 steps, but it looked pretty regular walking for a while. :)


I was thinking about writing something similar in C. How long is the genome( bytes/chars )?


The genome is pretty simple, just a few motor parameters for each joint in the body.

3 floats for each joint, to be precise. Then those are combined to set the speed of the motor in each joint on every simulation step.

I tried more complex genomes, but it just takes way too long to get any kind of improvement. It didn't make for a fun casual browser experience. :)


Looks like you've hit an evolutionary dead-end.. exterminate!


I felt bad closing the window.


Welcome to punctuated equilibria.


So is this really a genetic algorithm, or is it hill-climbing? Are you combining genomes from different individuals? Or are you just mutating a single genome? There are no controls for crossing genomes, so it seems likely to be hill-climbing.

If not, how are you mapping the genome to the phenome?


In the source code their is a function for mating.


Reminds me of this really fantastic video of finding the gaits of different sized/legged 'animals' at different speeds: http://vimeo.com/79098420


I just got one that evolved kneeling to last a few more seconds: http://imgur.com/mii9N8J


Please make one for QWOP http://www.foddy.net/Athletics.html


There already is, actually. I read the paper a while back. Video: https://www.youtube.com/watch?v=eWxFI3NHtT8 Paper: http://dl.acm.org/citation.cfm?id=2598248


You made my day. Nice find!


Reminded me about http://rednuht.org/genetic_cars_2/

If you're disappointed with walkers' performance, try cars, they improve much more. The principle remains the same of course. Attention: addictive.


Yep, I remember that and http://boxcar2d.com/ ending up on HN some time ago


I remember this. I could watch it forever!


You seem to have some bugs with the copying of champions and/or the identical reproduction of walks at the very short simulation length. Without having adjusted simulation parameters I no longer have my highest scoring walker saved as a champion.


Thanks, I'll look into it!


Very cool! I did my master thesis in Genetic Algorithms and bipedal robots. You can watch the results here: https://www.youtube.com/watch?v=SYONnbvsxz0


This looks like fun! It seems that the copied champions are simulated in every run. As long there is no random influence you can probably skip this simulation, as the simulation of the champions is also most likely the most CPU intensive.

EDIT: Grammar.


I was thinking the same thing. Since the gene certainly isn't very long the best champions could be stored, until a better champion is found.


The simulation of the champions is unnecessary, but it's a nice detail. It gives a visual sense of improvement.


Good point!


Converged pretty fast to the first step on the 10th gen, then got stuck having a seizure and falling on it's head "forever".

I bet this would be interesting evolving a simpler movement mechanic, like in bacteria.


This is how NaturalMotion started. I remember their CEO Torsten evolving muscle controllers on the Mathengine physics engine (1999?). Get it right, and you get to build a pretty nice business out of it!


You could optimize the no-graphics/high speed even more by not simulating the copied champions (since their end score is pre-determined)


I always spend way too much time on these sites. Not a new idea, but the visualization is awesome.


its not improving after gen 190. Its stuck.


How may steps/score? Try changing the parameters.


I did it was runnig at 0fps with max speed. I even changed the mutation part but was still no improving. I am running dual core machine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: