AI endless runner

ctrembla · December 2017

Can anyone link me to something or suggest stuff to create an AI endless runner that learns from his mistakes? Like he falls into a hole and dies he restart and uses a new move when reaches there or does same then next time does new. I'm not looking to hard code if cliff < 5 jump. I want him to fall into the cliff then restart then figure out he needs to get over it.

TfGuy44 · December 2017

Sounds like a good job for a neural network. If you're coding your own platformer too, start with that.

ctrembla · December 2017

ty, sadly I don't understand his code. I'm running into libraries in python that do everything. Why it so hard to learn this? Why can't I find something to tell me how and explain it? It amazes me how many things explain it without saying how to build one and the others just use libraries.I wanna build my own for games then later maybe more.

TfGuy44 · December 2017

Also:

TfGuy44 · December 2017

You're not finding a simple example because, well, it's not a simple thing to do.

ctrembla · December 2017

ok, this video is annoying it giving the AI stuff and not letting it learn on own. This video is just doing as it code says. All of them are saying what it does not how to make it. The second one programmed it to do what he wanted.I want code like the first one. Where it learns by doing. not one where I say if these dots are lit up do this. If the first video explained how to code what is going on I'll be happy, but no he says what it does.

TfGuy44 · December 2017

You might also skip the whole neural network thing completely... If your platformer has a fixed pattern of platforms and pits to navigate, then a genetic algorithm might create a pattern of actions for a moving player to take that gets it close to a goal.

https://forum.processing.org/two/discussion/17728/genetic-algorithm

This would, of course, only create a pattern that is a good match for some input environment. If you were to, say, suddenly add a pit or remove a platform, it would screw with a genetically generated move pattern's solution a lot more than it would with a neural network's attempts.

ctrembla · December 2017

I want it soI can throw in a pit and it learns to get around it. I need tounderstand how that mario guy's thing learned. I was thinking of using booleans, but then that kinda hard coding.

TfGuy44 · December 2017

Well, okay. So let's start with a thing that moves and a pit for it to fall into. Forget about controlling the thing with code for now and just wire its movement to the arrow keys. Once we have code for that, then we can work on moving it via other means.

ctrembla · December 2017

ok, sounds like great place to start. Should I use x or right like if he was hitting key? wait how does it access the endless runner game?

TfGuy44 · December 2017

Doesn't matter. Just make something that you are sure moves in a way that you can control. I'm using the arrow keys to control in my example.

ctrembla · December 2017

cool, ty for the help. ok, he can move around now by arrow keys. Even though I want him to acccess other games should I create the runner in his file?

TfGuy44 · December 2017

Post the code of what you have.

ctrembla · December 2017

main with left and right to move `Player player; Platform platform; void setup(){ size(300,300); player = new Player(); platform= new Platform(15,height-10,3); } void draw(){ background(0); player.display(); platform.display(); }

void keyPressed(){
  if (keyCode == RIGHT){
    player.move(1);
  }
  if (keyCode == LEFT){
    player.move(-1);
  }
}`

platform for him to fall off

`class Platform{
  float x,y,w;
  Platform(float x, float y, float w){
    this.x = x;
    this.y = y;
    this.w = w;
  }

  void display(){
    rect(x,y,w*50,25);
  }
}`

player still need to add gravity

`class Player{
  float x, y;
  float spd;
  Player(){
    x = 30;
    y = height - 50;
  }

  void display(){
    rect(x,y,25,25);
  }

  void move(float xdir){
    x += xdir;
  }
}`

TfGuy44 · December 2017

Now let's talk about formatting. Post your ENTIRE sketch. As a SINGLE CHUNK of code. YES - all tabs of code should be in the SAME CHUNK. Then select it and press Ctrl + o to indent it four spaces. This tells the forum that it is a chunk of code.

https://forum.processing.org/two/discussion/15473/readme-how-to-format-code-and-text#latest

Please learn how to get this right. Otherwise working with you and your code is a PAIN.

ctrembla · December 2017

oh you mean not by the tabs. Like this right?

`Player player;
Platform platform;
void setup(){
size(300,300);
player = new Player();
platform= new Platform(15,height-10,3);
}
void draw(){
background(0);
player.display();
platform.display();
}

void keyPressed(){
  if (keyCode == RIGHT){
    player.move(1);
  }
  if (keyCode == LEFT){
    player.move(-1);
  }
}
class Platform{
  float x,y,w;
  Platform(float x, float y, float w){
    this.x = x;
    this.y = y;
    this.w = w;
  }

  void display(){
    rect(x,y,w*50,25);
  }
}
class Player{
  PVector(x,y) loc;
  float spd;
  Player(){
    loc.x = 30;
    loc.y = height - 50;
  }

  void display(){
    rect(x,y,25,25);
  }

  void move(float xdir){
    x += xdir;
  }
}`

TfGuy44 · December 2017

Cool. So here's what I got:

class Player {
  PVector p, v, a;
  Player() {
    p = new PVector(100, 100, 0);
    v = new PVector(0, 0, 0);
    a = new PVector(0, .32, 0);
  }
  void draw() {
    simulate();
    render();
  }
  void simulate() {
    v.add(a);
    v.x = 0;
    if( keys[0] ) v.x = -1;
    if( keys[1] ) v.x = 1;
    for ( Platform plat : platforms ) {
      if ( p.x > plat.p.x && p.x < plat.p.x + plat.l && p.y > plat.p.y ) {
        v.y = 0;
        p.y = plat.p.y;
      }
    }
    p.add(v);
  }
  void render() {
    pushMatrix();
    translate(p.x, p.y);
    fill(255);
    stroke(255);
    triangle(0, 0, -5, -10, 5, -10);
    popMatrix();
  }
}

class Platform {
  PVector p;
  float l;
  Platform(float ix, float iy, float il) {
    p = new PVector( ix, iy, 0 );
    l = il;
  }
  void draw() {
    render();
  }
  void render() {
    pushMatrix();
    translate(p.x, p.y);
    fill(0,128,0);
    stroke(0,128,0);
    rect(0,0,l,10);
    popMatrix();
  }
}

Platform[] platforms = new Platform[2];
Player player;

void setup() {
  size(600, 400);
  platforms[0] = new Platform(0, 300, 200);
  platforms[1] = new Platform(400, 300, 200);
  player = new Player();
}

void draw() {
  background(0);
  for ( Platform plat : platforms ) plat.draw();
  player.draw();
}

boolean[] keys = new boolean[2];

void keyPressed(){
  if( keyCode == LEFT ) keys[0] = true;
  if( keyCode == RIGHT ) keys[1] = true;
  if( keyCode == UP ) player.v.y = -10;
}

void keyReleased(){
  if( keyCode == LEFT ) keys[0] = false;
  if( keyCode == RIGHT ) keys[1] = false;
}

Same as your code, we have Platforms and a Player.

TfGuy44 · December 2017

So, what is a good measure of success in this super simple environment?

ctrembla · December 2017

getting a high score and not falling in the holes for the simple then learning to avoid enemies.

TfGuy44 · December 2017

So what earns the player score? How does falling in the hole effect the score?

ctrembla · December 2017

falling just resets everything. win points by landing on each roof. How does a game we control turn into a learning program?

TfGuy44 · December 2017

First, you assign yourself a score based on your actions. Let's say your score is how far you move to the right plus how many times you jump, but zero if you fall into the pit. Given those criteria for a scoring system, what's the best strategy you can come up with as a human player?

Then what you do is assign the game to run some input that doesn't come from you. Perhaps this input is instructions like:

1) Move right for one second.
2) Press jump.
3) Move left for 1500 milliseconds.
... etc

When you simulate this input, the actions taken by a Player following those instructions will also kick out a score. This score tells us how good the instructions were.

If you have many sets of instuctions to follow, you will get many scores. So you can see which instructions caused the best scores. Then - and this is the important bit - you get rid of the instructions that caused bad scores, and slightly change the scores that were pretty good. Maybe you mix them together, or change them a little bit.

That leaves you with many new sets of instructions to try. And, hopefully, they will have better scores that the previous sets.

Maybe the scoring system is flawed - perhaps jumps shouldn't count for points. Or maybe falling in the pit is okay if you jump back out. Maybe you score a point when you get from one platform to the other. What's the best strategy in that case? Do you think you could evolve a set of instructions that would score well?

ctrembla · December 2017

your way is you telling it what to do. I want the machine to learn not to fall in the pits and learn not to die from enemies.

Chrisir · December 2017

There is also an AI library by quark.

Google Quarks place iirc

http://www.lagers.org.uk/ai4g/index.html

ctrembla · December 2017

ty I guess I can't find someone to teach me this even though spent hours searching. I don't wanna cheat and use a library but in the end I guess I'll have to.

Chrisir · December 2017

If you look at the book section

there’s nature of code, a book that’s also online

It has a chapter on neural networks iirc

ctrembla · December 2017

ty, i guess my teacher was correct when he said most code these days are copied and not written. I wanna write it.

Chrisir · December 2017

...then I recommend to read this chapter and then write it....

ctrembla · December 2017

I been reading on it. They are great at saying what it does not how to make it do it.

TfGuy44 · December 2017

What is it that you need to know?

Do you not understand how to make the Player move based on some instructions?

Do you not understand how to give a moving player a score based on its fitness?

Do you not understand how to have many instructions?

Do you not understand how sets of instructions can change?

It is not easy to help you. Please be very clear about what you are having trouble with.

ctrembla · December 2017

I need to know how to make it learn like on the mario version. He didnt have timers or commands telling it what to do. It didn't even know the buttons till a few rounds after it been made and played. His version learned the buttons and learned from dying. BTW I said what I wanted like 3 times each time the same thing.

jeremydouglass · December 2017

I think you are really misunderstanding the mario example and how it worked, and that is why you are not understanding the answer to your question which you have been given many times.

Learning requires a goal. You don't have to tell the learning algorithm which button is jump, or what jump means, or when to do it. You do have to:

Tell it what things you can do, like "Press A." You have to give it things it can do. You must program these. For example, every frame it could choose anything from pushing all of the buttons to pushing none of them.
Tell it whether it did a good or bad job, and define what is a good or bad job clearly and carefully so that it can make better and better guesses about what to do. To be clear: tell it what was happening / what happened when it did a good / bad job. E.g.: A flurfl glorped: also, 100 more fitness points. You gumph the whomwz, and die." It doesn't have to know what those things are. It just has a way to learn over time that, any time you gumph, that is bad, and having tried 30 other things, pressing B twice is a way to avoid gumph-ing when near a whomwz. To avoid gumphing a flurfl, 100 more deaths taught you to wait, then press A.

If all a genetic algorithm does is:

let the algorithm press A
tell it when it died and nothing else (e.g. give it a time value as a score)
tell it longer is better

...then your algorithm might be able to memorize a fixed layout, no random elements endless runner level. However it would not be able to learn how to play an endless runner game, because you haven't given it anything to learn from -- it is blind, and you are just periodically screaming "you died!" while it mashes a button. @Tfguy44 already told you that.

For anything more complex -- or for more complex goals, like getting a high score, you probably need more feedback to get good performance. The mario example is giving tons of feedback to learn from -- in terms of rightward progress, and when a generation dies, and the presence of objects in the scene, etc.

More complex is more complex. If you can give your algorithm, give it buttons to mash, and let it know what important things / events exist in its scene and how good or bad a job it did with its mashing, then it can "learn" to mash better. But you have to do those those things. No goals and no feedback? No learning. Make goals. Make feedback. You have already gotten a lot of advice on how.

ctrembla · December 2017

I was going tell it the buttons. I just dont want to say timer = something then jump. BTW what is bizhawk? I found it on the mario site and it linked to github.

jeremydouglass · December 2017

BizHawk is a multi-platform emulator with full rerecording support and Lua scripting.

So you can play a game in emulation -- not on the original platform it was released on, use Lua scripting (among other things) to pass in controller input from your AI, and record / replay sessions.

If you are working in Java, you could use the Robot class to generate normal system input. https://docs.oracle.com/javase/7/docs/api/java/awt/Robot.html

ctrembla · December 2017

wow, that is helpful. ok, now we need the system to know a1 is good like get a 1 when closer to whatever game objection the game is. or a 0 when dies. then get it without saying walk so and so feet and then jump to avoid death and get more 1s. Here a nonworking code but it what i'm looking for. As you can see it has NO TIMERS.

` float action; void setup(){ size(300,300); }

void draw(){
  background(0);
}

void Buttonnames(){
  "LEFT", "RIGHT", "UP", "DOWN";
}

if (action = 1){
// keep doing next time
}

if (action == 0){
// try something else
}

`

so I want it to chose a button without knowing what the buttons do. The robot class looks like it can take the button the program chooses and use it. then it does other stuff when getting higher score it gets 1s when dies it get 0s.

i'm in processing.

Howdy, Stranger!

Categories

In this Discussion

AI endless runner

Best Answer

Answers