2d Metaball sketch running super slow in Python mode

edited April 2018 in Python Mode

Hi guys,

I made several Python versions of 2d metaball sketches I found here on the forum or on the CodingTrain Github.

Problem: It runs super slow (<1fps !) while all Java versions run very smoothly (60fps).

Questions:

  • Why such a difference in the framerates ?

  • Is is possible to improve the Python code or am i for ever doomed for having chosen Python mode over Java mode ?

original JAVA code

main tab:

Blob[] blobs = new Blob[10];

void setup() {
  size(640, 360);
  colorMode(HSB);
  for (int i = 0; i < blobs.length; i++) {
    blobs[i] = new Blob(random(width), random(height));
  }
}

void draw() {
  background(51);


  println(frameRate);

  loadPixels();
  for (int x = 0; x < width; x++) {
    for (int y = 0; y < height; y++) {
      int index = x + y * width;
      float sum = 0;
      for (Blob b : blobs) {
        float d = dist(x, y, b.pos.x, b.pos.y);
        sum += 10 * b.r / d;
      }
      pixels[index] = color(sum, 255, 255);
    }
  }

  updatePixels();

  for (Blob b : blobs) {
    b.update();
    //b.show();
  }
}

blob class

class Blob {
  PVector pos;
  float r;
  PVector vel;

  Blob(float x, float y) {
    pos = new PVector(x, y);
    vel = PVector.random2D();
    vel.mult(random(2, 5));
    r = random(120, 400);
  }

  void update() {
    pos.add(vel); 
    if (pos.x > width || pos.x < 0) {
      vel.x *= -1;
    }
    if (pos.y > height || pos.y < 0) {
      vel.y *= -1;
    }
  }

  void show() {
    noFill();
    stroke(0);
    strokeWeight(4);
    ellipse(pos.x, pos.y, r*2, r*2);
  }
}

PYTHON code

liste = []

def setup():
    size(640, 360, FX2D)
    colorMode(HSB)
    [liste.append(Blob(random(width), random(height))) for e in range(10)]

def draw():
    loadPixels()
    for x in range(width):
        for y in range(height):
            index = x + y * width
            sum = 0
            for e in liste:
                d = dist(x, y, e.pos.x, e.pos.y)
                try:
                    sum += 10 * e.r / d
                except ZeroDivisionError:
                    return 1
            pixels[index] = color(sum, 255, 255)
    updatePixels()

    for e in liste:
        e.update()

class Blob(object):
    def __init__(self, x, y):
        self.pos = PVector(x, y)
        self.velocity = PVector.random2D()
        self.velocity.mult(random(2, 5))
        self.r = random(120, 400)

    def update(self):
        self.pos.add(self.velocity)

        if self.pos.x > width or self.pos.x < 0:
            self.velocity.x *= -1
        if self.pos.y > height or self.pos.y < 0:
            self.velocity.y *= -1

Answers

  • edited April 2018

    try looks slow, why not use if

    in processing java we can use set instead of pixels and skip load and updatepixels

  • edited April 2018

    Hi Chrisir, thanks for your answer. The 'if' statement doesn't improve a bit the performance, unfortunately. What I don't get is why such a difference in the framerates. It's well known that Python mode runs slower than Java (around 20% slower if I recall correctly) but here we're talking major difference (nearly 100%) !

  • edited April 2018

    Regarding the set() function, the Processing documentation indicates:

    Setting the color of a single pixel with set(x, y) is easy, but not as fast as putting the data directly into pixels[].

    Not sure if they're talking performance wise...

  • Looks like a Lava lamp

  • edited April 2018

    The best I could figure out:

    • running the sketch using JAVA2D renderer
    • loading pixels within the setup() function
    • displaying a pixel every 2 or 3 pixels
    • avoiding the ZeroDivisionError exception by adding 1 to the distance (d)

    With 2 tiny metaballs on a skimmed 640*360 canvas I can get up to 23 fps.

    With 10, I'm still running at 1 fps or lower... ridiculous, really.

    liste = []
    
    def setup():
        background(0)
        size(640, 360, JAVA2D)
        colorMode(HSB)
        [liste.append(Blob(random(width), random(height))) for e in range(2)]
        loadPixels()
    
    def draw():
        for x in range(0, width, 2):
            for y in range(0, height, 4):
                index = x + y * width
                sum = 0
                for e in liste:
                    d = dist(x, y, e.pos.x, e.pos.y)
                    sum += 20 * e.r / (d+1)
                pixels[index] = color(sum)
        updatePixels()
    
        for e in liste:
            e.update()
    
    class Blob(object):
        def __init__(self, x, y):
            self.pos = PVector(x, y)
            self.velocity = PVector.random2D().mult(random(2, 5))
            self.r = random(120, 400)
    
        def update(self):
            self.pos.add(self.velocity)
    
            if self.pos.x > width or self.pos.x < 0:
                self.velocity.x *= -1
            if self.pos.y > height or self.pos.y < 0:
                self.velocity.y *= -1
    
  • edited April 2018

    I tried set() yesterday in java instead of pixels[] and it was faster

    I also tried int instead of float but it was slower

    (I would have thought differently)

  • edited April 2018

    I tried as well as you suggested but in Python mode it runs a tiny bit slower... at least on my computer.

  • @solub --

    in Python mode it runs a tiny bit slower... at least on my computer.

    How much is "a tiny bit"? That might be expected when running on Jython, right? Earlier you said:

    It's well known that Python mode runs slower than Java (around 20% slower if I recall correctly)

  • edited April 2018

    Hi Jeremy,

    With set() the sketch is running at 19/20 fps against 21/22 fps with loadPixels(). Again this is with a skimmed 640*360 canvas and only 2 metaballs.

    As soon as I increase the pixel density (full or larger canvas) or the number of metaballs, the frame rate drops drastically (6/8 fps, then 1 fps).

    So far, the biggest improvement was to move loadPixels() within the setup() function (preventing the pixels to be loaded at each iteration). The sketch then gained 4 fps (lol).

  • edited April 2018 Answer ✓

    Interesting. Yes, the difference between the originals (even after dropping the "try") is extremely dramatic. Perhaps something to do with the cost of explicit int vs. implicit float math operations -- or just Jython overhead? Not sure.

    @JonathanFeinberg might have some idea why.

  • edited April 2018

    I tend to think it has to do with some Jython overhead... not sure about the explicit int vs implicit float math operations.

  • edited April 2018 Answer ✓

    There are a couple of reasons why the way Jython was implemented on JVM has ended up so slow. :(
    I know at least these 3 below: :-&

    1. Numbers are objects and new 1s are instantiated for each single math operation!
    2. All variables are volatile. Meaning the JVM gotta make sure each variable access is synchronized in memory.
    3. There is some overhead converting Java native types and method calls to Jython.
  • I see.

    To be honest I really love both Processing and Python. Developping a Python mode for Processing was a great idea and the more I learn about programing and creative coding, the more I wish Processing could run native Python.

  • edited April 2018

    There are some other Processing / Python projects in the works... check out p5py from last year's GSOC 2017.

    I'm a really big fan of the Jython (Python 2.7) Processing mode. It doesn't have great performance, but you can still do amazing things with it, despite its very particular limitations and constraints.

    See also the GSOC 2018 proposal: https://forum.processing.org/two/discussion/26778/proposal-gsoc-2018-application-arihant-parsoya-native-python-and-processing

  • Thanks Jeremy!

  • edited April 2018

    So I found a way to display 10+ metaballs at 60fps on a large canvas using:

    • a "marching square" algorithm
    • linear interpolations (to get smoother contours)

    (the stroke is supposed to be black, the yellow color is a glitch from the gif conversion)

    Unfortunately, I couldn't find a way to fill the blobs...

    n_points, c_size, w, h = 10, 20, 800, 600
    points = []
    cells = [[0 for e in range((w/c_size) + c_size)] for f in range((h/c_size) + c_size)]
    
    def setup():
        size(w, h, OPENGL)
        [points.append(Point()) for p in range(n_points)]
        strokeWeight(3)
        smooth(8)
    
    def draw():
        background(255)
    
        for p in points:
            p.update()
    
        for i, x in enumerate(range(0, width + c_size, c_size)):
            for j, y in enumerate(range(0, height + c_size, c_size)):
                cells[i][j] = 0
                for p in points:
                    cells[i][j] += p.r / dist(x, y, p.pos.x, p.pos.y)
    
        for i, x in enumerate(range(0, width, c_size)):
            for j, y in enumerate(range(0, height, c_size)):    
                case = 0
                case += 1 if cells[i][j + 1] >= 1 else 0
                case += int(pow(2, 1)) if cells[i + 1][j + 1] >= 1 else 0
                case += int(pow(2, 2)) if cells[i + 1][j] >= 1 else 0
                case += int(pow(2, 3)) if cells[i][j] >= 1 else 0
    
                if case == 1 or case == 14:
                    x1 = x
                    y1 = y + c_size * ((1 - cells[i][j]) / (cells[i][j + 1] - cells[i][j]))
                    x2 = x + c_size * ((1 - cells[i][j + 1]) / (cells[i + 1][j + 1] - cells[i][j + 1]))
                    y2 = y + c_size
                    line(x1, y1, x2, y2)
    
                if case == 2 or case == 13:
                    x1 = x + c_size
                    y1 = y + c_size * ((1 - cells[i + 1][j]) / (cells[i + 1][j + 1] - cells[i + 1][j]))
                    x2 = x + c_size * ((1 - cells[i][j + 1]) / (cells[i + 1][j + 1] - cells[i][j + 1]))
                    y2 = y + c_size
                    line(x1, y1, x2, y2)
    
                if case == 3 or case == 12:
                    x1 = x
                    y1 = y + c_size * ((1 - cells[i][j]) / (cells[i][j + 1] - cells[i][j]))
                    x2 = x + c_size
                    y2 = y + c_size * ((1 - cells[i + 1][j]) / (cells[i + 1][j + 1] - cells[i + 1][j]))
                    line(x1, y1, x2, y2) 
    
                if case == 4 or case == 11:
                    x1 = x + c_size * ((1 - cells[i][j]) / (cells[i + 1][j] - cells[i][j]))
                    y1 = y
                    x2 = x + c_size
                    y2 = y + c_size * ((1 - cells[i + 1][j]) / (cells[i + 1][j + 1] - cells[i + 1][j]))
                    line(x1, y1, x2, y2)
    
                if case == 5:
                    x1 = x + c_size * ((1 - cells[i][j]) / (cells[i + 1][j] - cells[i][j]))
                    y1 = y
                    x2 = x
                    y2 = y + c_size * ((1 - cells[i][j]) / (cells[i][j + 1] - cells[i][j]))
                    line(x1, y1, x2, y2)
                    x1 = x + c_size;
                    y1 = y + c_size * ((1 - cells[i + 1][j]) / (cells[i + 1][j + 1] - cells[i + 1][j]))
                    x2 = x + c_size * ((1 - cells[i][j + 1]) / (cells[i + 1][j + 1] - cells[i][j + 1]))
                    y2 = y + c_size
                    line(x1, y1, x2, y2)
    
                if case == 6 or case == 9:
                    x1 = x + c_size * ((1 - cells[i][j]) / (cells[i + 1][j] - cells[i][j]))
                    y1 = y
                    x2 = x + c_size * ((1 - cells[i][j + 1]) / (cells[i + 1][j + 1] - cells[i][j + 1]))
                    y2 = y + c_size
                    line(x1, y1, x2, y2)
    
                if case == 7 or case == 8:
                    x1 = x + c_size * ((1 - cells[i][j]) / (cells[i + 1][j] - cells[i][j]))
                    y1 = y
                    x2 = x
                    y2 = y + c_size * ((1 - cells[i][j]) / (cells[i][j + 1] - cells[i][j]))
                    line(x1, y1, x2, y2)
    
                if case == 10:
                    x1 = x + c_size * ((1 - cells[i][j]) / (cells[i + 1][j] - cells[i][j]))
                    y1 = y
                    x2 = x + c_size
                    y2 = y + c_size * ((1 - cells[i + 1][j]) / (cells[i + 1][j + 1] - cells[i + 1][j]))
                    line(x1, y1, x2, y2)
                    x1 = x
                    y1 = y + c_size * ((1 - cells[i][j]) / (cells[i][j + 1] - cells[i][j]))
                    x2 = x + c_size * ((1 - cells[i][j + 1]) / (cells[i + 1][j + 1] - cells[i][j + 1]))
                    y2 = y + c_size
                    line(x1, y1, x2, y2)
    
    class Point(object):
        def __init__(self):
            self.r = random(15, 30)
            self.pos = PVector(random(self.r, width - self.r), random(self.r, height - self.r))
            self.vel = PVector.random2D().mult(random(4, 6))
    
        def update(self):
            self.pos += self.vel
    
        if self.pos.x > width - self.r or self.pos.x < self.r :
            self.vel.x *= -1
        if self.pos.y > height - self.r or self.pos.y < self.r :
            self.vel.y *= -1
    
Sign In or Register to comment.