A small optimization breaks my douglas peucker algorithm

edited April 2017 in Python Mode

Near line 110, I have output[i] = PVector(sx, sy)

If I change that too:

   output[i].x = sx
   output[i].y = sy

Then the algorithm breaks, I can't understand why. Can someone take a look?

vecs   = []
points = []
buffer = [PVector()] * 5000

def setup():
    size(500, 500)
    print "okidoki"
    print buffer

def draw():

    if len(vecs) > 3:
        for v in vecs:
            vertex(v.x, v.y)

    if len(points) > 3:
        epsilon = map(mouseX, 0, width, 0, 100)
        result = douglas_peucker(points, epsilon, buffer)

        for v in result:
            vertex(v.x, v.y)


def mouseDragged():
    vecs.append(PVector(mouseX, mouseY))

def mouseReleased():
    global vecs
    del points[:]
    for v in vecs:
    vecs = []

def douglas_peucker(points, epsilon, buffer):

    douglas_peucker_algo(points, 0, len(points)-1, epsilon, buffer);

    result = []

    # copy from the buffer to the result
    # skipping overlapping points
    last_x = float("inf")
    last_y = float("inf")

    for i in range(len(points)):
        v = buffer[i]

        if (v.x != last_x or v.y != last_y):
          result.append(PVector(v.x, v.y))
        last_x = v.x;
        last_y = v.y;

    return result;

# returns the number of insert points made between the start and the end
def douglas_peucker_algo(points, start_index, end_index_inc, epsilon, output):

    # find the point with the max distance
    d_max = 0
    index = -1

    for i in range(start_index+1, end_index_inc):
        d = dist_to_segment_squared(points[i].x, points[i].y, points[start_index].x, points[start_index].y, points[end_index_inc].x, points[end_index_inc].y)
        if (d > d_max):
          index = i
          d_max = d

    d_max = sqrt(d_max)

    # recursively simplify
    if (d_max > epsilon):
        douglas_peucker_algo(points, start_index,     index,     epsilon, output) # front
        douglas_peucker_algo(points,       index, end_index_inc, epsilon, output) # back
        # Collapse all points to the start except the last.
        # Only overlapping points will be ignored later.
        sx = points[start_index].x
        sy = points[start_index].y
        for i in xrange(start_index, end_index_inc):
          output[i] = PVector(sx, sy)
          # why do the lines below break it?
          # output[i].x = sx
          # output[i].y = sy

        # add the last point
        output[end_index_inc] = PVector(points[end_index_inc].x, points[end_index_inc].y)

def dist_sq( x1,  y1,  x2,  y2):
    return sq(x1 - x2) + sq(y1 - y2)

def dist_to_segment( px,   py,  lx1,   ly1,  lx2,  ly2):
    return sqrt(dist_to_segment_squared(px, py, lx1, ly1, lx2, ly2))

# inspired by http://stackoverflow.com/a/1501725/1022707
def dist_to_segment_squared( px,  py,  lx1,  ly1,  lx2,  ly2):
    lineDist = dist_sq(lx1, ly1, lx2, ly2);

    if (lineDist == 0):
        return dist_sq(px, py, lx1, ly1)

    t = ((px - lx1) * (lx2 - lx1) + (py - ly1) * (ly2 - ly1)) / lineDist

    if (t < 0): return dist_sq(px, py, lx1, ly1)
    if (t > 1): return dist_sq(px, py, lx2, ly2)

    return dist_sq(px, py, lx1 + t * (lx2 - lx1), ly1 + t * (ly2 - ly1))


  • edited April 2017 Answer ✓

    The culprit is at your line #3: buffer = [PVector()] * 5000
    All of those 5000 indices are actually the very same PVector object reference! @-)

    It means if you mutate the PVector from 1 of the indices, that's mirrored for the other 4999! :O)

    At line #108: output[i] = PVector(sx, sy), you're creating a new PVector object and re-assigning it back to list output[] (which is an alias to the global buffer[] btW), replacing 1 of the PVector clones at that current index i. ;;)

    Those reassignments are thus fixing 1 by 1 the cloning bug from line #3. :P

    Therefore, if you wanna use output[i].set(sx, sy) in place of output[i] = PVector(sx, sy), in order to avoid unnecessarily instantiating more PVector objects by re-using them, create global buffer[] this way: buffer = tuple(PVector() for i in range(5000)) *-:)

  • I belive it's because of pointerlike behavior

  • Ah thanks @GoToLoop , I'm quite new to python.

    I use this now: buffer = [PVector() for i in range(5000)].

  • edited April 2017
    • That 1 would create a List instead of a Tuple. :\">
    • BtW, that feature's called list comprehension. :-B
    • In general, if we don't intend to replace any of the elements later, a Tuple assures us we don't do it inadvertently. L-)
    • For better performance, rather than replacing a PVector w/ another, use its method set() in order to "recycle" it. *-:)
Sign In or Register to comment.