Journeys.. world air and train travel

edited November 2013 in Share Your Work

Just finished a new stage of this piece.

Journeys1

(better quality capture here)

It's an exploration of tweeted journeys across the world presented in a schematic format (so trip from US to Japan are logically correct but not physical "great circles") It's is based on work i did last year to pull geo located tweets and show them on a mercator projection. This extension i have been working on for the last month then links those tweets with an arrival and departure location. i scan the tweet for words like airport, terminal etc. I check time and distance travelled matching user ids to store the journey. I also show the tweet body itself located in the map. It's great fun just to watch although i don;t think it could ever work on a live website as the matching computation is expensive. I also lately added rail tracking and this something i found more interesting.

I take all the journeys and during runtime animate across the entire timeline stored, adding to it as new journeys are identified.

I store all the data in very large JSON object arrays. One array is the geo located messages, around 130,000 at around 71MB file size. Another is the journeys of around 12Mb and 35,000 journeys. Finding matches with 130,000 entries is expensive.

I run an offline process the remove old posts without matches, and re check all journey matches. This is really expensive as i am comparing 130,000 x 130,000 posts of matches and takes around 45 minutes all told. It would be great of i could "patch" this into the main running instance instead of being offline.

The system was running at 4 FPS until i converted all the drawing to the new PShape and saw the frame rate go to 24FPS. The issue is that upon adding a new journey i get an enormous stutter as the matching code, and the rebuilding of the PShape arrays has to be recomputed. I'll be trying to load balance that over a set amount of frames in the near future.

Great work on the guys who rewrote PShape for 2.0..it's really fast.

Tons of work still to do..it's just an ongoing thing.

journeys1

Answers

  • Answer ✓

    i am comparing 130,000 x 130,000

    seems less than ideal. how are you doing this?

    usually it's easy to reduce this to less than half - you don't need to match b with a if you've already tried to match a to b. and you can skip matching a to a. basically, if these are a grid, you only need to iterate over the top right triangle.

    that said, hashing on user id would mean only having to compare a user's tweets with their other tweets.

  • yes i haven't even looked at optimising this part yet, but indeed i actually only forward search during the offline process so find A and look forward from A until i find B. So the search gets faster toward the end.. but still a ton of easy options to speed it up.

  • Wow!....incredible work. :-bd

  • Answer ✓

    Impressive!

Sign In or Register to comment.