Minecraft type engine for a data visualization

I have been doing a lot of research of lately and it appears that for my project, I need what amounts to a Minecraft (voxel) (geometry instancing) (octree)(point cloud) engine that will allow me to visualize millions of data points representing polygons in 3D space and be able to zoom through them inspecting any/all of the polygons and query the datapoints by clicking on them. The polygons will all be the same shape, but their 3D rotation and associated face colors will depend on the datapoint. I have written this in straight Processing previously, but kept running up against memory limitations after only a few thousand points, so I am restarting from scratch. Any and all suggestions of thoughts of directions or EXAMPLES would be greatly appreciated. A last point would be that I need to load these Datasets as rapidly as possible, as each set is a timeframe for the data collection and I would like to see how the data change visually through the time periods.

My newest point format will be similar to: Point(x,y,z) {rotationX, rotationY, rotationZ: Int (0..360) var1, var2, var3: Int}

By having these few variables for each point and correlating hash tales, I should be able to determine all needed data for my project. Attached is an image of the last straight processing result I created, to give an idea of the data visualization I need, except this does not show any of the cubes rotated.



  • out of interest, what is the rotation going to indicate?

    and will it always be 40x40x40? (=64000)

  • you want to be able to click on the points to query them, but you also want to load rapidly to see some changes.

    I think this contradicts each other or are two different modes of the program.

    For the latter you can just make images of each frame and make a movie from it. So you don't have to worry about speed.

    processing won't handle millions of data points well. Do you mean millions of data points at once or sumed up over time?

    For rotation you can use peasy cam.

    Chrisir ;-)

  • edited October 2015

    Well, I've written a Minecraft-like engine in Processing... okay, with a lot of "low level GL". It currently renders around 7123 chunks consisting of 16x16x16 voxels at a time in about 2-4ms, including entity AI, collision detection, particle effects, player input, loading and unloading, you name it. The world is theoretically endless (think Minecraft but with a limitless sky).

    voxel-1 voxel-2

    An old screenshot from the AI/Pathfinding tests. The little tanks are trying to get me. And because the pathfinding worked better than expected, they got me and pushed me over the edge of the corridor. And jumped to their deaths...

    The engine will load/generate all chunks in a certain radius around the player and store it in a spatial hash table. A spatial hash table is basically just a hash table that uses 3 coordinates as key. It's pretty fast, memory efficient and garbadge collector friendly (no micro-lags because of large gc intervals).

    As a spatial hash table behaves just like an array, you can simply do a ray cast into the camera/mouse direction to get the selected voxel. This is pretty fast. Actually all of my (800-1200) entities are doing raycasts every frame to determine if they are able to see the player or another entity.

    The "rendering" is done via a self written tesselator that will create a Vertex Buffer Object for any chunk in the spacial hash table and send it to the graphics card. Then of all the chunks saved on the graphics card only the visible ones (the ones that collide with the view frustum) will be rendered to screen.

    Conclusion: It's not hard to write a Minecraft-like engine, you'd just have to replace almost all of Processing's rendering, math, matrix, etc. functions because they are far to slow or rather built for a different purpose. In case you still want to use Processing as a frame for your project, read a few tutorials about OpenGL in java. You can get a reference to the OpenGL object via the following code:

    GL2GL3 gl = ((PJOGL)beginPGL()).gl.getGL2GL3();
    // Do some low level GL stuff
  • Answer ✓

    I've never contemplated anything on this scale - or had the time to sadly - so I can't speak from experience; but I'd imagine doing this with millions of data points is going to be very hard. In practice I'd expect you could only ever keep the data for a subset in memory at any given time...

    In practice would being able to click on any point when you're zoomed out lead to a meaningful interaction for the user? Would they be able to distinguish enough difference between polygons to understand the relationship with the data displayed when they click one? Will they be able to easily click on the same polygon twice? We don't know the data you're working with, so not sure about the first question, but the answer to the others is almost certainly no.

    So perhaps you only need to start enabling interaction at a certain zoom level; and on those polygons at a limited depth? In this case the number of polygons you'd be dealing with will be much less and therefore more manageable...

  • Thank you for all of the feedback and suggestions. The size of the dataset I am trying to analyze is approximately 10K X 10K X 10K points over thousands of years. I realize starting with this dataset is ridiculous, so I am starting with a very small subset.

    The polygons (cubes at this stage) have different colors on each face for easy HUMAN visual interaction. The polygons rotate in different directions resulting from interactions from neighboring polygons. I realize that rendering all of the points as the full polygon (cube or other) they represent is not necessary for ALL of the points, I am also trying to figure out how to show the 99% of the cubes that are close enough as simple colored points of the color facing the camera, and as they come into view of sufficient closeness, they will be rendered as a polygon.

    The PRIMARY setup of the cloud will be reading the data in from a database/spreadsheet and instantiating the environment. As points are influenced, they will change their rotations (therefore changing their properties and color).

    The reason for the visualization is that we (the team) theorize that similar to sound waves, as we change the orientation of cubes, their rotation may/will change neighboring cubes and cause waves of some type to ripple for given distances across the world. One of the issues is that rotating a blue-facing red-topped cube clockwise will have a different affect in its neighbors that a yellow-facing red-topped cube will, etc. With the amount of data, looking at just the numbers is impossible, and requires some type of visual analysis.

    I currently use PeasyCam for zooming around/through the world and have started seeing trends, but due to the size of the world, I keep running out of memory. The manipulation of the data happens on a 64GB RAM drive independent of the visualization using a different program, so that is the easy part. When I want to render the entire world and fly through/around it, that is where the problems pop up. I am assuming that if I use an Octree system with culling and some other type of graphics area selection, I can load the section that I am actually viewing into the GPU and select elements from that point. This is all a work in progress and I appreciate all of the input.

    Blindfish: you are absolutely correct in your analysis also. The first step is to set up the simulation and watch for waves and trends by flying around and through the world until we see a wave/trend that we want more details on. As we see them, we would stop the time factor and click on one of the particular polygons of interest, to get the identifier for that element, time iteration, and X/Y/Z rotation parameters. by knowing this, we could redo the simulation and concentrate on just that area to understand more of what is going on, including exporting smaller datasets for more detailed analysis.

    Any and all additional thought/comments are appreciated.

  • Answer ✓

    10K X 10K X 10K

    how big is your monitor? mine is 1366x768. at one pixel per datapoint can only see slightly more than a millionth of the data set at any one time...

  • Koogs,

    Very good point. My monitor is a typical 3840x2160, so there is never really a time that I would need to see more than 2K2K2K, so I probably need to make that my maximum visual area. Does anybody know if there is any culling available to optimize the display of the objects?

Sign In or Register to comment.