How can I specify (increase) the maximum row count in a table in processing

edited November 2017 in Questions about Code

I am using processing.org to collect a large amount of data and put it in a table. I then write this table to a CSV file. The code I have works great until I get around 1 million rows in my table. After that I get messages in the console saying:

"Note: setting maximum row count to 1,310,720 (resize took 1,654 ms)".

This makes my program take ages to run. I added a setRowCount(); line to my code but this did not remedy my issue, it gave me an empty table. Here is my code with the added line:

// stuff is a large array of values > 1 million
t = new Table();
t.setRowCount(10000000);  
t.addColumn("Stuff"); 
for( int i = o; i < stuff.length; i++) 
    {  
    TableRow newRow = t.addRow(); 
    newRow.setFloat("Stuff", stuff[i]);
    } 
saveTable(t, "data.csv");

Answers

  • Answer ✓

    Hmmmm 8-|

    Can you make two files?

    Maybe tell us more about your process. Are you creating this table in memory and then saving it? How many entries do you want to create? Only 1E6 or maybe 1E9? Even if you save this data in a csv, and you try to open it in Excel, Excel will scream at you. Are you working in a plan to work with big data sets?

    I'll stick to my simple solution for now.

    Kf

  • https://github.com/processing/processing/blob/3ec89e51447c73a814db4d1186d15e40b0073ba8/core/src/processing/data/Table.java#L2284

    If the goal is performance, perhaps (untested) loop through the data and write it out to a series of files one million rows at a time (as @kfrajer suggests). Then concatenate the files on the filesystem, e.g. using exec()

  • My process is receiving real time data from an Arduino reading a sensor over WiFi which I store in an array when I receive it. Once I have all the data I put it in a table and then save that table to a csv file.

    At the moment my application needs 1080000 entries, I would however like to find a solution that leaves potential for me to deal with larger numbers of entries.

    I don't plan on opening it in excel, I will open it in MATLAB and plot sections of the data.

    From what you suggest it seems that it is not possible to increase the row size of a table.

    I think the approach of writing to two files and then concatenating them should work but I was hoping to just extend my table size.

    Is there any particular reason why the table in processing can only deal with 1 million entries before it begins to print warnings / become slow ?

    Thanks for the replies

  • Answer ✓

    You can create your own custom class, and save its content via saveStrings():
    https://Processing.org/reference/saveStrings_.html

  • Does anyone know why the method I used in my original code block does not support more that 1000000 rows ?

  • edited November 2017 Answer ✓

    Well, I mean, it says if (newCount > 1000000) { right in the code block that I linked. Or am I not understanding your followup question?

    Processing.Table is designed to warn you that trying to do large tables in-memory may be RAM-expensive and slow -- and depends not on your system RAM, but on your Java RAM allocation. It often makes more sense to do something stream-based / flush periodically to disk. For example design assuming that you may want to deal with an unlimited number of chunks of 10,000 rows each (which are also conveniently spreadsheet sized). In this case, you want to process 110 such chunks, but that could change....

  • Answer ✓

    you don't even need a custom class given that you appear to just be creating a big array of floats.

    Once I have all the data I put it in a table and then save that table to a csv file.

    or just write them as you receive them.

  • Thanks you all for the helpful replies. I will implement the solution where I use multiple files to avoid the warnings being printed and slowing things down. I'm attempting to learn by asking these follow up questions, apologies if they are annoying.

    jeremydouglass: So keeping large amounts of data in memory like this is considered bad practice and should be avoided at all costs ?

    koogs: The reason I keep it all in memory is because I am receiving a large amount of data over WiFi using UDP packets I want to support as many transmitting devices as possible. I thought storing the data in arrays would take less time per received packet than writing to a file and therefore give my code more time to receive packets meaning that I could support more boards. Am I mistaken about the time taken to write to a file being greater than the time taken to put a float in an array ?

  • PS my application does not allow for any interruption in the data i.e I have to receive a packet from each device once every 10 ms. I can accept missed packets now and again but not a large gap in my data.

  • Re:

    large amounts of data in memory ... should be avoided at all costs ?

    No, if you can fit everything in you Java memory allocation and write out one it isn't always bad practice.

    But it doesn't scale, and you said you wanted a method that could scale in the future:

    I would however like to find a solution that leaves potential for me to deal with larger numbers of entries.

    If you want to receive an arbitrary large amount of data, periodically writing to disk is key.

Sign In or Register to comment.