applicable rounding method prior to exporting Double data?

edited April 2014 in Using Processing

I've got a boids model and I'm exporting some analysis measures to csv file, but because I'm exporting Double data type in order to get means with decimals, the output files are far too huge to open or do anything with. E.g. I just made a csv file that's 34GB, and Excel or SPSS won't open it!

I've read lots on different rounding methods, most of which looked far too complicated for my needs here and now, and as a beginner recently fallen in the deep end (biology student not CS student!), it could take me longer than I've got to resolve this apparently simple issue.

Would setRoundingMode(RoundingMode roundingMode) be applicable? Or ".2sf", if so, where do I put it?

The following snippets hold all the relevant code, I think, but my whole file is here- http://piratepad.net/2qUISlwOK7 Thanks!

What I'm aiming for is to export fewer bytes to the csv file, so that I can record long model runs without making files too big to be able to open them. 3 or 4 significant figures would be fine.

I guess that means converting the Double data to String data and applying a rounding mode in the process, but there seem to be so many options for how to do rounding up and I don't understand which one(s) is applicable.

thanks!!

      public double [] meangroupsize = new double [2];
      // declare and initiate an array of <integer> arraylists, with an arraylist for each Type (i.e. normal and parasitized), to record mean group sizes per timestep
      public ArrayList<Double>[] meanGroupSizePerType;

then...

    int [] totalgroupsizes = new int [2];
          for (int i = 0; i < boids.size(); i++) {
            Agent boid = (Agent) boids.get(i);
            if (boid.type == 1) {
              totalgroupsizes[0] += transGroupSize[i];
            }
            else if (boid.type == 2) {  
              totalgroupsizes[1] += transGroupSize[i];
            }
          }

          //    double [] meangroupsize = new double [2];
          meangroupsize [0] = (double) totalgroupsizes[0] / countNormal();
          meangroupsize [1] = (double) totalgroupsizes[1] / countParasitized();

          meanGroupSizePerType[0].add(meangroupsize[0]);
          meanGroupSizePerType[1].add(meangroupsize[1]);

And then

void finish() {
    // build table of meanGroupSizePerType
    PrintWriter pw1 = null;
    try {
      File meanGroupSizePerTypeCSV = new File("meanGroupSizePerTypeCSV.csv");
      FileWriter fw1 = new FileWriter(meanGroupSizePerTypeCSV, true);
      pw1 = new PrintWriter(fw1);
      for (int i = 0; i < meanGroupSizePerType[0].size(); ++i) { // why [0]? 
        for (int j = 0; j < meanGroupSizePerType.length; ++j) {
          pw1.print(meanGroupSizePerType[j].get(i) + ",");
        }
        pw1.println("");
      }
    } 
    catch (IOException e) {
      e.printStackTrace();
    } 
    finally {
      if (pw1 != null) {
        pw1.flush();  // Writes the remaining data to the file
        pw1.close();
      }
    }
//...load of other code
exit();
}

Answers

  • How much definition do you need? In other words, how many decimal places do you care about? Is 2.3 okay? 2.36? 2.368546?

    There are several ways to round, or just truncate (drop everything after a certain point). Which approach you take depends on how exact you need to be.

  • The answer I found eventually is:

    pw.print(String.format("%f04,", meanGroupSizePerType[j].get(i))); 
    

    The comma inside "%f04," is instead of

    pw.print(String.format("%f04", meanGroupSizePerType[j].get(i)) + ","); 
    

    i.e. I'm saving to csv file, so I need the comma between each value.

    This gives me 4 bytes per datapoint (String of digits, 1 byte per digit, I think?) instead of 18 bytes for Double data type, I think.

    The file sizes have come down, which is what I needed. :)

  • Processing's own String functions such as nf() would take care of it too: [-(
    http://processing.org/reference/nf_.html

  • edited April 2014 Answer ✓

    This gives me 4 bytes per datapoint (String of digits, 1 byte per digit, I think?) instead of 18 bytes for Double data type, I think.

    Each char of a String takes up 2 bytes (16-bit) in memory. Since it's UTF-16!
    And if you use the primitive double, rather than the wrapper Double, it takes up 8 bytes (64-bit).
    And of course, a float is half of it, 4 bytes each (32-bit).
    However, since everything's gonna be UTF-8 inside a file, # of digits determines its file size instead! :P

  • Thanks! I had three different errors which were making the files too huge. Now got them down from 306GB (!) to 25KB for what was supposed to be the same same number of timesteps.

Sign In or Register to comment.