We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpOther Libraries › Trouble parsing XLS
Page Index Toggle Pages: 1
Trouble parsing XLS (Read 1576 times)
Trouble parsing XLS
Feb 1st, 2010, 4:25pm
 
I was looking into some of the Haiti Relief Data on the interwebs today and found an interesting excel document that has an extensive amount of data about who is contributing to the Relief Efforts in Haiti. The document, http://www.reliefweb.int/rw/fts.nsf/doc105?OpenForm&rc=2&emid=EQ-2010-000009-HTI..., unfortunately isn't well formatted for parsing. Below is my attempt to weed out all of the blank cells, but I still come up with Blank Cells and Errors.

Anyone more familiar with the XLS Reader Library have any suggestions on being more specific about Blank Cells through the getString() method

Below the source you can download the whole .pde package with the .xls file and structure. Running on 1.0.9

Code:

import de.bezier.data.*;

XlsReader reader;
String title;

void setup ()
{
 size( 600, 400 );

 noStroke();
 fill( 0 );
 background( 255 );
 smooth();

 reader = new XlsReader( this, "ohca.xls" );
 reader.showWarnings(false);

 reader.firstRow();
 reader.firstCell();

 title = reader.getString();
 println(title);

 while (reader.hasMoreRows()) {
   reader.nextRow();

   String r = reader.getString(); // Gather the row

   if(r != null) {
     int rL = r.length();
     // println(cityLength);
     if( rL <= 0 && r == "" ) {
       // println("Rows empty");
     }
     else {
       println(r);
       while( reader.hasMoreCells() ) {
         reader.nextCell();
         String c = reader.getString();
         if(c != null) {
           int cL = c.length();
           if(cL <= 0 && c != "") {
             // println("Cells Empty");
           }
           else {
             int pos = reader.getCellNum();
             println(c + " " + pos);
           }
         }
       }
     }
   }
 }
}

You can download the source code with the data here http://jonobr1.com/code/xlsreader.zip
Re: Trouble parsing XLS
Reply #1 - Feb 1st, 2010, 9:10pm
 
You can select all blank cells http://www.mrexcel.com/archive/Edit/15733.html and fill them with something, lets say "blank" and than just catch that in processing

btw i found it much easier to convert everything in the excel file to string cells and then simple convert them to float or int in processing.
Re: Trouble parsing XLS
Reply #2 - Feb 3rd, 2010, 11:44am
 
Thanks for the tips Cedric. Those seem like great ideas and converting everything to strings in xls made the process much quicker. I think that now my problem is that I may not be understanding how the Library operates properly. Below is my understanding of how to iterate through a sheet:

Code:

import de.bezier.data.*;

XlsReader reader;
String title;

void setup () {
 size( 600, 400 );

 noStroke();
 fill( 0 );
 background( 255 );
 smooth();

 reader = new XlsReader( this, "ohca.xls" );
 reader.showWarnings( false );

 reader.firstRow();
 reader.firstCell();

 title = reader.getString();
 println( title );

 while ( reader.hasMoreRows() ) {
   reader.nextRow(); // Iterate through the rows
   while( reader.hasMoreCells() ) {
     reader.nextCell(); // Iterate through the cells
     String c = reader.getString();
     if( c != "blank" ) {
       println( c );
     }
   }
 }
}


I'm thinking of it similar to a pixel array or video file from OpenCV or Jmyron where one while statement goes to the row and the nested one goes through the columns of that row. The result (what is printed in the console), however, is only the Cells that say "blank." The new xls doc as well as the pde can be found at http://jonobr1.com/code/xlsreader2.zip
Re: Trouble parsing XLS
Reply #3 - Feb 15th, 2010, 3:52am
 
UPDATE: I figured it out. Just needed to go to the first cell before the while statement. Everything above the Horizontal Rule is the update. Code below:

Code:

import de.bezier.data.*;

XlsReader reader;
String title;
ArrayList xls;

void setup ()
{
 size( 600, 400 );

 noStroke();
 fill( 0 );
 background( 255 );
 smooth();

 xls = new ArrayList();

 reader = new XlsReader( this, "ohca.xls" );
 reader.showWarnings(false);

 reader.firstRow();
 reader.firstCell();

 title = reader.getString();
 println(title);

 while (reader.hasMoreRows()) {
   reader.nextRow();

   String r = reader.getString(); // Gather the row

   if(r != null) {
     int rL = r.length();
     if( rL > 0 && r != "" ) {
       reader.firstCell();
       String f = reader.getString();
       xls.add(f);
       while( reader.hasMoreCells() ) {
         reader.nextCell();
         String c = reader.getString();
         if(c != null) {
           int cL = c.length();
           if(cL > 0 && c != "") {
             int pos = reader.getCellNum();
             xls.add(c);
           }
         }
       }
     }
   }
 }
 for(int i = xls.size() - 1; i >= 0; i--) {
   String s = (String)xls.get(i);
   println(s);
 }
}


Enjoy!

Okay, so after some tinkering and thinking about this problem I think I was able to circumvent it sort of...Basically, an ArrayList won't be affected by these errors and null spaces. The ArrayList spits out just the content!

Code:

import de.bezier.data.*;

XlsReader reader;
String title;
ArrayList xls;

void setup ()
{
 size( 600, 400 );

 noStroke();
 fill( 0 );
 background( 255 );
 smooth();

 xls = new ArrayList();

 reader = new XlsReader( this, "ohca.xls" );
 reader.showWarnings(false);

 reader.firstRow();
 reader.firstCell();

 title = reader.getString();
 println(title);

 while (reader.hasMoreRows()) {
   reader.nextRow();

   String r = reader.getString(); // Gather the row

   if(r != null) {
     int rL = r.length();
     if( rL > 0 && r != "" ) {
       while( reader.hasMoreCells() ) {
         reader.nextCell();
         String c = reader.getString();
         if(c != null) {
           int cL = c.length();
           if(cL > 0 && c != "") {
             int pos = reader.getCellNum();
             xls.add(c);
           }
         }
       }
     }
   }
 }
 for(int i = 0; i < 10; i++) {
   String s = (String)xls.get(i);
   println(s);
 }
}


There is a new issue that has arisen however. While the method, nextCell(), in the documentation says, "Select the next (or first) cell of the selected row of current sheet," when I println out my ArrayList the first element, the name of the donor, doesn't display. Below is the println output and in red are the missing items:


Haiti Relief Contibution List
Donor
Channel
Description
Funding
USD
Uncommitted
Pledges
USD
3M
NGOs
Working with key humanitarian partners like Project HOPE and MAP International, 3M has donated numerous boxes and cases containing medical supplies such as Nexcare bandages, 3M Tegaderm transparent dressings, sterile drapes, splints, medical tapes and respiratory protection products. 3M continues to work closely with its nonprofit partners to identify other 3M products that may be needed.
1000000.0
0.0
UN Agencies, NGOs and Red Cross
In-kind: Donations of medicines and nutritional products


The items in red all happen to be in the first column of my excel document. Now at the moment I don't think it's the library, I think I must be skipping the first column somewhere, but if anyone has run in to this problem or can see what the issue is I would be much appreciative! As always you can find a zipped version of the pde with the documents here: http://jonobr1.com/code/xlsreader3.zip
Page Index Toggle Pages: 1