We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpPrograms › Getting data into processing
Page Index Toggle Pages: 1
Getting data into processing (Read 1520 times)
Getting data into processing
Jun 6th, 2007, 11:15pm
 
I work with a lot of data.  How do I get them [the data] into Processing?  All the examples I've seen thus far involve user generated data.  However, I want to use Processing to display data I already have on file.  For example, if I wanted to create a simple bivariate scatterplot in Processing, what would be the best way of entering the data?
Re: Getting data into processing
Reply #1 - Jun 7th, 2007, 12:19am
 
http://processing.org/reference/loadStrings_.html or http://processing.org/reference/loadBytes_.html
Re: Getting data into processing
Reply #2 - Jun 7th, 2007, 1:01am
 
Thanks, John.  Does anyone want to offer a practical example of how to read in data and create a simple scatterplot/histogram/bar chart?
Re: Getting data into processing
Reply #3 - Jun 7th, 2007, 7:46am
 
first we'd need to know what data you have there, how is it saved? how big are your files?

depending on what you want to do with the graphic generated from the data the approach would be different. do you just want to save an image / pdf or is it going to be interactive?

normally i'd read the data into an array or object (depends on how complicated it is) at setup() then work with it in draw().

F
Re: Getting data into processing
Reply #4 - Jun 7th, 2007, 8:38am
 
I've make such thing some years ago, so the code maybe a little bit out of date. This exampmle loads the data from the param attribut of the applet. But it will work also if you load a xml from the net.

http://www.eskimoblood.de/test/applet/
Re: Getting data into processing
Reply #5 - Jun 7th, 2007, 5:59pm
 
Thank you, that was most helpful. I noticed that you had to write your own percentile and average functions, not to mention an entire boxplot program.  I typically do this type of work in R, where the following code would produce a nice boxplot:

> xx<-rnorm(50)
> boxplot(xx)

If I wanted to get the average or the first quartile I could use the following:

> mean(xx)
> quantile(xx,0.25)

I'm wondering if there has been any work (or if there is even any interest) in producing standardized functions (like in R) that would facilitate manipulating and displaying data.
Re: Getting data into processing
Reply #6 - Jun 7th, 2007, 6:09pm
 
To respond to Fjen: Yes, I'd like to make something interactive that would be delivered on the web.  For static images, I can use R, an environment I am already extremely comfortable using.  However, my business clients would be better served by an interactive display that allows them to investigate multidemensional data.  Most of my data sets are huge, but I'm not even sure how to get a small array (like the one below) into processing.  Thanks for any help you can lend.

 "SLenth","SWidth","PLength","PWidth","Species"
 6.4,3.1,5.5,1.8,"virginica"
 6.5,3,5.5,1.8,"virginica"
 5.6,3,4.5,1.5,"versicolor"
 6.6,3,4.4,1.4,"versicolor"
 5.4,3.9,1.3,0.4,"setosa"
 6.4,2.8,5.6,2.2,"virginica"
 5.6,2.8,4.9,2,"virginica"
 6.7,3,5.2,2.3,"virginica"
 4.9,3,1.4,0.2,"setosa"
 5.1,2.5,3,1.1,"versicolor"
Re: Getting data into processing
Reply #7 - Jun 7th, 2007, 7:21pm
 
Getting data such as that example in is fairly straightforwards:

Code:
String[] data;
float[] Slength,SWidth,PLength,PWidth;
String[] Species;
void setup()
{
//normal setup stuff..
data=loadStrings("data.txt");

//start from line 1 instead of line 0, since that's not data.
for(int line=1;line<data.length;line++)
{
//separate out the parts...
String[] fields=split(data[line], ',');
if(fields.length!=5)
{
println("Bad data, line "+line);
}
else
{
//add each bit to the relevant array.
SLength=append(SLength,float(fields[0]));
SWidth=append(SWidth,float(fields[1]));
PLength=append(PLength,float(fields[2]));
PWidth=append(PWidth,float(fields[3]));
//need to strip the "s off.
String newSpecies=fields[4].substring(1,fields[4].length-2);
Species=append(Species,newSpecies);
}
}
}


And there, you have all the info into arrays, such that (for example) item 4 in each array corresponds to the 4th entry in your data file.

As a pre-emptive stumbling-block removal, if you come to want to find all "virginica" you need to use <string>.equals() instead of ==, e.g.:

Code:
for(int i=0;i<Species.length;i++)
{
// *NOT* if(Species[i]=="virginica")
if(Species[i].equals("virginica"))
{
//do something with all i-th entries in the arrays
}
}


As for built in functions, no processing doesn't have anything like R, since that's a dedicated graphing language, and processing is an all round tool.
Re: Getting data into processing
Reply #8 - Jun 7th, 2007, 7:51pm
 
Thanks for the timely response, John.  Your example should get me started.  One last question: If I try to read 1 million records each having 100 fields, am I going to have any problems with the size?  I know that some programs (like R) read all of the data into memory whereas others do not (like SAS).  Does Processing have any space limitations I should be familiar with before loading in a lot of data?
Re: Getting data into processing
Reply #9 - Jun 7th, 2007, 8:00pm
 
processing will load them all into memory. That much data might be too much for processing to do any sort of "interactive" work with.

You might be better doing the statistics and analysis in something else, and then writing the results out to a file, (e..g bx position/extents/error bars) and then just showing that in processing, rather than trying to load the raw data, and do all the analysis in processing.
Re: Getting data into processing
Reply #10 - Jun 7th, 2007, 8:06pm
 
Agreed.  Thanks for the info.
Re: Getting data into processing
Reply #11 - Jun 12th, 2007, 9:09pm
 
A couple of comments on this.

First, even if you do have the 100+ MB of free memory you'd need to do this, if you're using JohnG's code unaltered you're going to have a bit of a wait, because each of those "append()" calls actually allocates and fills a brand new array (which I'm sure he is quite aware of; for small data sets, his method is definitely the one to use).  By the time you reach the end of your dataset, you're allocating a new 1,000,000 element array for each additional data point, so I don't know if your loading will ever finish!

For large data sets like yours, you'd want to preallocate the array - assuming that there is one title row and the rest are valid data, you'd just say, for instance, float[] Slength = new float[data.length-1]; and then fill in the array as you traverse the String array (just replace each append() call with an assignment to the appropriate index).  If you don't know that each entry is valid you need to do something more complex, making a first pass through the array to calculate the number of valid entries, allocating the float[] array, and then making a second pass to actually populate that array.  Same two-pass method applies if you need to filter data.

On another note, though, if you don't actually need to use all of the data at any one time (this would _not_ be the case if you potentially needed random access to the array; in most cases you only need to process data in sequential order, and this is what I'm talking about), you can use the Java I/O functionality directly to do things like read the next character or line of data.  I'll refer you to http://java.sun.com/docs/books/tutorial/essential/io/ for more details on how to do this, as unfortunately Java's methods are not quite as simple to use as Processing's.  The main problem with this approach is that it necessarily binds the parsing phase to the calculation phase.  But this is usually preferable to hauling around thousands of megabytes of mostly irrelevant variables, so make what you will of it...
Re: Getting data into processing
Reply #12 - Jun 12th, 2007, 11:31pm
 
Thank you.  This is good information to have on hand as I embark on processing large amounts of data.  You both saved me a lot of grief.  I'll probably start with small datasets and work my way up from there.
Page Index Toggle Pages: 1