We are about to switch to a new forum software. Until then we have removed the registration on this forum.
I am not quite sure about the formal procedure, (I will figure it out hopefully) but I have a nice idea for an extension, which would be really helpful for many processing users (in my experience).
I am a mathematician who currently continues his studies with a bachelor in media art. My current focus is data analysis and most important visualization. Creative, good looking, interactive visualization plays a key role in understanding high dimensional complex data set. They while being a great interactive and beautiful form of narration are a major tool in gathering meaning out of information.
A big minus on this procedure is: there is always a big abstraction gap between the data wrangling and analyzing and the visualizing in the end. This makes feedback loops in your work (generate your visualization, detect errors, switch back to the data...) really annoying and time consuming. Also, in my point of view, it leaves processing in the design / arts hemisphere, while it would be great tool in visualization - especially if normal processing users like artists and designers are able to work on data related stuff easily.
So my suggestion for a contribution is a library, which is working similar to pandas, or maybe which just delivers a kind of framework / interface to handle data in processing the more easy way.
Possible starting points could be:
get data the easy way: it should not matter if your data are a CSV file, a JSON file, a REST Web Service, an SQL or a NoSQL database. You should just have one major function to get your data as a data frame into processing.
process data the easy way: a data frame delivers you an easy to use interface to perform operations on your data (building sums, group the data, filling NaN entries, creating SQL queries on the data frame...). In the best case it should also provide a set of standard statistics functionality.
provide an output the easy way: here processing plays the key role. If your processed data can easily be converted to a custom output formate (native Array, List, File ...) which then can be easily attached to standard processing graphics functionality a prototyping feedback loop for good visualization would be a lot faster and effective and would include a lot more designers and artists who are not so common with data wrangling the hard way...
What do you think about such extension?
If you don't really know what I have in mind yet, you should have a look at the Pandas library for python - which is doing exactly this thing. But as python is not as smart for visualization as processing a merge of both benefits would be the total awesomeness! :-)
Edit: You find the proposal here: http://www.google-melange.com/gsoc/proposal/review/student/google/gsoc2014/jonaskoehler/5629499534213120