We are about to switch to a new forum software. Until then we have removed the registration on this forum.
I'm just taking a look at Processing. I'm most interested in dynamic data visualization for social change. Can processing go out on the web and scrape or mash site-data easily? I took a quick look at the libraries but didn't see much for that. I don't want to have to learn yet another language (or relearn one) to do the data scraping. I know some Python, but I don't want to be switching back and forth.
Answers
Take a look at the Processing example "XMLYahooWeather". Maybe it will spark interest. Here:
Parsing XML is a bit different than parsing HTML (unless that's XHTML, of course).
For the latter, you can take a look at the jsoup Java library.
I would definitely recommend python for the scraping part, especially this module
Just dump the data you need to json or xml for that matter, and read this with processing.
i ve done this a couple of times and ,as PhiLo mentioned, jsoup is the way to do it (or any other similar library that suits you)..
bear in mind that if you want to collect data from a website that visits multiple pages you have to wait at least 5 seconds before you go to the next one , info about how you can parse data are stored in the robots.txt file in each website ..
Moreover, i think that this kind of visualization requires threads