We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpOther Libraries › Screen scraping
Page Index Toggle Pages: 1
Screen scraping (Read 1243 times)
Screen scraping
Aug 30th, 2007, 12:39pm
 
Have I missed something? Given that screen scraping is a pretty useful way to assemble new data sets, it's surprising that there is little (er...almost no) discussion of using it for Processing.

Can anyone recommend a library for easily scraping and parsing HTML - and not some clunky old java library with a billion irrelevant classes and methods?

Has anyone created one for Processing? (I would, but it's beyond my skill level!)  Links to examples of screenscraping in Processing would be nice too!

Cheers,
-Brock

Bonus points: sourcing data from RSS feeds
Re: Screen scraping
Reply #1 - Sep 2nd, 2007, 11:54am
 
Have you had a look at the processing libraries page?
http://www.processing.org/reference/libraries/index.html

proHTML and proXML are probably just what you want.
Re: Screen scraping
Reply #2 - Sep 19th, 2007, 6:43pm
 
Yes, thanks. I've looked at the documentation a good while, but the examples are so localzed (i.e., only relevant to the specific method or class) that I can't work out how to put them into use for anything practical.

How for example, would I use say, htmlList.getElements or any  other method to find a particular string in a web page? If I am looking for "Temeperature in Foo:" and I want to get the value for that temperature, how would I parse this and stuff it into a variable??

Thanks!
Page Index Toggle Pages: 1