Problem parsing a webpage
in
Contributed Library Questions
•
1 year ago
Hi!
I got a strange "problem" with a sketch. I am playing with a Java library, htmlparser. It´s a really nice library, useful for parsing web pages and, imho, works much smoother than ProHTML, that gave me a lot of headaches that made me to give up. In order to learn the library I was creating a little tool, that parses The Big Picture ( http://www.boston.com/bigpicture/) and brings one of the nice pictures posted there to the sketch in an arbitrary way; the concept is quite simple: it´s a learning sketch.
The problem I find is sometimes it seems than the sketch takes a couple of links, not one. The sketch takes its size from the dimensions of the picture, so in this case the dimensions passed to the size() function are the width and height of the first picture but the final image brought to the sketch is the second one.
I can´t exactly figure out what is going on here, and I ´d like to know... any good idea there?
thanks&®ards · Ale
Code here (maybe it takes a couple of attempts to make it fail the way I describe above):
I got a strange "problem" with a sketch. I am playing with a Java library, htmlparser. It´s a really nice library, useful for parsing web pages and, imho, works much smoother than ProHTML, that gave me a lot of headaches that made me to give up. In order to learn the library I was creating a little tool, that parses The Big Picture ( http://www.boston.com/bigpicture/) and brings one of the nice pictures posted there to the sketch in an arbitrary way; the concept is quite simple: it´s a learning sketch.
The problem I find is sometimes it seems than the sketch takes a couple of links, not one. The sketch takes its size from the dimensions of the picture, so in this case the dimensions passed to the size() function are the width and height of the first picture but the final image brought to the sketch is the second one.
I can´t exactly figure out what is going on here, and I ´d like to know... any good idea there?
thanks&®ards · Ale
Code here (maybe it takes a couple of attempts to make it fail the way I describe above):
- import org.htmlparser.*;
- import org.htmlparser.util.*;
- import org.htmlparser.filters.*;
- import org.htmlparser.nodes.*;
PImage base;- int sx,h,w;
- ImageLoader iL;
- void setup() {
- UrlParser u=new UrlParser("http://www.boston.com/bigpicture/");
- w=u.getW();
- h=u.getH();
- sx=w;
- size(w,h,P2D);
- background(#000000);
- iL=new ImageLoader(u.getImgUrl());
- iL.start();
- }
- void draw() {
- if (!iL.running() && sx>0) {
- creaPixel ();
- sx--;
- }
- }
- void creaPixel() {
- for (int y=0; y<h; y++) {
- for (int x=0; x<w; x+=sx) {
- color pix=iL.getImg().get(x,y);
- stroke(pix);
- line(x,y,x+sx,y);
- }
- }
- }
- class UrlParser {
- String route,url,tit;
- int wI,hI;
- UrlParser (String route){
- this.route=route;
- selectImgUrl (route);
- }
-
- void selectImgUrl(String urlPath) {
- try {
- org.htmlparser.Parser ps = new org.htmlparser.Parser ();
- ps.setURL(urlPath);
- NodeList iList = ps.parse(new HasAttributeFilter("class","bpImage"));
- int posicionUrl = int(random(iList.size()));
- TagNode imgNode = (TagNode) iList.elementAt(posicionUrl);
- url = imgNode.getAttribute("src"); println(url);
- String[] attrs=splitTokens(imgNode.getAttribute("style"),"; px height: width:");
- hI=int(attrs[0]);println(hI);
- wI=int(attrs[1]);println(wI);
- }
- catch (Exception e) {
- println("Exception: "+e);
- e.printStackTrace();
- }
- }
- String getImgUrl() {
- return url;
- }
- int getW (){
- return wI;
- }
- int getH (){
- return hI;
- }
- String getTit(){
- return tit;
- }
- }
- class ImageLoader extends Thread {
- String address;
- PImage base;
- boolean onoff=false;
- ImageLoader (String address){
- this.address=address;
- }
- void start (){
- onoff=true;
- super.start();
- }
- void run (){
- if(onoff){base=loadImage(address);}
- onoff=false;
- }
- boolean running (){
- return onoff;
- }
- PImage getImg(){
- return base;
- }
- }
1