Problem parsing a webpage

Ale_

in Contributed Library Questions • 1 year ago

Hi!
I got a strange "problem" with a sketch. I am playing with a Java library, htmlparser. It´s a really nice library, useful for parsing web pages and, imho, works much smoother than ProHTML, that gave me a lot of headaches that made me to give up. In order to learn the library I was creating a little tool, that parses The Big Picture ( http://www.boston.com/bigpicture/) and brings one of the nice pictures posted there to the sketch in an arbitrary way; the concept is quite simple: it´s a learning sketch.
The problem I find is sometimes it seems than the sketch takes a couple of links, not one. The sketch takes its size from the dimensions of the picture, so in this case the dimensions passed to the size() function are the width and height of the first picture but the final image brought to the sketch is the second one.
I can´t exactly figure out what is going on here, and I ´d like to know... any good idea there?
thanks&&regards · Ale

Code here (maybe it takes a couple of attempts to make it fail the way I describe above):

import org.htmlparser.*;
import org.htmlparser.util.*;
import org.htmlparser.filters.*;
import org.htmlparser.nodes.*;
PImage base;
int sx,h,w;
ImageLoader iL;
void setup() {
UrlParser u=new UrlParser("http://www.boston.com/bigpicture/");
w=u.getW();
h=u.getH();
sx=w;
size(w,h,P2D);
background(#000000);
iL=new ImageLoader(u.getImgUrl());
iL.start();
}
void draw() {
if (!iL.running() && sx>0) {
creaPixel ();
sx--;
}
}
void creaPixel() {
for (int y=0; y<h; y++) {
for (int x=0; x<w; x+=sx) {
color pix=iL.getImg().get(x,y);
stroke(pix);
line(x,y,x+sx,y);
}
}
}
class UrlParser {
String route,url,tit;
int wI,hI;
UrlParser (String route){
this.route=route;
selectImgUrl (route);
}
void selectImgUrl(String urlPath) {
try {
org.htmlparser.Parser ps = new org.htmlparser.Parser ();
ps.setURL(urlPath);
NodeList iList = ps.parse(new HasAttributeFilter("class","bpImage"));
int posicionUrl = int(random(iList.size()));
TagNode imgNode = (TagNode) iList.elementAt(posicionUrl);
url = imgNode.getAttribute("src"); println(url);
String[] attrs=splitTokens(imgNode.getAttribute("style"),"; px height: width:");
hI=int(attrs[0]);println(hI);
wI=int(attrs[1]);println(wI);
}
catch (Exception e) {
println("Exception: "+e);
e.printStackTrace();
}
}
String getImgUrl() {
return url;
}
int getW (){
return wI;
}
int getH (){
return hI;
}
String getTit(){
return tit;
}
}
class ImageLoader extends Thread {
String address;
PImage base;
boolean onoff=false;
ImageLoader (String address){
this.address=address;
}
void start (){
onoff=true;
super.start();
}
void run (){
if(onoff){base=loadImage(address);}
onoff=false;
}
boolean running (){
return onoff;
}
PImage getImg(){
return base;
}
}

Replies(10)

hellochar

Re: Problem parsing a webpage

1 year ago

When I visit " TheBigPicture.com", I get one of those parked domains that are full of advertisements. Is this what you're looking for?

Chrisir

Re: Problem parsing a webpage

1 year ago

that's his url:
http://www.boston.com/bigpicture/

see code

the pictures are amazing...

Ale_

Re: Problem parsing a webpage

1 year ago

Thanks, a silly lapsus... ;-) I fix it now...
It´s one of the best picture galleries to check what´s going on in the world, indeed... :-) That´s the reason I chose it...
Regards! · Ale

Ale_

Re: Problem parsing a webpage

1 year ago

No ideas there? :-(
Such a bad first question! :-D

Ale · 60rpm.tv/i

PhiLho

Re: Problem parsing a webpage

1 year ago

Well, I don't know htmlparser, and your code seems OK. The problem is in the "sometime", I suppose.
Perhaps you have some threading issue?

Ale_

Re: Problem parsing a webpage

1 year ago

Hi phi.lho, I don´t know what do you mean exactly with " in the 'sometime' "... :-(
The thread is a last addition to the code in order to avoid having the program stucked whilst image was loading... but I think the problem existed before the thread addition... I´ll check it and I´ll post what I find...
Thanks for the answer... :-)

Ale · 60rpm.tv/i

Chrisir

Re: Problem parsing a webpage

1 year ago

the width of the picture is 990 (nearly) throughout

so why not put size to 990 and crop bigger images?

PhiLho

Re: Problem parsing a webpage

1 year ago

I meant that the problem is that sometime it works, sometime not.

Ale_

Re: Problem parsing a webpage

1 year ago

Phi.lho: indeed... ;-)
Chrisir : that´s a smart practical solution, but this is a purely didactic sketch... I mean, I´m less interested in knowing how to sort this out than in knowing what´s happening, cause it doesn´t make any sense for me... :-)

Ale · 60rpm.tv/i

Ale_

Re: Problem parsing a webpage

1 year ago

Hi all.

I think I found it.

Processing reference:

" Again, the size() method must be the first line of the code (or first item inside setup). Any code that appears before the size() command may run more than once, which can lead to confusing results. "

I´m sure it´s this... ;-)

Ale · http://60rpm.tv/i

Top Reply