FAQ
Cover
This is the archive Discourse for the Processing (ALPHA) software.
Please visit the new Processing forum for current information.

   Processing 1.0 _ALPHA_
   Programming Questions & Help
   Syntax
(Moderators: fry, REAS)
   proHTML : dead links
« Previous topic | Next topic »

Pages: 1 
   Author  Topic: proHTML : dead links  (Read 512 times)
nay

WWW
proHTML : dead links
« on: Feb 16th, 2005, 3:25pm »

hi all
 
i am hoping to use the proHTML library to list dead links, unfortunately if i try to parse a link to find out if it is dead - and it is - it will abort the program
 
any advice on how i can do this? i would assume that Christian Riekoff's piece "Tree" (http://www.texone.org/tree/index.html), which uses the library would have something to stop it crashing when it hits a dead link
 
cheers
rene.
 
amoeba

WWW
Re: proHTML : dead links
« Reply #1 on: Feb 16th, 2005, 6:55pm »

Do you get any errors when it aborts, NullPointerExceptions etc?
 
If ProHTML uses the URL.getConnection method there is no way to set a timeout on the connection, which can be very bad (i.e. eternal wait...) I just finished a web crawler project where I ended up using the Apache Jakarta Commons HttpClient library.
 
My spider (aka the useless Universal Digest Machine) is currently happily surfing and printing out little receipts for every page it visits...
 

marius watz // amoeba
http://processing.unlekker.net/
nay

WWW
Re: proHTML : dead links
« Reply #2 on: Feb 17th, 2005, 12:21am »

hi amoeba - nice piece!
 
it looks like the error is from within the proHTML library to me but i could be wrong (and often am!):
 
prohtml.InvalidUrlException: This is not a parsable URL
at prohtml.HtmlTree.<init>(HtmlTree.java:9
at Temprary_6243_3356.setup(Temporary_6243_3356.java:9)
 
it spits out the error and quits. would be great if whatever caused the error would print an error and return a bool value or something instead of quitting.
 
am hoping to get this to work with this libary as I have no java experience but will look into it if i have to...
 
JohnG

WWW
Re: proHTML : dead links
« Reply #3 on: Feb 17th, 2005, 3:26pm »

Just guessing at the syntax, since I'm not 100% sure of Java excpetions, but if you change the call to:
 
Code:

try
{
  <command that causes program to quit>
}
catch(prohtml.InvalidUrlException e)
{
  println("Bad Url");
}

 
it may work. Like I said, this is just a complete guess, since I've not tried to use exceptions in Java before.
 
amoeba

WWW
Re: proHTML : dead links
« Reply #4 on: Feb 17th, 2005, 3:55pm »

nay: Glad you liked the piece.
 
John is correct about the exception handling, you could make it even more general by saying the following:
 
Code:
try {
  <command block>
} catch(Exception e) {
  println("Exception: "+e);
}

 
Exception handling is usually needed when you don't know the data you're dealing with, just be aware that after an exception has been thrown you can't assume that any of the commands inside the try-catch block have been carried out. The best practice is to discard the result and start whatever you're doing again from the beginning.
 
Or, as they say: "Fail gracefully."
« Last Edit: Feb 18th, 2005, 1:36am by amoeba »  

marius watz // amoeba
http://processing.unlekker.net/
nay

WWW
Re: proHTML : dead links
« Reply #5 on: Feb 18th, 2005, 7:17pm »

excellent! thanks guys - both snippets work!
 
will read up on exception handling, not a term i was familar with
 
am on my way...
 
Pages: 1 

« Previous topic | Next topic »