We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpSyntax Questions › Problems to read HTML with java.net
Page Index Toggle Pages: 1
Problems to read HTML with java.net (Read 926 times)
Problems to read HTML with java.net
Dec 4th, 2009, 8:44am
 
I just want to get the HTML code of a webpage out as a String.
I tried it with java's URLConnection + InputStream:
The first times it works (HTML was printed on the console) but after wards it didn't do anything and it slows down the computer...
What happened? I don't really know java.net Classes and what happens behind the scenes... Maybe someone knows and could help?


Here the code:

import java.net.*;
InputStream urlStream;
boolean alreadyconnected = false;

String content;

//URLConnection c;
void draw(){


if(alreadyconnected == false ){
 try{
 

   URL url = new URL("url");
// i can't post a link here. Please enter a random URL
   URLConnection c;
   
   
     
   c = url.openConnection();
   urlStream = url.openStream();
   
   
   byte b [] = new byte[1000];
   int numRead = urlStream.read(b);
    content = new String(b,0,numRead);
   while(numRead != -1){
      numRead  = urlStream.read(b);
      if(numRead != -1){
        String newContent = new String (b, 0, numRead);
        content += newContent;
       
      }
     
   }
   
   if(numRead == -1){
   println(content);
   urlStream.close();
   alreadyconnected = true;
   }
   

 


 }
 catch(IOException e){
   println("doesn't work");}
 

}

}
Re: Problems to read HTML with java.net ?
Reply #1 - Dec 4th, 2009, 8:47am
 
Strange that you are presumably using Processing but then delving into java.net

If all you want to do is load some HTML as strings then you might look at loadStrings()

http://processing.org/reference/loadStrings_.html
Re: Problems to read HTML with java.net
Reply #2 - Dec 4th, 2009, 9:15am
 
Flancke wrote on Dec 4th, 2009, 8:44am:
The first times it works (HTML was printed on the console) but after wards it didn't do anything

Not surprising given the code you show...
alreadyconnected boolean is here to ensure the HTML is downloaded only once.

Or do you mean first runs of the sketch
Re: Problems to read HTML with java.net
Reply #3 - Dec 4th, 2009, 9:29am
 
I used this boolean because I thought multiple .openConnection() and .openStream orders could "overload the system"?
I tried it without the boolean and the system was confused, difficulties to close the program.

I thought I have to give the .openConnection method into the draw because making a Connection will not work for the first time.
But once that the Connection is established, I didn't wanted the program to continue to demande the openConnection.
Same for the .openStream method.
I searched for a Test (if the connection is already established) like the boolean .connected() with the aim to say something like this: if (alreadyconnected) don't try to connect anymore...
But .connected field was not visible...

I am absolutely new with java Code...
And I found it hard to translate a javacode in to Processing language,
to put it into the setup() and draw() scheme...



Re: Problems to read HTML with java.net
Reply #4 - Dec 4th, 2009, 12:39pm
 
Well, the interest of Processing is to insulate as much as possible from Java for the most common operations.
So perhaps you can proceed as Quark suggested and use loadStrings().
If it is not suited to your needs, you should explain why and how the Java code you use brings some advantages.
Re: Problems to read HTML with java.net
Reply #5 - Dec 6th, 2009, 7:31am
 
Yes: the loadStrings(url) function was exactly what I looked for...
So simple!
Thank you!  Cheesy
Page Index Toggle Pages: 1