We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpOther Libraries › HTMLParser-Lib. -> Proxy-Use
Page Index Toggle Pages: 1
HTMLParser-Lib. -> Proxy-Use? (Read 1052 times)
HTMLParser-Lib. -> Proxy-Use?
Feb 15th, 2007, 3:29pm
 
Hello to all from a processing newbie!

my first question to you is about the library HTMLParser in general and to get the html-code of a site by using a proxy as my specific problem.

To get this done, and because i can't find an example, i did the following:
Code:

void getDataFromClient()
{
 org.htmlparser.Parser ps;
 ConnectionManager cm;

 // get the connection-manager
 cm = Parser.getConnectionManager();

 // set the proxy-data i find in the java-docs of the library
 cm.setProxyHost("[the proxy of my company]");
 cm.setProxyPort([the proxy-port]);
 cm.setProxyUser("[my user-id");
 cm.setProxyPassword("[my password]");

 ps = new org.htmlparser.Parser ("[url to get the html-code from]");
   
 OrFilter orf = new OrFilter();

 NodeFilter[] nfls = new NodeFilter[1];

 nfls[0] = new TagNameFilter("html");

 orf.setPredicates(nfls);

 NodeList nList = ps.parse(orf);
 Node     node  = nList.elementAt (0);

 this.parseTree(node);
}


When runnning this code from within processing, i get the following error-message:
Code:

org.htmlparser.util.ParserException: Unexpected end of file from server;
java.net.SocketException: Unexpected end of file from server

at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)

no matter which url i use to feed the program.

the error comes up right after Code:
 ps = new org.htmlparser.Parser ("[url to get the html-code from]");


Can anyone give me a hint where the error is
Where's my mistake, what is missing from the code
Does anyone know if there's a documentation about HTMLParser

(the code is part of htmlgrapher which can be found at http://www.aharef.info/static/htmlgraph; i used it as a start in processing and want to get it running from my companys intranet)
Page Index Toggle Pages: 1