We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpSyntax Questions › xml import of utf-8 encoded file
Page Index Toggle Pages: 1
xml import of utf-8 encoded file (Read 553 times)
xml import of utf-8 encoded file
Jun 5th, 2007, 2:18pm
 
I've got a utf-8 encoded xml file which contains Scandinavian characters with which I've been attempting to import with the core xml importer but getting the following error:

processing.xml.XMLParseException: XML Parse Exception during parsing of the XML definition at line 1: Expected: <

The import works fine when the xml file is encoded in Mac Roman and the characters display, but does so only on Macs of course.
I'm a bit confused as in a Java tutorial it states that unicode is native in Java so one would think that importing a unicode encoded file would be no problem.
Anybody got an insight into this issue and been able to parse utf-8 encoded xml files
Re: xml import of utf-8 encoded file
Reply #1 - Jun 6th, 2007, 3:08pm
 
you just need to specify the encoding differently for the file. it's something that i probably need to include throughout the api, since this is messier than necessary:

Code:
InputStream input = openStream("yourfile.xml");
InputStreamReader reader = new InputStreamReader(input, "UTF-8");
XMLElement xml = new XMLElement(reader);
Re: xml import of utf-8 encoded file
Reply #2 - Jun 6th, 2007, 4:56pm
 
Thanks! It works like a charm now.

Just to clarify for anybody who'll stuble upon the same problem in the future, the xml files I've used have been saved as utf-8 in the Mac's TextEdit and in BBEdit as utf-8, no BOM. Any other flavour fails.
Page Index Toggle Pages: 1