We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpSyntax Questions › problems with "umlaut"
Page Index Toggle Pages: 1
problems with "umlaut" (Read 1762 times)
problems with "umlaut"
Jan 11th, 2010, 10:19am
 
Hi everybody,

i somehow got the problem with "umlauts" (äöü etc). Even so i got the latest processing version(1.0.9) println() doesn't give back umlauts:

here is some simple code i tried:
Code:

String[] umlaut = loadStrings("/umlaut.txt");
println(umlaut);


in the umlaut.txt there is writen: "ööäääüüüßßßßß"
what println() gives out is "?????????????"

somebody got an idea how to change that?

Thanks
Re: problems with "umlaut"
Reply #1 - Jan 11th, 2010, 10:21am
 
First thing to do is check that the text file is encoded in UTF8:

"Starting with Processing release 0134, all files loaded and saved by the Processing API use UTF-8 encoding."
Re: problems with "umlaut"
Reply #2 - Jan 11th, 2010, 10:26am
 
i did see that processing actually should be able to understand it... but somehow it doesn't.

if i do sth like this, its a similar problem:

Code:

import processing.net.*;

Client c;
String data;

void setup() {
size(200, 200);
background(50);
fill(200);
c = new Client(this, "www.wollle.com", 80); // Connect to server on port 80
c.write("GET / HTTP/1.1\n"); // Use the HTTP "GET" command to ask for a Web page
c.write("Host: www.wollle.com\n\n"); // Be polite and say who we are
}

void draw() {
if (c.available() > 0) { // If there's incoming data from the client...
data = c.readString(); // ...then grab it and print it
println(data);
}
}
Re: problems with "umlaut"
Reply #3 - Jan 11th, 2010, 10:35am
 
ok, with the .txt problem it was the encoding, cause when i do this, it works properly:

Code:
String umlauts[] = new String[1];
umlauts[0] = "äääääööööüüüüßßßßß";
saveStrings("/umlaut.txt", umlauts);
delay(1000);
String[] loaded_umlauts = loadStrings("/umlaut.txt");
println(loaded_umlauts);


but how about the web encoding? can i somehow force processing to read the umlauts properly? or do i need to make another request?

cheers
Re: problems with "umlaut"
Reply #4 - Jan 11th, 2010, 10:43am
 
See readString():

Quote:
Returns the all the data from the buffer as a String. This method assumes the incoming characters are ASCII. If you want to transfer Unicode data, first convert the String to a byte stream in the representation of your choice (i.e. UTF8 or two-byte Unicode data), and send it as a byte array.


Must admit I haven't done anything like this so not sure I can help much more than that...  Mind you these kind of questions crop up often enough - did you search the site
Re: problems with "umlaut"
Reply #5 - Jan 11th, 2010, 11:01am
 
yep, i did... but somhow i didn't find the simple answer i wished to have...
Re: problems with "umlaut"
Reply #6 - Jan 11th, 2010, 11:37am
 
i tried now something like this:

Code:
import processing.net.*;

Client c;
String data;

void setup() {
size(200, 200);
background(50);
fill(200);
c = new Client(this, "www.wollle.com", 80); // Connect to server on port 80
c.write("GET / HTTP/1.1\n"); // Use the HTTP "GET" command to ask for a Web page
c.write("Host: www.wollle.com\n\n"); // Be polite and say who we are
}

void draw() {
if (c.available() > 0) { // If there's incoming data from the client...
data = c.readString(); // ...then grab it and print it

try{
byte[] b = data.getBytes("UTF-8");
String a = new String(b);
println(a);
}
catch (Exception e){
println("error");
}
}
}


just to try to do the opposite as shown here:
http://processing.org/discourse/yabb2/num_1210675667.html
but still i doesn't work. somebody knows ,what is wrong
Re: problems with "umlaut"
Reply #7 - Jan 12th, 2010, 12:56am
 
timm wrote on Jan 11th, 2010, 11:37am:
...but still i doesn't work. somebody knows ,what is wrong


You need to be a little more specific:  What doesn't work  Do you get anything out of getBytes()  If so what  I suspect the next step will be to convert the bytes back to something human-readable...
Re: problems with "umlaut"
Reply #8 - Jan 12th, 2010, 1:36am
 
Have you tried to skip the first step (or any other step), so that you know that at least one part is working? Let's say you skip the GET part, and hard code the type of string you expect to receive, i.e. your 'data' is filled with data. What will that data look like? If you know what it will look like, hard code it, and see if it works from there. When you got that part working, go back to the GET step, and see if you can get that to work.
Re: problems with "umlaut"
Reply #9 - Jan 12th, 2010, 4:21am
 
mhm yeah well what i tried several things. in draw i'm trying to cast the incoming String 'data' to UTF-8 (see above).
Somehow it doesn't change anything. E.g. a line that looked like this before: " Europas größter Autobauer will sich für die..." still looks the same (in stead of: 'Europas gößter ... will sich für ...')

as blindfish already mentioned readString (http://processing.org/reference/libraries/net/Client_readString_.html ) seems to expect ASCII, so i can't say if the error is in the GET command i'm sending to the server or if it's the readString, that i somehow do cast in a wrong way to UTF-8.

How would i tell a server, that i'm wanting to have UTF-8 I tried to send something like "charset UTF-8" but without any success so far.

Thanks
Re: problems with "umlaut"
Reply #10 - Jan 12th, 2010, 4:29am
 
@blindfish:
Code:
byte[] b = data.getBytes("UTF-8");
String a = new String(b);
println(a);

when print 'a', I do get the whole String that 'data' does contain. It just does look exactly the same as it does if i'd print 'data'... nothing seems to change.
Re: problems with "umlaut"
Reply #11 - Jan 12th, 2010, 6:08am
 
Should be:
byte[] b = data.getBytes("ISO-8859-1");
getBytes "Encodes this String into a sequence of bytes using the named charset, storing the result into a new byte array."
The header of the page indicates it is ISO-8859-1.
Re: problems with "umlaut"
Reply #12 - Jan 12th, 2010, 8:40am
 
after lots of testing this one made it:

Code:
import processing.net.*;

Client c;
String data;

void setup() {
size(200, 200);
background(50);
fill(200);
c = new Client(this, "www.ct-w.com", 80);
c.write("GET /timm/ HTTP/1.1\n"); // heres a simple testing file
c.write("Host: www.ct-w.com\n\n");
}

void draw() {
if (c.available() > 0) {
data = c.readString();

try{
byte[] e=data.getBytes();
String v=new String(e,"utf-8");
byte[] f=v.getBytes("iso-8859-2");
String w=new String(f);
println(w);
}
catch (Exception e){
println("error");
}
}
}
Page Index Toggle Pages: 1