We are about to switch to a new forum software. Until then we have removed the registration on this forum.
I'm working on a paint program that will be able to handle larger images ( 4800 x 6000) pixels. In order to mix pixels properly, I've found I need to either use a floating point array or an array with the integer r,g,b, values of the pixels shifted 16. Using just integers, there is a little clipping each time that is lost and they start adding up into weird hue shifts. Though tests, I've found no difference in performance between using the floating point array and the integer shifted array-at least on the java machine. So I use the float because its clearer.
But I have found a major difference in performance between using a multidimensional array and 3 one dimension arrays.
I had been allocating the array as such:
float floatArray = new float[4800 * 6000][3];
That little line of code took about 5-10 seconds to allocate on my Windows 7, i7 processor with 16gb ram.
But writing it as this takes only milliseconds to allocate:
float floatArrayR = new float[4800 * 6000];
float floatArrayG = new float[4800 * 6000];
float floatArrayB = new float[4800 * 6000];
I remember reading many years ago that different os's handle multidimensional arrays differently and one should not assume how the members are ordered. I can't imagine how the java virtual machine handles multidimensional arrays to make them so much slower than one dimension arrays. Allocation time from 5-10 seconds to milliseconds? Whats going on there? The multidimensional array also slowed garbage collection so much it was performance issue. Not so with the one dimension array.
Answers
byte[]
orbyte[][]
instead, b/c each primary color fits in 1byte
.But using bytes isn't going to work because they'd overflow when adding values together, which is why he's using floats in the first place.
They have to be treated individually with something that can hold the little partial values that would normally be clipped. If you do integer math (255 + 2) / 2 ( an example of mixing pixels - say one red component is 255 and the other is 2), the .5 that is clipped would be hard to notice. But if it happens many times it adds up into weird hue shifts. What I do is store the actual values of the math in an array and the only time they are clipped is when move them to the display image:
img.pixels[i] = 0xFF000000|(int)floatArrayR[i] <<16|(int)floatArrayG[i]<<8|(int)floatArrayB[i];
I never lose the clipped portion and I get no weird hue shifts. The clipped portions don't accumulate.
"A flatten array is always gonna be faster than a multi-dimensional array though."
But by that much? From 5-10 seconds to like 40ms allocation time?
If you need to access each color component separately, for best access performance, each 1 of them should be a separate array.
If you need to store fractional values, and that fraction is merely 1 or few digits, you can still use a whole datatype.
For 1 digit, when saving the value, multiply by 10, when reading, divide by 10.
This way you can use something like
short[]
, which is 16bit, which is more pipeline cache friendly for the CPU than a 32bitfloat[]
."This way you can use something like short[], which is 16bit, which is more pipeline cache friendly for the CPU than a 32bit float[]."
I like that idea. It would cut memory usage in half and that precision should be enough. Is a short unsigned in java? Even if not, it seems I could multiply it by 100. 255 *100 = 25,500 fits into 32,767 or a signed 16 bit short.
Thank you, I going to try this :)
Yes it is. The only primitive datatype which it isn't is
char
, which is also 16bit.If you don't need negative values, you can use
char
in place ofshort
, in order to reach 65335 = 2**16 - 1. ~O)Well I might just go with bit shifting it
<< 8
. It would only be a small increase in precision from multiplying by 100, but it may be faster. Thanks for that.Looking into it, a short is signed in java. I can still shift it by 7;
rereading your post, I think that's what you meant. A char is the only unsigned datatype.
Have not seen your code but maybe you could save a buffer of the original image that you re-process in larger steps rather than small incremental steps to get rid of the hue changes?
On topic:
You're comparing allocating millions of 3-element arrays against 3 millions-element arrays.
float[][] floatArray = new float[3][4800 * 6000];
Will take about the same amount of time as three [4800 * 6000] arrays.
Note that rearranging the array that way only makes sense if your'e processing the image one color channel at the time rather than pixel by pixel
"Will take about the same amount of time as three [4800 * 6000] arrays."
No, not in practice on a Windows 7 machine. Three one dimensional [4800*6000] arrays allocate thousands of times faster than one [4800 * 6000][3] multi dimensional array. If you have the memory, test it. Three one dimensional arrays take about 35-40 ms to allocate. The multidimensional array take between 5 and 10 seconds to allocate. I've tested this. That's why I posted my surprise. I can't figure how java handles multidimensional arrays to make them thousands of times slower. But try it. write a little short program with :
On my computer the difference is a thousand fold or almost.
My print out of the above is 6342 ( 6.3 seconds)
and 32 (32 milliseconds).
So not a thousand times slower - but 200 times slower.
The problem with clipping when mixing pixels, is you can be mixing two slightly off gray pixels and start winding up with very saturated pixels when the clips accumulate. The hues will also go astray. One has to keep track of the clipped portions to have accurate mixing.
@prince_polka said:
@shawnlau said:
...shawnlau, notice that the
[3]
is not in the same place in your test example compared to what prince_polka was suggesting...!Oh, I see. Ordered that way there is only a minor difference in allocation om my machine. 62 ms for the multidimensional array, and 47 ms for the 3 flat arrays. I guess the order was my problem :)
And on reruns, sometimes the multidimensional array is allocated faster - sometimes both equal. So prince_polka was exactly right.
Thanks for the help! (Everyone)
"Note that rearranging the array that way only makes sense if your'e processing the image one color channel at the time rather than pixel by pixel"
The
float[4800 * 6000][3]
makes for clearer code, but the performance is terrible. As you said, I've created millions arrays instead of three withfloat[3][4800 * 6000]
.I am working pixel by pixel, but I can do that with the 3 arrays.
I do think the 3 flat arrays named properly makes for code that is a little more clear.
float red = floatArray[0][index];
vs
I know what both mean, but if I come back to work on it in 3 years I think the latter way will be more recognizable.
Thanks for the help!
I like the idea of the class because it makes it abundantly clear what is going on and can encapsulate much of the pixel fiddling (methods). But I think I'm going to stick to floats. The shorts cut the array size in half, but the code looks convoluted. In this program, pen pressure is used and that arrives in the form of a float. Alpha is also used and mixed and that is best calculated with a float. The program only uses one float array so for the sake of 4800 * 6000 *2 bytes, I think I'll stick with the floats. In my tests, I've seen no performance differences between floats and shorts.