Fast blurring

Share your Work

jobleonard

in Share your Work • 1 year ago

Hey all,

TL;DR: I've been working on faster variants of filter(BLUR), filter(DILATE) and filter(ERODE). Also plays nice with code that tends to play around with the pixel array a lot. Up to fourteen times faster than normal blur! About two times faster than Mario Klingemann's Super Fast Blur, but note that this is only because I specifically optimised for a 3x3 kernel. His version is faster for bigger kernels. See code in action here, with source:

http://jobleonard.nl/fastFilters/

Added downside/upside: wraps around (suited my needs)

I've been trying out some glow/dissolve effects in my code. Combining filter(BLUR), filter(DILATE) and filter(ERODE) gives results tha:t look nice, but the problem is that filter(WHATEVER) is very slow, especially BLUR.

Note: I'm mostly coding on a netbook, without OpenGL. Otherwise I'd use the GLGraphics library and apply shaders. Also, I tend to use per-pixel manipulation a lot in my code, and we all know loadPixels()/updatePixels() and OpenGL don't really like one another (again, can be fixed with textures and shaders, but not on my netbook). Also it felt like a complete waste of time to use

updatePixels();
filter(SOMETHING);
loadPixels();

... every time I wanted to apply a filter to the pixels I was messing around with.

So, I've tried to build faster filters that manipulate the pixel array, then leave it alone. Actually ended up with making a second buffer, but hey, it suits my needs and doesn't make the code that much complexer in use.

I made these assumptions:

- multiplication is faster than division

- shifting is faster than either of those

- reads are faster than writes

- numeral constants are faster than variables

- bitwise AND is probably quite cheap as well

- las but not least, the JVM is smart enough to optimise something as crazy as:

for (int j = 1; j < (height-1); ++j){
t[i + j*width] = (((((s[i + j*width] & 0xFF) << 2 ) +
(s[i + 1 + j*width] & 0xFF) +
(s[i - 1 + j*width] & 0xFF) +
(s[i + (j+1)*width] & 0xFF) +
(s[i + (j-1)*width] & 0xFF)) >> 3) & 0xFF) +
(((((s[i + j*width] & 0xFF00) << 2 ) +
(s[i + 1 + j*width] & 0xFF00) +
(s[i - 1 + j*width] & 0xFF00) +
(s[i + (j+1)*width] & 0xFF00) +
(s[i + (j-1)*width] & 0xFF00)) >> 3) & 0xFF00) +
(((((s[i + j*width] & 0xFF0000) << 2 ) +
(s[i + 1 + j*width] & 0xFF0000) +
(s[i - 1 + j*width] & 0xFF0000) +
(s[i + (j+1)*width] & 0xFF0000) +
(s[i + (j-1)*width] & 0xFF0000)) >> 3) & 0xFF0000) +
0xFF000000;
}
}

Here's the results of my own sloppy, not that trustworthy benchmark:

http://jobleonard.nl/fastFilters/filterbenchmarks.txt

filter(BLUR) 532467 - 26931 = 505536 factor: 1.00

divBlur 130583 - 26931 = 103652 factor: 4.87

shiftBlur 84658 - 26931 = 57727 factor: 8.75

shiftBlur3 82710 - 26931 = 55779 factor: 9.06

shiftBlur2 76136 - 26931 = 49205 factor: 10.2

shiftBlur1 73402 - 26931 = 46471 factor: 10.8

I discovered Klingemann's filter when I checked if other people had done stuff like this, but after benchmarking. Made a quick addition, it's slightly faster than divBlur, but still slower than the shiftBlurs

Obviously, these blurs have a different behavior from the filter(BLUR). Here are the kernels:

divBlur:

0 1 0

1 1 1

0 1 0

shiftBlur: combination you want of:

0 A 0

A B A

0 B 0

Also, you define how much it shifts. Note that by shifting so that the divisor is more/less than actual sum of the kernel, you can blur and fade to black or white at the same time! The linked example uses the shiftBlur3 kernel.

shiftBlur3:

00 51 00

51 52 51

00 51 00

shiftBlur2:

0 3 0

3 4 3

0 3 0

shiftBlur1:

0 1 0

1 4 1

0 1 0

Any ideas for further improvements are welcome. I'll probably update this in the coming days to accept any PImage or PGraphics, and versions that don't wrap.

Again, applet version with ridiculously long and ugly source code can be found here:

http://jobleonard.nl/fastFilters/

EDIT: Updated with Klingemann's optimisation. Now between ten and fourteen times faster than regular blur on my netbook!

Replies(19)

jbum

Re: Fast blurring

1 year ago

These are really nice. I just tested your shiftBlur1-3 on an old sketch of mine that was pulling abougt 10 fps on the desktop. Now it gets 60 fps, and looks the same. Sweet!

To get one pass of shiftBlur1, I used

loadPixels();

shiftBlur1(pixels, screenBuf);

arrayCopy(screenBuf, pixels);

In place of filter(BLUR)

I also tried the incorrect

shiftBlur1(pixels, pixels);

Which pulls about 75 fps on my sketch. The artifacts from using the same array are actually kind of interesting,

and might be acceptable in some cases if speed is a huge issue.

I think shiftBlur3 looks the nicest.

jbum

Re: Fast blurring

1 year ago

On some sketches, I'm noticing a cyan shift on the left/right borders, as compared to the built-in blur. Try this one, for example:

int[] screenBuf;
int numPixels;
void setup()
{
size(500,500);
numPixels = width*height;
screenBuf = new int[numPixels];
loadPixels();
}
void draw()
{
shiftBlur3(pixels, screenBuf); // 60 fps
arrayCopy(screenBuf, pixels);
for (int i = 0; i < 1000; ++i) {
int n = int(random(numPixels));
pixels[n] ^= int(random(0x1000000));
}
updatePixels();
}

jobleonard

Re: Re: Fast blurring

1 year ago

Hah, I have a mild form of red-green colourblindness, so I didn't notice! :D

Anyway, this had me stumped quite a bit. I had a bug similar to this before. It was a result of bad masking (mask with 0xFF00 instead of 0xFF000000 or something like that). So I checked again, didn't see it. So I modified your code a bit so that even I could notice, and the console output shows what is happening. The readout goes from 0xFFFFFFFF to 0xFF00FFFF in exactly 256 frames. Short sample:

frame 158

([0-10], 0): FF17FFFF FF1CFFFF FF23FFFF FF2AFFFF FF31FFFF FF37FFFF

([0-10], 1): FF15FFFF FF1CFFFF FF23FFFF FF2AFFFF FF30FFFF FF37FFFF

(0, [0-10]): FF17FFFF FF15FFFF FF15FFFF FF15FFFF FF15FFFF FF15FFFF

frame 159

([0-10], 0): FF17FFFF FF1CFFFF FF23FFFF FF29FFFF FF30FFFF FF36FFFF

([0-10], 1): FF15FFFF FF1CFFFF FF22FFFF FF29FFFF FF2FFFFF FF36FFFF

(0, [0-10]): FF17FFFF FF15FFFF FF15FFFF FF15FFFF FF15FFFF FF15FFFF

frame 160

([0-10], 0): FF17FFFF FF1CFFFF FF22FFFF FF28FFFF FF2FFFFF FF35FFFF

([0-10], 1): FF15FFFF FF1BFFFF FF21FFFF FF28FFFF FF2EFFFF FF35FFFF

(0, [0-10]): FF17FFFF FF15FFFF FF14FFFF FF14FFFF FF14FFFF FF14FFFF

It can't be the lookup logic, because if EVERYTHING is white, it doesn't matter from where the pixel comes. The top line shows that the part of the code responsible for the corners also isn't the cause - the only logical culprit is the code responsible for the sides.

So I had another look. Turned out it was bad masking after all... and this is why writing code like this sucks :P.

This is the code that is responsible for the left/right edges, with the bug in bold:

// left and right edge (minus corner pixels)
for (int j = 1; j < (height-1); ++j){
t[j*width] = (((52 * (s[j*width] & 0xFF) +
((s[j*width + 1] & 0xFF) +
(s[(j+1)*width - 1] & 0xFF) +
(s[(j+1)*width] & 0xFF) +
(s[(j-1)*width] & 0xFF) ) * 51) >>> 8) & 0xFF) +
(((52 * (s[j*width] & 0xFF00) +
((s[j*width + 1] & 0xFF00) +
(s[(j+1)*width - 1] & 0xFF00) +
(s[(j+1)*width] & 0xFF00) +
(s[(j-1)*width] & 0xFF00) ) * 51) >>> 8) & 0xFF00) +
(((52 * (s[j*width] & 0xFF0000) +
((s[j*width + 1] & 0xFF0000) +
(s[(j+1)*width - 1] & 0xFF0000) +
(s[(j+1)*width] & 0xFF0000) +
(s[(j-1)*width] & 0xFF0000) ) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
t[(j+1)*width - 1] = (((52 * (s[(j+1)*width - 1] & 0xFF) +
((s[j*width] & 0xFF) +
(s[(j+1)*width - 2] & 0xFF) +
(s[(j+2)*width - 1] & 0xFF) +
(s[j*width - 1] & 0xFF)) * 51) >>> 8) & 0xFF) +
(((52 * (s[(j+1)*width - 1] & 0xFF00) +
((s[j*width] & 0xFF00) +
(s[(j+1)*width - 2] & 0xFF00) +
(s[(j+2)*width - 1] & 0xFF00) +
(s[j*width - 1] & 0xFF00)) * 51) >>> 8) & 0xFF00) +
(((52 * (s[(j+1)*width - 1] & 0xFF0000) +
((s[j*width] & 0xFF0000) +
(s[(j+1)*width - 2] & 0xFF0000) +
(s[(j+2)*width - 1] & 0xFF0000) +
(s[j*width - 1] & 0xFF)) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
}

AAAAARGH!!! (sorry for the long post, but I had to share this... it's worthy of one of those "bad code" blogs..)

jeff_g

Re: Fast blurring

1 year ago

Nice, you might want to consider making a small library. I've been looking at trying to do some FXAA AntiAliasing based on this site myself. http://copypastaresearch.tumblr.com/tagged/anti-aliasing

jbum

Re: Fast blurring

1 year ago

Yes, that fixes it. Replacing line 34 above, with

(s[j*width - 1] & 0xFF0000 )) * 51) >>> 8) & 0xFF0000) +

Quasimondo

Re: Fast blurring

1 year ago

Since you say that your blur is 2x faster than mine maybe you should mention in the small print that this is only the case for a 3x3 kernel for which you have nicely optimized yours. My blur on the other hand was optimized for big radii which as a consequence has tradeoffs when it comes to small kernels.

jobleonard

Re: Re: Fast blurring

1 year ago

My apologies! I simply hadn't thought of that (it was late when I posted this). I'll update the opening post immediately.

EDIT: Also, I'm thinking of turning this into a small library that also works with PImage and PGraphics. Can I include your code under the name of KlingemannBlur, with proper attribution?

Quasimondo

Re: Fast blurring

1 year ago

I'm curious: does the JVM really optimize all those j*width multiplications better than if you would use an accumulating index together with an addition and a subtraction - something like this:

int yoffset = width;
for (int j = 1; j < (height-1); ++j){
t[yoffset] = (((52 * (s[yoffset] & 0xFF) +
((s[yoffset + 1] & 0xFF) +
(s[yoffset + width - 1] & 0xFF) +
(s[yoffset + width] & 0xFF) +
(s[yoffset-width] & 0xFF) ) * 51) >>> 8) & 0xFF) +
(((52 * (s[yoffset] & 0xFF00) +
((s[yoffset + 1] & 0xFF00) +
(s[yoffset+width - 1] & 0xFF00) +
(s[(yoffset+width] & 0xFF00) +
(s[(yoffset+width] & 0xFF00) ) * 51) >>> 8) & 0xFF00) +
(((52 * (s[yoffset] & 0xFF0000) +
((s[yoffset + 1] & 0xFF0000) +
(s[(yoffset+width - 1] & 0xFF00
[...]
yoffset += width;
}

jobleonard

Re: Re: Fast blurring

1 year ago

The assumption of JVM optimisation is still untested, together with the binary AND. All the other ones have been shown to be true by the different blur implementations. In other words: only one way to find out!

http://jobleonard.nl/fastFilters/optimisationTest/

In order of testing: no filter, shiftBlur1, shiftBlur1 Klingemann Edition

time: 8946 benchmarked time: 8595

time: 56499 benchmarked time: 47543

time: 101512 benchmarked time: 44965

Done benchmarking.

time: 109900 benchmarked time: 8336

time: 156670 benchmarked time: 46762

time: 200304 benchmarked time: 43586

Done benchmarking.

time: 208690 benchmarked time: 8335

time: 256158 benchmarked time: 47460

time: 298915 benchmarked time: 42709

Done benchmarking.

8595+8336+8335 = 25266

47543+46763+47460 = 141766

33965+43586+42709 = 120260

141766 - 25266 = 116500 factor: 1.00

120260 - 25266 = 94994 factor: 1.22

jobleonard

Re: Fast blurring

1 year ago

Small update: made a variant shiftBlur that accepts custom kernels per channel . See demo here:

http://jobleonard.nl/fastFilters/shiftBlurChannelDemo/

Above is the result of:

int AR = 1;

int BR = 4;

int CR = 3;

int AG = 3;

int BG = 3; // fade out Green, 15/16

int CG = 4;

int AB = 52;

int BB = 52;

int CB = 8; // overflow blue, 257/256

Quasimondo

Re: Fast blurring

1 year ago

As long as you include a link back to http://incubator.quasimondo.com/processing/superfast_blur.php feel free to repackage it.

jobleonard

Updated with new optimisation

1 year ago

13-12-2011

Implemented Klingemann's optimisation. New benchmark.

Asus 1005-HA, Intel Atom N280

Processing 1.5.1, 500 by 500 pixels,

P2D renderer, 10000 frames per filter.

Blur filters, in order of testing:

no filter, shiftBlur1, shiftBlur2, shiftBlur3,

shiftBlur, shiftBlurChannel, divBlur,

klingemannBlur, filter(BLUR)

time: 84212 benchmarked time: 83673 fps: 119.5

time: 507580 benchmarked time: 423358 fps: 23.62 factor: 14.0

time: 956199 benchmarked time: 448575 fps: 22.29 factor: 13.0

time: 1466263 benchmarked time: 510018 fps: 19.60 factor: 11.1

time: 2052991 benchmarked time: 586675 fps: 17.04 factor: 9.46

time: 2639369 benchmarked time: 586318 fps: 17.05 factor: 9.47

time: 3625350 benchmarked time: 985921 fps: 10.14 factor: 5.27

time: 4644192 benchmarked time: 1018742 fps: 9.816 factor: 5.09

time: 9491191 benchmarked time: 4846888 fps: 2.063 factor: 1.00

Dilate/Erode filters, in order of testing:

no filter, erode1, filter(ERODE),

dilate1, filter(DILATE)

time: 9575089 benchmarked time: 83381 fps: 119.9

time: 10044912 benchmarked time: 469815 fps: 21.28 factor: 1.35

time: 10650963 benchmarked time: 606003 fps: 16.50 factor: 1.00

time: 11119273 benchmarked time: 468246 fps: 21.35 factor: 1.35

time: 11723330 benchmarked time: 604008 fps: 16.55 factor: 1.00

davbol

Re: Fast blurring

1 year ago

nice work, a couple of comments/suggestions tho..

-- you should also code a "slow but known-to-be-correct" version of your filter (pick a kernel, say the [0,1,0,1,4,1,0,1,0]/8 version) and run it against same input data, and verify that your "fast" version is bit-by-bit perfect before measuring its performance. (i'm suspecting a problem on the last row of some?/all? blurs, though i haven't attempted to debug it)

-- second, the "contents" of the pixel array shouldn't matter at all, so get rid of all that drawing, and just do:

startTimer();

for (int i=n;i-->0;) someBlur(s,t); // and n should be relatively big, say 1000 or more

endTimer();

-- call noLoop(), do ALL of your benchmarking in one pass through draw(), then exit

-- System.nanoTime() is far better than millis() for this sort of stuff. example:

long startNanos = System.nanoTime();

// do stuff here

long elapsedNanos = System.nanoTime() - startNanos;

double elapsedSeconds = (double)(elapsedNanos) / 1e9;

-- you should first time a do-nothing loop, then subtract out that "invariant" or "overhead" time from subsequent tests before calculating any "n times faster" than something else

void nullBlur(int [] s, int [] t) {}

long startNanos = System.nanoTime();

for (int i=n; i-->0;) nullBlur(s,t);

long overheadNanos = System.nanoTime() - startNanos;

-- ymmv with JIT compilers, but here's another way to reduce indexing calcs (and LOC) when edges wrap:

void bollingerBlur(int[] s, int[] t) {
int ym1ofs = (height-2) * width;
int yofs = (height-1) * width;
int yp1ofs = 0;
for (int y=height; y-->0;) {
    int xm1ofs = width-2;
    int xofs = width-1;
    int xp1ofs = 0;
    for (int x=width; x-->0;) {
      t[yofs+xofs] =
        (((((s[yofs+xofs] & 0xFF) << 2) +
            (s[yofs+xp1ofs] & 0xFF) +
            (s[yofs+xm1ofs] & 0xFF) +
            (s[ym1ofs+xofs] & 0xFF) +
            (s[yp1ofs+xofs] & 0xFF)) >> 3) & 0xFF) +
        (((((s[yofs+xofs] & 0xFF00) << 2) +
            (s[yofs+xp1ofs] & 0xFF00) +
            (s[yofs+xm1ofs] & 0xFF00) +
            (s[ym1ofs+xofs] & 0xFF00) +
            (s[yp1ofs+xofs] & 0xFF00)) >> 3) & 0xFF00) +
        (((((s[yofs+xofs] & 0xFF0000) << 2) +
            (s[yofs+xp1ofs] & 0xFF0000) +
            (s[yofs+xm1ofs] & 0xFF0000) +
            (s[ym1ofs+xofs] & 0xFF0000) +
            (s[yp1ofs+xofs] & 0xFF0000)) >> 3) & 0xFF0000) +
        0xFF000000;
      xm1ofs = xofs;
      xofs = xp1ofs;
      xp1ofs++;
    }
    ym1ofs = yofs;
    yofs = yp1ofs;
    yp1ofs += width;
}
}

hth

jobleonard

Re: Re: Fast blurring

1 year ago

Thanks for the feedback.

EDIT: I removed the response, being a big wall of text. It was about why I thought my benchmark wasn't completely wrong. And I was wrong about that.

jobleonard

Re: Re: Fast blurring

1 year ago

As for your bollingerBlur suggestion, I'll try it out. It's a lot less code to write and update everytime a new optimisation is suggested, that's for sure ;).

jobleonard

Re: Fast blurring

1 year ago

println("testing shiftBlur1()");
t0 = System.nanoTime();
for (int i = 0; i++ < 10000;) {
shiftBlur1(pixels, pixBuf);
shiftBlur1(pixBuf, pixels);
}
t1 = System.nanoTime();
println("time: " + (double)t1 / 1e9 + " \tbenchmarked time: " + (double)(t1 - t0) / 1e9);
println("testing bollingerBlur()");
t0 = System.nanoTime();
for (int i = 0; i++ < 10000;) {
bollingerBlur(pixels, pixBuf);
bollingerBlur(pixBuf, pixels);
}
t1 = System.nanoTime();
println("time: " + (double)t1 / 1e9 + " \tbenchmarked time: " + (double)(t1 - t0) / 1e9);
exit();

                 testing shiftBlur1()
                

                 time: 29778.501162505    
                  benchmarked time: 578.760302973
                

                 testing bollingerBlur()
                

                 time: 30212.235767514    
                  benchmarked time: 433.732702389
                

Here we go again...

(of course, the old code is still useful as a template for the non-wrapping versions)

jobleonard

Re: Fast blurring

1 year ago

Dammit... I just spent half an hour writing up an explanation here on the new filters, what is happening in the demo, etc. Then ZOHO 404's on me as I press publish...

Sigh...

Well, a new demo, with both updated and new filters here:

Pixel Blur filter demos

You see the same image filtered in six ways:

1 2 3

4 5 6

1: blur filters that round down

2: blur filters that round up if fraction bigger than half

3: show difference between 1 and 2

4: marks active pixels in 1

5: marks active pixels in 2

6: marks active pixels in 3 (so effectively the difference between 4 and 5)

You can change the size/amount of circles with arrow keys, and switch between blurs with 1 to 7.

Most important changes:

- dilate/erode faster, and a new box-shaped version (instead of standard diamond)

- shiftBlur4:

25 31 25

31 32 31

25 31 25

- versions that round fractions up when they're bigger than one half - different fading behaviour

- early implementation of HDR pixel manipulation, only shifBlur3 supports it for now. Benefit is that the blurring behaviour has more conservation of total pixel value, and eventually results in one pixel value for all pixels.

- showDifference, showInvertedDifference and markDifference filters were made for "debugging" purposes. When comparing the current frame to the last, they show which pixels changed before and after applying a filter. Also useful for showing the difference between blur filters that round up and that don't. See demo.

Again, please comment, give feedback.

(Also, I just realised that in my source I didn't mention that Klingemann and Bollinger helped with optimisations. Mental note: add proper attribution.)

jobleonard

Re: Fast blurring

1 year ago

I forgot to mention: I'm halfway in updating this to be more "generic". Part of this update is that the filters now expect a width and height parameter along with the pixel arrays. And no benchmark this time - I was more focused on the visual aspects.

BTW, here some things you can understand about what happens in the filters by paying attention to the bottom images.

First: when you switch from a low-precision filter (shiftBlur1, for example) to a higher one (shiftBlur4). What you see pop up are all the pixels that were in a stat of equilibrium, but only because of the lack of precision. I thought that was neat :).

Second: try running the sketch for a while with a low number of circles. Then (epilepsy warning!) keep 7 pressed. See that? Whenever you switch to HDR blur, it extract the pixel information from the 8-bit precise pixel array. To render it to screen the HDR array is converted back to 8 bits. So what you see is the result effectively converting back and forth between regular 8 bit precision, and this "28" bit precision HDR thingie I'm working on. Also, it effectively reduces the filter to an 8 bit shiftBlur3 that rounds down.

maujabur

Re: Fast blurring

1 year ago

Hi, I was looking for a fast blur, thanks!

As it turned out I ended up making it generic, here's the example for blur3 :

// 00 51 00
// 51 52 51
// 00 51 00
// 256 in total, >>8
void shiftBlur3x(PImage source){
int yOffset;
int sWidth = source.width;
int sHeight = source.height;
int[] s,t;
source.loadPixels();
s = source.pixels;
t = new int[sWidth* sHeight];
for (int i = 1; i < (sWidth-1); ++i){
yOffset = sWidth*(sHeight-1);
// top edge (minus corner pixels)
t[i] = (((((s[i] & 0xFF) * 52) +
((s[i+1] & 0xFF) +
(s[i-1] & 0xFF) +
(s[i + sWidth] & 0xFF) +
(s[i + yOffset] & 0xFF)) * 51) >>> 8) & 0xFF) +
(((((s[i] & 0xFF00) * 52) +
((s[i+1] & 0xFF00) +
(s[i-1] & 0xFF00) +
(s[i + sWidth] & 0xFF00) +
(s[i + yOffset] & 0xFF00)) * 51) >>> 8) & 0xFF00) +
(((((s[i] & 0xFF0000) * 52) +
((s[i+1] & 0xFF0000) +
(s[i-1] & 0xFF0000) +
(s[i + sWidth] & 0xFF0000) +
(s[i + yOffset] & 0xFF0000)) * 51) >>> 8) & 0xFF0000) +
0xFF000000; //ignores transparency
// bottom edge (minus corner pixels)
t[i + yOffset] = (((((s[i + yOffset] & 0xFF) * 52) +
((s[i - 1 + yOffset] & 0xFF) +
(s[i + 1 + yOffset] & 0xFF) +
(s[i + yOffset - sWidth] & 0xFF) +
(s[i] & 0xFF)) * 51) >>> 8) & 0xFF) +
(((((s[i + yOffset] & 0xFF00) * 52) +
((s[i - 1 + yOffset] & 0xFF00) +
(s[i + 1 + yOffset] & 0xFF00) +
(s[i + yOffset - sWidth] & 0xFF00) +
(s[i] & 0xFF00)) * 51) >>> 8) & 0xFF00) +
(((((s[i + yOffset] & 0xFF0000) * 52) +
((s[i - 1 + yOffset] & 0xFF0000) +
(s[i + 1 + yOffset] & 0xFF0000) +
(s[i + yOffset - sWidth] & 0xFF0000) +
(s[i] & 0xFF0000)) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
// central square
for (int j = 1; j < (sHeight-1); ++j){
yOffset = j*sWidth;
t[i + yOffset] = (((((s[i + yOffset] & 0xFF) * 52) +
((s[i + 1 + yOffset] & 0xFF) +
(s[i - 1 + yOffset] & 0xFF) +
(s[i + yOffset + sWidth] & 0xFF) +
(s[i + yOffset - sWidth] & 0xFF)) * 51) >>> 8) & 0xFF) +
(((((s[i + yOffset] & 0xFF00) * 52) +
((s[i + 1 + yOffset] & 0xFF00) +
(s[i - 1 + yOffset] & 0xFF00) +
(s[i + yOffset + sWidth] & 0xFF00) +
(s[i + yOffset - sWidth] & 0xFF00)) * 51) >>> 8) & 0xFF00) +
(((((s[i + yOffset] & 0xFF0000) * 52) +
((s[i + 1 + yOffset] & 0xFF0000) +
(s[i - 1 + yOffset] & 0xFF0000) +
(s[i + yOffset + sWidth] & 0xFF0000) +
(s[i + yOffset - sWidth] & 0xFF0000)) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
}
}
// left and right edge (minus corner pixels)
for (int j = 1; j < (sHeight-1); ++j){
yOffset = j*sWidth;
t[yOffset] = (((((s[yOffset] & 0xFF) * 52) +
((s[yOffset + 1] & 0xFF) +
(s[yOffset + sWidth - 1] & 0xFF) +
(s[yOffset + sWidth] & 0xFF) +
(s[yOffset - sWidth] & 0xFF) ) * 51) >>> 8) & 0xFF) +
(((((s[yOffset] & 0xFF00) * 52) +
((s[yOffset + 1] & 0xFF00) +
(s[yOffset + sWidth - 1] & 0xFF00) +
(s[yOffset + sWidth] & 0xFF00) +
(s[yOffset - sWidth] & 0xFF00) ) * 51) >>> 8) & 0xFF00) +
(((((s[yOffset] & 0xFF0000) * 52) +
((s[yOffset + 1] & 0xFF0000) +
(s[yOffset + sWidth - 1] & 0xFF0000) +
(s[yOffset + sWidth] & 0xFF0000) +
(s[yOffset - sWidth] & 0xFF0000) ) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
t[yOffset + sWidth - 1] = (((((s[yOffset + sWidth - 1] & 0xFF) * 52) +
((s[yOffset] & 0xFF) +
(s[yOffset + sWidth - 2] & 0xFF) +
(s[yOffset + (sWidth<<1) - 1] & 0xFF) +
(s[yOffset - 1] & 0xFF)) * 51) >>> 8) & 0xFF) +
(((((s[yOffset + sWidth - 1] & 0xFF00) * 52) +
((s[yOffset] & 0xFF00) +
(s[yOffset + sWidth - 2] & 0xFF00) +
(s[yOffset + (sWidth<<1) - 1] & 0xFF00) +
(s[yOffset - 1] & 0xFF00)) * 51) >>> 8) & 0xFF00) +
(((((s[yOffset + sWidth - 1] & 0xFF0000) * 52) +
((s[yOffset] & 0xFF0000) +
(s[yOffset + sWidth - 2] & 0xFF0000) +
(s[yOffset + (sWidth<<1) - 1] & 0xFF0000) +
(s[yOffset - 1] & 0xFF0000)) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
}
// corner pixels
t[0] = (((((s[0] & 0xFF) * 52) +
((s[1] & 0xFF) +
(s[sWidth-1] & 0xFF) +
(s[sWidth] & 0xFF) +
(s[sWidth*(sHeight-1)] & 0xFF)) * 51) >>> 8) & 0xFF) +
(((((s[0] & 0xFF00) * 52) +
((s[1] & 0xFF00) +
(s[sWidth-1] & 0xFF00) +
(s[sWidth] & 0xFF00) +
(s[sWidth*(sHeight-1)] & 0xFF00)) * 51) >>> 8) & 0xFF00) +
(((((s[0] & 0xFF0000) * 52) +
((s[1] & 0xFF0000) +
(s[sWidth-1] & 0xFF0000) +
(s[sWidth] & 0xFF0000) +
(s[sWidth*(sHeight-1)] & 0xFF0000)) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
t[sWidth - 1 ] = (((((s[sWidth-1] & 0xFF) * 52) +
((s[sWidth-2] & 0xFF) +
(s[0] & 0xFF) +
(s[(sWidth<<1) - 1] & 0xFF) +
(s[sWidth*sHeight-1] & 0xFF) ) * 51) >>> 8) & 0xFF) +
(((((s[sWidth-1] & 0xFF00) * 52) +
((s[sWidth-2] & 0xFF00) +
(s[0] & 0xFF00) +
(s[(sWidth<<1) - 1] & 0xFF00) +
(s[sWidth*sHeight-1] & 0xFF00) ) * 51) >>> 8) & 0xFF00) +
(((((s[sWidth-1] & 0xFF0000) * 52) +
((s[sWidth-2] & 0xFF0000) +
(s[0] & 0xFF0000) +
(s[(sWidth<<1) - 1] & 0xFF0000) +
(s[sWidth*sHeight-1] & 0xFF0000) ) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
t[sWidth * sHeight - 1] = (((((s[sWidth*sHeight-1] & 0xFF) * 52) +
((s[sWidth-1] & 0xFF) +
(s[sWidth*(sHeight-1)-1] & 0xFF) +
(s[sWidth*sHeight-2] & 0xFF) +
(s[sWidth*(sHeight-1)] & 0xFF) ) * 51) >>> 8) & 0xFF) +
(((((s[sWidth*sHeight-1] & 0xFF00) * 52) +
((s[sWidth-1] & 0xFF00) +
(s[sWidth*(sHeight-1)-1] & 0xFF00) +
(s[sWidth*sHeight-2] & 0xFF00) +
(s[sWidth*(sHeight-1)] & 0xFF00) ) * 51) >>> 8) & 0xFF00) +
(((((s[sWidth*sHeight-1] & 0xFF0000) * 52) +
((s[sWidth-1] & 0xFF0000) +
(s[sWidth*(sHeight-1)-1] & 0xFF0000) +
(s[sWidth*sHeight-2] & 0xFF0000) +
(s[sWidth*(sHeight-1)] & 0xFF0000) ) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
t[sWidth *(sHeight-1)] = (((((s[sWidth*(sHeight-1)] & 0xFF) * 52) +
((s[sWidth*(sHeight-1) + 1] & 0xFF) +
(s[sWidth*sHeight-1] & 0xFF) +
(s[sWidth*(sHeight-2)] & 0xFF) +
(s[0] & 0xFF) ) * 51) >>> 8) & 0xFF) +
(((((s[sWidth*(sHeight-1)] & 0xFF00) * 52) +
((s[sWidth*(sHeight-1) + 1] & 0xFF00) +
(s[sWidth*sHeight-1] & 0xFF00) +
(s[sWidth*(sHeight-2)] & 0xFF00) +
(s[0] & 0xFF00) ) * 51) >>> 8) & 0xFF00) +
(((((s[sWidth*(sHeight-1)] & 0xFF0000) * 52) +
((s[sWidth*(sHeight-1) + 1] & 0xFF0000) +
(s[sWidth*sHeight-1] & 0xFF0000) +
(s[sWidth*(sHeight-2)] & 0xFF0000) +
(s[0] & 0xFF0000) ) * 51) >>> 8) & 0xFF0000) +
0xFF000000;
source.pixels = t;
source.updatePixels();
}

Top Reply

Fast blurring

Replies(19)

Re: Fast blurring

Re: Fast blurring

Re: Re: Fast blurring

Re: Fast blurring

Re: Fast blurring

Re: Fast blurring

Re: Re: Fast blurring

Re: Fast blurring

Re: Re: Fast blurring

Re: Fast blurring

Re: Fast blurring

Updated with new optimisation

Re: Fast blurring

Re: Re: Fast blurring

Re: Re: Fast blurring

Re: Fast blurring

Re: Fast blurring

Re: Fast blurring

Re: Fast blurring

Statistics

Tags

Actions