Variable casting speed

Moxl · January 2017

I just ran a little test and this

void setup() {
}

void draw() {
  for (int i=0; i<1000000000; i++) {
    float var = i;
  }
}

runs about 3 times faster than this

float var;

void setup() {
}

void draw() {
  for (int i=0; i<1000000000; i++) {
    var = i;
  }
}

which is (at least to an ignorant like me) counter-intuitive, because in the first example, variable var is cast a billion times, whereas in the second one, it's merely changed in value. How comes?

GoToLoop · January 2017

In both cases, an int value is auto-upcasted to float type before each assignment to variable var.

However, for the 2nd example, var is an instance field of class PApplet, stored in memory's heap region.

And the other is a local variable scoped to a for ( ; ; )'s curly block, stored in memory's stack region.

Stack region got faster access than heap's.
And all local variables (including parameters) are faster than object fields. \m/

Unless a field represents a compile-time known constant value.
In this particular case, it is as fast as a literal!!! $-)

Moxl · January 2017

Thanks. Stack and heap are new lingo to me. So, if my googling serves me right, I understand that Processing decides for me which region to use for which variable in my program. I get it that in C, for instance, you have to specify the region to use. Is that right?

GoToLoop · January 2017

Not Processing but Java. And even in C, the rule is very similar.
Variables go to stack and allocations go to the heap.

GoToLoop · January 2017

https://en.Wikipedia.org/wiki/Stack-based_memory_allocation
https://en.Wikipedia.org/wiki/Memory_management

Moxl · February 2017

Thanks. Does the position of a variable down the stack influence how fast it can be accessed?

Lord_of_the_Galaxy · February 2017

I believe that as long as it is in the same stack (and the stack isn't too large), the position of the variable in the stack shouldn't influence the time it takes (it may, but only a bit, and then also depending on the stack size) but I may be wrong.

I also believe that stacks tend to fit inside the L1 cache (or atleast L2 cache) entirely and are hence faster than the heap, which (most of the times) fits only in the main RAM, or at best the L2/L3 cache. Am I right?

neilcsmith_net · February 2017

The other answers are missing the fact that the Java compiler and the Java VM JIT compiler can do dead code elimination. Because your first example assigns a local variable that is never used, the VM may remove that code. In fact, as the for loop then doesn't do anything, it may remove that too!

http://www.oracle.com/technetwork/articles/java/architect-benchmarking-2266277.html

GoToLoop · February 2017

@Lord_of_the_Galaxy, I just know the general idea.
But I guess it be should be something like you described. ;))

Just wanna let you know that Java, since version 6, got something called escape analysis:
https://en.Wikipedia.org/wiki/Escape_analysis

It allows for Java's JVM to decide if an object can be created in the stack rather than its regular place in the heap!

However, I dunno whether a stack object is faster than a heap 1.
What I know for sure is that once the stack is destroyed, it doesn't involve the Garbage Collector at all.

So at least, it spares the GC from the hard work to deal w/ a stack-allocated object. \m/

Lord_of_the_Galaxy · February 2017

@GoToLoop AFAIK a stack object should be faster than a heap one, because heap ones will cause additional delays during creation also (stack allocation is easier, I guess).

@neilcsmith_net Are you sure that it is happening in this case? Because if dead code elimination was occurring, then both the versions should run at same speed (good dead code elimination will remove the variable and the for loop as the only place the for loop is being used is to set the value of var, which itself is dead). Maybe it is partially disabled in whatever compiler the Processing IDE uses?

GoToLoop · February 2017

@neilcsmith_net is right for the 1st example.
The variable var never escapes the for ( ; ; ) {} loop block!

An actual valid test would be something like this:

float var = random(-1000); // comment this out if var was declared globally!
for (int i = 0; i < 1000000000; ++i)  var += i;
println(var);

neilcsmith_net · February 2017

@Lord_of_the_Galaxy in the second case the field could potentially be accessed from outside, even by reflection, so it's harder / impossible to prove it's really dead code. The Java compiler definitely couldn't prove it, the VM JIT compiler might be able to (though would probably be against the spec)

Lord_of_the_Galaxy · February 2017

And if the Java compiler proves it, then that (dead code elimination) is the reason for faster execution in first example

GoToLoop · February 2017

That doesn't disprove local variables and parameters are faster than object field members! [-X

Moxl · February 2017

If dead code there is, it can only be the local variable declaration, not the loop, because execution still takes about 500 ms.

Lord_of_the_Galaxy · February 2017

I didn't say it disproves that, the statement was conditional - if complete dead code elimination is happening then that becomes the primary reason for the reduction in speed and other reasons are secondary.
But since the OP says it still takes a fair bit of time, I'd say that no dead code elimination is occurring here.

Moxl · February 2017

I've learned totally new things in the thread. Thanks all.

GoToLoop · February 2017

Just did a "Benchmark for Member Vs. Local Vs. Parameter Variables". :bz
Run it and pick your own conclusions: O:-)

P.S.: New version 1.1, which now includes an extra test for static member variable. :D

/**
 * Benchmark: Static Vs Member Vs Local Vs Param Variables (v1.1.6)
 * GoToLoop (2017-Feb-02)
 * forum.Processing.org/two/discussion/20582/variable-casting-speed#Item_17
 */

static final int WARM = 5_000_000, ITERATIONS = MAX_INT;

static double staticMemberVar;
double memberVar;

void setup() {
  long timer;

  staticMemberVar = random(MIN_INT);
  sumStaticGlobal(WARM); // Static Global Warm-up
  println("Static warm-up:", staticMemberVar);

  memberVar = random(MIN_INT);
  sumGlobal(WARM); // Global Warm-up
  println("Global warm-up:", memberVar);

  memberVar = random(MIN_INT);
  sumLocal(WARM); // Local Warm-up
  println("Local  warm-up:", memberVar);

  memberVar = sumParam(WARM, random(MIN_INT)); // Param Warm-up
  println("Param  warm-up:", memberVar, ENTER);

  /** Global Static Member Variable Benchmark */
  staticMemberVar = random(MIN_INT);
  timer = System.currentTimeMillis();
  sumStaticGlobal(ITERATIONS);
  timer = System.currentTimeMillis() - timer;
  println("Static:", timer, staticMemberVar);

  /** Global Member Variable Benchmark */
  memberVar = random(MIN_INT);
  timer = System.currentTimeMillis();
  sumGlobal(ITERATIONS);
  timer = System.currentTimeMillis() - timer;
  println("Global:", timer, memberVar);

  /** Local Variable Benchmark */
  memberVar = random(MIN_INT);
  timer = System.currentTimeMillis();
  sumLocal(ITERATIONS);
  timer = System.currentTimeMillis() - timer;
  println("Local: ", timer, memberVar);

  /** Local Parameter Benchmark */
  memberVar = random(MIN_INT);
  timer = System.currentTimeMillis();
  memberVar = sumParam(ITERATIONS, memberVar);
  timer = System.currentTimeMillis() - timer;
  println("Param: ", timer, memberVar);

  exit();
}

static final void sumStaticGlobal(final int num) {
  for (int i = 0; i < num; staticMemberVar += i++);
}

final void sumGlobal(final int num) {
  for (int i = 0; i < num; memberVar += i++);
}

final void sumLocal(final int num) {
  double localVar = memberVar; // cache global var locally
  for (int i = 0; i < num; localVar += i++);
  memberVar = localVar; // update global var w/ local var's result
}

static final double sumParam(final int num, double localParam) {
  for (int i = 0; i < num; localParam += i++);
  return localParam;
}

Moxl · February 2017

Wow cool! Will test this tonight. Thanks!

How would an open poll with result times vs system specs be relevant? :)

static final void sumStaticGlobal(final int num) {
  for (int i = 0; i < num; staticMemberVar += i++);
}

You can assign a new value to a variable inside a for( ; ; ) statement? Mind = blown. :-O

Moxl · February 2017

Console output (retabbed):

Global static warm-up: 1.249894340912E13
Global warm-up:        1.2499903337568E13
Local  warm-up:        1.2498816642144E13
Param  warm-up:        1.24996111192E13
Static Global:   3234  2.30584297191809638E18
Global:          3061  2.30584297099809741E18
Local:           3543  2.30584297265220454E18
Param:           3067  2.30584297218409216E18

Lord_of_the_Galaxy · February 2017

The third part of the for statement (for(..;..;this part)) can be any single statement :)

Moxl · February 2017

Sorry for the digression... but if i++ is the equivalent to i = i+1 and var1 += var2 the equivalent to var1 = var1 + var2 , then I'm surprised to see that something like var1 = var1 + (i = i+1) can evaluate without error (two equal signs here). What am I not getting right?

Lord_of_the_Galaxy · February 2017

Exactly, it does evaluate without an error. Try it.

Lord_of_the_Galaxy · February 2017

(Try y = y + (x = x + 1);)

Moxl · February 2017

Hmm. So you can change the value of a variable WHILE it's being used in an expression. I mean, not only does your expression use variable x added of 1, it also changes the stored value of variable x. That's new too. ;;)

EDIT: understanding this might look trivial to many, but believe me I've really had to wrap my head around it for a fair 5 minutes before it sunk in...

quark · February 2017

The result of i=i+1 is i and this value is passed through so can be used by adding to var1

In your particular expression (i=i+1) is equivalent to the pre increment operator ++i

You might try var1 = var1 + ++i; should compile but can't try it myself at the moment. Also var1 = var1 + (i++) ; should produce the same result.

Moxl · February 2017

Yes, both compile error-free. My single-equal-sign-per-evaluation world is crumbling.

Lord_of_the_Galaxy · February 2017

@quark

You might try var1 = var1 + ++i; should compile but can't try it myself at the moment. Also var1 = var1 + (i++) ; should produce the same result.

Are you trying to say that var1 = var1 + ++i; and var1 = var1 + (i++) produce the same result? If I remember right, ++i evaluates to (i + 1) and i++ always evaluates to i. Or am I wrong?

Moxl · February 2017

Okay well, I hadn't bothered to check values, I just saw it ran. But you have a point: if var1 and i are equal to 0, var1 = var1 + (i++) evaluates to 0, whereas var1 = var1 + ++i evaluates to 1!

Lord_of_the_Galaxy · February 2017

Aha! I was right then. I guess he meant that both should compile, and not both should evaluate to same value?

GoToLoop · February 2017

My AMD-CPU laptop here is slow. This is my output for version 1.1.4: 3:-O

Static warm-up: 1.2499591927776E13
Global warm-up: 1.2498802534624E13
Local  warm-up: 1.2498785221344E13
Param  warm-up: 1.2499873005664E13 

Static: 4315 2.30584300407844506E18
Global: 4648 2.30584300406604595E18
Local:  4294 2.30584300349099981E18
Param:  4297 2.30584300422547046E18

I don't get why @Moxl's Local was the slowest, while his Global was the fastest? 8-}
Mine's just the opposite. 8-X

Moxl · February 2017

Might be my uninformed method for testing... don't have my sketch here but will post when I get back to my "coding computer".

EDIT: oh you meant the results of your test. Yeah. Weird. Who am I to tell, though...

GoToLoop · February 2017

You can assign a new value to a variable inside a for( ; ; ) statement? Mind = blown.

You din't see nothing yet! We can use commas , in order to add more than 1 expression there! =P~
Of course, those expressions can be simply function calls too! :)>-

quark · February 2017

@Lord_of_the_Galaxy

Yes i++ evaluates to i but (i++) evaluates to i+1 because the parentheses have higher precedence than ++ so forces the increment before i is used, effectively making it the same as ++i

GoToLoop · February 2017

In their internal implementation, both pre & post increment/decrement unary operators modify its variable immediately.

The difference is that the post version returns the original previous value instead. :-B

Lord_of_the_Galaxy · February 2017

@quark I tested too, what I said is right as far as Java is concerned. And @GoToLoop explains it more correctly now.

Lord_of_the_Galaxy · February 2017

@GoToLoop Cool! I didn't know that. Though I would still prefer adding curly brackets to make it look more readable, it will help me in making code much shorter (in length, and not talking of compiled code) if and when required.

Lord_of_the_Galaxy · February 2017

Try this code yourself-

int a = 0;
int b = 1;
b = b + (a++);

print(b);

quark · February 2017

I wasn't able to test my code when I posted earlier and it seems I was incorrect regarding (a++) being equivalent to ++a.

Howdy, Stranger!

Categories

In this Discussion

Variable casting speed

Best Answers

Answers