Variable casting speed

edited February 2017 in Questions about Code

I just ran a little test and this

void setup() {

void draw() {
  for (int i=0; i<1000000000; i++) {
    float var = i;

runs about 3 times faster than this

float var;

void setup() {

void draw() {
  for (int i=0; i<1000000000; i++) {
    var = i;

which is (at least to an ignorant like me) counter-intuitive, because in the first example, variable var is cast a billion times, whereas in the second one, it's merely changed in value. How comes?


  • edited January 2017 Answer ✓

    In both cases, an int value is auto-upcasted to float type before each assignment to variable var.

    However, for the 2nd example, var is an instance field of class PApplet, stored in memory's heap region.

    And the other is a local variable scoped to a for ( ; ; )'s curly block, stored in memory's stack region.

    Stack region got faster access than heap's.
    And all local variables (including parameters) are faster than object fields. \m/

    Unless a field represents a compile-time known constant value.
    In this particular case, it is as fast as a literal!!! $-)

  • Thanks. Stack and heap are new lingo to me. So, if my googling serves me right, I understand that Processing decides for me which region to use for which variable in my program. I get it that in C, for instance, you have to specify the region to use. Is that right?

  • Answer ✓

    Not Processing but Java. And even in C, the rule is very similar.
    Variables go to stack and allocations go to the heap.

  • Thanks. Does the position of a variable down the stack influence how fast it can be accessed?

  • edited February 2017

    I believe that as long as it is in the same stack (and the stack isn't too large), the position of the variable in the stack shouldn't influence the time it takes (it may, but only a bit, and then also depending on the stack size) but I may be wrong.

    I also believe that stacks tend to fit inside the L1 cache (or atleast L2 cache) entirely and are hence faster than the heap, which (most of the times) fits only in the main RAM, or at best the L2/L3 cache. Am I right?

  • The other answers are missing the fact that the Java compiler and the Java VM JIT compiler can do dead code elimination. Because your first example assigns a local variable that is never used, the VM may remove that code. In fact, as the for loop then doesn't do anything, it may remove that too!

  • edited February 2017

    @Lord_of_the_Galaxy, I just know the general idea.
    But I guess it be should be something like you described. ;))

    Just wanna let you know that Java, since version 6, got something called escape analysis:

    It allows for Java's JVM to decide if an object can be created in the stack rather than its regular place in the heap!

    However, I dunno whether a stack object is faster than a heap 1.
    What I know for sure is that once the stack is destroyed, it doesn't involve the Garbage Collector at all.

    So at least, it spares the GC from the hard work to deal w/ a stack-allocated object. \m/

  • @GoToLoop AFAIK a stack object should be faster than a heap one, because heap ones will cause additional delays during creation also (stack allocation is easier, I guess).

    @neilcsmith_net Are you sure that it is happening in this case? Because if dead code elimination was occurring, then both the versions should run at same speed (good dead code elimination will remove the variable and the for loop as the only place the for loop is being used is to set the value of var, which itself is dead). Maybe it is partially disabled in whatever compiler the Processing IDE uses?

  • edited February 2017

    @neilcsmith_net is right for the 1st example.
    The variable var never escapes the for ( ; ; ) {} loop block!

    An actual valid test would be something like this:

    float var = random(-1000); // comment this out if var was declared globally!
    for (int i = 0; i < 1000000000; ++i)  var += i;
  • @Lord_of_the_Galaxy in the second case the field could potentially be accessed from outside, even by reflection, so it's harder / impossible to prove it's really dead code. The Java compiler definitely couldn't prove it, the VM JIT compiler might be able to (though would probably be against the spec)

  • And if the Java compiler proves it, then that (dead code elimination) is the reason for faster execution in first example

  • edited February 2017

    That doesn't disprove local variables and parameters are faster than object field members! [-X

  • If dead code there is, it can only be the local variable declaration, not the loop, because execution still takes about 500 ms.

  • I didn't say it disproves that, the statement was conditional - if complete dead code elimination is happening then that becomes the primary reason for the reduction in speed and other reasons are secondary.
    But since the OP says it still takes a fair bit of time, I'd say that no dead code elimination is occurring here.

  • edited February 2017

    I've learned totally new things in the thread. Thanks all.

  • edited February 2017

    Just did a "Benchmark for Member Vs. Local Vs. Parameter Variables". :bz
    Run it and pick your own conclusions: O:-)

    P.S.: New version 1.1, which now includes an extra test for static member variable. :D

     * Benchmark: Static Vs Member Vs Local Vs Param Variables (v1.1.6)
     * GoToLoop (2017-Feb-02)
    static final int WARM = 5_000_000, ITERATIONS = MAX_INT;
    static double staticMemberVar;
    double memberVar;
    void setup() {
      long timer;
      staticMemberVar = random(MIN_INT);
      sumStaticGlobal(WARM); // Static Global Warm-up
      println("Static warm-up:", staticMemberVar);
      memberVar = random(MIN_INT);
      sumGlobal(WARM); // Global Warm-up
      println("Global warm-up:", memberVar);
      memberVar = random(MIN_INT);
      sumLocal(WARM); // Local Warm-up
      println("Local  warm-up:", memberVar);
      memberVar = sumParam(WARM, random(MIN_INT)); // Param Warm-up
      println("Param  warm-up:", memberVar, ENTER);
      /** Global Static Member Variable Benchmark */
      staticMemberVar = random(MIN_INT);
      timer = System.currentTimeMillis();
      timer = System.currentTimeMillis() - timer;
      println("Static:", timer, staticMemberVar);
      /** Global Member Variable Benchmark */
      memberVar = random(MIN_INT);
      timer = System.currentTimeMillis();
      timer = System.currentTimeMillis() - timer;
      println("Global:", timer, memberVar);
      /** Local Variable Benchmark */
      memberVar = random(MIN_INT);
      timer = System.currentTimeMillis();
      timer = System.currentTimeMillis() - timer;
      println("Local: ", timer, memberVar);
      /** Local Parameter Benchmark */
      memberVar = random(MIN_INT);
      timer = System.currentTimeMillis();
      memberVar = sumParam(ITERATIONS, memberVar);
      timer = System.currentTimeMillis() - timer;
      println("Param: ", timer, memberVar);
    static final void sumStaticGlobal(final int num) {
      for (int i = 0; i < num; staticMemberVar += i++);
    final void sumGlobal(final int num) {
      for (int i = 0; i < num; memberVar += i++);
    final void sumLocal(final int num) {
      double localVar = memberVar; // cache global var locally
      for (int i = 0; i < num; localVar += i++);
      memberVar = localVar; // update global var w/ local var's result
    static final double sumParam(final int num, double localParam) {
      for (int i = 0; i < num; localParam += i++);
      return localParam;
  • edited February 2017

    Wow cool! Will test this tonight. Thanks!

    How would an open poll with result times vs system specs be relevant? :)

    static final void sumStaticGlobal(final int num) {
      for (int i = 0; i < num; staticMemberVar += i++);

    You can assign a new value to a variable inside a for( ; ; ) statement? Mind = blown. :-O

  • Console output (retabbed):

    Global static warm-up: 1.249894340912E13
    Global warm-up:        1.2499903337568E13
    Local  warm-up:        1.2498816642144E13
    Param  warm-up:        1.24996111192E13
    Static Global:   3234  2.30584297191809638E18
    Global:          3061  2.30584297099809741E18
    Local:           3543  2.30584297265220454E18
    Param:           3067  2.30584297218409216E18
  • The third part of the for statement (for(..;..;this part)) can be any single statement :)

  • edited February 2017

    Sorry for the digression... but if i++ is the equivalent to i = i+1 and var1 += var2 the equivalent to var1 = var1 + var2 , then I'm surprised to see that something like var1 = var1 + (i = i+1) can evaluate without error (two equal signs here). What am I not getting right?

  • Exactly, it does evaluate without an error. Try it.

  • (Try y = y + (x = x + 1);)

  • edited February 2017

    Hmm. So you can change the value of a variable WHILE it's being used in an expression. I mean, not only does your expression use variable x added of 1, it also changes the stored value of variable x. That's new too. ;;)

    EDIT: understanding this might look trivial to many, but believe me I've really had to wrap my head around it for a fair 5 minutes before it sunk in...

  • edited February 2017

    The result of i=i+1 is i and this value is passed through so can be used by adding to var1

    In your particular expression (i=i+1) is equivalent to the pre increment operator ++i

    You might try var1 = var1 + ++i; should compile but can't try it myself at the moment. Also var1 = var1 + (i++) ; should produce the same result.

  • edited February 2017

    Yes, both compile error-free. My single-equal-sign-per-evaluation world is crumbling.

  • @quark

    You might try var1 = var1 + ++i; should compile but can't try it myself at the moment. Also var1 = var1 + (i++) ; should produce the same result.

    Are you trying to say that var1 = var1 + ++i; and var1 = var1 + (i++) produce the same result? If I remember right, ++i evaluates to (i + 1) and i++ always evaluates to i. Or am I wrong?

  • edited February 2017

    Okay well, I hadn't bothered to check values, I just saw it ran. But you have a point: if var1 and i are equal to 0, var1 = var1 + (i++) evaluates to 0, whereas var1 = var1 + ++i evaluates to 1!

  • Aha! I was right then. I guess he meant that both should compile, and not both should evaluate to same value?

  • edited February 2017

    My AMD-CPU laptop here is slow. This is my output for version 1.1.4: 3:-O

    Static warm-up: 1.2499591927776E13
    Global warm-up: 1.2498802534624E13
    Local  warm-up: 1.2498785221344E13
    Param  warm-up: 1.2499873005664E13 
    Static: 4315 2.30584300407844506E18
    Global: 4648 2.30584300406604595E18
    Local:  4294 2.30584300349099981E18
    Param:  4297 2.30584300422547046E18

    I don't get why @Moxl's Local was the slowest, while his Global was the fastest? 8-}
    Mine's just the opposite. 8-X

  • edited February 2017

    Might be my uninformed method for testing... don't have my sketch here but will post when I get back to my "coding computer".

    EDIT: oh you meant the results of your test. Yeah. Weird. Who am I to tell, though...

  • You can assign a new value to a variable inside a for( ; ; ) statement? Mind = blown.

    You din't see nothing yet! We can use commas , in order to add more than 1 expression there! =P~
    Of course, those expressions can be simply function calls too! :)>-

  • @Lord_of_the_Galaxy

    Yes i++ evaluates to i but (i++) evaluates to i+1 because the parentheses have higher precedence than ++ so forces the increment before i is used, effectively making it the same as ++i

  • edited February 2017

    In their internal implementation, both pre & post increment/decrement unary operators modify its variable immediately.

    The difference is that the post version returns the original previous value instead. :-B

  • @quark I tested too, what I said is right as far as Java is concerned. And @GoToLoop explains it more correctly now.

  • @GoToLoop Cool! I didn't know that. Though I would still prefer adding curly brackets to make it look more readable, it will help me in making code much shorter (in length, and not talking of compiled code) if and when required.

  • edited February 2017

    Try this code yourself-

    int a = 0;
    int b = 1;
    b = b + (a++);
  • I wasn't able to test my code when I posted earlier and it seems I was incorrect regarding (a++) being equivalent to ++a.

Sign In or Register to comment.