We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpSyntax Questions › Discussion: Idiom: 2-element array instead of x, y
Page Index Toggle Pages: 1
Discussion: Idiom: 2-element array instead of x, y (Read 2063 times)
Discussion: Idiom: 2-element array instead of x, y
Oct 30th, 2009, 6:03am
 
The following is copied from comments on a sketch posted on the OpenProcessing site - OSC Media Controller by Julio Terra - for the purposes of expanded discussion here.

Quote:
subpixel:

Is there some wisdom floating around in the Processing (or Java) community that says it is a good idea to use a 2-element array to store (x,y) coordinates, where element 0 holds the x value and element 1 holds the y value

I have seen this scheme numerous times in OpenProcessing sketches, and I really don't understand where it has come from. In this sketch "x" and "y" are "used up" as constants to represent which array index the coordinates are stored in. Whilst I appreciate that this makes the code slightly more readable..

eg
buttonLoc[x] = tXloc;
buttonLoc[y] = tYloc;
vs
buttonLoc[0] = tXloc;
buttonLoc[1] = tYloc;

..it is going about things the completely wrong way. To my mind, it is questionable as to whether this is an improvement over or inferior to even

buttonLocX = tXloc;
buttonLocY = tYloc;


What I rather expect to see is something more like
buttonLoc.x = tXloc;
buttonLoc.y = tYloc;

or, going a step further,
buttonLoc = loc;
where loc is a location type that has x and y components.

That might need to be adjusted to buttonLoc = loc.clone() or similar to get a copy of the location object, but I think you can see what I'm getting at.

Julio, please don't take this as an attack on what you have done, since you are hardly the first person to do this. I see that what you're building here is not beginner level stuff, so I'm writing this to encourage you to go back to wherever you picked up this idiom and question both where it came from and why it is being used.

I don't have a lot of Java experience, so can't give you the quick answer on how best to structure this sort of data in Java objects, but it seems to me that using an extra dimension on an array for x,y,z (or whatever) components harks back to olden days versions of BASIC and similar that did not offer the opportunity for named data members.

-spxl


Quote:
Sinan Aşçıoğlu:

As far as I remember from cs classes, the reason lies beneath the performance of memory:

buttonLoc[0] = tXloc;
buttonLoc[1] = tYloc;
is the fastest since it is an array.

buttonLoc[x] = tXloc;
buttonLoc[y] = tYloc;
is slower, since it is a dictionary. (also needs to keep the 'names' in memory, hence indexing is a little slower).

buttonLocX = tXloc;
buttonLocY = tYloc;
is even slower, since two variables mean two way separate memory allocations.

buttonLoc.x = tXloc;
buttonLoc.y = tYloc;
is the slowest, since it includes an both object + 2 variables associated with it.

This is one of the big challenges of performance. Object oriented is always really good practice for programming, however because it is slower than primitive approaches (such as arrays), its performance in cases might be way slower than primitive ones.

So, it might not matter most of the times, but it becomes a general practice not to create objects if you can do it with dictionary and even with arrays, if you might be concerned with performance issues.
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #1 - Oct 30th, 2009, 7:10am
 
I think Sinan's response hilights a number of issues and misunderstandings about Java and/or general programming.

Before getting bogged down in the details, I should make a statement about the notion that optimisation (for speed, code size, etc) should be one of the last things to consider when writing code. Especially in a learning-to-program sort of environment. I'm not (yet) a great Object-Oriented coder, but I do lament at seeing a lot of decidedly non-OO, awkward and inelegant code, especially when it seems to be through a lack of understanding and/or misguided instruction.

Sinan is the creator of the OpenProcessing site, and take that as experience enough to not be considered a newbie! However, there are some vague and also some incorrect notions about what is going on.

The OSC Media Controller code example defines a Button class, starting as follows:
Code:
class Button { 

 // general variables used accross class
 final int x = 0; final int y = 1;                                      // variables to use with matrix size array
 // ...
 // matrix and cell related variables
 float [] buttonSize = new float [2];                                   // width and height of each button        
 float [] buttonLoc = new float [2];                                    // location of the button
 float [] relativeMouseLoc = new float [2];                             // holds relative location of mouse on object (0,0 being top left hand corner)
 // ...
// define the button class constructor
Button(int tXloc, int tYloc, int twidth, int theight) {
 buttonLoc[x] = tXloc;
 buttonLoc[y] = tYloc;
 buttonSize[x] = twidth;
 buttonSize[y] = theight;
}  

My immediate reaction upon seeing "final int x = 0; final int y = 1;" was to think that was a bit strange. x and y are universally understood in many contexts to refer to some sort of position or vector, but here they are used as constant array indices.

Sinan suggests that "buttonLoc[0]" is fastest, and that "buttonLoc[x]" is "slower, since it is a dictionary". Firstly, since x is a constant ("final int"), I expect that it should be, or has the potential to be, exactly the same as using the literal constant 0 (by that I mean a compiler could produce the exact same executable code regardless of whether "x" or "0" are used, excusing cases where different variable types cause confusion). If x was not constant (that is, not "final"), then I agree there is another lookup required to find what the current value of x is each time, but I think Sinan is confusing the issue with thinking it is like using buttonLoc["x"] in PHP or something, where the "x" is not an integer variable but rather a  dictionary key to look up in an associative array (which is, of course, far more compute intensive).

I also disagree that using buttonLoc[0] and buttonLoc[1] should be slower than using separate fields buttonLocX and buttonLocY; indeed I expect the latter to have faster access.

Resoning:

buttonLoc[0] requires the location of the field (the buttonLoc array) and an array index, which is not only a number, but (probably) a number that needs to be multiplied by another number (the size of an individual array element) to get the actual memory offset.

buttonLocX requires only the location of the field.

I should mention that, in both cases, the "location of the field" is itself an offset from the object's base location. How memory addresses are determined in Java is a somewhat unknown to me; my description here relates to the way things usually work for C or C++ structs and arrays.

I have tweaked someone else's Processing sketch that used an extra size-2 dimension on arrays for storing x and y values to instead use a separate x-values array and y-values array and noted a significant speed increase, so am fairly confident in saying that an unnecessary array indexing is not something you want to have when aiming for performance.

In C, structs inside structs aren't a problem at all. A sub-struct is as easy to access and use as a "top level" struct. I don't know if it can be done in Java (a quick look just now for "Java structs" yields the impression of some heated debate about allowing struct-like behaviour in Java), but I have a feeling that people using (array[0],array[1]) instead of (field.x, field.y) because they saw someone else do it, without understanding why and perhaps believing something that is the opposite of the intention.

-spxl
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #2 - Oct 30th, 2009, 7:15am
 
Quote:
Is there some wisdom floating around in the Processing (or Java) community that says it is a good idea to use a 2-element array to store (x,y) coordinates, where element 0 holds the x value and element 1 holds the y value?

Yes and it predates Processing and Java!

Consider that we are remembering a single coordinate in 2D space. We are not so much remembering X and Y but (X,Y). Using a two element array means that there is a single identifier (array name) to hold a 2D coordinate and be can easily be passed as a single parameter to other functions.

So why an array? In the early days of computing the only data structure was the array.

So why still use an array
  • it is better than having 2 separate variables
  • slightly faster than using classes and objects

Despite the slight speed difference I believe that using object orientation should be the preferred option.

Java supports object orientation so if I wanted a very simple class to hold a 2D coordinate then something like

Code:

public class Coord2D {
   public float x;
   public float y;
}

Using it
Code:

Coord2D c = new Coord2D();
c.x = 23.5803;
c.y  = -1.296;


Re: Discussion: Idiom: 2-element array instead of x, y
Reply #3 - Oct 30th, 2009, 7:33am
 
We have PVector to hold a pair of coordinates...
I agree with Quark, using a structure (array or class) is better than using two variables. A big advantage is that you can return such structure, while you can return only one primitive value!

I think the "optimizations" discussed above are totally pointless! Smiley
Some concepts are indeed drawn from other languages, be them PHP, JavaScript, C or something else...
Frankly, unless you know how the Java bytecode is generated and interpreted, don't bother for such optimizations based on suppositions on how the JVM works...
Even more, the JIT (just in time compiler) does further optimizations, depending on vendor of the JVM, version of the latter, etc.
Note that the line defining final x and y is an half backed optimization: making them static would save memory (shared by all instances) in addition of the speed up (the compiler knows it can treat them as constants).

So, unless you can prove (by appropriate benchmark) that a version is faster (or uses less memory), just stick to the most readable, easiest to understand, thus to maintain, version.

Note: in current version of Java, creating too much objects can slow down a bit an application.
I have seen in the JBox2D forum that programmers gained much performance by filling fields of pre-allocated objects (mostly coordinates!) instead of creating an object on the fly and returning it.
That's because the library generates lot of such short lived objects. So it cumulated piles of garbage objects, making lot of work for the garbage collector. The reasoning might not be usable in your sketch...

Note you will find articles (like Java theory and practice: Urban performance legends, revisited) stating you should not make such optimization because (no more so) recent Java versions have escape analysis, which destroys the short lived objects on the spot (IIRC).
Except that you have to activate this optimization yourself!
I have seen that Java 7 should enable it by default... Smiley
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #4 - Oct 30th, 2009, 8:25am
 
I'm definitely not in a position to add to this debate from a technical perspective (though it makes for interesting reading); but there's no doubt that using an object-based approach is by far the most readable.  As such, and if Processing is aimed at making programming languages more accessible, I think it's definitely the approach that should be encouraged, even if it isn't optimal.

When beginners start trying to store coordinates into 2d arrays they get themselves into all kinds of trouble...
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #5 - Oct 30th, 2009, 9:14am
 
Quark wrote on Oct 30th, 2009, 7:15am:
Quote:
Is there some wisdom floating around in the Processing (or Java) community that says it is a good idea to use a 2-element array to store (x,y) coordinates, where element 0 holds the x value and element 1 holds the y value

Yes and it predates Processing and Java!

I meant: has the Java community at large decided that objects suck and devolving to the use of (simple) arrays (when there are other options available) is just a better way to do things

I take it that the answer (in general, non hardcore otimisation cases) is 'no'.

On that basis, I suggest a bit of an education campaign, advocating the use of objects (such as PVector, Coord2D or whatever). As I mentioned, I have seen this use of 2-element arrays (or adding an extra dimension to some other array structure) a number of times, and if people (especially people new to programming and/or OO programming) keep seeing it they will keep using it.

Warning about use of Coord2D (as suggested by Quark)
Code:

Coord2D c = new Coord2D();
c.x = 23.5803;
c.y  = -1.296;

Coord2D c2 = c;

// c2 is _not_ a copy

c2.x = 99;

// c.x has apparently also changed to 99, as there is only one object

-spxl
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #6 - Oct 30th, 2009, 11:00am
 
Quote:
I meant: has the Java community at large decided that objects suck and devolving to the use of (simple) arrays (when there are other options available) is just a better way to do things?

I take it that the answer (in general, non hardcore otimisation cases) is 'no'.


Sorry about that my misunderstanding!

Looking at the replies on this thread I think that the consensus is that for a 2D coordinate the order of preference is.
1) OO
2) 2 element Array
3) 2 simple variables

The problem for OO is that there is quite a steep learning curve. It is fairly simple to use other peoples classes it is quite another thing to design and create your own.
One of the things I like about Processing is that quite complex sketches are being done by people who have not come from a programming background but are having a go and because of this are using non OO techniques.
Although Processing is based on Java and Java is an OO language, the Processing IDE was designed to allow the creation of sketches without a knowledge of OO.

Quote:
I suggest a bit of an education campaign, advocating the use of objects (such as PVector, Coord2D or whatever).


I heartily agree. Cheesy

So on that note I make a couple of comments.

I am not advocating the use of Coord2D it was a simple example to show how OO could be used for this problem. Anyone wanting to use XY coordinate values should use PVector as it has many methods for manipulating coordinates. (Note PVector has over 550 lines of code if you also include the comments)

Quote:
Warning about use of Coord2D (as suggested by Quark)


This applies to all Java classes for instance
Code:

PVector c = new PVector();
c.x = 23.5803;
c.y  = -1.296;

PVector c2 = c;

// c2 is _not_ a copy

c2.x = 99;

// c.x has apparently also changed to 99, as there is only one object

does exactly the same thing.

Using objects is not the same as using primitive types like int and float

In the above example c and c2 are 'object references' not the actual object. If we imagine a saucepan and in the bowl (i.e. the actual object) we have the values for x & y then c is the saucepan handle. The statement c2 = c then copies the handle (not the object) so our saucepan now has 2 handles. We can use either handle to change the contents of the bowl!

Useful knowledge for anyone experimenting OO.

Smiley

Re: Discussion: Idiom: 2-element array instead of x, y
Reply #7 - Oct 31st, 2009, 12:27am
 
Quote:
Processing IDE was designed to allow the creation of sketches without a knowledge of OO

Exactly! As you point out, procedural coding is simpler and less intimidating.
Using arrays is simpler than using ArrayLists, hence the tricks mentioned above which are quite common (not in the Java world! but in the Processing world, in the newbies category...). Smiley
Somehow, Processing even facilitates the usage of simple arrays with its functions for manipulating them (extending, truncating, etc.), slow but convenient for beginners.

The "assignment isn't a copy" is a good example of why OO can be perceived as difficult. Also the "I declared an array but it doesn't take my values" (no new), or "I made new Stuff[N] but I can't assign values to my objects" (no new Stuff() N times), etc. When users have a little background in other programming languages, it can be even more confusing!
But fortunately, motivated people learn quite quickly...

Note: I started to program, in Basic, on computers with 1MHz CPUs (yes, MHz, not GHz!). I coded a lot in C, quite a bit in assembly language, etc.
So my mind is used to micro-optimizations, which are quite important at such "speeds" and in these languages, interpreted or poorly optimized.
Today we have complex languages with optimizing compilers and VMs, etc., so we have to forget most of our prejudices on languages.
In C, it was stupid to write:
for (i = 0; i < strlen(str); i++)
because strlen computed the length of the string by iterating on it!
So it was better to pre-compute this value.
In JavaScript, you still had a gain by writing:
for (i = 0, l = str.length; i < l; i++)
for slightly different reasons. Even that might depend on the JS engine (different in Chrome, Firefox, IE...).
In Java, it is OK to put str.length() in the test part. The apparent call is probably optimized by the JIT compiler.

If you want to improve the performances of your sketch, first look at the tight loops, doing millions of iterations on each frame.
I often recommend to put stuff like smooth() from draw() to setup() if invariant, but it is not for speed, more for readability/logic.
And to optimize, measure times! Several times, since computers are multitask today. Find if an optimization really gain speed. Don't assume things.
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #8 - Nov 1st, 2009, 12:05am
 
My main point here is that if someone new to programming is going to do something with coordinates (2D or 3D) in Processing, they should be discouraged from learning the 2- or 3-element array (or worse yet, extra dimension on some other array) method.

Encourage them to learn the basics of using a PVector, which in some respects is simpler than learning to use arrays, is easier to read, etc.

Seeing a new programmer resorting to creating constants "x" and "y" is a good enough example for me to see that this idiom is bad news for new programmers. If people don't see it everywhere, they won't copy it over and over (and hence convince other people to do the same).

-spxl
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #9 - Nov 1st, 2009, 12:13am
 
Aside: see spxlOrigamiButterfly for example of replacing three-dimensional arrays with simple arrays and associated performance improvements (timed), listed in the program comments. Note that the optimisations (and timings) were performed in that order, so it is not clear what the last array manipulation may have had if it were done first.

-spxl
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #10 - Nov 1st, 2009, 12:51am
 
subpixel wrote on Nov 1st, 2009, 12:05am:
Encourage them to learn the basics of using a PVector, which in some respects is simpler than learning to use arrays, is easier to read, etc.

I agree, somehow it is like using String objects, they are here to be used without even knowing OO concepts.

The idiom might have taken before PVector existed: it is relatively recent, and before it was either "use couple of arrays" or "make your own version" of vector, going back to knowing the bases of OO (not much, though) and multiplying incompatible versions.
And since there are so much examples already with the arrays, it is still prone to be taken as example.
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #11 - Nov 1st, 2009, 4:28am
 
With regard to comparing arrays with objects consider this sketch
Code:
void setup(){
 float[] p = {2.878, 3.142};
 
 println(p[0] + "\t" + p[1]);
 
 float[] p2 = p;
 p2[1] = 9.999;
 
 println(p[0] + "\t" + p[1]);
}


The output is
2.878      3.142
2.878      9.999
in other words p and p2 refer to the same array!

In fact Java treats arrays as objects so
Code:

 float[] p = {2.878, 3.142};

// is same as

 float[] p = new float[2];
 p[0] = 2.878;
 p[1] = 3.142;


In Java the new command instantiates/creates an object

In your butterfly program you were using multidimensional arrays, in Java a 2D array is a 1D array of arrays. A 3D array is a 1D array of 1D arrays of arrays. Might seem confusing but effectively we have a lot of arrays each one being treated as an object and is subject to same garbage collection rules as any other object.

In your program you started with 3D arrays e.g. folds[][][] when you need to access a particular element e.g. folds[5][0][1] Java has to do a lot of work first to find the location of array folds[5][][] then using the array at this location to find the array folds[5][0][] and finally find the element at folds[5][0][1] this is why you got improved performance when you used smaller dimension arrays you are reducing the number of calculations.

subpixel I love the origami butterfly it looks great.  Cheesy
Re: Discussion: Idiom: 2-element array instead of x, y
Reply #12 - Nov 2nd, 2009, 1:44pm
 
Quark wrote on Nov 1st, 2009, 4:28am:
In your program you started with 3D arrays e.g. folds[][][] when you need to access a particular element e.g. folds[5][0][1] Java has to do a lot of work first to find the location of array folds[5][][] then using the array at this location to find the array folds[5][0][] and finally find the element at folds[5][0][1] this is why you got improved performance when you used smaller dimension arrays you are reducing the number of calculations.

It wasn't my program, nor my 3D array... it took me a while to figure out what the f*k this program was doing because of these unnamed array dimensions/elements! I just untangled the logic a bit and "optimised it" because I wanted to see it animated instead of a random static image every couple of seconds. I still don't fully understand the algorithm, but I at least understood the data structure and how it was accessed.

With regard to your reasoning for the improved performance: refer to my original post (reducing array dimensions or removing use of arrays altogether -> improved performance).

-spxl
Page Index Toggle Pages: 1