Sequential elimination of rows in 2d array / Retention of rows that differ on all elements
Hi everyone,
My data is stored in a two-dimensional array, which rows correspond to different products, and row elements correspond to product components. Product values are stored in a separate one-dimensional array.
I would like to retain only the products with highest value that consist of distinct and unique components, i.e. identify and retain the rows of the 2d array that differ on all elements. Here is an example:
// 2d array with product IDs
int [][] duplicate =
{{0, 1, 2, 3}, // retain
{0, 5, 6, 11}, // remove
{4, 1, 6, 3}, // remove
{8, 9, 10, 11}}; // retain
// 1d array with product values
int [] value = {10,4,9,15};
int rows =4;// number of components (a constant equal to 4)
// My objective is to filter out distinct products:
Int [][] distProd =
{{0, 1, 2, 3},
{8, 9, 10, 11}};
My idea was (1) to identify the product with the highest value, (2) check if other products overlap with the best one; (3) if products do not have overlapping components with the best one, they are stored in a new array. Here is the code:
int [][] duplicate =
-
{{0, 1, 2, 3}, // retain
{0, 5, 6, 11}, // remove
{4, 1, 6, 3}, // remove
{8, 9, 10, 11}}; // retain
// 1d array with product values
int [] value = {10,4,9,15};
int rows =4;// number of components (a constant equal to 4)
int [] maxval = new int [rows]; // product id with the max value of the fixed length
int max = max(value); // 15
for (int i=0;i<duplicate.length; i++){
if (value[i]==max) {
maxval = duplicate[i];
}
}
println(maxval); // 8, 9, 10, 11
// check if there are duplicates by substracting each array from an array with max value:
// if products consist of distinct cells neither of the cells in test array should be equal to zero
int [][] dup_test = new int [duplicate.length][rows];
for (int i=0;i<duplicate.length; i++){
for (int j=0; j<duplicate[i].length;j++){
dup_test[i][j]=maxval[j]-duplicate[i][j];
}
}
for (int i=0;i<duplicate.length; i++){
println(i);
println(dup_test[i]);
}
//initialise boolean array
boolean [][] isZero = new boolean [duplicate.length][rows];
for (int i = 0; i<duplicate.length; i++){
for (int j = 0; j<duplicate[i].length; j++){
isZero[i][j] = false;}
}
// check which positions are zeros
for (int i = 0; i<duplicate.length; i++){
for (int j = 0; j<duplicate[i].length; j++){
if (dup_test[i][j]==0) {isZero[i][j] = true;};
}
}
for (int i = 0; i<duplicate.length; i++){
println(i);
println(isZero[i]);
}
// count how many zeros in each product
int [] zeros = new int [isZero.length];
int count_null =0;
for (int i = 0; i<isZero.length; i++){
for (int j = 0; j<isZero[i].length;j++){
if ( isZero [i][j] == true) {count_null++;
}
} zeros [i] = count_null; count_null =0; // reset count for each row
}
for (int i = 0; i<zeros.length; i++){
println(zeros[i]);
}
// filter out distinct products (zeros [i]=0) and the product with max value (zeros[i]=4)
int [][] distinct = new int [duplicate.length][rows]; // temp array with excessive length
int count_dist =0;
for (int i = 0; i<duplicate.length; i++){
if (zeros[i] ==rows || zeros [i] == 0) {distinct[count_dist]=duplicate[i]; count_dist++;}
}
for (int i = 0; i<zeros.length; i++){
println(i);
println(distinct[i]);
}
// count how many distinct products (to input as a length of an array)
int count_cutoff =0;
for (int i =0; i<zeros.length; i++){
if (zeros[i] ==rows || zeros [i] == 0) {count_cutoff++;}
}
int [][] distProd = Arrays.copyOf(distinct,count_cutoff);
// prints: {{0, 1, 2, 3},
// {4, 1, 6, 3},
// {8, 9, 10, 11}};
Question 1: I was wondering if there is a shorter and more elegant way to arrive at the desired solution. Moreover, the current version uses Arrays.copyOf which is no longer available in more recent versions of Processing.
Question 2: The code only partially fulfills its purpose because although products may not have overlapping cells with the max value product (8, 9, 10, 11), they can still have common components between themselves (like in case of 0, 1, 2, 3 and 4, 1, 6, 3). I could just repeat the same algorithm comparing elements to the second best product in distProd, but maybe there is an easier way..
Thank you very much for your consideration!