Clarification on: "Great care must be exercised if mutable objects are used as set elements..."

_vk_vk
edited October 2014 in Programming Questions

This is from Set javaDoc

Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.

Does this means that if I change a field in an obj in the set (using iterator()) I need to remove and reinsert it to guarantee that the Set will be kept with only unique elements? That sounds weird...

My case is like

HashSet hs.

Obj1 obToAdd

if obToAdd exists (it equality is based in one field only, so they might be different in the deep, that's desired)

Add to existing obj the entries in the obToAdd that are different.

else just add.

So I guess in this case I'm good, because I'm not changing the field that is used to hashCode and equals(). Is that right?

But if I change that field, what should I do?

thanks.

Answers

  • edited October 2014 Answer ✓

    Well, take a look at hashCode() from the class example of your previous thread:
    http://forum.processing.org/two/discussion/7691/generating-hashcode-questions-

    @ Override int hashCode() { // 4-byte unique hash:
      return myByte<<030 | (int)(myFloat*100)<<020 | myShort;
    }
    

    Obviously, if any of those 3 fields change, another hashCode() will result from it!
    And even more seriously, previous equals() will be false when compared to its updated state!

    @ Override boolean equals(Object o) {
      return o instanceof DataHolder
        ? o.hashCode() == hashCode() : false;
    }
    

    But that is only relevant if the objects of this class are used as keys for Map data structures. Not as values: <):)
    http://docs.oracle.com/javase/8/docs/api/java/util/Map.html

    But for Set data structures there's no escape:
    http://docs.oracle.com/javase/8/docs/api/java/util/Set.html

    I believe it should be removed before changing any of its internal fields. And after the change, re-inserted! :-SS

  • _vk_vk
    edited October 2014

    Thanks once more. Perhaps I'm asking the wrong question...

    I choose set for 3 reasons:

    • I don't care about the order.

    • I can't have duplicates.

    • I enjoy trying stuff I've never used before... try to further learn

    But if I will have to remove/reinsert perhaps I could have used a List and have a transient Set in a noDupes function...

    What do you think?

  • edited October 2014 Answer ✓

    Set is related to equals(). What makes an instance of an object equal to another 1? What are the parameters?

    Let's say you've got a class w/ 5 fields: x, y, w, h, c. And let's say that you got 2 objects of it.
    Both of them got the same values for w, h & c. But their x & y are different.
    Should they be considered diff. objects b/c their coordinate pair position isn't the same? 8-X
    Or should x & y be irrelevant when determining uniqueness between them? :-?

    If you decide for the latter, you've gotta remove x & y from both hashCode() & equals().
    And when you throw them into your HashSet, latest 1 put() gonna replace the older!
    B/c they're "duplicate" in relation to the equality rules inside equals()! :-B

    Another consideration:
    Any field or data structure involved gotta be kept immutable as long they're in a Set or it's a key to a Map! (~~)

    For example, if fields x & y aren't part of neither hashCode() nor equals(), they can be changed freely, b/c they wouldn't break the object uniqueness. ^#(^

  • _vk_vk
    edited October 2014

    Thanks I think I got all that.

  • Assume you have a class that has the attributes a, b, c, d, e & f

    Further assume that the hashCode and equals methods have been overridden and use attributes a, b & c in their implementation.

    Therefore we can change the values d, e and f anytime without affecting the objects hashCode value or change its equality value.

    Now the tricky part, what happens if we want to change a, b & c?

    The answer is it depends.

    If the object is stored in a collection that uses the hashCode or equals method for deciding on where the objects are stored (e.g. HashMap, HashSet) then the values a, b, c MUST NOT BE CHANGED because the object will not be found afterwards. The following code demonstrates this

    import java.util.*;
    
    HashSet<MyClass> set = new HashSet<MyClass>();
    
    void setup() {
      MyClass c0, c1;
      c0 = new MyClass();
      c1 = new MyClass();
      set.add(c0);
      set.add(c1);
      println(set.contains(c1)); // finds c1
      c1.a = 999;
      println(set.contains(c1)); // cannot find c1
    }
    
    class MyClass {
      int a = 1;
      int b = 2;
      int c = 3;
      int d = 4;
      int e = 5;
      int f = 6;
    
      public boolean equals(Object obj) {
        MyClass other = (MyClass) obj;
        return a == other.a && b == other.b && c == other.c;
      }
    
      public int hashCode() {
        int prime = 31;
        int result = 1;
        result = prime * result + a;
        result = prime * result + b;
        result = prime * result + c;
        return result;
      }
    }
    

    The output for this is true then false

    If you have to change a, b or c then the object should be removed, changed and then re-added.

    Generally when you examine the problem domain you will find attributes that can be used as immutable keys. For instance passport number, driving license ID, social security number all make good keys because the are immutable, a person might change their address or even their name but they keep these key values don't change. These immutable keys can then be used in hashCode and equals.

    But if I will have to remove/reinsert perhaps I could have used a List and have a transient Set in a noDupes function...

    But you still end up with a set. It is much faster to remove, modify and re-add an object to a HashSet that it is find the object and modify the object in a list.

    For instance to remove, modify and re-add an object to a HashSet takes constant time O(1) no matter how many objects are in it. The time it takes to find an object in an unordered list is proportional to the number of objects in the list O(n)

  • Not a question about code, so I moved the topic.

    Good question, great answers, nothing to add...

  • _vk_vk
    edited October 2014

    Hey there @quark. I really appreciated all this information you kindly gave me. Thanks a lot! It definitely helped. Right now I'm stuck with a bug that I can't find, so I'm not being able to really advance in the code and try all this. So I'm going back some steps to rebuild things. It's a nasty bug that shows only some times... When I test each module they seems to work, but not when integrated... What makes it worse. Even for asking for help. Also, these days I'm very busy, so not much time to code after hours :)

    Just to check that I'm doing the right thing with my HashSet, is this the proper way to interact with them? I mean the damn thing don't even have a get()!!!

    Do I always need a temp instance so I can change it? I'm probably missing something... Here is what I'm thinking using your classes: edit: I'm not previously changing c1.a to 999 for this example.

     Iterator <MyClass> i = set.iterator();
    
      // this a local var to hold "extracted" instance
      MyClass gotFromSet=null;
    
      while (i.hasNext ()) {
    
        // this is necessary, besides convenient
       //cause if I call i.next() twice in one iteration
       // I'll got the next before wanted, right?
        MyClass m = i.next();
    
        if (m.equals(c1)) {
          gotFromSet = m; 
          set.remove(m);
        }
      }
    
      //make changes
      gotFromSet.a =999;
      println("gfs.a = "+gotFromSet.a);
    
      // kind of re-assign modified c1 to c1      
      c1 = gotFromSet;
      println("c1.a = " + c1.a);
    
     // reinsert
      set.add(c1);
    
     //re adding here just to check no dupes
      set.add(gotFromSet);
    
      println("c1 is equal to gfS:" + c1.equals(gotFromSet));
      println("set still with only one obj as intented..."+set.size());
    

    @PhiLho thanks for moving to proper category, my bad :)

  • It's a nasty bug that shows only some times...

    These are always the hardest bugs to find, but the fact that it is intermittent is a clue in itself. In this situation you have to think what is different between a good run and a bad run. I suspect that there is no significant difference in the program logic so it must be with the data, in other words with the set.

    Now adding data to set does not cause a run-time error even if the data being added is null. So the problem is most likely when data is being removed from the set and that that is the problem here.

    When using an iterator to iterate over a set, the ONLY safe way to remove an object from the set is using the iterator-remove method. In line 15 of your code you are using the set-remove method, this will invalidate the iterator. Replace line 15 with

    i.remove();

    I also suggest that you don't use i as an identifier for an iterator. For decades programmers have used i, j and k as integers and most programmers looking at some code and seeing i would assume it is an integer. Personally I always use iter and if I need more than one iter1, iter2 etc. Of course it is only a suggestion and you are more than welcome to ignore it.

    Anyway I have taken the liberty of extending the code I posted earlier to demonstrate how you might safely change the attribute values of an object in a set, when those attributes are used for testing equality.

    HTH

    import java.util.*;
    
    HashSet<MyClass> set = new HashSet<MyClass>();
    Iterator<MyClass> iter;
    
    void setup() {
      MyClass temp;
      temp = new MyClass(1, 2, 3); set.add(temp);
      temp = new MyClass(4, 5, 6); set.add(temp);
      temp = new MyClass(7, 8, 9); set.add(temp);
      displaySet();
      // Find the object {4, 5, 6} and change it by
      // multipling the a,b,c by 5
      MyClass toFind = new MyClass(4, 5, 6);
      if (set.contains(toFind))
        println(toFind + " is in the set\n");
      else
        println(toFind + " is NOT in the set\n");
    
      // OK so lets do it
      MyClass gotFromSet = null;
      iter = set.iterator();
      while (iter.hasNext ()) {
        MyClass m = iter.next();
        if (m.equals(toFind)) {
          gotFromSet = m; // only do this if we find it
          iter.remove();
        }
      }
      // If we found it it was remove and is now
      // reference by gotFromSet
      if (gotFromSet != null) {
        gotFromSet.a *= 5;
        gotFromSet.b *= 5;
        gotFromSet.c *= 5;
        set.add(gotFromSet);
      }
      displaySet();
      // Find the object {4, 5, 6} and change it by
      // multipling the a,b,c by 5
      if (set.contains(toFind))
        println(toFind + " is in the set\n");
      else
        println(toFind + " is NOT in the set\n");
    }
    
    void displaySet() {
      println("~~~~~~~~~~~~~~~~~~");
      for (MyClass mine : set)
        println(mine);
      println("~~~~~~~~~~~~~~~~~~");
    }
    
    class MyClass {
      int a = 1;
      int b = 2;
      int c = 3;
      int d = 4;
      int e = 5;
      int f = 6;
    
      public  MyClass(int a, int b, int c) {
        this.a = a;
        this.b = b;
        this.c = c;
      }
    
      public boolean equals(Object obj) {
        MyClass other = (MyClass) obj;
        return a == other.a && b == other.b && c == other.c;
      }
    
      public int hashCode() {
        int prime = 31;
        int result = 1;
        result = prime * result + a;
        result = prime * result + b;
        result = prime * result + c;
        return result;
      }
    
      public String toString() {
        return " {"+a+", "+b+", "+c+"}";
      }
    }
    
  • BTW I concentrated on the bug and forget to answer the question is the is the correct way to interact with them

    Basically everything is fine except for removing the object from the set. If you compare my code with yours you will see no difference in the logic used. :)

  • @quark, that's a good catch! I totally was missing this very important detail, hopefully it is the source of the bug!! I'll try that ASAP.

    Thanks for all the code examples and explanations and hints. Great help! You will never see again an iterator named i in my code :D

    I'll try all that.

    cheers

    :-bd

Sign In or Register to comment.