Friday, October 2, 2009

Java Puzzler: enforcing whether a class overrides equals/hashCode


The Problem

Let's say I'm writing a Java class that requires users of my class override equals and hashCode on their objects. I'd be in good company: lots of classes do this, not least the java.util.Collection classes like List and Set.

This important requirement is generally only recorded in the JavaDoc, with a stern warning that Bad Things Will Happen if you forget. But the requirement is not enforced at runtime, much less compile time.

The Question

What if I wanted to enforce it? What if I wanted to write a class like HashMap that made sure anything you put in it overrode equals and hashCode? What if I'm prepared to sacrifice a little performance and/or code complexity to do this? What are my choices? Can I do it at runtime? Better yet, can I do it at compile-time?

The Answer?

I don't know a great answer. I don't know of any libraries that do. I'd love some feedback on helping eliminate this important category of subtle bugs.

To get the ball rolling, this is what I'm thinking of doing in Metawidget:

Class classToTest = objectToCache.getClass();
Object dummy1 = classToTest.newInstance();
Object dummy2 = classToTest.newInstance();

if ( !dummy1.equals( dummy2 ))
throw new Exception( classToTest + " does not override .equals(), so cannot be reliably cached" );

if ( dummy1.hashCode() != dummy2.hashCode() )
throw new Exception( classToTest + " does not override .hashCode(), so cannot be reliably cached" );

This approach takes advantage of the fact that, given your class has internal state, two brand new instances should always have equivalent state, but the default Object.equals (which uses ==) will return false. It's not perfect. It'll work if you write a POJO and forget to override equals. But it won't work if your superclass overrides equals but your subclass, which adds some more internal state, forgets to. But it's a start.

Note that, annoyingly, this doesn't seem to work:

class.getDeclaredMethod( "hashCode" )

Public methods like hashCode and equals are always considered 'declared methods', even if the class doesn't actually declare them. Similarly:

class.getDeclaredMethod( "hashCode" ).getDeclaringClass()

Always returns the subclass name, even if the subclass doesn't override hashCode.

Suggestions welcome!

Update

Thanks to everyone for the really helpful comments. The first thing to note is that...

class.getDeclaredMethod( "hashCode" ).getDeclaringClass()

...does actually work. I don't know why it didn't seem to when I tried it originally. Evidentally I am an idiot!

Anyway, here's what's currently going into Metawidget, based on everyone's feedback:

Class<?> configClass = configToStoreUnder.getClass();

// Hard error

// equals

Class<?> equalsDeclaringClass = configClass.getMethod( "equals", Object.class ).getDeclaringClass();

if ( Object.class.equals( equalsDeclaringClass ) )
throw new Exception( configClass + " does not override .equals(), so cannot cache reliably" );

// hashCode

Class<?> hashCodeDeclaringClass = configClass.getMethod( "hashCode" ).getDeclaringClass();

if ( Object.class.equals( hashCodeDeclaringClass ) )
throw new Exception( configClass + " does not override .hashCode(), so cannot cache reliably" );

if ( !equalsDeclaringClass.equals( hashCodeDeclaringClass ) )
throw new Exception( equalsDeclaringClass + " implements .equals(), but .hashCode() is implemented by " + hashCodeDeclaringClass + ", so cannot cache reliably" );

if ( !configClass.equals( equalsDeclaringClass ) )
{
// Soft warning
//
// Note: only show this if the configClass appears to have its own 'state'.
// Base this assumption on whether it declares any methods. We don't want to
// use .getDeclaredFields because that requires a security manager
// check of checkMemberAccess(Member.DECLARED), whereas we may only have
// checkMemberAccess(Member.PUBLIC) permission

for ( Method declaredMethod : configClass.getMethods() )
{
if ( configClass.equals( declaredMethod.getDeclaringClass() ) )
{
LOG.warn( configClass + " does not override .equals() (only its super" + equalsDeclaringClass + " does), so may not be cached reliably" );
break;
}
}

// Note: not necessary to do !configClass.equals( hashCodeDeclaringClass ),
// as will already have thrown an Exception from
// !equalsDeclaringClass.equals( hashCodeDeclaringClass ) if that's the case
}

Improvements welcome!

7 comments:

Odd said...

The getDeclaredMethods() method of Class returns an array that only includes methods defined or redefined (overridden) in the specific class. So one way to check whether the class overrides equals/hashCode is to check if the array returned from getDeclaredMethods() includes the corresponding methods or not.

Greetings
Odd Möller

Richard said...

Odd,

Gah! I swear this didn't work when I tried it earlier (hence why I wrote this blog entry), but after your comment I tried it again and now it seems okay.

This means I can do...

if ( !classToTest.equals( classToTest.getMethod( "equals", Object.class ).getDeclaringClass() )

...which is exactly what I want. Thanks!

Richard.

ppow said...

You should also remember that the user classes can also be extended.

Se the followin: Your class is A, user wrote an abstract class B where he wrote equals/hashCode. Then created class C extending B in a way that does not need to change equals/hashcode (strange, but still possible). In that case your snippet would not work. You need the following:

if ( ! classToTest instanceof classToTest.getMethod( "equals", Object.class ).getDeclaringClass() ) {
// report error
}

What is more you probably want to check that equals and hashcode are overriden at the same place / class:

if ( classToTest.getMethod( "equals", Object.class ).getDeclaringClass() != classToTest.getMethod( "hashCode", Object.class ).getDeclaringClass()) {
// report error
}

Colin said...

Also, note that hashCode defined in Object specify : "As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)".
Your solution is (possibly) inefficient in some JVMs.

Richard said...

Thanks everyone for your helpful comments. I have updated the blog entry based on your feedback.

Further feedback most welcome!

Javin said...

Nice article but I believe its important to understand the consequences of not following this contract as well and for that its important to understand application of hashcode in collection classes e.g. How HashMap works in Java and how hashcode() of key is used to insert and retrieve object from hashMap.

Javin

Anonymous said...

I would mark the equals and hashCode methods abstract in your class and then enforce that type for your map. And maybe even mark the methods with http://code.google.com/p/jsr-305/source/browse/trunk/ri/src/main/java/javax/annotation/OverridingMethodsMustInvokeSuper.java allowing static code analysis tools to discover any classes that override equals in a child without calling super. Tada, it's now checked at compile time.