Ruby is one of the few programming languages that get equality right. I often play around with other languages, but keep coming back to Ruby. This is largely because Ruby’s implementation of equality is so nice.
Nonetheless, equality in Ruby isn't straightforward. There is #==, #eql?, #equal?, #===, and more. Even if you’re familiar with how to use them, implementing them can be a whole other story.
Let's walk through all forms of equality in Ruby and how to implement them.
Why Properly Implementing Equality Matters
We check whether objects are equal all the time. Sometimes we do this explicitly, sometimes implicitly. Here are some examples:
- Do these two
Employeeswork in the same
Team? Or, in code:
denis.team == someone.team.
- Is the given
DiscountCodevalid for this particular
Product? Or, in code:
- Who are the (distinct) managers for this given group of employees? Or, in code:
A good implementation of equality is predictable; it aligns with our understanding of equality.
An incorrect implementation of equality, on the other hand, conflicts with what we commonly assume to be true. Here is an example of what happens with such an incorrect implementation:
geb_also objects should definitely be equal. The fact that the code says they’re not is bound to cause bugs down the line. Luckily, we can implement equality ourselves and avoid this class of bugs.
No one-size-fits-all solution exists for an equality implementation. However, there are two kinds of objects where we do have a general pattern for implementing equality: entities and value objects. These two terms come from domain-driven design (DDD), but they’re relevant even if you’re not using DDD. Let’s take a closer look.
Entities are objects that have an explicit identity attribute. Often, entities are stored in some database and have a unique
id attribute corresponding to a unique
id table column. The following
Employee example class is such an entity:
Two entities are equal when their IDs are equal. All other attributes are ignored. After all, an employee’s name might change, but that does not change their identity. Imagine getting married, changing your name, and not getting paid anymore because HR has no clue who you are anymore!
ActiveRecord, the ORM that is part of Ruby on Rails, calls entities "models" instead, but they’re the same concept. These model objects automatically have an ID. In fact, ActiveRecord models already implement equality correctly out of the box!
Value objects are objects without an explicit identity. Instead, their value as a whole constitutes identity. Consider this
Points will be equal if their x and y values are equal. The x and y values constitute the identity of the point.
In Ruby, the basic value object types are numbers (both integers and floating-point numbers), characters, booleans, and
nil. For these basic types, equality works out of the box:
Arrays of value objects are in themselves also value objects. Equality for arrays of value objects works out of the box—for example,
[17, true] == [17, true]. This might seem obvious, but this isn’t true in all programming languages.
Other examples of value objects are timestamps, date ranges, time intervals, colors, 3D coordinates, and money objects. These are built from other value objects; for example, a money object consists of a fixed-decimal number and a currency code string.
Basic Equality (Double Equals)
Ruby has the
!= operators for checking whether two objects are equal or not:
Ruby’s built-in types all have a sensible implementation of
==. Some frameworks and libraries provide custom types, which will have a sensible implementation of
==, too. Here is an example with ActiveRecord:
For custom classes, the
== operator returns true if and only if the two objects are the same instance. Ruby does this by checking whether the internal object IDs are equal. These internal object IDs are accessible using
gizmo == thing is the same as
gizmo.__id__ == thing.__id__.
This behavior is often not a good default, however. To illustrate this, consider the
Point class from earlier:
== operator will return true only when calling it on itself:
This default behavior is often undesirable in custom classes. After all, two points are equal if (and only if) their x and y values are equal. This behavior is undesirable for value objects (such as
Point) and entities (such as the
Employee class mentioned earlier).
The desired behavior for value objects and entities is as follows:
- For value objects (a), we’d like to check whether all attributes are equal.
- For entities (b), we’d like to check whether the explicit ID attributes are equal.
- By default (c), Ruby checks whether the internal object IDs are equal.
Point are value objects. With the above in mind, a good implementation of
Point would look as follows:
This implementation checks all attributes and the
class of both objects. By checking the class, checking equality of a
Point instance and something of a different class return
false rather than raise an exception.
Checking equality on
Point objects now works as intended:
!= operator works too:
A correct implementation of equality has three properties: reflexivity, symmetry, and transitivity.
Reflexivity (a): An object is equal to itself:
a == a
Symmetry (b): If
a == b, then
b == a
Transitivity (c): If
a == band
b == c, then
a == c
These properties embody a common understanding of what equality means. Ruby won’t check these properties for you, so you’ll have to be vigilant to ensure you don’t break these properties when implementing equality yourself.
IEEE 754 and violations of reflexivity
It seems natural that something would be equal to itself, but there is an exception. IEEE 754 defines NaN (Not a Number) as a value resulting from an undefined floating-point operation, such as dividing 0 by 0. NaN, by definition, is not equal to itself. You can see this for yourself:
This means that
== in Ruby is not universally reflexive. Luckily, exceptions to reflexivity are exceedingly rare; this is the only exception I am aware of.
Basic Equality for Value Objects
Point class is an example of a value object. The identity of a value object, and thereby equality, is based on all its attributes. That is exactly what the earlier example does:
Basic Equality for Entities
Entities are objects with an explicit identity attribute, commonly
@id. Unlike value objects, an entity is equal to another entity if and only if their explicit identities are equal.
Entities are uniquely identifiable objects. Typically, any database record with an
id column corresponds to an entity. Consider the following
Employee entity class:
Other forms of ID are possible too. For example, books have an ISBN, and recordings have an ISRC. But if you have a library with multiple copies of the same book, then ISBN won’t uniquely identify your books anymore.
For entities, the
== operator is more involved to implement than for value objects:
This code does the following:
supercall invokes the default implementation of equality:
trueif and only if the two objects are the same instance. This
supercall, therefore, ensures that the reflexivity property always holds.
- As with
Point, the implementation
class. This way, an
Employeeinstance can be checked for equality against objects of other classes, and this will always return
nil, the entity is considered not equal to any other entity. This is useful for newly-created entities which have not been persisted yet.
- Lastly, this implementation checks whether the ID is the same as the ID of the other entity. If so, the two entities are equal.
Checking equality on entities now works as intended:
Blog post of Theseus
Implementing equality on entity objects isn’t always straightforward. An object might have an
id attribute that doesn’t quite align with the object’s conceptual identity.
BlogPost class, for example, with
body attributes. Imagine creating a
BlogPost, then halfway through writing the body for it, scratching everything and starting over with a new title and a new body. The
id of that
BlogPost will still be the same, but is it still the same blog post?
If I follow a Twitter account that later gets hacked and turned into a cryptocurrency spambot, is it still the same Twitter account?
These questions don’t have a proper answer. That’s not surprising, as this is essentially the Ship of Theseus thought experiment. Luckily, in the world of computers, the generally accepted answer seems to be yes: if two entities have the same
id, then the entities are equal as well.
Basic Equality with Type Coercion
Typically, an object is not equal to an object of a different class. However, this isn’t always the case. Consider integers and floating-point numbers:
float_two is an instance of
integer_two is an instance of
Integer. They are equal:
float_two == integer_two is
true, despite different classes. Instances of
Float are interchangeable when it comes to equality.
As a second example, consider this
Path class provides an API for creating paths:
Path class is a value object, and implementing
#== could be done just as with other value objects:
Path class is special because it represents a value that could be considered a string. The
== operator will return
false when checking equality with anything that isn’t a
It can be beneficial for
path == "/usr/bin/ruby" to be
true rather than
false. To make this happen, the
== operator needs to be implemented differently:
This implementation of
== coerces both objects to
Strings, and then checks whether they are equal. Checking equality of a
Path now works:
This class implements
#to_str, rather than
#to_s. These methods both return strings, but by convention, the
to_str method is only implemented on types that are interchangeable with strings.
Path class is such a type. By implementing
Path#to_str, the implementation states that this class behaves like a
String. For example, it’s now possible to pass a
Path (rather than a
IO.open, and it will work because
IO.open accepts anything that responds to
String#== also uses the
to_str method. Because of this, the
== operator is reflexive:
#equal? to check whether two objects are the same instance:
Here, we end up with two
String instances with the same content. Because they are distinct instances,
false, and because their content is the same,
Do not implement
#equal? in your own classes. It isn’t meant to be overridden. It’ll all end in tears.
Earlier in this post, I mentioned that
#== has the property of reflexivity: an object is always equal to itself. Here is a related property for
Property: Given objects
a == b.
Ruby won't automatically validate this property for your code. It’s up to you to ensure that this property holds when you implement the equality methods.
For example, recall the implementation of
Employee#== from earlier in this article:
The call to
super on the first line makes this implementation of
#== reflexive. This
super invokes the default implementation of
#==, which delegates to
#equal?. Therefore, I could have used
#equal? rather than
I prefer using
super, though this is likely a matter of taste.
In Ruby, any object can be used as a key in a
Hash. Strings, symbols, and numbers are commonly used as
Hash keys, but instances of your own classes can function as
Hash keys too—provided that you implement both
The #eql? Method
#eql? method behaves similarly to
#==, does not perform type coercion:
#== doesn’t perform type coercion, the implementations of
#== will be identical. Rather than copy-pasting, however, we’ll put the implementation in
#eql?, and let
#== delegate to
I made the deliberate decision to put the implementation in
#eql? and let
#== delegate to it, rather than the other way around. If we were to let
#eql? delegate to
#==, there’s an increased risk that someone will update
#== and inadvertently break the properties of
#eql? (mentioned below) in the process.
Path value object, whose
#== method does perform type coercion, the implementation of
#eql? will differ from the implementation of
#== does not delegate to
#eql?, nor the other way around.
A correct implementation of
#eql? has the following two properties:
Property: Given objects
a == b.
Property: Given objects
These two properties are not explicitly called out in the Ruby documentation. However, to the best of my knowledge, all implementations of
#== respect this property.
Ruby will not automatically validate that these properties hold in your code. It’s up to you to ensure that these properties aren’t violated.
The #hash Method
For an object to be usable as a key in a
Hash, it needs to implement not only
#eql?, but also
#hash method will return an integer, the hash code, that respects the following property:
Property: Given objects
a.hash == b.hash.
Typically, the implementation of
#hash creates an array of all attributes that constitute identity and returns the hash of that array. For example, here is
Path, the implementation of
#hash will look similar:
Employee class, which is an entity rather than a value object, the implementation of
#hash will use the class and the
If two objects are not equal, the hash code should ideally be different, too. This isn’t mandatory, however. It’s okay for two non-equal objects to have the same hash code. Ruby will use
#eql? to tell objects with identical hash codes apart.
Avoid XOR for Calculating Hash Codes
A popular but problematic approach for implementing
#hash uses XOR (the
^ operator). Such an implementation would calculate the hash codes of each individual attribute, and combine these hash codes with XOR. For example:
With such an implementation, the chance of a hash code collision, which means that multiple objects have the same hash code, is higher than with an implementation that delegates to
Array#hash. Hash code collisions will degrade performance and could potentially pose a denial-of-service security risk.
A better way, though still flawed, is to multiply the components of the hash code by unique prime numbers before combining them:
Such an implementation has additional performance overhead due to the new multiplication. It also requires mental effort to ensure the implementation is and remains correct.
An even better way of implementing
#hash is the one I’ve laid out before—making use of
An implementation that uses
Array#hash is simple, performs quite well, and produces hash codes with the lowest chance of collisions. It’s the best approach to implementing #hash.
Putting it Together
#hash in place, the
Employee objects can be used as hash keys:
Here, we use a
Hash instance to keep track of a collection of
Points. We can also use a
Set for this, which uses a
Hash under the hood, but provides a nicer API:
Objects used in
Sets need to have an implementation of both
#hash, just like objects used as hash keys.
Objects that perform type coercion, such as
Path, can also be used as hash keys, and thus also in sets:
We now have an implementation of equality that works for all kinds of objects.
Mutability, Nemesis of Equality
So far, the examples for value objects have assumed that these value objects are immutable. This is with good reason because mutable value objects are far harder to deal with.
To illustrate this, consider a
Point instance used as a hash key:
The problem arises when changing attributes of this point:
Because the hash code is based on the attributes, and an attribute has changed, the hash code is no longer the same. As a result,
collection no longer seems to contain the point. Uh oh!
There are no good ways to solve this problem except for making value objects immutable.
This isn’t a problem with entities. This is because the
#hash methods of an entity are solely based on its explicit identity—not its attributes.
So far, we’ve covered
#hash. These three methods are sufficient for a correct implementation of equality. However, we can go further to improve that sweet Ruby developer experience and implement
Case Equality (Triple Equals)
#=== operator, also called the case equality operator, isn’t really an equality operator at all. Rather, it’s better to think of it as a membership testing operator. Consider the following:
Range#=== checks whether a range covers a certain element. It’s also common to use
case expressions to achieve the same:
This is also where case equality gets its name. Triple-equals is called case equality, because
case expressions use it.
You never need to use
case. It’s possible to rewrite a
case expression using
===. In general,
case expressions tend to look cleaner. Compare:
The examples above all use
Range#===, to check whether the range covers a certain number. Another commonly used implementation is
Class#===, which checks whether an object is an instance of a class:
I’m rather fond of the
#grep method, which uses
#=== to select matching elements from an array. It can be shorter and sweeter than using
Regular expressions also implement
#===. You can use it to check whether a string matches a regular expression:
It helps to think of a regular expression as the (infinite) collection of all strings that can be produced by it. The set of all strings produced by
/[a-z]/ includes the example string
"+491573abcde". Similarly, you can think of a
Class as the (infinite) collection of all its instances, and a
Range as the collection of all elements in that range. This way of thinking clarifies that
#=== really is a membership testing operator.
An example of a class that could implement
#=== is a
An example instance is
PathPattern.new("/bin/*"), which matches anything directly under the
/bin directory, such as
/bin/ruby, but not
The implementation of
PathPattern#=== uses Ruby’s built-in
File.fnmatch to check whether the pattern string matches. Here is an example of it in use:
Worth noting is that
#to_str on its arguments. This way,
#=== automatically works on other string-like objects as well, such as
PathPattern class implements
#===, and therefore
PathPattern instances work with
For some objects, it’s useful not only to check whether two objects are the same, but how they are ordered. Are they larger? Smaller? Consider this
Score class, which models the scoring system of my university in Ghent, Belgium.
(I was a terrible student. I’m not sure if this was really how the scoring even worked — but as an example, it will do just fine.)
In any case, we benefit from having such a
Score class. We can encode relevant logic there, such as determining the grade and checking whether or not a score is passing. For example, it might be useful to get the lowest and highest score out of a list:
However, as it stands right now, the expressions
scores.max will result in an error: comparison of
Score with Score failed (ArgumentError). We haven’t told Ruby how to compare two
Score objects. We can do so by implementing
An implementation of
#<=> returns four possible values:
- It returns
0when the two objects are equal.
- It returns
selfis less than
- It returns
selfis greater than
- It returns
nilwhen the two objects cannot be compared.
#== operators are connected:
Property: Given objects
(a <=> b) == 0, then
a == b.
Property: Given objects
(a <=> b) != 0, then
a != b.
As before, it’s up to you to ensure that these properties hold when implementing
#<=>. Ruby won’t check this for you.
For simplicity, I’ve left out the implementation
Score#== in the Score example above. It’d certainly be good to have that, though.
In the case of
Score#<=>, we bail out if other is not a score, and otherwise, we call
#<=> on the two values. We can check that this works: the expression
Score.new(6) <=> Score.new(12) evaluates to
-1, which is correct because a score of 6 is lower than a score of 12. (Did you know that the Belgian high school system used to have a scoring system where 1 was the highest and 10 was the lowest? Imagine the confusion!)
Score#<=> in place,
scores.max now returns the maximum score. Other methods such as
#sort work as well.
However, we can’t yet use operators like
<. The expression
scores < scores, for example, will raise an undefined method error:
undefined method `<' for #<Score:0x00112233 @value=6>. We can solve that by including the
Score class automatically gains the
>= operators, which all call
<=> internally. The expression
scores < scores now evaluates to a boolean, as expected.
Comparable mixin also provides other useful methods such as
We talked about the following topics:
#==operator, used for basic equality, with optional type coercion
#equal?, which checks whether two objects are the same instance
#hash, which are used for testing whether an object is a key in a hash
#===, which isn’t quite an equality operator, but rather a “is kind of” or “is member of” operator
#<=>for ordered comparison, along with the
Comparablemodule, which provides operators such as
You now know all you need to know about implementing equality in Ruby. For more information check out the following resources:The Ruby documentation is a good place to find out more about equality:
I also found the following resources useful:
Denis is a Senior Software Engineer at Shopify. He has made it a habit of thanking ATMs when they give him money, thereby singlehandedly staving off the inevitable robot uprising.
If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Visit our Engineering career page to find out about our open positions. Join our remote team and work (almost) anywhere. Learn about how we’re hiring to design the future together—a future that is digital by default.