Functional Equality (rewrite)

When 2 + 2 Does Not Equal 4.0

Introduction: What Do You Mean by Equal?

What does it mean for two values to be equal? In designing programming languages or defining types, we have to consider this question. And the wrong answer can create problems. It can lead to counter-intuitive surprises like in JavaScript where "" and [0] both are equal to 0 but not to each other. It can confuse programmers who expect two equal values to be the same, when they are actually quite different.

In this essay, I will compare functional equality with semantic equality and survey equality testing semantics across different programming languages.

Leibniz’s Law

Philosophers have long debated the question of what makes two things identical, and have come up with a criterion…

When 2 + 2 Does Not Equal 4.0

Introduction: What Do You Mean by Equal?

In this essay, I will compare functional equality with semantic equality and survey equality testing semantics across different programming languages.

Leibniz’s Law

Philosophers have long debated the question of what makes two things identical, and have come up with a criterion called Leibniz’s Law, or “the identity of indiscernibles”. This law says that two things are identical if they have all their properties in common. In other words, two things are the same if there is no way to tell them apart.

The formal expression of Leibniz’s Law is:

$$ x = y ⟺ ∀f (f(x) = f(y)) $$

Functional Equality and Substitutability

This translates nicely to a definition of equality in a pure functional programming language: two values are indiscernible if they have all function results in common (i.e., no function produces different results when applied to them). Two such indiscernible values are functionally equal.

We can extend this definition of functional equality to languages that aren’t purely functional by defining “function” to include all operators or procedures that can be applied to a value. Then, we’ll say that f(x) = f(y) also means that any effects of f(x) are equal to those of f(y). By this definition, two values are functionally equal if we can substitute one for the other in our program without changing program behavior.

Semantic vs. Functional Equality

The criteria for functional equality is very strict. It generally means that two values can be functionally equal only if they have the exact same type – otherwise for example typeOf(x) != typeOf(y). In fact, if any function can inspect the binary representation of a value, then two values can be functionally equal only if they are identical down to the bit.

So in many programming languages, the == operator does not test for strict functional equality! For example, the integer 4 and the float 4.0 are often ==, but not functionally equal, because for example integer division works differently than float division, or they have different string forms.

For example, in Python:

>>> 2+2 == 4.0
True
>>> str(2+2) == str(4.0)
False

So functional equality is different from numeric equality.

When comparing two numbers, it is usually numeric equality we are interested in. But not always. In certain scenarios such as unit tests, we truly care about substitutability. If a float32 and float64 behave differently in any way, then you can’t always substitute one for the other without changing the program output.

Numeric equality is a type of semantic equality. In this essay, I’ll define and explain the idea of semantic equality. But first, I’ll address some possible points of confusion in the definition of functional equality. Armed with an understanding of these two concepts, I’ll compare example scenarios where either functional or semantic equality tests are needed. Finally I’ll survey how the different types of equality tests are made in a variety of programming languages.

Functional Equality

Functional Equality vs. Identity

Functionally equal values don’t have to be identical in the sense of Leibniz’s Law, because they may have different internal implementations as long as those representations are not exposed in the type’s public API.

For example, a set type might be internally represented using a binary tree. Two binary trees holding the exact same values could have different structure (e.g. if the values were inserted in different order). But if all public functions/methods (such as toString) exposed set elements in sorted order, and no public functions exposed the tree structure, then two sets internally represented by different trees could be functionally equal.

So two values are functionally equal if they are substitutable. They don’t have to be absolutely identical.

Functional Equality and Functional Purity

Substitutability is an important concept in the theory of functional programming. A functionally pure language must respect the principle of referential transparency, which requires that you can substitute an expression with its value without changing program behavior.

This almost looks identical to the definition of functional equality. But there is a subtle difference. Let’s compare the criteria:

| | Referential Transparency | Functional Equality | | | ———————— | —————–– | | criteria | you can substitute an expression with its value without changing program behavior | you can substitute an expression with an equal value without changing program behavior | | example | if f(x) is 2, and the value of x is 1.0, then you can replace f(x) with f(1.0) and nothing will change | if f(x) is 2, and x is equal to 1.0, then you can replace f(x) with f(1.0) and nothing will change |

The difference is that referential transparency does not require comparing two values for equality. But in practice, a functional equality operator is critical for some functional programming techniques, such as caching/memoization, as I’ll demonstrate in the cache example later in this essay.

Equal Values vs. Equal Variables

One may be tempted to say that two values cannot be functionally equal if they have different memory addresses. But this confuses values with variables. Variables have addresses, but values have no address – they just are. When a programmer writes a == b, they mean to test whether the value referred to by a is equal to the value referred to by b, not whether a and b are the same variable.

This means that when comparing two pointers or references to mutable objects, functional equality requires that they point to the exact same address. Even if the values stored at two different addresses are equal, mutating one will not have the same effect on program behavior as mutating the other, so they are not substitutable. Even pointers to immutable copies of identical values are not functionally equal if the address of the pointer itself can be inspected (e.g. via toString).

Semantic Equality

When comparing any two values, we usually don’t care if they are 100% functionally equal. We just care if they are, well…the same, for all intents and purposes.

But what are those intents and purposes? I argue that our intent is always to test whether two different values are representations of the same underlying semantic value.

For numbers, the semantic value is its numeric value (with minor exceptions for IEEE floats, like NaNs or signed zeros). But there are many other types of semantic values that could have more than one representation.

The same instant in time in different time zones
Unicode strings with the same NFC form (e.g. “ñ” (U+00F1) and “ñ” (U+006E U+0303))
Sets represented by different lists but with the same elements (e.g. {1,2} and {2,1})
The same length represented in different units (e.g. 1in and 2.54cm)
Fractions with the same normal form (e.g. 2/4 and 1/2)

In all these cases, there are two distinct entities: the semantic value, and its representation. These have a 1-to-many relationship.

Diagram showing one semantic value with multiple representations

Programmers can’t work with a value without representing it somehow. We need to choose a format/precision for numbers. We need to store the elements of a set in some order. So the actual values of variables programmers work with are always representations of some semantic value.

We can formally define semantic equality as:

$$ x = y ⟺ SemanticValue(x) = SemanticValue(y) $$

The Relevant “Semantics”

The meaning of SemanticValue depends on your particular problem domain. For example, an application that processes timestamps in event logs probably should use timestamps semantics: two timestamps are equal if they represent the same instant in time, regardless of the location where the event occurred.

On the other hand, a calendar application may use local time semantics, where the actual time zone of an event is often relevant (a calendar event that starts at 9am New York Time behaves differently from a calendar event that starts at 3pm Madrid Time).

So there are therefore multiple possible ways that two times could be semantically equal. In such cases, it’s important for the programmer to understand which semantics are relevant in order to perform the right kind of equality test.

Canonical Representations

When there are multiple ways to represent the same semantic value, we can always designate one as the canonical representation. Examples of canonical representations:

Instants in Time: Unix timestamps (seconds since the epoch, UTC)
Unicode Strings: the NFC form
Sets: an ordered list
Length: length in meters
Rational numbers: normal form (divide by GCD)

Whatever the semantics, programmers can perform semantic equality tests by performing a functional equality tests on the canonical representation.

Examples (Javascript)

// compare two timestamps using instant-in-time semantics
const a = Temporal.ZonedDateTime.from('2023-01-01T00:00:00[UTC]');
const b = Temporal.ZonedDateTime.from('2022-12-31T19:00:00[America/New_York]');
a.equals(b);  // False
a.toInstant().equals(b.toInstant());  // True

// compare two Unicode strings using NFC semantics
const a = 'ñ';  // U+00F1
const b = 'n\u0303';  // U+006E U+0303
a === b;  // False
a.normalize('NFC') === b.normalize('NFC');  // True

Examples (Python)

# compare two lists using set semantics
a = [1, 2]
b = [2, 1]
a == b  # False
set(a) == set(b)  # True

import astropy.units as u
from astropy.units import imperial
u.add_enabled_units(imperial)

# compare two measurements using physical length semantics
a = 1 * imperial.inch
b = 2.54 * u.cm
a == b  # False
a.to(u.m) == b.to(u.m)  # True

Canonical Types

This naturally suggests the question: why do we even have non-canonical representations? If you always represent timestamps in event logs using epoch seconds, you never need to convert before comparing. If your rationals are always in normal form, or your set elements are always sorted, or your distances are always in meters, there is no possible confusion.

Many languages facilitate creation of canonical types: types that only permit canonical representations.

For example, a canonical Fraction type only allows fractions represented in normal form. Python’s fractions.Fraction is an example: the expression Fraction(2, 4) is automatically converted to normal form Fraction(1, 2). Ruby’s Fractional, Haskell’s Data.Ratio, Julia’s Rational all do the same.

With canonical types, there is a 1:1 relationship between the values of that type, and the semantic values they are supposed to represent. As a result, for these types functional equality is equivalent to semantic equality.

When Functional Equality Matters

Example: Caching

When implementing a cache, two cache keys should probably not be considered equal unless they are functionally equal. Otherwise the cache can return a value for a functionally different key. This would violate the basic principle of a cache, which should only affect program performance, never behavior.

For example, a polymorphic function f(x) might accept either a float32 or a float64 argument. It could be that the float32 version of f and the float64 version of f return numerically different values for numerically equal inputs – for example the float32 might overflow and return +Inf where the float64 version doesn’t.

If we put f(x) behind a generic cache, and that cache used numeric equality to check whether there was a cached result for some value of x, then f(b) might start returning +Inf for float64 values that should not overflow. This would change program behavior and could introduce hard-to-find bugs.

Example: Testing

Assertions in test code should generally use functional equality tests. First because it’s generally better for an automated test to be very specific about what the expected results are. But also because mistakenly using semantic equality checks in test code can cause incorrect behavior.

Consider a unit test that looks like this (in no particular language).

result = foo()
assert(result == 1)           // this passes
// assert(f(result) == true)  // this fails for some reason
assert(f(1) == true)          // but this passes

The test originally failed on the assertion f(result) == true. But when the programmer commented that out the failed assertion and replaced it with the assertion f(1) == true, the test passed! But this is confusing, because the previous line of code just confirmed that result == 1.

But of course, the problem is that the programmer used the == operator, which is checking for semantic equality, not functional equality. result might, for example, be a float (1.0), while 1 is an int, and it may be that f(1) != f(1.0).

Example: Programmer Intuition

Here’s another example where a programmer’s intuition might expect that equality implies substitutability:

if result == Success {
// return Success     // This breaks for some reason
return result
}

If result == Success, why should it matter whether I return result or return Success?

But if == is not a functional equality test, then it can matter (e.g. the result might have more information than just success/fail status code). If the programmer doesn’t understand that, they can easily make the wrong choice.

Common Approaches to Equality Testing

There are a few common ways that programming languages enable both functional and semantic equality tests.

By far, == is the most common choice for the equality operator in programming languages. And in most languages, == defaults to a functional equality test when comparing two values of the same type.

However, some languages allow customizing or overloading == so that it performs a semantic equality check instead.

Further, in some languages == can be used to compare two values of different numeric types. Some other languages also support comparing “truthy” values of different types (e.g. 1 == True) or comparing various types of values to strings (e.g. 13 == "13"). This is generally done using implicit conversion or widening (e.g. convert both values to a float or a string before comparing).

Languages such as Javascript/Typescript have a more complex Abstract Equality Comparison algorithm that even allows semantic equality between strings, numbers, and even arrays and objects (e.g. [] == 0).

For languages where == is not strict functional equality, === is often used as the strict functional equality operator.

Example: Go

In Go, == is a strict functional equality operator. The == operator can’t be overloaded, and the compiler won’t even allow values of different types to be compared.

To test for numeric equality across types, programmers must explicitly convert values to a common type:

var a int32 = 2+2
var b float64 = 4.0

// compile error
// a == b

// true
float64(a) == b

Example: Python

Python on the other hand allows equality comparison across values of any type. Numeric values are implicitly converted to a common type:

>>> a = 2+2
>>> type(a)
<class 'int'>
>>> b = 4.0
>>> type(b)
<class 'float'>
>>> a == b
True

“Truthy” values of different types can also be equal:

>>> 1 == True
True
>>> 1 == "Blue"
False

Normalization on Initialization

Many languages facilitate creation of canonical types via normalization of values on initialization. For example, in Scala you can define a custom apply function:

case class Fraction(numerator: Int, denominator: Int)

object Fraction {
def apply(n: Int, d: Int): Fraction = {
// Assume normalization logic here (compute GCD, reduce, handle signs)
val reducedN = /* normalized numerator */
val reducedD = /* normalized denominator */
Fraction(reducedN, reducedD)
}
}

// Usage
val a = Fraction(2, 4)
val b = Fraction(1, 2)
println(a == b)  // True, as both normalize to the same values (e.g., 1/2)

By enforcing that fractions are always in normal form, two functionally equal Fraction values will also be numerically equal, and vice versa.

Canonical Number Types

Few languages have a single canonical number type. This is hard to achieve without sacrificing performance and constraining the range of numeric values that can be represented.

JavaScript, TypeScript, and pre-5.3 Lua have a single number type (double-precision floats). Perl represents numbers using a unified scalar interface with internal optimizations using different representations (ints, floats, and decimal strings). These languages often trade simplicity for performance limitations, and constrain the range of numbers that can be represented exactly (e.g., separate types or pragmas are required for bigints or exact rationals).

Scheme, Common Lisp, Julia, and several others unify numbers under a “numeric tower” or hierarchy that includes ints, floats, and exact rationals. These approximate canonicity through promotion rules but the different types still have differing arithmetic semantics and runtime checks that reveal the type.

Number types differ not only in the range or precision of values that can be represented, but also in the actual semantics of arithmetic operations. A single canonical number type would need to somehow decouple representations from arithmetic, requiring programmers to specify the semantics of their arithmetic operations (integer, float, decimal float, or exact rational), along with precision and overflow rules, instead of letting these be determined by the type. This may be unrealistic.

Comparison Across Languages

This table summarizes the rules for several popular programming languages

Language	Cross-Type `==` Comparison	Functional Equality Test	Equality Operator Overloading	Normalization on Initialization	Canonical Numbers
Go	No	==	No	No	No
JavaScript	Abstract Equality Comparison algorithm	===	No	No	Yes (Number as doubles)
TypeScript	Abstract Equality Comparison algorithm	===	No	No	Yes (Number as doubles)
Python	Numbers and Truthy Values	type(x) == type(y) and x == y	Yes (eq)	Yes (init for normalization)	No
Haskell	Yes (typeclass)	== (User-Defined via Eq)	Yes (typeclass)	Yes (newtypes for wrapping)	Partial (numeric tower)
Swift	No	==	Yes (Equatable protocol)	Yes (init for normalization)	No
Kotlin	Numbers	===	Yes (override equals())	Yes (init for normalization)	No
Julia	Numbers	===	Yes (methods)	Yes (constructors)	Partial (numeric tower)
Java	No	==	No	Yes (constructors)	No
Scala	Numbers	x.getClass == y.getClass && x == y	Yes (override equals)	Yes (apply in companion object)	No
Rust	No	== (via PartialEq/Eq)	Yes (impl trait)	Yes (newtypes/structs)	No
C++	No	operator== (User-Defined)	Yes (operator==)	Yes (constructors)	No
C#	No	==	Yes (operator ==)	Yes (constructors)	No
Ruby	Numbers	self.class == other.class && self == other	Yes (override ==)	Yes (initialize for normalization)	No
Perl	Numbers	$x == $y (numeric context)	No (== is built-in)	No	Yes (unified scalars)
Scheme	Numbers (via =)	eqv?	No	Yes (rationals only)	Partial (numeric tower)
Lua (pre-5.3)	N/A (single number type)	==	Yes (__eq metamethod)	No	Yes (uniform doubles)

Summary

If you think about it, any two things are the same in some ways, but different in other ways. So if you want to know whether two values are “equal,” you must first ask yourself what do you mean by equal?

Two values are functionally equal iff substituting one for the other won’t change program behavior—meaning it’s never true that $f(x) \neq f(y)$ for any $f$.

Semantic equality is looser: the integer 4 and float 4.0 represent the same numeric value, but if toString(4) differs from toString(4.0), then 2 + 2 isn’t functionally equal to 4.0.

A single semantic value can have multiple representations. Functional equality compares representations, semantic equality compares meaning. Choose a domain-tailored canonical representation—like sorted lists for sets or epoch seconds for times—and semantic equality tests reduce to functional equality tests on canonical representations. Make == a strict functional equality test to force programmers to be explicit about their intent, increasing clarity and killing bugs at the expense of convenience. Embrace canonical types to merge functional and semantic equality completely.

Introduction: What Do You Mean by Equal?

Leibniz’s Law

Introduction: What Do You Mean by Equal?

Leibniz’s Law

Functional Equality and Substitutability

Semantic vs. Functional Equality

Functional Equality

Functional Equality vs. Identity

Functional Equality and Functional Purity

Equal Values vs. Equal Variables

Semantic Equality

The Relevant “Semantics”

Canonical Representations

Canonical Types

When Functional Equality Matters

Example: Caching

Example: Testing

Example: Programmer Intuition

Common Approaches to Equality Testing

Example: Go

Example: Python

Normalization on Initialization

Canonical Number Types

Comparison Across Languages

Summary

Similar Posts