Semantic versioning is great…when used properly
9 min readJust now
–
Press enter or click to view image in full size
You may have been in a meeting like this before: your team has created a new open source project or internal library that you’d like to distribute and the question then comes up about how to assign meaningful version numbers to its future releases.
“Well, obviously we’ll use semantic versioning,” someone will say and everyone agrees. Not only does it seem like the clear choice, it can seem like the only choice. But is it really?
Semantic versioning is an excellent scheme and one that I’ve taken advantage of numerous times, but I also believe it to be a common tool that is widely misused and misunderstood (just like how that drawer under your oven is [not actu…
Semantic versioning is great…when used properly
9 min readJust now
–
Press enter or click to view image in full size
You may have been in a meeting like this before: your team has created a new open source project or internal library that you’d like to distribute and the question then comes up about how to assign meaningful version numbers to its future releases.
“Well, obviously we’ll use semantic versioning,” someone will say and everyone agrees. Not only does it seem like the clear choice, it can seem like the only choice. But is it really?
Semantic versioning is an excellent scheme and one that I’ve taken advantage of numerous times, but I also believe it to be a common tool that is widely misused and misunderstood (just like how that drawer under your oven is not actually meant for storing your cookie sheets and baking pans).
Before we discuss these issues, though, let’s first dive a little deeper into what semantic versioning actually is.
One Scheme To Rule Them All
Created around 2011 by GitHub cofounder Tom Preston-Werner, semantic versioning is a clearly defined and well-documented scheme that is meant to standardize the way external dependencies are versioned in order to easily communicate at a glance the kinds of changes present in each release. It’s an admirable goal and, if anything, it has had a great influence over how dependency versions typically look.
Details of the 2.0.0 version of the scheme — because yes, semantic versioning is itself semantically versioned — can be found in its online specification and include the following high-level summary:
Given a version number MAJOR.MINOR.PATCH, increment the:
1. MAJOR version when you make incompatible API changes 2. MINOR version when you add functionality in a backward compatible manner 3. PATCH version when you make backward compatible bug fixes
To put this a little more visually:
Press enter or click to view image in full size
Seems easy enough, right? If the MAJOR version does not change, you should be good to pull in any updates you want in order to get new features and bug fixes, without fear of introducing backwards-incompatible changes to your application that may require significant time of your own to address.
The problem is that this is not always how it works out. To see why, we need to dig a little deeper into how the first number of this triplet actually works.
Let’s Talk About Breaking Changes
When you look further into the specification for how the MAJOR version behaves, you will find the following description:
Major version X (X.y.z | X > 0) MUST be incremented if any backward incompatible changes are introduced to the public API. It MAY also include minor and patch level changes. Patch and minor versions MUST be reset to 0 when major version is incremented.
Note the emphasis on the word “MUST” there when describing backward incompatible changes, also typically known as “breaking changes”. It is not optional to increment the MAJOR version when it includes these kinds of changes.
What exactly defines these incompatible, breaking changes, though? Unfortunately, the specification does not explicitly categorize these. While some libraries tie these to concepts like binary-/source-code compatibility, let’s use a definition here that most folks would intuitively agree with:
“Breaking changes” are any changes made to the exposed, intended surface of a library that COULD require changes by SOME consumers of it in order for it to continue working correctly after updating.
The maintainers of the semantic versioning scheme will readily admit that a key purpose of it is to discourage these kinds of changes and to motivate developers to find ways to introduce changes in a backwards-compatible way. In many cases, though, the actual effect has been to discourage folks from incrementing the MAJOR version, despite the changes included. Let’s see why.
The Obvious Breaking Changes
First, let’s look at some changes that anyone would unambiguously recognize as breaking.
Consider the following function call in the Kotlin language.
suspend fun submitPayment( paymentAmountDollars: Long) { // ...}
Kotlin is a null-safe language and this function therefore has a single strictly-required parameter, paymentAmountDollars. What would happen if all of a sudden we needed an additional required parameter, paymentDescription? This would result in the following:
suspend fun submitPayment( paymentAmountDollars: Long, paymentDescription: String) { // ...}
This, of course, is a classic breaking change, as any existing consumers would need to update all of their calls to include a value for this new parameter.
The maintainer of such a library might try to avoid these kinds of breaking changes by making the new parameter optional in some way. Here, for example, we could make paymentDescription nullable and provide a default value of null:
suspend fun submitPayment( paymentAmountDollars: Long, paymentDescription: String? = null) { // ...}
Assuming a null value is possible within the internal constraints of the feature, this nicely adds new functionality without *requiring *immediate changes by consumers.
These are the kinds of changes that semantic versioning handles well and which library maintainers tend to do a great job of correctly identifying — in this case by incrementing the MINOR version (going from, say, 1.2.3 to 1.3.0). But let’s look at a class of changes that are not so obvious.
The Non-Obvious Breaking Changes
To see where more subtle changes come in, let’s consider adding a return type to our function from above:
suspend fun submitPayment( paymentAmountDollars: Long, paymentDescription: String? = null): PaymentReceipt { // ...}
where our PaymentReceipt model includes all of the original data passed in plus an identifier, paymentId :
data class PaymentReceipt( val paymentId: String, val paymentAmountDollars: Long, val paymentDescription: String?)
Suppose this PaymentReceipt model is only ever used from the library as a return type. What happens if we want to add a new required property to our model, createdAtMillis:
data class PaymentReceipt( val paymentId: String, val paymentAmountDollars: Long, val paymentDescription: String? val createdAtMillis: Long)
This would constitute a breaking change. Even though you are only adding data to a response model and consumers are not required to use that data in any way, the model is part of the API contract and this change is backwards incompatible.
To see why, consider any time a consumer might be manually creating that object. Any previous calls like
val receipt = PaymentReceipt( paymentId = paymentId, paymentAmountDollars = paymentAmountDollars, paymentDescription = paymentDescription)
would no longer compile with the latest updates. This may seem contrived for a “response” model but this could be done in order to create mock data for tests or elsewhere, or could even be the result of deserializing this data from a database.
You also would not want to use the “trick” of providing this new property as an optional value just for the sake of backwards compatibility, like so:
data class PaymentReceipt( val paymentId: String, val paymentAmountDollars: Long, val paymentDescription: String? val createdAtMillis: Long? = null // Not what you want!)
This is because this would no longer be communicating the intent of the model change, which is to say that you will always be given the createdAtMillis data in your response.
Similar problems arise when adding new cases to an enum for example. If you begin with the following:
enum class PaymentType { Amex, Visa}
and then add one more case, MasterCard, in a future release:
enum class PaymentType { Amex, Visa, MasterCard}
you have now broken any consumers that might be exhaustively checking for all cases of PaymentType.
These are just some of the many subtle ways that small changes that don’t immediately seem like they would introduce incompatibilities to consumers actually do.
Major Versions, Major Problems
The question is then: how should we increment the version number for these kinds of changes? They are clearly of the breaking variety, so we should update the MAJOR version. But sometimes these and other kinds of seemingly minor changes either escape the notice of the developers or they just don’t feel right: to many people, “major” version updates should include major changes. So what happens is that breaking changes just…slip into minor versions.
A study from several years ago found that roughly a **third of libraries **hosted on Maven that seemingly adhere to semantic versioning incorrectly include breaking changes in a minor version update. This phenomenon is not limited to small libraries maintained by a single developer in their spare time, either: developers have complained about changes to tools like Python, Apollo iOS, and Storybook.
So What Can We Do About It?
Library versions should be used with intention to communicate to consumers what to expect, but as we’ve just seen sometimes that can go awry. Here are a few things to keep in mind in order to be successful.
Document Your Versioning Scheme
As the maintainer of a library, the first thing to note is that you should always document what kind of versioning scheme you are using, whatever it may be. When confronted with a version of the format MAJOR.MINOR.PATCH, many developers will simply assume this means semantic versioning is being followed, even though that’s not always the case. Describe your chosen scheme and its rules directly in the README of your project or include a link there to where that definition can be found.
Be Honest
Next, to quote one of the world’s most famous former soccer players: Be honest. Once you’ve defined your scheme, make sure you are actually sticking to it.
In the case of semantic versioning, this means (among other things) either being extremely careful not to include breaking changes in minor versions or just accepting that MAJOR versions “are not sacred”, to quote Tom Preston-Werner. Large and frequently changing MAJOR version numbers are just fine and don’t need to signify a “major release” in the traditional sense.
This is particularly important when providing your dependency via package managers like NPM and Rust Cargo that strictly require semantic versioning in order to safely manage transitive dependencies.
Use What Works Best For You
Most important, though, is to use what works best for you and your team. And, where possible, that may mean not choosing semantic versioning.
For example, the Rust language defines its own scheme that is very similar to semantic versioning but declares that new public items do not require a MAJOR version bump. Android’s Jetpack libraries declare that the rules of semantic versioning are used to indicate *binary compatibility only *and that minor releases may include breaking changes as far as source compatibility goes. And TypeScript is versioned with a scheme that bumps the MINOR version on a roughly quarterly basis (no matter what changes are included) and increments the MAJOR version simply to avoid the MINOR value going above 9 (¯_(ツ)_/¯).
In my own experience of creating libraries, I’ve found semantic versioning to be very useful when applied to more focused libraries with small, stable API surfaces, but I have leaned more on custom schemes for larger, more dynamic projects, particularly when they involve the modeling of a lot of “data”.
In one case, we were building a networking library that wrapped hundreds of API calls. Each weekly release would inevitably include additional properties added to some of the response models, so by the terms of semantic versioning we would only ever be incrementing the MAJOR version week-by-week. We decided instead on a scheme of the MAJOR.MINOR.PATCH format, but where:
- The
MAJORversion was only incremented to indicate fundamental overall changes in the library (ex: moving from RxJava 1 to RxJava 2 as a means of handling asynchronous functionality). - The
MINORversion was incremented with each weekly release and the correspondingCHANGELOGentries clearly indicated which changes fell into each of the following buckets: breaking changes and non-breaking API changes, behavior changes, and bug fixes. - The
PATCHversion was only incremented when releasing a “hotfix”, which was a build based directly off of a previously weekly release. This allowed for fixes to be included for some consumers before their own deployments without forcing them to pull in all the other changes from subsequent weekly releases.
In each case, our schemes were chosen to best convey the relevant and necessary information to our consumers.
And Speaking Of Consumers
Finally, as a consumer of library dependencies, always “read the manual”: don’t make assumptions about how releases work for any given project. Check to see what versioning scheme is used and always read the CHANGELOG to see what you are actually pulling in. Among other things, you may find that there are more breaking changes than you would have initially thought and you will have to adjust your plans accordingly.
Final Thoughts
With all things in software engineering, communication is paramount. Semantic versioning is a great tool, but it may not always be the **best **one to effectively deliver information from one team to another. Rather than defaulting to a seemingly ubiquitous standard, make sure to find the right fit for you and what you are building.
Brian works at Livefront, where he fixes bugs, adds new functionality, and makes the occasional breaking change or two.