The Seven Deadly Sins of Versioning (Part 3): Versions in the Code

Fred Simon

Versions vs. Versioning in the Code

Versioning has existed ever since software began. And for a half century, the process of creating binaries has required the existence of version information. For most of that time, software building was on the right, evolutionary track. Systems would ask for a version declaration at the time of the build, but this would not be included in the code base.

Around the turn of the millennium, with the fast adoption of version control systems (VCS), a bad practice crept into development where build systems were becoming solely reliant on code information. As updates, patches, bug fixes, and software upgrades started coming faster and in even greater numbers the habit of placing these version text files inside the code continued without much industry reflection on just how dreadful an idea this really is.

Package managers have grappled this issue. And although some have simply reinforced the problem, others have been more solution-oriented, while still others have made things a bit of a muddle. Early package managers, such as RPM and Debian, didn’t actively contribute to this problem because they offered no option to place version information in the code. However, from 2005 forward, newer package managers, such as Maven, Ruby, and npm, forced developers into this practice by automating its execution. In more recent years, as developers are working with expanded build environments and the rise of semantic versioning (SemVer), we’ve seen different tools – including Gradle, Docker, and Go – that have taken strides to address this issue in that Gradle allows for this option, while the latter two make it impossible.

And the difference between having versions in the code and not is dramatic.

Simply put, having versions in one’s code is bad. It requires the existence of a file within a code base that identifies a particular version number. Anytime one needs to change a version number, it necessitates a change in the code base. On the other hand, versioning as a process is substantively different. Here, we have a tool inside of one’s code that’s capable of appropriately updating the version number with each change made to the software. However, in this environment, the version identification information is not stored in the code itself. What is in the code is information regarding the process by which versioning will be executed. In other words, what’s being coded is the type of versioning that’s desired (i.e., the versioning process), but not the version number.

When versions are not a part of one’s code base, one can identify any point in that code base and create a branch from it. Otherwise, when the version is inside a given code base, then each version generates a code change, as one code base would have to have been altered merely for the sake of creating a new software binary version. The problem with this type of new binary is that it isn’t immediately evident whether the code change that was made to create the new version is a real change; all that can be known is that this version represents a change.

Whether we are dealing with major upgrades or minor patches, we always want to create new versions. What’s fundamentally at issue is where this version information is to be stored.

How It Has Been Done and Ways of Doing It Better

Many developers, while continuing to place versions in the code, have implemented workarounds that make the version information that’s embedded in the code dynamic. In this manner, the file describing a particular package that contains the version information still exists, but the version part is dynamic. A script kicks in when building out from this version, and it automatically changes the parameters related to the build process.

However, this merely swaps out the need to manually change the code base to create a new version number and makes the process automatic. While the script is changing the code for the purpose of versioning, it isn’t placing the new version in the VCS. Rather, the script is placing a space between the VCS and the build system. So, at the end of this process, we still don’t have the code base that created the system.

Gradle represents a great advancement in versioning. It allows one to add things as part of a normal build process without the necessity to change the code to generate a new version number. At its core, Gradle separates static information from the dynamic process of creating a binary. The input regarding the calculation of a version becomes part of the declaration of the module and the environment. Gradle is a non-opinionated build automation system. It permits making any parameter dynamic when one wants it to be to be dynamic or static as desired. And if a developer is still inclined to have their version in the code, Gradle can do this.

For all its innovativeness, though, Gradle is so flexible that it offers developers all the rope they need to hang themselves.

Docker is also quite flexible, although it’s opinionated as regards versions of the images that one creates. Docker differs from Gradle in that the version is not part of the Dockerfile at all. This allows use of the same file and the same code base to create multiple versions without the need to change anything in the VCS. A Docker build creates versions only from the dynamic parameters that are given to it through the build process itself. This still leaves build developers to find a versioning methodology to be used as part of their continuous build and integration processes. In Part 1 and Part 2 of of our 7 Deadly Sins of Versioning blog series, we discussed best practices for SemVer, patch numbering, and hash versioning, which we believe will help to reinforce good versioning methodologies.

Go modules (since Go 1.11) adds more constraints and forces more good behaviors related to versioning, branching, the relationship between versions, Git, Git branching, APIs, major version changes, and so on. A notable innovation is that it requires compilation of source code into a full executable. The design of the Go modules is such that its go.mod files (which are part of the source code) will not contain version information. Versions are created from one’s build command and build environment. Go is an opinionated and solid solution for versioning. Nevertheless, it has created some real world conflicts because it just doesn’t take into account that there are a variety of “religions” to which developers adhere when it comes to naming their versions and creating their Git tags and branches.

The Garbageman Cometh

It’s a fact that the vast majority of all the binaries that are created are never released and simply accumulate in software junk piles. In fact, at JFrog, we like to think of ourselves, in a good way, as software garbage collectors. As developers create more and more software, there will necessarily be more and more versions. And with all of this there will be more and more rubbish. It’s just the nature of our trade. Someone has to collect unusable/failed versions, tag them, and know how to throw them away so they don’t clutter software/binary creation systems.

In years past, this work was handled by human beings. Nowadays, JFrog’s binary repository manager, Artifactory, automates garbage collection, sorting, and disposal efficiently, transparently and, most importantly without pausing the development process.

Garbage collection also raises the issue of sequential gaps in versioning and why they are to be encouraged in our modern world of development. If it’s a given that most binaries will never go into production or distribution, but they will have version numbers, then it’s logical to have “version holes.” In other words, we might have a perfectly useful piece of software, v2.1.2, whose next useful version that’s ready for prime time will be v2.1.5, and the next 2.1.8. The gaps are an explicit admission that software development takes time and, yes, mistakes along the way are just part of the process. Most companies don’t like sequential gaps because they feel like it’s publicly exposing what they perceive as a weakness. However, most users already intuit, if they don’t outright know (through experience, if nothing else) that not all software versions are perfect. More to the point, most users don’t even pay any attention to software version numbers, so why companies should be so uptight about version holes is a bit of a mystery.

The Biggest Problem

There have been great improvements in version management. But it may well be that what we’re waiting for isn’t the next great innovation to come down from on high, so much as the need for open source developers to be, well…less lazy. Many treat version management as an afterthought, at best. Yet it’s in their own best interest to focus more attention on the issue, as advances in this arena will make their professional lives better and more productive. Simply stated. And it doesn’t need to be any more complicated than remembering that the goal of a good versioning system should always be to clearly identify versions and to keep the generation of new versions flowing. Using SemVer and hash versioning correctly, as described in Part 1 and Part 2 of this series, will help those involved in liquid software pipeline programming and production to quickly and efficiently clear away bad versions, and continuously create full stack applications that are stable and reliable.

Changes today are mostly being driven by passive exposure to new tools and procedures, as opposed to pro-active lobbying for particular modernized standards. Today’s developers are, out of necessity, acquainted with a variety of package managers. This is because software components are created by a variety of developers with each using his or her favorite tool. As developers interact in the course of working on particular pieces of software, they must learn about tools they may never have used before.

Now they must consider this knowledge beyond tomorrow’s workload. They must see a bigger picture for themselves and the industry. We know that when developers start to demand changes, their voices are heard and actions taken. Versioning can get infinitely better than it is. And you can make it happen. It’s your move developers.

The Liquid Software Revolution of Continuous Updates is here. Get on board and join the revolution.
Read the first chapter.

Illustrations and Diagrams

Download

Blog

The Seven Deadly Sins of Versioning (Part 4): REST API Versioning

February 28, 2019

Blog

10 Reasons You Don’t Need Continuous Updates

December 19, 2018