News The latest CleverThis news.

Writing High Quality, Well Scoped, Commits

In modern software projects, a clean Git history is more than just an aesthetic choice – it is a cornerstone of maintainable, collaborative development. Practitioners across the industry (from Linux kernel maintainers to Google’s Angular team) advocate for atomic commits: each commit being a focused, self-contained change that can stand on its own.

This article outlines widely accepted conventions and best practices for writing such commits in a language-agnostic context, drawing on official guidelines and community consensus. We discuss:

  • Keeping commits small
  • Ensuring each commit builds and passes tests
  • Making commits easy to revert or reorder
  • Including related updates (like documentation and tests) in the same commit

Relevant examples and scenarios are provided to illustrate these principles.

Keep Commits Focused

Make each commit an atomic unit of change. Commits should encapsulate a single logical change or fix. As the official Git documentation advises, you should “split your changes into small logical steps, and commit each of them”1. Each commit ought to be justifiable on its own merits and easily understood in isolation2. In practice, this means you should avoid bundling unrelated modifications into one commit. For example, if you are fixing two separate bugs or adding two independent features, handle them in separate commits (or even separate branches) rather than a single combined commit3. Small, focused commits make it easier for others to review your work and to trace through the project history later (e.g. using git blame or git log)1.

Commit related changes together, and nothing more. A commit should be a wrapper for related changes only3. This is a common theme in many guides, from the Linux kernel’s Submitting Patches guide to internal team wikis. The Linux kernel documentation explicitly states: “Separate each logical change into a separate patch.” For instance, if a change includes a bug fix and a performance improvement, split those into two patches (commits)2. Conversely, if one logical change requires edits across multiple files, those can be grouped in one commit2. The rule of thumb is that each commit addresses one idea or issue. This helps reviewers verify the change and ensures the commit is self-explanatory.

Example: If you are refactoring a module and adding a new feature in that module, do it in two commits. The first commit might be titled “refactor(auth): simplify token validation logic” containing only the cleanup/refactoring, and the second commit “feat(auth): add support for multi-factor login” containing the new feature implementation. This separation makes it clear what each change does and allows one to be reverted or adjusted without affecting the other.

Commit often to keep changes small. Embracing a workflow of frequent commits helps achieve this granularity. Rather than coding an entire feature over several days and committing at the end, break the work into smaller sub-tasks or milestones that can be committed independently. Commit each meaningful step – this might be every few hours or even more frequently. Regular commits prevent huge “omnimbus” changes and reduce the chance of merge conflicts4. It’s often said that it’s easier to squash multiple small commits later than to split a gigantic commit. Git’s interactive rebase (git rebase -i) makes it possible to combine or edit commits after the fact, but splitting one large, tangled commit is very difficult. The Git project’s own workflow notes underscore this: don’t fear making too many small commits; you can always clean up history before merging. “It is always easier to squash a few commits together than to split one big commit into several”, and you can polish commit series with rebase before publishing1.

Avoid mixing concerns in one commit. Any changes that are not directly related to each other should live in separate commits. For instance, do not bundle cosmetic changes (like reformatting code or fixing typos) with functional changes. If you happen to re-indent a code block or rename variables while fixing a bug, consider committing the refactoring first (or separately) and then the bug fix. This way, the bug fix commit’s diff is focused only on the substantive change, unclouded by whitespace or renaming noise. In large projects, maintainers appreciate this separation. The Linux kernel guidelines even give a specific example: if you need to move code from one file to another, do not modify that code in the same commit as the move – perform a pure move in one commit, and then make changes in a subsequent commit 2.

Ensure Commits Pass Tests

A hallmark of a well-scoped commit is that the project remains in a working, buildable state after that commit. In other words, if someone checks out any given commit from the history, the code should compile (or run) and ideally all tests should pass at that point. Keeping the repository in a buildable state at each commit is crucial for tools like git bisect, which relies on testing a range of commits to pinpoint regressions 2. If some intermediate commit breaks the build or test suite, it can frustrate developers trying to bisect, not to mention teammates who might check out that commit. The Linux kernel process documentation explicitly urges developers: “take special care to ensure that the kernel builds and runs properly after each patch in the series”, because someone may split your series at any point in a bisection2. Similarly, the official Git workflows guide says each commit should “pass the test suite, etc.” 1, underlining that each step in your commit history should maintain project integrity.

Don’t commit half-done work. You should only commit code when a logical chunk of functionality is completed and integrated. If your work is still in progress (e.g. a feature is only partially implemented, or tests are still failing), avoid committing that to the main branch. Instead, use a feature branch or even Git’s stash to save your work until it’s ready4. An oft-cited guideline is: “Don’t commit broken code”. Committing something that doesn’t even compile or that causes major parts of the application to fail will “make no sense” to others and may impede colleagues working on the project5. In a collaborative repository, pushing a commit that breaks the build or fails tests can block integration pipelines and annoy team members. Before committing, run the test suite. Ensure that all tests pass (or at least those not intentionally expected to fail). This discipline might require running a quick unit test command or a full CI pipeline locally. As one set of best practices puts it: “Resist the temptation to commit something that you think is completed – test it thoroughly to make sure it really is completed and has no side effects.” 4. By testing your code before you commit, you validate that the code in that commit is in a good state.

Maintain bisect-friendly history. If every commit builds and tests green, git bisect can be a powerful ally for tracking down issues. When you introduce a bug down the line, git bisect will check out commits one-by-one to find where things went wrong. If your commits are small and each is stable, bisect will cleanly pinpoint the first bad commit. However, if some commits are unstable (say, an intermediate commit that “half-implements” a feature and causes a test to fail), git bisect could be led astray. Unintentional test failures act like false signals during bisection6. For example, a commit that only adds a failing test (for a bug not yet fixed) introduces a red test in history; later the test passes when the fix is committed. From the perspective of bisect, the test going red in that commit might be mistaken for a regression, even though it was intentional. Because bisect assumes at most one transition from “good” to “bad”, a deliberately introduced failure can confuse it6. The safest practice is to avoid committing new failing tests or broken code in the main history altogether.

Intentional breaking commits (rarely) and how to handle them. In general, every commit to a shared branch should keep the build green. There are rare cases, often in test-driven development (TDD) or complex feature rollouts, where one might commit a known failing test or temporarily break something with the intention to fix it in the next commit. For example, when fixing a bug, a developer might first add a test that exposes the bug (which fails), and then in the next commit implement the fix so the test passes. This two-step approach can make the reasoning clear: the first commit shows the problem, the second shows the solution. Important: Such patterns should be used with care. Many experts advise not to push the failing test commit to the main branch until it’s fixed5. In a team setting or on the main integration branch, it’s better to combine the test and fix in one atomic commit, or at least ensure the failing test is flagged (skipped) so it doesn’t break the build. If you do use a separate commit for a failing test (on a topic branch, for example), ensure it’s clearly intentional. Some teams mark the commit message with a tag like “[WIP]” or use a convention to indicate that the commit is part of an incomplete sequence. In summary, breaking the build or tests should never be an accident – only do it when absolutely necessary, and even then, communicate that intent clearly (e.g. via commit message or branch strategy).

Independent and Revertible Commits

Another key characteristic of a well-scoped commit is that it can be cleanly reverted or cherry-picked without entangling unrelated code. This comes naturally when commits are atomic (one change at a time) and each commit stands on its own. An independent commit means it does not secretly rely on subsequent commits to function. In other words, it works in the context of the codebase as of that commit. This independence also implies a degree of reorderability: if you have two or three separate atomic commits, the project state should not fundamentally break if their order is swapped or if one is omitted – assuming they don’t have direct dependencies on each other. (Of course, some sequences do build on each other; the goal is to minimize unnecessary coupling.)

Ensure commits can be undone singly. A good test of a commit’s isolation is to ask: if this commit were reverted later, would the codebase still make sense and build correctly? An atomic commit “should be able to be reverted (via git revert) and not cause any side effects or conflicts in other parts of the system”7. When a commit contains only one self-contained change, reverting it will cleanly remove that change. On the other hand, if a commit mixes multiple concerns or partial work, a revert might remove some needed pieces or conflict with later changes. Community wisdom emphasizes this: “An 'atomic' commit is easier to handle in case you want to revert, cherry-pick or merge it. The changes of the commit are clear and understandable.”8. For example, consider a commit that both renames a database column and changes the logic that uses that column. If a problem is later found with the logic change, you cannot revert that commit without also undoing the rename – they’re entangled. Two independent commits (one for the rename, one for the logic change) would have been revertible in isolation. Keeping changes orthogonal in this way provides a safety net: any single commit can be undone with minimal impact on the rest of the code.

Cherry-pick and reorder with confidence. Independent commits also allow you to reuse changes easily in other contexts. For instance, if you develop a feature on a branch that consists of five small commits, and later you realize one of those commits is actually a useful bugfix that should go into the release branch, you can cherry-pick that one commit onto the release branch. This will be painless if that commit doesn’t depend on the other four commits from the feature. Teams often find themselves needing to rearrange the order of commits (during rebase) or drop a commit from a series; atomic commits make this process straightforward. The Git workflows documentation notes that commits should work independently of later commits 1. That implies that if you stop the history at any given commit, everything up to that point works – and likewise, you could potentially reorder some commits without breaking things. While not every sequence of commits is reorderable (sometimes A must come before B), following the practice of one-per-change and keeping them buildable maximizes flexibility. In patch-driven projects like the Linux kernel, it’s acceptable for one patch to depend on another, but it should be noted in the description and each patch still must be justifiable on its own2. The takeaway is to minimize hidden interdependencies between commits. If commit X is meant to prepare for commit Y, ensure X by itself doesn’t put the system in a faulty state; it should lay groundwork harmlessly, and then Y builds on it. This way, maintainers could reorder or drop commits during integration if needed (for example, if one commit in a series isn’t ready, they might defer it and take the others).

Small commits simplify merges. When everyone commits in small, self-contained chunks, integrating changes via merges becomes easier as well. If a merge conflict occurs, it’s often easier to resolve when the commits involved are narrow in scope. Moreover, the chances of two developers editing the exact same lines or files in the same commit are lower if each commit is focused. In case of conflict, understanding what each side was doing (from the commit messages and diffs) is clearer with atomic commits. Finally, debugging issues is easier: if a bug is introduced, you can often pinpoint it to a single small commit. If needed, you can revert that one commit to fix the bug, rather than backing out a large grab-bag commit that also contained unrelated changes.

Tested and Documented Commits

A commit doesn’t just represent a code change; it represents a change in the state of the project. Therefore, any ancillary updates required by that change should be included in the same commit whenever practical. This ensures that the repository remains consistent and that anyone looking at that commit sees the full context of the change.

Update tests along with code. If your change is supposed to be covered by tests – for example, you fix a bug or add a new feature – consider adding or updating the tests in the same commit. An oft-quoted rule for atomic commits is that all necessary pieces go together. “If you add a new feature, the same commit should ideally also add automatic tests for the feature to ensure it won’t regress”9. The logic here is straightforward: the feature and its tests are one logical unit of work. They either both go in, or neither does. Shipping code without its tests in the same commit could mean that, for a brief period in history, the code is untested (or incorrectly tested), or that tests in a later commit might be testing behavior introduced in an earlier commit (making that earlier commit not fully verifiable on its own). By including tests in the commit that introduces the functionality, you guarantee that anyone checking out that commit can run the test suite and see tests passing, including the new ones. It also documents the intended behavior of the code at the moment it’s introduced. For example, if you fix a bug in function parseData(), add a test case in that commit demonstrating the bug is fixed – so future readers of the history can clearly see the before/after effect via that test.

Keep documentation in sync. Documentation and configuration files should likewise be kept up-to-date with the code changes. If your commit changes how an API works, update the README, user guide, or code comments in that same commit to reflect the new behavior. If you’re adding a new feature that users should know about, it might warrant an entry in the CHANGELOG or release notes – don’t postpone that to some future commit. By updating docs as part of the change, you reduce the chance of forgetting to do it later. In the spirit of atomic commits: everything related to that change goes together. One common practice in many projects (including Angular and others) is to have a commit type for documentation changes (often docs:). This makes it clear that the commit affects documentation. For instance, a commit message might start with docs: update README with new configuration options. But importantly, if the documentation update is tied to a code change (e.g., adding a new config option), it can be part of the same commit that introduces that option, or a directly subsequent commit labeled as docs. In either case, it should be done before the change is considered “complete.” A good guideline is: if someone checked out your commit, would they have all they need to understand and use the change? If not, consider what’s missing (tests, docs, example config, etc.) and include it.

Ancillary files and metadata. Don’t overlook other repository files that might need updates. For example, if you add a new contributor in a project that tracks authors in a CONTRIBUTORS file, update that in the commit where appropriate. If you make a change that affects the build or deployment, update any configuration or build scripts as needed. In a documentation-driven project, if you change a function signature, update the relevant docs or even generated docs if they are version-controlled. All these practices ensure consistency. The project should not be in a state where the code says one thing but the README says another (at least not within a single commit’s snapshot of the repo). By keeping commits self-contained in this way, you again make it easier to revert or cherry-pick changes: you won’t accidentally revert code and leave stale documentation behind, or cherry-pick a feature without its necessary docs.

Example: Suppose you remove a deprecated command-line option from a CLI tool in a commit. A well-scoped commit for this would delete the code handling that option, update the help text or README to no longer mention it, and perhaps adjust any reference config files or tests that used it – all in one commit titled something like feat(cli): remove deprecated "verbose" option (BREAKING CHANGE). This commit fully encapsulates the removal of the feature. If later someone needs to revert this removal, the revert commit will bring back not just the code but also the relevant documentation and tests, keeping everything consistent.

Include CHANGELOG entries when relevant. Some repositories maintain a CHANGELOG.md manually. If your project does this, consider adding an entry in the changelog as part of the commit that introduces a notable change (especially for user-facing features or fixes). This way, you tie the changelog update to the change itself atomically. However, many modern workflows use automated generation of changelogs from commit messages (e.g., Conventional Commits with tools that aggregate feat and fix commits). If that’s the case, ensuring your commit message is descriptive and follows the convention is how you “update” the changelog. For example, Angular’s contributing guide and the Conventional Commits specification define commit message types like feat (feature), fix, docs, etc., and a special marker for breaking changes 10. Following such a convention helps downstream tools pick up your commit for release notes. For instance, if your commit introduces a breaking API change, including BREAKING CHANGE: in the commit message footer is an established practice10 – it signals to maintainers and automated tools that this commit is meant to introduce a deliberate breaking change in functionality.

Clear Commit Messages

(While this article focuses on commit content, it’s worth noting that good commit practices go hand-in-hand with good commit messages. A few guidelines are mentioned here for completeness.)

Always accompany your well-scoped commit with a clear, descriptive commit message. Many organizations have style guides for this. For example, the Angular project’s commit message format (which influenced the Conventional Commits standard) requires a structured message like <type>(<scope>): <short summary> 11. Even if you don’t follow a specific template, follow general best practices for messages: use a short summary line (50 characters is a common recommendation) describing what the commit does, and a longer body if needed to explain why and how. Use the imperative mood (“Fix bug” not “Fixed bug”), as if giving an order to the codebase3. This is consistent with messages generated by Git for merges or reverts and is widely adopted. For example, instead of writing “I added a check for null inputs”, write “Add check for null inputs”.

If your commit is one in a series, it can be helpful to mention relationships (e.g., “Part 1 of 3” or “Prerequisite for X feature”) in the body. And if the commit introduces a breaking change or deprecates something, explicitly call that out – some conventions use BREAKING CHANGE: in the message body to flag this10. This practice ensures that when scanning history, such commits stand out. It also ties back to the idea of intentional breaks in the test or API: by noting it in the message, you signal to everyone that the breakage is acknowledged, not accidental.

Example of a well-formatted commit message:

feat(api)!: remove deprecated endpoints

Remove the deprecated v1 API endpoints `GET /users` and `POST /submit`.
This commit deletes the code handling these endpoints and updates the API documentation and tests accordingly.

BREAKING CHANGE: Clients using the removed endpoints will receive HTTP 404 errors. They must migrate to the v2 endpoints introduced in version 2.0.

In this example, the commit title follows a convention (feat type with a ! to indicate a breaking change). It clearly states what the commit does. The body explains details – what was changed and why. It also notes that documentation and tests were updated (showing the commit is comprehensive), and explicitly calls out the breaking change for downstream users. Anyone reviewing the history can immediately grasp the impact of this commit.

Conclusion

Writing small, self-contained commits is a discipline that pays dividends in team productivity, code quality, and project longevity. By adhering to the conventions outlined above – one logical change per commit, always leaving the code in a working state, and including all relevant updates – you create a Git history that tells a coherent story. Such a history is easy to navigate, making debugging and code archaeology far less painful. It also facilitates smoother code reviews and collaboration, since each commit is focused and can be discussed in isolation.

These best practices are reflected in the guidelines of some of the most respected software communities. The Linux kernel, for instance, requires patches to be logically separated and bisect-friendly2, and projects like Angular enforce structured commit messages for clarity and automated tooling11. Tools and specs (Git itself, Conventional Commits, etc.) have evolved to encourage this style of work because it leads to more maintainable software. As a developer, making a habit of crafting atomic commits with clear messages is a mark of professionalism and care for your craft.

In summary, commit often, commit intentionally, and commit completely. Each commit should be an island of functionality: small but whole. By following these conventions, you ensure that any commit in your project’s history can be understood, built, tested, and if necessary, reverted or reused with confidence. This fosters a robust development workflow where changes are tracked and communicated effectively through version control. As the proverb goes, “take care of the pennies and the pounds will take care of themselves” – in Git terms, take care of the commits, and the codebase will take care of itself.

TL;DR

  • Make incremental changes in each commit, keeping them small: Write the smallest amount of code necessary to implement a single change or fix. This approach makes each update easy to test and roll back without affecting unrelated functionality12. Small commits also reduce the likelihood of merge conflicts by minimizing overlap with others’ work 12.
  • Limit each commit to one logical change or issue: Do not bundle unrelated fixes or features in the same commit. For example, if you fix a bug and improve performance, use separate commits for each; a commit should serve as a wrapper for related changes only 2,4. Keeping commits focused ensures they are easier to understand, review, and revert if something goes wrong4.
  • Keep commits atomic and self-contained: Ensure every commit is a single, self-sufficient unit of work (one task or fix) that can be applied or reverted in isolation without side effects12. An atomic commit encapsulates a complete thought – for instance, a code refactoring and a new feature should be in separate commits – which makes code reviews faster and revert operations safer12. Each commit should be justifiable on its own merits and not require later commits to be understandable2.
  • Avoid committing half-done work: Only commit when a piece of functionality or bug fix is fully implemented and tested. Incomplete code (such as a partial feature that doesn’t yet work) does not belong in the main commit history 4. If you need to checkpoint work or switch context, use a draft/WIP branch or Git’s stash feature rather than committing unfinished changes4. This ensures that every commit in the shared history represents a coherent, finished change.
  • Commit early and commit often: Frequent commits prevent individual commits from growing too large. By breaking development into quick, logical chunks, you make it easier for teammates to integrate changes regularly and avoid large merges 4. Short-lived branches with regular small commits minimize divergence from the main branch, reducing integration problems and easing collaboration12.
  • Ensure each commit builds and passes tests: Treat the repository as if it should be in a working state at every commit. Compile the code and run the test suite for the project after each commit to verify nothing is broken2. This guarantees that any commit can be checked out on its own and will function correctly, which is crucial for tools like git bisect and for safely reverting commits in isolation2.
  • Test and review before committing: Resist the urge to commit code you think is complete—run the code and tests to be sure. Verifying the change in isolation (including checking for side effects) is important for quality 4. Additionally, self-review the diff before you commit to confirm that only the intended changes are included and that you haven’t introduced debugging statements or unrelated edits. This extra scrutiny helps maintain focus and catch mistakes early13.
  • Use interactive staging to craft focused commits: Leverage Git’s staging tools (for example, git add -p or git add --interactive) to stage changes selectively. This allows you to split a mixed set of edits into separate commits for each concern13. By carefully choosing hunks of code to include, you ensure each commit contains only relevant changes, which reinforces a clean separation of ideas and makes the history easier to understand.
  • Separate formatting or whitespace changes from functional changes: If you need to reformat code, fix indentation, or make other non-functional style tweaks, do so in a dedicated commit without mixing in code logic changes14. Isolating purely cosmetic changes prevents noise in the diff and lets reviewers concentrate on the substantive modifications. In practice, such a commit should contain no semantic code changes, making it clear that its purpose is only to improve readability or adhere to style guidelines14.
  • Isolate large-scale code movements or renames in their own commits: When moving code between files or renaming classes/functions, perform the move in one commit and make no other changes in that same commit2. This clear separation means that the move/rename commit purely relocates code. The subsequent commit can then modify that code if needed. Keeping moves separate greatly aids reviewers (who can verify no logic changed during the move) and preserves file history for tools like git blame2.
  • Use topic branches to organize development: Develop new features or fixes on separate branches rather than on the main branch. Branching isolates your work until it’s ready, which avoids polluting the main history with WIP commits and allows for code review before integration12. Once the work is complete and each commit is polished, merge the branch back (or rebase onto main) so that the main line of development remains stable and linear. This practice also makes it easier to revert or cherry-pick a set of changes if they were developed in an isolated branch.
  • Establish a consistent team workflow and commit policy: As a team, agree on how you use Git – for example, whether you squash commits on merge, if you prefer rebase over merge commits, how you name branches, and the expected commit message format. A defined workflow (such as GitHub Flow or Git Flow) and shared conventions ensure that everyone works in a compatible way4. Consistency in commit practices across the team promotes clarity and makes the project history more predictable and maintainable.
  • Adopt a standard commit message format: Use an established convention for commit messages to provide structure and meaning. For instance, many projects follow the Conventional Commits or Angular format, where each message begins with a type like feat:, fix:, docs:, etc., optionally followed by a scope, and a brief description10. Having a consistent format makes the history easy to parse and can enable automated changelog generation or semantic versioning tools.
  • Begin each commit message with a concise, descriptive summary: The first line of the commit message (the “subject”) should quickly convey what the commit does. Write it in imperative mood (as if giving an order, e.g. “Add user login audit log” not “Added” or “Adding”) and in the present tense12. Keep this summary line short (around 50 characters or less) and capitalize the first word. A good summary allows others scanning the commit log to immediately grasp the intent of the change.
  • Provide explanatory detail in the commit message body when necessary: If the reason or context for the change isn’t obvious from the code diff and summary alone, include a body in the commit message after a blank line below the summary. In this body, explain why the change was made and any background info or implications, not just what was done4. Wrap the text at roughly 72 characters per line for readability4. A well-written body can describe the problem being solved or the decision behind the implementation, helping future maintainers understand the commit without needing external references12.
  • Reference relevant issues, tickets, or external links in the footer: When a commit relates to a discussion, bug report, or feature request, mention that in the commit message. For example, include “Fixes #123” or “Refs: issue-456” in the footer to automatically link the commit to issue trackers10. This practice provides traceability – anyone reading the commit can discover the broader context or see that an issue was resolved by that commit. It also helps project management by closing issues when commits are merged, if the repository host supports it.
  • Maintain consistency and clarity in commit messages: Ensure each commit message adheres to the agreed style and contains enough information to stand on its own. Describe your changes in a way that other developers (or your future self) can understand the history without guesswork. For example, rather than just saying “Update code” or having an empty message, always supply meaningful detail. Consistent, well-structured messages across the project make the version history a valuable narrative of the project’s evolution 12.
  • Leverage tooling to enforce commit quality: Use automated checks and hooks to uphold your commit standards. A commit-msg hook can verify that commit messages meet your formatting rules (for example, rejecting messages that don’t follow the Conventional Commits pattern), and a pre-commit hook can run linters or tests to prevent code that fails checks from being committed. Many teams integrate these into their workflow so that commits that don’t meet the project’s guidelines are flagged or rejected before they reach the main repository. Automation in this way helps sustain high quality and consistency in a collaborative environment.
  • Strive for a clean, bisectable history: Each commit in the history should be meaningful and not merely a “fixup” for the previous one. Try to correct small mistakes (typos, minor bugs introduced in the last commit, etc.) by amending the prior commit or via an interactive rebase before pushing, rather than adding a new “fix typo” commit. The goal is a tidy commit log where any commit can be understood on its own and, if needed, reverted without having to also revert subsequent “patch” commits. This discipline makes tools like git bisect more effective and the project history more professional and maintainable.

References


  1.  ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
  2.  ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
  3.  ↩︎ ↩︎ ↩︎
  4.  ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
  5.  ↩︎ ↩︎
  6. Hacker News – community discussion about using failing tests for git bisect (When fixing a bug, add a failing test first, as a separate commit | Hacker News↩︎ ↩︎

  7.  ↩︎
  8. Stack Overflow – discussion on why small, atomic Git commits are easier to understand, revert, and cherry-pick (Why I need small Git commit every time? - Stack Overflow↩︎

    • Optimized by Otto Blog – “Five requirements for a good git commit” (atomic, with tests and docs included) (How to make a good git commit)
     ↩︎
    • Conventional Commits v1.0.0 – specification for structured commit messages (types like feat/fix/docs, and marking breaking changes) (Conventional Commits)
     ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
  9. Angular Project – Contributing Guidelines (commit message formatting rules that inspired Conventional Commits) (angular/CONTRIBUTING.md at main · angular/angular · GitHub↩︎ ↩︎

  10. GitLab – “What are Git version control best practices?” (advice on making incremental, small changes) (GitLab Blog↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎

  11. DEV Community – “Make Small Commits” by oculus42 (tips on using IDE tools to manage focused commits) (Make Small Commits - DEV Community↩︎ ↩︎

  12. Stack Overflow – Committing when changing source formatting? (discussion on separating cosmetic from functional changes) (version control - Committing when changing source formatting? - Stack Overflow↩︎ ↩︎

W3C submission of Clever Semantic Versioning

Clever Semantic Versioning: A New W3C Submission Bringing Clarity and Consistency to Versioning

On April 03, 2025, the World Wide Web Consortium (W3C) has officially published CleverThis's Clever Semantic Versioning as a W3C Member Submission. We at CleverThis are proud to announce this milestone. The Clever Semantic Versioning specification extends the familiar principles of semantic versioning to meet the needs of modern development, improving clarity, consistency, and automation in how developers version their APIs, software libraries, and distributed systems. In this post, we’ll explore what Clever Semantic Versioning is, how it differs from traditional semantic versioning, and why it’s a game-changer for developers. We’ll also highlight CleverThis’s role in driving this innovation and what it means for the broader tech community.

What is Clever Semantic Versioning?

Clever Semantic Versioning is a formal specification for versioning software and related artifacts in a predictable, meaningful way. If you’re familiar with traditional Semantic Versioning (SemVer) from SemVer.org, you already know the basics: version numbers are typically expressed as MAJOR.MINOR.PATCH, and they convey the scope of changes (breaking changes, new features, bug fixes, etc.). Clever Semantic Versioning builds on this foundation but goes further, addressing gaps and ambiguities in SemVer to make it suited for a wider range of use cases and modern workflows.

**Under the hood, Clever Semantic Versioning uses a version format of MAJOR.MINOR.PATCH-EXTRA+META. Let’s break that down:

  • MAJOR, MINOR, PATCH – These work as in traditional SemVer, indicating incompatible API changes, backwards-compatible additions, and backwards-compatible fixes respectively. Clever Semantic Versioning retains these core rules, so it feels familiar to developers.
  • EXTRA – This is an extension field (following a hyphen) for pre-releases or other additional version qualifiers. In SemVer, a hyphenated suffix denotes pre-release versions (like -beta or -rc1). In Clever Semantic Versioning, the EXTRA field generalizes that idea: it can denote pre-release identifiers or embed secondary version information for advanced cases. For example, the EXTRA field is used to encode dependent artifact versions (more on that shortly).
  • META – This is a metadata field (preceded by a +) for build or contextual metadata. Like SemVer’s build metadata, the META part in Clever Semantic Versioning doesn’t affect version ordering; it’s there for informational purposes (e.g. build IDs, timestamps).

Importantly, Clever Semantic Versioning is backward-compatible with SemVer 2.0. Any version string that is valid under SemVer 2.0 is also valid under Clever Semantic Versioning, so developers can adopt the new spec without breaking existing versioning schemes. This compatibility means you can gradually transition to the Clever model and still interoperate with tools (package managers, etc.) that expect SemVer format.

Beyond SemVer: Key Enhancements for Modern Development

Traditional semantic versioning was designed primarily for software libraries and code – it’s “explicitly and deliberately targeting software only”. In practice, however, developers version all kinds of artifacts: web APIs, databases and datasets, configuration schemas, user interfaces, machine learning models, and more. Clever Semantic Versioning bridges that gap by extending the semantic versioning concept to these other artifact types. Here are some of the key enhancements that set Clever Semantic Versioning apart from the classic SemVer model:

  • Broader Scope of Artifacts – The new spec isn’t limited to code APIs. It defines versioning rules for multiple version classes including APIs, user interfaces (UI), data sets, and schemas. This means whether you’re versioning a backend API, a front-end application UI, a dataset, or a database schema, you have a clear, semantic way to label and increment versions. Each class comes with explicit guidance on what constitutes a PATCH, MINOR, or MAJOR change in that context, removing the guesswork for developers. For example, in UI versioning a MINOR change could be adding new UI elements in a backwards-compatible way, whereas a MAJOR change might be a redesign that breaks backward compatibility for users. By expressing the rules in a generic way and instantiating them for several categories of artifacts (like UIs and datasets), the submission “extends the scope of semantic versioning to other kinds of artifacts”, bringing semantic rigor to domains that previously lacked a standard versioning scheme.

  • Explicit Versioning Guidelines – One of the challenges with vanilla SemVer is interpreting the rules consistently across different scenarios. Clever Semantic Versioning directly addresses this by providing explicit guidelines for each versioning class. As the spec notes, the general idea of major/minor/patch is the same everywhere, but “the specific interpretation is not always clear depending on the class of resources… an API, schema, or application would all interpret when to increment a version a bit differently”. Clever Semantic Versioning eliminates ambiguity by spelling out when to bump each part of the version for each artifact type. This clarity ensures that two different teams (or projects) following the spec will make consistent versioning decisions. For instance, the rules stipulate that in an API, deprecating a publicly available function (without removing it) must trigger a MINOR version increment, whereas in a dataset, adding new records (with no schema change) counts as a PATCH bump. By defining these rules, the spec creates a common language for version increments across various domains.

  • Hybrid and Multi-Component Versioning – Modern software often isn’t a single monolithic artifact; it’s a distributed system or a product suite comprising multiple components (e.g. a service with an API and a UI, or a dataset with an associated schema). Traditionally, teams might either give each component its own version or try to synchronize them in less formal ways. Clever Semantic Versioning introduces a Hybrid Versioning approach for such cases. If an artifact exposes multiple public components of different types, you can version each component separately and maintain an overall version for the artifact as a whole. The hybrid versioning rules ensure the overall version only moves in response to the most significant changes among the components. For example, consider a software release that includes both an API and a UI: if the API has a major breaking change (requiring a MAJOR bump) and the UI only a minor update, the overall product version’s MAJOR would increment (reflecting the breaking change) while the combined version could encode that the UI part had a minor change in the metadata or extra field. This way, consumers of the combined product see a single version (for simplicity) but that version was calculated with a principled consideration of both subcomponents. Hybrid versioning provides a structured way to keep multi-part systems in sync and communicate changes clearly.

  • Dependent Artifact Versioning – Another innovative feature is how the spec handles extensions or plugins that depend on a base artifact. Often, extensions need to indicate not only their own version but also which version of the base system they are compatible with. Clever Semantic Versioning offers a solution: it allows embedding one version number inside another. In practice, the “EXTRA” part of the version string can carry the dependent component’s version, treating the base project’s version as more significant. The specification describes a format like MAJOR_BASE.MINOR_BASE.PATCH_BASE-MAJOR_DEP.MINOR_DEP.PATCH_DEP+META_BASE for these cases. For example, 1.2.3-4.5.6 could denote version 4.5.6 of an add-on for base version 1.2.3. In version comparisons, the base part is considered first, so this ensures that no matter how high the extension’s own version goes, 1.2.3-something will always be recognized as tied to base 1.2.3 and will sort lower than 1.3.0 of the base, for instance. This dependent versioning scheme is a boon for plugin architectures, allowing developers to convey compatibility in the version string itself – a level of automation-friendly detail that traditional SemVer didn’t cover.

  • Formal Grammar and Rigor – The Clever Semantic Versioning spec doesn’t just provide guidelines; it also defines a precise syntax (using Backus–Naur Form) for what constitutes a valid version string. This formal definition means tooling can reliably parse and validate version numbers against the spec. By providing an exact grammar, Clever Semantic Versioning eliminates edge-case ambiguities (for example, what characters are allowed in pre-release tags, how numeric sequences are handled, etc., all of which are clearly specified). This level of detail is essential for automation and tooling support, allowing developers to integrate version checks into continuous integration (CI) pipelines or package management systems with confidence.

In summary, Clever Semantic Versioning takes the proven semantic versioning pattern and supercharges it with greater scope and precision. It remains familiar – if you know how MAJOR, MINOR, PATCH work, you’re already most of the way there – but it adds the semantic nuance and structure needed for today’s diverse software ecosystems. The W3C’s summary of our submission captures it well: this new scheme “aims to bridge [the] gap by extending the scope of semantic versioning to other kinds of artifacts”, expressing rules in a generic way and tailoring them to multiple categories.

Why Developers Should Care: Clarity, Consistency, and Automation

One of the main goals of Clever Semantic Versioning is to make developers’ lives easier when it comes to managing versions. Version numbers are more than just labels – they are contracts and signals to anyone who uses your code or data. Here’s how the Clever Semantic Versioning model delivers concrete benefits in clarity, consistency, and automation for developers:

  • Clarity: With clearly defined version classes and rules for each, developers and teams gain a shared understanding of what a given version bump signifies. No more debating “is this change a minor or a major?” – the spec provides objective criteria. This clarity is especially helpful when working in large teams or open-source projects where consistent versioning practices are critical. By following Clever Semantic Versioning, your users can immediately tell how significant an update is. For instance, if you release version 2.0.0 of an API, they’ll know it involves breaking changes to the public API surface by definition. If you update a dataset to 1.1.0, they’ll understand that new information was added in a backwards-compatible way (perhaps new data fields or records). Clear version numbers act as an instant communication tool between maintainers and consumers of software. Moreover, the spec’s guidance to “explicitly state the versioning class being used” means everyone is on the same page about what aspect of the project the version refers to (e.g. “API version” vs “dataset version”), eliminating confusion in multi-faceted projects.

  • Consistency: Consistency in versioning leads to predictability. Because Clever Semantic Versioning is an open specification, it encourages a unified approach across the industry. As more libraries, services, and datasets adopt it, developers will start to notice a consistent pattern in version numbers everywhere. This uniformity can reduce integration friction in distributed systems – microservices and client applications can coordinate version upgrades more smoothly if all components adhere to the same semantic versioning principles. For example, a microservice architecture might involve dozens of services; with each service using Clever Semantic Versioning, it's straightforward to automate checks that no service introduces a breaking change (MAJOR bump) without others being aware. Additionally, consistency is maintained with the past: since the specification is backward compatible with SemVer 2.0, projects already using semantic versioning won’t see any inconsistency or disruption. You can layer on the CleverThis extensions (like EXTRA fields or additional artifact classes) as needed, while still outputting version strings that look and sort the way developers expect. The result is a more consistent ecosystem where version numbers mean the same thing across different projects and artifact types, thanks to a common specification.

  • Automation: Perhaps one of the most exciting advantages is how this model enables better automation and tooling around version management. By codifying the rules and even providing a formal grammar, Clever Semantic Versioning makes it easier to build tools that can automatically bump versions, enforce version policies, or orchestrate releases. For example, consider continuous integration pipelines: a CI tool could be configured to analyze the changes in a code repository and, based on the nature of changes (detected via commit messages, code analysis, or tests), decide whether to increment patch, minor, or major automatically according to the spec’s rules. In a multi-repository or multi-component system, the hybrid versioning approach can be automated to calculate an aggregate product version from individual module versions – saving release engineers from manual coordination. Dependency management also benefits: package managers (like npm, Cargo, or Go modules) already rely on semantic versioning for resolving compatible versions. With Clever Semantic Versioning’s richer semantics, such tools could become even smarter – for instance, a future package manager could understand that a 1.0.0-beta (Extra field used for pre-release) is an unstable preview and treat it differently, or that 2.0.0-1.0.0 (Extra field used for dependency version) should only be pulled in if the base 1.0.0 is present. In distributed systems, orchestration tools might use version metadata to ensure that microservices only talk to compatible versions of each other. The bottom line is that consistent, semantic version strings are machine-friendly. By improving the semantic richness of version numbers, Clever Semantic Versioning unlocks more opportunities for automation, reducing manual effort and human error in the release process.

In essence, developers who adopt Clever Semantic Versioning can deliver more trustworthy versioning. Your version numbers become honest indicators of what’s changed, and you can leverage that honesty to automate release workflows and compatibility checks. This means less time wrestling with versioning dilemmas and more time building features, with confidence that your versioning strategy will scale as your project grows.

CleverThis’s Leadership in Open Standards

The release of Clever Semantic Versioning as a W3C Member Submission also highlights the broader innovation and leadership of CleverThis. We identified a crucial gap in how the industry approaches versioning and took the initiative to create a solution – not just for our own use, but for the benefit of the entire developer community. By crafting this specification and submitting it to the W3C, CleverThis is demonstrating a commitment to open standards and collaboration.

Creating a W3C Member Submission is no small feat. It involved rigorous drafting (the submission document itself spans a detailed overview, formal definitions, and multiple examples), internal review by our team of experts, and an official review by W3C’s team. The W3C has acknowledged our submission, noting that “Clever Semantic Versioning provides a more suited solution with its generic scheme” compared to using basic SemVer for varied purposes. In the W3C Staff Comment, they even pointed out that our approach could be useful for communities managing things like dataset versions (e.g. the W3C DCAT vocabulary for dataset metadata) or ontology/schema versions in Semantic Web standards. It’s rewarding to see that our work is being recognized as a valuable resource for the web and software community at large.

For CleverThis, contributing this specification is part of our philosophy of giving back and driving progress. We believe that by pioneering best practices and sharing them openly, we can elevate the state of the art for everyone. This effort aligns with our commitment to open source and open standards – much like how we contribute to other projects, we’ve now contributed a potential future standard that anyone can adopt. It’s a form of thought leadership: we didn’t just adapt to how versioning was done; we actively shaped how versioning could be done better. We’re excited that Clever Semantic Versioning is now published on W3C’s platform, where it can serve as a stable reference for developers and organizations around the world.

Getting Started and Looking Ahead

So, how can developers start taking advantage of Clever Semantic Versioning? The first step is to read the official Clever Semantic Versioning specification (hosted on W3C’s website) to understand the detailed rules. The spec includes numerous examples and the precise definitions needed to implement the scheme. Since it’s backwards-compatible with traditional SemVer, you can adopt it incrementally: for instance, you might continue using MAJOR.MINOR.PATCH as you always have, but begin applying the additional guidelines for what constitutes a major or minor change in your specific domain (be it an API, UI, etc.). Over time, you can introduce the EXTRA field for pre-releases or special versioning scenarios, and use META for build metadata as appropriate. If you maintain libraries or services, consider explicitly documenting which versioning class you adhere to (e.g. “This project uses Clever Semantic Versioning – API Versioning class”), so your contributors and users know what to expect.

Tooling support will evolve as the community gains awareness, but you likely can already integrate parts of the scheme with existing tools. Many build and release pipelines have hooks or plugins for SemVer – these can often be configured to handle a fourth "extra" segment or to enforce certain version bump rules. The open-source community may soon develop dedicated libraries to validate and compare Clever Semantic Versioning strings (similar to the many SemVer packages available). In fact, because the spec is open and published, we invite developers to contribute to tooling and to share best practices for adoption. (CleverThis has made the text of the specification available under a Creative Commons license, and our own implementation notes are open for discussion on our GitLab repository.)

Looking ahead, we are optimistic about the influence Clever Semantic Versioning can have. Our hope is that it becomes a go-to standard for versioning in scenarios that SemVer didn’t fully cover. By using it, organizations can reduce miscommunication about version changes and rely more on automation for release management. As more projects adopt the model, it could even pave the way for future standardization – perhaps informing a future W3C Working Group or industry-wide best practice for versioning across different types of digital assets.

In conclusion, the W3C Member Submission of Clever Semantic Versioning marks an exciting development in the world of software versioning. It brings a new level of precision and universality to something every developer deals with: version numbers. We encourage you to explore the specification, think about how its principles could improve your development workflow, and join us in embracing a more clever way to version. CleverThis is honored to lead this charge, and we look forward to seeing the community benefit from clearer, more consistent, and more automated versioning practices in the years to come.

Learn more: Check out the full Clever Semantic Versioning specification on the W3C website, and feel free to reach out to us with your experiences or questions as you adopt this new model. Together, let's make versioning a solved problem, so we can all focus on building great software with confidence in our compatibility and change management!

Ethical Challenges in AI

    / [pdf]