This article describes a technique that helps to become productive in a large code base quickly. When mentoring other engineers, I found this to be a very effective and teachable approach. The primary audience are software engineers and engineering managers.
When I joined larger organisations, the on-boarding usually involved some sort of starter task: add a new metric to the dashboard, allow people to add a website URL to their profile, … . They are a great idea, but often new engineers struggle to make progress – paralysed by the complexity of the code base. Most are start working on a small subset (e.g. do front-end changes first) before understanding the big picture. This not only applies to junior engineers, the same is true for software veterans switching to a new company and auditors parachuting into a short-term project.
I call this approach “stack diving”, as we aim for a quick, narrow journeys all the way through the stack. Similar to how a cliff diver accelerates through the air, and then repeats the exercise multiple times.
Create a playground
We want to move fast in the next steps, but not break things. Therefore, it’s key to have a playground where we can experiment confidently. Any serious tech infrastructure should have a way to spin-up separate environments for doing this.
I found that this is the step where senior software engineers create a lot of value. Setting the entire build and deploy up once properly, will prevent many ad-hoc questions later. Those interrupt flow and block the new joiner from making progress on their own.
It’s of course crucial to understand where the playground stops and the danger zone begins.
Sometimes even dev environments connect to the main data backend.
A drop table;
might make one more well-known than one hoped for.
Go wild
First item on the agenda: break things. Let’s do all of the following, starting from a clean checkout (or environment) each time:
- Delete classes / methods and observe how they break the build and/or service
- Change configuration parameters
- Delete entire folders or move them around
- …
These exercises allow to learn and verify a lot of important questions:
- What parts are compile-time checked?
- What parts are captures by tests (and are they actually blocking deployment)?
- What kind of code generation is happening as part of the build?
- Is there any code that appears to be unused?
- How much do error messages agree with our changes or do they lead us in a different direction?
Also, we can verify that our changes are actually picked-up and deployed. Few things are more frustrating than assuming that a change is safe since our tests went smooth, but later seeing that some changes we’re not visible in our local environment.
Follow the white rabbit
Back to our starter task. Now it’s time to identify the top of the cliff. We navigate in our playground to the frontend component or API endpoint that we want to change.
Within our safe playground we now try to do the dive by following the calls through the stack.
Along the way we add log statements at every opportunity.
Where logging is difficult to add (it should not!), do grave changes (e.g. throw an exception) to verify and then add a comment.
All these are an easy way to verify we are actually going down the right rabbit hole.
Also, we can later backtrace our steps using git diff HEAD
.
The main goal is to identify the right places to make changes.
We want to prevent spending two days adding a new attribute to EntityAccount
when we’re actually interacting with EntityUser
.
While this might sound obvious, in large codebases it is nearly impossible to list all possible candidates.
At this point it’s a good idea to bug the busy engineer next to us to double-check.
Now that we know our trajectory through the air, it’s time to do our changes properly.
We commit all our discovery code changes and call that branch discovery_task_xx
.
The actual work now happens on top of that in a branch called task_xx
.
This way we don’t loose our path markers and the our weird logging is outside our real commits.
When finished, simple rebase task_xx
onto main
and we have a clean PR ready for review.
Credits: cover photo by Sophie Sollmann on Unsplash.