Building invisible tooling for internal users

Holding your core - with concrete, ropes and lots duct tape

Jul 03, 2023

Building the internal tooling - reusable components, pieces of infrastructure or APIs - is much more difficult and unforgiving than building the customer-facing stuff.

You can’t see, smell or touch it.

Take the recommendations, the API component - it feeds and supports others in building client stuff the company can actually sell. You aren’t getting much kudos for stuff you build but…

If it goes south, there may be months before the failure becomes apparent, the data is corrupted, lots of wrong decisions were made, and the whole structure other teams are building needs a serious pivot costing you all money.

Some of these are irreversible, and some of them end up making maintenance a financial nightmare long after you leave the company.

Some scrum evangelists frown upon component teams, but there is often no other way around it, you can’t make every team build their own piece of the backend as a part of their given feature/product, there will be a huge list of inconsistencies, dependencies, incompatible versions and many other nasty things those evangelists have no idea about.

The code will be tripled on production, there will be duplicates and copies of the copies with lots of weird redundant code no one wants to touch (I remember back in the day, some folks refused to install Magento on their Macs cause they hated touching the PHP codebase and complained about their Macs unable to allocate that much memory locally. Yeah, I had to beg standing on my knees).

Therefore some sort of committee must arise, those sullen platform people must be there to keep an eye on every product decision ever made.

While I am pro-autonomy, over the years I realized allowing anarchy on the backend will make you unable to build products fast enough for them to matter, without a considerable time investment in rebuilding your dated backend that is.

Paradoxically, while maintaining the support structure we have discussed above, you are interested in making your part as invisible as possible - fast, reliable as a hammer, lightweight and faceless as much as possible.

The magic of best-in-class hardware/software is to remain invisible and reliable as a Swiss watch (or an old Volvo 240 GL for that matter).

I had a fair deal of stories with bad decisions being made - before my tenure or even on my watch, I have a couple of personal rules I am trying to follow as a PO/PM (constantly growing in size):

Find & engage architects on the team (good engineers will do); their expertise will save you lots of money, they have to understand what you are trying to build though. Be as vocal as possible, and use UML and drawing boards.
For every piece of new functionality think about collisions, edge cases, worst-case scenarios, telemetry and the future you are going towards. What’s the actual end goal of the change?
Get yourself a map of dependencies of services you might touch developing X and Y, always keep respective folks in the loop.
Given that your service will have a shield - an intermediary between you & the end user (a client, an actual end-user) try keeping the other folks in the company in sync with your plans and timelines. They are intermediaries you want on your side.
Consider selling early. At times, the team building on top of your components isn’t fully aware of the machine capabilities you are in charge of. At times, the tunnel vision of the customer-facing scrum teams becomes overly customer-focused, so they lose sight of your shore. Reminding them about engines and assembly lines standing ready is your duty, it is a pushing job, not a pulling one. The management rarely goes down to mines to talk to their company dwarfs.
- People are shallow.
- Users are shallow.
- Reliable products are miles deep and wide.
Telemetry is your friend. Your Grafana dashboards become an entangled network of spies and snitches (not literally). Make sure to track every single thing you can, you will thank me later. Along with events, encourage tracing with services like Jäger. This stuff will save you time and money.
Ask your internal user base about the worst-case scenarios they expect. What is a big no-no in features they sell to the end users? An empty list won’t do. A delay of 5 minutes is a show-stopper? The more you know, the safer the machine your guys are building. Don’t bother them until you do.

Reliability is silent and faceless.

God bless real engineers!

Product Zine by Gene Ishchuk

Discussion about this post