As I get to know my new coworkers I’m starting to assign them nicknames in this blog. The papal conclave that elected a new Pope today inspired me, for a few reasons, to nickname my boss, previously referred to as CTO. He is henceforth “Pope”.
So Pope and I the last two days have been working on bringing me up to speed, and I have surely been “drinking from the fire hose.” I’m mentally exhausted.
Today became a different kind of trial, however, as a critical issue with the current production servers (directly impacting company revenue) consumed much of our time.
I found myself doing something I really never like to do but have many times found myself needed to do: dive into a complicated, poorly-documented set of running processes and try to figure out what the problem is. Pope and I tag-teamed it, with his understanding of processes and my understanding of code, with occasional calls to business users to figure things out. Eventually we figured out that this was a problem with our network, which had recently been reconfigured due to the opening of our new branch office (the one from which I’m working). Essentially, the internal firewall got misconfigured and blocked the processes from communicating with outside processes.
Which tells me the current version of our platform does something very badly: It doesn’t report when things aren’t working right. It “fails silently.”
TL’s Third Rule of Development: Never architect nor write your code under the assumption that everything will work as planned. Always develop with the exception cases in mind.
That rule goes double for any code that talks to something external to itself (a database, an API, a file import or export). Because those always seem to break at the most surprising times.
Fortunately most of this code is going to be retired when the new platform is release. Unfortunately, that’s no guarantee that the new code handles the exception cases appropriately, and a review is indicated.
Other fun things I discovered. A cron job runs every hour, on the hour. This cron job tries to execute two scripts that don’t exist. As far as I can tell, they haven’t existed for over a year because there are web server logs that show the same errors every hour going back that far.
This concerns me greatly because it, obviously, means there has been little to no quality assurance on the processes or the code to date. And I’m obviously worried that the new development has taken place under the same (lack of) constraints.
My solution is to lay the groundwork for a near-term plan to do a full review of the new code base. I know that CEO’s top priority is getting our new platform out the door as quickly as possible, as the E-commerce guy has all kinds of things he wants to do that are dependent on the new platform being live. So that’s first priority. But once that’s done, I want to do that full review and identify potential problem areas. I introduced CEO to the concept of ‘technical debt’ and told him what my post-release goals were in the context of ‘paying down’ the technical debt.
Mostly, I don’t want to be trying, yet again, to figure out how to build stable code on a brittle foundation.
The other item of note today was that CEO was eager to plan to build my team. I told him I had several people in mind, but that I needed a couple weeks to really get to understand the company’s needs before I made any recommendations. Pope was quick to back me up on that, saying he wanted my focus right now to be getting acquainted with the new applications.
But this is a good thing for my desire to build myself and my company a great development team.