Closing the loop on a monitoring alert is traditionally something that implicitly happens when the dashboard returns to its idyllic green state, the text massage returns a well deserved “Service: OK” or in more extreme cases the incident review is over and actions have been assigned. This however assumes the alert is working well and the operator understands why it woke them up and the value their involvement brings. In more fluid environments alerts can be incorrect, issues that do not require immediate attention and in the worst case ghost calls that mysteriously correct themselves just after you’ve woken up enough to find your MFA device. Read on →

Testing your shell scripts can be complicated enough already but when you start to incorporate time based test scenarios you can quickly find yourself in the land of fragile tests and intermittent failures. By adding a small command line utility, called faketime, to your toolbox you can make your chronology related tests more reliable and reproducible. To start we’ll install faketime. This focused little binary allows you to explicitly set the system time for which ever command you pass to it. Read on →

I recently read SLO Adoption and Usage in SRE, a free book of two halves. The first provides a brief introduction to SLIs, SLOs and Error Budgets that could be given to an impatient but interested co-workers. The second part is an analysis of the responses from the ‘SLO Adoption and Usage in SRE’ survey. If you like the DORA State of DevOps Reports you’ll also enjoy this. Summary “SRE is an emerging IT Service Management framework” and should be treated in the same way as ITIL, distrusted but pillaged for the good bits. Read on →

I have a tab that normally lives somewhere near the middle of my web browsers tab bar. Over the course of my day it faces constant pressure on each side. From ad hoc work tabs being opened by the pinned email and slack tab on its left and from proactive work based tabs from its right. I’ve learned I can tell how my week is going by where it is on the bar. Read on →

I’m not a massive fan of Mind Mapping, a “hierarchical way to diagram and visually organise information that shows relationships among pieces of the whole”, but before Christmas I found myself blocked and unable to gain traction on a small research based side project. After idling for an uncomfortable duration I tried a few alternative approaches and Mind Mapping appeared to be a decent fit so I decided to install a tool and break my block with it. Read on →

Just before Christmas I had to do some work on new business continuity plans (BCP) and disaster recovery (DR) documents. To help warm up and get myself in the right frame of mind I posted a few easy opening scenarios to Twitter for comment and I’ve decided to collect them back up and post here, in my external memory, for posterity. Each of these ideas should be considered the most generic and low hanging fruit of your plans. Read on →

from-free-work-phone-to-life-balance-complaints-in-2-easy-steps I’ve spotted a small but recurring pattern among some of my friends, their on-call responsibilities, and the gradual erosion of their work life balance. It all starts innocently enough with the ceremonial signing off of the new work phone. The keen new employee goes to the Corporate IT team and gets given a 2 or 3 generation old, locked down, mobile phone. It normally comes with the trinity of features, a slightly shonky battery, a larger than expected physical presence and a decent bandwidth package you never have to pay for. Read on →

Next in my unhurried investigation of hosted build systems for my small collection of Free Software are GitHub Actions. A fully hosted task runner that can Build, test, and deploy your code right from GitHub. As someone not exclusively using GitHub to manage all their source code the idea of being completely tied into a single provider isn’t a great one but the technology looks interesting enough to justify running a few simple experiments. Read on →

I recently enabled Dependabot to help track updates to my dependencies and keep them current. The user experience has been a pleasant one with simple configuration and timely pull requests but I’ve quickly come to dread one specific thing - the Updating Of The Rubocop. I have quite a lot of ruby bases repositories and I like to use rubocop as a basic safety net and second set of eyes so it’s in heavy use, which is great until the version changes. Read on →

I’ve had a half written draft of this post sitting in a folder for the last six months and I’ve not been able to shake the root cause so I’m going to publish it and see what the feedback teaches me. But first the heresy - Service Level Objectives make me uncomfortable. I have no issue with the idea that you need some form of measurement and tracking to ensure you’re maintaining an acceptable level of service but when reading posts on SLOs, or watching recorded conference sessions, the concept seems to imply some rigour and background process to determine the numbers to work towards that feels decoupled from any hard details and often comes across as either a guesstimate or just a Current Representation of Actual Percentages. Read on →