A project to modernise an old Terraform code base came across my desk recently and while investigating the more recent developments in testing tools and workflows I stumbled onto conftest, a utility to help you write tests against structured configuration data. I was interested in trying the technology out but I don’t want to put something i have this little experience in on the main flow of work so I decided to do a few tests with it against a smaller, more self contained, use case. Read on →

After seeing DNSTwist mentioned in a twitter thread recently I’ve been having far more fun than appropriate using it to investigate domain name typo squatting. Typo squatting is when you mistype a domain name or URL and someone has registered a very similar domain in order to capture that traffic and often do unpleasant things with it. A benign example of this is GutHib, a common typo for GitHub that just helps people along with a subtle indication of the error. Read on →

Back in the mists of time when I first registered this domain name the main purpose of the site was to be a place where i could link to all my little side projects. Over the years, as I’ve been fortunate to be found by readers, I’ve grown more and more picky about what I posted and by side effect, some of the side projects I’d invest a few hours in to. Read on →

GitHub recently announced the Super Linter, a Docker container that can be run via GitHub Actions and comes complete with a lot of built in linting tools to help you detect less than ideal code. As someone who uses linters in different contexts, for example shellcheck for bash, rubocop for Ruby and flake8 for Python, I like the idea of having someone else package these up for easier use in my own GitHub repositories. Read on →

For most companies Incident Commander or Incident Manager is not a specific job, it’s a role you may take on when something has gone, often horribly, wrong and you need to quickly unite an adhoc group into a team to resolve it. The incident commander should be the point of contact, and source of truth, about your incident and to do that successfully they’ll need to be updated and kept informed about what’s happening. Read on →

After my recent foray into Monitoring alerts and customer satisfaction surveys I was fortunate enough to have a couple of conversations with other SREs interested in trying out similar approaches and while discussing the specifics of our individual forms the lack of an easy way to share the details became an annoyance. Storing the wording in a shared doc was OK at first but recreating the forms so we could each learn from and tweak each others wording and design quickly became a distraction with the Google Forms web UI taking up most of the time we had to chat. Read on →

Closing the loop on a monitoring alert is traditionally something that implicitly happens when the dashboard returns to its idyllic green state, the text massage returns a well deserved “Service: OK” or in more extreme cases the incident review is over and actions have been assigned. This however assumes the alert is working well and the operator understands why it woke them up and the value their involvement brings. In more fluid environments alerts can be incorrect, issues that do not require immediate attention and in the worst case ghost calls that mysteriously correct themselves just after you’ve woken up enough to find your MFA device. Read on →

Testing your shell scripts can be complicated enough already but when you start to incorporate time based test scenarios you can quickly find yourself in the land of fragile tests and intermittent failures. By adding a small command line utility, called faketime, to your toolbox you can make your chronology related tests more reliable and reproducible. To start we’ll install faketime. This focused little binary allows you to explicitly set the system time for which ever command you pass to it. Read on →

I recently read SLO Adoption and Usage in SRE, a free book of two halves. The first provides a brief introduction to SLIs, SLOs and Error Budgets that could be given to an impatient but interested co-workers. The second part is an analysis of the responses from the ‘SLO Adoption and Usage in SRE’ survey. If you like the DORA State of DevOps Reports you’ll also enjoy this. Summary “SRE is an emerging IT Service Management framework” and should be treated in the same way as ITIL, distrusted but pillaged for the good bits. Read on →

I have a tab that normally lives somewhere near the middle of my web browsers tab bar. Over the course of my day it faces constant pressure on each side. From ad hoc work tabs being opened by the pinned email and slack tab on its left and from proactive work based tabs from its right. I’ve learned I can tell how my week is going by where it is on the bar. Read on →