Matt Daines

Doing Things The Hard Way

To be a well-rounded infrastructure engineer you need exposure and experience to various technologies. For example, networking, hardware, operating systems and applications. On the soft skills side, troubleshooting, problem solving and communication.

To be candid, I think a lot of what I’ve done recently professionally has been quite easy - this sounds like a brag, it’s not meant to be. It’s an acknowledgement of how the public cloud makes consuming cloud services very easy relative to on-premises data centres. In Azure, there’s no VLANs, no physical networking (the devices that enable ExpressRoute and site-to-site VPNs could be an exception but I’d argue they aren’t hosted in Azure) and for a lot of Azure services the availability of the service is taken care of for you.

As a passionate technologist (and an Azure cloud consultant professionally) I love the challenge of picking a technology I’ve not used before and trying to figure out how it works and where I can use it. Those opportunities have become increasingly harder to find thanks to the complexity abstraction cloud services offer.

I have found that colleagues that are new to the industry that start their careers in Azure and grow from there are less likely to be able to troubleshoot issues later in their careers. Why is that? What’s the difference between the start of my career and some of my newer colleagues?

I put it down to a lack of experience and training on the fundamentals of computing. For context I’m not talking about different types of gates within a processor like you might find on computer science courses at university. I’m sure that’s very important for some professions but it’s not important for what I do professionally. I’m talking networking (OSI model, CIDR, subnets, routes, etc), Windows Server (Active Directory Domain Services, IIS, Hyper-V), and an understanding of public key infrastructure (PKI) and so much more.

I can’t, and don’t blame anybody for not knowing what I consider to be the fundamentals. On a day-to-day basis I don’t need to know all of this. In Azure, if I need to get my virtual machine to go via a firewall before it goes to the Internet then I just tell Azure’s software defined networking via a user defined route to go to the frontend IP of an Azure firewall. Concepts like the default gateway are abstracted away.

A positive argument to this abstraction is that it makes cloud engineers more productive and enables more people to get involved with Azure, and I’m sure other cloud providers. You no longer need to be an expert to setup a website and that’s great! But when things go wrong, how do people troubleshoot?

Large language models (LLMs) like ChatGPT, Gemini (formerly Bard), Llama are a fascinating technology! But I’m not sure people are using them correctly when it comes to problem solving and troubleshooting. From what I’ve seen it’s quite common for people to only find the solution to their problem and not have the LLM help them to understand the problem and why the solution works. This is a problem because when the same problem is presented in a different way, the person is likely to face the same challenge and still not know how to move forward. So if you’re using LLMs to help you solve problems, make sure that you’re also asking the LLM to explain the solution to you in a way that you can understand it.

So, what am I doing about my lack of challenge? Well, I’m moving my blog from GitHub pages to a Kubernetes cluster hosted on Raspberry Pis. Why? There’s a lot of technology that I need to understand that I have either not worked with for a long time or not at all!

By doing this I’ll need to learn:

And I’m sure a lot more! I hope to blog about the challenges I face as I go through this process. See you on the other side! And I’d encourage you to think about where you can do things the hard way to become a more resilient person.