As a principal solutions architect, I always start my cloud architecture discussions with the “triangle of truth,” or “holy trinity” of architecture principles for cloud: stability, agility, and security. You must have these three things. The rub is that hitting any two vertices of those pillars is pretty straightforward. Hitting one is trivial. From an architecture perspective, two is definitely within grasp for most businesses, but three is quite difficult. If you’re hitting all three simultaneously, you have a mature architecture and a mature approach to cloud, which is infrastructure as code.
To help you identify your next steps, I’ll break down the big moving pieces and provide you with a day 1 checklist to build the stable, agile, and secure environment your business requires.
Stability
Stability requires good connectivity. To implement standard cloud architecture, having good connectivity is imperative; by this, I mean hybrid connectivity. Connectivity could be an SD-WAN, an edge play, or private connectivity. Typically with big enterprises – the fortune 500 and the blue chips – we see both edge and private connections.
How big and involved your connectivity needs to be depends on your business model. Connectivity must account for a multi-region or multi-cloud strategy, which is why having a transit network in the cloud becomes important. But regardless of how fat the pipe, you need a reliable internet strategy to get your workloads to the cloud. This is because the majority of enterprise cloud deployments are hybrid. Applications and the associated workloads will be spread out across your distributed team, not live just on-prem or just in the cloud.
So, the first pillar is getting your connectivity dialed in and your latency within scope. Your applications will stretch like taffy across these pipes, so having robust, low-latency hybrid connectivity forms the foundation on which you’ll build everything else.
Agility
Agility is your ability to do infrastructure as code and execute on the promise of cloud, which is to automate everything. Good, agile deployment and continuous development and CI-CD pipeline, and good use of something like Terraform roll into agility. Good cloud ops practices involve more than merely deploying in the cloud. Planning for day 2 operations is also critical.
Security
Adequate security in the cloud requires fresh thinking. Network security tends to be challenging. Many cloud security stacks and capabilities are difficult to automate at scale. There might be 100 people worldwide capable of enterprise-level multi-cloud security architecture.
Many companies try to address this skills gap by training their current employees. But, like most of us, many security and IT professionals ultimately fall back on what they know. So they bring in virtual offerings like virtual firewalls or virtual routers; tools that are based on technologies they’re familiar with.
Unfortunately, this virtualization is self-defeating. Much of this stuff does not align with a cloud operational model. It’s hard to orchestrate virtual firewalls as code.
Recently, we’ve seen these firewalls being touted as services. The downside is that they become black boxes like any as-a-service offering. You lose your visibility and control because it’s being done by a service provider. If you need a data dump or need to troubleshoot and see what’s happening on an interface, it becomes a ticketing process. Or a pick-up-the-phone process. Both are time-consuming and slow, which makes your day 2 operations very difficult.
The scales are tipping towards these security devices, which bring back a lot of the control, visibility, and application-level insight that security demands without having to pay top dollar for talent; everything is code-based.
Still, there’s risk. These firewalls are expensive and complicated to scale. If they fail, you lose everything because you pulled all your security into these firewalls; if it goes supernova, it’s not pretty.
I suggest you try to use the cloud-native network security stacks, where possible, inside the VNet or VPC, and make the VNet or VPC a meaningful boundary. Use it. Don’t abuse it. Don’t make hundreds of them. What should go in there should be heterogeneous; maybe it’s a type of application or a line of business.
VPCs tend to be overused and abused. Use good orchestration. Where you can, use cloud-native security stacks inside your VPCs between the segments, and reserve your firewalls for the big motions between the VPCs to try to keep the firewall less busy.
Checklist
Let’s summarize our checklist.
1. Stability
- Start with a good edge strategy, good connectivity, and an understanding of whether the internet or private pipes– or both – are effective for your business model.
- Think about what’s on the other side of the connection. Preferably, a good virtual data center architecture that is scalable and “stampable.”
- Confirm you have good transit between your regions to ensure you’re prepared for your business model to grow. Region-to-region is critical, not only to reach global customers and scale your business, but also for stability.
2. Agility
- Expect regional outages and events.
- Appropriately plan for very good day 2 operations control and visibility. Both have been hindered somewhat by traditional security models moving into the cloud that are an anti-pattern. They don’t scale, and they cannot keep up with the speed and growth of the cloud.
- Be aware of all your data privacy regulations.
3. Security
- Ensure encryption everywhere, or in as many places as possible.
- Establish a conservative security policy.
- Use micro-segmentation.
- Make sure your VNets and VPCs are meaningful and what’s going on inside of those VNets and VPCs is safeguarded.
- Even if it’s within a shared business unit, use cloud capabilities.
- If you can, create security groups to curtail access.
- Always go with minimal privileged access.
It can be daunting for many enterprises to develop or hire the expertise needed to operationalize such a checklist. But there are services that abstract away much of this complexity and simplify the management of migrating to the cloud.