Click to learn more about author Matt Yonkovit.
A forest fire is a powerful force of nature. Capable of massive destruction, but also with the potential to bring forth new life and facilitate positive growth.
Cloud database-as-a-service offerings have a similar duality.
The power of the cloud has transformed our technology infrastructure. Nowhere is this more evident than the rise of cloud database-as-a-service offering.
These powerful offerings (such as Amazon Aurora, Azure SQL, Google Cloud SQL, and MongoDB Atlas) have rapidly become the most popular way for people to run their databases in the cloud. There are also challenges and potential problems if not used or deployed correctly. In its latest Magic Quadrant, Gartner made the strategic assumptions that 75% of all databases will be deployed or migrated to a cloud platform, with only 5% ever considered for repatriation to on-premises. By 2023, the preference for Data Management in the cloud will also reduce the number of vendors around while multi-cloud will make things more complicated around Data Governance and integration.
Developers have always had a tepid relationship (at best) with their databases. Databases are/were considered a necessary evil when it came to building applications. Friction between the DBA and the developer often resulted from each group having different goals and outcomes. The DBA has to consider long-term sustainability, security, availability, and scalability. The developer, on the other hand, thinks about features, release schedules, and users.
Database-as-a-service (DBaaS) promises to automate away concerns around sustainability, security, and availability. It puts the power to choose and operate databases into a developer’s hands, allowing them to move fast and reap the rewards.
But, while database-as-a-service solutions remove much of the mundane work of configuring and operating databases, they tend to focus on the lowest-hanging fruit. This leaves the more complex administrative or troubleshooting database activities to the developers and/or DBAs.
For example, you may have the tools to perform backups or encrypt your data, but are you setting them up correctly or avoiding common mistakes?
Over the last few years, many companies have made headlines because their data has ended up in the hands of hackers. However, contrary to the stereotype of a computer whiz-kid hacker frantically tapping code into a laptop, these breaches are usually a result of companies not following best practices, or not understanding the limitations of the technology they deploy. One of the most common sources of leaks is unprotected S3 buckets containing backup files. Just because you can click the backup button doesn’t mean your backup is properly encrypted or protected. In some cases, it is even restorable. This is why you still need committed and knowledgeable resources available to advise on potential risks.
As these problems are becoming more common, database vendors and cloud providers are adding features to help mitigate or address them. However, many of the deepest, most impactful issues are still based on an application’s architecture and how the database is set up for a particular application. This is where the concept of “Shared Responsibility” comes into play.
Cloud providers give you the foundational database components to build a secure and scalable application. That doesn’t mean your application is scalable and secure. Often we find that the “set it and forget” mentality towards databases (or rather the “it’s my provider’s responsibility” mentality) leaves users with higher costs and a lot of work longer-term.
A recent report from Andressen Horowitz highlighted the potential higher costs of the cloud. In my experience, database costs are often a proportionally higher percentage of the total spend than you might expect. These increased costs are often a result of companies solving database problems via scaling by credit card (moving to the next instance size, increasing storage space, etc). This is a false economy and can quickly spiral. By instead implementing proper database design, management, and support, costs can be reduced up to 50%.
Cost aside for a second, there are some important points you should bear in mind when considering database-as-a-service. Database-as-a-service is good for developers and users, and even better for database and cloud vendors (especially in the open-source space).
It is important to realize, though, that you often have to trade control and portability for ease of setup and use.
You have probably heard the term “lock-in” mentioned in the open-source community. Vendor lock-in refers to the situation you found yourself in when you start using a vendor and then end up stuck with them unless you go through a long and costly process to redevelop portions of your application (think about the song “Hotel California”: “you can check in but you can never leave”). Oracle in the 1990s had a bad reputation for locking people in and then playing games with their licensing and costs. In fact, that negative experience was a driver for people in the enterprise to start exploring open-source databases as an alternative. If we are not careful, DBaaS could result in the highest levels of lock-in in over 20 years.
Open source is one of the most powerful tools users have against lock-in and shenanigans in the technology industry. This is particularly well represented in the database space.
For instance, if you want to run MySQL or PostgreSQL, you have options. You can choose how you run them, where you run them, and what sort of stack you run them with. You can deploy MySQL on Azure Compute today, and next week you could deploy the same MySQL on AWS EC2 and have a similar experience.
This level of portability is especially beneficial for those new applications designed as “cloud-native,” with the applications themselves gaining more portability through design. The rise of cloud-native applications running on containers via Kubernetes decouples the application from the infrastructure provider. Portability and the ability to run anywhere are becoming an expectation of cloud-native application design.
However, some of the value of this portability provided by both open source and application design can be offset or obscured by the growth in database-as-a-service offerings. Many cloud providers offer DBaaS offerings that are similar, but not exactly the same, limiting portability. Amazon Aurora has feature differences and compatibility issues with different versions of MySQL. They are close, but not 100% the same … hence Aurora’s claim to be “Open Source Compatible.” Each step further into a vendor’s MySQL or PostgreSQL DBaaS makes it a little harder to move to another vendor or even your own managed version. This may or may not be an issue, depending on the level of portability your application requires. If you are comfortable committing to run on a single cloud provider, many of these DBaaS offerings can provide good value by speeding up management and deployment.
In addition to the technical differences, many database vendors are also now protecting their “As a Service” revenue streams by moving to non-open-source database licenses under the guise of preventing cloud providers from stealing business and monetizing their original work.
The Server Side Public License (SSPL) is a clear example of this. MongoDB created this license to exclude others from creating “as a service” offerings that compete with MongoDB (specifically their Atlas product). And MongoDB is not the only one. Elastic recently changed their licensing for the same reason. These moves limit user choices. The end result is that instead of open-source software, we now have single vendors providing automated managed databases for a fee.
The same database vendors are enhancing their DBaaS with exclusive features and options. Atlas, for instance, announced a new service Atlas Online Archive that allows less-used data to be stored in S3 or slower storage. This can provide a nice reduction in cost and help speed up performance by decluttering systems with unused data. As a built-in feature, people should take advantage of it as it provides an easy, consistent, and seemingly seamless way to handle a longtime problem.
As mentioned, this is an example of trading “easy” for future portability and control.
Companies/applications can do something similar by building in an archiving routine or processes to handle something like this, but there is currently no open-source equivalent to this feature. So, if you decide to stop paying for MongoDB Atlas you lose access and will need to rewrite portions of your application to handle the now missing functionality. This is similar to other proprietary software (again, think Oracle in the ’90s).
A huge benefit of open source has been that if we stop seeing value in paying for a vendor or subscription we have options to take our business elsewhere. If you are unhappy with your PostgreSQL provider, there are a dozen more to choose from.
As open-source (or formerly open-source) vendors move down the DBaaS route, more and more of their features, tools, and software will be available only via their DBaaS offering. This means that those users who want to deploy their own versions will gradually be left with older, less capable, and less feature-rich versions. There is also the risk of a slowdown in innovation as community contributions to DBaaS software are not really possible (you don’t have access to the full code).
Is DBaaS bad?
No! Not at all. DBaaS provides great benefits to users and vendors alike.
Any databases created today will consider cloud-native and DBaaS offerings their go-to deployment method. Users can get started easily, focus on innovation for their business, and get a consistent experience. These are all massive wins.
However, these wins do not negate the need to have expertise and knowledge available to advise the best way to design and build the interface between the application and the database.
Ignoring these critical knowledge gaps brings risk (such as higher costs and potential outages and leaks). It is also important to understand that you might no longer have the portability you have become accustomed to.
In this new world, you are not only committing to a technology, but also a vendor. How the community evolves and the direction DBaaS takes will dictate which vendors are considered trustworthy, and which will be the next generation’s cautionary lock-in tale.