Most businesses tend to rely on relational database management systems (RDBMS) to provide business insight, including continuous intelligence. Cloud relational databases have improved computing power they bring to the table, to handle more massive amounts of data. However, relational databases, even ones in the cloud, face two issues. They have a harder time with the unstructured big data and enormous memory demands. Their fixed schema architecture makes it difficult to service a high proportion of continuous intelligence.
Gartner predicts, by 2022, more than “half or major new business systems will incorporate continuous intelligence.” Continuous intelligence requires transforming big data into real-time analytics that business operations can use to prescribe actions. Yet many companies struggle to find a general-purpose database solution that responds quickly, handles vast amounts of data of all types, scales-out across multiple computing instances, performs well, and remains consistent and governable.
The desire for a quicker, better performing and more flexible architecture, led to the development of the non-relational or NoSQL databases. NoSQL databases have fewer storage needs, handle big data better, and turn-around ingested data fast. However, as a DATAVERSITY® Trends in Data Management 2021 report mentions, many businesses feel overwhelmed in understanding how to get any business insight from the NoSQL database and its architecture.
Recently DATAVERSITY spoke with Jai Karve, a Solutions Architect at MongoDB, to better understand non-relational database technology, how it is primed for continuous intelligence while closing gaps with RBDMS advantages to become more general-purpose.
Moving Data Quicker and Across More Machines
NoSQL technology came from a drive to “move data fast” and “scale-out well horizontally,” said Karve. Go back to 2008, and find streaming media applications like Twitter and YouTube, grow more popular. Those applications amassed continuous data quicker and quicker, and the RDBMS suffered performance issues when trying to handle it all.
“So, some NoSQL developers looked at how to distribute humongous data sets across multiple machines. They wanted to accommodate rich JavaScript Object Notation (JSON) data structures designed to speed up requests and responses between computers while scaling-out to include many networked computer instances. The result, NoSQL technology performing as a ready-made big data platform as a service, allowing developers to build out data applications.”
Eager to get feedback about this new kind of architecture, the non-relational database code became open source, where developers could try it out, modify, submit issues, and suggest enhancements. As he observed, businesses continued to see the non-relational database as an anomaly.
“Companies continued to use their RDBMS for a system of record and look to a NoSQL database, like MongoDB, to build Application Programming Interfaces (APIs) and utilize JSON capabilities for performance and speed. So, the marketplace saw the NoSQL database as a solution for a niche use case, a caching layer to serve up stored data faster.”
While the NoSQL databases aspired to have more mainstream use, they lacked vital characteristics, including data validation and ACID-a compliant transactions. ACID describes atomic, consistent, isolated, and durable database properties, ideal for payments. Any database system architected to meet ACID attributes keeps the integrity of every transaction, one set of operations, and values. In contrast, the NoSQL database could change data from the time of input and eventually be consistent, but not necessarily providing strong consistency guarantees.
Some NoSQL technologies tried to achieve this consistency by locking data at the database level. “But then updating or writing to the database becomes cumbersome,” as Karve said, “discouraging NoSQL database use.” The NoSQL challenge became, “…providing the high availability and horizontal scalability strengths in NoSQL but closing the gap with desirable RDBMS features, like ACID transactions.”
A General-Purpose NoSQL Database with ACID Properties
In 2017, NoSQL database technologies evolved, retaining flexibility, speed, and performance while embedding ACID database properties. Karve cited one solution, the document database. Each document contains keys and values, customized to user specifications. The contents, number, and array of documents in a document database have few limits, making it ideal for big data.
Karve explained that MongoDB added ACID properties to this document Data Architecture. First, algorithms validate contents written to JSON documents. Think of this code as a way of maintaining Data Quality by checking whether document contents meet the business rules and requirements and keeping the ones that do, locking it. This programming makes transactions atomic and consistent.
NoSQL database durability comes from a replica set. Each data cluster contains a primary node, accepting database writes and secondary nodes replicating the writes. When the primary node fails, one of the backup nodes becomes primary. Data can be sure to endure server or network outages as a new node steps up to be primary.
MongoDB architects an ACID database, a data cluster with a set of JSON documents, configured, and checked by code. In the meantime, the business can then scale out as many data clusters as needed across multiple locations or flexibly decide how to do so.
A Continuous Intelligence Database Architecture
Understanding how to get continuous intelligence from the document database NoSQL architecture poses a challenge. Karve remarked:
“People get stuck with relational database baggage while trying to map their database model to a document database. They try to achieve normalization, organizing data to meet a schema based on relationships. Customers then have a terrible experience. Starting, business benefit by doing a paradigm shift. They need to think about data storage and access when modeling data. Keep like data to be accessed together. Think less about boxes and more about details of each data cluster.”
Recognizing that business can get stuck in a RDBMS perspective, MongoDB created The Modernization Toolkit with some partners. “This graphical interface helps business analysts map from RBDMS to data clusters, better understanding NoSQL continuous intelligence data modeling.”
For those who wish to keep their relational schema, vendors, like Confluence, integrate NoSQL and SQL technologies, connecting enterprise tools. The result is a “powerful platform allowing business analysts easy accesses of real-time event screening while transforming that continuous intelligence with SQL queries.” Examples also include Tableau and PowerBI, providing real-time interactive dashboards and reports that can be queried with SQL.
Governing Data Clusters
Real-time interactive dashboards give only as good intelligence as the Data Quality within the database systems. However, what happens when business requirements change? Karve commented:
“Some employees use an open source document database to get things done quickly. But then a multitude of different practices that get inherited over time. The business does not know how to manage or administer that database once employees leave.”
He explained how MongoDB handles this kind of Data Governance problem, the cloud database service known as Atlas. Think of Atlas as a control center for data clusters setup in the cloud. Mongo does the “heavy-lifting of the NoSQL database structure” while business has “the levers, knobs, and dials to set the data cluster’s parameters.” From there, an enterprise can fine-tune data performance and locations.
How does this apply to Data Governance? Knowing the data location means knowing which regulations apply to the data stored there. As governments enact different privacy standards, it becomes easier to update the data configurations to comply with new laws or move the data to another location without the regulation.
Combine this capability under an enterprise-wide Data Governance umbrella and get a powerful way to fine-tune and flexibly manage data policies and procedures.
Making Streaming Data More User Friendly with Extended Scalability
The future of NoSQL technology looks bright for continuous intelligence. First, NoSQL databases, promise to be more user friendly through autonomous database functions. Karve explained, “MongoDB will detect user behavior and provide suggestions on database modeling
and index creating.” Indexing retrieves search results faster.
Second, MongoDB’s document database will extend its capability to handle more mobile devices and the Internet of Things (IoT). Karve said, “It will make the data clusters at the edge of the network better able to sync with the server.” The NoSQL technology will leverage 5G a technology with better connection speed, allowing more streaming data inputs.
Business needs NoSQL technology to leverage continuous intelligence because of its flexibility, performance, and reliability. Furthermore, based on the recent COVID-19 pandemic, businesses see embracing multi-cloud possibilities as a more resilient and elastic way to store data and prevent downtime. NoSQL databases scale out with performance across many clouds, extending how much continuous data you can capture and where to put it. RDBMS alone does not have the architecture to handle multi-cloud data and continuous intelligence.
Image used under license from Shutterstock.com