It’s time for the enterprise to seize the opportunity to build new applications and processes based on a synthesis of meaningful data brought together from diverse systems. Doing so requires some key things, though, starting with a database platform that supports seamlessly integrating the data and ensuring that it is understandable at the conceptual level. Enterprise NoSQL database provider MarkLogic has been pushing down that path by helping organizations integrate their data – structured and unstructured – in one place with its schema-agnostic data model for some time now. It’s possible to load data as is into the system and use a universal index to get at it.
“There’s been so much talk over the past years about Big Data. But in the enterprise space there’s often not even the opportunity to have Big Data, because data is broken into so many different systems, applications, and silos that they can’t bring it together,” says Joe Pasqua, MarkLogic EVP of Products. “We want to bring it together not just for data warehouse or analytics but to operate on.”
The MarkLogic database also has long had in place a semantic foundation. It acts not only as a document store for storing JSON and XML, but offers an integrated triple store for storing RDF triples that can be linked together to describe facts and relationships.
Now, MarkLogic is preparing to take things to the next level, with a platform for building next-generation applications with MarkLogic 9 that previewed in early May of 2016, and is due to ship by year’s end. The latest version, says Pasqua, is going to further drive the possibilities for a database to do smart things to help build next-gen applications, rather than just serve as a dumb repository of data. “If you’re not going to tell the database anything about the data, then there’s only a limited set of things the database has the opportunity to do,” he says.
Building up Database Smarts
MarkLogic 9 changes the equation, building on the semantic foundation it has in place to provide new capabilities such as Entity Services, which let developers give their data consistent meaning using a semantic model of the key concepts and the relationships between them.
It’s a way to provide a high-level concept of business entities vs. a detailed low-level description at the physical level, and to let databases do something for developers that they didn’t have a chance to do before. “We store that information, version it, and make it available to apps in a consistent way, but also to the database to get smart about things,” he says. This can lead to automatically creating REST APIs for sharing customer entities, product entities, supplier entities, and so on. “It’s important because of the world of increasing micro-services,” he says, such complex applications are composed of small, independent processes communicating with each other via APIs. “You need an architecture that directly and natively supports that.”
Another new feature is the Optic API query mechanism. As a document-oriented NoSQL database, it’s natural to query information as documents, Pasqua says, but sometimes developers want to see tabular data. “This lets you look through a tabular lens and see data in tabular form and do rollups and aggregates on it,” he says. Alternately, it lets users see data through a semantic lens or even see semantic data through a tabular lens. “It lets you have the most natural way of looking at data depending on what you are trying to achieve,” he says. Hidden from the developer are underlying technologies: A new index and distributed execution across a cluster for fast and effective performance.
The SQL capabilities present in this enterprise NoSQL database are enhanced too, for integrating data from MarkLogic with existing SQL tools. “Folks have tools like Tableau that use ODBC [Open Database Connectivity] to get at data and we must provide a bridge so customers can use the tools they depend on and to get value out of MarkLogic at the same time,” he says.
Enterprises are in a transition period, he notes, and there’s got to be a connection so that people can get their jobs done today but also move forward. “Our big challenge to ourselves is how to give them better tools to deal with what they have got, but also to allow them to move onto the next generation,” says Pasqua.
Continuing Focus on Security
The last thing a CIO wants to see on the front pages of newspapers and websites is a headline screaming that his or her company’s data has been breached, and that’s a big reason MarkLogic has always taken security seriously, as has its use for sensitive government applications, Pasqua says. In the security realm, it’s been distinguished as a Common Criteria-certified NoSQL database, for instance.
Today, the market generally is looking to focus more tightly on security for the enterprise. For MarkLogic 9, that means a few things, starting with adding advanced encryption capabilities to deal with outsider and insider threats.
Encryption technologies will reside in the core of the database, and even system administrators with root access to the system won’t be able to see encrypted data, for instance:
“Advanced key management capabilities to keep things safe, along with fine-grained controls over what even administrators can do with the database from a security perspective, are very important to customers,” he says.
Redaction features are part of the picture, too. “The idea is that part of the goal of bringing data from different systems is to make that data valuable to more people, so you want to give them access but you may need to redact certain elements of the data depending on who uses it,” he says.
For example, a healthcare organization may want researchers to get their hands on data that can be highly valuable for researching disease treatments, but certainly they don’t want those researchers to have access to patients’ personally identifiable information (PII). With MarkLogic 9, the PII can either be removed or randomized. There are plenty of other scenarios where that capability has additional value in large enterprise environments, too. For instance the best testing for QA environments happens when the data used reflects what’s really going on in the production system. But of course businesses don’t want real, sensitive data just floating around in those environments. “Redaction lets you take the data out of production, redact it and put it right back into the QA environment,” he says.
MarkLogic 9 also is adding to the role-based security it has incorporated at the document level by adding the same at the element level. So, an individual document can have elements in it that are top secret, for example, and others that are merely secret. “Depending on the clearance level of the people querying information, they will only see the information they are allowed to see in a single document,” he says.
MarkLogic has an advantage in that it doesn’t have to bolt on security to its solution as some other enterprise NoSQL products might, since the company always has had an enterprise focus. Generally speaking, Pasqua says, NoSQL started out being used in places and for tasks where security was not a paramount issue.
“The challenge is that once you build a system, it’s hard to go back and get security into it, versus building the fundamentals of it into it from the beginning,” he says. “Obviously it’s of paramount importance, though, and doing it right is the challenge.”
Manageability Matters and So Does the Cloud
Pasqua also points to manageability as a critical issue as more data comes together, systems get bigger and replication expands across geographies. So MarkLogic has created an Ops Director single pane of glass to view an organization’s entire MarkLogic infrastructure and manage it uniformly. A Rolling Upgrade feature was created in the service of non-disruptive operations, so that new versions can be installed on one machine in a cluster while the application stays running elsewhere, with the installation then rolling through the cluster, so that customers don’t have to experience downtime.
The company also is mindful of the growing prominence of the Cloud in the enterprise: MarkLogic already is in the Amazon marketplace and runs on Microsoft Azure and Google Cloud. While some features in the latest release aren’t Cloud-specific, they are Cloud-relevant, he notes. For example, its encryption enhancements could help ease the concerns of customers about taking their data to the Cloud and having external service provider administrators supporting those systems, he says. Enhancements to MarkLogic 9’s tiered storage usage capabilities also “make it smarter about the way it uses storage tiers and how they are queried, and that makes it more effective for enterprises to use the cloud cost effectively,” he says.
MarkLogic already has begun an early access program for MarkLogic 9, which it will be expanding. “We like to do that because it lets customers give us feedback while in the development process,” says Pasqua, “and it’s good for them because they can start building next-generation apps with new features now.”