Ludwig Mies van der Rohe said, “Architecture starts when you carefully put two bricks together”—and Data Architecture begins upon creating, storing, and putting two or more characters together, be they sets of records, emails, pictures, audio, video. This resonated well with initial thoughts about Data Architecture, as it is comprised of things, the functionality of those things, and how those things interrelate. However, starting with “things” has limitations.
In the 1990s, Data Architecture required a technical expert, like a person in IT and/or an SQL guru, to create a new database and maintain or fix existing ones. Many enterprises experienced inefficiencies in this approach, including higher volumes of stored data, an increase in database variety, difficulties integrating newer technologies, and siloed data languishing in departments.
Companies recognized that Data Architecture was needed to leverage and manage data for a strategic advantage. Many organizations had collections of platforms, technologies, and tools strung together. This could be a good one-off for a specific integration project. But as companies grew, changed, and implemented new services, “data stringing” became a headache. Technological implementation was blocked or delayed by missing pieces of information needed to connect data.
Data Architecture looked like spaghetti, tangled and disconnected. While this kept many quality assurance and software testers employed unravelling and solving data and technical problems, it impacted companies and caused delays.
To demystify Data Architecture, a successful Data Architecture concept needs the right technology for the right job. Especially where organizations see data as core assets fueling digital business transformation, leading to better analysis and understanding.
As Donna Burbank of Global Data Strategy states, “The more complex things get, the simpler you need to make them.” Mies van der Rohe believed “Less is more.” This mantra also applies to Data Architecture.
Data Architecture Comes from a Data Strategy
Data Architecture explains how and what to create to meet defined outcomes—not how to use data to drive better business. A Data Strategy does this. According to Donna Burbank:
“It’s the opportunity to take your existing product line and market it better, develop it better, use it to improve customer service, or to get a 360-degree understanding of your customer.Data Strategy is driven by your organization’s overall business strategy and business model.”
Think about it: would it make sense to design or modify a building for a new business without knowing what services the organization provides, to whom it provides them or markets targets, how it is to be organized and managed, and its financial targets? No, it would be more efficient to hash out a business plan.
Likewise, a company needs to have a high-level view about creating and maintaining data infrastructures to get the desired business impacts. This Data Strategy ideally results in a family of solutions, addressing performance across all business functions and aligning with the business strategy, including Data Architecture. It is all about building the infrastructure to create those business impacts that are identified in the Data Strategy, says Anthony Algmin, of Algmin Data Leadership. Data Architecture provides information toward developing a good Data Strategy.
Data Architecture Requires Data Governance
While a Data Strategy aligns business needs and planning to guide what Data Architecture to create and how to create it, Data Strategy can be interpreted in different ways, depending on viewpoint. Each person could create their own Data Architecture variation to meet the Data Strategy and be correct, just as five people can see an elephant from their own unique, but limited, viewpoint. Data Governance needs to join data practices and processes together, formally, to ensure consistent, legal, and valuable data usage. Data Architecture guides only one component, the data technical infrastructure.
Data Architecture is as much a business decision as it is a technical one, since new business models and entirely new ways of working are driven by data and information. It must bridge with operational technologies, processes, people, and organizational culture. These concepts are inter-related. However, a data architect’s technical specifications will be limited in understanding these components. Data Governance provides the broader overarching framework of including compliance and security and roles and responsibility.
Think of the difference this way: data architects create a lookup table. If workers enter customer data incorrectly upfront, this impacts downstream processes. Data Governance determines this issue and looks for a solution. Or, the data architect can propose a lookup table created out of a governance process that can support referential integrity by making it easier to enter customer data correctly. As the business grows and changes, this conversation continues between Data Architecture (as represented by the dotted arrow in the figure), people, and processes.
Data Architecture Connects with Data Modeling
Data Architecture in its broadest sense, asks, “What are we trying to do as a business?” And then from all the diverse technologies, “What’s the best fit for that purpose and how do they work together?”
It results in outcomes: models, definitions, and data flows on various levels, usually referred as Data Architecture artifacts. It encompasses technology and infrastructure design, and financial decisions such as choosing a buy vs. build data system. It also includes behaviors such as collaborations, mindsets, and skills among the various roles that effect the enterprise’s Data Architecture. Specifically, this means operating and developing a Data Architecture and understanding the data ecosystem; architecture staff rely on data flow diagrams, data models, and process models to do this.
While Data Architecture bridges business strategy and technical execution by including specifications used to describe the existing state, define data requirements, guide data integration, and control data assets as put forth in a Data Strategy, it does not focus on data relationships specifically. Data Modeling does this. It documents, defines organizes, and shows how the data structures within a given database, architecture, application, or platform are connected, stored, accessed, and processed within the given system and between other systems.
Data models enable an organization to understand its data assets through core building blocks such as entities, relationships, and attributes. Data modelers map this information and share this with data architects, represented by the dotted line in the diagram. That way, the architects can make better decisions about placing the right mechanisms to support business outcomes. This could be anything from data systems and data warehouses to visualization tools.
Benefits Coming from Good Data Architecture
- Quicker Integration: GoodData Architecture provides almost seamless data integration, bringing scattered information together in a unified format so that stakeholders can get better business insights. Integration includes: migration—changing location from one place to another; conversion—changing data into another form, state, or product; and connecting different systems into a converged architecture—employing all or parts of your application patterns into one to get to a single version of truth to foster creativity and spur innovation.
- Better Use of Robust Tools and Platforms: Data architects have a wide array of technologies to give good data fast and effectively, e.g. Cloud vs Edge Computing. Good Data Architecture can sift through these, considering robustness of SQL, built-in optimization, on-the-fly-elasticity, dynamic environment adaptions, trade-offs between computing vs storage, and support for diverse data. The work put into creating and understanding Data Architecture helps makes sense of all these.
- Easier Adaptation of Emerging Technologies: Data Architecture strategically prepares organizations to quickly evolve their products, services, and data to make efficient use of emerging technologies. Pressure is on for businesses to improve performance of their applications and services. Architects already consider and work with newish technologies like the cloud, Internet of Things (IOT) and NoSQL databases. Upcoming hot technologies include Graphql (an engine that promises to query data quicker and allow for richer customization by the user), machine learning, and artificial intelligence. But this new technology can’t be leveraged without a Data Architecture plan, directed by a Data Strategy and aligned with Data Governance.
Conclusion
William McKnight, the President McKnight Consulting Group, said that Information Architecture plays a key role in establishing order in the continuous evolution of emerging data technologies. Already, in Preparing and Architecting for Machine Learning, Gartner noted the challenges of preparing data for ML pipelines when end-to-end Data and Analytic Architectures are not refined to interoperate with underlying analytic platforms.
In other words, complex, moving, unrelated pieces of Data Architecture without a coherent means to leverage business data, will make it very difficult for companies to consider machine learning. Get the Data Architecture down where less is more, a solid base. As Donna Burbank said, “people want to get to the new stuff, and they realize they can’t get there without building the foundation first.”
Image used under license from Shutterstock.com