In the buzzing world of data architectures, one term seems to unite some previously contending buzzy paradigms. That term is “knowledge graphs.” In this post, we will dive into the scope of knowledge graphs, which is maturing as we speak. First, let us look back. “Knowledge graph” is not a new term; see for yourself […]
Connecting the Three Spheres of Data Management to Unlock Value
Many organizations have mapped out the systems and applications of their data landscape. Many have documented their most critical business processes. Many have modeled their data domains and key attributes. But only very few have succeeded in connecting the knowledge of these three efforts. The remainder of this point of view will explain why connecting […]
2023: Mitigating Data Debt by Knowing or by Guessing?
One of the newer data buzzwords is “data debt.” Actually, it is approximately 10 years old, and it became popular ever since agile people realized that postponing things creates not only technical debt, but certainly also data debt. Will we, in 2023, be better at not creating so much data debt, and will it be […]
It’s All About Relations!
The new ISO 39075 Graph Query Language Standard is to hit the data streets in late 2023 (?). Then what? If graph databases are standardized pretty soon, what will happen to SQL? They will very likely stay around for a long time. Not simply because legacy SQL has a tremendous inertia, but because relational database paradigms […]
The Data Engineer’s Roadmap
Data engineering is a fascinating and fulfilling career – you are at the helm of every business operation that requires data, and as long as users generate data, businesses will always need data engineers. In other words, job security is guaranteed. But, with such great power comes great responsibility. The journey to becoming a successful data engineer […]
A Primer to Optimizing Your Apache Cassandra Compaction Strategy
When setting up an Apache Cassandra table schema and anticipating how you’ll use the table, it’s a best practice to simultaneously formulate a thoughtful compaction strategy. While a Cassandra table’s compaction strategy can be adjusted after its creation, doing so invites costly cluster performance penalties because Cassandra will need to rewrite all of that table’s data. Taking […]
Say Hello to Graph Normal Form (GNF)
You thought you knew all normal forms? (And possibly also some abnormal …) Well, think again: There is also “graph normal form (GNF).” The diagram below is a fifth normal form graph concept model, which is just a few steps from GNF, so hang on: Where GNF comes from GNF is based on serious mathematics, […]
Data Modeling Techniques and Best Practices
Data models play an integral role in the development of effective data architecture for modern businesses. They are key to the conceptualization, planning, and building of an integrated data repository that drives advanced analytics and BI. In this blog post, we’ll provide you with an overview of the most popular data modeling techniques and best practices to […]
The Rise of the Semantic Layer
Cloud giants like Google and Snowflake, unicorns like dbt Labs, and a host of venture-backed startups are now talking about a critical new layer in the data and analytics stack. Some call it a “metrics layer,” or a “metrics hub” or “headless BI,” but most call it a “semantic layer.” I prefer to call it a “semantic layer” because it best describes a business-friendly interface […]
A Beginner’s Guide to Data Modeling and Analytics
As more and more companies start to use data-related applications to manage their huge assets of data, the concepts of data modeling and analytics are becoming increasingly important. While they typically rely on one each, they are two very distinct concepts. Companies use data analysis to clean, transform, and model their sets of data, whereas they […]