Temporal data are often treated lightly, and sometimes not at all, during data resource design and development. However, temporal data can be extremely important to an organization and must be considered in a well-designed data resource. The three aspects of temporal data that must be considered are briefly described below.*
Temporal means of or relating to time; of or relating to the sequence of time or to a particular time. Chronological means of, relating to, or arranged in or according to the order of time. Temporal data are any data that represent time in some form, and allow other data to be placed in a chronological sequence, or to be analyzed chronologically. The term temporal is used when referring to the chronological component data resource design. The term time is used when referencing time as hours, minutes, seconds, and fractions of seconds.
Granularity is the coarseness or fineness of something. It’s the extent to which something is broken down into smaller parts. Granularity applies to temporal data ranging from astronomical time to chronons. The coarsest granularity of time is astronomical time, which is measured in millions or billions of years. It’s the time in our universe that started with the Big Bang some 13.7 billion years ago. It’s the time used by astronomers and theoretical physicists in their study of the birth and expansion of the universe, the birth and death of galaxies, and the life cycle of solar systems.
Geologic time is the next most coarse granularity of time and is used to measure time with respect to the Earth. The Geologic Time Scale consists of Geologic Eras, Geologic Periods, Geologic Epochs, and Geologic Series. It’s the time used by geologists studying evolution of the Earth.
Calendar time is the next most granular form of time. It is typically based on the Gregorian Calendar, which is based on the equinoctial year. However, other calendars exist, such as Julian, Chinese, etc., and other years exist, such as sidereal and anomalistic years. Calendar time has four components for century, year, month, and day. Most organizations use calendar time for their business activities.
The most granular form of time is clock time, which is typically based on the Gregorian Calendar and a 24-hour day. Clock time has four components for hours, minutes, seconds, and chronons. A chronon is a clock tick to the precision that is relevant to the organization. Normal business activities may not use clock ticks, but particle physics is interested in very small fractions of a second. For example, the Large Hadron Collider at CERN impacts atoms every 25 millionths of a second and the resulting particles last for about a trillionth of a second.
Temporal granularity is the degree of granularity of time that ranges from astronomical time to chronons. Temporal relevance is the smallest unit of temporal granularity that is acceptable or relevant to an organization. Geological studies have a temporal relevance of geologic time, business has a temporal relevance of calendar time, and particle physics has a temporal relevance of trillionths of a second. Each organization determines their own temporal relevance and uses that temporal relevance in their data resource design.
The first aspect of temporal data is tracking the states of a business event from it’s happening to its availability in the organization’s data resource. It is commonly referred to as bi-temporal data. However, up to five different states could exist in the pathway from its happening to its availability in the data resource.
The business change state is the point in time that the business change actually happened in the business world, such as a vehicle collision. The organization notification state is the point in time that the business change was reported, such as a driver sending a vehicle collision report to their insurance company. The organization receipt state is the point in time that the change was first received by the organization, such as the receipt of the vehicle collision report. The change entry state is the point in time that the change was entered into the organization’s data resource. The change availability state is the point in time that the change entered into the data resource was actually available to applications and queries.
The term bi-temporal data is often used for tracking the business change state and the change entry state. From a database perspective, these appear to be the only two states that are important. However, from a business perspective, tracking only the business change state and change entry state could have serious legal and financial implications for the organization. Therefore, all five states must be considered when designing a data resource.
The terms tri-temporal data, quadri-temporal data, and quinti-temporal data are useful for many organizations, based on their business needs. Tri-temporal data track the business change state, the organization receipt state, and the change entry state. Quadri-temporal data track the business change state, the organization receipt state, the change entry state, and the change availability state. Quinti-temporal data track all five states. Each organization must choose the states they need to track in their data resource.
The second aspect of temporal data is the tracking of entities through their life cycle, which is commonly referred to as multi-temporal data. However, the term multi-temporal data is often confused with bi-temporal data, tri-temporal data, quadri-temporal data, and quinti-temporal data. Longitudinal means running lengthwise; dealing with the growth and change of an individual or group over a period of years. Longitudinal data are data that track business objects and business events over time according to an organization’s temporal relevance.
For example, geologists track continental drift on the Earth. A person’s growth and health are tracked over their lifetime. Students are tracked longitudinally during their time in the public school system. Particles are tracked over billionths or trillionths of a second. Longitudinal data have a wide variety of uses as organizations move into analytics and business intelligence.
The third aspect of temporal data is navigation through the data resource based on time components, which is commonly referred to as time relational data. Time relational data are any data entities that are connected by time. Time relational data uses navigation based on time ranges, rather than navigation based on primary key and foreign key values. Referential integrity is not enforced for time relational data, because the connection between a subordinate data occurrences and a parent data occurrences is based on a specific time or a time range, not on the value in primary keys and foreign keys.
Referential integrity is the situation where the value of a foreign key in a subordinate data entity must have a matching value in a primary key in a parent data entity. A data occurrence cannot be added in a subordinate data entity without a corresponding parent data occurrence in the parent data entity. Similarly, a data occurrence in a parent data entity cannot be deleted while subordinate data occurrence still exist in a subordinate data entity. Referential integrity ensures that data relations remain viable.
Temporal integrity is the situation where temporal data attributes must exist in a parent data entity and a subordinate data entity to allow temporal navigation, although those data attributes may not exist in primary or foreign keys. Referential integrity is an existence dependency based on values in primary keys and foreign keys, and temporal integrity is a temporal dependency based on temporal ranges.
Temporal navigation is the technique for navigating between data entities based on the temporal data values. A specific parent data occurrence is not known for time relational data. However, temporal data attributes must be present to support the navigation. A temporal relation is an association between data occurrences in different data entities based on time ranges. It provides the capability to for temporal navigation between data entities. It is different from a data relation because it does not depend on fixed values in primary keys and foreign keys. Temporal normalization is the technique that ensures the existence of temporal data characteristics in a parent and subordinate data subject to support temporal navigation. It ensures temporal dependency.
For example, a question might arise whether the person voted in the proper precinct based on their address. Since no direct link exists between the date of the election and the effective dates of the addresses, a data relation doesn’t work. A search needs to be made of the effective data ranges for the addresses based on a temporal relation to determine which address was active at the time of the election.
Any data management professional that is designing or developing an organization’s data resource must be aware of the three aspects of temporal data, and use temporal data appropriately. They must look at temporal data from a business perspective and not just a database perspective. They must use proper terms and seek to resolve the lexical challenge in data resource management.
_______
Adapted from: Brackett, Michael. Data Resource Design: Reality Beyond Illusion. Technics Publications. 2012.