Historical data shards divorced from the methods, post object-oriented strategy

The advantage of data on read strategy is that it separates the processes of data collection from the processes of applying a schema in order to interpret the results. We can learn more easily that our prior knowledge was wrong when we get prior knowledge out of the data store.

Advertisements

Exploiting sharding capabilities in cloud databases for better preservation of history

For the project of knowledge or hypothesis discovery, this sharding of history is more valuable than attempting a historical report using the operational database. The sharded history retains the context of the data. For a business example, assume a report for the previous period involved some action by an employee who has since been promoted to a different position. Using the operational database for this historical information will naturally return the erroneous result that the new position was responsible for the prior action when in fact that action was done in capacity of the older position.

Perspective of real time analytics

The potential return for exploiting operational data will not justify the investment. This return is naturally limited by the short time period available to take advantage of the opportunity. The window of opportunity is naturally short because new operational data will present distractions of new opportunities to pursue. Also, the competitors and customers also are employing their own operational data intelligence so that they will quickly close any advantage gap. Unfortunately, this investment distracts the organization away from historical data that offers more durable knowledge discovery.

Deterioration of Trust in Data

In earlier posts I proposed a division of human inquiry in terms of the age of the data involved.   My point was to emphasize a fundamental change in our approach to data when the data changes from present-tense operational data to historical data. I suggested the approach taken with present-tense operational data is best exemplified…