Big data tall tales told by human story tellers

Lately there have been many articles presenting enthusiastic promotions of the the positive possibilities of large investments in big data systems with associated analytics, predictions, and visualizations.    Many of these articles discuss the labor shortage of data scientist.   In some of these articles about what make data scientists different from other disciplines, I…

Crowd data: big data free of business rules

In my last post, I described a difference in big data approaches between data warehouses (including those implemented with NoSQL databases) and forensic data tools such as SIEM (for IT systems). Data warehouses put very large investments in imposing business rules and constraints on data before admitting the data into the data store.   The…

When data is the algorithm

In several recent posts, I attempted to present my concern that people, for their private gain at the expense of others, can game the machine-learning or predictive analysis algorithms currently hyped as part of the big data benefits.    The mechanism for manipulating the algorithms occurs outside of the security perimeter of the IT supporting big data and that includes things…

Extraordinary Popular Delusion and madness of Crowd (data)

In earlier posts I suggested a better term for Big Data would be Crowd Data.   The characteristics that distinguish big data from other data are similar to what distinguishes crowds from other aggregations of people.    The challenges and risks are similar as well.   Crowds can get unruly. In a recent post, I…

Three Vs is Crowd Data: fleeting information

I considered my last project to involve a lot of data especially in context of the volume of data processed at the time (over 10 years ago).   It was only recently when I heard the popularity of the term of Big Data as an emerging concept.    My first reaction was to think that…

Deterioration of Trust in Data

In earlier posts I proposed a division of human inquiry in terms of the age of the data involved.   My point was to emphasize a fundamental change in our approach to data when the data changes from present-tense operational data to historical data. I suggested the approach taken with present-tense operational data is best exemplified…

Bright Data make trains run on time

This video from a conference presents a demonstration of extensive instrumentation of an entire public infrastructure project (in this case trains).   This data provides central controllers access to real time information for virtually every aspect of the operation from the mechanical operation of individual trains to the number of people waiting at stations. The…

Multiple versions of truth

Data warehouses provide a single source to share the same data throughout various parts of the organization.   One of the promoted advantages of such a system is that it offers a single version of truth that all organizations can use for their analysis.   This single source of truth makes it more likely to…