What really makes legacy news fake is the tyrannical influence of past narratives that influence what future observations we accept. Fake news is the need to keep old narratives relevant when the such a narrative never would have emerged if started from scratch with the data available at the current moment.
These examples set a new standard for rapid access to context information to accompany the new information for breaking news. In the case of street maps and aerial/street views, this information required extensive investment long before the event occurred. In the case of the more recent information (street congestion, weather radar imagery, landslide risk assessments) there was a need for prior investment for models and technologies to provide this information on a timely basis. These investments were made on a global scale where the vast majority of this readily available capability may never been needed for matching with a breaking news story. But when a breaking news story does occur, we welcome the ready access to this information specific to the broader context of the story.
Oral story telling was the original big data. The various oral stories were saved in persistent memory and captured a large volume and variety. The invention and adoption of written works displaced the oral tradition and that brought and end to that earlier big data. In this sense, our current excitement about big data may be a rediscovery of a capability available our ancient ancestors. Big data and oral story telling tradition both offer inexpensive and durable means to manage a large number of distinct and very individualized stories. In the modern era, we are rediscovering the need to collect individual stories and thus granting them ability to circulate like what happened in the preliterate society of oral story tellers.
This week, the NBC News anchor Brian Williams made an apology for incorrectly recounting his experience on a helicopter flight during the early part of the war in Iraq. There continues to be controversy despite the apology and even more controversy about the actual wording of the apology. Perhaps not surprisingly for an event like…
Missing from my taxonomy of types of data (as summarized in this post) leaves out mention of dirty data. The different types of data have different levels of trustworthiness where bright data is highly trusted, while model-generated data is less trusted witness of reality. My descriptions of the dim data or model-forbidden data is not…
In modern data science projects with automated data collection and analytics, the hypothesis-discovery occurs at the beginning of the process. The modern decision maker participates at this early stage of the process to select discovered hypothesis that are self-evidently persuasive. The following data collection and analysis that supports this hypothesis will lead to a simple decision that does not require any last-minute invention of a story to earn the decision-makers approval. After the decision, additional invented stories will serve only the purpose of illuminating the underlying non-fiction of the data and analysis.
As in the computer interpolated images to simulate a faster frame rate, the reader sees this manufactured information as part of the same story. The the story becomes hyper-real. In movies the faster frame rate gives the impression of a cheaper production more frequently associated with daily soap operas. In journalism, the injection of author’s opinions leaves the reader with the impression of reading a cheap novel. Neo-NeoCon sums this up nicely as “When I read Erdely’s piece, it seemed to me that its style resembled a romance novel gone bad”.