I previously wrote some of my thoughts on a data-science perspective of Rolling Stone’s controversial publication of a rape charge. I concluded that although the story turned out to fail to meet traditional journalistic standards, the story itself was a valuable contribution to a big data store. The value of the story is that it tells us something about the genesis of rumors and gossip that can have a substantial influence on broader social trends and politics.
The traditional view of journalism values practices that checks facts with attempts to falsify accounts of a story. This tradition grew out of an earlier era when publications were far more expensive and durable. In the age when print media constrained journalism, an article needed to be of very high value to justify the costs and risks of publication. Once distributed, a printed story is inaccessible to any type of updates such as even a simple addition of a flag to indicate future published corrections or contradictory reporting. In such an age the rigorous journalistic practices were essential to protect the reputation value of the publication. Someone can always retrieve an older published issue and highlight the errors in such a way to tarnish the publication’s reputation. In defense, a publication can point to more recent corrections, but such corrections will not reach all readers of the original publication.
With modern communications channels through Internet, there is less of a reputation cost for publishing poorly researched articles. The electronic publication through web-sites allows the publisher to retain access to the authoritative publication, the version that resides on the website. On that web-site, the publisher can update a story with new information or at least add hyperlinks to other stories (local or remote) that expand upon the original story. Internet communications break down the earlier cost and durability barriers of print media. Today’s journalism can afford to be less diligent in getting a story out.
In addition, today’s environment places a higher premium on getting stories out the fastest. The professional journalist is competing with amateur journalists to obtain page views on a story. A well researched story will reach only a very small audience once the basic sense of the story is already old news. The journalist needs to act quickly.
Another prized skill in traditional journalism is the ability to write engaging prose. People may be drawn to a story by basic curiosity, but they are unlikely to finish the story (and absorb all the nuanced details from research) unless the prose is well written. The long-form journalism prizes a style of writing that borrows heavily from 19th century fiction, as I discussed in my post that compared modern non-fiction with older forms of fiction. The modern long-form journalism takes fictional liberties to make the story more interesting. Our ancestors would have called it fiction, but they expected their fiction to be grounded in researched facts. What they called fiction became prototypes of what we today call non-fiction.
The point of engaging prose is to provide entertainment to the audience to encourage them to return to buy more content in the future. This was an explicit economy in print-media involving purchases or subscriptions of paper-copies. Today that economy has transformed into advertisement revenue proportional to page views. It it not clear that the modern economic model can sustain journalism. Many journalist outlet web-sites are attempting some type of restricting content to paid subscribers.
Modern journalism distorts the perception of reality into a perception of hyper-realism. For example, the Rolling Stone article on rape at University of Virginia evokes an element of the super-natural in the exaggerated circumstances of the rape.
After other journalists discredited of the original premise, the continued defense of the article is a defense of the hyper-realism or the super-natural the emerges from the article. In other words, modern journalism stories begin to appear biblical in nature. Some of the defenses of the continued truth of the article appear to come from a following inspired by a deep and sincere belief. Consumers of modern journalism prize the engaging prose that exposes a deeper truth through the artful use of fables. Modern journalism market welcomes new books to some modern bible.
This tendency toward supernatural or hyper-realistic journalism is a consequence of the clash of traditional journalism with modern competition from amateur journalists including everyone with a camera-equipped smart phone. The journalists lost their monopolies on fact collection and on publication media. As a result, they cling to their last remaining advantage of skilled prose writing. The result is more hyper-realism, more supernatural, more fictionally-decorated journalism.
In my opinion, modern society is squandering its journalistic talents on trying to write the newer testament. I argued that we should instead take advantage of their investigative skills to collect more first person stories. Instead of arranging and editing their research into marketable published articles, we should pay the journalists directly for their raw research notes from their hard work of investigation. We should pay them for their raw notes as-is and insert them into our big data data stores. Over time, the big data stores will accumulate stories from the relevant population including the majority who are naturally reluctant to volunteer their stories directly.
Unfortunately, there is not much evidence that this is happening. All of the current investment into big data involves technologies to exploit the preexisting volunteered data. Volunteered data is very biased. Our failure to invest in active investigations to uncover reluctantly-shared data is going to result in an eventual catastrophe.
The current state of journalism appears to be an adulterated version of traditional journalism that is trying to survive in the modern economy. This journalism is finding profit in selling hyper-realistic or super-natural tales inspired by gossip, rumors, or sources who can not be named. The journalist fills in the missing data with plausible fiction to make it into something that can be published.
In my mind, I place journalism along a line of human fact finding. At one end of the line is some mystery and at the other end is the proof beyond a shadow of doubt test for criminal justice. Moving from mystery, first we encounter rumors or gossip, next we encounter journalism that ranges from simple reporting (something happened) to in-depth investigative reporting with fact checking.
Although there is more appeal to journalism when it uncovers a villain or criminal, journalism is valuable when it finds a good answer or a non-culpable actor. Moving beyond journalism, we encounter the justice system but only when there is a prosecution of an accused. We don’t employ the preponderance of evidence or beyond a shadow of doubt test for investigations of findings that don’t involve a culpable act. Most journalism does not lead to engagement of the justice system with civil or criminal courts. The story still stands as a finished journalism product that has market value for readers.
Journalism is a form of history. Most of the subject matter of journalism involves uncovering what happened in the past. However, the past is recent enough that there are living people to interview and otherwise volatile evidence still available to uncover. The journalist task is to obtain this information to provide relevant information about current events or controversies usually for the purpose of informing an audience who will want to use that information for making up their own minds about the current situation.
Returning to the now retracted Rolling Stone Rape Story, I still disagree that the story should have been retracted. I agree fully that the story should not have been published in the accusatory and authoritative tone it used. But once it was published, it should remain available and traceable to the original source: posted on the Rolling Stone website. I strongly disapprove of the idea that something can be unpublished. The article had an direct and significant impact on culture. The publication of the story became a news story. What happened since publication makes less sense when we don’t have access to the original published article.
While I agree the published version was inappropriate, I think same field notes could have been published following a different narrative that emphasized the intended bigger story of a college student’s experiences on a campus. The story could have emphasized the limited supporting evidence for the claims so as to avoid the unfortunate and unjust consequences that resulted from the actual story. A talented journalistic narrator should have been able to construct at least a mildly entertaining non-accusatory gossip story.
Even after later investigations fully discredited most of the claims in the original article, there were many who claimed that the story still spoke to a broader truth. These defenders say that the story still has value in illustrating the problems they feel are prevalent on campuses. I agree they have a good point. The story is a fable for how many people perceive what they call the rape culture. The fable describes both the male obedient cooperation necessary for a gang rape and the indifference of the administration.
The story is a fable. It does not fairly describe the specifics of anything that happened on that particular campus. A fable may use real places and real people as long as the narrator is artful enough to avoid making damaging and malicious accusations. I would not attempt to write a fable around a factual place and time, but I can imagine that talented writer may get away with it. The reporter’s research presented in a fable form could serve the purpose of illustrating the larger story of the extent and severity of the rape-culture people fear is present on campuses.
Even if understood only as a documentation of people’s impressions, the information could be very valuable in continued public discussions, debates, and policy-making. The fable story could have provided the contribution of a shared story we can use to discuss the various dimensions of the problem called rape culture. It is for this reason I regret its retraction.
On this blog, I try to see things from a data science point of view. For any current discussion or debate, my first reaction is where can we obtain even more data. There is never enough data. In recent posts (example), I described my fear of missing data. In general, I suspect a large number of modern problems and failed policies are a result of failing to recognize the extent of our ignorance. I argued that really good leaders are the ones who are most alert to justified fears and doubts. In contrast to ignorant fears and doubts, a great leader’s fears and doubts is based on experience of recognizing the unseen missing data that if known would change the data-driven recommendations or conclusions.
Counter-intuitively, the missing data problem is more dangerous today than in earlier times. The damage is occurring because we are overwhelmed with the volume, velocity, and variety of big data. The volume and velocity of data is forcing us to make decisions at faster rates and giving us a confidence that we have enough data to make even bolder decisions. The problem is that despite the volume, the velocity, and the variety, most of the relevant data is still missing. If we are honest with ourselves, we will admit that most of our decisions are based on dark data: our filling in missing data with our presumptions of how the world ought to behave given what little data we do know.
For example, last year there was much controversy about NSA’s large-scale collection of communication information spanning the domestic population. In that controversy, there was little attention paid to the fact that despite the size of their program they were only capturing a small subset of all possible communications. Even within the intended scope of data collection for their project, most of the data was missing. If I assume they could achieve their goal of comprehensive data collection, I am sure that data would be insufficient to meet the goals of their mission. For the foreseeable future, there is too much data that will remain inherently missing.
It is bad enough when missing data is the result of insufficient sensor technologies or insufficient deployments. It is worse to deliberately discard information, such as what happened with the Rolling Stone Rape article. I’m just stating my opinion. In my last job, I described my attitude toward data as like the saying “beggars can’t be chooses”: I eagerly hoarded anything that arrived. In my validation or data-cleansing actions, I would flag data as untrustworthy, but I would not throw it away. I later found use of the untrustworthy data as examples of what the dismissed data was like.
I will never be satisfied we have enough information. I’ve learned through personal experience to fear the missing data.
For this particular post, I wanted to tie the above discussion about fabulist journalism as a source of data with a way that I interpret the stories in the protestant christian bible consisting of the old and new testament in the tradition of the King James version. In particular, I want to focus on the stories of specific peoples and events. These stories have a fable dimensions (like the Rolling Stone Article) and involve supernatural or hyper-realistic details such as interaction with a divine voice or apparition. In the old testament we get single accounts of many different stories. In the new testament, at least for the story of the later-period of Jesus’ earthly life, we have multiple accounts of this story.
The point I wanted to make here is that one way I read the stories of the bible is to read them as very primitive forms of journalism. I tend to believe that most of stories were inspired by some historical event. The initial goals of the story tellers was to pass these stories down through the generations so that later generations will not forget what had happened before. Their basic intention was a prototype of modern journalism. The basic goal was there to capture and preserve what happened at a particular time and place. The journalist skills were obviously under-developed at the time, but the intention was journalism.
I imagine that in the earlier time, the concept of journalism was a major innovation. My imagining of the context is of peoples who were used to thinking of the world as static where lives played out according to arbitrary present-day whims of various gods. The emergence of a record of a very different past, and of a sequence where later events were very much dependent on previous events, provided a new way to look at the world. The revelation would have been a journalistic one: that the present day conditions were largely a result of what happened in the past instead of being the result of the present mood of one of the many possible gods. The people of the time found this view of explaining their conditions based on past events useful. Instead of calling this view history or journalism, they called it God.
I take the view that God is a synonym for history. God is all present, all knowing, and all powerful because He accounts for everything that ever happened. That’s history. Part of history is journalism, the discovery of new stories based on evidence and understanding available at the time.
The complaint about fables is that belief in the supernatural or hyper-realistic aspects of the fables can get us in trouble. Certainly, this is a common atheist or agnostic complaint about biblical fables. However, I think we see the same trouble resulting from modern journalism as we’ve seen recently with the Rolling Stone Rape article and the Ferguson police-shooting story. Modern journalism is just as prone to fables as ancient biblical stories, and just as dangerous.
As I argued above, I still welcome fables as data. The fables represent data that can fill in the missing data. In my above discussion, I defended the Rolling Stone Rape story as illustrating the perception many have of the rape culture problem. For people like myself who haven’t been on campus for decades, this is a useful insight. Whether or not the rape culture is true, it is clear that many perceive the problem as consistent with what was illustrated in that story.
The problem with fables is not that they exist. The problem is that we don’t have enough of them. In the case of the modern journalism fables, we required a considerable effort to uncover the suppressed stories that eventually discredited the initial stories. I argue that the better approach to excluding stories with unsupported claims is to include as many such stories as possible. The stories, or field notes, would supply a data store instead of a published periodical. In traditional journalism, the journalist will collect many stories in his private notes and then assemble the competing accounts into a (hopefully) well-researched article. The only information available to the public is the final article: the private notes will remain private and may eventually be lost for ever. My preference is that the journalist would sell his field notes directly for direct recording into a large data stores instead of selling narratives. This postpones the narrative building over time and makes possible competing narratives incorporating the same notes where the differences come from different backgrounds of the authors or including a different mix of notes to support the story.
The concept I have for the future of journalism is to create a market for capturing the uncorroborated stories from individual reluctant story tellers and the postpone the story telling phase indefinitely. Narrators at different times may assemble these field notes (including validation through cross-checks) into different stories to meet different needs. This avoids the problems we face when we have to accept as authoritative the one published narrative.
Present time journalists have the luxury of documenting more circulating fables with recent events.
With past events, we lost that ability to get other stories that may be equally fabulous: contradictory, or perhaps confirming. I think this is what sets the new testament apart from the old. In the new testament we have multiple accounts of the same story that collaborate many points of the story but also offer unique or conflicting descriptions. This gives the story of Jesus more credibility than the old testament stories of a single account.
Even with Jesus’ biography captured in the synoptic gospels in the new testament, we are aware of other accounts that have since been lost. The stories that survive were the ones the church felt were most worthy of preservation (for its purposes). Today, scholars appreciate the loss of those other accounts that could provide a better understanding of what actually happened in Jesus’ time, not just about the specifics of his life but also about the prevalent culture at the time. The lost stories would have given us a better context to interpret the surviving texts.
With the old testament stories, only one report survives for each story. I suspect that the preserved stories were a tiny sample of a larger body of oral history told within a great number of families, tribes, or villages living at different times, sometimes coincident with the surviving stories, other times telling of other events that may have followed a different succession of events. These stories may have accounted similar scenarios where one of their revered ancestors overcame some hardship with some attribution to supernatural intervention. The nature of that intervention probably would have been very different from story to story. For example, there may be different explanations about the protagonist’s personality that attracted supernatural assistance or punishment. If all of those stories survived, even with their fabulist components, we would have the opportunity to have a better understanding of the history and of the cultures concept of their place in history.
We lost nearly all of those stories.
One explanation of the loss of the stories was of an authoritarian rule the suppressed competing stories. That authority may have come from house of David for the old testament and from the early Christian church for the new testament. I don’t doubt the stories were actively suppressed but I doubt that suppression would have been too successful. The story of Joseph survived to Moses’ time despite a long period of intervening hardship if not overt attempts at cultural suppression.
I believe the bigger reason for loss of the vast majority of other stories was the loss of the very durable oral story-telling tradition when it was replaced with the far less durable and more expensive literary counterparts. The study of Homeric epics and subsequent findings of evidence of Troy give hints at the durability of oral history. Explicit references (such as boar toothed helmets) in the epics were confirmed in modern archaeology but that physical evidence would have been unavailable at Homer’s time.
Oral story telling was the original big data. The circulating oral stories captured a large volume and variety in a persistent memory. The invention and adoption of written works displaced the oral tradition and that brought and end to oral history and thus the end of that earlier big data.
Our current excitement about big data may be a rediscovery of a capability available our ancient ancestors. Big data and oral story telling tradition both offer inexpensive and durable means to manage a large number of distinct and very individualized stories. In the modern era, we are rediscovering the need to collect individual stories and thus granting them ability to circulate like what happened in the preliterate society of oral story tellers.
It is unfortunate that it has taken us so long to invent big data technologies. It should have occurred before we invented writing. We may have avoided the embarrassing errors that resulted from the loss of parallel stories to the ones that survive in holy texts.
With access to cheap and durable memory, we have access to a broader diversity of perspectives of certain events or times. The data volume and variety gives us more confidence of what happened by observing collaborating accounts. Most importantly, the diversity of information can alert us to be more cautious because of equally plausible counter claims.