Pessimistic skepticism as a virtue for data science

A recent article published on LinkedIn’s Pulse platform (probably requires an account to read) plotted out different personality types in terms of two axes.  One axis has the extremes of pessimism and optimism and the other axis had the extremes of skepticism and optimism.  The article itself was typical for LinkedIn and just was a monologue (like my posts here) expressing an opinion.  I would also say it is typical of most featured articles in its exultation of the virtues of optimism and credulity.  The article made several points but I only focus on the chart that placed at the far corner of pessimism and skepticism the personal type he labeled “Jerks”.

Much of my writing on this blog has been to express both skepticism and pessimism about data.  Data can mislead us, and if given a chance, it will.  If I don’t come across as extremely pessimistic and skeptical, it is because of lack of boldness on my part.  Although I enjoy working with data, what I enjoy most is finding the flaws and finding ways to correct them.

During my project, I made many innovations for the time.  I suggested that something might work and most of the time I implemented that suggestion to prove I was right.  Sometimes it required a lot of effort, but the initial suggestion was a good one.  Although this innovation involves some degree of credulity (I thought it could work) and optimism (I thought I could make it work), I don’t think either label of credulity or optimism was appropriate.  

In most of my best innovations came when we ran out of the credulous or optimistic options.  These were suggestions made out of desperation.  We had to do something when the usual options seemed unworkable.  These were often suggestions with both skepticism (I’m not sure it will work) and pessimism (I’m not sure I have the resources to make it work).   The innovation proceeded because I decide to give it my best effort to make it work.

Most of the time, the LInkedIn Pulse articles present motivational articles in optimistic and credulous terms.  The motivation comes from the encouragement to believe in ourselves and in the bounty that lies just ahead.  These are great themes for motivation and coincidentally these themes probably also help to drive up readership.   Most people prefer to read uplifting messages especially if they are currently in a frustrating position.

I am not one of those people.   Generally, I cringe when I read something that overly optimistic or credulous.  I cringe even more when these traits are presented as high virtues.   To me, skepticism and pessimism are the better virtues.  

I work in data science with a focus on the data for decision making.  Skepticism and pessimism provides the passion for what I do.   I am motivated by the suspicions that data may be wrong about the objective reality it attempts to present.   My skeptical and pessimistic approaches to evaluating data often reward me with discovering something new about how data (or our theories) can mislead us.

I guess I can get into argument over definitions.  I am successful at what I do because I do have some optimism and credulity in my ability to figure things out.   Someone with pure pessimism and skepticism would be unable to motivate himself to do anything and thus would live in utter despair and be a barrier for those who are trying to do something.   I am not that kind of person.   I strive to master skepticism and pessimism in order to put them to good use.   That striving and sense of mastery appears optimistic and credulous, but that characterization distracts attention away from the key objective of striving for deliberate skepticism and pessimism.

When it comes to running a business or promoting some new concept, skepticism and pessimism are best hidden out of view.  Selling and marketing rely heavily on promoting benefits and countering objections with positive responses.

In recent posts, I have been complaining about the over selling of big data technologies because all we hear about are the successes.  The only times we hear of faults is when someone is promoting some solution that overcomes the failing.   For example, we learn of the difficulties of ingesting data because there are products specifically promoted as eliminating this difficulties.    We also learn that success is elusive indirectly through the complaints that truly qualified data scientists are rare.   In that latter example, we are lead to believe that a failure is the result of not having sufficiently talented data scientists, rather than suggest that the project itself may be fundamentally flawed.

I am not convinced of the wisdom of widespread adoption of big data technologies across all aspects of business and government.  My posts on this site express the opposite sentiment of fear and doubt.  My recent posts on accountability and my earlier posts on the necessity of super-majority consent for peaceful government reflect that kind of objection to counter the appeal to big data to solve our problems.

I frequently refer to medical profession that seeks to exploit data analytics to optimize treatment management and resource allocations.  There is a lot of promise for finding previously elusive strategies for addressing specific cases or reducing overall costs.   We can save patients and save money.  The problem can come when the algorithms and data are too complex for human accountability.   Without accountability, people may protest for their poorer outcomes (because of their lower priority) and others may withhold their support or consent (because they have no reason to defend the decisions).  This could lead to a break down where people boycott the medical services, seek alternative care, or even engage in violence that causes more injuries and costs than what was saved.

In the recent posts, I suggested that a decision maker gains competence at accountability by balancing both the evidence (big data is all about evidence) with doubt (non-evidence).   Raising and answering doubts is essential to the thinking prior to a decision.  When we demand accountability for a decision, we want more than just just an accounting of the evidence, we also need proof that the relevant doubts were identified and addressed before the decision was made.   The adverse effects of a decision is new evidence that could have been suspected in the first place.   A more satisfying response to an adverse complaint is the persuasive argument that we considered this possibility that was not previously recorded as evidence, and we decided its risks were outweighed by the potential benefits.   In contrast, the evidence-based decision making approach is to appeal to ignorance by answering that we had no way to know this could happen because it was never recorded before.   The evidence-based response will not satisfy the aggrieved or convince the bystander groups.

The point of this post is that the concepts of pessimism and skepticism are valuable tools especially when it comes to making decisions.   I object to the notion that these should be discouraged as bad traits.   The argument for discouraging pessimism and skepticism is that they impede progress.   It is frustrating to have a good idea be dismissed due to arguments of doubt.   But the virtue of pessimism and skepticism is that it inserts the human decision maker in the loop.   This impedes progress because demand certain individuals be persuaded to accept the project.  That persuasion requires overcoming all of their doubts.

I have experienced the frustration of not earning approval for my great ideas.   In these instances, I attempted to prepare a strong case in favor of my proposal.   I still think I had a good case to make.   I failed to convince the necessary decision makers.   In hindsight, I recognize my own error in failing to anticipate the decision maker’s doubts.   I may even have considered the possibility of these doubts but they seemed irrelevant to me.   I addressed only the doubts directly related to the project’s prospects of success but that was not sufficient.  

In one case (if not others), I failed to appreciate the depth of the doubts of the decision maker’s own ability to sell the concept.  I think he may have been convinced that the project would be beneficial but it was too unusual for him to easily describe to others.   He needed a simpler sales pitch and I failed to provide it.   I could dismiss that is being overly pessimistic, but that doesn’t change the fact that I failed to convince him of my progressive idea.

A good decision maker is one who challenges the people asking for a decision.   I recall a recent example of a manager who demanded that the people making a pitch to him to be his direct employee or a contractor.   He required his own staff to make the persuasive arguments because he knew he had questions that they would need to go back to find answers.  These questions could seem irrelevant and certainly would be treated as such by a non-employee sales team (although they would probably try to come up with some answer).   The questions were ultimately necessary and required the efforts to find answers because the decision maker is practicing due diligence in identifying and addressing all possible doubts.

Another example from my experience is one where I ultimately got an approval.   I had a solution for a very challenging and urgent problem.   I had already successfully demonstrated that it was practical and it would satisfy the need.   Given the circumstances, I expected a quick decision in my favor but it took much longer because of how I tried to describe it.  My solution was to follow an industry trend to adopt a data warehouse strategy so I described my solution as a data warehouse solution.   The unexpected problem was that there was already an initiative elsewhere to build a data warehouse for the entire organization while the one I had in mind was only for one department that happened to involve a far larger volume of data than the other project.    In short, I had a winning idea with a bad name.   The decision process took multiple iterations where the focus was on finding an appropriate name so as to not confuse or conflict with other initiatives.   All of the popular industry buzz words that were synonymous with my project had already been taken by other proactive thinkers with smaller and longer-term solutions.   We finally settled on a name that described the user interface rather than describing the back end data architecture.    It worked to persuade an approval and the project was a success.   But the seemingly irrelevant complaint about how to name the project was an important consideration from an organizational perspective: there were other projects that needed to be distinguished in order to continue their funding to support their objectives that will pay off later.

As I mentioned, this example was a success story that not only won approval but exceeded in delivering its promises.   Unfortunately, when it came time to re-engineer the system, there was tremendous confusion about the nature of the system.   The name described only the user interface while the bulk of the effort was in the back end data handling.   So ingrained was the notion of the user-interface definition, that it was difficult to convince them that this was basic multi-dimensional analytic database that is very well supported by COTS tools that essentially provide user interfaces automatically.  They were focused instead on a more costly project of rebuilding the user interface.

Perhaps this is an example of the unintended consequence of the earlier decision of choosing a non-data name for the project.   That decision failed to anticipate this type of confusion when it came time to upgrade the system to the latest technologies.

The above personal examples were not very disruptive.   In earlier posts, I expressed strong worries about catastrophic consequences such as collapse of whole governments due to lack of adequate accountability for decision making.   In the above examples, the consequences were minor.   I used them to illustrate the fact the due diligence of a decision maker requires of him to consider all possible doubts including those that seem irrelevant.   

The best decision makers anticipate and address all possible doubts before making the decision so they can defend the decision if someone objects about the consequences.   Even something as trivial as the name of a project can and will be challenged with a demand for a persuasive defense that may be a losing battle.

To support the decision maker, the data scientist (the student of data itself) needs to anticipate the doubts of the decision maker.   The data scientist needs to challenge proactively the data itself for the possible doubts of its authenticity, accuracy, and relevance.

Entertaining doubts is indistinguishable from skepticism and pessimism.  This is a virtue for data science.


One thought on “Pessimistic skepticism as a virtue for data science

  1. Pingback: Data analysis cannot find what you don’t want to find | kenneumeister

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s