Masters of Ignorance: effective data analysis

The masters of science degree is valuable to demonstrate advanced mastery of a particular discipline, including both a more extensive understanding of the field as well as an aptitude for doing original research.   Data science (and its variants) have masters of science degrees that focus on the particular technologies and techniques for working with data.   Because the data science field is so new and still maturing, we recognize that a masters degree is not quite enough.   One answer is even more extensive study resulting in a contribution to the body of knowledge: a PhD in some discipline related to data science.   An alternative answer may be in the other direction, a mastery of ignorance.   Often to get value from analyzing data, there is a benefit to approaching the data with deliberate ignorance.

A data analyst frequently confronts a task to find an answer to the particular type of problem that is a problem that shouldn’t be occurring.   All of our understanding of causal relations fails to explain why something is happening based on everything we can measure.   We cannot dismiss the fact that the problem is occurring but we can’t find the causal link between the observed preconditions and the observed problem.

Clearly, such a scenario demonstrates the incompleteness of our knowledge of causal relationships in the world of the problem.   This is an ignorance about the world being observed instead of ignorance of some yet to be discovered data practice.

A competency of data analysis ironically involved what can be considered deliberate ignorance.   This is possible in a community where we can count on the specialists to defend their knowledge of the causal models of their world.   To make progress in this environment, there is a role for someone to feign ignorance of established laws of the world in question.   In effect, such a deliberately ignorant person demands to be convinced by the current observations.   Given the current scenario where there is a problem that defies a causal explanation, the data can not convince the deliberately ignorant person of something that the domain experts know to be fact.

The scientific practices differentiate the experimental sciences from the theoretical science.   The experimental scientists I respect the most are the ones who insist their job is not to explain the results of the experiments.   Instead, they take for granted that the established explanations are true and they build new experiments to further the confirmation of that truth.   When they encounter results that contradict or undermine the theory, they merely present their results, placing emphasis on the quality of the experiment and the confidence of the relevance of the results.   It is up to the theoretical science to explain the experimental findings: finding flaws in the experiment, or fixing flaws in the theory.

Deliberate ignorance is different from experimental science, because experimental science takes for granted that some theory is true.   Deliberate ignorance demands to be convinced of the truth of some theory.   In modern times, the experimental scientists are still testing the basic laws of gravity.   The deliberately ignorant would demand to be convinced of the truth of gravity based on currently available data.  Given that some condition currently exists that can not be explained, if gravity is involved in the failed explanation, the it is possible that our understanding of gravity is wrong.

In reality, the conditions I am talking about are very complex in that they involve a large number of theories acting concurrently on the observed outcome.   The point is that we can’t explain what we are observing.   Until a solution is found, the deliberately ignorant effectively suspects each explanation as being potentially wrong.

As mentioned above, such deliberate ignorance works well in a broader team consisting of domain experts who can reliably defend their knowledge.   The deliberately ignorant poses no real risk to overturning any knowledge.   The potential value of the deliberately ignorant is to confront the domain experts with data that challenge their explanations.   As the domain experts explain some aspect of the results, the deliberately ignorant accepts those explanations, but he revises his analysis incorporating these explanations and then presents new results that need an explanation.

It is not hard to see such a contrarian to be tediously annoying if he keeps coming up with easily dismissed examples.   In such a scenario, the person would be completely dismissed as uselessly ignorant of the topic.   In a work environment, such a person would be reassigned to other work where he is more competent.   I would suggest that the lacking competence is the competence of being ignorant.

The willfully ignorant can become a valued team member if he is competently ignorant.  He does understand the relevant domain knowledge, or at least he readily absorbs the explanations that are consistent with current observations.   His role is to take into account everything that can be measured and every knowledgeable explanation that can be derived from observations.   The distinction is that he will still seek out contrary observations that cannot be explained, and when he does so he ignores an proposed explanation that cannot be demonstrated by current observations.

An example is in building a dashboard or some other visualization.   Such a tool combines currently observed observations with the projected results based on accepted explanations.   The deliberately ignorant would design the dashboard to exacerbate any unexpected results derived from the accepted theories.   The goal is to highlight anything that should not be happening, emphasis on the word “should” from the perspective of the theory.    Inherent in this approach is a deliberate choice to distrust the theory.

There is a competency to being deliberately ignorant.   As noted, it is very easy to become tedious to the point of being ejected from the team.   On the other hand, when done correctly, there will likely be some petition to eject such a person from the team.   The competency is in convincing the remaining team to keep the deliberately ignorant on the team.

I propose that there can be a different kind of training that produces productive members who can complement the people who master some domain.  The latter are masters of the science of some discipline.   The former are masters of ignorance of some discipline.

In some sense, data science is a combination of both.   The mastery of the technologies of data as well as the mastery of challenging other fields to verify their validity based on current observations.   In this age of team work among specializations, I propose a distinct discipline of someone who is a master of no science (not even data science) but instead a master of ignorance.   This is a person who can demand that the current observations be fully explained by the science but doing so with optimal relevance that frequently results in solving new problems.   The skill is in creative ignorance: selecting the right area to feign ignorance for the purpose of getting the team to find a new solution.

Unlike masters of science, I don’t think that masters of ignorance could be a university discipline.   I can’t even think of how it can be taught.   Perhaps there is some relationship to fields of philosophy such as skepticism, nihilism, epistemology, etc, but I don’t think these would be very practical.

A better way to think of deliberate ignorance is as a tactic to use to promote the value of data beyond the obvious applications of using data to feed models.    The alternative use of data is to challenge the models, especially when we dislike the current outcomes.  A person trained in the use of data may be very good at optimizing the use of data in a way that confirms models, whether those are established or newly proposed.   A different kind of skill is to get data to discover something new, something no one has proposed theoretically before.   This may be more of a motivation than a skill.  The motivation to sell data as something that offers value that is distinct from science.

At the extreme, this is a proposal that data can by itself add to human knowledge something that science may never be able to explain.   There are many problems we face currently that we are frustrated at being unable to solve.   We look to expand data to include measures that we assume are missing, or we gather more data to get better statistics needed to illuminate the answer.   The deliberately ignorant approach is to take advantage of available data, including seemingly unrelated data, to challenge our understanding.   After all, we acknowledge that we have situations where we desire better solutions.

The deliberately ignorant offers new alternatives.  Maybe our theories are wrong.   More importantly, in some cases maybe we would be better served without considering these theories at all.   There may be cases where correlation in the absence of causation may actually be more workable.

This post is another variation of the same theme I’ve discussed previously on this site.  Earlier I have described how data and science are mutually antagonistic.   In my distinction between what I called bright data and dark data, I proposed that data and science are mutually exclusive.   The point is the same.   We should study observations separately from derivations from theories.   The deliberately ignorant takes the position that data is superior to science.   There is a valid place for the deliberately ignorant when included in teams with domain experts representing each of the relevant scientific disciplines.   In order to work, the deliberately ignorant needs to be skilled at his craft of being ignorant in the right way to propel the team towards a new solution without annoying everyone to the point of being expelled.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s