Bright Data, Hidden World

This morning I had a very strange dream that begs to be interpreted.   I came up with some personal interpretation that will remain private, but the dream was strong enough to support a number of interpretations.   I’m going to try to make an objective interpretation in this post.

In my last post, I invented a word to describe what I see coming for our future: government based on big data.   I tried to come up with a greek root for data and apparently mashed it up with democracy so it came out to be something different than my intention of a pure government by data alone.   However, the rest of the post seemed to reinforce my mistake by suggestion a cross between democracy and big data, or the democratizing of government data.   I imagine a future where everyone can run ad hoc queries on the same data set, build their own models, come up with their own conclusions.

Realistically, that is not going to happen.   More likely, we will end up with some form of oligarchy of the approved analysts who can run queries that will determine policies, and we’ll submit to those policies by the power of data.    It is already happening in some policies where we accept the power of data that we individually have no access to see for ourselves.    We trust the government data-analysts and we’ll trust them more in the future.  We will increasingly accept more intrusive decisions that we will be assured is supported by the data, whatever that data is.

Now, about that dream.   It was one of those particularly long dreams where it keep dragging on as if to prove it could not be a dream.  In such dreams I recall asserting quite strongly that this was a dream, and yet the dream would continue on as if to challenge me to prove it was a dream.    I get such dreams only a couple times a year, and they tend to set me up for a moody day.

In this dream, I was working again.   I was working in a typical cubicle nestled among a floor packed with cubicles.   The office was well lit and my job was going well.   I assume it was similar to my last job involving working with data, making sense of data and relating it to the real world.    I tend to get absorbed in that kind of work.

But in this dream, the coworkers were telling me that this was just an introductory position and that the real work was performed at a different location.   The nature of the work was the same, but the location was different.     They arranged to take me to that location but somehow I missed the ride so I had to go there by myself.   Apparently I had some idea where it was.

It was in a completely different city.  I found that the roads were all empty.  The city had buildings but all of the lights were out and the buildings didn’t have any address or identification marking.   There was no one on the outside.   I knew the job was in the city but I didn’t know which building.   I approached the first building and found the door locked (my badge didn’t open it) and there was a stern looking guard nearby.   He didn’t answer my query about where was my building.   I was left wandering the street alone wondering which building was the right building and wondering what I would find because they all seemed to be vacant.

The dream switched suddenly to be inside some underground network of hallways that connected the buildings.  I suppose I found some entrance that accepted my badge.   The hallways were lit but but not brightly so.   Unlike the street level, there were people in the hallways going in and out of the various buildings.   They seemed to have business with each other but they had no interest in giving me direction.    I became aware that in this city, it is expected that everyone already knows where they should go, and that it was not permitted to give directions or even address strangers at all.

A person in the hallway belonged somewhere in the maze so there was no reason to throw me out.  But on the other hand, it was not permitted to exchange any information about what they were doing or where I should be going.    If there was information sharing it had to occur in the assigned buildings.   I only had to figure out where I was assigned.

The realistic part of the dream was the extensiveness and details of that hallway.  The halls connected a full city with restaurants, stores, gyms, and everything you can think of.   I kept wandering around and hitting dead ends and then having to back track to see what I saw before.  It was a very convincing dream.   But even the commercial businesses seemed to be following some code where you had to belong there in order to enter.   Apparently I didn’t belong anywhere.

Eventually I got back on the surface and wandered around at the street level again.    The image that emerged in my mind was that this was a completely secret city.   I’m not sure if it was in the dream or in the waking moments of thinking about the dream, but I imagined that the city was completely secret.   If you searched the coordinates on the Internet, it would show nothing but wilderness, even the satellite view would show nothing but trees.    You had to go there physically to see that there was a city.   Everyone in the city was sworn to some kind of code of secrecy.   Even the supporting staff at janitorial positions were devoted to keeping the city secret.   Information stayed in the city.  No information left it.

You had to have to personally experience it to even know it exists.  It left no trace that could be found on the Internet.   I presume that meant it had no trace in any database anywhere.

Here I’m writing this at the end of a full day and the dream is still vivid in my mind.

I came up with an interpretation that tied in some of recent thinking about data science.    Data science is all about recorded historical data.  The Internet itself and the search engines in particular are a good analogy.   With few exceptions, most of what is on the Internet is historical information of various degrees of accuracy or of fiction.   There is a lot to be learned from exploring data.

Today you can explore a lot of the world just by looking up data on the Internet.   Of course it is not the same thing as being there, but there is hardly a place that exists that doesn’t have some data available on the Internet.   Most possible places have a lot of data.   We even have a lot of data of places where we can’t possible personally visit such as the far side of Saturn.

The dream suggested something different.   The dream suggested entire cities that exist but is completely unavailable in data.   Data even suggests that such a place doesn’t exist at all.   And yet it does exist in some fashion as to resist recorded observation.

Earlier I had suggested dividing human activities into three types of intelligent capabilities:  present-tense science, past-tense science, and persuasive arts.

Most of my posts focus on the past-tense science that I place data science: the tricky work of making sense of recorded data.  With all of its shortcomings, recorded data is all we have about the past.   We can’t go back an get new data from past events.   Instead we have to make the best with what we have.   That project is never ending.

In my last post, I suggested that this historical data is increasing used to create policy, moving us toward government by data instead of government by people.   I wrote that government-by-data post this morning after the dream but not really thinking about the dream.   That earlier post suggested that government by data is inevitable and we should prepare for it.

Now, I’m thinking about that dream.   I still think that our future will see ever more encroaching policies based on historical data rather than human political debate.   The dream suggests we may be fooling ourselves into thinking that data will tell us everything about the real world.   There are realities that data suggests doesn’t exist, or at least such realities leave no record we can query.

There is a fundamental limit to what data can inform us.   No matter how much data we can collect, it will always be lacking critical information in human experience.   Human experience that can not be captured in data but instead is only accessible through conversations, dialogs, and debates.

For example, I spend much of my days practicing piano.  There is a lot information available about what do do and how to do it.  There are plenty of recordings including videos with close-ups of hand movements from expert pianists during performance.   All of that is data available to me.  And yet, I still don’t know how to play piano.   If I want to hear great piano music, I need to find a great pianist to play it.   Something like that is going on in all of life.

There is something more to experience than was can be captured in data.   It seems very naive if not dangerous to suggest that we can make policies based on data alone.   Data can present some very compelling patterns that suggest a distinction between choices to make one look like the best option.   Despite that data, there is still room for us to object that it doesn’t match with our experience.   Our experiences access something that will never be recorded in the data.

I can’t convey the richness of the hidden city.  This is unfortunate because I think the dream really captured the sense that the hidden city was very real and yet very inaccessible except by direct experience.    Even when I was there, I could only experience was was happening in the hallways not inside the buildings.   The city was full of people who had their own experiences.    The dream suggested they were forbidden from sharing those experiences but it might have been simply impossible to share that experience.

I’m reminded of someone trying to describe his experience of jumping out of plane and being in free fall for some time before opening a parachute.   He shared a lot of information I could understand, but there was no way to know all that he was talking about without having that experience myself.   Even if I did repeat his action, there is a sense I will still not know what he felt when he did it.   This is important information is fundamentally inaccessible.

In my recent post about BMI, I related its utility to a doctor during a visit with his patient.  He uses BMI as a tool to discuss weight concerns to a patient.  But he probably found motivation to discuss those concerns before even seeing the BMI measurement.   He saw some things that would not be easy to talk about without using the objective BMI.   But he also saw some things he probably couldn’t put in words even if he wanted to.  His experience informed him that the concerns were worth discussing.

In earlier posts, I described a type of data I called forbidden data.   This is actual observations that we do not allow in our data set because of our suspicion that the data is wrong.   The data has to fit within some reasonable bounds of our expectations.    One example in my actual experience was observing measurements of a volume that exceeded the capacity of the container.   Such volumes occasionally would be recorded but either that data would end up in a different data store for exceptions (an error log) or the data would be revised to perfectly fit the container (plus logging the anomaly).   Forbidden was the option of putting this unmodified observed data point into the official data store.    Something caused it to be recorded but that something will be lost to an analyst studying only what is available in the data store.

I return to my imagining of using data-mining for medical decision making.   The data mining will use available data to determine whether a particular patient’s set of conditions map to groups with good (cost effective) or poor (ineffective) procedures.   The key term is available data.   There will be plenty of data missing.   Some data such as ethnic or religious background may be forbidden from being available in the evaluation.   Some data such as the experienced judgement of the doctor may have no way to be recorded in the data in a way to make meaningful comparisons.

Data can be helpful, but it will never find the hidden city.   That city is what we intend to govern.

Update 4/22: Coincidentally an article roughly talking about the same challenges here (via again instapundit).


One thought on “Bright Data, Hidden World

  1. Pingback: Data-based vs Science-based government | Hypothesis Discovery

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s