Darkening data to please the government

Following up on my fantasy government game, it is now over a month later.   We now know what our existing governments will for this period.   There is a variety of refinements across the country.

Locally, they are continuing the original policies of restricting commerce, imposing on a broad range of activities strict constraints that were not present a few months ago.   Significantly, this is part of the country with a very high density of highly educated people and many are employed serving the government in influential positions.   These are people we should trust to be the most rational in their response, and they largely support the continuation of caution against this situation.

One major reason for their lack of urgency is that the restrictions are not affecting the more educated very much.   Many of them have government jobs.   I can’t think of a single government job that is identified as non-essential.   There may be some layoffs or reduced hours for some jobs such as building maintenance and security, but most government workers are able to be fully employed at full pay while working from home.

Meanwhile, many of these educated people have spacious well-equipped homes where they can replicate many of the services within their own homes.   They can host guests, have home-delivery of gourmet meals and supplies of beverages.   They can go to the stores and get supplies with the minimal inconvenience of having to wear masks.

Most of these people never really experience the sidewalk experience of small shops that are no longer permitted to serve their customers.   In fact, they probably appreciate the less traffic through town, and especially having to deal with the inconvenience of having someone in front of them try to parallel park in a busy street.

They may regret the loss of some pleasures, but overall from their perspective, it is tolerable.

The local area has substantial population of people who not as well off and are not able to work from home.   They are so far deferring to their governing superiors to look out for their best interests, why they languish at home hoping for the next unemployment check to arrive before the next bill is due.   I just received the semi-annual real estate tax bill due in about 30 days, and it is larger than ever before.   I have no doubt that many receiving this bill realize that there is no way they can pay it.   Yet, they so far accept that this is absolutely necessary.

Other areas of the country are opening up and allowing businesses to open up with very little restrictions compared with earlier.   One state that opened up early was Georgia despite warnings that doing so would unleash a surge of new cases.   So far those dire predictions have not occurred in these locations.

Whether the state or locale has chosen to open up or remain locked down, the prognosticators have the same dire prediction that there will be a second wave worse than the one that occurred earlier.    They are just not precise about when to expect the second wave, thus they can claim vindication whenever the second wave does occur, and are free of accountability if the second wave hasn’t occurred yet.

I recall an aphorism I heard when growing up and one that almost discouraged me from pursuing higher education.   It goes something like the more education a person has, the less common sense he has.   Of course there are a lot of ways to interpret it so that it nearly a tautology.   For example, education may be a necessary antidote to the ignorance and dangers of common sense.    I have always interpreted to mean that when the two conflict, common sense is most likely right.

Lately, I have been elaborating on distinguishing two types of science.   One type of science is the science of learning what has already been tested and documented.   This is the science from education imparted by credentialed scientists (of various fields).

The other type of science, that I assert is closer to the original intention behind government being guided by science, is the notion that everyone can participate in science.   They only need to be educated in the methods and rationality of scientific examination of evidence.    This sense of science is a type of common sense because nearly everyone has it, especially now where nearly everyone has elementary education that includes basic training in the scientific and rational methods.

Everyone can observe the world from their own experiences.    They can come up with their own conclusions as they always had throughout the history of mankind.   In modern history, we can trust that those conclusions are guided in part by scientific inquiry, seeking a rational explanation for the observations while providing compelling counter arguments to objections.

A decade ago, there was even a realization of the benefit of crowd wisdom.   People participating freely in an open Internet allowed for a convergence to very reliable explanations without the need for salaried and credentialed scientists.    I recall specifically the discussion that this was often superior to what came from the salaried experts.    I continue to agree with that assessment.   If allowed free and unfettered discussion and debate, crowds of volunteers will come to actionable conclusions with validity matching the conclusions of government or academically funded experts.   I trust the crowd more than the experts because the latter can be biased by self-interest of self promotion or justification of their continued salary.

Regrettably, in recent years, the masters of the Internet have dismissed the prior discovery by censoring alternative hypotheses and banning those who regularly offer compelling arguments in support of the alternative.   The recent trend has been to restrict the notion of knowledge to those who have credentials, and in particular defend an established position even against fresh contradictory evidence.

The crowd wisdom phenomenon was short lived.   It was not possible before the Internet technology matured while still being open to all.   It is no longer possible now that the Internet has become controlled by just a few entities who cooperate with each other under a single perspective conforming to a single established set of knowledge.

In this analogy, the crowd wisdom of individually arguing from their own experiences and their application of scientific methods is very similar to what I describe as algorithms working on bright data.

Big data analytics of actual measured data can process far more information than any individual.   In this sense, big data analytics operates similarly to crowds.   I think if we allowed crowds to expand to major fractions of the world’s population and we allow them complete freedom to contribute to the discussion, the result could be competitive with big data analytics.   Conversely, restricting this kind of crowd interaction reduces the competition to big data analytics.

Here I am not comparing the underlying mechanics of the processes, but instead I am comparing the outputs.   I think crowds free to assemble and discuss their observations can come up with conclusions and recommendations that are as competent and reliable as those from computers.

I just resign to the fact that we will never allow ourselves to operate this way.   As demonstrated with the current crisis, when crises do occur the vast majority of people want to be led by a single authority no matter how ill-advised that authority is.    Very few people are willing to trust being led by a crowd discussion.

There is a good reason for this unwillingness.   The very nature of the crowd discussion is that it always considers the latest information, and addresses the latest decisions.   Unlike the stability from an identified authoritative figure, crowd decisions are necessarily tentative and subject to radical changes in very short intervals.  Often the intervals between new recommendations is shorter than the time required for the first recommendation to take effect.

Our current culture takes great comfort in the hope that nature is guided by an unchanging and knowable truth.   As a result, we expect to our leader to have access to this truth.   The leader’s confidence to sticking to a particular conclusion is evidence that he has indeed acquired that knowledge of the unchanging truth.

I tend to believe that it is unrealistic to expect anyone can know the universal and unchanging truth, especially for conditions involving life and that is especially true that we accept that life is continuously evolving at all levels.   To me it is reasonable to expect that our past knowledge is always tentative and incomplete.    In addition, I expect that even past knowledge proved to be valid in the past is not necessarily valid in the present.

I prefer to approach each new demand for a new decision as a request to discover an appropriate science that best explains the current observations.    I tend to suspect that old theories may not be applicable and I am open to new theories may be more relevant even if we can show that the new theories would not have helped in the past.

When confronted with a need to make a new decision, what matters most is what are the current facts and objectives.    The past facts and past objectives are irrelevant.    The past knowledge may remain as relevant now as it was in the past, but I don’t trust this is a certainty.    Sometimes it is better to choose a different approach.   The data is different.   The objectives are different.

One area that I think illustrates this is the notion of solving an epidemic with a vaccine.   While there remains some debate (with compelling arguments to the contrary), we generally conclude that vaccines have solved past epidemics of smallpox and polio.   One option for the current crisis is that a vaccine is necessary to solve the current crisis.

I am annoyed that we have concluded that this is the only option.   The only option is global immunity.   In the near term when the vaccine is unavailable, the immunity must come from isolating all individuals to be unable to contract the disease from others.   In the longer term, everyone will be vaccinated (probably twice).

We censor and ban discussions of any and all alternative approaches that do not follow these two strategies of isolation followed by mass vaccination.    Personally, I have the following objections.  One is that we have never before had a vaccine for corona viruses, and vaccines like for the flu have been around for decades without ever eradicating the flu.    The second objection is that we now know that the additives and other material included in the vaccine injection poses risks that may not become apparent for years later.   Those risks are higher as the recipient gets younger because that person has more years of life to develop the bad effects.   I am particularly concerned about the effects of these materials on children and infants who will inevitably be included in the new vaccinations.

As mentioned above, the reason why people dislike crowd decisions is that they prefer confidence and stability in leadership through a crisis.   Decisions that are tentative and rapidly discarded display uncertainty and ineptitude that are unworthy of following.   This is may be an unfortunate vestigial behavior from our early evolving from small tribes that relied on a single reliable elder for guidance.

In many other areas of life, we accept that some of our evolved traits are inappropriate in the modern world.   I suggest we should include in that list, the trait of demanding unwavering leadership from a person presumed to possess the knowledge of some unchanging truth.

We now have evidence to the contrary.   As I already mentioned, in the last decade we learned to respect the wisdom of the crowds when allowed to discuss their individual perspectives freely.   Although we recently shut that down so that even sites like the Wikipedia are authoritatively restricted to credentialed voices, we have the contrary evidence of machine learning used for make rapid decisions based on whatever it learns based only on recent observations.    We don’t ban or censor such machine intelligence because we are literally incapable of even recognizing what theories of the world it has come up with.

We are in this very strange world now where we suspect all humans of having wrong theories unless they conform to some established knowledge, while we allow machines complete freedom to act on their own conclusions based on machine-learned knowledge that we can not even understand let alone challenge.

The concept of governing by data and urgency is ultimately a work-around to permit what we already learned about the wisdom of crowds.   We can reclaim that new found source of wisdom by allowing the population to collect and check data, to select appropriate measures of merit to optimize for, and to decide when urgency is sufficient to activate the algorithms.    All three elements are essential: data, algorithm, and urgency, and all three ultimately are under democratic control.   I suggest that this may ultimately be superior to human led government even though this necessarily results in authoritative rule by machines: once we decide to activate the algorithm we are bound to obey its dictates with no recourse for veto until the rule expires.   Because we recognize the tentativeness of ruling based only on recent data, the rules always must expire.   We accept the temporariness of these rules because they are likely not the best solution for the long term and the algorithm needs freshly collect unbiased data to support the best decision for the next urgency.

In my discussions of data, I keep returning the conflict between what I call dark data and what I call bright data.

Bright data is direct observations from reliable sensors.   Because of this, bright data can be very current.   This is the data we provide to automation algorithms ranging from mechanical control systems to stock-market trades and circuit breakers.

Dark data comes from algorithms that incorporate prior scientific understanding.   These algorithms use bright data inputs in order to interpolate and extrapolate to fill in for information the sensors can not provide.    While the algorithms use recent data to calculate the missing data, that recent data is delayed due to the need for that additional processing, and the processing itself imparts prior understanding onto the outputs.   Dark data tends to defeat the benefits of the more current bright data by forcing the algorithm to consider old information.

In my mind, I equate dark data to computer simulations where much work is spent on implementing code that reliably calculates some scientific discovery.   The testing of the simulation includes verifying that it reliably calculate the correct results for a specified input.   In this sense, the simulation is forced to replicate our prior knowledge.   The simulation becomes relevant through the input of recent observations.

In my experience, I worked with statistical simulations involving random number generation shaped by specific distributions.   As a result, the distributions and their parameters captured prior knowledge the same way that the code captured prior scientific discoveries.    In some cases, the statistical simulation will run multiple times with the same inputs but using random numbers to produce a distribution of outputs.  Those outputs are an added layer of darkness because they no longer operate directly on bright data observations.   Instead, we fit the bright data to some distribution and then use the distribution for the simulation.

Another test for the validity of the simulation is that the results appear reasonable.   There may be some statistical tests to verify the reasonableness of the simulation outputs, but inevitably there is a human check.   Some human, hopefully one who is trained in the relevant sciences, will assess whether the simulation outputs is understandable.    Usually, this means that the simulation must confirm his intuitions about what the results should be.   If the results to not confirm his intuitions, then he recommends design changes so that future results converge more to his intuitions.

In other words, the nature of modeling is to transfer already existing intuitions of a trained science into a data form that appears indistinguishable from an actual sensor observation.

The dark data observations are actually assertions that the present world obeys the expectations for how the world operates according to very small number of people’s (often just one) understanding.  This presents the opportunity for those same individuals to force the simulated world to the way that best conforms to their wishes.    No one else will know whether the tests are for actual science or for some agenda.

I would describe this as darkening dark data.   The darkening is to boost the influence of the credentialed scientist’s agenda.   Because of our current culture of trusting scientists over trusting our own common sense, we automatically trust the results of their simulations.   The simulations came from highly respected scientists who are associated with highly respected academic institutions.   Our current culture demands that we trust their darkened dark data over bright data because that added tint comes from scientific knowledge of a scientist.

The current COVID19 presents a current example of this where most of the early government reactions of draconian restrictions (stay-at-home orders, shut-down of nearly all small businesses and non-essential socially-oriented businesses, and police enforcement of social distancing and face masking) came as a the result of frightening predictions from a simulation out of the Imperial College London.   The simulation included considerations of how people interact in public and estimations of infection risks for each encounter and estimations of fatality risks of the infections.   All of these properties relied on old information with the most recent information relating to infection rates and fatality rates estimated very early on with early reports about this disease.

By the time the results were used to bolster recommendations for policy actions, most of these assumptions were obsolete.    I expect that the models of human social behaviors may be even a decade out of date and even that may have been immaturely developed.   Most of the governments over the entire globe chose to follow the results from this models instead of considering recent observations that suggested that the infection has already started to level off, and that the death rates have decreased in part because of improvements in health care practices.   Many of the early deaths were the results of incorrect medical practices, such as the excessive use of intubation and ventilators, instead of the disease itself.

Given what I know now, I would revise my earlier estimation that a dedomenocracy would have come to the same conclusion of immediate lock down.    A dedomenocracy comes to a recommendation only after being triggered by a democratic majority recognition of urgency.   The recommendation would be computed using the available data at the time of that trigger and by that time it was likely apparent that the problem is not that bad.   In particular, this calculation would either ignore or assign a very low weight to any simulation model outputs especially if it is known the the model’s assumptions are very out of date.

As I mentioned at the time, it might be possible that the algorithm using recent data may still have imposed a temporary lock-down but the purpose would be that it would be the most effective way to inform the entire population of the danger.    The lock-down would expire quickly because its objectives would be met quickly.

Now that it is clear that decision makers relied so heavily on models using obsolete assumptions, I expect that maybe the dedomenocracy algorithm would not have recommended any lock down at all.   The presently available shows that feared danger is exaggerated.   According to the rules of this government, the algorithm must generate some policy in response to a request from urgency.   The benefit of this type of government is that the algorithm is not constrained as to what policy it chooses.  It could choose a completely unrelated policy that addresses a completely unrelated problem the public is currently not paying attention to.

It could have come up with a policy to clean up government and academics of corrupted or no longer qualified practitioners.    It has much more bright data to support that policy than it does to worry about this virus.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s