I suspect the decision to send Dr Salia to Nebraska is mostly informed by our prejudices of a failed medical system in West Africa. We may also have been motivated by our charitable intentions of sharing our medical facilities for the fight against Ebola. I doubt that there was much medical justification for this decision.
Fears and doubts by decisions-makers are not only reasonable, but they should be expected. There is a lot that can surprise us.
With that sense in mind, I want to express some observations I have from listening to the data about the recent events in the Ebola cases occurring in Dallas, Texas. The data I’m listening to are news reports instead of database records. But I’m assuming the news reports are summaries of valid data about what is happening. The following are some things I hear when I listen to this data.
We need the opportunity to discover a hypothesis that practices informed by our best theories may actually be making matters worse. We can never learn that lesson if we keep substituting theory-based explanations for otherwise mysterious transmissions. The theory itself may be the problem.
My earlier Ebola post … implied that [data science] participation is optional. With this post, I think the employment of big data predictive analytics is not optional. This disease will spread to affluent areas where people will learn of their degree of contact separation from the infected individual. We urgently need predictive analytics to inform these people of quantitatively verified risk of contracting the disease given that degree of contact separation.
We have the capacity to supply the necessary technologies to this region to begin collecting social media data. We just need the incentive that this data will provide valuable contributions to the fight against the Ebola epidemic and that the data science community is available to devote their efforts and put their reputations on the line to come up with big-data recommendations that really make a difference for an urgent and life-critical critical. First of all, we need data scientists to present solid proposals for how big data can help the management of the populations with the goal of improving our prospects for controlling the epidemic.
When we look to data technology to solve problems, we should permit the technologies to identify the problems that can be solved with the current capabilities instead of demanding that the technologies evolve to solve the hard problems we have been working on. There are many opportunities to make progress even if we don’t touch the hard problems. Allowing technology to solve what it can solve now may transform the hard problems to be narrower, or possibly even less visible. For example, there are other ways we can improve overall life expectancy without curing any cancers, perhaps with investments in areas unrelated to health care. It is our nature to focus on objectives that catch our attention. This focus can blind us to immediate opportunities that are realistic given our current situation.
Data deception is a concern for automated decision making based on data analytics (such as in my hypothetical dedomenocracy). I think it is already a concern with our current democracy. I fear the current enthusiasm for data technologies because I do not see much in the way of appreciation for the possibility of deception. There is a huge confidence in the combined power of large amounts of data and sophisticated statistical tools (such as machine learning). Missing from our consideration is how well the data actual captures the real world. The data is not necessarily an honest representation of what is happening in the real world. It is very possible that the data may include deliberate deception.
Dedomenocracy is a scaled up version of modern data science practice using big data predictive analytics to automate decision making. As a data science project, there is a need to evaluate the data in terms of how closely it represents a fresh unambiguous observation of the real world at a specific time instead of a reproduction of a past observation through model-generated dark data. Darker data involves some level of contamination with historic observations or with our interpretation of past observations. The problem with darker data is that its use of old and potentially outdated data can discount more recent observations that can tell us something new and unexpected about the current circumstances of the world.
Government by data and urgency will operate very different from the present governments. The focus shifts to immediate issues that can be informed by recent data. Unlike the present government with accumulating perpetual laws, this new form of government exclusively enacts short-lived rules that get updated when new data becomes available or get retired when priorities change. Similarly, the government views the population in terms of future possibilities instead of past performance.