Automation needs data analyst monitors

In earlier posts, I divided knowledge or science into 3 categories: past, present, and future.

Past science or history is the study of remnant data from past events to better understand the relationships of the prevailing conditions with the observed consequences.  Past science has no access to fresh observations so it must concentrate on recovering the data of contemporaneous observations and the make the best use of those observations.   I generally strive to work in this area, though the concept is past could be as recent as a few minutes ago.   The point is to understand what had already happened, instead of trying to understand (or solve) what is currently happening.

Present science is the applied science on current operations of some system.  Operators responsible for managing current conditions generally have training in the engineering or science of their jobs, but are expected to keep operations within certain parameters.  Operators are expected to solve problems but only as long as those problems affect current operations.   Often problems resolve themselves.  In many other cases, the operator acts in ways that solve the problem but either he doesn’t know why or even he acted in a way contrary to his training, relying instead on some instinct or intuition.   In contrast to the historical scientist, the operational scientist is not burdened with the task of explaining why the past problem occurred or how it was resolved.   Instead, the operational scientist’s task remains on the immediate task of managing the immediate state of his system.

An illustration of a present science occurs when driving into traffic where you need to suddenly brake or swerve to avoid an unexpected encounter (with a road hazard, another car, a pedestrian, or an animal).    The operator succeeds when he successfully evades the hazard, and in so doing will continue to drive ahead remaining alert to oncoming potential for new hazards.   He leaves behind some unanswered questions of what he could have done to have avoided the evasive maneuver in the first place, and why his evasive maneuver worked.   On the latter question it could have been a direct application of the science of his training, or it may have been by luck, or by some intuition that was contrary to his training.   My point is that the operator is not burdened to explain his past accomplishments because his job has moved on to navigate through the currently presented potential of hazards.

The same person may later revert to a past scientist at the end of his operational duty as he reflects on his experience through a process of self improvement or a more formal documentation of the events for others to study.   When I divide science into 3 time categories, I’m referring to the activity itself, not the person.   People could be performing all three sciences at the same time, but when they do, they are multitasking at least 3 different topics.    Generally speaking, the operator is not performing past science on his recent experience especially if the condition is fully resolved.   Doing so could be a distraction that can lead to confusing the past with the present.   We don’t want the operator responding to some current mitigation effort while simultaneously trying to understand some now irrelevant condition.

The third type of science concerns the future.  Note that I consider prediction or extrapolation to be aspects of present science: using currently available observations to act consistently with the science of what will happen next.   Instead, future science involves dealing with the unknowable future observations.   In earlier posts, I identified this type of science as the arts of rhetoric and logic (both formal and informal) to persuade a coalition of decision makers to buy into some change in operations.   This is the hardest and least understood science of the three.

My work experiences involves all three of these activities.   I primarily focus on historical science of explaining what happened in the past.  Typically, the tasks involve very recent past of what happened at most a few weeks ago, or as recent as a few minutes ago.   I distinguish this from operational science because either the problem is already resolved or I have at best an advisory role for those who are more operationally involved.   Through my conclusions, I am also involved with the persuasion science but as with operations, my role is advising managers who are tasked with persuading decision makers.

I work with data artifacts of what happened in the past.   Despite the best of efforts, the remnant data of some event is a tiny fraction of the data that was available (and acted upon) during the time of the event.   When data is available, it is a summary statistic of a large number of observations that often I wish I could still access.

Often, there are measurements that I know existed and where known at the time, but are now irrecoverable, leaving me to conjecture what those could be.   I described such conjectured data points as dark data: manufactured data points based on mathematically modeled interpolations of other data points.   While better than no data at all, dark data is a poor substitute to actual observations, especially if operators considered those observations in their actions.

This is the problem of historians piecing together some historical event based on letters, proclamations, or physical evidence on the ground.   They do the best they can, but they will never be able to reconstruct the moment-by-moment experiences of the key actors at the time, where otherwise inexplicable actions may be fully explained by experiences that never got recorded in any recoverable form.   The problem is that we don’t if such an observation existed, so we can never distinguish momentary incompetence from extreme competence acting on bad information.

Until recently, human history was a record of the consequences human operators.  As noted, we may never be able to reconstruct all of the information presented to those operators, but we often have good confidence of the information they had available.   This confidence is the consequence of the operators being humans: their communications would either involve records or eye witnesses.

However, recent history has introduced automated operators, machines doing the work we previously trusted only to humans.   Machinery has always automated human operation, but until recently this automation was of the a consistently repetitive nature such as weaving a cloth: the decision of each weave is automated but with very low risk of mistake.   Even then there could be the problem of a misshaped thread or a skipped weave and I can imagine this is something that merits a historical review but it is hard to justify the expense given the goal of productivity.

Modern society benefits from productivity measured as increased product output for decreased amount of labor.   In prior posts, I often advocated for extensive collection and retention of actual observations (bright data) so as to minimize modeled (dark) data.   When tasked to study some topic, I prefer observations of what actually happened over what science claims must have happened based on other data.   Unfortunately for me, the demand for productivity necessitates ever funding the analysis of data of highly reliable processes and as a result there is no justification for collecting and preserving the underlying observations.

If I were involved in the textile industry, I would dream of some future of having a record of each state involved in every weave of every cloth.   Even if that were possible, who is going to pay for the analyst to study that data for some insight that no one is asking for.   The process is acceptably productive and reliable.

In my work, I often pursue some question that was only briefly suggested and then dismissed as unnecessary to pursue.   My discomfort in not being to answer the question motivated me to pursue an explanation.

This is seeking an explanation at the edge of what is considered relevant for human understanding.   As our automation progresses, that edge of relevance to human awareness keeps creeping away from the physical world.   As that happens, we lose interest in collecting data that goes into the automated operation, including the abandonment of previous efforts to do so.   Even when such data collection remains affordable, it is not affordable to invest in the effort of analysis.   To avoid the presumably unproductive attempt at analysis, we halt the collection of the data to analyze.

The trend to ever higher productivity through automation has the side effect of making humans more ignorant about the conditions of the actual world that the automation must navigate.   The automation reliably does its job, and part of its job is to avoid hazards or to effectively resolve the consequences of encountering them.   If sufficiently reliable, there is no need to record that the hazard ever existed in the first place, or how the remediation operated.

An example is my water heater.  It has to operate in the basement through all the seasons by applying additional heat depending on the temperature of the tank.   It works so reliably that there is no attempt to record each time when the heat is turned on or off and what were the conditions inside and outside of the tank.   As I write this, and thinking about the specifics of my house and its environment, this would be interesting data to study, especially over a span of many years.   I suspect I would learn something new if I were to study that data, maybe not not about the water heater itself, but maybe about the house or about my activity within it.   I don’t bother setting up such as measurement collection and retention system.   Most people would not even think about this, but I’m sure there are people who actually do this.  (I dare not search on this topic, because doing so would probably review actual products that specifically do this!)   I bring it up as an illustration of the isolation of the human attention to the physical world, attention needed to provide the comfort of the human, but this attention is entirely handled by a thermostatic mechanism (not mentioning the infrastructures providing steady sources water and fuel).

The automation needs to pay attention to the physical world.   The success of that automation depends on spending that attention to actual observations of the physical world.   When sufficiently successful, the automation frees the human from having to pay any attention to those observations.

Humans evolved to negotiate survival in the physical world through efficient collection and interpretation of observations about that same world.    Adopting automation gradually frees us of this burden, but it has the consequence of making us ignorant of the aspects of the physical world that the automation must deal with.

In the place of having to navigate through the hazards of the physical world, humans have to adapt to navigate through the hazards of the automated world.   There is a qualitative difference between the two.

For the most part, the hazards of the physical world has no inherent bias toward human comfort or misery.   Our ancestors navigated through an uncaring natural world.   While that world did not weep at our misfortune, it also offered no resistance to our successes.  The latter of which we have exploited to great fortune, and will continue to exploit for the foreseeable future.

The future exploitation will increasingly be done through proxy by the automation we create.   The automation will exploit the physical world’s indifference to our plans, isolating the humans from the experience of the details of that exploitation.

Automation takes the place of the indifferent hazards of the physical, but automation is a human creation, and as such it is not indifferent to the specific joys and sufferings of humans.   Humans must now adapt to an automated world that specially cares, one way or the other, about the consequences to humans.

Given the narrative of Darwinian, it is questionable whether human behavior will successfully transition to this new type of world.   We evolved to survive and thrive in a physical world that is indifferent to our needs.   We are not evolved to do likewise in an automated world that specifically acts in full consideration of our needs and weaknesses.  That automated world acts on the physical world on our behalf but it will use information denied to us in order to make decisions for some objective that will benefit some and harm others.

There will be at least some people who will have to survive through the automation’s threat of harm as a consequence of achieving some optimization of some human-defined objective given that automation’s hidden access to information about the physical world.  Those people in the risk zone of automation will need to need to negotiate directly with automated world instead of the physical world.

Do people have this capacity?  Darwinian evolution does not give me much confidence.   Maybe we’ll luck out in that a few people have the necessary mutations to defend themselves against automation, but that will benefit only their descendants.   The rest of us will need to take our chances, hoping that we never encounter the threat personally.

The present automated economy places a priority on the operations of systems (the present tense science) at the expense of the historic sciences expect for the specific cases of some dramatic catastrophe that will inform some future persuasion.   The needs for ever increased productivity of humans necessarily devalues any study of history of things that did not go wrong.

In the past, we had to experience the world for ourselves so that at a minimum our personal experiences directly contributed to a historical record we could contemplate at our leisure.   This gave us the freedom to learn from our experiences and most of those experiences were inconsequential or benign.   I frequently recall experiences as trivial as the smell of seashore air on a particular day several years ago when the wind and atmospheric conditions were just right for that to happen.   This was not recorded anywhere else but in the brains of people who were there at the time and who actually recognized it as unexpected.   There are many more consequential observations in my memory, such as remembering balancing myself on a bicycle after hitting an unexpected obstacle.

Increasingly, we delegate those consequential experiences to our automation, never to experience them ourselves, and never bothering to even record them.   Consequently, we will not have the luxury of reviewing those direct experiences, especially when there were no adverse consequences.    The loss to us is the recognition that hazards do exist even though they are effectively automated around.   That recognition could remind us that there may be a future hazard that the automation would fail to overcome either due to its magnitude or due to faulty data it had to work with.    We as humans will be especially ignorant of the presence of faulty data before a catastrophe occurs.

As I mentioned above, I like to limit my work to historical data, data that is no longer operationally relevant.   I maintain my employment because I occasionally find something from the historical data that is useful either for the people more involved in the operations or those involved in future planning.

In this work, I am eager to look at all available data.   Even when tasked with solving a particular failure condition, I will seek out as much data I can get so that I can compare the failed condition with the otherwise ignored successful condition.    Sometimes, I can propose future data collections to support future analysis, but most of the time I have to work with existing data.   Existing available data is often frustrating in its lack of the observations that can directly explain the specific event.   If data is available, it is often a statistical summary, leaving me to estimate a probability within the population summarized of an observation that could explain the failure.   More often, the observation known to have been available at the time is never recorded in any form.

When data is available, sometimes I will have two independent measurements but from different sensors.   Often there are attempts to consolidate these into a single measurement of a single version of truth, but I fight against that consolidation.   I want to work with multiple versions of the truth.   Even if one measurement is obviously wrong, it is not obviously irrelevant.   The existence of the measurement implies that something used that information, or at least could have used it.

My desire to know this erroneous or conflicting observation goes beyond a objective search for scientific proof.   I desire the recreation of experiencing the event first time, as if I were there at the time.   I know that if I were present at the time, I very well could have been confronted with the bad measurement with no access to the good one.   Deep down, I want to experience that, and I want to experience my attempts to react to that information.

I suppose the desire to live in the data preserved in our records marks the true aptitude of a historian.  Historians seek to understand the rationality of the person who actually made the decision.   This is fine for the history of men.

There is now a new field of historians who seek to understand the rationality of automation that made some decision, or that may make some future decision.   This characterizes the same task presented to modern men in general.   Darwinian survival of the fittest will select those of us who can evade the hazards of automation that seeks to benefit someone else while hiding the physical world observations it alone has access to.

The new breed of men needs to effective live in the past instead of the future.  We need to demand a indelible record of observations we can not experience directly.   We need to be able to relive that record to understand what led up (or could lead up) to the automation’s threat to our individual person.   We need to then act directly upon the automation (instead of nature) to defuse that threat.

Threats in the natural world are largely indifferent to our joys or pains so it is sufficient to dodge the hazards or grab the opportunities.   Threats in the automated world have a direct interest in human well being that just may happen to threaten us individually.   Automation won’t be so easily distracted by our flee, fight, or freeze reactions because it will have some direct animus on us individually.   We need to be able to neutralize that animus.

Assuming the that automation is intended to be beneficial to us, the most likely cause of that animus is faulty data.   As a result, to protect ourselves, we need to invest in the labor to seek out and remove faulty data.

This is what historical sciences do.   We encounter data points, perhaps of completely different types of observations, that can not simultaneously be true.   Sometimes, we can recognize the ridiculousness of one value so we can confidently discard it.   Other times, we can’t so we either discard all of the conflicting observations or at least quarantine it as unreliable.

This is one of the things I do in my job.  The people responsible for operations will react appropriately to some observation, but that appropriate reaction can have detrimental actions.   If I am able to find proof that the observation is incorrect or that the condition being observed is contrary to what is expected, I can provide this information to operations and they will revise their actions, avoiding the unnecessary action that could have costly consequences.

By evidence of my continued welcome on the team, I must be providing a valuable service.   But in some sense, I am an excess expense in terms of operations.   Operations could proceed without my assistance.   The team would be more productive (minus my labor) even though occasionally there will be some casualties.

Based on actual experience, these casualties can be manageable through the persuasive arts of management.   There is no way to undo the damage, the actions were justified by the observations, and the overall process is over the long term highly productive and cost effective.

Built into the modern system is a tolerance for someone getting hurt once in a while.  Sure, there may be some kind of compensation, but rarely if ever will that compensation undo the damage.   Consequently, individuals have to develop a new form of self-defense, when they have little at their disposal to defend against remote automation dead-set on operating in the human’s stated objectives given the available information.

The alternative is the role I have, someone who can access the observations available to operations and can inform the operations of errors in the data or new ways to interpret or to react to the data.   As noted, this is an added burden that subtracts from productivity but it does so by increased protections against adverse actions that could harm a population that otherwise may be small enough so that their unintentional losses could be defended after the fact as both being within acceptable cost for the productivity return and being fully justified by the best knowledge at time of implementation or the best data at the time of the event.

This discussion relates to thoughts I was having about the Boeing 737 Max crashes that appear to have been caused by the MCAS unit.   In the cockpit there were two human pilots and multiple automated pilots where MCAS was the latest addition.   The two human pilots were fully employed in their task of operating the aircraft.   They were assisted by invisible pilots in the automation.

At some point, it appears, the human pilots were fighting with the automation.   Both sets were operating under the best of intentions but they were working with different data inputs.

In particular, it appears that the MCAS was operating on sensor information that was not even available to the pilot: the Angle of Attack (AoA) sensor.   This sensor is optional to display to the pilot and generally irrelevant to human piloting since they can adequate control with the other information they had, implicitly acting consistently with AoA requirements without the need for the direct numeric value.

There was a good reason to not include the AoA measurement to the pilots.  The numeric values are hard to interpret and do not really add any value to their effectiveness as pilot.  The problem is that the sensor measurements do exist, and some automation is using that information.   Clearly is it is burden for the pilots to monitor sensors that don’t add value to their operation of the aircraft.   My complaint is that if sensory information is provided to some system in the aircraft, someone needs to pay attention to the current trustworthiness and credibility of the measurement.   In one of the crashes, there were, in retrospect, easily recognizable reasons to doubt the credibility of the AoA measurement.   A simple action to deactivate the sensor itself would have blinded the MCAS of the information it was incorrectly responding to.

There could be a third seat in the cockpit, one devoted to monitoring the credibility of all of the sensors and tasked to initiate procedures to quarantine that sensor and mitigate its loss.   There used to be a third seat in the cockpit, one devoted to navigation but effectively at the time devoted to monitoring the data feeds to the pilots.   This seat was removed from most flights possibly because of the improvements making navigation something the two pilots can manage themselves.   I believe this decision was wrong.   That third seat is needed for more than just path navigation.   There is a need for data stewardship not distracted by operational concerns but paying attention to the health and trustworthiness of the sensors.    I believe this became an absolute requirement at the point when sensors like the AoA sensor were introduced for the sole use by automation and completely unavailable to the pilot.

Automation is needed for the operations productivity, but it adds new labor burdens on humans whose incentive of self-protection from automation drives him to pay attention to the credibility of the sensors in terms of correspondence to what it is supposedly measuring.   The problem might be solved by live streaming of sensor observations of physical world back to some operations control center where a small staff can monitor all currently operating systems.   This would require a data network to handle the data traffic in an timely manner, but I believe this is the right solution if it is feasible.   Such as center would validate each sensor’s information against all information about the aircraft’s environment, and they would have the means to directly initiate the process to remove the bad sensor and mitigate its subsequent absence.   Such a system also would allow the periodic review of more detailed sensor data collected from all operations to find anomalies that would identify the misbehaving automation, flagging it for engineering or certification review.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s