In an earlier post, I presented my idea of a taxonomy of data from the perspective of the data science labor to scrutinize that data. I defined bright data to be observed data that is well documented and well controlled. Bright data is a gold standard to judge other kinds of data. I described radar data to be an example of bright data and then showed that it is not immune from criticism.
This post considers another type of data that seems to meet a test for being well documented and well controlled: blood pressure measurements.
Based on my own experience, I have virtually no confidence at all in any blood pressure measurement. Conceptually, it is well documented and well controlled. There is a pressure cuff with a calibrated pressure guage. There is a stethoscope picking up an unambiguous signal when the artery opens and closes. The process is well documented: measure the two numbers and enter them as good solid data.
Several years ago, my doctor recommended I keep track of my blood pressure on a daily basis. I did that and dutifully recorded the results. For a while, I even plotted it. It was so smooth. I was trying to lower it, and the plot did show a gradual decline of the mean. Great stuff.
During my next visit, I shared my little list of numbers and the doctor gave whatever his thoughts were. But then he measured it himself and the number was much higher. He says, not to worry, it is probably the white coat effect. I pointed out that nurse measured a lower number earlier and he suggested that was also not unusual. Ok.
I had been using the stethoscope for a while so I could experience the same thing the doctor experiences. He expressed some doubt that I really knew what I was doing so he recommended an electronic version. I decided to give it a try. It was very convenient needing just one tube instead of three. It uses the same sensor to measure pressure and the pressure pulses. It presented nice digital numbers that I can copy down. It even saved the recent measurements so I could scroll through them. What could be better?
Using this new meter for a while and my blood pressure started declining more rapidly. I recalled the messages the doctor mentioned about stress and imagined that the stress of setting up the stethoscope version probably artificially inflated the pressure. But the values kept going down. It started measuring blood pressures that suggested I should be close to fainting. I switched to my stethoscope version and it said I should still be concerned about a stroke. This was ridiculous.
I took the electronic version into the doctor’s office and we played a game where we’d try to measure at the same time, my toy on one arm and his on the other. He agreed my toy was measuring too low. I discarded it.
I went back to the stethoscope. He reassured me that the important thing was that I measure it the same way every time. That alone should be interesting. It is an admission that blood pressure is not a repeatable measurement. Different trained medical people can use the same mercury-column version with top quality stethoscopes and come up with different readings. I could come up with different readings on my own depending on what method I choose to use.
How could this be considered data at all? I may have to add a new entry to my taxonomy to cover this kind of data: biological data, organic data, squishy data.
It doesn’t take much exposure to medicine to realize how important blood pressure measurements are. The operation of checking blood pressure almost takes the place of handshakes when greeting a health care professional. There are clear relationships of blood pressure with certain conditions and risks. And yet the measurements do not impress me as being any way reliable.
Maybe I’d be happy if we’d just dismiss the numbers entirely. It would be safe to give it some qualitative designation such as low, normal, high, very high, and scary high. These would be the actual markings on the mercury tube: instead of numbers just have different bands of colors. Such a scale would be more honest about the precision of this particular measurement.
I suspect there is some psychological value to identifying and stating the numbers. Verbally stating 120 over 80 seems more authoritative and informative than “normal”, but all we can really say is that it is in the normal range.
Blood pressure numbers seem to be worthless in the context of a data store. I imagine my own collection of blood pressure measurements over several years (assuming I actually kept them all) and included the various different meters I used plus the measurements from the doctor’s office. I’d put them into a kind of diary that including some observations such as what exercise I did, what I ate, how much sleep I got, etc. I would be tempted to use this information to see if there is some kind of pattern between these different observations. I can’t imagine that being anything but a complete waste of time.
I am not in the health care profession, but I am pretty certain that these measurements are used in data systems both the monitor progress of individual patients, and to compare outcomes of a population of patients. It is one of the vital signs. We expect it to tell use something and probably something more than just whether we are dead or alive.
I imagine also the trends of using big data analysis to optimize health care delivery. Blood pressure will be a dimension included in most data queries. We are pretending that all of these measurements are directly equatable. That 120/80 means 120/80 when in fact it may mean 90/60 or 140/100 depending on who makes the measurement, what instrument they use, or whatever else is going on.
We’re going to base medical decisions based on that? Even though we always included blood pressure in making medical decisions those decisions were on the individual patient. A big data query over a large population of patients sharing the same readings seems dubious.
Perhaps I’m used to working with much more repeatable and consistent measurements, because it almost makes no sense at all to consider blood pressure as any kind of dimension for data. And yet it must be.
I pick on blood pressure because I had the opportunity to collect a lot of measurements over time with it. Blood pressure readings are virtually free. I think about more expensive tests such as blood tests. These occur far less frequently so whatever numbers they report are single values that are considered valid for long periods of time. I would not be surprised if those values are equally prone to variations as blood pressure measurements are. Blood test results at least are presented more honestly by supplementing the numbers with the range of normal values. The numbers are psychologically pleasing but the real information is either normal, high, or low.
In earlier posts, I expressed concern about using big data in health care, and in particular to use big data query results to make decisions of permitting or denying health care. My concerns are elevated when I consider how imprecise biological measurements can be. If I were to become a candidate for some expensive procedure that must be validated by some big data query analysis, how precise will the query be about comparing my measurements with other patients? Could it be possible that combined with all of the other measurements a blood pressure of 120/80 would come out to be approved but 128/86 be denied?
This seems pretty obvious that someone will recognize that this is not significant. But what if they instead compared my fasting blood sugar numbers instead? Or my platelet counts? It seems there is a big risk that the dimensions for some or many measurements will be too narrow and make distinctions when there really aren’t any. Even when we use broad categories (such as normal, high, and low) how well do we understand what is normal? How about combining multiple measurements: do we understand all of their normal ranges equally well? I’m skeptical.
The more I think about it the less I trust any suggestion that we can leverage big data for optimizing best allocation of health care resources based on biological measurements. Perhaps the technology is presented as a way to defend selective allocation of resources without any pretense of optimization. If we have to chose between who gets care and who does not, it might as well be a lottery disguised as a big data query than to use any human decisions that could be unduly subjectively influenced.
I am not a professional in the field and even if I were, I probably would have to be specifically approved to have access to the details of these big data dimension designs and queries. Although I am not a health care professional, I am able to pose the question. I will probably never be allowed to learn the answer.
For today’s post, I primarily wanted to talk about the measurement as an example of dim data: an observation that has some problems with the documentation and control so that I’m not certain exactly what it is measuring. Stated that way seems wrong. We know what blood pressure measures, the process is very well documented and the measurement is controlled. The problem is that measurements are so inconsistent. The inconsistency of the measurements implies that something is not quite fully understood about the measurements, so I call it dim data.
Another source of dimness is the fact that it depends a lot on the different patients and the different providers. There lacks consistency within a single combination of provider and patient, and there lacks consistency between different combinations of patients and providers.
Another source of dimness is the difference in technologies for the actual measurements. The doctor’s office uses high quality stethoscopes and real mercury columns. The hospital bed may use some kind of automated digital version. The patient uses some retail variant of either. The stethoscope versions use sound waves from collapsing and opening of blood vessels. The digital versions use pressure waves instead of sound. The technologies are not exactly comparable and more importantly the technology is not identified as part of the measurement. For example, instead of a reading of simply 120/80, I would rather see 120/80 measured digitally using pressure-wave method. (I shudder when imagining how messy this will get with all of the other biological measurements with their various available approaches).
My observation is that biological measurements are fundamentally different than mechanical measurements.
Biological measurements are fundamentally more fuzzy than physical measurements. I can almost visualize how I could tolerate a single dimension with some fuzziness. For example I could allow for some neighboring categories in my consideration. I haven’t even begun to try to imagine multiple fuzzy dimensions.
I can guess that big data analysis of biological measurements must make extensive use of some kind of fuzzy queries. At least I hope so even if I can’t quite imagine what they would look like.
Allow me to assume that medical big data uses fuzzy queries that are fundamentally different from the sharply defined queries used in other big data systems. If that is the case, then it seems unfair to expect that the successes of sharply defined non-biological queries will be enjoyed by fuzzy queries of biological data. I would expect fuzzy answers from fuzzy queries.
Fuzzy is a cute word and probably not accurate. But I think it is important to separate biological data systems from physical data systems. There are biological queries and there are non-biological queries. They may be fundamentally incomparable data sciences.