This is a continuation of contrasting my experience with mechanical data with what it must be like to work with medical data in a big data context. My last post convinced me that medical, or more precisely biological measurements, are imprecise and inconsistent. They might be useful when categorized into broad ranges such as normal, low, or high, but even then I suspect the boundaries of the categories are not precise. There appears to be a lot of fuzziness in the data. I have a hard time believing that we can get any usable value out of big data approaches when the all of the information involved is so vague.
This post looks at the body mass index BMI as another measurement that is an indicator of health, although a controversial one.
BMI the ratio of mass to height. It is very well defined and it is very well controlled. Unlike the earlier examples of medical measurements, it is remarkably precise and consistent. This measure comes out exactly the same no matter who measures it. An individuals BMI is remarkably stable over time or if it changes we are easily convinced we know why. This is great data to add a dimension to the multidimensional characterization of an individual. I trust this data. This data is so good numerically that there is little need to lump them into categories, especially after rounding to the nearest integer value. There are a limited range of integers that cover nearly all individuals, and the integer measurements are highly repeatable.
The first controversy with BMI is the assignment into categories. From a data query perspective, I don’t see why we can’t just work with the integer values instead of broader categories. Different queries using different combinations of dimensions or measures may suggest differing assessments of what category the BMI belongs to. Instead we made the choice up front that the number immediately be assigned a category. The broad category is more important than the precise number.
In my opinion, the controversy comes with the naming the categories. In my last post, I suggested the imprecise measurements be replaced with broad categories of normal, above normal, below normal, etc. This does not work with BMI because normal is hard to define. Cultural and economic factors influences the distribution of BMI in a population so that a common value may actually be unhealthy. We choose instead to impose a value judgement on the naming of categories as under-weight, normal-weight, over-weight, obese, etc. It is possible that normal weight may actually be relatively rare and yet it gets its name from what we presume should be normal. The category effectively immediately presents a diagnosis about the patient instead of being a merely descriptive dimension for an individual.
The problem is made worse by the choice of boundaries between the different categories. We impose the boundaries based on prior theories about what is healthy or unhealthy. In my data science taxonomy, I call dark data to be data that is derived from a model instead of measurement. Although the BMI number is a clean observation that I would call bright data, the replacement of the BMI number with the BMI category turns bright data into dark data. The boundaries between BMI measurements are based on models of what we consider to be healthy or not. We end up feeding model-generated data into the data store.
The model data biases the value of the data for doing hypothesis discovery. We essentially have a dimension includes categories predetermined to be named “unhealthy” or “healthy”. In my imagined future health care world where big data is used to determine allowing or denying expensive procedures, the query will work with dimensions that already announce the patient as unhealthy. While it may be true that all patients who are in the overweight category may be at a disadvantage in a procedure, that disadvantage may not exist people with specific BMI values in that range (for example a BMI of 27 or mid-point of overweight category may not be as unhealthy as 26 or 28).
When working with data for doing multidimensional analysis I prefer high quality observed data (BMI numbers qualify) to model-derived data that predetermine an assessment (such as BMI categories). I’m almost certain that there are conditions and procedures where a set of BMI values be beneficial while that same set of BMI may be harmful for others. Being underweight may offer some advantages in terms of avoiding certain types of diseases, but it could be a disadvantage when trying to recover from an existing disease. Although this example uses the BMI category names, I suspect there are finer distinctions within each category based on the precise number.
BMI presents a precise number over limited ranges of integers. I would prefer to just use the number instead of giving it a diagnostic name.
I like BMI as a data point for using queries because it is easy to measure and the measure is so easily repeated no matter who performs the measurement or when it is performed. However, I fully recognize that it alone may not have very much diagnostic value. I am aware that mass can come in different forms: water, lean muscle, or fat. BMI can not distinguish between these. The health issues appear to be most concerned about fat, but some forms and locations of fat are more benign than others. BMI as a bulk measurement alone offers little diagnostic value.
BMI does become useful during a doctor’s visit. The doctor can use the BMI chart to point out where the patient’s ideal weight should be. However, the doctor initiates this talk about BMI after adding his own observations of the patient. A glance at the patient should clarify the type and placement of the body mass in question. Combining that visual observation with BMI results in a very useful assessment of health risk. I concede that there is room for debate whether that visual information is enough, but my point is that BMI as one of many dimensions can be useful for determining risks. The doctor can talk objectively about the recommendation using a chart instead of bringing up a subjective observation about the patient’s appearance.
Allow me to distinguish two uses of BMI. One is to apply to a health assessment of a specific individual such as the examples I’ve been suggesting. The other is to apply BMI to groups of individuals to allow for aggregate analysis of the group as a whole to suggest general trends for groups instead of for individuals.
Large scale decision making involves working aggregates of data, summary information about groups of individuals. The categories allow us to place the individuals into one group or another so that the measures (health risks or outcomes) can be compared. There are going to be multiple dimensions where each has its own set of categories. If BMI numbers are a dimension, it will only be one among many other unrelated dimensions.
Performing multidimensional analysis on large aggregates of large populations can suggest statistical improvements for the group as a whole. For example, a particular group of various health conditions but with a particular BMI may have fewer risks or better outcomes if the group as whole reduces the average BMI. This does not mean that a particular individual in that group will benefit by this change. But the only way to get the group’s BMI score to decrease is to get the individuals to decrease their own BMI.
It is like telling a patient that they should reduce their BMI so that their group has better health outcomes even if it may not benefit the patient. This is very similar to the arguments we made about tobacco use. Some individuals can be heavy tobacco users their entire lives and die at an old age from causes having nothing to do with tobacco. However, getting everyone to reduce or stop tobacco use will reduce incidences or severity of several diseases that may confront the group as a whole. We encourage people to stop tobacco for the aggregate good, not the individual one. BMI may work the same way. Changing one patient’s BMI may not have any benefit for the patient, but it will bring down the group’s BMI and that is expected to provide an advantage to the group.
I need to put an asterisk on that last point. Data science is the science of historical data that suggests new hypotheses. The hypotheses may be wrong. In this case the hypothesis is that reducing BMI for a particular group may have a beneficial outcome. That hypothesis needs to be tested by actually doing that reduction. That test may fail. In effect, we are asking the patients to participate in an experiment to test hypotheses suggested by big data queries. It doesn’t assure that those hypotheses will actually deliver their promises.
Back to my point about using a dimension for aggregates. The motivation for querying aggregates is to inform broad policy making. An example may be whether a particular medical procedure is effective for a certain broad group of individuals. Some of the thinking behind health care reform seems to be that we can find cost savings by fine tuning access to procedures at the aggregate level instead of the individual level.
Part of problem of increasing health care costs is that there is no practical way to deny access to certain procedures at an individual level. Having an aggregate-level analysis show the advisability of denying care based on combinations of conditions that places a patient in a certain group is just like the doctor pointing at the BMI chart to tell the patient his optimal weight. The benefit is that it provides a seemingly objective alternative to the otherwise subjective observation (the patient appears too fat).
I think it is an open question whether this big-data approach will actually work in practice. Cost savings in health care will eventually require denying access to certain procedures to certain patients. When that denial occurs it will affect a particular patient who is likely to be aware of the procedure’s relevance to his condition. The patient will know his care is denied. The patient will contest the decision.
Our experience with BMI should tell us that much. There are examples of people at all BMI levels who contest the diagnostic element of BMI. People with high BMI may feel (often with some justification) that they are healthy. People with normal BMI may feel they are unhealthy (again with some justification). The entire debate is being aired publicly. The popular opinion tends to treat BMI as only a rough guide and not very useful for a health policy.
In many ways, the way we came up with the different categories of overweight, obese, or underweight was through a type of big data analysis of aggregates where BMI is a dimension. We saw trends that certain problems were more prevalent in some ranges. This is exactly the same project we are considering to do for all of health care. We are going to make decisions that impact individual patients based on aggregate statistics.
We should expect that the patients will contest those decisions. We should expect those debates to be aired publicly. We should expect the public’s sentiment eventually to consider any big data result as a mere suggestion and that individual decisions should still be made individually. Ultimately, we should expect the entire project to fail to provide any justification for denying care. We may not be able to leverage big data solutions to provide cost saving policies.
In terms of BMI, a common argument is that someone in a particular BMI group isn’t like the other people in that group. Someone with a high BMI may have fat distributed in safe locations or may be physically active. From a data science point of view, the individual is objecting to his membership in the group. He demands to be placed in a different group. As we add dimensions to define groups of individuals, it is more likely that an individual will find more reasons to distinguish himself from the other members of the group. These are likely to be debated publicly and even in courtrooms. The patient contesting denied access to care will have the advantage by being able to show all of the ways he is different from the others he was grouped with.
BMI is an interesting example to explore because it is so ideal as a dimension to use for big data analysis. BMI is easy to measure precisely and every measurement is highly repeatable. For doing historical data analysis, I much prefer to work with a dimension like BMI numbers than an imprecise and unrepeatable one like blood pressure. I disagree with replacing the very reliable direct numeric BMI with the categories that prejudice the information with old ideas of healthy and unhealthy. This replacement is replacing a solid observation with a model-based value: replacing bright data with dark data. That’s not good, in my mind.
Despite my comfort with using the measure as a dimension in data analysis, the measure is controversial. People have good arguments why the measure is not useful. I would counter that argument by pointing out that the analysis could be valid when combining BMI with other dimensions that can make the necessary distinctions. My argument won’t go very far unless people had more experience working with data. My experience informs me that for analysis of data aggregates, precise repeatable measurements like BMI numbers are more useful than imprecise unrepeatable values such as a person’s athletic achievements. But my experience also tells me I would not fare well in arguing my case against an accomplished athlete with a high BMI.
BMI provides an easy to observe example of what we are getting into when we attempt to use big data to optimize health care delivery. It is a project that is likely to fail for the same reasons that BMI struggles to justify a policy of recommending normal-weight for individuals.