Cosmology has a little problem with their math. The amount of known mass falls short of what can explain the motions in galaxies. The amount of known energy falls short of what can explain the accelerating expansion of the universe.
The answer is dark matter and dark energy. Dark simply means it is unknown. I would rather they be more honest: their theories have some problems. Given a name as specific as dark matter or dark energy not only directs attention to finding these things, but also that these are very specific. The missing matter is a specific type of matter that is dark. The missing energy is a specific type of energy that is dark in its own way.
Maybe there are so many different types within each that any particular type offers a negligible contribution. This is like a standard pie-chart showing contribution where the some items are abundant enough to have their own slice and then there is a final slice that is labeled “all other” because the individuals in that group are too tiny to show as separate slices. “Dark” may be the same as the “All Other” in the pie chart.
But even if they rename these as all-other matter and all-other energy, I would still object.
In data science, I’m very alert to the “this can’t be right” type of data. When something can’t be right, it usually means something is wrong with the assumptions about the data: something wrong with the theory of the data.
In the area of astrophysics, I understand a large number of scientists have carefully scrutinized both the theory and the observations to reinforce their confidence that these are very sound. I’m not a member of that field, so I defer to their confidence.
But even if I were in that community and I agreed with the consensus that there must be something out there that we can’t observe, I would still be scrutinizing the observation data and the theory to see if there is something I’m missing.
I don’t work in astronomy. I work with data that supports decision makers. This data doesn’t involve a lot of complex mathematics or computations. A single data item may be pretty simple, but decisions are made with aggregations of data from multiple sources and where each source may have multiple technologies for measurements. A lot can go wrong with data.
Unexpected results from data analysis is often the most valuable type of data for decision makers because this data can result in new opportunities or avoiding new types of losses. But unexpected data is more likely to be the result of some error in the pipeline between the measurement and the derived result.
The problem could be that the measurement device was not designed properly for something new or different about the item being measured. Imagine a tool for counting the number of dogs walking in the neighborhood by counting the number of footsteps in a stride and dividing by 4, but at some point one of the dogs becomes lame and limps with one foot in the air. The number of dogs have not changed, but the measurement didn’t anticipate this.
The problem could be a faulty measurement device. Perhaps in there are hundreds of devices but one of these occasionally resets itself resulting in a brief period it is blind to any new observations. Perhaps one of the devices is less sensitive than the others so it misses data. Perhaps one of the devices is too sensitive and measures thing that should be excluded.
The problem could be in the transmission of the data to storage. The data can be corrupted or lost. Often the transmission may involve multiple intermediate collection points and these can introduce problems by forwarding old data instead of fresh data, or providing the wrong timestamps.
The problem could be in the preparation of the data for the report. The processing of the data may no longer be appropriate for the data or the underlying observations.
If I fail to find fault with anything leading to the results I come up with, then I have to come up with an explanation for the unexpected results. It is not enough to say that something is indicated by the data, there has to be some realistic explanation for why it is.
I am never confident that the data or its interpretation represents reality. The best I can do is to say I haven’t found out what could have gone wrong.
Usually, it is a very small team to look for problems in data I work with. In contrast, there are a large number of scientists studying every aspect of the data that is telling them there is dark matter or dark energy. They are talking about the entire universe and I’m just trying to support some mid-level decision-maker.
There is probably more practical at stake with my work because this is used for making real world decisions.
If the astronomers are wrong about their interpretation of the data, the galaxies will continue to spin and the universe will continue to expand. But there is something practical at stake. They are setting an example.
The example they are setting is that it is ok to presume the existence of some completely mysterious substance after you have convinced yourself there is nothing wrong with your data or your theory. I say no. All you know is that the data is telling you “this can’t be right”. Science would the public a better service by saying that much and not inject into our educational system the concept that science is compatible with mysterious explanations.
Hypothesis discovery is the coming up with a new theory of the world that then can be tested through experiments. Experimentation may be difficult but it is something that someone can be trained to do. It is very hard to train for discovering new hypotheses. Or at least it is difficult to find people who can do it when confronted with “this can’t be right” data.
It is easier to say, “I give up, it must really be right”.
3 thoughts on “Thinking dark things”
Pingback: Dark Nothing Hypothesis | kenneumeister
Pingback: Crowd data: big data free of business rules | kenneumeister
Pingback: Crowd data: big data free of business rules | Hypothesis Discovery