This post follows up on my earlier thoughts about personality typing from methodologies such as MBTI or Big-5 traits.
These formal processes set out to define nearly-independent traits that together can effectively divide populations into clusters where these clusters are very effective at predicting future outcomes in various life situations. I assume there was some original research that identified the basic traits as
- Intuition/Sensing (Openness),
- Thinking/Feeling (Consciousness),
- Judgement/Perceptiveness (Agreeableness),
At some point, though, we decided that these were sufficient to describe innate personality types so that further research may focus on how best to measure these traits through better questions or better analysis of answers to those questions.
The above 5 traits defines 32 different clusters and the clusters are named by their relative scores for each of the 5 dimensions. (Or as in MBTI, 16 clusters for 4 different dimensions).
I imagined what would happen if we gave the task to a clustering algorithm with no other information other than to divide a large population into 32 clusters based on answers to several dozens of questions. Let me assume that the algorithm comes up with 32 clusters with relative population sizes that match the combinations of the pre-defined traits. Given this clustering without a preconceived trait naming, any names we assign to these clusters could be arbitrary without any inference to meaning. We may choose to name the categories with randomized codes so as to not infer any meaning at all.
There is a good reason to adopt this approach of meaningless naming of categories. This is what machine learning algorithms would do. We often fear that machine learning can becomes more intelligent than humans. Some of the advantages of machine-learning is that humans handicap themselves with rigorous respect to prior knowledge. Humans would insist on naming clusters based on the predetermined trait components, while machines do not need such information. I suspect human intelligence can be more competitive with machine intelligence by simply adopting the same strategy of disconnecting interpretation of clusters from any meaning based on component traits.
I think about the different narratives about typical personalities that have some combination of traits. These descriptions concern how successful or unsuccessful such people will be in various life circumstances. Often these descriptions are not obviously tied to the specified traits. For example, a MBTI assessment of INFJ to a particular person will result in description that include conditions that are not obviously related to either of the component parts.
As a thought experiment, assume the accuracy of the descriptions for each of the 16 clusters in MBTI methodology. Given these descriptions, I can imagine different trait names, for example
- Ease of being entertained by others
- Ease of entertaining others
- Reliability to follow instructions (includes the ability to understand those instructions)
- Reliability of obtaining good outcomes when deliberately disobeying instructions
The above scheme is very workable despite not having a single word description where each word can have a unique abbreviation letter. I suppose with some cleverness, I can come up with some single word that captures the essence of the descriptions, but I would argue that such a goal is evidence of humans being too intelligent for their own good. There is nothing to gain by clever or poetic shortening of the concepts and a lot to lose. Inevitably, the chosen poetic terms would excite our imagination in ways that have nothing to do with the data itself.
One advantage of machine intelligence over human intelligence is that machines have no imperative to convert discoveries into poetic language, and no need to appreciate any such poetic language. The data speaks for itself.
I mention the above examples as another schema that may convert the observed 16 clusters into 4 near-independent dimensions, but I suspect there are unlimited other ways to describe the presumed 4 dimensions that constitute the 16 clusters.
Again, the mere action of recognizing the 16 is the 4th power of two is a form of poetry that appeal to humans. When given a set of 16 clusters, a machine-intelligence can be satisfied with the 16 as sufficiently informative.
One of the advantages of machine intelligence over human intelligence is that machines are not driven toward poetry. To me, poetry captures the scientific appreciation for the simplest explanations with the fewest number of terms. Humans are innately poets by nature, and even the objectivity of science can not escape the human delight in well-crafted poetry, or human disdain for inelegance in descriptions.
Machine intelligence is not constrained by elegance.
While I believe it is a part of human nature to appreciate elegant descriptions of observed clustering, I think humans are very capable of inelegant descriptions. In fact, we can be very successful in using concepts that lack any meaningful descriptions at all.
Humans were expert at getting projectiles to hit their targets long before we worked out the theory of gravity and the mathematics of calculus that elegantly describes the trajectory. Even now that we have such elegant discussions of the principles, baseball pictures learn their expertise at throwing balls from practice and training that may never mention the elegant poetry of the science and mathematics that describes the motion. I would go so far as to say, that the best athletes are able to break records in part because they find inspiration that contradicts the prior elegant explanations. Of course, once they discover some innovation, science will rush in with a replacement explanation that approaches the past elegance. The innovation preceded the elegant explanation.
In modern times, we constrain humans with a need to conform to the poetry of prior discoveries. We allow innovation only if it is preceded by an explaining theory that is acceptably close to the poetry of the prior theory. Without a convincingly poetic defense of a new proposal, we will often decline approval for that innovation. The project would be shelved until someone can come up with an acceptably elegant explanation for why it would succeed.
Machine intelligence is not constrained by this need for poetry.
We allow the machines to learn on their own. Given the tasks we are striving the solve, we don’t have the luxury of coaching them through the learning process, and certainly not the time to work out elegant answers for the machines.
In this blog, I often described the option of a new form of government that I called a dedomenocracy, or more poetically a government by data and urgency. The machines running such a government has a very difficult task of taking into account all of the most recent information and apply it to the higher priorities for its human subjects.
Part of the information that such a governmental system would require is the categorization of humans into clusters in order to predict how best to engage the available population to accomplish some goal.
I imagine such a system would continuously compute a cluster algorithm that will consider the most recent information about the population. The cluster algorithm may choose to divide humanity into 16 clusters, but there is no expectation that clusters at different times will contain the same members.
In contrast to the human approach to personality assessment inherently values the elegance of a life-long consistency of peers within a cluster. This is an artificial constraint on realizing the potential of a population. An example is the notion that individuals have a near permanent IQ that will remain consistent for a lifetime: the value may change but we expect that high IQ later in life is preceded by high IQ earlier in life. This may be true, this may be irrelevant to some immediate task of addressing some urgency.
The example I have in mind is when I have a job opening and I have little choice but to hire someone based on training and experience while I know I have succeeded at same task without those benefits and in fact that success came from a series of low-IQ decisions (mistakes). I fully agree that as a manager I need to pick the best person for the job, but demanding relevant experience and training may result in failure as a result of disregarding a better candidate who just happened to lack that background.
I imagine a micro-dedomenocracy of a machine-intelligence that is in charge of filling some vacancy would be more successful at the task than I could be, because that machine intelligence has no need for “poetic justice” in terms of rewarding someone for past accomplishments. The unjust result of hiring the least qualified may in fact be the better choice. Humans are not allowed to do this, but a machine-intelligence manage would.
The computing power and large-data handling of computers present a formidable challenge for humans to compete against. In this modern environment, it is counterproductive to further handicap humans with the need for poetic elegance and poetic justice.