Archaic personality assessment in age of data

I recall in the 1990s taking some personality assessments of various methodologies.   All of these involved my answering a long list of questions with some multiple choice type answer (including true/false questions).   The system then processed these answers to evaluate what level I had in various type categories.   I received the results in terms of a set of numeric scores for each category accompanied by a long narrative about how similarly scored people behave.

At the time, I gave the results some level of credibility in terms of better understanding myself backed by something that felt like it was scientific.   However, even at the time I compared the narrative descriptions with similar descriptions for zodiacal signs based on only one question: what was my birth date.   In both cases, I could find statements that I agreed were very accurate descriptions about who I am or how I appear to others.   I preferred the personality assessment tests because they gave me the opportunity to answer questions, many of which I was happy to have the opportunity to answer.

Now, looking back from a dedomenology point of view, I see the results differently.   They obtain precise answers to questions numbering in dozens or even hundreds.   In return, they return back to me something that resembles a one-way hash.   The hash code is in the form of a series of numbers (representing percentiles or levels) for a small set of traits.   Many may have the same hash code, but there would be one that best represented me.

In the case of the old Myers-Briggs (MBTI) system, there are only 16 different hash values to apply to the entire worlds population.   Many people may answer questions differently and yet get assigned to the same type.   Consequently, it is impossible to identify all of the individual question answers from the concluded personality type.

In the small-data world of the 20th century and earlier, it was necessary to reduce voluminous data into a small set of categories or statistics that can be stored persistently and retrieved quickly.

I recall back to the time when I was introduced to statistics mathematics.  There was just the briefest mention of the pragmatic need for data reduction to accommodate existing data technologies.   The bulk of the instruction was to interpret meaning from the statistics.   In particular, we were taught that we can understanding reality based on data-reduction into distributions and their parameters such as means, standard deviations, or modes.

I was not a particularly good student of statistics.  I didn’t trust the reliability of the mathematical models to represent anything.   I also didn’t distrust it.  I accepted scientific discoveries including those that reached their conclusions from statistics that somehow dismissed outlier results.

I did not find statistics to be very interesting when I encountered data.   I trace that back to the very introduction of the need for statistics to reduce data in order to fit in limited data stores and slow retrieval processes.  I wonder if we would ever have invented statistics if mankind began with big data technologies we have today: technologies that can store every raw observation and that can retrieve results from that store.

Imagine if a new species came into being today, species distinct from humans but equally smart.  Such as species would have access to the human-developed data technologies of the current era.  These technologies include the automation of statistical computation and inference, so it is effortless to engage these tools.   However, the technologies include the ad-hoc query capabilities to retrieve raw observations or to explore patterns in data without any preconception of a need to fit a statistical model.   I propose that such a species would not make much use of statistics, because there is so much more to learn from the raw data itself.

Some have suggested that machine learning has developed to being similar to a new sentient being distinct from humans.  There have been many recent announcements of capabilities of machine learning matching or exceeding the capabilities of humans who specialize in the same activity.   The advancements have extended to the point where we are close to accepting fully autonomous vehicles such as cars, freight trucks, or container ships.  Human are close to trusting autonomous vehicles with the responsibility for assuring the safety of cargo of great value to humans.   Machines are taking over human jobs very much like a different species would.  It is comparable to earlier humans using animals to perform hard labor tasks.

We have imagined such automation for over a century.  Earlier results relied almost entirely on encoding data-reduced models into the machines.   The machines needed to make decisions based on already established statistical models.   When these machines failed, we replaced them with ones containing more statistical models or improvements on the old models, but in all cases the models were data reductions within the established rules of statistics.   As a result, progress was slow, and there was little confidence that fully autonomous operation would occur any time soon.   In fact, I recall expert predictions that it make take several human generations before full autonomy would be acceptable.

That all changed when technology improved to make affordable machine learning.  Machine learning does not need human’s statistical models to reduce data.   Instead, we permit machines to learn from raw data used in training.   In the case of neural networks, there is still data reduction of many training observations to a finite set of weights on the discrete connections between nodes.   The distinction is that the machine can come up with any model, including models that humans would never use.  Simple learning networks that learn how to recognize hand-written letters will come up with a very different concept of what qualifies as a particular letter but still be as accurate as humans in recognizing the letter.

Despite the difference in approaches that machine learning uses to modeling the real world, we are close to accepting the turning over of responsibility for high value cargo to autonomous vehicles.   This success comes from our allowing machines to work with the abundance of raw observational data instead of being constrained by human-accepted rules of statistics to reduce data.

Machines may be coming up with their own laws of statistics.  They are probably coming up with different laws for different circumstances.   In effect, the machines are constructing alternative realities, or parallel universes, in terms of understanding the world within the specific goals they are trained for.

Society can never afford the same luxury to humans, where we demand psychologists to come up with treatments that conform to the notion that humans evolved instead of being created.   A machine-learning psychologist would not be constrained to accept either proposition.

My point here is that to take full advantage of modern data analytics, we prefer access to raw observation data over human-determined data reduction such as to statistical models or data categorizations.

On the topic of personality assessments, a modern approach would focus on the collection of the raw data of actual answers to specific questions.   Given the capabilities of large data, the modern assessment would use orders of magnitudes more questions.  A modern assessment would combine answers about the same individual from the individual himself, from a trained therapist assessing answers over the course of multiple sessions, and from acquaintances.

Gathering this raw data requires motivation.   One such motivation is to provide some type of abstract assessment that entertains the person with some kind of understanding about himself, his client, or his acquaintance.   That assessment itself is only for entertainment purposes that motivates people to submit answers to more questions.

I would imagine a modern personality assessment to have access to tens of thousands of questions to be asked randomly to provide a few questions at a time.   Over a long course of time, the same person may be asked the same question multiple times but this would be a unique answer due to the passage of time and due to results of contemplating answers to the intervening questions.   Similarly the same question may be answered by colleagues or therapists.

The goal of such an assessment is the persistent storage of all of these questions including the time-stamps and sequence number of each answer.   Ideally each person would have a distinct key in the database so that we can model individuals.   Such a key may be anonymous and still be useful, but there is more value to have each individual identified.

Consider the case of a big-data store the was able to store all of the individual answers keyed with sequence numbers, time stamps, and specific individual identification.  I don’t think anyone would voluntarily discard that data in exchange with anonymized data consisting of just a few categories.   The value of data reduction into categories is for people who don’t have access to big data.   Those people are the consumers who wish to have an external assessment of what kind of person they are, allowing them a shortcut to introducing themselves, similar to the 1960’s approaching of introducing oneself as a zodiacal sign.

The book title “Please Understand Me” described this incentive explicitly.   The proposed solution is to replace a zodiacal sign with an MBTI tag.   Instead of introducing myself as an Libra, I could say I am an INFJ.   In both cases, I can find the published narrative descriptions of example personality traits that I agree closely resembles myself.   I could have some hope that someone would get a good idea about who I am by accepting the label I provided.

Lately, there is the new site “understand myself” that uses the big 5 personality traits of CANOE (my preferred acronym).   Like MBTI, a list of answers to a series of questions determines the traits.  I can replace INFJ with my CANOE scores of High, High, High, Low, Low (respectively).  Tellingly, the identification is not as easily communicated as MBTI.  I don’t think I fit that all that into an elevator speech, let alone a quick identification.  But that is consistent with the title of the site: where the goal is understanding myself instead of having others understand me.

I’m being facetious.  Both systems have the same goals, and both have a high degree of correspondence between individual traits with exception of neuroticism that is unique to the CANOE.   From my perspective, I don’t see much difference from the variant approaches of trait assessment from answering a series of question.   The need for multiple measures is analogous to needing to learn different languages: you have to change the language depending on who you are talking to.   The MBTI is more widely recognized among the general public while the CANOE score is recognized by scientists and there is always the zodiacal sign as a last resort.

All of these are archaic in the sense of presupposing learning based on Occam’s razor simplified models with parameters that reduce raw data.   These are still useful for casual conversations, but we now have technologies that permit us to access the big data of the raw data of all of the individual questions.   While I can see the privacy concerns of others having that kind of detailed information, I view the raw answers to individual questions to be a closer measure of who I am than the percentile scores of a small set of categories that themselves are open to interpretation after the fact.

I was surprised by the fact that a new web-based personality assessment site that is clearly using modern website design would revert to an archaic approach of asking from a fixed set of questions and generating a score that is included with generic narratives about what kind of people have similar scores.   Instead, I would have expected a site to have access to tens of thousands of questions where something like 100 of them would be asked at a time.  After 100 answers were available, it may score the results, but in a KPI (key performance indicator) approach with some kind of symbolism like thermometers, stop-light indicators, or other graphic.   The site would not stop there.  It would allow the person to continue to answer more questions and then allowing new scores to be based on all the answers or perhaps a sampling of the past answers (such as the most recent 100 answers).   To make this workable, the system must persistently store the answers and identify them to a particular person.

Zodiacal signs are appropriate communication within 1960s cultures.  MBTI are appropriate for communicating within 1990s cultures.   CANOE scores are appropriate for communicating with 2000s social science cultures.  All of these facilitate self-identification using legacy human language communication.

The raw data of all answers to a large number of questions are appropriate for communicating within big data cultures.  Big data cultures have little value in the abstraction of a parameterized model.   It is far more informative to have access to all the raw answers along with the exact wording of the questions, and the times and sequence-numbers for the questions asked.   With this kind of data, a data scientist can study patterns between the answers that do not fit the nice little compartments of labeled personality traits.   We can discover new traits.   More specifically, machine learning can find new traits that it can use without ever elaborating definitions for the discovered traits.   When it comes to making decisions, ranging from simple questions such as whether someone is worth getting to know better, to more critical questions about who is best to fill some role, the data scientist would prefer the raw data.   In fact, once accustomed to large data elsewhere, the data scientist would treat the simplified models with their single-valued parameters as being nearly as worthless as the zodiacal sign.

I have not taken the self-assessment in the new site  I have taken plenty of these before and always with the same disappointment of an assessment that seemed superficially accurate.   I looked at the description, and I couldn’t believe that despite the modern appearance, the approach appears a throwback to the personality tests I took back in the 1990s: a finite set of questions, a list of traits that get scored in terms of some single-value level of matching the definition.   It could have been interesting if the questions were randomized from a larger set of questions, with no upper limit on questions answered,  and with an ability to reevaluate the traits at any time.  Alternatively, the assessment algorithm can improve allowing for different assessments on old answers at a later time.

But I look at the description of the process in the site.  The narrative appears to be written by a social scientist, but the site was created by a data scientist.   The answers are identified because the site requires purchasing a test, there is a need to match the user with a payment method.   The site undoubtedly uses some form of data store in the back-end with an analytics algorithm that can process the stored results.   Also, the site explicitly mentions that there are future plans to include questions for couples.  If the site becomes successful, it is easy to imagine future features such as allowing colleagues to answer questions about the person, or even trained therapists.   Each such purchase would reward the individual customer with some version of a CANOE score and this will be entertaining in interpersonal assessment.   Meanwhile, the data accumulates behind the scenes, ignored by the social scientists who came up with the original idea, but the data scientist will know it is there, and he knows what he can do with it.


One thought on “Archaic personality assessment in age of data

  1. Pingback: Assigning names to personality traits | kenneumeister

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s