Teaching Data Investigation in Schools

This article reminds me that we have for many years promoted the introduction of computer technologies to primary and secondary classrooms with the primary goal of teaching software programming.  I object to the notion that the motivation should be to prepare students for jobs as computer programmers.   Programming languages take about a week to learn and a few weeks to master.   Compared with the time required to learn a foreign language, software coding could be an after-school type of activity, as it has been for a long time.

I do support computers in school, but for a completely different reason.   In several of my past posts, I have described the revolutionary changes we have been experiencing in employment and government.   This change is coming from the reality of big data collection and analysis that supports optimization that increasingly impact our lives.    In contrast to the above article that suggests computer training can lead to computer-related jobs, data science skills will be essential for all human activities.   From negotiating a job as an on-floor retail associate to having a voice in a democratic government, people will need to be able to do their own independent data science.

Data science to me is all about investigating data.   I place data science in our millennia-old experiences in historical sciences that study historical data.    These sciences are characterized by investigating available data of historical observations.    This activity involves selecting relevant data, scrutinizing that data for flaws or irrelevance, and interpreting the data to support an argument or counter argument.

In the past, the primary and secondary education was constrained by lack of access to actual observation data.   As a result, the coursework of human or natural history focused on learning the conclusions of these studies.    For example, learning about the US Civil War involves learning events already determined by historians to be key.    It was not practical to expose the class to the practice of the historians reading letters, news articles, government correspondence, etc, to come to their own conclusions.    The source material remains largely inaccessible today, and even where it is accessible it uses language made foreign by the passage of time.

There will always be a need for this type of education to learn the current understanding of key aspects of human and natural history.  From my experience being schooled a half century ago, it was sufficient to know history and to rely on historians to continue to keep me informed.    This is no longer sufficient.

We need to be training current students the very art of historical sciences using relevant modern data.   Unlike early times, this is feasible with ready access to databases of recent and relevant information that students can understand.    We have the opportunity to teach students how to do historical science that I rename as data science to make it more relevant.   In addition, their future will be increasingly controlled by data and so it is essential that they learn how to work with data.

The past sentiments of getting computers in the classrooms were to prepare students for STEM careers by teaching programming arts.   I don’t think that has been very successful.   Computer programming is something that people can pick up as adults after the learn the motivations for doing the programming.  The proper role for computers in schools is to provide access to databases and the data tools used to query and analyze recorded observations.

Instead of teaching students Python, they should be learning SQL.   Instead of teaching programming to get computers to follow some assigned capabilities, we should be teaching how to collect relevant data, how to scrutinize that data for relevance, and how to present a case based on that data.     We should be teaching the actual practice of being historians.

Allow me to illustrate what I mean by the example of my prior employment experiences.   I developed web applications that strove to present high level results to analysts.   These top level reports prepared voluminous underlying data to a summary form that could support decision making.    This is analogous to a history book telling students what events happened in the past or what caused those events.    The analysts, like the students, could take that given knowledge and proceed to prepare their own assignments.

However, unlike the history books, my reports provided rich hyperlinks to every data point in the result, inviting the analyst to drill into any value of his own choosing to investigate the justification for that data point.  The hyperlinks would jump to new reports that would provide more detailed information that when combined will produce the higher level summarized data point.   These intermediate reports follow the same design by providing similarly rich collection of hyperlinks to yet other reports to explain the data (drill down) or to explore related data (drill across).

Over time I developed a philosophy that every piece of information needs a hyperlink to a report that explains more about that data.   The hyperlink concept was initially conceived as a construct to use sparingly such as how I write these blog posts.   The hyperlink is often emphasized by colors and underlines to draw attention to its exceptional nature.    My approach was to treat the hyperlink as the rule.   Every concept would have a hyperlink.   Coloring and underlining the hyperlink is both unnecessary and distracting: everything is an invitation for investigation.

I imagine writing an essay where every word and group of words would hyperlink to more data.  I am pleased with modern browsers that provide context-menus for highlighted text that provide the option to search the Internet for that same text.   That is how I think.   Anything and everything should be a hyperlink.  The link should go to something that explains that context.   In written prose, that hyperlink may go to a dictionary or encyclopedia definition of the text.  Frequently, I prefer that the hyperlink go to another author’s point of view on that topic.   I use this browser-tool extensively, making my own hyperlinks when I need them.

My observation is that this habit is not common.    People generally are trained to read a work as a finished and final word on a topic.   We learn this in school.   We know some historical event from the civil war because it is spelled out in some book.  The book didn’t have hyperlinks.  At best it had references but those references generally were hard to obtain so we learned to take the author’s word for it, and be satisfied with the appearance of an authoritative-looking reference.

I made this observation in my experience with my software project described above.  Despite the extensive availability of hyperlinks, no one ever used them.    This was not because they didn’t think it was necessary.   Often, they would request my assistance to explain some report.  I would talk with these users directly and explain the results by using those same hyperlinks always available to them.   Even when they appreciated the insight provided by the hyperlinked-reports, they would not use those links on their own.

I acknowledge that they offered a good explanation for not following the links.   Their explanation is that they didn’t know what to expect when following the links.  I explained that the linked report will support the number that is hyperlinked: for example, the report would show a break out of numbers that when added would produce the number with the hyperlink.   This explanation did not answer their concerns.

Their concerns about investigating data go deeper than understanding the data.   I think their concerns ultimately are the result of years of conditioning that tells students to trust the author’s interpretation of the source material.    This is how they were taught in school.   Part of that training discourages us from investigating source data due to the risk of coming to a contrary opinion that could result in lower grade in the class.

In some cases this conditioning is deliberate.   For many topics, the source material is beyond the ability of the student to understand.   For example, historical material from ancient Rome requires deep understanding of Latin language and Roman culture that would be beyond the student’s ability at the time of learning that there was a rise and fall of an Mediterranean empire.

I most cases, this conditioning was accidental.  Schools still teach from a standardized published text book.   The lessons to be learned are in that text.   Tests will be based on material in that text book.   Even if there are references to external reading materials, those materials are largely inaccessible and optional.  For the most part, the published text represents the entirety of what is needed to pass the class.   This conditioning teaches us that the strategy for passing a class is the strategy for passing through life, or at least for passing through an knowledge-worker type of job.    The job is to read a text, not to investigate what is behind the text.

I mentioned by own current habits of highlighting any text that raises my suspicions and then right-clicking to jump to a search to see what other resources are available for that particular set of words.    This habit is what I feel needs to be taught throughout school.    Notice that I call it a habit instead of a skill.    The habit is something conditioned.  In my case, I conditioned myself by being rewarded by finding useful new knowledge or by finding more confidence in my understanding of the presented material.

It would have been better if this conditioning were out of repeated use throughout school as a habit.   This conditioning is to tell the student to recognize an assertion and to recognize his ability to test that assertion for himself.   The reason this is better is because often this investigation comes up empty by providing nothing interesting beyond the original assertion.   It is important to get into the habit of investigating even though it may be disappointing.   Frequently, the investigation is rewarding.   From my own experience, I estimate about one out of every four investigations to deliver a result that I’m grateful to discover.     That’s why it needs to be habit.   Even though the effort fails to produce anything interesting 3 out of 4 times, we should always investigate.

As an aside, web advertisement marketing provide some statistics about people’s readiness to investigate novel content.  Web advertisements go out of their way to encourage people to investigate.   Despite this effort, the ratio of people clicking into an ad that the see is on the order of 0.1% for a successful ad.   This is a surprisingly small number.   Although many ads may be repeated frequently so that subsequent appearances may no longer by new to the reader, many successful ad campaigns frequently refresh content and can target users who would be most likely be interested in investigating the offer.   The low click-through rate may partly be explained by our conditioned tendency to accept the top-level presentation as sufficient without needing further explanation.

More troubling to me is the fact that the conditioning I’m arguing to be done as part of schooling is currently being done as part of web advertisers.   Advertisements teach us that investigation leads to some kind of targeted persuasive argument.   If this is the only conditioning available, then we will expect investigation to lead to some type of sales pitch instead of leading to source data for us to make up our own minds.   This conditioning may discourage us from starting an investigation.

Returning the original article, we should approach data science (selecting, scrutinizing, and interpreting data) as comparable to way we approach reading, writing, and arithmetic.   Data science does not mean computer programming (although that may be part of it).   Data science involves using computers to access databases of current and relevant data to support the student’s studies throughout their primary and secondary educational experience.   Data science is applicable to all subjects of learning.   Data science skills are increasingly essential for people to participate in workforce and government.   The example in my last post illustrates how data science is needed even for jobs like retail sales associates whose work-hours are made unpredictable because of inaccessible data analysis queries.   Part of that inaccessibility is the lack of training to investigate data.


7 thoughts on “Teaching Data Investigation in Schools

  1. Pingback: Data Mining For Children | kenneumeister

  2. Pingback: Electronic Records of government staff labor as a method of reform | kenneumeister

  3. Pingback: Data science and education | kenneumeister

  4. Pingback: Electronic Records of government staff labor as a method of reform | Hypothesis Discovery

  5. Pingback: Teaching Data Investigation in Schools | Hypothesis Discovery

  6. Pingback: Data science and education | Hypothesis Discovery

  7. Pingback: Data Mining For Children | Hypothesis Discovery

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s