This past week while working through a task assigned to me, something occurred to me that is hard to put in any way that can be taken seriously.
Initially, the task seemed to be a straight-forward use of a database I had already developed for a different purpose. I was eager to show off that I created something that could be useful in multiple problems. In order to prove my point, I avoided modifying the original database. Instead I manually ran multiple queries and combined them in a spreadsheet to produce a summary report. This makes sense because the question presented to me is a one-time question that is unlikely to be asked again so it didn’t make sense to develop something from scratch.
The first iteration did the job of answering the question. I presented the results and during the review they pointed out that they had made some changes to the plans. At this point, I was disheartened. The whole reason why I did it the way I did it the first time was because I was convinced I wouldn’t have to repeat it. Also, the new assumptions will complicate what was already a tedious process. I could have redone the task the same way I did before and they granted me sufficient time to do that.
There are many reasons why I rejected the notion of repeating the operations. First is that I hate doing the same job twice. Even though the assumptions changed, the basic operations were mostly identical to the last time. Given that it take a while to get it done, it seems ridiculous to have to manually do something that should be automated. In addition, the added complexity made the entire process less accessible for someone else to do. I was creating a process that only I could do.
One aspect of my personality is that I don’t want anyone, not even an employer, to be dependent on me. I’m the kind of person who is repulsed by the idea of job-security, at least in the sense as the only person who can do something of value. All throughout my career I always worked with on foot outside the door ready to walk away when it made most sense for me to do so. Yet, I’m conscientious enough to not leave the employer with a slot that would be too hard to fill. Thus if I do something, I want that something to be something that is reasonable to expect someone else to do. In particular, it should be something that a current coworker can do.
I cringe at the notion that my leaving would result in the employer having to find a KEN replacement. On the other hand, if I do a task, I like to do it as thoroughly as possible. When given a complex task, I want to do the complex work to complete it. I strive to compartmentalizing the complexity so that the manual effort of completing the tasks are simple steps that do not require confronting the complexity again. This is the opposite of my original approach. For a one-off task, it is easier to confront complexity manually one step at a time. I know the concepts so that I can complete the task without making the extra effort to automate it.
When presented with a need to come up with a quick answer, there doesn’t seem to be much allowance to go through the extra effort to make the task repeatable and transferable to others. I’ve been confronting this kind of problem for decades now, and every time I end up replacing the initially manual task with a semi-automated task with the introduction of new scripts or algorithms. What I never learn is to not both trying the shortcut manual approach in the first place. I guess I need reminders as to why that isn’t satisfying.
Getting on with my little story, once I started to design a new solution I found ways I could improve upon my original analysis. While the omission of these improvements was acceptable given the guidance for a quick rough estimate, the fact that I was writing a new process made it easy to incorporate into the analysis. This in turn created new opportunities for presenting the new results and I had the opportunity because I was rewriting the process.
The end result was more satisfying than what I presented earlier. I could present the results with more confidence because they came from a more rigorous process, and also because I knew I eliminated the doubt of the possibility of operator error in the tedious manual approach. Also, I feel more comfortable that I can transfer this task to someone else if there is another request. Even if I would end up doing a follow-up analysis, it is more comforting to know that there would be an option for someone else to do it.
Subtracting out the innovative additions in the above story, the process of replacing a prototype with new code is sometimes called code refactoring. Code refactoring sometimes seems like a nuisance because in the ideal sense it is replacing better designed code for existing code that reproduces the exact same functionality as the original code. Refactoring doesn’t improve the functionality and doesn’t have to improve performance (although it often does). The point of refactoring is to make the code more understandable, to clean out the vestigial components, to clarify the names and structures to best match the concepts behind the found concept that works.
In some sense, the goal of refactoring is to make the code reproducible. Another team can read the code and understand how it works. That other team may be the same team but at a different time when most have forgotten the lessons learned. Refactoring allows the code to be reproduced for other projects by permitting the new users to evaluate the fitness of the code to meet the new requirements, or to reuse components in different ways.
The thought that occurred to me is the distinction between design process that went into the original prototype and the reproduction process that went into the refactoring. This seemed applicable to biological evolution where lately there has been so much discussion on the inference of design that either is evidence of a designer or evidence of how natural processes create the illusion of design. Design is what gives us the prototype, but it doesn’t give us the reproduction. The emphasis on design misses the most fundamental aspect of biology and that is reproduction. The key step to evolution is not the design, but instead the reproduction of a design.
The key step to evolution is refactoring instead of designing. There is much difficulty in explaining how design can emerge in evolution. I have my own ideas based on reversing how time works. In any case, I don’t know of any conclusive explanation for how design emerges. The source of design is a hard problem to solve, but it is not as hard as explaining how the design is refactored so that it can reproduce. For design theorists, the refactoring (reproduction) comes after the design. For naturalists, the refactoring occurs before the design. In either case, the refactoring involves the codification of information for what will work in future copies.
The problem for both sides is how does information get introduced into living beings based on physical matter. The cosmic designer requires an immaterial supernatural essence that can interact with material world, something we never observe unambiguously. Natural determinism requires the material world to introduce new information through physical processes where this information is complex involving multiple simultaneous changes in many interdependent parts.
My thought is that this is what happens in refactoring. This makes me wonder whether modern software development and data science in particular could produce new models for how evolution could occur. Initially, we replaced religious doctrine as a basis for imagining a cosmic designer with a natural deterministic model based on the successes of the physical sciences. Now that we are enjoying a new level of benefits from software development and data science, we may look to how these successes occur as models for how evolution may occur.
Evolution involves prototypes by trying to use existing forms for new problems. Similar to my story above, the initial prototype is a behavioral change in an existing creature. Some creature that is best suited for eating vegetation may begin eating insects, for example. The prototype may work to show that this is at least feasible but it is difficult to reproduce in succeeding generations. In order for evolution to occur, there needs to be a way to refactor this prototype into something that can be reproduced.
As an aside, this video seems relevant. Even the short video quickly leaps from seeking a new food source, to using prey as an insect repellent, to using it for getting high. In either case, the behavior would not persist unless there is a way to transfer the information to peers or to offspring. Maybe for lemurs this can be cultural, transferred similarly to how humans transfer tradition. Even in that case, there needs to be some kind of refactoring of the initial discovery to something that others can safely yet effectively replicate.
I was trained in engineering back in the early 1980s, where the emphasis was on mathematical formula documenting theories or laws. The determinism inherent in such formula descriptions of the world is consistent with why Darwinian Evolution was more acceptable than Lamarckian Evolution.
Despite my initial engineering education, I spent most of my career working with data. I started in simulation models that involved implementing algorithms to compute the mathematical formula explaining some process. However, I ended up with a more of a focus on data to the point where I currently prioritize observational data over model-generated data.
In my categorization of types of data, I described well-documented and well-controlled observations as bright data. I used the term bright to contrast with dark data that was modeled on dark matter or dark energy: data created by models instead of by observation. Only just today did I realize that the term bright has a meaning within atheist or skeptic communities to refer to a rejection of mystical explanations for any aspect of the material world. Conveniently, I find the terms to have compatible meaning. My bright data is observational data completely uncontaminated by mystical explanations. However, I’m not one of the bright ones because I treat all human models as suspect sources of data.
Observational data are inherently adversaries to human theories (scientific or mystical). The reason why I prioritize bright data over dark data is that I want to give data the opportunity to challenge science and to make room for non-material explanations. Prioritizing bright data over other types of data gives us the most opportunity to discover new hypotheses.
The story that started this post suggests a new hypothesis that modern software development and data science practices may suggest a better a new way to look at the physical world. The world works similar to how software works. Production software is repeatable. Modern software development incorporates continuous testing where repeatable good results prove the software is working correctly. If the scientific method were applied to software products, it would come up with theories of operation that can be reliably repeated by many investigators, and those theories will match the design of the software (assuming the software had no bugs).
Meanwhile, software emerges from a multi-tier development process that starts with design, then refactoring, then testing until it is reproduced in multiple copies to use in production. I can imagine a similar process occurring in the natural world to produce what appears to be designed, even though I can’t imagine what makes up the software developers or data scientists. I’m not one of the community of brights because I refuse to dismiss the possibility of the supernatural.
Coincidentally, ever since I was introduced to Darwinian Evolution in school, I always preferred the Larmarck approach where succeeding generations acquired attributes that were desired by their ancestors. I admit this is ridiculous to assume that even human-like consciousness can translate conscious desires into inheritable code of genetics or epigenetics.
However, I start with the observation that I have consciousness and creative capabilities. From this I conclude that nature is capable of creating consciousness and creative intelligence. It seems reasonable to me that nature could produce consciousness in other ways. I think it is a reasonable presumption that everything is conscious unless there is a good proof that it is not conscious. Given the philosophical arguments of zombies, where we can’t really prove that our fellow humans have the same consciousness as we perceive, I think it is very difficult to disprove consciousness elsewhere in the world.
The bright thinkers will describe consciousness itself to be an illusion, even the consciousness we think we have. There is only deterministic natural processes involved so that there can not be any free will and the consciousness we perceive is a emergent consequence of systems with sufficient complexity such as the human mind. Even this approach suggests that the illusion of consciousness is rare with humans being one of the few creatures to have it. I can’t accept this because observations suggest consciousness in all life forms. When the burden of proof is on me to prove that bacteria are conscious, I can’t do that. But when the burden of proof is on science to prove they are not conscious or sentient, they can’t. We can’t even be sure our fellow humans are not unconscious zombies reflexively acting to give the illusion of consciousness.
Consciousness is mystical and may be incompatible with material determinism. I suggest that only one can be true. Either all is conscious or all is material determinism. My presumption is that consciousness can’t arise from material determinism. However, as in my software-developer analogy, consciousness can produce results that are consistent with the scientific method. The very nature of production software is software the reliably and repeatably pass its tests for functionality.
Taking the accepted notion of the big bang that occurred about 14 billion years ago, the very instant after the mystical instant of its start, there had to be present one of two different concepts that would be firmly in control of what will follow. Before the emergence of the first forces, there had to be present either consciousness or causality. Without causality or consciousness, the consequences of the big bang could not have formed the universe we see. Any emergence of fundamental forces or fields would be ineffectual without causality.
I don’t know of any scientific explanation for why causality must exist. We take it for granted and it is very intuitive to do so. We can test causal relationships with scientific tests so we know causality is a useful if not essential concept of reasoning. However, given the magical singularity that started with the the big bang, the infant universe could have started without causality. Also, given what we know of quantum mechanics that matches the dimensions of the initial event, it would very improbable that causality would emerge at all.
Causality and consciousness are equally mystical. One of them had to be present at beginning of time in order for history to play out the way it did. Of the two, causality is more difficult to accept because to get from the initial big bang to an earth teaming with diverse life forms requires many miracles of spontaneous creation (including spontaneous starting and stopping of inflation), spontaneous life, spontaneous ascent, and spontaneous consciousness. Each of these steps are extremely improbable to occur. In contrast, an initial consciousness could through design, refactoring, and replication build up the universe without any further miracles beyond the initial consciousness in the first place.
My ideas are emerging as I contemplate what I’m experiencing as I do my work trying to answer some new question based on observed data. While this work doesn’t involve neural networks, my understanding of neural networks is that they can be trained to perform tasks that require human like intelligence. As we see with autonomous image recognition or terrain navigation, these intelligent machines can be trained to do tasks with competencies that rival humans. But, when observe how they are doing this, they are building models are not causal in the sense of matching the material world. The causality is only in the sense of picking the answers most likely to result in reward or least likely to result in penalties.
In my classroom experience with a neural network problem to recognize handwritten numbers, I observed the learned models for number recognition by making a 2D plot of the weights for each element. The learning came up with different ways of recognizing the numbers from how I was taught. I was first taught the ideal concept of the shape of each number and then learned how to create those with my handwriting or to recognize them in the handwriting of others. In contrast, the machines need no such ideal concepts to reliably recognize handwritten numbers. This can be an illustration that consciousness is not required to do something we think of as intelligent. I think it also can be an illustration that the world can be understood without resorting to causal explanations.
Causality, and consequently materialistic theories, are similar to supernatural explanations in that both are human stories we use to describe the real world. In that sense, I treat data derived from both scientific theories and mystical explanations similarly. I label both sources of information as dark data to separate them from bright observational data. I prefer to model the world like the neural networks model their tasks, learning based only on examples from bright observations. I want to find new hypotheses.