Machine Understanding

I’m at the opera house watching The Nutcracker. Toward the end of Act II, Scene 1, one of the lead ballerinas stumbles, nearly falling over. The audience falls silent, but before anyone can grasp what’s happening, she leaps into her role again. Thunderous applause follows the curtain’s fall, despite the less than perfect rendition. In the next scene, a robot replaces Ms. Akhmatova, and robo-ballerina executes an immaculate interpretation of the “Dance of the Sugarplum Fairy.”

As I wait in the long line to the men’s restroom in this fictional opera house, I ask myself which of the ballerinas, Ms. Akhmatova or robo-ballerina, has a better grasp of the ballet. If the essence of a ballet lies in its execution, does the robot, with its flawless performance, “understand” the ballet more completely than Ms. Akhmatova, whose occasional missteps fail to escape the seasoned observer?

With “Recursive Neural Networks Can Learn Logical Semantics”, Samuel Bowman, Christopher Potts, and Christopher Manning successfully trained recursive neural networks (RNNs) to apply logical inference to natural language. Like many other pivotal scientific works, the significance of this phenomenal work won’t become fully appreciated or manifest except in retrospect. Machine learning (ML) researchers have been applying neural networks (NNs) to a variety of problems, from image recognition to signal processing, but as a student of natural language processing, this work renewed my faith in neural networks’ capacity to live up to the term “deep learning” and uncover profundity in data.

There is a tendency among non-technical admirers of ML to regard these methods as beyond their creators: independent entities that will one day, given refined enough algorithms and enough energy, out-comprehend their human creators and overwhelm humanity with their artificial consciousnesses. The term “neural networks” is itself a misnomer that doesn’t at all reflect the elaborate complexity of how human neurons represent and acquire information; it’s simply a term for nonlinear classification algorithms that began catching on once the computing power to run them emerged.

The question of whether or not Samuel Bowman’s NN, or the robo-ballerina in the opening scenario, are capable of “understanding” is largely a theoretical concern for the ML practitioner, who spends the bulk of his or her time undertaking the hard work of curating manually labeled data, fine-tuning his or her neural classifier with methods (or hacks) such as dropout, stochastic gradient descent, convolution and recursion, to increase its accuracy by a few fractions of a percentage point. Ten or twenty years from now, I imagine we’ll be dealing with a novel set of ML tools that will evolve with the rise of quantum computing (the term “machine learning” will probably be ancient history, too), but the essence of these methods will probably remain: to train a mathematical model to perform task X while generalizing its performance to the real world.

I don’t mean to detract from the brilliance of Sam Bowman’s work. I don’t remember the last time a scientific paper excited me so much (in contrast to the medical literature, with its mantra of randomized control trials and cohort studies), and I can’t help but let my imagination wander at the thought that a RNN can actually learn logical inference. As exciting as I find Bowman et al’s paper, it also led me to grapple with the hairy question: What is understanding, and what is mimicry? Trying to answer this question (without using the word “consciousness”) led to a great deal of mental turmoil that culminated in the writing of this essay.

Professor Timothy Winters, a philosopher from Oxford University, praised man’s ability to name as his/her greatest gift. Implicit in this statement, I think, is man’s ability to conceptualize. When I call the energy illuminating my desk lamp “electricity,” I’m not just associating a phonetic time series with my halogen bulb’s white glow, I’m also instantiating an abstract class of natural phenomena and associating with it a body of hypotheses (for instance, Ohm’s Law and Kirchoff’s circuit laws). Had I called this “light” instead of “electricity”, I would have been operating under a different set of hypotheses using different mental schemata.

So what is understanding? To understand is to admit that one doesn’t comprehend anything at all. To understand is to use our uniquely human ability to create mental schemata of the world, models for how things and people interrelate and to systematically test and revise these hypotheses. These models might be inspired by a combination of personal experience, bodies of scientific thought, religion or spirituality, but they represent models nevertheless that are subject to change, and we ought accept them as such else we one day discover our worlds as brittle as the models themselves.

My understanding of people as inherently good, or my understanding of myself as a member of society with a moral duty to serve others, or my belief in human reason, are models subject to change based on my own experiences and the experiences of those who influence me. The word “understand” is itself utopic, an attempt at an ultimately impossible feat.