Text extracted via OCR from the original document. May contain errors from the scanning process.
scientists are doing science, especially in such data-intensive sciences as sociology and
epidemiology, for which causal models have become a second language. These
disciplines view their linguistic transformation as the Causal Revolution. As Harvard
social scientist Gary King puts it, “More has been learned about causal inference in the
last few decades than the sum total of everything that had been learned about it in all
prior recorded history.”
As I contemplate the success of machine learning and try to extrapolate it to the
future of AI, I ask myself, “Are we aware of the basic limitations that were discovered in
the causal-inference arena? Are we prepared to circumvent the theoretical impediments
that prevent us from going from one level of the hierarchy to another level?”
I view machine learning as a tool to get us from data to probabilities. But then we
still have to make two extra steps to go from probabilities into real understandingnce—
two big steps. One is to predict the effect of actions, and the second is counterfactual
imagination. We cannot claim to understand reality unless we make the last two steps.
In his insightful book Foresight and Understanding (1961), the philosopher
Stephen Toulmin identified the transparency-versus-opacity contrast as the key to
understanding the ancient rivalry between Greek and Babylonian sciences. According to
Toulmin, the Babylonian astronomers were masters of black-box predictions, far
surpassing their Greek rivals in accuracy and consistency of celestial observations. Yet
Science favored the creative-speculative strategy of the Greek astronomers, which was
wild with metaphorical imagery: circular tubes full of fire, small holes through which
celestial fire was visible as stars, and hemispherical Earth riding on turtleback. It was
this wild modeling strategy, not Babylonian extrapolation, that jolted Eratosthenes (276-
194 BC) to perform one of the most creative experiments in the ancient world and
calculate the circumference of the Earth. Such an experiment would never have occurred
to a Babylonian data-fitter.
Model-blind approaches impose intrinsic limitations on the cognitive tasks that
Strong Al can perform. My general conclusion is that human-level AI cannot emerge
solely from model-blind learning machines; it requires the symbiotic collaboration of
data and models.
Data science is a science only to the extent that it facilitates the interpretation of
data—a two-body problem, connecting data to reality. Data alone are hardly a science,
no matter how “big” they get and how skillfully they are manipulated. Opaque learning
systems may get us to Babylon, but not to Athens.
27
HOUSE_OVERSIGHT_016830