3-25-2015

March 25, 2015

The Quest to Reverse Engineer Our Algorithmic World

Arvind Narayanan
Assistant Professor, Department of Computer Science, Princeton University

Jock McFarlane, Introducer, and Arvind Narayanan, Speaker

Minutes of the 24th Meeting of the 73rd Year

he meeting was called to order at 10:15 AM by President Owen Leach.  It was the 24th meeting of the 73rd year of the Old Guard. Charles Clark led the invocation.

The minutes of the March 18th meeting were read by John Schmidt.

95 members were present.  Bill Burkes introduced his guest Frank McDougal

Jock McFarlane introduced our speaker Arvind Narayanan, who came to Princeton in 2012 as assistant professor of computer science. He received his PhD at University of Texas at Austin and was a post-doc researcher at Stanford. Numerous references on the Web include our speaker's demonstrations that our presumption of anonymity when we use the Internet is not justified and in innumerable ways we reveal our personal identity even when we think we are simply  "exploring" with our computer browser.

Professor Narayanan began with a review of the quest for artificial intelligence. The modern quest is dated to the 1956 Dartmouth College conference of the same name. The speaker illustrated the optimism and numerous periods of disillusion with which we greeted success and failures of the quest. Many of us remember the feeling of accomplishment when a computer succeeded in beating chess grand master Gary Kasperov in 1997. Professor Narayanan demonstrated with a cartoon that we had underestimated the complexity of the task, showing a robot moving chess pieces. The choice of preferable chess moves from among all those legally possible was actually less daunting than selecting their actual physical movements from among the realm of all possible placements of chess pieces. In fact we have often equated chess skill with high intellect, but have yet to develop a computer that can fold laundry.

Until this talk I confused some expert machines with artificial intelligence. Expert machines are computer programs which use the rules that human experts have developed in order to make decisions. For example to predict probable success in granting a small business loan, computers can process large quantities of data and correlate it with measured outcomes. In a three dimensional model of what he was describing, he showed a space with data points representing successful or defaulted loans. Correlation does not prove cause and effect but success in helping predict a successful loan outcome would be a valid reason to include any particular input criterion in granting a loan. In processing large amounts of data, the system will improve its accuracy with each new experience. In the correlations, the mathematical transformations may be sufficiently complex that they obscure from the programmer just which criteria have been selected as favoring the application. In just one disturbing illustration, the criterion might in fact be a surrogate for race or ethnicity.

I had heard that Professor Narayanan's talk was going to be about the end of anonymous data and what to do about it. Association of our personal identity with our finances or health records comes to mind, but is only one part of our concern. One of Professor Narayanan's areas of research has been in deanonymization of data.  As he pointed out in his later answers to questions, removal or blocking of cookies have only limited success in covering our browsing trail. Some Web sites can leave a browser fingerprint on your browser each time it is visited. These are records the Web site can recognize when you visit that site again. The speed and immense size of the Web make it possible to establish a database of previous instances of your fingerprint and the history of your behavior on that Web site. In advertising, it might choose to target ads to you. The targeted advertising could be commercial or political. As this becomes more sophisticated, it can be used to influence voting. Most of us will never be aware of being fingerprinted or being selectively targeted, whether for advertising or political purposes.

An important lesson is that AI will not be just the result of clever program code, with inclusion of more sophisticated algorithms. It will be the product of massive data collection. This enormous hunger for data will be fed by the speed and capacity for the World Wide Web to accumulate and process data

I came away somewhat awed by how much the World Wide Web has grown. I am grateful that Professor Narayanan and his associates are concerned about the ethics of data collection and its uses. Professor Narayanan leads the Web Transparency and Accountability Project at Princeton University, using large-scale, automated web measurement to uncover how companies are collecting and using our personal information. He studies information privacy and security, and as, he says, moonlights in technology policy.

Respectfully submitted,
Henry J. Powsner