Instead of having a speaker for this meeting, we chose to have a journal club type discussion of a paper.
Title: “The Dark Secret at the Heart of AI”
Abstract: We will discuss a paper by Will Knight that can be found here.
Everyone is invited to read the paper before the lunch, but Michael Solomon will present a synopsis of its contents for the benefit of those who might not have time to do so.”
Present: Ed Turner, Andreas Losch, Erik Persson, Michael Solomon
We met outdoors near the dining hall at IAS.
We had all read the paper, so Michael opened the discussion summarizing the issue raised in the article: While the benefits of Deep Learning in AI are undeniable, the fact that the basis by which the Machine reaches any conclusion is unknown, and in fact unknowable, is problematic. Michael expanded saying that using Big Data, even with programmed algorithms, results may be biased because of patterns in the data caused by human biases and by assumptions in the program. We can correct for biases we can identify. For example, a well-documented problem occurred with a program intended to help employers hire workers who would remain on the job. The applicant’s zip code turned out to be the most predictive factor, and this was initially attributed to longer commutes. However, it soon became clear that zip codes were surrogates for socio-economic and ethnic factors, and results actually reflected pre-existing biases in hiring and firing that ought not to be preserved. The more worrisome problem is that we cannot correct for biases we cannot identify. For Deep Learning in AI we cannot even determine how the machine reached a given conclusion let alone if that process is biased. The article pointed out risks of using Deep Learning in the Military, where errors can be devastating and where there are many false positive threats identified. For Medicine, Deep Patient described in the article appeared superior to physicians for diagnostic purposes, but with significant risks regarding confidentiality, false positives, and effects on patients of knowing risks that they may not be able to modify.
Erik asked, “Is the machine explanation for how decisions are made any worse than human explanations for their decisions?”
Andreas wondered if machines could use memory to explain how they chose? He noted that often humans use intuition for choices and only in retrospect justify those choices.
Ed offered that consumers may not need to know how a conclusion was reached if it works any more than drivers need to know how their car runs. In the case of malfunction the court may need to know, though.
Ed thought the article was a bit loose in distinguishing different system architectures. Some AI systems do allow you to see what is going on using techniques, especially those based on “dimensionality reduction” such as principle component analysis, etc. Machine Intelligence in general may be more understandable than the subset using Deep Learning.
Andreas asked whether neural networks were hardware or software?
Ed answered that at present most are software running on general purpose hardware, but they could be set up using specialized hardware. Ed referred to a common computation in astronomy using gravitational forces to calculate movement and positions of large numbers of objects. He said that Piet and some of his colleagues in Japan developed hardware that plugs into your laptop to calculate such trajectories. These devices were competitive but never replaced the general purpose hardware using advanced hardware software techniques.
Erik reminded us that one reason we use Deep Learning is that we don’t know how to program the calculation ourselves.
Andreas returned to Deep Patient saying it was too good not to use, but Ethics requires that we understand it better.
Michael thought that people often tend to trust machines more than they would their doctors. The amount of money spent on Alternative Healthcare and on supplements is an example of the lack of trust in Medicine. Whether people know how a conclusion is reached or not, they might put more faith in the machine’s advice than their doctor’s.
Ed agreed that people believe what they want to believe. We often discount facts that contradict our beliefs and emphasize those that agree. If people can’t even agree on the facts, how can they agree on the answer?
Michael noted the paradox of the Information Age. It was thought that availability of diverse opinions and access to more data would improve sharing ideas and consensus. Instead, we get our information only from those who agree with us and polarize discourse even more.
Erik thought that scientists were less likely to come down strongly on one side or the other of controversial topics on which there is disagreement even about the facts.
Ed agreed that may be true for Science, but not so true for scientists. He cited work by a neuroscientist colleague at Princeton who found that when test subjects read a set of stories that reinforce their strongly held views, the pleasure centers are activated on fMRI. But reading opposite view stories results in activating fear and anger centers. Furthermore, the subjects remembered those stories that supported their view one year later much better than those that contradicted their view.
Andreas referred to Robin Lovin’s talk at CTI May on 2nd in which he discussed how to get opposing political camps to work together. It seems hard to get people with opposing views to listen to each other, let alone to change their views.
Erik told of a colleague who had asserted that Homo Sapiens is the only animal capable of planning for the future on a public radio show. A caller told him about a chimp at the zoo who liked to throw things at the visitors, and who used the hours when the zoo was closed to stockpile stones to throw later. Erik’s friend went to the zoo and studied this phenomenon and wrote a paper based on this that won several prizes.
Ed also had a good friend who had been vocal on one side of a debate in astronomy. By chance, his friend was visiting when Ed and a graduate student obtained a measurement that proved his friend was wrong. His friend looked at the data and said, “Well, we were wrong.” and never looked back.
Michael referred to the Consciousness Club discussion in NYC last night in which someone in the audience pointed out correctly that when we perform experiments often the choice of parameters included determines certainty. One parameter may have wide variance but good specificity while another may have narrow variance but poor specificity. Which you include determines the accuracy and precision of your experimental results. Identifying the truth remains elusive at best. Interpreting statistical data is often counter-intuitive, and sometimes may be just wrong.
Ed focused on differential techniques to discover how the machines using Deep Learning come to conclusions. You could feed the Deep Learning machine two different data sets during the curriculum phase and use that difference to probe how the data affects a conclusion.
Michael referred to the use of Sparse Data mentioned in the talk last night. Tracing all or most of the neural connections in processing, results in an uninterpretable mess, but limiting the tracing to a small number of pathways or nodes yields useful data. This is analogous to using Key Indicators in business models as predictors.
Returning to the issue of allowing machines to determine outcomes that have major ethical implications without our knowing how the machine reached a conclusion, Erik suggested we might be able to teach a machine Ethics.
Michael applied the well-known trolley problem in ethics to self-driving cars. Should the car swerve to avoid hitting five people if that requires hitting one person? How do sins of omission relate to sins of commission? If you are the agent causing the harm (by diverting the trolley in the thought experiment) how does that alter your choice? Since there is no single universally accepted basis for Ethics, teaching the machine would be extremely problematic.
Erik referred to the alternative version of the trolley problem. Would it be acceptable to take the organs of a perfectly healthy living donor to save the lives of five dying organ transplant recipients? While many accept sacrificing one person instead of five to the trolley, almost none accept killing the healthy donor for his organs.
Ed asked, “Should the car put a higher value on its passengers and their families than on pedestrians?” Those who buy the cars would certainly prefer to protect themselves and their families. Those manufacturing the cars might have a difficult time justifying such a priority in court when they are blamed for injuries to the pedestrian.
Erik further offered that if self-driving cars are in fact safer and do save lives, then by giving priority to the passengers, and therefore encouraging consumers to buy more self-driving cars, society would benefit by saving more lives.
Although we strayed quite a bit from the issues raised in the article, it does appear that if we are to accept the benefits of the Deep Learning technology, at least for the present we may have to accept some loss of ethical oversight. The degree to which that loss is less than with human oversight remains unclear. How to minimize that loss and preserve a determinative role for ethics in a world increasingly dominated by technology (of which AI is only one part) remains challenging.
At the end of our discussion we questioned whether we should use the journal club format in the future? There are still many potential speakers we could enlist.
We agreed that we would not have a lunch meeting on May 11th.
Erik agreed to speak on May 25th on the topic he has been working on – A Definition of Life.
We will not meet on June 1st as both Ed and Michael are unavailable that day.
Michael Solomon, MD