Abner Shimony

Notice: We are in the process of migrating Oral History Interview metadata to this new version of our website.

During this migration, the following fields associated with interviews may be incomplete: Institutions, Additional Persons, and Subjects. Our Browse Subjects feature is also affected by this migration.

We encourage researchers to utilize the full-text search on this page to navigate our oral histories or to use our catalog to locate oral history interviews by keyword.

Please contact [email protected] with any feedback.

ORAL HISTORIES

Interviewed by

Joan Bromberg

Interview dates

September 9 and 10, 2002

Location

Shimony's home, Wellesley, Massachusetts

See catalog record for this interview.

Usage Information and Disclaimer

Disclaimer text

This transcript may not be quoted, reproduced or redistributed in whole or in part by any means except with the written permission of the American Institute of Physics.

This transcript is based on a tape-recorded interview deposited at the Center for History of Physics of the American Institute of Physics. The AIP's interviews have generally been transcribed from tape, edited by the interviewer for clarity, and then further edited by the interviewee. If this interview is important to you, you should consult earlier versions of the transcript or listen to the original tape. For many interviews, the AIP retains substantial files with further information about the interviewee and the interview itself. Please contact us for information about accessing these materials.

Please bear in mind that: 1) This material is a transcript of the spoken word rather than a literary product; 2) An interview must be read with the awareness that different people's memories about an event will often differ, and that memories can change with time for many reasons including subsequent experiences, interactions with others, and one's feelings about an event. Disclaimer: This transcript was scanned from a typescript, introducing occasional spelling errors. The original typescript is available.

Preferred citation

In footnotes or endnotes please cite AIP interviews like this:

Interview of Abner Shimony by Joan Bromberg on 2002 September 9 and 10, Niels Bohr Library & Archives, American Institute of Physics, College Park, MD USA, www.aip.org/history-programs/niels-bohr-library/oral-histories/25643

For multiple citations, "AIP" is the preferred abbreviation for the location.

Abstract

In the interview Shimony discusses his undergraduate years at Yale in mathematics and philosophy; influence of C. S. Peirce, A. N. Whitehead; reactions to Hume; studying under Robert Calhoun and Paul Weiss; the bases of Shimony's physical realism; Army service at Ft. Monmouth, 1953-55; physics Ph.D. at Princeton; reading EPR; interaction with Eugene Wigner; teaching and doing research on the philosophy of quantum mechanics at MIT in the 1960s; first reactions to Bell's 1965 paper; collaboration with J. F. Clauser, M. A. Horne, and R. Holt on tests of Bell's inequality; the 1970 Varenna summer school on Foundations of Quantum Mechanics; the researches on hidden variable theory and on quantum mechanics of von Neumann, G. Mackey, J. P. Vigier, C. Piron, J. M. Jauch, E. Specker, and S. Kochen; the metaphysical implications of quantum mechanics: potentiality and nonlocality; the search for non-linear modifications of quantum mechanics; neutron interferometry; interactions with C. Shull, A. Zeilinger, and D. Greenberger; devising measures of entanglement; plans for future research.

Transcript

Bromberg:

I think we might want to start with the 1950s. Does that sound right with you?

Shimony:

I will inevitably go back quite a bit earlier than the 1950s.

Bromberg:

Well then let's start where you might go back to.

Shimony:

Yes, I will try to do everything chronologically. I'll answer your questions and then there'll be some references to earlier times. But look, here question one is about Max Born's Natural Philosophy of Cause and Chance and whether it first turns you to philosophy of physics. That's not true. It did influence me very much, and it sort of triggered my decision to go back to school and get a doctorate in physics. But it certainly didn't turn me to philosophy of physics. What happened was in the winter of 1952–53, I was working on my doctoral thesis in philosophy, which was on probability. Now, it was on probability in the sense of inductive logic, reasonable degree of belief.

I was an eager student and I read literature on probability, including things that went beyond what I really needed for the thesis, and I did some reading on probability in physics. For instance, I read about ergodic theory, which I didn't use in the thesis, but I was interested in it. ergodic theory is one of the possible foundations of probability. I read Born's book. I don't think I made any use of it for the thesis, but it was fascinating. Born is a wonderful expositor, and I became very interested in both classical statistical mechanics and quantum mechanics. Not that I hadn't been interested in quantum mechanics before, but my interest was revived.

I had a very exciting time as an undergraduate, but you see why I was drawn to major in philosophy? Because it enabled you to do everything. Of course, later on I did learn that if you knew everything and don't know anything very well, there can be some superficiality in what you do. I kept being nagged in the back of my mind by the rigor of mathematics and the magnificent achievement of physics. I did continue to take a few physics courses. I took a course in classical mechanics and a graduate course in quantum electro-dynamics, the first course in which I had an exposition of special relativity theory. So I continued to be interested in physics.

I never took Margenau's course in philosophy of physics, which was a mistake. I'm sorry I missed something. I did come to know, I think in about senior year, Adolf Grunbaum. I may have met him at Weiss' soir‚e, but I did meet him and I was very impressed by the way in which he put together his knowledge, particularly of relativity theory, with philosophical questions. So I saw that it could be done. And I remember it was always in the back of my mind that I would someday do some philosophy of physics, even though the work I did for a while was more in philosophy of mathematics. With Fitch I did logic, and then Fitch taught philosophy of mathematics of a kind of platonic sort. Closer I guess to G”del than to anyone else. I think I talked in my preface about G”del's influence on my thinking.

I must have read them around 1996 to 1998, after Wang died. There was a memorial for Wang, and I was asked to talk about Wang's philosophical ideas, because most of the people in the memorial service were going to talk about his contribution to logic. So I read three books of Wang's: one called Beyond Analytic Philosophy, one called Reflections on G”del, and one called A Logical Journey, from G”del to Philosophy. He has a real polemic against both the British analytic philosophy and logical positivism for somehow detaching semantic analysis from the content of mathematics and the content of the natural sciences. Now, to be sure, the Vienna Circle is very interested in the natural sciences. Carnap had a degree in physics, and Carnap knew a lot of classical physics. One of the purposes the reforms the Vienna Circle envisaged was to clear away nonsense from our language in order to have a language suitable for the natural sciences, and a language in which one can reason with rigor in the natural sciences, and present the discoveries of the natural sciences without a metaphysical overlay.

[Henry] Margenau did. I know that, and I heard little bits of this from [Leigh] Page. Page was the theoretical physicist at Yale. I heard about these things, even though I did not have a course in philosophy or physics as an undergraduate. It was a great gap. Anyway, a little bit more on Peirce. Peirce had a philosophy which was grounded in science. In volume one of the collected papers he said something wonderful about why he has devoted so much of his life to the study of history of science.

He says, "So that I may not neglect any pathway to the truth" that is, if I study what was historically done, and what was historically done was successful, that must be a pathway to the truth. I cited that passage in my essay on Kuhn. Kuhn in my opinion studies history of science for the wrong reason: in order to bypass, in order to emphasize the disagreements instead of to emphasize the systematic approach to the truth, which is I think the signature of the progress of science. Anyway, let me go on a little further. What else was impressive in Peirce? He was a fallibilist. He said no proposition can be asserted with absolute confidence. Does that make him a constructivist or deconstructivist or relativist? Not one bit. At the same time that he was a fallibilist, he believed very strongly in reasonable degree of belief, and the use of probability theory.

He had many essays on inductive method. Now, lots of them are his own redoing of the hypothetico- deductive method. So as I say, from way back, from high school days, I must already have dimly known, as anybody who studies the sciences dimly knows, about hypothetico- deductive method. Everyone knows you don't deduce the laws of nature from phenomena, but by taking those laws as conjectures and deriving consequences and comparing the consequences with the data, one can confirm or disconfirm. The whole machinery of probability theory is to turn that very qualitative framework which I just sketched into something quantitative. So I was very impressed by this, and I thought that what Peirce had was one of the answers to Hume; a useful answer if you can't answer Hume by dismissing Hume's statement that any impression is logically independent of any other impression. That may or may not be true.

As I said, Whitehead really had a different analysis of Hume. His different analysis was that one experience can be ingredient in another. But what Peirce is saying in his essays on scientific method is quite apart from ultimate ontology. You have an answer to Hume by use of probability theory. Now, as far as I know, Peirce doesn't mention Bayes, but he must have known it; Peirce knew everything. He certainly knew Laplace, and Laplace makes a lot out of Bayes' theorem. So it's possible that Peirce did. All I know is that when I learned about Bayes' theorem at the University of Chicago— I read Nagel's little book on probability theory and then I took a course in probability with Carnap it seemed to me that the Bayesian formulation of probability is the natural framework within which you can do the hypotheto- deductive method.

I was rife for incorporating the Bayesian point of view into my work. Now I would have to jump ahead to my doctoral thesis, where I have some modifications of Bayesianism.. Anyway, reading Peirce certainly prepared me for the point of view that probability is absolutely essential in our epistemology, and that judgments of very high probability in favor of one conjecture and against another are quite compatible with his overall fallibilism. So I thought he had the makings of a balanced epistemology. Balanced between dogmatism on one hand, (and you want to avoid dogmatism on the one hand, which of course was Hume's greatest enemy), and excessive skepticism on the other hand. I really felt that Peirce came closer to striking the balance of these two polar errors than any philosopher that I have ever read before. Now let me go on to a few more things in Peirce. He also really anticipated so much of the epistemology of the latter half of the twentieth century.

Through much of Peirce's career, he defended the idea that the primary sense of probability is the frequency interpretation of probability. So probability really refers to an ensemble of similarly prepared systems. Then somewhere around 1890 or so he wrote a series of papers saying, AI was all wrong. I was immature, unripe". The frequency theory, the frequency interpretation is derivative; it doesn't apply to the individual case. You may have an ensemble of one entity and no replicas are made. You may destroy the die and, nevertheless, when the die was in one piece it had a certain probability of turning up one, two, three, four, five or six.

Destroy it; you have no ensemble anymore. Therefore, the fundamental notion of probability has got to be a would-be that applies to an individual case. So Peirce had the idea that a statistical interpretation is simply derivative. When one has the same would-bes, or a large number of entities similarly prepared, then one has a statistical interpretation of probability. But the would-be is well defined in an individual case, even though you can't test it without having the ensemble. It's meaningful in the individual case. That's exactly what Popper said in his papers on propensities in the 1950s. Now, my feeling is that Peirce was the first to articulate the propensity interpretation, though it's clear to me that Laplace had the idea but he didn't articulate it so well. Now, where does a propensity interpretation of probability fit in a classical world view? That's a real problem for Laplace, because Laplace was a strict determinist when all the data are given, all the boundary conditions, all the conditions of the whole universe. That's why for Laplace it's not entirely clear that he had a well-formulated propensity interpretation.

But Poincar‚ did. He didn't call it propensity interpretation; he called it chance. Poincar‚ was the man who first developed the ideas of chaos. The idea that even if the world is deterministic we don't know for sure, but even if it is that doesn't mean that if you have only approximate knowledge of the initial conditions on a system, that you will be able to say approximately what the system will be doing ten seconds from now or a year from now. It is characteristic of chaotic systems that if you're off by an ever so little bit from the exact initial conditions, you will be off by ever so much in the final state. We now have several bodies of mathematical theory, ergodic theory and chaos theory, which investigate such things. But what Poincar‚ said was that a roulette wheel can be described objectively by probability because, unless you know exactly what torque the croupier is imparting to the wheel, and exactly what resistance is imposed on the bearings, and exactly what resistance the air will make, you don't know how many times the wheel will go around. Since any little error allows for a large range of possibilities in the final state of the roulette wheel, where it ends up, you have an objective determination of a probability distribution of outcomes. I'm mixing up terminology.

That's how chance arises in a deterministic physical system. It is deterministic but yet unstable. In a stable system, if you're off by a little bit in initial conditions, you're off only by a little bit in the prediction. In an unstable system, chaotic system or chance system, if you're off by a little bit in your initial condition, you're off by ever so much in your final condition. Now, Peirce knew all this. I don't remember that he cites Poincar‚, but he understood all this, and I think he talks something like this when he talks about would-bes. Peirce believed in addition that the world is in deterministic. He said, first of all, we have no evidence; no evidence for strict determinism.

All evidence is evidence concerning bouncing balls or solar systems, and that evidence is fitted perfectly well by laws which are off by a little bit. Secondly, even if you granted that the laws were exact, where do the ultimate initial conditions in the universe come from? They can't be themselves matters of law; that would just give you an infinite regress. Therefore, brute chance must enter somewhere in the world. The Judea-Christian creation legend makes the brute chance be the will of God in starting the whole mechanism going. The 17th and 18th century mechanists kept some of the religious point of view.

That is, God set the whole mechanism going, but they added certain things, such as the deists said that once it was set going, there was no more interference with the machinery. The theists said it is possible for God to intervene in that which He set in motion in the first place. What Peirce is insisting on is that the various mechanists and determinists of the past still have an opening wedge for something nondeterministic; namely, the ultimate initial conditions. If that is so, if initial conditions are not matters of law, then why not take a cosmology in which chance is suffused through the world throughout its career? Instead of localizing chance at the moment of creation, let us suppose that as time goes by there always is absolute chance.

Now all of this, remember, is before quantum mechanics. He did know about statistical physics. That is, he knew about kinetic theory of gases. But there were formulations of kinetic theory of gases entirely within deterministic mechanics. Boltzmann's own formulation was within a Newtonian dynamics. Now, this is what I'm building up to.

He had a point of view which was sitting there waiting for quantum mechanics to be discovered. When quantum mechanics was discovered, there was evidence that the world really does work the way he conjectured in his alternative to classical cosmology. That is, he had the point of view that stochasticity is suffused throughout nature at all times. Then along comes a theory representing particles by waves, by wave functions. When measurements are made or something else happens that requires a definite outcome, the outcome is chance.

Bromberg:

What about [tape volume cuts out; few inaudible words] between optics and classical optics. Was that any part of what you were looking at then? I mean, I'm now thinking about Emil Wolf and that kind of thing where they—

Shimony:

Rayleigh and Jeans had statistical optics. But what does that mean? It means that you don't know exactly what radiation comes from the sources, therefore you have to treat the radiation by a probability distribution.

Bromberg:

Which sounds exactly like classical kinetic...

Shimony:

Read the paper by Einstein, Podolsky and Rosen on an argument for hidden variables, and find out what's wrong with the argument." So that was my first reading of the EPR paper, and I didn't think anything was wrong with the argument. It seemed to be a very good argument. I never saw anything wrong with it. Later on I realized it had premises. Well, I knew then that it had premises, and one thing that could be wrong is that one of the premises is false. In fact, the argument is based on relativistic locality, and that may be the trouble. That is, that the quantum correlations may not be compatible with relativistic locality.

In quantum mechanics, the wave function of whatever is the whole system (which may include apparatus, or it may include apparatus and environment), will just go on evolving deterministically under the Schr”dinger equation. Where does an event occur? Events do occur, undeniably there are events. Therefore, there must be some breaking of the deterministic evolution of the state. We don't know now where that break is. One possibility is that when you have microscopic systems, the Schr”dinger equation has to be revised. In fact, you'll see in my paper that I list a number of possible loci for modification for quantum mechanics. There are people who believe, like Penrose, that the proper locus for modification of the Schr”dinger equation is when the space/time metric becomes involved.

I don't remember Wigner ever mentioning that as one of the possibilities, but he certainly did at least once write down a non-deterministic equation governing the evolution of the density matrix i.e., the statistical operator. Now, one possible locus of the breakdown that he does talk about is when life comes into play. Not mind yet, but life. He has a very interesting paper arguing that the phenomenon of life with reproduction is incompatible with quantum mechanics. It's in his Collected Papers. The reason is that in order to have reproduction, some parameters of the new generation must be identical, or close to being identical, with the corresponding parameters of the preceding generation. He does some counting of the number of parameters that have to be determined, and then he says, "There are not enough constraints in the quantum dynamics to fix those parameters."

If that's so, then reproduction, similarity of the offspring to the parents, cannot be— I think it's tangential, his argument that reproduction cannot be explained physically. But he is unequivocally against a physicalistic treatment of mentality. And he says it is possible that the locus of the breakdown of validity of the Schr”dinger equation is when systems endowed with mentality are involved, like Wigner's friend. That is, the paradox of Wigner's friend still being suspended between having seen a red light or a green light, would be resolved if the Schr”dinger equation doesn't govern the mentality of the friend. So stochastically, one or the other of these possible visions is picked out.

He admits that as a possibility, and if this possibility is true, then two things. One is the integration of physics with psychology, which he believes is inevitable if science will continue. That's the great frontier, of putting them together. That when that integration occurs, it will necessarily involve some modifications of things that are now precious within physics, like the literal truth of the Schr”dinger equation. So that's one consequence of making mentality the locus of the breakdown of the Schr”dinger equation. The other consequence is there will be some similarity to Bohr's point of view. It's not that he started with Bohr's point of view, but this idea that the Schr”dinger equation is not valid when mentality enters would lead to some common ground with Bohr. And he says there still will be subtle differences between the orthodox point of view and myself, and I think he's absolutely right on the subtle differences.

He says the trouble with the orthodox point of view is that it makes the fixed points of physics to be sharp, clear observations made on experimental apparatus, like what number you read on a scaler, or whether a bell rings or does not ring. Whereas in a real integration of physics and psychology, you must take into account the whole range of psychic phenomena, including sleep, the unconscious, peripheral vision, many things that are not sharp.

His brother was a physicist, and got him interested in superconductivity. London did some beautiful theoretical work on superconductivity, and his brother said, "If you turn all that in, you'll get a doctorate in physics." And he did, and he became the greatest expert on the theory of superconductivity of his generation.

As a student of Husserl, there were some residues of phenomenology in the little booklet of London and Bauer. Without giving you the details, in the first quantum mechanics paper I wrote, the one called "The Role of the Observer in Quantum Mechanics", I have a long passage on London and Bauer. That came from reading the book to teach the course at MIT. That course really got me into the general literature.

Bromberg:

Therefore, in a way, he got you into the measurement problem in a little bit more detail at least.

Shimony:

Yes. The first conference that I went to entirely dedicated to foundations of quantum mechanics was in 1963 in Cincinnati^[1]. It was organized by Boris Podolsky (of Einstein, Podolsky, and Rosen), who was Professor of Physics at Xavier University. That was a really great conference, with great people attending. Wigner was there, Dirac was there, Bohm, Aharonov, Wendell Furry, and some more will come to my mind. Wigner asked them to let me make a talk, and I did give a talk. I gave what was essentially the content of "The Role of the Observer in Quantum Theory.

Dirac asked me a question and scared the hell out of me. I thought he was setting me up by a preliminary question, and then would say something devastating. He said, "What is the meaning of solipsism. I said, "Solipsism is the theory that nothing exists but the knowing subject". He said, "Oh." He was content with that. I thought something terrible would happen after that. Anyway, I think from that meeting, it became known that I was interested in foundations of quantum mechanics. I did send copies of the reprints of "The Role of the Observer" around to lots of people. I suspect that's why I got Bell's paper in the mail.

Bromberg:

Who sent it to you?

Shimony:

Well, I don't really know. Bell was on a year leave from in 1964.

Bromberg:

He was out at Stanford, wasn't he?

Shimony:

He was one of the most rigorously honest men ever, and I never met anything like it, myself. He was awesome. He didn't think that a theory, for instance, that said once you have a diagonal density matrix, the problem is solved because you could never do an experiment that would show superpositions of terms with different positions of the needle. He says that's true, and he introduced the acronym FAPP, "for all practical purposes". Yes, that's true for all practical purposes, but you haven't said how the selection is made. Furthermore, what is measurement anyway? Measurement is an anthropocentric thing. What we are doing in physics is describing the world, and measurement is just one small incidental part of the interaction of human beings, a very small part of the world, with the great physical world.

So it's not enough to have an interpretation of quantum mechanics that it simply saves all of the appearances in measurements, because that isn't the world. I have another paper in which I've given an argument. I don't think you've seen this paper because it only came out a couple of weeks ago. It is called "Reminiscences and Reflections on Bell." One of the questions that I ask, and the main part of the paper, is why did Bell discover Bell's Theorem and nobody else did? Not a hard theorem to prove. Then you could say, well, the hard part is thinking up the question, not giving the proof. But there were other people who were interested in the question, too. I traced a series of steps which Bell, because of his rigorous honesty, was discontented with partial answers.

I got that. So let me summarize what I knew as of 1964. I had probably taught foundations of quantum mechanics twice at MIT at that time. So I knew some literature. I knew the Einstein-Podolsky-Rosen paper, I knew Bohr's answer to Einstein- Podolsky-Rosen. I knew von Neumann's no hidden variables proof in his book. I knew Bohm's model, and I think Bohm also criticized von Neumann's additivity of expectation values. I don't remember whether I knew Jauch and Piron at that time. I'm not sure. But since I knew Kochen and Specker, I knew their criticism of von Neumann, which was essentially the same as Jauch's and Piron's. And I knew of Gleason's theorem, I knew what he asserted, though I had never studied the proof.

That probably was the extent of my knowledge as of 1964. Then what happened? The paper by Bell came through the mail. That paper said two things. We now know that there are no hidden variable theories in the standard sense, of assigning definite truth values to every projection, if the dimensionality is greater than three. But he presented a model for hidden variable theories for a spin one-half system. Locality doesn't enter it quite yet. We know that hidden variable models are possible with two dimensional systems. Then, this is so peculiar, he sketches very briefly a result which he gives in detail in his 1966 paper, which he wrote before the 1964 paper, but the paper was lost in the Reviews of Modern Physics office. Therefore, it came out chronologically out of order.

In that paper, what Bell says is that none of the no-go theorems about hidden variables pay attention to another possibility, namely, that the value which a state assigns to a proposition doesn't just depend on the state and the proposition, but on what other things are measured along with that proposition. It later came to be called the "context". I think when I wrote up something on Bell's contribution, I made the distinction between contextualistic and non-contextualistic hidden variable theories. That is bad English. Later, there was a book by Beltrametti and Cassinelli, who took my terms, and left off two syllables each. They had "contextual" and "non-contextual".

Why didn't I think of that, instead of leaving it to non- English speakers to simplify the words? Anyway, a non-contextual hidden variable theory is the kind that von Neumann was talking about. It was an assignment of a definite true value to each proposition, given the state, regardless of what else is being measured along with it. A contextual one says that if you have two different contexts in which a certain proposition is measured, the value assigned to that proposition by the state may be different. For instance, you may be interested in the total angular momentum of an atom.

The operator representing the total angular momentum, J2, commutes with Jz and it commutes with Jy, even though Jz and Jy don't commute with each other. So if you say you are going to have a hidden variable theory which assigns a definite value to J2, independently of whether I measure J2 along with Jy or Jz, that would be a non-contextual hidden variable theory. Bell says that's not physically very plausible because the procedure I would use, the orientation of the Stern- Gerlach apparatus that I would use for measuring J2 along with Jy is different from the apparatus I would use for measuring J2 along with Jz. And if what we want to do is pay attention to the influence of physics on our theories, we should enlarge our view of hidden variable theories to permit contextual hidden variable theories as well as non-contextual. So what Bell did in his 1966 paper, which wasn't available when the 1964 paper came out, was to say first of all, AI have blocked a loophole in von Neumann's original no- go proof.

On the other hand, I've opened a new loophole by pointing to the physical plausibility of contextual hidden variable theories." So, there's a see-saw. It's down for hidden variable theories with the no-go theorem, and then up for hidden variable theories when contextual hidden variable theories are acknowledged as being physically plausible. Great. Now, it turns out if you look at the end of the 1966 paper, written before the 1964 paper, with its great Bell's Theorem, you'll see that Bell already raised the question, "What if you impose some extra reasonable physical conditions? Can you still supply contextual hidden variable theories for a general quantum system?"He already in the 1966 paper says, "You may not be able to do that," for reasons which were dimly adumbrated by David Bohm. Namely, the price of doing this may be to introduce nonlocality.

Now, we can go to the 1964 paper, which says look if you wish, at a contextual hidden variable theory, do it for a pair of particles, both with spin of one-half. What's the Hilbert space? Four dimensional. You are not going to be able to have a non-contextual hidden variable theory for that system consisting of the two spin one-half systems. That's already excluded by Kochen and Specker, by Bell's earlier work, and by Gleason. What about a contextual hidden variable theory? Then Bell says, "Let's try to construct one." That is, let the values of a pair of— Well, he actually does it by taking three different observables, A, B, and C. He does A and C for Particle One, and B and C for Particle Two. He has a duplicate, the same quantity C, spin in the x direction might be used both for Particle One and for Particle Two, and two other directions of spin for A and B. Clauser, Horne, and Holt and I changed that when we redid Bell's Theorem. Anyway, he takes two particles, and he tacitly assumes that you are using contextual hidden variable theories, and let's see if it will go through.

Then the proof is that any theory in which the probability of getting a and b, say, that result, can be written as the product of the probability of getting a for Particle One, and b for Particle Two. That would mean the outcome over here is independent of the outcome over there, and also independent of what variable over there you measure, what quantity you measure. Again, this wasn't so explicit in the original paper. Later on, it was clear that what Bell was assuming was two kinds of locality. One is what came to be called "outcome independence" and [the other is] "parameter independence". That is, the probability of an outcome here is independent of what variable you measure over here, and it is also independent of what outcome it has, modulo fixing the hidden variable.

That is, once the hidden variable is fixed, that sort of screens out anything else about what is done over on the other side. So Bell's locality condition in his 1964 paper is tacitly the conjunction of two locality conditions. He later on made that explicit himself, and Jon Jarrett independently made it explicit. Both of them showed that you really have the conjunction of two different locality conditions. The net result of the theorem of the 1964 paper is to show the indispensability of a feature of Bohm's model from 1952, the feature being that nonlocality is used in order to assign definite outcomes to all quantities.

They did not realize that the question of whether entanglement persisted when particles separate had been asked by Schr”dinger in 1935. When they raised this question at the beginning of their paper, they call it Furry's hypothesis. Now, it's not Furry's hypothesis for two reasons. One is, Furry didn't believe it. Furry said that the loss of entanglement, if entanglement is indeed lost, is in disagreement with quantum mechanics. That was the main thing that Furry pointed out.

And he didn't even make the statement that we don't know whether entanglement is lost or maintained in such situations. Whereas Schr”dinger, who I thought was more reflective about it said, "We are at the verge of our experimental knowledge. Nobody has ever done an experiment which looked for entanglement when the particles are well separated." Entanglement is well known to hold safely for the two electrons in a hydrogen molecule or a helium atom, but those are right on top of each other. So anyway, even though they got their history wrong, as almost everybody does we know by now, they still said, "We now have an opportunity to test whether entanglement persists when the particles are well separated." So they did a calculation whether any mixture of products of one particle wave functions for the two photons would agree with the data of Wu and Shaknov. That is, every term in the mixture is a product of the quantum state of one times the quantum state of two. Now, you take a mixture of those. Will any such mixture yield the experimental scattering data that Wu and Shaknov [got].

They said no. They showed that it is not so. I don't know whether Mike Horne used the wonderful phrase at that point, but he did at one point say that this using of an old experiment, digging up the data, and using that data for settling some other question, is "quantum archaeology". I love that expression. He should be remembered for that expression. What Bohm and Aharonov did was quantum archaeology on the Wu-Shaknov experiment. I knew about the Bohm-Aharonov paper, and it immediately occurred to me that the Wu-Shaknov experiment also concerns pairs of particles. They are not pairs of spin one-half particles, as in the Bell 1964 paper, but pairs of photons. And things you can say for spin one-half particles, you can translate into things you can say for photon polarization. The polarization space is also a two dimensional space.

So I thought, "You know, I think we can fill a great gap in the literature." I don't know how long I thought to myself, by using quantum archaeology. At first, I thought you could do that. But then when I looked more carefully at the Wu-Shaknov experiment, I saw that the directions in which they were measuring polarization were only parallel to each other or perpendicular to each other. I knew from Bell's 1964 paper that if you just took parallel and perpendicular directions, the predictions of quantum mechanics will not violate Bell's Inequality. You need to have other angles besides zero and ninety degrees. In fact, I think Bell uses zero, 45 degrees, and 90 degrees. I'll come to what Mike, Clauser, Holt, and I did later on.

Anyway, at first I thought that we simply could dig out the data they had, and we could already see whether quantum mechanics or hidden variable theories is supported by the actual data. Two things happened. As I mentioned to you a minute ago, one is I very soon saw that no, it's not going to suffice to use their experiment. You really have to redo the experiment and have at least three different angles between the axes along which polarization is measured for particle one and for particle two. The second thing is I talked to Aharonov. He may have been at Brandeis at that time.

He knew about Bell's paper by them, and I said, "You know, wouldn't it be worthwhile to do an experiment testing Bell's Inequality? Just because you have a discrepancy between quantum mechanics and local hidden variable theories, that doesn't mean that the local hidden variable theories are ipso facto wrong. This may be just the place where quantum mechanics is limited." He said, dismissing me, "It's already been done. That's what Bohm and I did in our 1957 [or 1959] paper."

Aharonov is a very fast thinker and a very fast talker, and I was in awe of him, and thought, "He's right. Maybe he's right. But maybe he isn't right." The more I thought of it, the less convinced I was. What I think he did, and in fact, I'm quite sure this is what he did, he was saying that no hidden variable theories of a particular kind are going to be able to recover the data of the Wu-Shaknov experiment. What is that particular kind? It is a hidden variable theory of composite systems, like a two photon system, which assigns a pure quantum state to this one and a pure state to that one. That is, indeed, a hidden variable theory. But it's not the only kind of hidden variable theory. You could have non-quantum mechanical hidden variable theories.

That is, ones in which the treatment of the individual particles is not a quantum mechanical treatment. That simply hadn't been looked at yet. So it didn't take me very long, but I don't know how long, before I thought, "Aharonov hasn't settled the question. I believe there is still something to be done." I think Howard Stein was living in the Boston area at that time. He was living in Newton. He had read Bell's paper. I think I gave it to him and said, "This is a wonderful paper." I said, "Let's try to devise an experiment to test it. And let's see if we can test it by a variant of the Wu-Shaknov experiment, the variant being using more than two angles between the polarization axes." Howard Stein is the most meticulous reader and reasoner that I know. He was interested in the problem, but he wanted to know the whole background.

In the background of the Wu-Shaknov experiment was a paper by Goudsmit and somebody else, I think. This paper looked at the probabilities of joint scattering of— No, I think just looking at probabilities of scattering of individual photons by electrons. I think that was it. So what they were doing is looking for some generalization of Compton scattering. Now I am afraid that my memory is slipping. It may have been that they actually did look at probabilities of joint scattering in different directions. The paper is very general and very difficult. Howard read, and read, and read, and he found it very difficult, but he wouldn't give up reading it. I said, "Let's accept that this paper is true, and let's go on from there. Let's just do our calculation, modulo the assumption that this is true. If we get something interesting, we can always go back and check more." I'm afraid I'm not quite as rigorous a thinker, and maybe I don't have the same sort of moral scruples that he had. His qualities are absolutely admirable, but they drove me crazy. In the end, we simply abandoned the problem.

He wouldn't give up reading the background literature, and I said, AI don't understand the background literature. I just believe that it's correct. I want to go on from there." And we were at an impasse. That's how things were. It must have been 1966 or 1967. So I put the whole thing on ice because I really wanted a companion to do this. I thought it was a big, ambitious project, and I wanted to do it with somebody. And I left it alone. I don't think I talked to anybody else until the summer of 1968. Then, I had accepted a job at Boston University, because I wanted to be in physics as well as in philosophy. My friend Charles Willis at Boston University said, "We have a man who just took his qualifying exams, and he needs a thesis advisor. He's interested in statistical mechanics. He doesn't have anybody to work with." I think Willis said, I have too many students. I can't take him on now. Would you take him on?" I remember saying that I had just begun to teach physics, after this interval of just doing philosophy.

"Can't you give me a little time to get habituated?" Willis is a shrewd man and a good friend, too. He said, "Talk to him, and see what you think." So I talked to Mike, and I said, AI just don't have a good statistical mechanics problem. That's what Willis said you were interested in, but I've been brooding about this other problem. It's been bothering me for a long time." I explained to him what the problem was, and I showed him Bell's 1964 paper, and he understood it. He was very enthusiastic. He said, "Yes, I'd like to do that." What I set him to do was to see if we could do a modification of the Wu-Shaknov experiment. He worked at it for a couple of months, and he finally said, AI don't think it's going to work." I don't know who articulated the reason. He did, or I did, or both together.

Ask him, because his memory may be better about this than mine. What turned out to be the trouble is that the photons that come out from positronium annihilation are about half MEV each. Those are very energetic, hard x-rays. How do you measure the polarization of these things? You're not going to do it with a piece of Polaroid. They will go right through. And you are not going to do it with a calcite prism either. They go right through. They are not going to choose between an ordinary ray and an extraordinary ray. You need something else. In fact, you need what Wu and Shaknov did, compton polarimeters. The compton scattering of the photons is sensitive to the polarization. If you have the photon coming in this way, into the region of the polarimeter, and the polarization is say, horizontal, you'd have a whole ensemble of such photons. Well, you'll have a whole ensemble of scattered photons, and that will not have spherical symmetry. It will be distorted in a certain way.

If you have photons coming in vertically, a whole ensemble of them, again, you will have a scattering sphere distorted in another way. If you look at the two scattering spheres, they are distinguishable. They are not very different, but they are different. So you can tell for whole ensembles of photons, if you know that each ensemble was polarized in the same way, either all vertically or all horizontally, and you look at the scattering results from the compton polarizer, you will be able to say with virtual certainty, "Yes, they were all polarized this way, or all polarized horizontally." Suppose you want to know photon by photon if it was polarized horizontally or vertically. You get practically no information. You get one hundredth of a bit of information in information theoretical terms.

That's not enough to do a check of Bell's Inequality. You need to know photon by photon the polarization. So there we were with our idea broken down, but not completely broken down. This must have been by November of 1968. Who said what to whom, I don't know, because we talked to each other about it. We realized we had to have low energy photons for which you can do easier tests of polarization. Either tests with Polaroids or tests with calcite prisms. We didn't know about the pile of plates method of measuring polarization, which Clauser and Freedman later used. But calcite prisms would have been good enough for us. Then started our enterprise of scholarship, which took the form of asking people questions. "Where can we get correlated pairs of photons?" The polarizations are correlated, and they need to be low energy photons. I asked my colleagues who did some atomic and photonic physics, and nobody could give me a good answer.

I can remember giving a talk in Case Western Reserve, and I asked people there, and nobody could give me a good answer there. Then, I knew that Costas Papaliolios at Harvard and the Smithsonian Observatory had done an interesting experiment testing the Bohm-Bub theory, and I thought, "Wow, he's a real experimental expert. He may know of this." So I wrote him, "I'd like to ask you a question about a proposed experiment..." He was in the middle of an experiment and put off a meeting.

Bromberg:

We were actually sort of talking about how you got into the measurement problem, partly through Wigner, and partly through this very stimulating conference that Podolsky sponsored in Cincinnati, and how you first got Bell's paper, we've gone through that.

Shimony:

We haven't talked about that. As I said, there was some activity on the measurement problem, but it looks to me as if most of the activity was trying to make the problem go away by removing idealizations from the measurement process. There were some people who wanted more radical changes, either hidden variables— And the main exponents of those were David Bohm and Louis de Broglie. De Broglie's history was very interesting. Around 1927, at one of the Solvay meetings, de Broglie had given what he called the Guiding Wave Interpretation. Then he was dissuaded from it by others. I don't know who did the dissuading. It might have been Max Born, but I'm not sure. For years, he became a sort of advocate of the Copenhagen Interpretation.

When Bohm revived the hidden variable theory, he revived it much along the lines of de Broglie's 1927 note; de Broglie got interested in it again, and thought there were some possibilities for it. De Broglie and Bohm weren't identical. De Broglie developed something called a double solution, in which the wave equation governs the normal quantum wave, which he thought was a kind of guiding equation for the particle. The other solution was a singular solution with delta functions. Those delta functions would give you the exact positions of particles. Now, Bohm doesn't have that second solution. Instead of a second solution, he has some extra ontology.

He has particles which have positions, but the particles are really fairly classical. In fact, they obey Newton's Second Law of Motion with one modification namely, an extra force depending on what he calls the quantum potential. So even though both de Broglie and Bohm give similar roles to the wave equation itself to guide the particles, they treat the particles differently.

Bromberg:

When did you read de Broglie's post-Bohm work? While you were at Princeton?

Shimony:

No. I think a lot of this I found when I was teaching the course at MIT. I had to read the literature, and I found that de Broglie had written new things after the 1952 paper of Bohm. I am trying to recall my exact reactions to Bohm's version. I didn't study de Broglie's papers very carefully. I really studied Bohm's much more carefully. My feeling was that this is too special. The Einstein-Podolsky-Rosen paper was very general. It only argued that because there are quantum correlations, and these quantum correlations cannot be explained without action at a distance unless there are hidden variables, therefore there must be some supplementation of the quantum description. But they made no commitment what that description would be.

When Bohm's paper came out, Einstein was quoted— I don't know when I learned this quotation. I think I didn't learn it until later on from Jammer's book on the philosophy of quantum mechanics. Einstein wrote a letter saying, AI don't like these hidden variables." Now, Jammer made, it seems to me, a rather heavy handed use of that statement of Einstein, saying that he doesn't like hidden variables at all. I don't believe that was Einstein's intention because any theory that says the quantum mechanical description is incomplete and has to be supplemented, is ipso facto a commitment to some kind of hidden variables, if you mean by hidden variables, that which supplements the physical description given by quantum mechanics. What I think Einstein wrote, rather carelessly, was, AI don't like Bohm's own type of hidden variables," which to him must have sounded too much like classical mechanics.

We've left classical mechanics behind, and he was not a reactionary. He may criticize quantum mechanics, but that doesn't mean he wanted to go back to pre-quantum days. So at one point or another, I did hear Einstein's objection to Bohm, and I thought that sounded right, that the EPR paper had a power which is not completely caught by Bohm's model. Bohm's model is one of many. But of course, what Bell did was amazing, again, part of his character and part of his intellect. Namely, he took one feature of Bohm's model, which probably Einstein didn't like, namely its nonlocality, and he asked whether that would be a necessary feature of any hidden variable theory that gave the same predictions statistically as quantum mechanics.

There are two books by Mackey. Well, there are more than two, but there are two that I know. One was The Mathematical Foundations of Quantum Mechanics, the same title as von Neumann's book, and the other is something like Induced Representations and Their Applications to Quantum Mechanics. Mackey was trying to do a kind of axiomatization of quantum mechanics, a minimum axiomatization from which one would recover the whole structure of quantum mechanics. Now, I think the antecedents of Mackey's work were Birkhoff and von Neumann. Birkhoff and von Neumann did a lattice theoretical formulation of physical theories, and then a specialization of them to quantum mechanics.

Mackey also uses this lattice theoretical formulation, and then he gets to a certain point where it looks to me as if he just has to postulate the thing he most wants. I want to be fair to him because he's a great man and what he did was magnificent. He postulated something that you just can't get by logical or a priori considerations. That is that the lattice of propositions of the quantum system is isomorphic to the lattice of closed linear sub-spaces of a Hilbert space. Now, that's a big postulate. That's very different from postulating, say distributivity, or a weakened form of distributivity— He has a name for it, which I should remember, but I don't right now.

It's postulating an awful lot of the full structure of quantum mechanics, when you bring in Hilbert space that way. The later work of [Constantine] Piron was in some ways more ambitious. Piron wanted to give axioms that have only a logical or experimental justification, and then derive from them the Hilbert space structure of the propositions of quantum mechanics. So that's a more ambitious program. Now, what else did Mackey do? Once he had the structure of the propositions— These are once the yes/no propositions. Does the particle pass through this filter or not pass through? Does it go into the extraordinary ray or the ordinary ray when it passes through a prism. But the projections are binary, they have yes/no answers, or one/zero answers.

So going into the ordinary ray would be plus one, and going into the extraordinary ray would be zero. Suppose the photon goes into the ordinary ray if it's polarized along the x axis, and into the extraordinary ray if it's polarized along the X axis, and into the extraordinary ray if it's polarized along the y-axis. Then the mathematical representation of a device which does this, a binary device of this sort, for any photon that comes in normal to the front surface of the prism would be a projection operator. Now, what does Mackey do? Given the structure of the projection operators, he is able to recover the structure of all observables. Those are, in his terms, "projection valued measures". Given any Borel subset of numbers on the real line, you can ask whether the value of this observable falls in that set, or falls in the complement of the set. Again, that's a yes/no. It's yes if it falls within the set, and no if it doesn't.

That's a projection operator. But if you have a complete set of projection operators for every Borel subset of the line, that complete set tells you what the observable is. Suppose the value of the position observable is seven on a certain scale. That would mean for any Borel set of real numbers that contains the number seven, the answer would be yes, and for any Borel subset that doesn't contain it, the answer would be no. So that means given the observable is fully defined once you say what projection operator corresponds to a given Borel subset of the real line. Now that wasn't new.

That wasn't Mackey's discovery, that was von Neumann's discovery. What von Neumann showed is that every self-adjoined operator, which are the operators used in ordinary, non-relativistic quantum mechanics, they are the ones that are assumed usually to represent observables. They are the ones that have real eigenvalues. So von Neumann already showed this relationship between the self-adjoined operators and the projection-valued measures. That is von Neumann's famous spectral theorm.

So Mackey simply recovers that in his book. Now, he goes further. He says, "Now that we know the relationship between observables and projection operators, what about states?" Traditionally, at least until one gets to the further reaches of quantum field theory, the states were supposed to be— Let's go back to von Neumann's version of it, the states were represented by vectors in a complete Hilbert space. The completeness property is a convergence property. That is if you have a sequence of vectors and the difference between the two of them in norm goes to zero, then the set of vectors converges to some vector in the Hilbert space. In von Neumann's formulation, those were the pure states. In addition to the pure state, there are the impure states, or mixed states, which are represented by statistical operators or density matrices. Mackey asked the question, Suppose we define a state in a more general way.

Suppose we don't assume that states are somehow mapped onto vectors of the Hilbert space, but let's suppose a state is defined in terms of the observables. I said, No, let's do it in terms of the projection operators. That is, given a state, given any projection operator, the state will assign a certain probability that that projection operator will get the answer yes when that projection operator is measured. That is not good enough. You also want, when the projection operator is the unit operator, the operator representing the proposition that is always true no matter what is done, then the state must assign a value of one to that projection. Finally, suppose you have two projections which are orthogonal to each other, which means they both can't be true at the same time. Then, you form sort of the generalized disjunction of the two.

What does that mean? Well, the easiest way to do it, is if you say one projection corresponds to one subspace of the Hilbert space, and the other to another subspace of the Hilbert space, and these are non-overlapping, then the disjunction of the two propositions will correspond to the closure of the union of those two subspaces of the Hilbert space. Anyway, you try to go as far as you can in keeping the structure of classical logic. Now, you then want to say that the state has an additivity property. That is, if the state assigns P1 to this projection, and P2 to this one, and these two are orthogonal projections, then it should assign P1 + P2 to the generalized disjunction of those two projections. Fine. That would be like ordinary classical probability theory.

One more step, suppose you had a denumerable number of these projections, and they all are mutually orthogonal. Then you want the state to assign to the generalized disjunction of all of them, a probability which is the infinite sum of the probabilities assigned to each.

Bromberg:

I want to interrupt for a moment because I'm a little bit lost from the point of view of history, or do you want to finish something first?

Shimony:

I'm almost done. All of these concepts were defined by Birkhoff and von Neumann in the early 1930s. Mackey, however, posed a problem. I want to get to Gleason. Mackey posed a problem, suppose one accepts that the projection operators are isomorphic to closed linear subspaces of the Hilbert space. And suppose you accept the definition of a state that I just gave you. What states are there? Are there any states other than the pure quantum mechanical states, which are represented by vectors in the Hilbert space, and the mixed ones which are represented by density operators? He was not able to answer that question.

He suspected that the answer was yes because he suspected that if there had been any other states possible, then somebody would have turned up with a construction. But he couldn't solve the problem. He talked to Andrew Gleason about the problem, and as far as I know, Gleason had not expressed any interest in foundations of quantum mechanics before, but he just took this as a problem in pure mathematics, and he proved that Mackey's conjecture was true. That the only states in the sense of probability measures which have these desirable probability properties are those which are recognized by quantum mechanics. They are the usual pure states, and the mixtures thereof.

Bromberg:

Does that rule out hidden variables?

Shimony:

Almost. I haven't gotten there quite yet. Almost. I have omitted one clause in Gleason's theorem. That clause is provided that the Hilbert space has a dimension of three or more. If it has dimension two, this theorem doesn't apply.

Bromberg:

Is that something you were already familiar with in the early 1960s?

Shimony:

He did that in 1957, I think. So what does that say about hidden variables? Well, what does a hidden variable theory do, or what does the usual one do? Let's not talk about all observables. Let's talk about projection operators because they are the simplest observables. They are the ones that have yes/no answers. They have only two possible values. A hidden variable theory would assign a definite value, yes or no, to every projection operator. It would do so in a way that is consistent with those probability conditions that I just told you, so that if you have two projections which are orthogonal, they can't both be true, then remember the probability of the disjunction is equal to the sum of the two.

If more than one of those, if both of them had values of one, then the sum to the two would be two, and that would violate the condition that the probability is always a sum number between zero and one. It no longer would permit a probability interpretation. So that means a hidden variable theory must assign to every projection operator a definite value, zero or unity, and do it in such a way that if two projections are mutually orthogonal, it assigns unity only to one of them.

And if the two are mutually exclusive and exhaustive, then it must assign unity to one of them and zero to the other. Gleason's theorem, in the case of three dimensions, implies that there can be no hidden variable theory which satisfies those conditions. That is, it's a corollary of Gleason's theorem that you cannot assign, in this systematic way, compatible with probability theory, definite answers yes and no to every projection operator, and therefore to every proposition about the system. That's not the full strength of Gleason's theorem, but it's the part of Gleason's theorem that applies to the longstanding question about hidden variables, the question raised by von Neumann.

It doesn't quite say all that von Neumann's theorem is purported to say.. Von Neumann's proof, which we've already been through, it not a correct proof because it assumes a physically implausible premise. But if it were a correct proof, it would state something stronger than this corollary, because it would even cover the case of a two dimensional Hilbert space, like the Hilbert space associated with a spin one-half particle. So Gleason's theorem recovers most of what von Neumann was trying to prove, but not quite all of it.

Gleason's theorem is a very intricate theorem, and several other people were aware of the theorem and said, "There must be a way of proving the corollary to Gleason's theorem," namely the non-existence of the hidden variable theory, "even if we can't prove the theorem in its full strength," namely there are no other quantum states besides the standard ones, the pure states and statistical operators. That is what, independently, three groups set themselves to do.

One was Bell, one was Jauch and Piron, and one was Kochen and Specker. They all were interested in proving that there are no hidden variables that will assign definite values to all the projections. In the case of Bell and Kochen and Specker, they were aware that their theorems only go through if you assume that the dimensionality of the Hilbert space is three or more. In the case of Jauch and Piron, I am pretty sure they thought they had a proof that goes through even when the dimensionality is two.

Bromberg:

Had you already read these papers before you picked up Bell?

Shimony:

The only one that I knew before Bell's paper came through the mail was the Kochen and Specker paper, even though that paper wasn't published until about 1967, after I received Bell's paper. But one of my friends gave me a copy of it, and said, "This is a really interesting paper because it blocks a loophole in von Neumann's proof." The Kochen and Specker proof is a correct proof. There are no holes in it. It is a rather intricate argument, more intricate than necessary, but it's a good proof. It didn't come out until 1967, I think. Bell's paper on this topic came out in 1966 in Reviews of Modern Physics, which is after the great paper with Bell's Theorem. I will go into the time inversion. Bell's proof is also correct except for some minor slips that are rather obvious to correct, and it's a simpler proof than that of Kochen and Specker.

It's a very nice piece of mathematical and logical reasoning. It's a beautiful proof. Both of them restrict their attention to the dimensions three or more. The paper by Jauch and Piron is a little problematic. At first, it looks better than either of those two. I don't remember when it came out. It came out in the 1960s also. In fact, I think it came out a little before Bell's paper, because he refers to the Jauch and Piron paper. It purports to apply, even to systems which have Hilbert spaces of dimension two, but it has an extra premise, which looks absolutely reasonable at first. Bell, in his 1966 paper, says, "It looks reasonable, but an ace has been pulled out of the sleeve."

I'm not going to try and go through that. You can look at Bell's paper if you want, and you can see where he's critical of Jauch and Piron. In fact, Bell's paper begins with criticisms of two earlier attempts to forbid hidden variable theories.. Von Neumann is one of them, and Jauch and Piron is the other. He says, "Both of them slipped something in." In fact, what they slipped in is assuming that the value assigned to a— I better not go into it. It's a little complicated. So Bell gives his own proof, and he is correct. It is definitely an improvement over Jauch and Piron.

It definitely blocks the loophole in von Neumann's proof. I like Bell's proof because it's very simple, but it's essentially no stronger than Kochen and Specker. And certain things that Kochen and Specker do are very beautiful and go a little beyond Bell. They talk about something called partial Boolean algebras. A partial Boolean algebra is a subset of all the projections, but within that subset, you can have distributivity. That is, An(BuC0=(AnBLu (AnC). So it's a little bit hard to draw it in air.

It's a slight generalization of the standard distributivity in Boolean logic. So the distributivity holds within this subset, and then you have many different subsets, and you can assign different hidden variables to each subset. But the hidden variables in one subspace will not agree with the hidden variables in another one of these partial Boolean algebras. It was a nice contribution quantum logic.

Shimony:

I had found out something interesting at Case Western Reserve. A friend there said, "Well, I don't know the answer to your question, but I have an experimentalist friend at Harvard named Joe Snyder, and he may know." Snyder I remembered from Princeton. He was a graduate student with me, and the only thing I remembered of him from Princeton was, I think, he was the champion tennis player in the Physics Department. I went to see him, and we dimly remembered each other from Princeton. We ran in different circles. I told him what I wanted, and he said, "You know, I think I've seen an experiment recently which worked with polarization correlation at low energies. Let's go up to the library and look."

Pritchard. David Pritchard said to Kocher, "Wouldn't your apparatus be the sort of thing Clauser needs?" And Kocher thought about it and said, "Yes, I think so." And Clauser, I'm not sure he knew about that apparatus, but he looked up the Kocher- Commins paper, and that's what he wanted. So I think he found out about it about a month before we did. Because he had time to write up a little note to put in the bulletin for the [American Physical Society] Washington meeting for 1969. Mike and I were too late for that. I remember telling Mike, "It doesn't matter.

Nobody is working on this sort of thing. This is so far out. We'll write up a full paper with all of our calculations, and that will be much better than one paragraph in a bulletin." Then the bulletin came out, and there was our work, and we felt pretty low. We really felt pretty sick. So I asked some of my colleagues, I remember asking Wolf Franzen, who said, AA note in the Bulletin isn't a publication. You go ahead and publish your own paper." Well, I didn't really like that idea. I didn't want to pretend that I hadn't seen the note in the Bulletin, I would certainly want to acknowledge it. So I called Wigner, and I called him at home in the middle of the evening. I said, "This is the best work I've ever done, what should we do about it?" Wigner said, "It's happened to me before. I and another person independently discovered the same thing."

He mentioned Bargmann by name, and he said, "We decided to join forces and write up the paper together, and it was very good. The joint paper was better than what we would have done separately." I think he said something like that. "Why don't you do the same? You call Clauser and suggest that." Well, I called Clauser, and at first, he didn't like the idea. He felt he had gotten there first with his note in the bulletin. But, we had our secret weapon after my meeting with Papaliolios, when he told me, "We have apparatus like the Kocher-Commins apparatus here," Papaliolios arranged a meeting between me, and I think Mike also. I think so, but I don't know.

Pipkin was also there, and his student Holt, and I think Nussbaum was there. Nussbaum was nearly done with his thesis, and that's why he dropped out of the picture, because he wasn't going to delay getting his degree another year or two to do another experiment. I think he had already accepted a job at Bell Labs, and he wanted to be done. He thought, "Well, maybe when I get to Bell Labs, I'll do the experiment there," but he wasn't going to delay getting his doctorate in order to do the experiment. But Holt was just beginning.

Bromberg:

According to this letter that you sent, you say Nussbaum was now at Bell Labs and told Papaliolios that he has equipment to the experiment, and is eager to do it.

Shimony:

It's more than that. Pipkin didn't understand what it was all about. Pipkin was a very good physicist, but he just didn't catch on as fast as his student did. Holt caught on immediately, and Holt was explaining things to Pipkin. Finally it got through to Pipkin that this is an interesting experiment, and he would allow Holt to do it. In fact, I think Holt did nothing else for his doctoral thesis than the test of Bell's inequality. Well, in the course of it, he found out some information about the lifetime of the intermediate state of mercury, which he was using, not calcium. But Holt was the one who saw it. So anyway, when I called Clauser, I don't think I talked to Clauser until after this meeting.

It may be that I didn't even know about the Bulletin at the time. I think I didn't know about the bulletin at the time that I met with Pipkin, Holt, and Papaliolios. I think not, because that certainly would have complicated our discussion. All I knew was that we had found the Kocher-Commins paper with the help of Joe Snyder, and that we knew exactly what to do. Well, not exactly. I'll tell you one complication, we'll get to it in a little bit. We knew almost exactly what to do. Then I saw the bulletin, and by that time, it was quite clear that Holt was going to do the experiment. Therefore, when I talked to Clauser on the telephone, I was able to tell him that the experiment is underway.

Fortunately for Mike and me, Clauser very much wanted his hands on the first experiment, the first test of Bell's Inequality, because he was absolutely convinced that the experiment was going to come out for the local hidden variable theory and against quantum mechanics, and it was going to be an epoch-making experiment, and he wanted to have his hands on it. So he agreed he would cooperate with us if he could do the experiment with Holt. I think Holt was willing. I don't remember what Pipkin said about it, but Holt was willing to do it. So over the telephone, it was tentative.

We agreed to meet at the Washington meeting, and then we went over the arrangements. I certainly was happy that Clauser would work with Holt, and I was very happy that we just joined forces. I thought it would be a better paper, and certainly, it was the civilized way to handle the priority question. As Franzen said, the Bulletin did not constitute a publication, so it wasn't that he had already published before we did. But that's a borderline case because it was something in print, and it did announce what the intention of the experiment was, and that it would be done with correlated photons from a calcium cascade. So it would have been unpleasant, and suppose if we would have gone on and published independently, there might have been bad feelings.

I'm so happy that that didn't happen, that the net result was that we became friends, which is a nice story in the history of science. We really did become friends. If I can succeed in getting Clauser nominated for the Nobel Prize for his work in the design and the performance, I will have proved my friendship to him. I really want that to happen. He wants it to happen, too. I said we almost knew what to do. Now, why almost? The reason is that in calcium, the initial state, that is the state to which the atom was pumped before the cascade began was a total J equals zero state. The final state was J equals zero, so the cascade was zero, one, zero.

Two photons that come out, have total angular momentum zero. Now, total angular momentum is the sum of spin angular momentum, which is polarization for photons, and orbital angular momentum. Now, if the photons came out like this, at 180 degrees to each other, the orbital angular momentum would be zero. But if they came out that way, the only way to make sure that they came out this way, is to have minuscule little detectors here, and minuscule lenses to catch the photons coming in this direction, and the ones going 180 degrees backwards to be caught again with a minuscule little lens. What kind of counting rate are you going to have if you have infinitesimal lenses? It will take forever.

The rate of production of these pairs is rather small to begin with, and if you throw away all but one millionth of them, you're going to have to wait for a long time to get enough data, and then you are not sure that the production rate is going to be uniform. There can be all sorts of secular changes in production rate. The tube can coat up with calcium so that the rate of photons coming out from the tube changes over time. It's a mess. So what you have to do is say, All right. We'll put in fairly big lenses, and we'll catch lots of photons." Then what will be correlated is the total angular momentum, that is J of this one, and the J of this one, are anti-parallel. So Jz is plus one for this one, it's minus one for that one. Fine. Good enough. But, what is measured with your calcite prism? Not total J, not the orbital angular momentum. It's just polarization.

So what we had to do is integrate out the contribution from the orbital angular momentum, and see what kind of correlation is left in the polarization angular momentum. Is it a strong correlation or is it a weak correlation? Does this opening up of the lenses really spoil badly the correlation of the polarizations? Mike and I didn't really know how to do that calculation, but Dick Holt had done calculations of this sort.

Of course, we were in contact with him because he was doing the experiment. So he didn't do the calculation for us, but he set up the kind of equations one would need, and from then on, we just sort of followed out the rules of the game and integrated as indicated. We were able to get the probabilities of joint polarizations with various arbitrary choices of polarization axes. It turned out that you could have large lenses, lenses with a half opening of about 30 degrees, which is a big lens, it's a nice lens.

You have plenty of photons. It turns out that the depolarization effect of this large opening is negligible. That is, instead of having one correlated with zero— I think for us was one correlated with one/zero, with zero. Almost all the time, it was something like 0.995 of the time. The correlation was still very strong. So the depolarization was not such as to ruin the discrepancy between the hidden variable prediction, which is Bell's inequality, and the quantum mechanical prediction. It's because Holt set us up with that calculation that we made him a co-author in the paper. That is, there would have been a hole in our exposition without that. People don't know it.

Bromberg:

Clauser wasn't able to do that calculation?

Shimony:

I don't know whether he could have or would have. He was very busy. He was trying to finish his doctoral thesis at Columbia at the same time.

Bromberg:

Having to sail to Berkeley. Oh, he wasn't going to Berkeley yet.

Shimony:

No. He hadn't gotten his appointment. I think he was involved in several things. He was finishing up his thesis. He had to do some writing for his thesis. He was still doing some work on the test of Bell's inequality. He was getting a crew for his yacht. He lived on a yacht in the East River, and it was docked there, and he would drive to Columbia every day to work. But he intended to celebrate finishing his doctorate by taking his yacht down the Atlantic Coast. When he got the job at Berkeley, he decided he would take it to Galveston, Texas, and then ship it overland to California, to San Francisco. I'm not sure he's allowed to take a yacht through the Panama Canal, probably not. And he surely was not going to go through Cape Horn! So he had a complicated life at that time. Whether he could have done the calculation, or knew how to set it up or not, it never arose.

Mike and I were in contact with Holt, Holt showed us how to do it, Mike really worked out the details, and so then it was done. Later on, the next year, I did the whole problem in another way so one didn't have to do it by the approximate method that Holt set up. I was very happy. The exact computation that I did is in the paper— You have my bibliography. Anyway, it's the one in d'Espagnat's volume, Foundations of Quantum Mechanics, the Varenna volume. That's it. That must have been 1971. "Experimental test of local hidden-variable theories." That was in the Varenna volume^[3].

There, I did it exactly, and it agrees exactly with the approximation. So it's one of these lucky cases where the first order in approximation was the true answer. Anyway, but by that time, we had published the four-man paper, and the first person who showed us how to handle this extracting from total angular momentum correlation the polarization correlation, which was the experimentally easy quantity to determine, was Holt. That's how he joined us in the publication. I was very pleased. I remember asking Mike, "Would you object to having a fourth man in the collaboration," and Mike being one of the best-natured men in the world said, AI have no objection."

I think we did. Yes, we did send one to Bohm. That may be all, but there may be one or two more. Now, we didn't know everybody who was interested in it, but these were the people we knew. I said, "If we are going to write to de Broglie, we must have a cover letter, and it must be in French." So I volunteered to write the letter, and I had one of my friends on the Wellesley faculty go over and correct the grammar, and it was meticulous. It was perfect. I got back a letter from de Broglie, in such beautiful old script, it looked like the Declaration of Independence. I answered that.

Yes. We are friends. And he invited me to France— This was the Summer of 1970. He invited me to spend a year at Orsay, at the Universit‚ de Paris. I spent a year there. I mainly worked on the measurement problem when I was there. I met both Jauch and Piron at the Varenna meeting, and I got interested in their approach to quantum mechanics via quantum logic. I did some work, probably as a result of studying their papers and Piron's book. So that was influential. Then it's hard to say. There were questions that came up, that I must have continued to think about. That is, particularly, how do you reconcile quantum mechanics with relativity theory. I was interested in the question of whether one can send messages via quantum nonlocality.

Furthermore, you may say, "All right, local hidden variable theories are out. They won't work. Then there are nonlocal ones." You can say, quantum mechanics itself is a nonlocal hidden variable theory, especially if you say, "We're not asking for a hidden variable theory which assigns definite values to every observable. All it need do is assign probabilities, and then the experimental probabilities are really an integral of these probabilities using the distribution over the hidden variables." So in that sense, quantum mechanics itself is a hidden variable theory, but definitely a nonlocal hidden variable theory. You're weakening or broadening the idea of a hidden variable theory.

In fact, I'll say a little bit more than that. Bell was such a gracious man. Here's what happened. When we first met at Varenna, he said, "Have you looked into the question of whether you can derive the hidden variable inequality." He didn't call it Bell's inequality, but that's what we do call it. "Derive it without the assumption that the hidden variables assign sharp values to each one, but instead assign only probability distributions." I said, "Well, I've thought about that problem, and I would like to solve it, and I looked at it a bit, but I haven't solved it." He said, "Well, if you had solved it, I would have let you present it. I want you to have the credit for it, but since you didn't, I will present what I have." If you'll look in the proceedings of the Varenna volume, you'll see that his presentation of Bell's theorem is much more general than his presentation in the 1964 paper. There are two changes.

One is he doesn't just use three directions as in the 1964 paper. He uses two directions, or two values of the parameter for Particle One, and two values of the parameter for Particle Two. That means in all four combinations, A with B, A with B1, B with A1, A1 with B1. So it's much neater when you do it that way, and it's much easier to make a comparison with experiment when you do it that way. That was an advantage of the variant of the inequality that Clauser, Horne, Holt and I derived, by working with four directions instead of three. But the main improvement was that the proof that Bell gave in 1970 at the meeting was what we've come to call a stochastic hidden variable theory.

The hidden variables only assign probability distributions to the various observables. That makes it more powerful. That means even if you banned the determinism, you obtain a conflict between locality and quantum mechanics. I remember there was a time when the literature got filled with rather slovenly remarks. There was a time in which I saw often that one must choose between determinism and locality.

Here it is. "An Analysis of the Proposal of Garuccio and Selleri for Super- Luminal Signaling^[5]." That was it. And what I did was to show that— Now I remember how it goes. I remembered that there was a flaw in their signaling method. That is they assumed that you knew a total l2 or a total J2, and my argument was that you would not be able to determine the total J2 unless you had data from both sides. It wouldn't— There is no experiment that you could do looking just at the left-hand side to enable you to infer what was done on the right-hand side.

Therefore, the super luminal signaling doesn't take place. Now, that was in a very special case. I only did it for angular momentum. Later on, three different people independently showed what I think is a very important and not difficult result. That is that you do this analysis of Bell locality into a conjunction of two different kinds of locality. One is parameter independence, and one is outcome independence. Then you look and you see that if parameter independence were violated, which is just what I was talking about now, that is the average over here depended on which choice of the parameter you made on the right-hand side, even though you average over all possible results, then you would be able to signal superluminally.

If the result over here depended on the outcome over here, that is it depended on whether a photon passed through a piece of Polaroid, or didn't pass through, there would be no way of using that dependence for signaling. Why? Because you don't control. You can't send an SOS or any Morse Code by making use of processes which are stochastic to begin with. If it's a matter of chance whether the photon passes through the Polaroid or not, you can't influence that in order to send a message over here.

Therefore, even though quantum mechanics is nonlocal in the sense that the outcome over here depends on an outcome over here, that is the violation of locality that is innocuous because it doesn't enable you to communicate superluminally. I said there were three people independently that proved that quantum mechanics does not permit the breakdown of parameter independence.

It was Khrushchev's term. Khrushchev made an amicable speech at the United Nations, I think, saying that yes, there is tension between the capitalist system and the soviet system, but they can live together they need not have a war. That's what he called peaceful coexistence. So I just adapted that phrase for the relation between quantum mechanics and relativity theory.

^[1]See "The Foundations of Quantum Mechanics, A conference report by F.G. Werner", Physics Today 17 No.1 (January 1964), 53–60.

^[2]Michael A. Horne to John F. Clauser, April 18, 1969 and Abner Shimony to John F. Clauser, April 20, 1969, from the papers of J.F. Clauser, to be deposited at the Bancroft Library, U.C. Berkeley.

^[3]B. d'Espagnat, editor, Proceedings of the International School of Physics "Enrico Fermi", Course IL, Foundations fo Quantum Mechanics (Academic Press, 1971).

^[4]Epistemological Letters of the Institut de la Méthode.

^[5]This conversation refers to the "Bibliography of Abner Shimony", pp. 247–253, of Robert S. Cohen, Michael Horne and John Stochel, eds., Experimental Metaphysics & Quantum Mechanical Studies for Abner Shimony Vol.1 (Kluwer Academic Publishers, 1997). The paper Shimony mentions is item 56.

Tip: Search within this transcript using Ctrl+F or ⌘+F.

Topics discussed in this interview

Subjects:

Bell's theorem, Lasers, Philosophy, Quantum theory

Additional Persons:

Peirce, Charles S. (Charles Sanders), 1839-1914, Shimony, Abner, Weiss, Paul Storch, Whitehead, Alfred North, 1861-1947, Wigner, Eugene Paul, 1902-1995