Axiomatization of quantum mechanics and physics in general ?

Alien8 · Sep 21, 2014

billschnieder said:

In the proof above, the RHS is simply an expansion of the LHS. The CHSH is not just |S| it is the inequality ##|S| \le 2##. In the proof they are trying calculate the upper bound for ##|S|##. In the QM calculation we are simply calculating the value for |S|. In experiments they are simply measuring |S|. Once you have S from all those places, you can then compare the value you get with the upper bound to see if there is agreement or not.

Exactly.

What I'm showing above is that the proof which culminates in ##|S| \le 2## uses the assumption that the terms are calculated from the same realization.

Yes. And the assumption you are talking about, what makes them belong to the same "realization", is the triangle inequality. Is there any particular reason you're hesitant to consider this? CHSH derivation on Bell's_theorem Wikipedia page starts with what only comes at the end of the actual derivation:

Factorization is of little consequence if your question is what are those four terms doing together in the first place. Look at the main article for CHSH inequality and note there is no any 'less or equal' symbol until the triangle inequality is applied. There would be no any inequality without the triangle inequality, so if you can explain the justification how and why it applies to CHSH experimental setup you will answer the question how and why are those four terms supposed to be a part of the same system. Can you explain?
http://en.wikipedia.org/wiki/CHSH_inequality#Derivation_of_the_CHSH_inequality

wle · Sep 22, 2014

Alien8 said:

What is "optimal measurement" you are referring to?

The (qubit) measurements that result in the maximal quantum violation of the CHSH inequality. In a suitable basis you can take these to be $$\begin{eqnarray}
A &=& \frac{1}{\sqrt{2}} \bigl( \sigma_{z} + \sigma_{x} \bigr) \,, \\
A' &=& \frac{1}{\sqrt{2}} \bigl( \sigma_{z} - \sigma_{x} \bigr)
\end{eqnarray}$$ and $$\begin{eqnarray}
B &=& \sigma_{z} \,, \\
B' &=& \sigma_{x} \,.
\end{eqnarray}$$ For the CHSH Bell operator, this works out to $$\begin{eqnarray}
\mathcal{S} &=& A \otimes B + A \otimes B' + A' \otimes B - A' \otimes B' \\
&=& (A + A') \otimes B + (A - A') \otimes B' \\
&=& \sqrt{2} \, \sigma_{z} \otimes \sigma_{z} + \sqrt{2} \, \sigma_{x} \otimes \sigma_{x} \,.
\end{eqnarray}$$

kith · Sep 22, 2014

billschnieder said:

##AB+AB'+A'B-A'B'= A(B+B')+A'(B-B') ##
That factorization can not be done for different realizations of the same ensemble.

How would you show this mathematically? The question here clearly is how to assign the correct mathematical symbols and their dependencies in the beginning. Terms like "realization" or "ensemble" are only relevant for this initial assignment. Afterwards, we are talking about numbers, functions and their algebra, and your notions get insignificant for the correctness of a statement.

If there's no disagreement regarding the initial assignment of symbols and you agree with the first line but disagree with the quoted line, this can only mean that the shorthand notation doesn't capture the subtlety you are after. So please show were you think that things go wrong mathematically by using the full notation.

Abc2020ro · Sep 22, 2014

You cannot do that. Mathematics only deals with quantity. By axiomatizing the natural world, you are failing to take into account the qualitative part of it. And one huge example is consciousness. Not to mention related phenomena, such as free will.

billschnieder · Sep 22, 2014

Alien8 said:

Can you name the principle of mathematics you are referring to and what is the definition of "realization" and "ensemble"?

I'm using those words because after a lengthy discussion with atty(a few pages back), he used them to describe what I meant. I would not normally use those words to describe it. To me if you assign each individual photon pair a unique identifier say ##i##, then when I say a realization of the experiment, I mean that you have one set say ##p## of N particle pairs ##i = 1..N##. If I now have a different realization, I mean you now have a completely different set say ##q## of M particle pairs ##i=N+1..N+M, etc. None of the ##i's## in ##p## exist in ##q##, even though the system producing the particle pairs may be generating them such that the probability distribution of hidden variables in ##p## and in ##q## are the same.

An inequality derived entirely within ##p##, is not the same thing as an inequality derived from one part of ##p## and a different part of ##q## etc. Just as the ##AB = -1## condition when angles are the same does not apply for particles from two separate pairs.

billschnieder · Sep 22, 2014

Sorry, should have previewed ...

I'm using those words because after a lengthy discussion with atty(a few pages back), he used them to describe what I meant. I would not normally use those words to describe it. To me if you assign each individual photon pair a unique identifier say ##i##, then when I say a realization of the experiment, I mean that you have one set say ##p## of N particle pairs ##i = 1..N##. If I now have a different realization, I mean you now have a completely different set say ##q## of M particle pairs ##i=N+1..N+M##, etc. None of the ##i's## in ##p## exist in ##q##, even though the system producing the particle pairs may be generating them such that the probability distribution of hidden variables in ##p## and in ##q## are the same.

An inequality derived entirely within ##p##, is not the same thing as an inequality derived from one part of ##p## and a different part of ##q## etc. Just as the ##AB = -1## condition when angles are the same does not apply for particles from two separate pairs.

billschnieder · Sep 22, 2014

Alien8 said:

They are not independent sets. It's not (a-b), (c-d), (e-f), (g-h), it's (a-c), (a-d), (b-c), and (b-d). Whether or not they can independently attain -1 or +1 depends on expectation value function E(x,y) = ?. For E(x,y) = cos2(y-x), E(a-c), E(a-d), E(b-c) and E(b-d) can not independently attain -1 or +1, so instead of 4 the boundary for E(x,y) = cos2(y-x) is 2.83.

Well, let x = (a-c), y=(a-d), z=(b-c), and then (b-d) = x+y-z. So you are right it is not completely independent and that will affect the upper bound so you would have an expression like E(x) - E(y) + E(z) +E(x+y-z). However, you can still evaluate this expression in two ways. You could calculate E(x) from one set of particles, E(y) from a different set of particles, E(z) from yet a different set of particles and E(x+y-z) from yet another set. no two particle pairs in any set belonging to any other set. In this sense, the sets are independent, even though the results are not entirely independent owning to the E(x+y-z). However, you could also take a single set of particles, and evaluate all 4 expressions on the exact same set, every particle pair contributing to every term. This is the sense in which I'm referring to "dependence" independence. There is "more independence" so to speak for 4 separate sets compared to the same set.

Each should have a different inequality. Besides, isn't the realism assumption that the same set of particles have all those properties simultaneously? It won't be a realism assumption if we say different sets each have one property simultaneously.

You are also right that it may be easier to make the point starting from the beginning of the full CHSH derivation.

DrChinese · Sep 22, 2014

billschnieder said:

...Besides, isn't the realism assumption that the same set of particles have all those properties simultaneously?

Yes, this is the EPR realism assumption: the properties do not need to be simultaneously predictable as long as each one could be predicted with certainty individually. (Of course, I would also say that it would also be CONSISTENT with "a realism assumption if we say different sets each have one property simultaneously.")

bhobba · Sep 22, 2014

Abc2020ro said:

You cannot do that. Mathematics only deals with quantity. By axiomatizing the natural world, you are failing to take into account the qualitative part of it. And one huge example is consciousness. Not to mention related phenomena, such as free will.

Conciousness or free will has nothing to do with QM in nearly every interpretation - garbled half truths from some popularisations not withstanding.

QM is perfectly axiomatiseable - and with a breathtaking elegance in the Geometrical approach - although mathematically very non trivial - translation - its hard.

Thanks
Bill

Abc2020ro · Sep 22, 2014

bhobba said:

Conciousness or free will has nothing to do with QM in nearly every interpretation - garbled half truths from some popularisations not withstanding.

QM is perfectly axiomatiseable - and with a breathtaking elegance in the Geometrical approach - although mathematically very non trivial - translation - its hard.

Thanks
Bill

It cannot be, since is not the final theory. LOL :w:w:w:w:w

bhobba · Sep 22, 2014

Abc2020ro said:

It cannot be, since is not the final theory. LOL :w:w:w:w:w

What has that got to with anything? Classical mechanics is not the final theory yet its perfectly axiomizeable.

But leaving that aside - how do you know the final theory will not be a quantum theory? The current most likely candidate is string theory and its a quantum theory.

Thanks
Bill

atyy · Sep 22, 2014

billschnieder said:

OK, according to the derivation (http://en.wikipedia.org/wiki/Bell's_theorem#Derivation_of_CHSH_inequality)
##A=A(a, \lambda), A'=A(a', \lambda), B=B(b, \lambda), B'=B(b', \lambda)##
##\begin{align}
\rho(a,b) + \rho(a,b') + \rho(a',b) - \rho(a',b')&= \int_\Lambda AB\rho +\int_\Lambda AB'\rho +\int_\Lambda A'B\rho -\int_\Lambda A'B'\rho \\
&= \int_\Lambda (AB+AB'+A'B-A'B')\rho\\
&= \int_\Lambda (A(B+B') + A'(B-B')) \rho\\
&\leq 2
\end{align}##
The heart of the derivation is the 4th line above:

##AB+AB'+A'B-A'B'= A(B+B')+A'(B-B') \le 2.##
That factorization can not be done for different realizations of the same ensemble

Alternatively, For a single set ##q## of N particle pairs, with N sufficiently large
##S_q = \frac{1}{N} \sum_{i=1}^{N} A(a,\lambda_i)B(b,\lambda_i) - \frac{1}{N} \sum_{i=1}^{N} A(a,\lambda_i)B(b',\lambda_i) + \frac{1}{N} \sum_{i=1}^{N} A(a',\lambda_i)B(b,\lambda_i) + \frac{1}{N} \sum_{i=1}^{N} A(a',\lambda_i)B(b',\lambda_i)##
Which I can easily factorize like
##S_p = \frac{1}{N} \sum_{i=1}^{N} (A(a,\lambda_i)[B(b,\lambda_i) - B(b',\lambda_i)] + A(a',\lambda_i)[B(b,\lambda_i) + B(b',\lambda_i)])##
##A, B## can only take values ##\pm 1##, therefore whenever ##B(b,\lambda_i) - B(b',\lambda_i)## is 2, ##B(b,\lambda_i) + B(b',\lambda_i)## must be 0. The possible values within the sum are -2, 0, 2. Therefore ## |S_p| \le 2##

For a 4 different sets ##r,s,t,u## of M,N,O,P particle pairs respectively, you instead have:
##S_{rstu} = \frac{1}{M} \sum_{i=1}^{M} A(a,\lambda_i)B(b,\lambda_i) - \frac{1}{N} \sum_{j=1}^{N} A(a,\lambda_j)B(b',\lambda_j) + \frac{1}{O} \sum_{k=1}^{O} A(a',\lambda_k)B(b,\lambda_k) + \frac{1}{P} \sum_{l=l}^{P} A(a',\lambda_l)B(b',\lambda_l)##

Which we can't factorize any further. In addition, Each of terms can independently attain the extrema of [-1, +1]. Therefore ##|S_{rstu}| \le 4##.

OK, let's see if I can try to clarify this in a different way. Here we are not talking about quantum mechanics, just classical probability. So we can just talk about flipping a coin. Let's consider a coin with heads (##H=1##) or tails (##H=0##), and let the same coins also have each side coloured either red (##R=1##) or blue (##R=0##). Also, let heads always be red, and tails always be blue.

Let
##P(H=1)=0.5, P(H=0)=0.5##
##P(R=1)=0.5, P(R=0)=0.5##.

Then we define
##E(H)= \sum_{H}HP(H) = (1 X 0.5) + (0 X 0.5) = 0.5##
##E(R)= \sum_{R}RP(R)= (1 X 0.5) + (0 X 0.5) = 0.5##
##Y = E(H) - E(R) = 0##

How in experiments do we get ##E(H)## and ##E(R)##? We assume we have a large number of coins ##N_{T}##. To get an experimental estimate of ##E(H)## we randomly draw a large subset of ##M## coins, toss each one, measure whether it lands head or tails, and form the sum ##\hat{E}(H) = \frac{1}{M} \sum_{i=1}^{M} H(i)##. To get an experimental estimate of ##E(R)## we randomly draw a different large subset of ##N## coins, toss each one, measure whether it lands red or blue and form the sum ##\hat{E}(R) = \frac{1}{N} \sum_{j=1}^{N} R(j)##.

Because the number of trials for each measurement is finite, say ##M = 100, N=99##. I could get the result ##\hat{E}(H)=0.44, \hat{E}(R)=0.49, \hat{Y}=-0.05##, which is different from the predicted value of ##Y=0##.

To get closer to the predicted value, what I need to do is increase the number of trials say ##M = 100000, N=99999##. I could get the result ##\hat{E}(H)=0.5005, \hat{E}(R)=0.5003, \hat{Y}=0.0002##, which is different from the predicted value of ##Y=0##, but much closer.

It is true that in the classical case, we can imagine measuring heads and colour at the same time, but there is no need to. If we were to measure heads and colour at the same time, we would for this example in fact get ##\hat{Y}=0##, the exact predicted Y value even for a finite number of trials. However there is no need to measure heads and colour on the same subset, since by increasing the number of trials, we can get closer and closer to the predicted Y value.

One could object that with the measurement on the same subset, we always get for this example exactly the predicted value, whereas by measuring on different subsets we don't get exactly the predicted value. By analogy, could one say that the Bell tests are consistent with local reality, but because we measured on different subsets, and because we have been very unlucky, what we consider a large number of trials simply isn't large enough? Yes. In fact, the general issue is the number of trials, not whether they are measured on the same or different subsets. The criterion one chooses to accept or reject a hypothesis is arbitrary. In some of these Bell tests, the deviation from any local deterministic theory is more than 20 standard deviations. But because the cut-off criterion is subjective, one could reject 20 standard deviations as sufficient.

Alien8 · Sep 22, 2014

billschnieder said:

Well, let x = (a-c), y=(a-d), z=(b-c), and then (b-d) = x+y-z. So you are right it is not completely independent and that will affect the upper bound so you would have an expression like E(x) - E(y) + E(z) +E(x+y-z). However, you can still evaluate this expression in two ways. You could calculate E(x) from one set of particles, E(y) from a different set of particles, E(z) from yet a different set of particles and E(x+y-z) from yet another set. no two particle pairs in any set belonging to any other set. In this sense, the sets are independent, even though the results are not entirely independent owning to the E(x+y-z).

Yes. If we measure independently two distances X and Y and the only thing they have in common is maximum length of 1, then all we can say is X+Y <= 2. But if they are two sides of the same triangle XYZ, then we can also say X+Y <= Z.

Basically you are asking what E1(a,b), E2(a,b'), E3(a',b) and E4(a',b') have in common beside {-1,+1} limit. If each E limit was independent and the only common rule they must follow, then the boundary for E1−E2+E3+E4 would be 4, so there must be something else, some other common rule they obey or system they belong to.

They share the same E(x,y) function and there is a proportionality between |a-b| - |a'-b| = |a-b'| - |a'-b'|. But that's only relation between input parameters, it doesn't explain which common system those input variables are supposed to belong to, or in other words - why the choice of (a-b), (a-b'), (a'-b) and (a'-b') instead of (a-a'), (a-b'), (b-b') and (b-a') for example.

So what is it? The only other common rule or system applied in the derivation I see is the triangle inequality, and to be applied to CHSH setup those angles therefore have to somehow correspond to some triangles. I don't know how this relation between angles, triangles and probabilities works, but beside {-1,+1} limit that's the only common thing they share together, so that must be where the answer to all our questions is.

You are also right that it may be easier to make the point starting from the beginning of the full CHSH derivation.

http://en.wikipedia.org/wiki/CHSH_inequality#Derivation_of_the_CHSH_inequality

where A and B are the average values of the outcomes. Since the possible values of A and B are −1, 0 and +1, it follows that:

Then, if a, a′, b and b′ are alternative settings for the detectors,

At the beginning there is only a and b, then there is suddenly b' in the first line of step (6), and then in the second line a' materializes out of thin air as well. I think the question begins with the step (6), according to what logic, physics, or mathematical principle is justified.

atyy · Sep 23, 2014

Here is another proof of CHSH by Richard Gill http://arxiv.org/abs/1207.5103 (see section 2). In this case, it is made clear that the 4 terms correspond to *disjoint* subsets of size N/4 each, with a total size of N.

For finite N, there is some probability that a local deterministic theory will violate the Bell inequalities. Taking this into account, the Bell inequalities are not hard bounds for a local deterministic theory, but rather only something that a local deterministic theory is likely to satisfy with a probability given by Eq (3). As N approaches infinity, the traditional Bell inequality as a hard bound is recovered as given in Eq (4) .

Alien8 · Sep 23, 2014

atyy said:

Here is another proof of CHSH by Richard Gill http://arxiv.org/abs/1207.5103 (see section 2). In this case, it is made clear that the 4 terms correspond to *disjoint* subsets of size N/4 each, with a total size of N.

1.) AB + AB' + A'B - A'B' <= 2

2.) E(a, b) − E(a, b′) + E(a′, b) + E(a′, b′) <= 2.

They are very different inequalities, one is dealing with binary outcomes -1 or +1, the other with expectation values of decimal range from -1.0 to +1.0. The paper says the first one is CHSH inequality, but Wikipedia says it's the second one, which is what makes sense. Binary outcomes inequality, the first one, is general and completely undefined relative to experimental settings, it can not be violated by anything as long as 1+1+1-1 = 2, so I see no reason to even mention it.

I believe this is how proper derivation goes, as the main CHSH Wikipedia article says:

where A and B are the average values of the outcomes. Since the possible values of A and B are −1, 0 and +1, it follows that:

Then, if a, a′, b and b′ are alternative settings for the detectors,

http://en.wikipedia.org/wiki/CHSH_inequality#Derivation_of_the_CHSH_inequalityStep (6) basically starts with: E1 - E2 = E1 - E2, which is completely pointless observation, but then it goes on to conclude something like: E1 - E2 = E3 - E4. Out of the blue. How do you start with E1 in step (4) and then end up with E1, E2, E3 and E4 all together in step (6)? Being alternative settings for the detectors doesn't imply or explain anything. So what physics, logic, or mathematics can justify placing those for expectation values together in such a relationship?

Fredrik · Sep 23, 2014

Abc2020ro said:

It cannot be, since is not the final theory. LOL :w:w:w:w:w

Perhaps this comment is influenced by the popular belief that axioms are "self-evidently true" or "obviously true" statements. The modern view of axioms is very different from this. Axioms are not obvious truths, or even objective truths. A list of axioms simply defines a branch of mathematics. That's all. The axioms are true in that branch, because the branch is by definition the part of mathematics where the axioms are true. Every axiom is false in some other branch of mathematics.

However, a theory of physics isn't defined by axioms in this sense. It's defined by a set of assumptions that I used to call "axioms" until a few years ago. A. Neumaier had a strong negative reaction to how I used that word in a discussion here. I decided that he was right about that. There's no reason to call them "axioms". So I call them "correspondence rules" now. I think almost everyone is OK with that term. The purpose of a set of correspondence rules is to tell us how to interpret some piece of mathematics as predictions about results of experiments.

We can certainly define a branch of mathematics using axioms, and a theory of physics using correspondence rules, without having any idea what the final theory might be, or if there even is one.

atyy · Sep 23, 2014

Alien8 said:

They are very different inequalities, one is dealing with binary outcomes -1 or +1, the other with expectation values of decimal range from -1.0 to +1.0. The paper says the first one is CHSH inequality, but Wikipedia says it's the second one, which is what makes sense. Binary outcomes inequality, the first one, is general and completely undefined relative to experimental settings, it can not be violated by anything as long as 1+1+1-1 = 2, so I see no reason to even mention it.

Yes. In the derivation I linked to, it is assumed that given the hidden variable and measurement setting, there is no variability, ie. the outcome is either +1 or -1 with certainty. This is the assumption of "local determinism". However, one can certainly imagine that given the hidden variable and measurement setting, there is variability, ie. the outcome is sometimes +1 and sometimes -1 with probability ##p(A,B|a,b,\lambda) = p(A|a,\lambda)p(B|b,\lambda)##. However, it turns out that this second, and more general case of "local random or deterministic variables" can be rewritten as a "local deterministic" model by introducing additional hidden variables. For this reason, the two different proofs of CHSH are equivalent. You can find a description of the equivalence in http://arxiv.org/abs/1303.3081 (Proposition 2.1 in section 2.2.2).

wle · Sep 23, 2014

atyy said:

Here is another proof of CHSH by Richard Gill http://arxiv.org/abs/1207.5103 (see section 2). In this case, it is made clear that the 4 terms correspond to *disjoint* subsets of size N/4 each, with a total size of N.

For finite N, there is some probability that a local deterministic theory will violate the Bell inequalities. Taking this into account, the Bell inequalities are not hard bounds for a local deterministic theory, but rather only something that a local deterministic theory is likely to satisfy with a probability given by Eq (3). As N approaches infinity, the traditional Bell inequality as a hard bound is recovered as given in Eq (4) .

There are really two related but different theorems here. Bell's theorem (at least, originally) is a mathematical demonstration that the predictions of local theories and quantum physics are different. The objects being compared are the sets of joint probabilities predicted by quantum physics (the "quantum set" ##\mathcal{Q}##) and by local causal theories (the "local set" or "local polytope" ##\mathcal{L}##). These can be defined by $$\boldsymbol{P} \in \mathcal{Q} \Leftrightarrow P(ab \mid xy) = \mathrm{Tr}\bigl[M^{(x)}_{a} \otimes N^{(y)}_{b} \bigr] \,,$$ in which ##\rho## is a density operator and ##M^{(x)}_{a}## and ##N^{(y)}_{b}## are POVM elements, and $$\boldsymbol{P} \in \mathcal{L} \Leftrightarrow P(ab \mid xy) = \int \mathrm{d}\lambda \rho(\lambda) P(a \mid x; \lambda) P(b \mid y; \lambda) \,.$$ One simple way of showing that these are different sets is by comparing the maximal possible values of the CHSH correlator $$S = \boldsymbol{I} \cdot \boldsymbol{P} = \sum_{abxy} (-1)^{a + b + xy} P(ab \mid xy) \,.$$ The well known result is that $$\max_{\boldsymbol{P} \in \mathcal{L}} \boldsymbol{I} \cdot \boldsymbol{P} = 2 \,,$$ compared with $$\max_{\boldsymbol{P} \in \mathcal{Q}} \boldsymbol{I} \cdot \boldsymbol{P} = 2 \sqrt{2} \,,$$ which is only possible if there are probability distributions in the quantum set ##\mathcal{Q}## that aren't in the local set ##\mathcal{L}##. In this case, the CHSH correlator is defined as a function of a joint probability distribution, so it's for just one realisation (e.g. one entangled particle pair).

In an actual Bell experiment, Bell's definition of locality is being tested against reality. If you want do this rigorously (though in practice, nobody seems to bother), this means recasting Bell's theorem in the form of a hypothesis test and doing some additional statistical analysis. Part of this is defining what the "experimental Bell correlator" that is going to be measured is, since the mathematical correlator defined for a single realisation isn't a directly measurable quantity. Gill describes one (but by no means the only possible) way of doing that which is close to what's done in most Bell experiments.

Alien8 · Sep 24, 2014

wle said:

Gill describes one (but by no means the only possible) way of doing that which is close to what's done in most Bell experiments.

Gill says for any four numbers A, A', B, B' each equal to either -1 or + 1, then: AB + AB' + A'B - A'B' = -2 or +2

There are no any relative angles and detector settings here, no any locality or non-locality assumptions, no any probabilities, no any theories involved in this inequality what so ever. It's a statement about numbers, like 1 + 1 = 2, it's not relevant to any physics or reality. It's about arbitrary combinations of the four variables having either value of -1 or +1, where any possible combination plugged in that equation will always yield either -2 or +2, that's all there is to it. QM can not violate that inequality any more than it can make 1 + 1 = 3. We can not compare QM and other kinds of predictions with that inequality, so what's the point of it?

atyy · Sep 24, 2014

Alien8 said:

Gill says for any four numbers A, A', B, B' each equal to either -1 or + 1, then: AB + AB' + A'B - A'B' = -2 or +2

There are no any relative angles and detector settings here, no any locality or non-locality assumptions, no any probabilities, no any theories involved in this inequality what so ever. It's a statement about numbers, like 1 + 1 = 2, it's not relevant to any physics or reality. It's about arbitrary combinations of the four variables having either value of -1 or +1, where any possible combination plugged in that equation will always yield either -2 or +2, that's all there is to it. QM can not violate that inequality any more than it can make 1 + 1 = 3. We can not compare QM and other kinds of predictions with that inequality, so what's the point of it?

Gill presents two equations he calls "CHSH". The one you are referring to is Eq 2. The one with measurement settings is Eq 4.

billschnieder · Sep 24, 2014

atyy said:

In this case, it is made clear that the 4 terms correspond to *disjoint* subsets of size N/4 each, with a total size of N.

For finite N, there is some probability that a local deterministic theory will violate the Bell inequalities.

I don't understand how you could calculate a probability that any local deterministic theory will violate Bell inequalities, without clearly defining the space of "local deterministic theories". For a given theory (and Gill gives one), yes I can imagine one easily checking the probability that it will violate the inequalities but how do you do that for "any local deterministic theory". It seems to me our interest is in the latter probability not the former one.

In the above paper, he says:

When N is large one would expect <AB>obs to be close to <AB>, and the same for the other three averages of observed products.
Hence, equation (2) should remain approximately true when we replace the averages of the four products over all N rows with the averages of the four products in each of four disjoint subsamples of expected size N/4 each.

N is the size of the spreadsheet with 4 columns. And he is saying that we have a certain distribution of numbers {+1,-1} in the spreadsheet, and if we divide that spreadsheet up into 4 disjoint parts, we will still have approximately the same averages?! Isn't he making a certain assumption about how the numbers are distributed in the spreadsheet to begin with?

We could start from the experimental situation, in which we have not one Nx4 spreadsheet but 4 different 2xN spreadsheets. Let us try to derive the inequality from this scenario, and make all the necessary assumptions we could want to make about local determinism and realism to end up with 2 in the RHS. Instead of 4 numbers A, B, A', B'. In this case, we now have 8 numbers A1, B1, A2, B2', A3', B3, A4', B4', so that we instead have

##A_1B_1 + A_2B'_2 + A'_3B_3 - A'_4B'_4 \le 4##

What assumptions do we have to apply to this in order to end up with 2 on the RHS? I can think of one. We could say ##A_1 = A_2, A'_3 = A'_4, B'_2 = B'_4, B_1 = B_3##, which translating from the numbers to spreadsheets of numbers, it means the corresponding columns are identical, not just that the have the same ratios of {+1, -1} but that the pattern of changing back and forth is identical, or can be made identical by rearranging. This is a condition that will allow us to factorize the terms from 4 disjoint sets. For that to be the case, the source will have to know what set each pair will end up in, or the distributions will have to so uniform at all angle settings that a single set will not be able to reproduce the experimentally observed expectation value for one angle pair.

wle · Sep 24, 2014

billschnieder said:

I don't understand how you could calculate a probability that any local deterministic theory will violate Bell inequalities, without clearly defining the space of "local deterministic theories". For a given theory (and Gill gives one), yes I can imagine one easily checking the probability that it will violate the inequalities but how do you do that for "any local deterministic theory". It seems to me our interest is in the latter probability not the former one.

I haven't looked at the details of Gill's method, but I know a simple way of doing this that could be done in an experiment. (It's described in appendix A.2 of this paper; I don't know if it was proposed earlier.) The idea is based around defining an estimator ##S_{k}## for the Bell correlator on the ##k##th realisation (i.e., the ##k##th particle pair, if you want to assume the things being measured are actually particles). I'll describe how this works for the CHSH correlator, though the method can just as well be used for any Bell correlator that can be defined as a linear function of the probability distribution for a single realisation. The procedure, for the ##k##th realisation, is:

Alice and Bob pick random measurements ##x_{k}, y_{k} \in \{0, 1\}## with probabilities ##P(x_{k}) = P(y_{k}) = 1/2##, such that ##P(x_{k} y_{k}) = P(x_{k}) P(y_{k}) = 1/4##.
They record the outcomes ##a_{k}, b_{k} \in \{0, 1\}## from the results of their measurements.
The estimator ##S_{k}##, which Alice and Bob will be able to compute later when they compare their results, is defined in terms of these by $$S_{k} = 4 (-1)^{a_{k} + b_{k} + x_{k} y_{k}} \,.$$

Defined this way, ##S_{k}## is a random variable that can take only the values +4 and -4. If you write down its expectation value, that works out to $$\begin{eqnarray}
\langle S_{k} \rangle &=& \sum_{abxy} 4 (-1)^{a + b + xy} P (abxy) \\
&=& \sum_{abxy} 4 (-1)^{a + b + xy} P(ab \mid xy) P(xy) \\
&=& \sum_{abxy} (-1)^{a + b + xy} P(ab \mid xy) \,.
\end{eqnarray}$$The last line is exactly what's considered in most derivations of the CHSH expectation value, so the same results hold. In particular, ##-2 \leq \langle S_{k} \rangle \leq 2## according to any locally causal model and ##- 2 \sqrt{2} \leq \langle S_{k} \rangle \leq 2 \sqrt{2}## according to quantum physics.

If Alice and Bob repeat this ##N## times, the CHSH estimator for the whole experiment can just be defined as the average for each of the realisations in the obvious way: $$S = \frac{1}{N} \sum_{k = 1}^{N} S_{k} \,.$$ This is adding a list of random variables of values +4 or -4, but since their expectation values are all bounded by 2 for any local causal model, the probability with which a local causal model can predict a significant violation becomes very low for a large number ##N## of realisations. (If you need an upper bound on the probability with which that can happen, the paper I linked to above explains how to do that using the Azuma-Hoeffding inequality.)

Alien8 · Sep 24, 2014

billschnieder said:

We could start from the experimental situation, in which we have not one Nx4 spreadsheet but 4 different 2xN spreadsheets. Let us try to derive the inequality from this scenario, and make all the necessary assumptions we could want to make about local determinism and realism to end up with 2 in the RHS. Instead of 4 numbers A, B, A', B'. In this case, we now have 8 numbers A1, B1, A2, B2', A3', B3, A4', B4', so that we instead have

##A_1B_1 + A_2B'_2 + A'_3B_3 - A'_4B'_4 \le 4##

What assumptions do we have to apply to this in order to end up with 2 on the RHS?

AB + AB' + A'B - A'B' = -2 or +2 has nothing to do with any locality, determinism or realism. There is no any assumptions related to that equation, it's entirely defined by its purely mathematical premise, which is that every possible combination of four variables A, B, C, D, were each can arbitrarily be either -1 or +1, when multiplied, added and subtracted in this particular order: AC + AD + BC - BD, will always yield either -2 or +2. That's all, numbers and mathematics, nothing else.

It can not be AB + CD + EF - GH, it has to be AC + AD + BC - BD because that's the particular combination which produces -2 or +2 result. It's not an assumption, it's mathematical truth, just a matter of choice. But that is not the inequality used in experiments, it has no any bearing to locality or determinism. We should be talking about proper CHSH inequality and relative angles: E(a,c) − E(a,d) + E(b,c) + E(b,d), then ask why it is not: E(a,b) − E(c,d) + E(e,f) + E(g,h).

billschnieder · Sep 24, 2014

wle said:

The procedure, for the ##k##th realisation, is:

Alice and Bob pick random measurements ##x_{k}, y_{k} \in \{0, 1\}## with probabilities ##P(x_{k}) = P(y_{k}) = 1/2##, such that ##P(x_{k} y_{k}) = P(x_{k}) P(y_{k}) = 1/4##.

They record the outcomes ##a_{k}, b_{k} \in \{0, 1\}## from the results of their measurements.

The estimator ##S_{k}##, which Alice and Bob will be able to compute later when they compare their results, is defined in terms of these by $$S_{k} = 4 (-1)^{a_{k} + b_{k} + x_{k} y_{k}} \,.$$

Defined this way, ##S_{k}## is a random variable that can take only the values +4 and -4.

Yes, I would expect ##-4 \le S_{k} \ge +4##. How you get from this to ##-2 \le \langle S_{k} \rangle \ge +2## is what the problem is.

If Alice and Bob repeat this ##N## times, the CHSH estimator for the whole experiment can just be defined as the average for each of the realisations in the obvious way: $$S = \frac{1}{N} \sum_{k = 1}^{N} S_{k} \,.$$ This is adding a list of random variables of values +4 or -4, but since their expectation values are all bounded by 2 for any local causal model

I don't follow. S is already a result of 4 different realizations, but then you appear to be averaging more than one S. The inequality is about what you can say for any ##S##, not what you can say for averages ##\langle S \rangle##, no? What is proved in the CHSH is ##-2 \le S_{k} \ge +2## not ##-2 \le \langle S_{k} \rangle \ge +2##, the former is a sufficient but not necessary condition for the latter. I do not see how even proving the latter implies the former.

billschnieder · Sep 24, 2014

Alien8 said:

It can not be AB + CD + EF - GH, it has to be AC + AD + BC - BD because that's the particular combination which produces -2 or +2 result. It's not an assumption, it's mathematical truth, just a matter of choice. But that is not the inequality used in experiments, it has no bearing to locality or determinism. We should be talking about proper CHSH inequality and relative angles: E(a,c) − E(a,d) + E(b,c) + E(b,d), then ask why it is not: E(a,b) − E(c,d) + E(e,f) + E(g,h).

While I agree with you that it is a mathematical truth, you have to remember what is actually measured in experiments. There is no such thing as E(a,b) ... in an experiment. All you have are 8 lists of numbers in 4 pairs. for each we multiply each member of a pair in each list, add up all the products in each list and average it, then we call that E(a,b), it's actually ##\langle AB\rangle## for angles ##a,b##, we do the same thing for the remaining 4 pairs. At the end we combine the 4 expressions we obtained, calculate ##S## and then compare that with an inequality. The issue is what is the correct inequality to use for this kind of result. Should we use an inequality we derived by assuming we had just 4 lists (A,B,C,D) which we recombined into 4 pairs (AB, AD, BC, BD), or should we use an inequality we derived by assuming we had 8 lists in 4 pairs (AB, CD, EF, GH). You are saying we can of course assume that since the 8 lists were obtained from just 4 angles, then we have just 4 lists. But we can't just assume that, there is more producing the outcomes than just angles. Having the same angles doesn't make the two systems have the same degrees of freedom. We could of course conduct an experiment in which we measure just 4 lists, no need to pair them at all. Just measure one single list at A, others at B,C,D. then recombine them to make the pairs. Why don't we do that? If we can combine 4 separate lists of pairs, why shouldn't we be able to combine 4 separate lists of singles? I suspect it is the same reason. If I throw a coin, knowing that it landed heads tells me clearly that it did not land tails. But If I through two identical coins, knowing that one coin landed heads, tells me absolutely nothing about what the other coin did or did not do.

atyy · Sep 24, 2014

billschnieder said:

I don't understand how you could calculate a probability that any local deterministic theory will violate Bell inequalities, without clearly defining the space of "local deterministic theories". For a given theory (and Gill gives one), yes I can imagine one easily checking the probability that it will violate the inequalities but how do you do that for "any local deterministic theory". It seems to me our interest is in the latter probability not the former one.

billschnieder said:

What assumptions do we have to apply to this in order to end up with 2 on the RHS? I can think of one. We could say ##A_1 = A_2, A'_3 = A'_4, B'_2 = B'_4, B_1 = B_3##, which translating from the numbers to spreadsheets of numbers, it means the corresponding columns are identical, not just that the have the same ratios of {+1, -1} but that the pattern of changing back and forth is identical, or can be made identical by rearranging. This is a condition that will allow us to factorize the terms from 4 disjoint sets. For that to be the case, the source will have to know what set each pair will end up in, or the distributions will have to so uniform at all angle settings that a single set will not be able to reproduce the experimentally observed expectation value for one angle pair.

That seems notionally right, and basically corresponds to the condition that everything is independent and identically distributed, and that the measurement settings and the hidden variables are independent. Gill does discuss the possibility of weaker conditions, but this is the typical assumption. See also wle's post #197 and the paper he linked to, where apparently a bound is derived in which the i.i.d. assumption is only needed on the measurement settings, but not the N samples on which the measurements are made.

Edit: In fact, the Pironio paper http://arxiv.org/abs/0911.3427 that wle linked to cites an earlier paper by Gill http://arxiv.org/abs/quant-ph/0301059 for a bound in which the i.i.d. assumption on the N samples is removed. Interestingly, Gill does comment that the 30 standard deviations given in Weihs et al is under the assumption of i.i.d and that probabilities were equal to observed frequencies, and that the bound under weaker conditions cannot be as strong.

Alien8 · Sep 24, 2014

billschnieder said:

The issue is what is the correct inequality to use for this kind of result.

You first assume that you know what they assumed the kind of result it is supposed to be, and then you question if the inequality is proper for it. I'm saying we first need to find out what kind of result they assumed it is supposed to be and then decided whether the inequality or the assumption is proper.

Should we use an inequality we derived...

First we should look at the actual CHSH inequality derivation and make sure we understand each step, especially step (6).

http://en.wikipedia.org/wiki/CHSH_inequality#Derivation_of_the_CHSH_inequality

That's the true origin of how ab,ab',a'b and a'b' got together. Similarity with AB + AB' + A'B - A'B' inequality is more of a coincidence because they both share the same {-1,+1} limits. But they are completely different and based off very different premises, they have nothing in common. One deals with binary units and the other with decimal range, it's like apples and elephants. AB + AB' + A'B - A'B' = -2 or +2 can not be violated by QM or any other theory, because it is not a subject to any theory, it's absolutely general and purely mathematical.

We know exactly why AB + AB' + A'B - A'B' is what it is. It's not a result of any derivation, it's a starting premise, purely mathematical premise completely unrelated to anything but abstract numbers by themselves. But we do not know what premise is combination of ab,ab',a'b and a'b' based on. You're asking the right question, just talking about wrong inequality. I wish we would focus on actual CHSH derivation and try to understand that first.

atyy · Sep 24, 2014

Alien8 said:

First we should look at the actual CHSH inequality derivation and make sure we understand each step, especially step (6).

http://en.wikipedia.org/wiki/CHSH_inequality#Derivation_of_the_CHSH_inequality

That's the true origin of how ab,ab',a'b and a'b' got together. Similarity with AB + AB' + A'B - A'B' inequality is more of a coincidence because they both share the same {-1,+1} limits. But they are completely different and based off very different premises, they have nothing in common. One deals with binary units and the other with decimal range, it's like apples and elephants. AB + AB' + A'B - A'B' = -2 or +2 can not be violated by QM or any other theory, because it is not a subject to any theory, it's absolutely general and purely mathematical.

We know exactly why AB + AB' + A'B - A'B' is what it is. It's not a result of any derivation, it's a starting premise, purely mathematical premise completely unrelated to anything but abstract numbers by themselves. But we do not know what premise is combination of ab,ab',a'b and a'b' based on. You're asking the right question, just talking about wrong inequality. I wish we would focus on actual CHSH derivation and try to understand that first.

In that step, the idea they are using is that ##A = B + C## can also be written as ##A = B + C + D - D##. Basically we can add any term that is of the form ##D - D## since ##D - D = 0##.

wle · Sep 24, 2014

billschnieder said:

I don't follow. S is already a result of 4 different realizations, but then you appear to be averaging more than one S.

By "realisation" I mean what you might call "measurement on one particle pair". (Though I don't like that terminology much since Bell's theorem is about locally causal theories, which may or may not be theories about particles.) Alice and Bob each pick a measurement to do. They measure their systems. They each get a result which they record. That's one realisation. This would normally be repeated thousands of times in a Bell experiment to get a good statistical estimate of the Bell correlator.

Yes, I would expect ##-4 \le S_{k} \ge +4##. How you get from this to ##-2 \le \langle S_{k} \rangle \ge +2## is what the problem is.

I explained that in the subsequent part of my post. The point is that the estimator is defined in such a way that its expectation value, including the average taken over the choice of measurements (which is random), is exactly what's bounded in most derivations of the CHSH inequality. If you want me to do that explicitly, then start with the last line I wrote down: $$\begin{eqnarray}
\langle S_{k} \rangle &=& \sum_{abxy} (-1)^{a + b + xy} P(ab \mid xy) \\
&=& \sum_{xy} (-1)^{xy} \sum_{ab} (-1)^{a} (-1)^{b} P(ab \mid xy) \\
&=& E(00) + E(01) + E(10) - E(11) \,,
\end{eqnarray}$$ with $$\begin{eqnarray}
E(xy) &=& \sum_{ab} (-1)^{a} (-1)^{b} P(ab \mid xy) \\
&=& P(00 \mid xy) - P(01 \mid xy) - P(10 \mid xy) + P(11 \mid xy)
\end{eqnarray}$$ defined for convenience. For a Bell-local model, the probability distribution should have the form $$P(ab \mid xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda) \,,$$ so the quantities ##E(xy)## can be written as $$E(xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, E(xy; \lambda)$$ with $$\begin{eqnarray}
E(xy; \lambda) &=& \sum_{ab} (-1)^{a} (-1)^{b} \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda) \\
&=& \bigl( P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \bigr) \bigl( P_{\mathrm{B}}(0 \mid y; \lambda) - P_{\mathrm{B}}(1 \mid y; \lambda) \bigr) \\
&=& E_{\mathrm{A}}(x; \lambda) \, E_{\mathrm{B}}(y; \lambda) \,.
\end{eqnarray}$$ In the last line, I set $$\begin{eqnarray}
E_{\mathrm{A}}(x; \lambda) &=& P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \,, \\
E_{\mathrm{B}}(y; \lambda) &=& P_{\mathrm{B}}(0 \mid y; \lambda) - P_{\mathrm{B}}(1 \mid y; \lambda) \,,
\end{eqnarray}$$ which are bounded by ##-1 \leq E_{\mathrm{A}}(x; \lambda) \leq 1## and ##-1 \leq E_{\mathrm{B}}(y; \lambda) \leq 1##. For any given ##\lambda##, $$\begin{eqnarray}
E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda)
&=& E_{\mathrm{A}}(0; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \bigr) \\
&&+\> E_{\mathrm{A}}(1; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \bigr) \\
&\leq& \lvert E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \rvert + \lvert E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \rvert \\
&\leq& 2 \,,
\end{eqnarray}$$ so for the CHSH estimator expectation value, you get $$\begin{eqnarray}
\langle S_{k} \rangle &=& \int \mathrm{d}\lambda \, \rho(\lambda) \, \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& \max_{\lambda} \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& 2 \,.
\end{eqnarray}$$

billschnieder · Sep 24, 2014

wle said:

In the last line, I set $$\begin{eqnarray}
E_{\mathrm{A}}(x; \lambda) &=& P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \,, \\
E_{\mathrm{B}}(y; \lambda) &=& P_{\mathrm{B}}(0 \mid y; \lambda) - P_{\mathrm{B}}(1 \mid y; \lambda) \,,
\end{eqnarray}$$ which are bounded by ##-1 \leq E_{\mathrm{A}}(x; \lambda) \leq 1## and ##-1 \leq E_{\mathrm{B}}(y; \lambda) \leq 1##. For any given ##\lambda##, $$\begin{eqnarray}
E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda)
&=& E_{\mathrm{A}}(0; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \bigr) \\
&&+\> E_{\mathrm{A}}(1; \lambda) \bigl( E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \bigr) \\
&\leq& \lvert E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \rvert + \lvert E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \rvert \\
&\leq& 2 \,,
\end{eqnarray}$$ so for the CHSH estimator expectation value, you get $$\begin{eqnarray}
\langle S_{k} \rangle &=& \int \mathrm{d}\lambda \, \rho(\lambda) \, \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& \max_{\lambda} \bigl( E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \bigr) \\
&\leq& 2 \,.
\end{eqnarray}$$

So, let us focus on the part where you are doing the factorization, as I keep coming back to the factorization (it is the crucial part of every such proof). You are doing algebra with the functions ##E(0; \lambda)_A , E(1; \lambda)_A E(0; \lambda)_B , E(1; \lambda)_B##, factorizing them like on the 4th line above. One may ask, if you can factorize them out of their respective pairs, and you have just 4 functions, why can't you just measure each one individually in the experiment and use that to verify your inequality?? For example, you have a very interesting inequality there, this one:

$$\begin{eqnarray}
\lvert E_{\mathrm{B}}(0; \lambda) + E_{\mathrm{B}}(1; \lambda) \rvert + \lvert E_{\mathrm{B}}(0; \lambda) - E_{\mathrm{B}}(1; \lambda) \rvert
&\leq& 2
\end{eqnarray}$$

Involving just single sided results, which are actually quite easy to measure, and for which QM has predictions. If QM does not violate this inequality, there is no chance it will violate ##E(00; \lambda) + E(01; \lambda) + E(10; \lambda) - E(11; \lambda) \le 2##, is there? Do you know what the QM predictions for ##E(0; \lambda)_A , E(1; \lambda)_A E(0; \lambda)_B , E(1; \lambda)_B## are for Bell states?

Alien8 · Sep 24, 2014

atyy said:

In that step, the idea they are using is that ##A = B + C## can also be written as ##A = B + C + D - D##. Basically we can add any term that is of the form ##D - D## since ##D - D = 0##.

I think we went far away from what the original topic was supposed to be. I started a new thread specifically about CHSH derivation:
https://www.physicsforums.com/threads/derivation-of-the-chsh-inequality.772844/

wle · Sep 25, 2014

billschnieder said:

So, let us focus on the part where you are doing the factorization, as I keep coming back to the factorization (it is the crucial part of every such proof). You are doing algebra with the functions ##E(0; \lambda)_A , E(1; \lambda)_A E(0; \lambda)_B , E(1; \lambda)_B##, factorizing them like on the 4th line above. One may ask, if you can factorize them out of their respective pairs, and you have just 4 functions, why can't you just measure each one individually in the experiment and use that to verify your inequality??

Because they depend on a variable ##\lambda## that a local hidden variable would supply that may not be measurable or even exist. If it does, then according to a local hidden variable theory you should have the factorisation ##E(xy; \lambda) = E_{\mathrm{A}}(x; \lambda) \, E_{\mathrm{B}}(y; \lambda)##, but for the terms ##E(xy)## all this let's you say is that they can be expressed in the form $$E(xy) = \int \mathrm{d}\lambda \, \rho(\lambda) E_{\mathrm{A}}(x; \lambda) \, E_{\mathrm{B}}(y; \lambda) \,,$$ which don't necessarily factorise into something like ##E(xy) = E_{\mathrm{A}}(x) \, E_{\mathrm{B}}(y)##.

billschnieder · Sep 25, 2014

wle, your choice of notation is very confusing, what the heck is ##E_{\mathrm{A}}(x; \lambda)## supposed to mean that is different from ##E_{\mathrm{A}}(x)##. Why not just use the standard notation ##A(x; \lambda)##?

wle · Sep 25, 2014

billschnieder said:

wle, your choice of notation is very confusing, what the heck is ##E_{\mathrm{A}}(x; \lambda)## supposed to mean that is different from ##E_{\mathrm{A}}(x)##. Why not just use the standard notation ##A(x; \lambda)##?

What notation looks "standard" depends on where you learned Bell's theorem from. I explained how to derive the CHSH inequality starting from the factorisation condition $$P(ab \mid xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda)$$ for a probability distribution, which is how Bell defined a local model in some of his later essays. This is a more general definition than what's used in many derivations of the Bell or CHSH inequality because it doesn't require the local model to be deterministic (though as atyy pointed out in an earlier post, it's always possible to turn a local stochastic model into a local deterministic model by adding more hidden variables, so it doesn't make any difference). I also personally find the definition given in terms of probabilities a lot clearer and less prone to misconceptions.

Most of the terms in post #204 are simply defined in terms of the elements appearing in the factorisation above. For instance, ##E_{\mathrm{A}}(x; \lambda)## was an intermediate variable defined as $$E_{\mathrm{A}}(x; \lambda) = P_{\mathrm{A}}(0 \mid x; \lambda) - P_{\mathrm{A}}(1 \mid x; \lambda) \,,$$ which I introduced just because it was convenient. If you insist on giving this an interpretation, then it's the expectation value of Alice's result for a given ##\lambda## if the results are called ##A = +1## or ##A = -1## instead of ##a = 0## or ##a = 1##. In general, this is a real number bounded by ##-1 \leq E_{\mathrm{A}}(x; \lambda) \leq 1##. For a deterministic local model, ##E_{\mathrm{A}}(x; \lambda)## can only be either ##+1## or ##-1## and it's the same thing that many derivations of the CHSH inequality would call ##A(x; \lambda)## or something similar.

I didn't explicitly define what ##E_{\mathrm{A}}(x)## was because I never needed such a term, but if I did I'd define it as $$E_{\mathrm{A}}(x) = \int \mathrm{d}\lambda \, \rho(\lambda) \, E_{\mathrm{A}}(x; \lambda) \,.$$ In the notation for deterministic local models that you're more familiar with, that would be the same thing as $$\langle A(x) \rangle = \int \mathrm{d}\lambda \, \rho(\lambda) \, A(x; \lambda) \,,$$ though this particular expectation value is never used in derivations of the CHSH inequality.

atyy · Sep 25, 2014

wle said:

What notation looks "standard" depends on where you learned Bell's theorem from. I explained how to derive the CHSH inequality starting from the factorisation condition $$P(ab \mid xy) = \int \mathrm{d}\lambda \, \rho(\lambda) \, P_{\mathrm{A}}(a \mid x; \lambda) \, P_{\mathrm{B}}(b \mid y; \lambda)$$ for a probability distribution, which is how Bell defined a local model in some of his later essays. This is a more general definition than what's used in many derivations of the Bell or CHSH inequality because it doesn't require the local model to be deterministic (though as atyy pointed out in an earlier post, it's always possible to turn a local stochastic model into a local deterministic model by adding more hidden variables, so it doesn't make any difference). I also personally find the definition given in terms of probabilities a lot clearer and less prone to misconceptions.

Where did you learn to derive CHSH? I like your proof. I'm a biologist, so probabilities and directed graphical nonsense are much more my cup of tea too.

Axiomatization of quantum mechanics and physics in general ?

Similar threads

Hot Threads

Recent Insights