1.
Quantum Mechanics 2nd term 2002 Martin Plenio Imperial College Version January 28, 2002 Oﬃce hours: Tuesdays11am-12noon and 5pm-6pm! Oﬃce: Blackett 622 Available at:
7.
7 Introduction This lecture will introduce quantum mechanics from a more abstractpoint of view than the ﬁrst quantum mechanics course that you tookyour second year. What I would like to achieve with this course is for you to gain adeeper understanding of the structure of quantum mechanics and ofsome of its key points. As the structure is inevitably mathematical, Iwill need to talk about mathematics. I will not do this just for the sakeof mathematics, but always with a the aim to understand physics. Atthe end of the course I would like you not only to be able to understandthe basic structure of quantum mechanics, but also to be able to solve(calculate) quantum mechanical problems. In fact, I believe that theability to calculate (ﬁnding the quantitative solution to a problem, orthe correct proof of a theorem) is absolutely essential for reaching areal understanding of physics (although physical intuition is equallyimportant). I would like to go so far as to state If you can’t write it down, then you do not understand it!With ’writing it down’ I mean expressing your statement mathemati-cally or being able to calculate the solution of a scheme that you pro-posed. This does not sound like a very profound truth but you would besurprised to see how many people actually believe that it is completelysuﬃcient just to be able to talk about physics and that calculations area menial task that only mediocre physicists undertake. Well, I can as-sure you that even the greatest physicists don’t just sit down and awaitinspiration. Ideas only come after many wrong tries and whether a tryis right or wrong can only be found out by checking it, i.e. by doingsome sorts of calculation or a proof. The ability to do calculations isnot something that one has or hasn’t, but (except for some exceptionalcases) has to be acquired by practice. This is one of the reasons whythese lectures will be accompanied by problem sheets (Rapid FeedbackSystem) and I really recommend to you that you try to solve them. Itis quite clear that solving the problem sheets is one of the best ways toprepare for the exam. Sometimes I will add some extra problems to theproblem sheets which are more tricky than usual. They usually intend
8.
8to illuminate an advanced or tricky point for which I had no time inthe lectures. The ﬁrst part of these lectures will not be too unusual. The ﬁrstchapter will be devoted to the mathematical description of the quan-tum mechanical state space, the Hilbert space, and of the description ofphysical observables. The measurement process will be investigated inthe next chapter, and some of its implications will be discussed. In thischapter you will also learn the essential tools for studying entanglement,the stuﬀ such weird things as quantum teleportation, quantum cryptog-raphy and quantum computation are made of. The third chapter willpresent the dynamics of quantum mechanical systems and highlight theimportance of the concept of symmetry in physics and particularly inquantum mechanics. It will be shown how the momentum and angularmomentum operators can be obtained as generators of the symmetrygroups of translation and rotation. I will also introduce a diﬀerent kindof symmetries which are called gauge symmetries. They allow us to ’de-rive’ the existence of classical electrodynamics from a simple invarianceprinciple. This idea has been pushed much further in the 1960’s whenpeople applied it to the theories of elementary particles, and were quitesuccessful with it. In fact, t’Hooft and Veltman got a Nobel prize for itin 1999 work in this area. Time dependent problems, even more thantime-independent problems, are diﬃcult to solve exactly and thereforeperturbation theoretical methods are of great importance. They willbe explained in chapter 5 and examples will be given. Most of the ideas that you are going to learn in the ﬁrst ﬁve chaptersof these lectures are known since about 1930, which is quite some timeago. The second part of these lectures, however, I will devote to topicswhich are currently the object of intense research (they are also mymain area of research). In this last chapter I will discuss topics such asentanglement, Bell inequalities, quantum state teleportation, quantumcomputation and quantum cryptography. How much of these I cancover depends on the amount of time that is left, but I will certainly talkabout some of them. While most physicists (hopefully) know the basicsof quantum mechanics (the ﬁrst ﬁve chapters of these lectures), manyof them will not be familiar with the content of the other chapters. So,after these lectures you can be sure to know about something that quitea few professors do not know themselves! I hope that this motivates
9.
9you to stay with me until the end of the lectures. Before I begin, I would like to thank, Vincenzo Vitelli, John Pa-padimitrou and William Irvine who took this course previously andspotted errors and suggested improvements in the lecture notes andthe course. These errors are ﬁxed now, but I expect that there aremore. If you ﬁnd errors, please let me know (ideally via email so thatthe corrections do not get lost again) so I can get rid of them. Last but not least, I would like to encourage you both, to ask ques-tions during the lectures and to make use of my oﬃce hours. Questionsare essential in the learning process, so they are good for you, but I alsolearn what you have not understood so well and help me to improve mylectures. Finally, it is more fun to lecture when there is some feedbackfrom the audience.
11.
Chapter 1Mathematical FoundationsBefore I begin to introduce some basics of complex vector spaces anddiscuss the mathematical foundations of quantum mechanics, I wouldlike to present a simple (seemingly classical) experiment from which wecan derive quite a few quantum rules.1.1 The quantum mechanical state spaceWhen we talk about physics, we attempt to ﬁnd a mathematical de-scription of the world. Of course, such a description cannot be justiﬁedfrom mathematical consistency alone, but has to agree with experimen-tal evidence. The mathematical concepts that are introduced are usu-ally motivated from our experience of nature. Concepts such as positionand momentum or the state of a system are usually taken for grantedin classical physics. However, many of these have to be subjected toa careful re-examination when we try to carry them over to quantumphysics. One of the basic notions for the description of a physical sys-tem is that of its ’state’. The ’state’ of a physical system essentially canthen be deﬁned, roughly, as the description of all the known (in factone should say knowable) properties of that system and it therefore rep-resents your knowledge about this system. The set of all states formswhat we usually call the state space. In classical mechanics for examplethis is the phase space (the variables are then position and momentum),which is a real vector space. For a classical point-particle moving in 11
12.
12 CHAPTER 1. MATHEMATICAL FOUNDATIONSone dimension, this space is two dimensional, one dimension for posi-tion, one dimension for momentum. We expect, in fact you probablyknow this from your second year lecture, that the quantum mechanicalstate space diﬀers from that of classical mechanics. One reason for thiscan be found in the ability of quantum systems to exist in coherentsuperpositions of states with complex amplitudes, other diﬀerences re-late to the description of multi-particle systems. This suggests, that agood choice for the quantum mechanical state space may be a complexvector space. Before I begin to investigate the mathematical foundations of quan-tum mechanics, I would like to present a simple example (includingsome live experiments) which motivates the choice of complex vectorspaces as state spaces a bit more. Together with the hypothesis ofthe existence of photons it will allow us also to ’derive’, or better, tomake an educated guess for the projection postulate and the rules forthe computation of measurement outcomes. It will also remind youof some of the features of quantum mechanics which you have alreadyencountered in your second year course.1.2 The quantum mechanical state spaceIn the next subsection I will brieﬂy motivate that the quantum mechan-ical state space should be a complex vector space and also motivatesome of the other postulates of quantum mechanics1.2.1 From Polarized Light to Quantum TheoryLet us consider plane waves of light propagating along the z-axis. Thislight is described by the electric ﬁeld vector E orthogonal on the di-rection of propagation. The electric ﬁeld vector determines the stateof light because in the cgs-system (which I use for convenience in thisexample so that I have as few 0 and µ0 as possible.) the magnetic ﬁeldis given by B = ez × E. Given the electric and magnetic ﬁeld, Maxwellsequations determine the further time evolution of these ﬁelds. In theabsence of charges, we know that E(r, t) cannot have a z-component,
13.
1.2. THE QUANTUM MECHANICAL STATE SPACE 13so that we can write Ex (r, t) E(r, t) = Ex (r, t)ex + Ey (r, t)ey = . (1.1) Ey (r, t)The electric ﬁeld is real valued quantity and the general solution of thefree wave equation is given by 0 Ex (r, t) = Ex cos(kz − ωt + αx ) 0 Ey (r, t) = Ey cos(kz − ωt + αy ) .Here k = 2π/λ is the wave-number, ω = 2πν the frequency, αx and αy 0 0are the real phases and Ex and Ey the real valued amplitudes of theﬁeld components. The energy density of the ﬁeld is given by 1(r, t) = (E 2 (r, t) + B 2 (r, t)) 8π 1 = (Ex )2 cos2 (kz − ωt + αx ) + (Ey )2 cos2 (kz − ωt + αy ) 0 0 . 4πFor a ﬁxed position r we are generally only really interested in the time-averaged energy density which, when multiplied with the speed of light,determines the rate at which energy ﬂows in z-direction. Averaging overone period of the light we obtain the averaged energy density ¯(r) with 1 ¯(r) = (Ex )2 + (Ey )2 0 0 . (1.2) 8πFor practical purposes it is useful to introduce the complex ﬁeld com-ponents Ex (r, t) = Re(Ex ei(kz−ωt) ) Ey (r, t) = Re(Ey ei(kz−ωt) ) , (1.3)with Ex = Ex eiαx and Ey = Ey eiαy . Comparing with Eq. (1.2) we ﬁnd 0 0that the averaged energy density is given by 1 ¯(r) = |Ex |2 + |Ey |2 . (1.4) 8πUsually one works with the complex ﬁeld Ex i(kz−ωt) E(r, t) = (Ex ex + Ey ey )ei(kz−ωt) = e . (1.5) Ey
14.
14 CHAPTER 1. MATHEMATICAL FOUNDATIONSThis means that we are now characterizing the state of light by a vectorwith complex components. The polarization of light waves are described by Ex and Ey . In thegeneral case of complex Ex and Ey we will have elliptically polarizedlight. There are a number of important special cases (see Figures 1.1for illustration). 1. Ey = 0: linear polarization along the x-axis. 2. Ex = 0: linear polarization along the y-axis. 3. Ex = Ey : linear polarization along 450 -axis. 4. Ey = iEx : Right circularly polarized light. 5. Ey = −iEx : Left circularly polarized light.Figure 1.1: Left ﬁgure: Some possible linear polarizations of light,horizontally, vertically and 45 degrees. Right ﬁgure: Left- and right-circularly polarized light. The light is assumed to propagate away fromyou. In the following I would like to consider some simple experimentsfor which I will compute the outcomes using classical electrodynam-ics. Then I will go further and use the hypothesis of the existence ofphotons to derive a number of quantum mechanical rules from theseexperiments. Experiment I: Let us ﬁrst consider a plane light wave propagatingin z-direction that is falling onto an x-polarizer which allows x-polarized
15.
1.2. THE QUANTUM MECHANICAL STATE SPACE 15light to pass through (but not y polarized light). This is shown inﬁgure 1.2. After passing the polarizer the light is x-polarized and from Figure 1.2: Light of arbitrary polarization is hitting a x-polarizer.the expression for the energy density Eq. (1.4) we ﬁnd that the ratiobetween incoming intensity Iin (energy density times speed of light)and outgoing intensity Iout is given by Iout |Ex |2 = . (1.6) Iin |Ex |2 + |Ey |2So far this looks like an experiment in classical electrodynamics oroptics. Quantum Interpretation: Let us change the way of looking atthis problem and thereby turn it into a quantum mechanical experi-ment. You have heard at various points in your physics course that lightcomes in little quanta known as photons. The ﬁrst time this assumptionhad been made was by Planck in 1900 ’as an act of desperation’ to beable to derive the blackbody radiation spectrum. Indeed, you can alsoobserve in direct experiments that the photon hypothesis makes sense.When you reduce the intensity of light that falls onto a photodetector,you will observe that the detector responds with individual clicks eachtriggered by the impact of a single photon (if the detector is sensitiveenough). The photo-electric eﬀect and various other experiments alsoconﬁrm the existence of photons. So, in the low-intensity limit we haveto consider light as consisting of indivisible units called photons. It isa fundamental property of photons that they cannot be split – there isno such thing as half a photon going through a polarizer for example.In this photon picture we have to conclude that sometimes a photonwill be absorbed in the polarizer and sometimes it passes through. Ifthe photon passes the polarizer, we have gained one piece of informa-tion, namely that the photon was able to pass the polarizer and thattherefore it has to be polarized in x-direction. The probability p for
16.
16 CHAPTER 1. MATHEMATICAL FOUNDATIONSthe photon to pass through the polarizer is obviously the ratio betweentransmitted and incoming intensities, which is given by |Ex |2 p= . (1.7) |Ex |2 + |Ey |2If we write the state of the light with normalized intensity Ex Ey EN = ex + ey , (1.8) |Ex |2 + |Ey |2 |Ex |2 + |Ey |2then in fact we ﬁnd that the probability for the photon to pass the x-polarizer is just the square of the amplitude in front of the basis vectorex ! This is just one of the quantum mechanical rules that you havelearned in your second year course. Furthermore we see that the state of the photon after it has passedthe x-polarizer is given by EN = ex , (1.9)ie the state has has changed from Ex to Ex . This transformation Ey 0of the state can be described by a matrix acting on vectors, ie Ex 1 0 Ex = (1.10) 0 0 0 EyThe matrix that I have written here has eigenvalues 0 and 1 and istherefore a projection operator which you have heard about in the sec-ond year course, in fact this reminds strongly of the projection postulatein quantum mechanics. Experiment II: Now let us make a somewhat more complicatedexperiment by placing a second polarizer behind the ﬁrst x-polarizer.The second polarizer allows photons polarized in x direction to passthrough. If I slowly rotate the polarizer from the x direction to the ydirection, we observe that the intensity of the light that passes throughthe polarizer decreases and vanishes when the directions of the twopolarizers are orthogonal. I would like to describe this experimentmathematically. How do we compute the intensity after the polarizer
17.
1.2. THE QUANTUM MECHANICAL STATE SPACE 17now? To this end we need to see how we can express vectors in thebasis chosen by the direction x in terms of the old basis vectors ex , ey . The new rotated basis e x , ey (see Fig. 1.3) can be expressed by theold basis by ex = cos φex + sin φey ey = − sin φex + cos φey (1.11)and vice versa ex = cos φex − sin φey ey = sin φex + cos φey . (1.12)Note that cos φ = ex · ex and sin φ = ex · ey where I have used the realscalar product between vectors.Figure 1.3: The x’- basis is rotated by an angle φ with respect to theoriginal x-basis. The state of the x-polarized light after the ﬁrst polarizer can berewritten in the new basis of the x’-polarizer. We ﬁnd E = Ex ex = Ex cos φex − Ex sin φey = Ex (ex · ex )ex − Ex (ey · ey )ey Now we can easily compute the ratio between the intensity beforeand after the x -polarizer. We ﬁnd that it is Iaf ter = |ex · ex |2 = cos2 φ (1.13) Ibef oreor if we describe the light in terms of states with normalized intensityas in equation 1.8, then we ﬁnd that Iaf ter |ex · EN |2 = |ex · EN |2 = (1.14) Ibef ore |ex · EN |2 + |ey · EN |2
18.
18 CHAPTER 1. MATHEMATICAL FOUNDATIONSwhere EN is the normalized intensity state of the light after the x-polarizer. This demonstrates that the scalar product between vectorsplays an important role in the calculation of the intensities (and there-fore the probabilities in the photon picture). Varying the angle φ between the two bases we can see that the ratioof the incoming and outgoing intensities decreases with increasing anglebetween the two axes until the angle reaches 90o degrees. Interpretation: Viewed in the photon picture this is a rathersurprising result, as we would have thought that after passing the x-polarizer the photon is ’objectively’ in the x-polarized state. However,upon probing it with an x -polarizer we ﬁnd that it also has a quality ofan x -polarized state. In the next experiment we will see an even moreworrying result. For the moment we note that the state of a photon canbe written in diﬀerent ways and this freedom corresponds to the factthat in quantum mechanics we can write the quantum state in manydiﬀerent ways as a quantum superpositions of basis vectors. Let us push this idea a bit further by using three polarizers in arow. Experiment III: If after passing the x-polarizer, the light falls ontoa y-polarizer (see Fig 1.4), then no light will go through the polarizerbecause the two directions are perpendicular to each other. This sim-Figure 1.4: Light of arbitrary polarization is hitting a x-polarizer andsubsequently a y-polarizer. No light goes through both polarizers.ple experimental result changes when we place an additional polarizerbetween the x and the y-polarizer. Assume that we place a x’-polarizerbetween the two polarizers. Then we will observe light after the y-polarizer (see Fig. 1.5) depending on the orientation of x . The light ˜ ˜after the last polarizer is described by Ey ey . The amplitude Ey is calcu-lated analogously as in Experiment II. Now let us describe the (x-x’-y)experiment mathematically. The complex electric ﬁeld (without thetime dependence) is given by
19.
1.2. THE QUANTUM MECHANICAL STATE SPACE 19Figure 1.5: An x’-polarizer is placed in between an x-polarizer and ay-polarizer. Now we observe light passing through the y-polarizer.before the x-polarizer: E1 = Ex ex + Ey ey .after the x-polarizer: E2 = (E1 ex )ex = Ex ex = Ex cos φex − Ex sin φey .after the x’-polarizer: E3 = (E2 ex )ex = Ex cos φex = Ex cos2 φex + Ex cos φ sin φey .after the y-polarizer: ˜ E4 = (E3 ey )ey = Ex cos φ sin φey = Ey ey .Therefore the ratio between the intensity before the x’-polarizer andafter the y-polarizer is given by Iaf ter = cos2 φ sin2 φ (1.15) Ibef ore Interpretation: Again, if we interpret this result in the photonpicture, then we arrive at the conclusion, that the probability for thephoton to pass through both the x’ and the y polarizer is given bycos2 φ sin2 φ. This experiment further highlights the fact that light ofone polarization may be interpreted as a superposition of light of otherpolarizations. This superposition is represented by adding vectors withcomplex coeﬃcients. If we consider this situation in the photon picturewe have to accept that a photon of a particular polarization can alsobe interpreted as a superposition of diﬀerent polarization states.
20.
20 CHAPTER 1. MATHEMATICAL FOUNDATIONS Conclusion: All these observations suggest that complex vectors,their amplitudes, scalar products and linear transformations betweencomplex vectors are the basic ingredient in the mathematical structureof quantum mechanics as opposed to the real vector space of classi-cal mechanics. Therefore the rest of this chapter will be devoted to amore detailed introduction to the structure of complex vector-spacesand their properties. Suggestions for further reading:G. Baym Lectures on Quantum Mechanics, W.A. Benjamin 1969.P.A.M. Dirac, The principles of Quantum Mechanics, Oxford Univer-sity Press 1958 End of 1st lecture1.2.2 Complex vector spacesI will now give you a formal deﬁnition of a complex vector space andwill then present some of its properties. Before I come to this deﬁnition,I introduce a standard notation that I will use in this chapter. Givensome set V we deﬁneNotation: 1. ∀|x ∈ V means: For all |x that lie in V . 2. ∃|x ∈ V means: There exists an element |x that lies in V . Note that I have used a somewhat unusual notation for vectors. Ihave replaced the vector arrow on top of the letter by a sort of bracketaround the letter. I will use this notation when I talk about complexvectors, in particular when I talk about state vectors.Now I can state the deﬁnition of the complex vector space. It willlook a bit abstract at the beginning, but you will soon get used to it,especially when you solve some problems with it.Deﬁnition 1 Given a quadruple (V, C, +, ·) where V is a set of ob-jects (usually called vectors), C denotes the set of complex numbers,
21.
1.2. THE QUANTUM MECHANICAL STATE SPACE 21 + denotes the group operation of addition and · denotes the mul-tiplication of a vector with a complex number. (V, C, +, ·) is called acomplex vector space if the following properties are satisﬁed: 1. (V,+) is an Abelian group, which means that (a) ∀|a , |b ∈ V ⇒ |a + |b ∈ V. (closure) (b) ∀|a , |b , |c ∈ V ⇒ |a + (|b + |c ) = (|a + |b ) + |c . (associative) (c) ∃|O ∈ V so that ∀|a ∈ V ⇒ |a + |O = |a . (zero) (d) ∀|a ∈ V : ∃(−|a ) ∈ V so that |a +(−|a ) = |O . (inverse) (e) ∀|a , |b ∈ V ⇒ |a + |b = |b + |a . (Abelian) 2. The Scalar multiplication satisﬁes (a) ∀α ∈ C, |x ∈ V ⇒ α|x ∈ V (b) ∀|x ∈ V ⇒ 1 · |x = |x (unit) (c) ∀c, d ∈ C, |x ∈ V ⇒ (c · d) · |x = c · (d · |x ) (associative) (d) ∀c, d ∈ C, |x , |y ∈ V ⇒ c · (|x + |y ) = c · |x + c · |y and (c + d) · |x = c · |x + c · |y . (distributive)This deﬁnition looks quite abstract but a few examples will make itclearer.Example: 1. A simple proof I would like to show how to prove the statement 0 · |x = |O . This might look trivial, but nevertheless we need to prove it, as it has not been stated as an axiom. From the axioms given in Def. 1 we conclude. (1d) |O = −|x + |x (2b) = −|x + 1 · |x = −|x + (1 + 0) · |x (2d) = −|x + 1 · |x + 0 · |x
22.
22 CHAPTER 1. MATHEMATICAL FOUNDATIONS (2b) = −|x + |x + 0 · |x (1d) = |O + 0 · |x (1c) = 0 · |x . 2. The C2 This is the set of two-component vectors of the form a1 |a = , (1.16) a2 where the ai are complex numbers. The addition and scalar mul- tiplication are deﬁned as a1 b1 a1 + b 1 + := (1.17) a2 b2 a2 + b 2 a1 c · a1 c· := (1.18) a2 c · a2 It is now easy to check that V = C2 together with the addition and scalar multiplication deﬁned above satisfy the deﬁnition of a complex vector space. (You should check this yourself to get some practise with the axioms of vector space.) The vector space 1 C2 is the one that is used for the description of spin- 2 particles such as electrons. 3. The set of real functions of one variable f : R → R The group operations are deﬁned as (f1 + f2 )(x) := f1 (x) + f2 (x) (c · f )(x) := c · f (x) Again it is easy to check that all the properties of a complex vector space are satisﬁed. 4. Complex n × n matrices The elements of the vector space are m11 . . . m1n . .. . , M = . . . . . (1.19) mn1 . . . mnn
23.
1.2. THE QUANTUM MECHANICAL STATE SPACE 23 where the mij are arbitrary complex numbers. The addition and scalar multiplication are deﬁned as a11 . . . a1n b11 . . . b1n a11 + b11 . . . a1n + b1n . .. . . .. . = . .. . . . . + . . . . . . , . . . . . . an1 . . . ann bn1 . . . bnn an1 + bn1 . . . ann + bnn a11 . . . a1n c · a11 . . . c · a1n . .. . = . .. . . . . . c· . . . . . . . an1 . . . ann c · an1 . . . c · ann Again it is easy to conﬁrm that the set of complex n × n matrices with the rules that we have deﬁned here forms a vector space. Note that we are used to consider matrices as objects acting on vectors, but as we can see here we can also consider them as elements (vectors) of a vector space themselves. Why did I make such an abstract deﬁnition of a vector space? Well,it may seem a bit tedious, but it has a real advantage. Once we haveintroduced the abstract notion of complex vector space anything wecan prove directly from these abstract laws in Deﬁnition 1 will holdtrue for any vector space irrespective of how complicated it will looksuperﬁcially. What we have done, is to isolate the basic structure ofvector spaces without referring to any particular representation of theelements of the vector space. This is very useful, because we do notneed to go along every time and prove the same property again whenwe investigate some new objects. What we only need to do is to provethat our new objects have an addition and a scalar multiplication thatsatisfy the conditions stated in Deﬁnition 1. In the following subsections we will continue our exploration of theidea of complex vector spaces and we will learn a few useful propertiesthat will be helpful for the future.1.2.3 Basis and DimensionSome of the most basic concepts of vector spaces are those of linearindependence, dimension and basis. They will help us to express vec-
24.
24 CHAPTER 1. MATHEMATICAL FOUNDATIONStors in terms of other vectors and are useful when we want to deﬁneoperators on vector spaces which will describe observable quantities. Quite obviously some vectors can be expressed by linear combina-tions of others. For example 1 1 0 = +2· . (1.20) 2 0 1It is natural to consider a given set of vectors {|x 1 , . . . , |x k } and toask the question, whether a vector in this set can be expressed as alinear combination of the others. Instead of answering this questiondirectly we will ﬁrst consider a slightly diﬀerent question. Given a setof vectors {|x 1 , . . . , |x k }, can the null vector |O can be expressed asa linear combination of these vectors? This means that we are lookingfor a linear combination of vectors of the form λ1 |x 1 + . . . + λ2 |x k = |O . (1.21)Clearly Eq. (1.21) can be satisﬁed when all the λi vanish. But this caseis trivial and we would like to exclude it. Now there are two possiblecases left: a) There is no combination of λi ’s, not all of which are zero, thatsatisﬁes Eq. (1.21). b) There are combinations of λi ’s, not all of which are zero, thatsatisfy Eq. (1.21). These two situations will get diﬀerent names and are worth theDeﬁnition 2 A set of vectors {|x 1 , . . . , |x k } is called linearly inde-pendent if the equation λ1 |x 1 + . . . + λ2 |x k = |O (1.22)has only the trivial solution λ1 = . . . = λk = 0.If there is a nontrivial solution to Eq. (1.22), i.e. at least one of theλi = 0, then we call the vectors {|x 1 , . . . , |x k } linearly dependent. Now we are coming back to our original question as to whetherthere are vectors in {|x 1 , . . . , |x k } that can be expressed by all theother vectors in that set. As a result of this deﬁnition we can see thefollowing
25.
1.2. THE QUANTUM MECHANICAL STATE SPACE 25Lemma 3 For a set of linearly independent vectors {|x 1 , . . . , |x k },no |x i can be expressed as a linear combination of the other vectors,i.e. one cannot ﬁnd λj that satisfy the equation λ1 |x 1 + . . . + λi−1 |x i−1 + λi+1 |x i+1 + . . . + λk |x k = |x i . (1.23)In a set of linearly dependent vectors {|x 1 , . . . , |x k } there is at leastone |x i that can be expressed as a linear combination of all the other|x j .Proof: Exercise! 2Example: The set {|O } consisting of the null vector only, is linearlydependent. In a sense that will become clearer when we really talk about quan-tum mechanics, in a set of linearly independent set of vectors, eachvector has some quality that none of the other vectors have. After we have introduced the notion of linear dependence, we cannow proceed to deﬁne the dimension of a vector space. I am sure thatyou have a clear intuitive picture of the notion of dimension. Evidentlya plain surface is 2-dimensional and space is 3-dimensional. Why dowe say this? Consider a plane, for example. Clearly, every vector inthe plane can be expressed as a linear combination of any two linearlyindependent vectors |e 1 , |e 2 . As a result you will not be able to ﬁnd aset of three linearly independent vectors in a plane, while two linearlyindependent vectors can be found. This is the reason to call a plane atwo-dimensional space. Let’s formalize this observation inDeﬁnition 4 The dimension of a vector space V is the largest numberof linearly independent vectors in V that one can ﬁnd. Now we introduce the notion of basis of vector spaces.Deﬁnition 5 A set of vectors {|x 1 , . . . , |x k } is called a basis of avector space V if
26.
26 CHAPTER 1. MATHEMATICAL FOUNDATIONSa) |x 1 , . . . , |x k are linearly independent.b) ∀|x ∈ V : ∃λi ∈ C ⇒ x = k λi |x i . i=1 Condition b) states that it is possible to write every vector as alinear combination of the basis vectors. The ﬁrst condition makes surethat the set {|x 1 , . . . , |x k } is the smallest possible set to allow for con-dition b) to be satisﬁed. It turns out that any basis of an N -dimensionalvector space V contains exactly N vectors. Let us illustrate the notionof basis.Example: 1. Consider the space of vectors C2 with two components. Then the two vectors 1 0 |x 1 = |x 2 = . (1.24) 0 1 form a basis of C2 . A basis for the CN can easily be constructed in the same way. 2. An example for an inﬁnite dimensional vector space is the space of complex polynomials, i.e. the set V = {c0 + c1 z + . . . + ck z k |arbitrary k and ∀ci ∈ C} . (1.25) Two polynomials are equal when they give the same values for all z ∈ C. Addition and scalar multiplication are deﬁned coeﬃcient wise. It is easy to see that the set {1, z, z 2 , . . .} is linearly inde- pendent and that it contains inﬁnitely many elements. Together with other examples you will prove (in the problem sheets) that Eq. (1.25) indeed describes a vector space. End of 2nd lecture1.2.4 Scalar products and Norms on Vector SpacesIn the preceding section we have learnt about the concept of a basis.Any set of N linearly independent vectors of an N dimensional vector
27.
1.2. THE QUANTUM MECHANICAL STATE SPACE 27space V form a basis. But not all such choices are equally convenient.To ﬁnd useful ways to chose a basis and to ﬁnd a systematic methodto ﬁnd the linear combinations of basis vectors that give any arbitraryvector |x ∈ V we will now introduce the concept of scalar productbetween two vectors. This is not to be confused with scalar multiplica-tion which deals with a complex number and a vector. The concept ofscalar product then allows us to formulate what we mean by orthogo-nality. Subsequently we will deﬁne the norm of a vector, which is theabstract formulation of what we normally call a length. This will thenallow us to introduce orthonormal bases which are particularly handy. The scalar product will play an extremely important role in quan-tum mechanics as it will in a sense quantify how similar two vectors(quantum states) are. In Fig. 1.6 you can easily see qualitatively thatthe pairs of vectors become more and more diﬀerent from left to right.The scalar product puts this into a quantitative form. This is the rea-son why it can then be used in quantum mechanics to quantify howlikely it is for two quantum states to exhibit the same behaviour in anexperiment.Figure 1.6: The two vectors on the left are equal, the next pair is’almost’ equal and the ﬁnal pair is quite diﬀerent. The notion of equaland diﬀerent will be quantiﬁed by the scalar product. To introduce the scalar product we begin with an abstract formu-lation of the properties that we would like any scalar product to have.Then we will have a look at examples which are of interest for quantummechanics.Deﬁnition 6 A complex scalar product on a vector space assigns toany two vectors |x , |y ∈ V a complex number (|x , |y ) ∈ C satisfyingthe following rules
28.
28 CHAPTER 1. MATHEMATICAL FOUNDATIONS 1. ∀|x , |y , |z ∈ V, αi ∈ C : (|x , α1 |y + α2 |z ) = α1 (|x , |y ) + α2 (|x , |z ) (linearity) 2. ∀|x , |y ∈ V : (|x , |y ) = (|y , |x )∗ (symmetry) 3. ∀|x ∈ V : (|x , |x ) ≥ 0 (positivity) 4. ∀|x ∈ V : (|x , |x ) = 0 ⇔ |x = |O These properties are very much like the ones that you know fromthe ordinary dot product for real vectors, except for property 2 whichwe had to introduce in order to deal with the fact that we are nowusing complex numbers. In fact, you should show as an exercise thatthe ordinary condition (|x , y) = (|y , |x ) would lead to a contradictionwith the other axioms if we have complex vector spaces. Note that weonly deﬁned linearity in the second argument. This is in fact all weneed to do. As an exercise you may prove that any scalar product isanti-linear in the ﬁrst component, i.e.∀|x , |y , |z ∈ V, α ∈ C : (α|x + β|y , |z ) = α∗ (|x , |z ) + β ∗ (|y , |z ) . (1.26)Note that vector spaces on which we have deﬁned a scalar product arealso called unitary vector spaces. To make you more comfortablewith the scalar product, I will now present some examples that playsigniﬁcant roles in quantum mechanics.Examples: 1. The scalar product in Cn . Given two complex vectors |x , |y ∈ Cn with components xi and yi we deﬁne the scalar product n (|x , |y ) = x∗ yi i (1.27) i=1 where ∗ denotes the complex conjugation. It is easy to convince yourself that Eq. (1.27) indeed deﬁnes a scalar product. (Do it!).
29.
1.2. THE QUANTUM MECHANICAL STATE SPACE 29 2. Scalar product on continuous square integrable functions A square integrable function ψ ∈ L2 (R) is one that satisﬁes ∞ |ψ(x)|2 dx < ∞ (1.28) −∞ Eq. (1.28) already implies how to deﬁne the scalar product for these square integrable functions. For any two functions ψ, φ ∈ L2 (R) we deﬁne ∞ (ψ, φ) = ψ(x)∗ φ(x)dx . (1.29) −∞ Again you can check that deﬁnition Eq. (1.29) satisﬁes all prop- erties of a scalar product. (Do it!) We can even deﬁne the scalar product for discontinuous square integrable functions, but then we need to be careful when we are trying to prove property 4 for scalar products. One reason is that there are functions which are nonzero only in isolated points (such functions are discontinuous) and for which Eq. (1.28) vanishes. An example is the function 1 for x = 0 f (x) = 0 anywhere else The solution to this problem lies in a redeﬁnition of the elements of our set. If we identify all functions that diﬀer from each other only in countably many points then we have to say that they are in fact the same element of the set. If we use this redeﬁnition then we can see that also condition 4 of a scalar product is satisﬁed. An extremely important property of the scalar product is the Schwarzinequality which is used in many proofs. In particular I will used itto prove the triangular inequality for the length of a vector and in theproof of the uncertainty principle for arbitrary observables.Theorem 7 (The Schwarz inequality) For any |x , |y ∈ V we have |(|x , |y )|2 ≤ (|x , |x )(|y , |y ) . (1.30)
30.
30 CHAPTER 1. MATHEMATICAL FOUNDATIONSProof: For any complex number α we have0 ≤ (|x + α|y , |x + α|y ) = (|x , |x ) + α(|x , |y ) + α∗ (|y , |x ) + |α|2 (|y , |y ) = (|x , |x ) + 2vRe(|x , |y ) − 2wIm(|x , |y ) + (v 2 + w2 )(|y , |y ) =: f (v, w) (1.31)In the deﬁnition of f (v, w) in the last row we have assumed α = v +iw. To obtain the sharpest possible bound in Eq. (1.30), we need tominimize the right hand side of Eq. (1.31). To this end we calculate ∂f 0 = (v, w) = 2Re(|x , |y ) + 2v(|y , |y ) (1.32) ∂v ∂f 0 = (v, w) = −2Im(|x , |y ) + 2w(|y , |y ) . (1.33) ∂wSolving these equations, we ﬁnd Re(|x , |y ) − iIm(|x , |y ) (|y , |x ) αmin = vmin + iwmin = − =− . (|y , |y ) (|y , |y ) (1.34)Because all the matrix of second derivatives is positive deﬁnite, wereally have a minimum. If we insert this value into Eq. (1.31) weobtain (|y , |x )(|x , |y ) 0 ≤ (|x , |x ) − (1.35) (|y , |y )This implies then Eq. (1.30). Note that we have equality exactly if thetwo vectors are linearly dependent, i.e. if |x = γ|y 2 . Quite a few proofs in books are a little bit shorter than this onebecause they just use Eq. (1.34) and do not justify its origin as it wasdone here. Having deﬁned the scalar product, we are now in a position to deﬁnewhat we mean by orthogonal vectors.Deﬁnition 8 Two vectors |x , |y ∈ V are called orthogonal if (|x , |y ) = 0 . (1.36)We denote with |x ⊥ a vector that is orthogonal to |x .
31.
1.2. THE QUANTUM MECHANICAL STATE SPACE 31 Now we can deﬁne the concept of an orthogonal basis which will bevery useful in ﬁnding the linear combination of vectors that give |x .Deﬁnition 9 An orthogonal basis of an N dimensional vector spaceV is a set of N linearly independent vectors such that each pair ofvectors are orthogonal to each other.Example: In C the three vectors 0 0 1 0 , 3 , 0 , (1.37) 2 0 0form an orthogonal basis. Planned end of 3rd lecture Now let us chose an orthogonal basis {|x 1 , . . . , |x N } of an N di-mensional vector space. For any arbitrary vector |x ∈ V we would liketo ﬁnd the coeﬃcients λ1 , . . . , λN such that N λi |x i = |x . (1.38) i=1Of course we can obtain the λi by trial and error, but we would like toﬁnd an eﬃcient way to determine the coeﬃcients λi . To see this, letus consider the scalar product between |x and one of the basis vectors|x i . Because of the orthogonality of the basis vectors, we ﬁnd (|x i , |x ) = λi (|x i , |x i ) . (1.39)Note that this result holds true only because we have used an orthogonalbasis. Using Eq. (1.39) in Eq. (1.38), we ﬁnd that for an orthogonalbasis any vector |x can be represented as N (|x i , |x ) |x = |x i . (1.40) i=1 (|x i , |x i )In Eq. (1.40) we have the denominator (|x i , |x i ) which makes theformula a little bit clumsy. This quantity is the square of what we
32.
32 CHAPTER 1. MATHEMATICAL FOUNDATIONSFigure 1.7: A vector |x = a in R2 . From plane geometry we know b √that its length is a2 + b2 which is just the square root of the scalarproduct of the vector with itself.usually call the length of a vector. This idea is illustrated in Fig. 1.7which shows a vector |x = a in the two-dimensional real vector b 2 √space R . Clearly its length is a2 + b2 . What other properties doesthe length in this intuitively clear picture has? If I multiply the vectorby a number α then we have the vector α|x which evidently has the √ √length α2 a2 + α2 b2 = |α| a2 + b2 . Finally we know that we have atriangular inequality. This means that given two vectors |x 1 = a1 b 1and |x 2 = a2 the length of the |x 1 + |x 2 is smaller than the sum b 2of the lengths of |x 1 and |x 2 . This is illustrated in Fig. 1.8. Inthe following I formalize the concept of a length and we will arrive atthe deﬁnition of the norm of a vector |x i . The concept of a norm isimportant if we want to deﬁne what we mean by two vectors being closeto one another. In particular, norms are necessary for the deﬁnition ofconvergence in vector spaces, a concept that I will introduce in thenext subsection. In the following I specify what properties a norm of avector should satisfy.Deﬁnition 10 A norm on a vector space V associates with every |x ∈V a real number |||x ||, with the properties. 1. ∀|x ∈ V : |||x || ≥ 0 and |||x || = 0 ⇔ |x = |O . (positivity) 2. ∀|x ∈ V, α ∈ C : ||α|x || = |α| · |||x ||. (linearity)
33.
1.2. THE QUANTUM MECHANICAL STATE SPACE 33Figure 1.8: The triangular inequality illustrated for two vectors |x and|y . The length of |x + |y is smaller than the some of the lengths of|x and |y . 3. ∀|x , |y ∈ V : |||x + |y || ≤ |||x || + |||y ||. (triangular inequality) A vector space with a norm deﬁned on it is also called a normedvector space. The three properties in Deﬁnition 10 are those thatyou would intuitively expect to be satisﬁed for any decent measure oflength. As expected norms and scalar products are closely related. Infact, there is a way of generating norms very easily when you alreadyhave a scalar product.Lemma 11 Given a scalar product on a complex vector space, we candeﬁne the norm of a vector |x by |||x || = (|x , |x ) . (1.41)Proof: • Properties 1 and 2 of the norm follow almost trivially from the four basic conditions of the scalar product. 2
34.
34 CHAPTER 1. MATHEMATICAL FOUNDATIONS • The proof of the triangular inequality uses the Schwarz inequality. |||x + |y ||2 = |(|x + |y , |x + |y )| = |(|x , |x + |y ) + (|y , |x + |y )| ≤ |(|x , |x + |y )| + |(|y , |x + |y )| ≤ |||x || · |||x + |y || + |||y || · |||x + |y || . (1.42) Dividing both sides by |||x + |y || yields the inequality. This assumes that the sum |x + |y = |O . If we have |x + |y = |O then the Schwarz inequality is trivially satisﬁed.2 Lemma 11 shows that any unitary vector space can canonically(this means that there is basically one natural choice) turned into anormed vector space. The converse is, however, not true. Not everynorm gives automatically rise to a scalar product (examples will begiven in the exercises). Using the concept of the norm we can now deﬁne an orthonormalbasis for which Eq. (1.40) can then be simpliﬁed.Deﬁnition 12 An orthonormal basis of an N dimensional vectorspace is a set of N pairwise orthogonal linearly independent vectors{|x 1 , . . . , |x N } where each vector satisﬁes |||x i ||2 = (|x i , |x i ) = 1,i.e. they are unit vectors. For an orthonormal basis and any vector |xwe have N N |x = (|x i , |x )|x i = αi |x i , (1.43) i=1 i=1where the components of |x with respect to the basis {|x 1 , . . . , |x N}are the αi = (|x i , |x ).Remark: Note that in Deﬁnition 12 it was not really necessary to de-mand the linear independence of the vectors {|x 1 , . . . , |x N } becausethis follows from the fact that they are normalized and orthogonal. Tryto prove this as an exercise. Now I have deﬁned what an orthonormal basis is, but you still donot know how to construct it. There are quite a few diﬀerent methods
35.
1.2. THE QUANTUM MECHANICAL STATE SPACE 35to do so. I will present the probably most well-known procedure whichhas the name Gram-Schmidt procedure. This will not be presented in the lecture. You should study this at home.There will be an exercise on this topic in the Rapid Feedback class. The starting point of the Gram-Schmidt orthogonalization proce-dure is a set of linearly independent vectors S = {|x 1 , . . . , |x n }. Nowwe would like to construct from them an orthonormal set of vectors{|e 1 , . . . , |e n }. The procedure goes as followsFirst step We chose |f 1 = |x 1 and then construct from it the nor- malized vector |e 1 = |x 1 /|||x 1 ||. Comment: We can normalize the vector |x 1 because the set S is linearly independent and therefore |x 1 = 0.Second step We now construct |f 2 = |x 2 − (|e 1 , |x 2 )|e 1 and from this the normalized vector |e 2 = |f 2 /|||f 2 ||. Comment: 1) |f 2 = |O because |x 1 and |x 2 are linearly inde- pendent. 2) By taking the scalar product (|e 2 , |e 1 ) we ﬁnd straight away that the two vectors are orthogonal....k-th step We construct the vector k−1 |f k = |x k − (|e i , |x k )|e i . i=1 Because of linear independence of S we have |f k = |O . The normalized vector is then given by |f k |e k = . |||f k || It is easy to check that the vector |e k is orthogonal to all |e i with i < k.n-th step With this step the procedure ﬁnishes. We end up with a set of vectors {|e 1 , . . . , |e n } that are pairwise orthogonal and normalized.
36.
36 CHAPTER 1. MATHEMATICAL FOUNDATIONS1.2.5 Completeness and Hilbert spacesIn the preceding sections we have encountered a number of basic ideasabout vector spaces. We have introduced scalar products, norms, basesand the idea of dimension. In order to be able to deﬁne a Hilbert space,the state space of quantum mechanics, we require one other concept,that of completeness, which I will introduce in this section. What do we mean by complete? To see this, let us consider se-quences of elements of a vector space (or in fact any set, but we areonly interested in vector spaces). I will write sequences in two diﬀerentways {|x i }i=0,...,∞ ≡ (|x 0 , |x 1 , |x 2 . . .) . (1.44)To deﬁne what we mean by a convergent sequences, we use normsbecause we need to be able to specify when two vectors are close toeach other.Deﬁnition 13 A sequence {|x i }i=0,...,∞ of elements from a normedvector space V converges towards a vector |x ∈ V if for all > 0there is an n0 such that for all n > n0 we have |||x − |x n || ≤ . (1.45) But sometimes you do not know the limiting element, so you wouldlike to ﬁnd some other criterion for convergence without referring tothe limiting element. This idea led to the followingDeﬁnition 14 A sequence {|x i }i=0,...,∞ of elements from a normedvector space V is called a Cauchy sequence if for all > 0 there isan n0 such that for all m, n > n0 we have |||x m − |x n || ≤ . (1.46) Planned end of 4th lecture Now you can wonder whether every Cauchy sequence converges.Well, it sort of does. But unfortunately sometimes the limiting ele-ment does not lie in the set from which you draw the elements of your
37.
1.2. THE QUANTUM MECHANICAL STATE SPACE 37sequence. How can that be? To illustrate this I will present a vectorspace that is not complete! Consider the set V = {|x : only ﬁnitely many components of |x are non-zero} .An example for an element of V is |x = (1, 2, 3, 4, 5, 0, . . .). It is nowquite easy to check that V is a vector-space when you deﬁne additionof two vectors via |x + |y = (x1 + y1 , x2 + y2 , . . .)and the multiplication by a scalar via c|x = (cx1 , cx2 , . . .) .Now I deﬁne a scalar product from which I will then obtain a norm viathe construction of Lemma 11. We deﬁne the scalar product as ∞ (|x , |y ) = x∗ yk . k k=1Now let us consider the series of vectors |x 1 = (1, 0, 0, 0, . . .) 1 |x 2 = (1, , 0, . . .) 2 1 1 |x 3 = (1, , , 0, . . .) 2 4 1 1 1 |x 4 = (1, , , , . . .) 2 4 8 . . . 1 1 |x k = (1, , . . . , k−1 , 0, . . .) 2 2For any n0 we ﬁnd that for m > n > n0 we have 1 1 1 |||x m − |x n || = ||(0, . . . , 0, n , . . . , m−1 , 0, . . .)|| ≤ n−1 . 2 2 2Therefore it is clear that the sequence {|x k }k=1,...∞ is a Cauchy se-quence. However, the limiting vector is not a vector from the vector
38.
38 CHAPTER 1. MATHEMATICAL FOUNDATIONSspace V , because the limiting vector contains inﬁnitely many nonzeroelements. Considering this example let us deﬁne what we mean by a completevector space.Deﬁnition 15 A vector space V is called complete if every Cauchysequence of elements from the vector space V converges towards anelement of V . Now we come to the deﬁnition of Hilbert spaces.Deﬁnition 16 A vector space H is a Hilbert space if it satisﬁes thefollowing two conditions 1. H is a unitary vector space. 2. H is complete. Following our discussions of the vectors spaces, we are now in theposition to formulate the ﬁrst postulate of quantum mechanics.Postulate 1 The state of a quantum system is described by a vectorin a Hilbert space H. Why did we postulate that the quantum mechanical state space isa Hilbert space? Is there a reason for this choice? Let us argue physically. We know that we need to be able to rep-resent superpositions, i.e. we need to have a vector space. From thesuperposition principle we can see that there will be states that are notorthogonal to each other. That means that to some extent one quan-tum state can be ’present’ in another non-orthogonal quantum state –they ’overlap’. The extent to which the states overlap can be quantiﬁedby the scalar product between two vectors. In the ﬁrst section we havealso seen, that the scalar product is useful to compute probabilitiesof measurement outcomes. You know already from your second year
39.
1.2. THE QUANTUM MECHANICAL STATE SPACE 39course that we need to normalize quantum states. This requires thatwe have a norm which can be derived from a scalar product. Becauseof the obvious usefulness of the scalar product, we require that thestate space of quantum mechanics is a vector space equipped with ascalar product. The reason why we demand completeness, can be seenfrom a physical argument which could run as follows. Consider anysequence of physical states that is a Cauchy sequence. Quite obviouslywe would expect this sequence to converge to a physical state. It wouldbe extremely strange if by means of such a sequence we could arriveat an unphysical state. Imagine for example that we change a state bysmaller and smaller amounts and then suddenly we would arrive at anunphysical state. That makes no sense! Therefore it seems reasonableto demand that the physical state space is complete. What we have basically done is to distill the essential features ofquantum mechanics and to ﬁnd a mathematical object that representsthese essential features without any reference to a special physical sys-tem. In the next sections we will continue this programme to formulatemore principles of quantum mechanics.1.2.6 Dirac notationIn the following I will introduce a useful way of writing vectors. Thisnotation, the Dirac notation, applies to any vector space and is veryuseful, in particular it makes life a lot easier in calculations. As mostquantum mechanics books are written in this notation it is quite im-portant that you really learn how to use this way of writing vectors. Ifit appears a bit weird to you in the ﬁrst place you should just practiseits use until you feel conﬁdent with it. A good exercise, for example, isto rewrite in Dirac notation all the results that I have presented so far. So far we have always written a vector in the form |x . The scalarproduct between two vectors has then been written as (|x , |y ). Let usnow make the following identiﬁcation |x ↔ |x . (1.47)We call |x a ket. So far this is all ﬁne and well. It is just a newnotation for a vector. Now we would like to see how to rewrite the
40.
40 CHAPTER 1. MATHEMATICAL FOUNDATIONSscalar product of two vectors. To understand this best, we need to talka bit about linear functions of vectors.Deﬁnition 17 A function f : V → C from a vector space into thecomplex numbers is called linear if for any |ψ , |φ ∈ V and any α, β ∈C we have f (α|ψ + β|φ ) = αf (|ψ ) + βf (|φ ) (1.48) With two linear function f1 , f2 also the linear combination µf1 +νf2is a linear function. Therefore the linear functions themselves form avector space and it is even possible to deﬁne a scalar product betweenlinear functions. The space of the linear function on a vector space Vis called the dual space V ∗ . Now I would like to show you an example of a linear function which Ideﬁne by using the scalar product between vectors. I deﬁne the functionf|φ : V → C, where |φ ∈ V is a ﬁxed vector so that for all |ψ ∈ V f|φ (|ψ ) := (|φ , |ψ ) . (1.49)Now I would like to introduce a new notation for f|φ . From now on Iwill identify f|φ ↔ φ| (1.50)and use this to rewrite the scalar product between two vectors |φ , |ψas φ|ψ := φ|(|ψ ) ≡ (|φ , |ψ ) . (1.51)The object φ| is called bra and the Dirac notation is therefore some-times called braket notation. Note that while the ket is a vector inthe vector space V , the bra is an element of the dual space V ∗ . At the moment you will not really be able to see the usefulnessof this notation. But in the next section when I will introduce linearoperators, you will realize that the Dirac notation makes quite a fewnotations and calculations a lot easier.1.3 Linear OperatorsSo far we have only dealt with the elements (vectors) of vector spaces.Now we need to learn how to transform these vectors, that means how
41.
1.3. LINEAR OPERATORS 41to transform one set of vectors into a diﬀerent set. Again as quantummechanics is a linear theory we will concentrate on the description oflinear operators.1.3.1 Deﬁnition in Dirac notation ˆDeﬁnition 18 An linear operator A : H → H associates to every ˆvector |ψ ∈ H a vector A|ψ ∈ H such that ˆ ˆ ˆ A(λ|ψ + µ|φ ) = λA|ψ + µA|φ (1.52)for all |ψ , |φ ∈ H and λ, µ ∈ C. Planned end of 5th lecture ˆ A linear operator A : H → H can be speciﬁed completely by de-scribing its action on a basis set of H. To see this let us chose anorthonormal basis {|ei |i = 1, . . . , N }. Then we can calculate the ac- ˆtion of A on this basis. We ﬁnd that the basis {|ei |i = 1, . . . , N } ismapped into a new set of vectors {|fi |i = 1, . . . , N } following ˆ |fi := A|ei . (1.53)Of course every vector |fi can be represented as a linear combinationof the basis vectors {|ei |i = 1, . . . , N }, i.e. |fi = Aki |ek . (1.54) kCombining Eqs. (1.53) and (1.54) and taking the scalar product with|ej we ﬁnd Aji = ej | ˆ (Aki |ek ) (1.55) k = ej |fi ≡ ˆ ej |A|ei . (1.56) ˆThe Aji are called the matrix elements of the linear operator A withrespect to the orthonormal basis {|ei |i = 1, . . . , N }.
42.
42 CHAPTER 1. MATHEMATICAL FOUNDATIONS I will now go ahead and express linear operators in the Dirac nota-tion. First I will present a particularly simple operator, namely the unit ˆoperator 1, which maps every vector into itself. Surprisingly enoughthis operator, expressed in the Dirac notation will prove to be veryuseful in calculations. To ﬁnd its representation in the Dirac notation,we consider an arbitrary vector |f and express it in an orthonormalbasis {|ei |i = 1, . . . , N }. We ﬁnd N N |f = fj |ej = |ej ej |f . (1.57) j=1 j=1To check that this is correct you just form the scalar product between|f and any of the basis vectors |ei . Now let us rewrite Eq. (1.57) alittle bit, thereby deﬁning the Dirac notation of operators. N N |f = |ej ej |f =: ( |ej ej |)|f (1.58) j=1 j=1Note that the right hand side is deﬁned in terms of the left hand side.The object in the brackets is quite obviously the identity operator be-cause it maps any vector |f into the same vector |f . Therefore it istotally justiﬁed to say that N 1≡ |ej ej | . (1.59) j=1This was quite easy. We just moved some brackets around and wefound a way to represent the unit operator using the Dirac notation.Now you can already guess how the general operator will look like, butI will carefully derive it using the identity operator. Clearly we havethe following identity ˆ ˆ ˆˆ A = 1 A1 . (1.60)Now let us use the Dirac notation of the identity operator in Eq. (1.59)and insert it into Eq. (1.60). We then ﬁnd N N ˆ A = ( ˆ |ej ej |)A( |ek ek |) j=1 k=1
43.
1.3. LINEAR OPERATORS 43 N = ˆ |ej ( ej |A|ek ) ek | jk N = ˆ ( ej |A|ek )|ej ek | jk N = Ajk |ej ek | . (1.61) jkTherefore you can express any linear operator in the Dirac notation,once you know its matrix elements in an orthonormal basis. Matrix elements are quite useful when we want to write down lin-ear operator in matrix form. Given an orthonormal basis {|ei |i =1, . . . , N } we can write every vector as a column of numbers g1 . . |g = gi |ei = . . . (1.62) i gNThen we can write our linear operator in the same basis as a matrix A11 . . . A1N . . ˆ= . ... . . . A . . (1.63) AN 1 . . . AN NTo convince you that the two notation that I have introduced give the ˆsame results in calculations, apply the operator A to any of the basisvectors and then repeat the same operation in the matrix formulation.1.3.2 Adjoint and Hermitean OperatorsOperators that appear in quantum mechanics are linear. But not alllinear operators correspond to quantities that can be measured. Onlya special subclass of operators describe physical observables. In thissubsection I will describe these operators. In the following subsection Iwill then discuss some of their properties which then explain why theseoperators describe measurable quantities. In the previous section we have considered the Dirac notation andin particular we have seen how to write the scalar product and matrix
44.
44 CHAPTER 1. MATHEMATICAL FOUNDATIONSelements of an operator in this notation. Let us reconsider the matrix ˆelements of an operator A in an orthonormal basis {|ei |i = 1, . . . , N }.We have ˆ ˆ ei |(A|ej ) = ( ei |A)|ej (1.64)where we have written the scalar product in two ways. While the left ˆhand side is clear there is now the question, what the bra ei |A on theright hand side means, or better, to which ket it corresponds to. To seethis we need to make the ˆDeﬁnition 19 The adjoint operator A† corresponding to the linear ˆ is the operator such that for all |x , |y we haveoperator A ˆ ˆ (A† |x , |y ) := (|x , A|y ) , (1.65)or using the complex conjugation in Eq. (1.65) we have ˆ ˆ y|A† |x := x|A|y ∗ . (1.66) ˆIn matrix notation, we obtain the matrix representing A† by transposi- ˆtion and complex conjugation of the matrix representing A. (Convinceyourself of this).Example: ˆ 1 2i A= (1.67) i 2and ˆ 1 −i A† = (1.68) −2i 2 The following property of adjoint operators is often used. ˆ ˆLemma 20 For operators A and B we ﬁnd ˆˆ ˆ ˆ (AB)† = B † A† . (1.69)Proof: Eq. (1.69) is proven by ˆˆ (|x , (AB)† |y = ˆˆ ((AB)|x , |y ) = ˆ ˆ (A(B|x ), |y ) Now use Def. 19. = ˆ ˆ (B|x ), A† |y ) Use Def. 19 again. = ˆ ˆ (|x ), B † A† |y )
45.
1.3. LINEAR OPERATORS 45 ˆˆAs this is true for any two vectors |x and |y the two operators (AB)†and Bˆ † A† are equal 2 . ˆ It is quite obvious (see the above example) that in general an op- ˆ ˆerator A and its adjoint operator A† are diﬀerent. However, there areexceptions and these exceptions are very important. ˆDeﬁnition 21 An operator A is called Hermitean or self-adjoint ifit is equal to its adjoint operator, i.e. if for all states |x , |y we have ˆ ˆ y|A|x = x|A|y ∗ . (1.70) In the ﬁnite dimensional case a self-adjoint operator is the same asa Hermitean operator. In the inﬁnite-dimensional case Hermitean andself-adjoint are not equivalent. The diﬀerence between self-adjoint and Hermitean is related to the ˆ ˆdomain of deﬁnition of the operators A and A† which need not be thesame in the inﬁnite dimensional case. In the following I will basicallyalways deal with ﬁnite-dimensional systems and I will therefore usuallyuse the term Hermitean for an operator that is self-adjoint.1.3.3 Eigenvectors, Eigenvalues and the Spectral TheoremHermitean operators have a lot of nice properties and I will explore someof these properties in the following sections. Mostly these propertiesare concerned with the eigenvalues and eigenvectors of the operators.We start with the deﬁnition of eigenvalues and eigenvectors of a linearoperator. ˆDeﬁnition 22 A linear operator A on an N -dimensional Hilbert spaceis said to have an eigenvector |λ with corresponding eigenvalue λ if ˆ A|λ = λ|λ , (1.71)or equivalently ˆ (A − λ1)|λ = 0 . (1.72)
46.
46 CHAPTER 1. MATHEMATICAL FOUNDATIONS This deﬁnition of eigenvectors and eigenvalues immediately showsus how to determine the eigenvectors of an operator. Because Eq. ˆ(1.72) implies that the N columns of the operator A − λ1 are linearlydependent we need to have that ˆ det(A − λ1)) = 0 . (1.73)This immediately gives us a complex polynomial of degree N . As weknow from analysis, every polynomial of degree N has exactly N solu-tions if one includes the multiplicities of the eigenvalues in the counting.In general eigenvalues and eigenvectors do not have many restriction for ˆan arbitrary A. However, for Hermitean and unitary (will be deﬁnedsoon) operators there are a number of nice results concerning the eigen-values and eigenvectors. (If you do not feel familiar with eigenvectorsand eigenvalues anymore, then a good book is for example: K.F. Riley,M.P. Robinson, and S.J. Bence, Mathematical Methods for Physics andEngineering. We begin with an analysis of Hermitean operators. We ﬁnd ˆLemma 23 For any Hermitean operator A we have ˆ 1. All eigenvalues of A are real. 2. Eigenvectors to diﬀerent eigenvalues are orthogonal.Proof: 1.) Given an eigenvalue λ and the corresponding eigenvector ˆ|λ of a Hermitean operator A. Then we have using the hermiticity ofAˆ ˆ ˆ ˆ λ∗ = λ|A|λ ∗ = λ|A† |λ = λ|A|λ = λ (1.74)which directly implies that λ is real 2 . 2.) Given two eigenvectors |λ , |µ for diﬀerent eigenvalues λ andµ. Then we have ˆ ˆ λ λ|µ = (λ µ|λ )∗ = ( µ|A|λ )∗ = λ|A|µ = µ λ|µ (1.75)As λ and µ are diﬀerent this implies λ|µ = 0. This ﬁnishes the proof 2 .
47.
1.3. LINEAR OPERATORS 47 Lemma 23 allows us to formulate the second postulate of quantummechanics. To motivate the second postulate a little bit imagine thatyou try to determine the position of a particle. You would put downa coordinate system and specify the position of the particle as a set ofreal numbers. This is just one example and in fact in any experimentin which we measure a quantum mechanical system, we will always ob-tain a real number as a measurement result. Corresponding to eachoutcome we have a state of the system (the particle sitting in a partic-ular position), and therefore we have a set of real numbers specifyingall possible measurement outcomes and a corresponding set of states.While the representation of the states may depend on the chosen basis,the physical position doesn’t. Therefore we are looking for an object,that gives real numbers independent of the chosen basis. Well, theeigenvalues of a matrix are independent of the chosen basis. Thereforeif we want to describe a physical observable its a good guess to use anoperator and identify the eigenvalues with the measurement outcomes.Of course we need to demand that the operator has only real eigenval-ues. Thus is guaranteed only by Hermitean operators. Therefore weare led to the following postulate.Postulate 2 Observable quantum mechanical quantities are described ˆby Hermitean operators A on the Hilbert space H. The eigenvalues aiof the Hermitean operator are the possible measurement results. Planned end of 6th lecture Now I would like to formulate the spectral theorem for ﬁnite di-mensional Hermitean operators. Any Hermitean operator on an N -dimensional Hilbert space is represented by an N × N matrix. Thecharacteristic polynomial Eq. (1.73) of such a matrix is of N -th orderand therefore possesses N eigenvalues, denoted by λi and correspondingeigenvectors |λi . For the moment, let us assume that the eigenvaluesare all diﬀerent. Then we know from Lemma 23 that the correspond-ing eigenvectors are orthogonal. Therefore we have a set of N pairwise
48.
48 CHAPTER 1. MATHEMATICAL FOUNDATIONSorthogonal vectors in an N -dimensional Hilbert space. Therefore thesevectors form an orthonormal basis. From this we can ﬁnally concludethe following important Completeness theorem: For any Hermitean operator A on a ˆHilbert space H the set of all eigenvectors form an orthonormal basisof the Hilbert space H, i.e. given the eigenvalues λi and the eigenvectors|λi we ﬁnd ˆ A= λi |λi λi | (1.76) iand for any vector |x ∈ H we ﬁnd coeﬃcients xi such that |x = xi |λi . (1.77) i Now let us brieﬂy consider the case for degenerate eigenvalues. Thisis the case, when the characteristic polynomial has multiple zero’s. Inother words, an eigenvalue is degenerate if there is more than oneeigenvector corresponding to it. An eigenvalue λ is said to be d-fold degenerate if there is a set of d linearly independent eigenvec-tors {|λ1 , . . . , |λd } all having the same eigenvalue λ. Quite obviouslythe space of all linear combinations of the vectors {|λ1 , . . . , |λd } isa d-dimensional vector space. Therefore we can ﬁnd an orthonormalbasis of this vector space. This implies that any Hermitean operatorhas eigenvalues λ1 , . . . , λk with degeneracies d(λ1 ), . . . , d(λk ). To eacheigenvectorλi I can ﬁnd an orthonormal basis of d(λi ) vectors. There-fore the above completeness theorem remains true also for Hermiteanoperators with degenerate eigenvalues. Now you might wonder whether every linear operator A on an N ˆdimensional Hilbert space has N linearly independent eigenvectors? Itturns out that this is not true. An example is the 2 × 2 matrix 0 1 0 0which has only one eigenvalue λ = 0. Therefore any eigenvector to thiseigenvalue has to satisfy 0 1 a 0 = . 0 0 b 0
49.
1.3. LINEAR OPERATORS 49which implies that b = 0. But then the only normalized eigenvector is 1 and therefore the set of eigenvectors do not form an orthonormal 0basis. ˆ We have seen that any Hermitean operator A can be expanded in itseigenvectors and eigenvalues. The procedure of ﬁnding this particularrepresentation of an operator is called diagonalization. Often we donot want to work in the basis of the eigenvectors but for example inthe canonical basis of the vectors 0 . . . 1 0 0 . 0 . , . . . , eN = . e1 = . . , . . . , ei = 1 . (1.78) . 0 0 0 . . 1 . 0 ˆIf we want to rewrite the operator A in that basis we need to ﬁnda map between the canonical basis and the orthogonal basis of the ˆeigenvectors of A. If we write the eigenvectors |λi = N αji |ej then j=1this map is given by the unitary operator (we will deﬁne unitaryoperators shortly) N α11 . . . α1N ˆ . . .. . U= |λi ei | = . . . . . (1.79) i=1 αN 1 . . . αN Nwhich obviously maps a vector |ei into the eigenvector corresponding ˆto the eigenvalue |λi . Using this operator U we ﬁnd ˆ ˆˆ U † AU = ˆ ˆ λi U † |λi λi |U = λi |ei ei | . (1.80) i i The operator in Eq. (1.79) maps orthogonal vectors into orthogonalvectors. In fact, it preserves the scalar product between any two vectors.Let us use this as the deﬁning property of a unitary transformation.
50.
50 CHAPTER 1. MATHEMATICAL FOUNDATIONS Now as promised the accurate deﬁnition of a unitary operator.Avoiding the subtleties of inﬁnite dimensional spaces (which are againproblems of the domain of deﬁnition of an operator) for the momentwe have the following ˆDeﬁnition 24 A linear operator U on a Hilbert space H is called uni-tary if it is deﬁned for all vectors |x , |y in H, maps the whole Hilbertspace into the whole Hilbert space and satisﬁes ˆ ˆ x|U † U |y = x|y . (1.81) In fact we can replace the last condition by demanding that theoperator satisﬁes ˆ ˆ ˆˆ U † U = 1 and U U † = 1. (1.82) Eq. (1.81) implies that a unitary operator preserves the scalar prod-uct and therefore in particular the norm of vectors as well as the anglebetween any two vectors. Now let us brieﬂy investigate the properties of the eigenvalues andeigenvectors of unitary operators. The eigenvalues of a unitary oper-ators are in general not real, but they not completely arbitrary. Wehave ˆTheorem 25 Any unitary operator U on an N -dimensional Hilbertspace H has a complete basis of eigenvectors and all the eigenvalues areof the form eiφ with real φ. ˆProof: I will not give a proof that the eigenvectors of U form a basisin H. For this you should have a look at a textbook. What I will proofis that the eigenvalues of H are of the form eiφ with real φ. To see this, ˆwe use Eq. (1.82). Be |λ an eigenvector of U to the eigenvalue λ, then ˆ ˆ ˆ λU † |λ = U † U |λ = |λ . (1.83)This implies that λ = 0 because otherwise the right-hand side wouldbe the null-vector, which is never an eigenvector. From Eq. (1.83) weﬁnd 1 ˆ ˆ = λ|U † |λ = λ|U |λ ∗ = λ∗ . (1.84) λ
51.
1.3. LINEAR OPERATORS 51This results in |λ|2 = 1 ⇔ λ = eiφ . (1.85)2.1.3.4 Functions of OperatorsIn the previous sections I have discussed linear operators and specialsubclasses of these operators such as Hermitean and unitary operators.When we write down the Hamilton operator of a quantum mechanicalsystem we will often encounter functions of operators, such as the one-dimensional potential V (ˆ) in which a particle is moving. Therefore it xis important to know the deﬁnition and some properties of functions ofoperators. There are two ways of deﬁning functions on an operator, oneworks particularly well for operators with a complete set of eigenvectors(Deﬁnition 26), while the other one works bests for functions that canbe expanded into power series (Deﬁnition 27). ˆDeﬁnition 26 Given an operator A with eigenvalues ai and a completeset of eigenvectors |ai . Further have a function f : C → C that mapscomplex numbers into complex numbers then we deﬁne N ˆ f (A) := f (ai )|ai ai | (1.86) i=1Deﬁnition 27 Given a function f : C → C that can be expanded intoa power series ∞ f (z) = fi z i (1.87) i=0then we deﬁne ∞ ˆ f (A) = fi Ai . (1.88) i=0 ˆDeﬁnition 28 The derivative of an operator function f (A) is deﬁned dfvia g(z) = dz (z) as ˆ df (A) ˆ = g(A) . (1.89) dAˆ
52.
52 CHAPTER 1. MATHEMATICAL FOUNDATIONS Let us see whether the two deﬁnitions Def. 26 and 27 coincide foroperators with complete set of eigenvectors and functions that can beexpanded into a power series given in Eq. (1.87). ∞ ˆ f (A) = fk Ak k=1 ∞ N = fk ( aj |aj aj |)k k=1 j=1 ∞ N = fk ak |aj aj | j k=1 j=1 N ∞ = ( fk ak )|aj aj | j j=1 k=1 N = f (aj )|aj aj | (1.90) j=1For operators that do not have a complete orthonormal basis of eigen-vectors of eigenvectors it is not possible to use Deﬁnition 26 and wehave to go back to Deﬁnition 27. In practise this is not really a prob-lem in quantum mechanics because we will always encounter operatorsthat have a complete set of eigenvectors. ˆ As an example consider a Hermitean operator A with eigenvalues ˆ ˆ = eiA . We ﬁndak and eigenvectors |ak and compute U N ˆ ˆ U = eiA = eiak |ak ak | . (1.91) k=1This is an operator which has eigenvalues of the form eiak with real ak .Therefore it is a unitary operator, which you can also check directly ˆˆ ˆ ˆfrom the requirement U U † = 1 = U † U . In fact it is possible to showthat every unitary operator can be written in the form Eq. (1.91). Thisis captured in the ˆLemma 29 To any unitary operator U there is a Hermitean operatorˆ such thatH ˆ ˆ U = ei H . (1.92)
53.
1.3. LINEAR OPERATORS 53Exercise: ˆ ˆ ˆˆ ˆ ˆ ˆ1) Show that for any unitary operator U we have f (U † AU ) = U † f (A)U ˆˆProof: We use the fact that U U † = 1 to ﬁnd ∞ ˆ ˆˆ f (U † AU ) = ˆ ˆˆ fk (U † AU )k k=0 ∞ = ˆ ˆ ˆ fk U † Ak U k=0 ∞ ˆ = U †( ˆ ˆ fk Ak )U k=0 ˆ ˆ ˆ = U † f (A)U . If you have functions, then you will also expect to encounter deriva-tives of functions. Therefore we have to consider how to take derivativesof matrices. To take a derivative we need to have not only one operator,but a family of operators that is parametrized by a real parameter s.An example is the set of operators of the form ˆ 1+s i·s A(s) = . (1.93) −i · s 1 − sAnother example which is familiar to you is the time evolution operator ˆe−iHs . Now we can deﬁne the derivative with respect to s in completeanalogy to the derivative of a scalar function by dAˆ ˆ ˆ A(s + ∆s) − A(s) (s) := lim . (1.94) ds ∆s→0 ∆sThis means that we have deﬁned the derivative of an operator compo-nent wise. Now let us explore the properties of the derivative of operators.First let us see what the derivative of the product of two operators is.We ﬁnd ˆ ˆProperty: For any two linear operators A(s) and B(s) we have ˆˆ d(AB) ˆ dA ˆ (s) = ˆ ˆ dB (s) (s)B(s) + A(s) (1.95) ds ds ds
54.
54 CHAPTER 1. MATHEMATICAL FOUNDATIONSThis looks quite a lot like the product rule for ordinary functions, exceptthat now the order of the operators plays a crucial role. Planned end of 7th lecture You can also have functions of operators that depend on more thanone variables. A very important example is the commutator of twooperators. ˆ ˆ ˆ ˆDeﬁnition 30 For two operators A and B the commutator [A, B] isdeﬁned as ˆ ˆ ˆˆ ˆˆ [A, B] = AB − B A . (1.96) While the commutator between numbers (1 × 1 matrices) is alwayszero, this is not the case for general matrices. In fact, you know al-ready a number of examples from your second year lecture in quantummechanics. For example the operators corresponding to momentumand position do not commute, i.e. their commutator is nonzero. Otherexamples are the Pauli spin-operators 1 0 0 1 σ0 = σ1 = 0 1 1 0 0 −i 1 0 σ2 = σ3 = (1.97) i 0 0 −1For i, j = 1, 2, 3 they have the commutation relations [σi , σj ] = i ijk σk . (1.98)where ijk is the completely antisymmetric tensor. It is deﬁned by 123 =1 and changes sign, when two indices are interchanged, for example ijk = − jik . There are some commutator relations that are quite useful to know. ˆ ˆ ˆLemma 31 For arbitrary linear operators A, B, C on the same Hilbertspace we have ˆˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ [AB, C] = A[B, C] + [A, C]B (1.99) ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ 0 = [A, [B, C]] + [B, [C, A]] + [C, [A, B]] (1.100)
55.
1.3. LINEAR OPERATORS 55 Proof: By direct inspection of the equations 2 . Commuting observables have many useful and important proper-ties. Of particular signiﬁcance for quantum physics is the followingLemma 32 because it guarantees that two commuting observables canbe simultaneously measured with no uncertainty. ˆ ˆLemma 32 Two commuting observables A and B have the same eigen-vectors, i.e. they can be diagonalized simultaneously.Proof: For simplicity we assume that both observables have only non-degenerate eigenvalues. Now chose a basis of eigenvectors {|ai } that ˆdiagonalizes A. Now try to see whether the |ai are also eigenvectors ˆ Using [A, B] we haveof B. ˆ ˆ ˆ ˆ ˆˆ ˆ A(B|ai ) = B A|ai = ai (B|ai ) . (1.101) ˆ ˆThis implies that B|ai is an eigenvector of A with eigenvalue ai . Asthe eigenvalue is non-degenerate we must have ˆ B|ai = bi |ai (1.102) ˆfor some bi . Therefore |ai is an eigenvector to B 2 . There is a very nice relation between the commutator and thederivative of a function. First I deﬁne the derivative of an operatorfunction with respect to the operator. ˆ ˆLemma 33 Given two linear operators A and B which have the com- ˆ A] = 1. Then for the derivative of an operator functionmutator [B, ˆ ˆf (A) we ﬁnd ˆ ˆ df ˆ [B, f (A)] = (A) . (1.103) dAˆProof: Remember that a function of an operator is deﬁned via itsexpansion into a power series, see Eq. (1.88). Therefore we ﬁnd ˆ ˆ ˆ [B, f (A)] = [B, ˆ fk Ak ] k = ˆ ˆ fk [B, Ak ] k
56.
56 CHAPTER 1. MATHEMATICAL FOUNDATIONS ˆ ˆNow we need to evaluate the expression [B, Ak ]. We proof by induc- ˆ An ] = nAn−1 . For n = 1 this is true. Assume that thetion that [B, ˆ ˆassumption is true for n. Now start with n + 1 and reduce it to thecase for n. Using Eq. (1.99) we ﬁnd ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ [B, A(n+1) ] ≡ [B, An A] = [B, An ]A + An [B, A] . (1.104) ˆ ˆNow using the [B, A] = 1 and the induction assumption we ﬁnd ˆ ˆ ˆ ˆ ˆ ˆ [B, A(n+1) ] = nAn−1 A + An = (n + 1)An . (1.105)Now we can conclude ˆ ˆ [B, f (A)] = ˆ ˆ fk [B, Ak ] k = ˆ fk k Ak−1 k df ˆ = (A) dAˆThis ﬁnishes proof.2 A very useful property is ˆLemma 34 (Baker-Campbell-Haussdorﬀ) For general operators A andˆ we haveB ˆ ˆ ˆ ˆ ˆ ˆ 1 ˆ ˆ ˆ eB Ae−B = A + [B, A] + [B, [B, A]] + . . . . (1.106) 2 ˆ ˆ ˆFor operators such that [B, [B, A]] = 0 we have the simpler version ˆ ˆ ˆ ˆ ˆ ˆ eB Ae−B = A + [B, A] . (1.107)Proof: Deﬁne the function of one real parameter α ˆ ˆ ˆ f (α) = eαB Ae−αB . (1.108)We can expand this function around α = 0 into a Taylor series f (α) = ∞ αn dn f n=0 n! dαn (α)|α=0 and therefore we need to determine the derivatives
57.
1.4. OPERATORS WITH CONTINUOUS SPECTRUM 57of the function f (α). We ﬁnd df ˆ ˆ (α)|α=0 = [B, A] dα d2 f ˆ ˆ ˆ (α)|α=0 = [B, [B, A]] dα2 . . .The rest of the proof follows by induction. The proof of Eq. (1.107)follows directly from Eq. (1.106).21.4 Operators with continuous spectrumIn the preceding sections I have explained the basic ideas of Hilbertspaces and linear operators. All the objects I have been dealing with sofar have been ﬁnite dimensional an assumption that greatly simpliﬁedthe analysis and made the new ideas more transparent. However, inquantum mechanics many systems are actually described by an inﬁnitedimensional state space and the corresponding linear operators are inﬁ-nite dimensional too. Most results from the preceding section hold truealso for the inﬁnite dimensional case, so that we do not need to learntoo many new things. Nevertheless, there are properties that requiresome discussion.1.4.1 The position operatorThe most natural place where an inﬁnite dimensional state space ap-pears is in the analysis of a particle in free space. Therefore let usbrieﬂy reconsider some aspects of wave mechanics as you have learntthem in the second year course. The state of a particle in free space(maybe moving in a potential) is described by the square-integrablewave-function ψ(x). The question is now as to how we connect thewave-function notation with the Dirac notation which we have used todevelop our theory of Hilbert spaces and linear operators. Let us remember what the Dirac notation for ﬁnite dimensionalsystems means mathematically. Given the ket-vector |φ for the state
58.
58 CHAPTER 1. MATHEMATICAL FOUNDATIONSof a ﬁnite dimensional system, we can ﬁnd the components of this ket-vector with respect to a certain basis {|ei }. The i-th component isgiven by the complex number ei |φ . Therefore in a particular basis itmakes sense to write the state |φ as a column vector e1 |φ |φ ↔ . . . (1.109) . en |φLet us try to transfer this idea to the wave-function of a particle infree space. What we will do is to interpret ψ(x) as the component of avector with inﬁnitely many components. Informally written this means . . . |ψ ↔ ψ(x) , (1.110) . . .where we have given the column vector a name, |ψ . Obviously theset of vectors deﬁned in this way form a vector space as you can easilycheck. Of course we would like to have a scalar product on this vectorspace. This is introduced in complete analogy to the ﬁnite dimensionalcase. There we had (|φ , |ψ ) = φ|ψ = φ|ei ei |ψ . (1.111) iWe just replace the summation over products of components of the twovectors by an integration. We have (see also Eq (1.29) ∞ (|φ , |ψ ) := dx φ∗ (x)ψ(x) . (1.112) −∞Now we have a space of vectors (or square integrable wave functions)H equipped with a scalar product. Indeed it turns out to be a completespace (without proof) so that we have a Hilbert space. Now that we have introduced ket vectors we also need to deﬁnebra-vectors. In the ﬁnite dimensional case we obtained the bra vectorvia the idea of linear functionals on the space of state vectors. Let us
59.
1.4. OPERATORS WITH CONTINUOUS SPECTRUM 59repeat this procedure now for the the case of inﬁnite dimensions. Wedeﬁne a linear functional φ| by φ|(|ψ ) ≡ φ|ψ = (|φ , |ψ ) . (1.113)Now I would like to investigate a particular ket vector (linear func-tional) that will allow as to deﬁne a position state. We deﬁne a linearfunctional x0 | by x0 |(|ψ ) ≡ x0 |ψ := ψ(x0 ) . (1.114)We are already writing this functional very suggestively as a bra-vector.Thats perfectly ok, and we just have to check that the so deﬁned func-tional is indeed linear. Of course we would like to interpret the lefthand side of Eq. (1.114) as a scalar product between two ket’s, i.e. (|x0 , |ψ ) := x0 |ψ = ψ(x0 ) . (1.115)What does the ket |x0 corresponding to the bra x0 | mean? Which ∗wave-function δx0 (x) does it correspond to? Using the scalar productEq. (1.112), we have ∞ ∗ dxδx0 (x)ψ(x) = (|x0 , |ψ ) = x0 |ψ = ψ(x0 ) (1.116) −∞ ∗This means that the function δx0 (x) has to act like a delta-function!The wave-function corresponding to the bra x0 | is a delta-function. Adelta-function however, is not square-integrable! Therefore it cannot bean element of the Hilbert space of square integrable functions. However,as we have seen it would be quite convenient to use these wave-functionsor states. Therefore we just add them to our Hilbert space, although wewill often call them improper states or wave-functions. In fact we canuse basically all the rules that we have learned about ﬁnite dimensionalHilbert-spaces also for these improper states. All we need to demandis the following rule for the scalar product ψ|x0 := ( x0 |ψ )∗ = ψ ∗ (x0 ) . (1.117)Now I can write for arbitrary kets |φ , |ψ ∈ H φ|ψ = φ∗ (x)ψ(x)dx = φ|x x|ψ dx = φ|( |x x|dx)|ψ . (1.118)
60.
60 CHAPTER 1. MATHEMATICAL FOUNDATIONSThen we can conclude |x x|dx = 1 . (1.119)Inserting this identity operator in x|ψ , we obtain the orthogonalityrelation between position kets δ(x − x )ψ(x )dx = ψ(x) = x|ψ = x|x x |ψ dx = x|x ψ(x )dx .Therefore we have x|x = δ(x − x ) . (1.120)Now we can derive the form of the position operator from our knowledgeof the deﬁnition of the position expectation value ψ|ˆ|ψ x := x|ψ(x)|2 dx = ψ|x x x|ψ dx = ψ|( x|x dx)|ψ , (1.121)where we deﬁned the position operator x= ˆ x|x x|dx = x† . ˆ (1.122)Now you see why the improper position kets are so useful. In this basisthe position operator is automatically diagonal. The improper positionkets |x0 are eigenvectors of the position operator x|x0 = x0 |x0 . ˆ (1.123)This makes sense, as the position kets describe a particle that is per-fectly localized at position x0 . Therefore a position measurement shouldalways give the result x0 . So far we have dealt with one-dimensionalsystems. All of the above considerations can be generalized to thed-dimensional case by setting ˆ |x = x1 e1 + . . . + xd ed . ˆ ˆ (1.124)The diﬀerent components of the position operator commute are as-sumed to commute. Planned end of 8th lecture
61.
1.4. OPERATORS WITH CONTINUOUS SPECTRUM 611.4.2 The momentum operatorNow we are going to introduce the momentum operator and momentumeigenstates using the ideas of linear functionals in a similar fashion tothe way in which we introduced the position operator. Let us introducethe linear functional p| deﬁned by 1 p|ψ := √ e−ipx/¯ ψ(x)dx . h (1.125) 2π¯ hNow we deﬁne the corresponding ket by p|ψ ∗ ˜ = ψ ∗ (p) =: ψ|p . (1.126)Combining Eq. (1.125) with the identity operator as represented in Eq.(1.119) we ﬁnd 1 √ e−ipx/¯ ψ(x)dx = p|ψ = h p|x x|ψ dx (1.127) 2π¯ hTherefore we ﬁnd that the state vector |p represents a plane wave,because 1 x|p = √ eipx/¯ . h (1.128) 2π¯ hAs Eq. (1.128) represents a plane wave with momentum p it makessense to call |p a momentum state and expect that it is an eigenvectorof momentum operator p to the eigenvalue p. Before we deﬁne the mo- ˆmentum operator, let us ﬁnd the decomposition of the identity operatorusing momentum eigenstates. To see this we need to remember fromthe theory of delta-functions that 1 eip(x−y)/¯ dp = δ(x − y) . h (1.129) 2π¯hThen we have for arbitrary |x and |y 1 x|y = δ(x − y) = dpeip(x−y)/¯ = x|( |p p|dp)|y , (1.130) h 2π¯hand therefore |p p|dp = 1 . (1.131)
62.
62 CHAPTER 1. MATHEMATICAL FOUNDATIONSThe orthogonality relation between diﬀerent momentum kets can befound by using Eq. (1.131) in Eq. (1.125). p|ψ = p|1|ψ = p|p p |ψ dp (1.132)so that p|p = δ(p − p ) . (1.133)The momentum operator p is the operator that has as its eigenvec- ˆtors the momentum eigenstates |p with the corresponding eigenvaluep. This makes sense, because |p describes a plane wave which has aperfectly deﬁned momentum. Therefore we know the spectral decom-position which is p = p|p p|dp . ˆ (1.134)Clearly we have p|p0 = p0 |p0 ˆ . (1.135)Analogously to the position operator we can extend the momentumoperator to the d-dimensional space by ˆ ˆ p = p 1 e1 + . . . + p d e d ˆ (1.136)The diﬀerent components of the momentum operator are assumed tocommute.1.4.3 The position representation of the momen- tum operator and the commutator between position and momentumWe have seen how to express the position operator in the basis of theimproper position kets and the momentum operator in the basis of theimproper momentum kets. Now I would like to see how the momentumoperator looks like in the position basis. To see this, diﬀerentiate Eq. (1.128) with respect to x which gives h ∂ ¯ h ∂ ¯ 1 x|p = √ eipx/¯ = p x|p . h (1.137) i ∂x i ∂x 2π¯h
63.
1.4. OPERATORS WITH CONTINUOUS SPECTRUM 63Therefore we ﬁnd x|ˆ|ψ p = x|p p p|ψ dp h ∂ ¯ = x|p p|ψ dp i ∂x h ¯ = x|( |p p|)|ψ i h ∂ ¯ = x|ψ . (1.138) i ∂xIn position representation the momentum operator acts like the diﬀer-ential operator, i.e. h ∂ ¯ p ←→ ˆ . (1.139) i ∂x Knowing this we are now able to derive the commutation relationbetween momentum and position operator. We ﬁnd x|[ˆ, p]|ψ x ˆ = x|(ˆp − px|ψ x ˆ ˆˆ h ¯ ∂ ∂ = x x|ψ − (x x|ψ ) i ∂x ∂x = i¯ x|ψ h = x|i¯ 1|ψ hTherefore we have the Heisenberg commutation relations [ˆ, p] = i¯ 1 . x ˆ h (1.140)
65.
Chapter 2Quantum MeasurementsSo far we have formulated two postulates of quantum mechanics. Thesecond of these states that the measurement results of an observableare the eigenvalues of the corresponding Hermitean operator. However,we do not yet know how to determine the probability with which thismeasurement result is obtained, neither have we discussed the state ofthe system after the measurement. This is the object of this section.2.1 The projection postulateWe have learned, that the possible outcomes in a measurement of aphysical observable are the eigenvalues of the corresponding Hermiteanoperator. Now I would like to discuss what the state of the systemafter such a measurement result is. Let us guide by common sense.Certainly we would expect that if we repeat the measurement of thesame observable, we should ﬁnd the same result. This is in contrastto the situation before the ﬁrst measurement. There we certainly didnot expect that a particular measurement outcome would appear withcertainty. Therefore we expect that the state of the system after theﬁrst measurement has changed as compared to the state before theﬁrst measurement. What is the most natural state that corresponds to ˆthe eigenvalue ai of a Hermitean operator A? Clearly this is the cor-responding eigenvector in which the observable has the deﬁnite valueai ! Therefore it is quite natural to make the following postulate for 65
66.
66 CHAPTER 2. QUANTUM MEASUREMENTSobservables with non-degenerate eigenvalues.Postulate 3 (a) The state of a quantum mechanical system after the ˆmeasurement of observable A with the result being the non-degeneratedeigenvalue ai is given by the corresponding eigenvector |ai . For observables with degenerate eigenvalues we have a problem.Which of the eigenvalues should we chose? Does it make sense to choseone particular eigenvector? To see what the right answer is we have to ˆmake clear that we are really only measuring observable A. We do notobtain any other information about the measured quantum mechanicalsystem. Therefore it certainly does not make sense to prefer one overanother eigenvector. In fact, if I would assume that we have to choseone of the eigenvectors at random, then I would eﬀectively assumethat some more information is available. Somehow we must be able todecide which eigenvector has to be chosen. Such information can onlycome from the measurement of another observable - a measurementwe haven’t actually performed. This is therefore not an option. Onthe other hand we could say that we chose one of the eigenvectors atrandom. Again this is not really an option, as it would amount tosaying that someone chooses an eigenvector and but does not reveal tous which one he has chosen. It is quite important to realize, that a) nothaving some information and b) having information but then choosingto forget it are two quite diﬀerent situations. All these problems imply that we have to have all the eigenvectorsstill present. If we want to solve this problem, we need to introduce anew type of operator - the projection operator. ˆDeﬁnition 35 An operator P is called a projection operator if itsatisﬁes ˆ ˆ 1. P = P † , ˆ ˆ 2. P = P 2 .
67.
2.1. THE PROJECTION POSTULATE 67 Some examples for projection operators are ˆ 1. P = |ψ ψ| ˆ ˆ 2. If P is a projection operator, then also 1 − P is a projection operator. ˆ 3. P = 1Exercise: Prove that the three examples above are projection opera-tors!Lemma 36 The eigenvalues of a projection operator can only have thevalues 0 or 1.Proof: For any eigenvector |λ of λ we have ˆ ˆ λ|λ = P |λ = P 2 |λ = λ2 |λ . (2.1)From this we immediately obtain λ = 0 or λ = 1 2 . For a set of orthonormal vectors {|ψi }i=1,...,N a projection operatorPˆ = k |ψi ψi | projects a state |ψ onto the subspace spanned by i=1the vectors {|ψi }i=1,...,k . In mathematics this statement is more clear.If we expand |ψ = N αi |ψi then i=1 k ˆ P |ψ = αi |ψi . (2.2) i=1 This is exactly what we need for a more general formulation of thethird postulate of quantum mechanics. We formulate the third postu-late again, but now for observables with degenerate eigenvalues.
68.
68 CHAPTER 2. QUANTUM MEASUREMENTSPostulate 3 (b) The state of a quantum mechanical system after the ˆmeasurement of general observable A with the result being the possiblydegenerated eigenvalue ai is given by ˆ Pi |ψ . (2.3) ˆwhere Pi is the projection operator on the subspace of H spanned by ˆall the eigenvectors of A with eigenvalue ai , i.e. ˆ Pi = |ψi ψi | . (2.4) i Experiments have shown that this postulate describes the resultsof measurements extremely well and we therefore have to accept it atleast as a good working assumption. I say this because there are quite afew people around who do not like this postulate very much. They askthe obvious question as to how and when the reduction of the quantumstate is exactly taking place. In fact, we have not answered any of thatin this discussion and I will not do so for a simple reason. Nobodyreally knows the answer to these questions and they are a subject ofdebate since a long time. People have come up with new interpretationsof quantum mechanics (the most prominent one being the many-worldstheory) or even changes to the Schr¨dinger equations of quantum me- ochanics themselves. But nobody has solved the problem satisfactorily.This is a bit worrying because this means that we do not understandan important aspect of quantum mechanics. However, there is a reallygood reason for using the projection postulate: It works superblywell when you want to calculate things. Therefore let us now continue with our exploration of the postulatesof quantum mechanics. We now know what the measurement outcomesare and what the state of our system after the measurement is, butwe still do not know what the probability for a speciﬁc measurementoutcome is. To make any sensible guess about this, we need to consider
69.
2.1. THE PROJECTION POSTULATE 69the properties of probabilities to see which of these we would like tobe satisﬁed in quantum mechanics. Once we know what are sensiblerequirements, then we can go ahead to make a new postulate. As you know, probabilities p(Ai ) to obtain an outcome Ai (whichis usually a set of possible results) are positive quantities that are notlarger than unity, i.e. 0 ≤ p(Ai ) ≤ 1. In addition it is quite trivial todemand that the probability for an outcome corresponding to an emptyset vanishes, i.e. p(∅), while the probability for the set of all elementsis unity, i.e. p(1). These are almost trivial requirements. Really important is the be-haviour of probabilities for joint sets. What we deﬁnitively would liketo have is that the probabilities for mutually exclusive events add up,i.e. if we are given disjoint sets A1 and A2 and we form the unionA1 ∪ A2 of the two sets, then we would like to have p(A1 ∪ A2 ) = p(A1 ) + p(A2 ) . (2.5)In the following postulate I will present a deﬁnition that satisﬁes all ofthe properties mentioned above.Postulate 4 The probability of obtaining the eigenvalue ai in a mea- ˆsurement of the observable A is given by ˆ ˆ pi = ||Pi |ψ ||2 = ψ|Pi |ψ . (2.6) For a non-degenerate eigenvalue with eigenvector |ψi this probabil-ity reduces to the well known expression pi = | ψi |ψ |2 (2.7) It is easy to check that Postulate 4 indeed satisﬁes all the criteriathat we demanded for a probability. The fascinating thing is, that it isessentially the only way of doing so. This very important theorem wasﬁrst proved by Gleason in 1957. This came as quite a surprise. Onlyusing the Hilbert space structure of quantum mechanics together with
70.
70 CHAPTER 2. QUANTUM MEASUREMENTSthe reasonable properties that we demanded from the probabilities forquantum mechanical measurement outcomes we can’t do anything elsethan using Eq. (2.6)! The proof for this theorem is too complicatedto be presented here and I just wanted to justify the Postulate 4 a bitmore by telling you that we cannot really postulate anything else.2.2 Expectation value and variance.Now that we know the measurements outcomes as well as the probabil-ities for their occurrence, we can use quantum mechanics to calculatethe mean value of an observable as well as the variance about this meanvalue. What do we do experimentally to determine an mean value of anobservable. First, of course we set up our apparatus such that it mea- ˆsures exactly the observable A in question. Then we build a machinethat creates a quantum system in a given quantum state over and overagain. Ideally we would like to have inﬁnitely many of such identicallyprepared quantum states (this will usually be called an ensemble).Now we perform our experiment. Our state preparer produces the ﬁrstparticle in a particular state |ψ and the particle enters our measure-ment apparatus. We obtain a particular measurement outcome, i.e. one ˆof the eigenvalues ai of the observable A. We repeat the experimentand we ﬁnd a diﬀerent eigenvalue. Every individual outcome of theexperiment is completely random and after N repetitions of the experi-ment, we will obtain the eigenvalue ai as a measurement outcome in Niof the experiments, where i Ni = N . After ﬁnishing our experiment,we can then determine the average value of the measurement outcomes,for which we ﬁnd ˆ Ni A ψ (N ) = ai . (2.8) i NFor large numbers of experiments, i.e. large N , we know that the ratioNiN approaches the probability pi for the outcome ai . This probabilitycan be calculated from Postulate 4 and we ﬁnd ˆ Ni ˆ ˆ A ψ = lim ai = p i ai = ai ψ|Pi |ψ = ψ|A|ψ . (2.9) N →∞ i N i i
71.
2.3. UNCERTAINTY RELATIONS 71You should realize that the left hand side of Eq. (2.9) is conceptuallyquite a diﬀerent thing from the right hand side. The left hand sideis an experimentally determined quantity, while the right hand sideis a quantum mechanical prediction. This is a testable prediction ofPostulate 3 and has been veriﬁed in countless experiments. Now weare also able to derive the quantum mechanical expression for the vari-ance of the measurement results around the mean value. Given the ˆprobabilities pi and the eigenvalues ai of the observable A we ﬁnd (∆A)2 := pi (ai − A )2 = p i a2 − ( ˆ ˆ pi ai )2 = A2 − A 2 . i i i i (2.10) After these deﬁnitions let us now analyse a very important propertyof measurement results in quantum mechanics.2.3 Uncertainty RelationsIn classical mechanics we are able to measure any physical quantity toan arbitrary precision. In fact, this is also true for quantum mechanics!Nothing prevents us from preparing a quantum system in a very welldeﬁned position state and subsequently make a measurement that willtell us with exceeding precision where the particle is. The real diﬀerenceto classical mechanics arise when we try to determine the values of twodiﬀerent observables simultaneously. Again, in classical mechanics wehave no law that prevents us from measuring any two quantities witharbitrary precision. Quantum mechanics, however, is diﬀerent. Onlyvery special pairs of observables can be measured simultaneously toan arbitrary precision. Such observables are called compatible or com-muting observables. In general the uncertainties in the measurementsof two arbitrary observables will obey a relation which makes sure thattheir product has a lower bound which is in general is unequal to zero.This relation is called the uncertainty relation. ˆ ˆTheorem 37 For any two observables A and B we ﬁnd for their un- ˆ ˆ ˆcertainties ∆X = X 2 − X 2 the uncertainty relation ˆ ˆ | [A, B] | ∆A∆B ≥ . (2.11) 2
72.
72 CHAPTER 2. QUANTUM MEASUREMENTSProof: Let us deﬁne two ket’s ˆ ˆ |φA = A − A |ψ (2.12) ˆ ˆ |φB = B − B |ψ . (2.13)Using these vectors we ﬁnd that | φA |φA | · | φB |φB | = ∆A∆B . (2.14)Now we can use the Schwarz inequality to ﬁnd a lower bound on theproduct of the two uncertainties. We ﬁnd that | φA |φB | ≤ | φA |φA | · | φB |φB | = ∆A∆B . (2.15)For the real and imaginary parts of | φA |φB | we ﬁnd 1 ˆˆ ˆˆ ˆ ˆ Re φA |φB = AB + B A − 2 A B (2.16) 2 1 ˆ ˆ Im φA |φB = [A, B] . (2.17) 2iWe can use Eqs. (2.16-2.17) in Eq. (2.15) to ﬁnd 1 ˆ ˆ ∆A∆B ≥ (Re φA |φB )2 + (Im φA |φB )2 ≥ | [A, B] | (2.18) 2This ﬁnishes the proof 2 . From Eq. (2.11) we can see that a necessary condition for twoobservables to be measurable simultaneously with no uncertainty isthat the two observables commute. In fact this condition is also asuﬃcient one. We have ˆ ˆTheorem 38 Two observables A and B can be measured precisely, i.e.∆A = ∆B = 0, exactly if they commute.Proof: From Eq. (2.11) we immediately see that ∆A = ∆B = 0 im- ˆ ˆplies that [A, B] = 0. For the other direction of the statement we needto remember that two commuting observables have the same eigen-vectors. If we assume our quantum mechanical system is in one of
73.
2.3. UNCERTAINTY RELATIONS 73these eigenstates |ψ , then we see that in Eqs. (2.12-2.13) that |φAand |φB are proportional to |ψ and therefore proportional to eachother. Then we only need to remember that in the case of proportional|φA and |φB we have equality in the Schwarz inequality which implies 1 ˆ ˆ∆A = ∆B = 2 | [A, B] | = 0 2 . Now let us make clear what the uncertainty relation means, as thereis quite some confusion about that. Imagine that we are having an en-semble of inﬁnitely many identically prepared quantum systems, eachof which is in state |ψ . We would like to measure two observables A ˆ ˆand B. Then we split the ensemble into two halves. On the ﬁrst half ˆwe measure exclusively the observable A while on the second half wemeasure the observable B. ˆ The measurement of the two observables ˆ ˆwill lead to average values A and B which have uncertainties ∆Aand ∆B. From this consideration it should be clear, that the uncer-tainties in the measurements of the two observables are not caused bya perturbation of the quantum mechanical system by the measurementas we are measuring the two observables on diﬀerent systems which arein the same state. These uncertainties are an intrinsic property ofany quantum state. Of course perturbations due to the measure-ment itself may increase the uncertainty but it is not the reason forthe existence of the uncertainty relations. In that context I would also like to point out, that you will some-times ﬁnd the uncertainty relation stated in the form ˆ ˆ ∆A∆B ≥ | [A, B] | , (2.19)that is, where the right-hand side is twice as large as in Eq. (2.11).The reason for this extra factor of 2 is that the uncertainty relation Eq.(2.19) describes a diﬀerent situation than the one stated in Theorem 37.Eq. (2.19) really applies to the simultaneous measurement of two non-commuting observables on one quantum system. This means that we ˆperform a measurement of the observable A on every system of the en-semble and subsequently we perform a measurement of the observable ˆB on every system of the ensemble. The measurement of the observable ˆA will disturb the system and a subsequent measurement of B could ˆtherefore be more uncertain. Imagine, for example, that you want to
74.
74 CHAPTER 2. QUANTUM MEASUREMENTSdetermine the position of an electron and simultaneously the momen-tum. We could do the position measurement by scattering light fromthe electron. If we want to determine the position of the electron veryprecisely, then we need to use light of very short wavelength. However,photons with a short wavelength carry a large momentum. When theycollide with the electron, then the momentum of the electron may bechanged. Therefore the uncertainty of the momentum of the electronwill be larger. This is an example for the problems that arise when youwant to measure two non-commuting variables simultaneously. If you would like to learn more about the uncertainty relation for thejoint measurement of non-commuting variables you may have a look atthe pedagogical article by M.G. Raymer, Am. J. Phys. 62, 986 (1994)which you can ﬁnd in the library.2.3.1 The trace of an operatorIn the fourth postulate I have deﬁned the rule for the calculation ofthe probability that a measurement result is one of the eigenvalues ofthe corresponding operator. Now I would like to generalize this ideato situations in which we have some form of lack of knowledge aboutthe measurement outcome. An example would be that we are not quitesure which measurement outcome we have (the display of the apparatusmay have a defect for example). To give this law a simple formulation,I need to introduce a new mathematical operation. ˆDeﬁnition 39 The trace of an operator A on an N dimensional Hilbertspace is deﬁned as N ˆ tr(A) = ˆ ψi |A|ψi (2.20) i=1for any orthonormal set of basis vectors {|ψi }. ˆ In particular, if the |ψi are chosen as the eigenvectors of A we see ˆthat the trace is just the sum of all the eigenvalues of A. The trace willplay a very important role in quantum mechanics which we will see inour ﬁrst discussion of the measurement process in quantum mechanics.
75.
2.3. UNCERTAINTY RELATIONS 75 The trace has quite a few properties that are helpful in calculations.Note that the trace is only well deﬁned, if it is independent of the choiceof the orthonormal basis {|ψi }. If we chose a diﬀerent basis {|φi }, then ˆ ˆwe know that there is a unitary operator U such that |ψi = U |φi andwe want the property tr{A} ˆ = tr{U † AU }. That this is indeed true ˆ ˆˆshows ˆ ˆTheorem 40 For any two operators A and B on a Hilbert space H ˆand unitary operators U we have ˆˆ ˆˆ tr{AB} = tr{B A} (2.21) ˆ ˆ ˆˆ tr{A} = tr{U AU † } . (2.22)Proof: I prove only the ﬁrst statement, because the second one follows ˆ ˆdirectly from U † U = 1 and Eq. (2.21). The proof runs as follows N ˆˆ tr{AB} = ˆˆ ψn |AB|ψn n=1 N = ˆ ˆ ψn |A1 B|ψn n=1 N N = ˆ ψn |A ˆ |ψk ψk |B|ψn n=1 k=1 N N = ˆ ˆ ψn |A|ψk ψk |B|ψn k=1 n=1 N N = ˆ ˆ ψk |B|ψn ψn |A|ψk k=1 n=1 N N = ˆ ψk |B ˆ |ψn ψn |A|ψk k=1 n=1 N = ˆ ˆ ψk |B1 A|ψk k=1 N = ˆˆ ψk |B A|ψk k=1 ˆˆ = tr{B A} .
76.
76 CHAPTER 2. QUANTUM MEASUREMENTSComment: If you write an operator in matrix form then the trace isjust the sum of the diagonal elements of the matrix.I introduced the trace in order to write some of the expressions fromthe previous chapters in a diﬀerent form which then allow their gener-alization to situations which involve classical uncertainty. In postulate 4 the probability to ﬁnd eigenvalue ai in a measurement ˆof the observable A has been given for a system in quantum state |ψ .Using the trace it can now be written ˆ ˆ pi = ψ|Pi |ψ = tr{Pi |ψ ψ|} . ˆ Using the trace, the expectation value of an observable A measuredon a system in a quantum state |ψ can be written as ˆ A |ψ ˆ ˆ = ψ|A|ψ = tr{A|ψ ψ|} .2.4 The density operatorWhen discussing measurement theory it is quite natural to introducea new concept for the description of the state of quantum system. Sofar we have always described a quantum system by a state vector, i.e.by a pure state which may be a coherent superposition of many otherstate vectors, e.g. |ψ = α1 |ψ1 + α2 |ψ2 . (2.23)A quantum mechanical system in such a state is ideal for exhibitingquantum mechanical interference. However, it has to be said that sucha state is an idealization as compared to the true state of a quantumsystem that you can prepare in an experiment. In an experiment youwill always have some statistical uncertainty due to imperfections inthe preparation procedure of the quantum state or the measurement.For example, due to an occasional error, happening at a random timeand unknown to you, the wrong quantum state may be prepared. Thenwe do not only have the quantum mechanical uncertainty that is de-scribed by the state vector, but we also have a classical statisticaluncertainty. This raises the question as to how to describe such an
77.
2.4. THE DENSITY OPERATOR 77Figure 2.1: An oven emits atomic two-level systems. The internal stateof the system is randomly distributed. With probability pi the systemis in the pure state |ψi . A person oblivious to this random distribution ˆmeasures observable A. What is the mean value that he obtains?experimental situation in the most elegant way. Certainly it cannotbe dealt with by just adding the state vectors together, i.e. forming acoherent superposition. To understand this better and to clarify the deﬁnition of the densityoperator I will present an example of an experimental situation wherethe description using pure quantum states fails, or better, where it israther clumsy. Consider the situation presented in Fig. 2.1. An ovenis ﬁlled with atomic two-level systems. We assume that each of thesetwo-level systems is in a random pure state. To be more precise, letus assume that with probability pi a two-level system is in state |ψi .Imagine now that the oven has a little hole in it through which thetwo-level atoms can escape the oven in the form of an atomic beam.This atomic beam is travelling into a measurement apparatus built byan experimentalist who would like to measure the observable A. The ˆtask of a theoretical physicist is to predict what the experimentalistwill measure in his experiment. We realize that each atom in the atomic beam is in a pure state|ψi with probability pi . For each individual atom it is unknown to theexperimentalist in which particular state it is. He only knows the prob-ability distribution of the possible states. If the experimentalist makesmeasurements on N of the atoms in the beam, then he will performthe measurement Ni ≈ N pi times on an atom in state |ψi . For each ofthese pure states |ψi we know how to calculate the expectation value ˆof the observable A that the experimentalist is measuring. It is simply ˆ i = tr{A|ψi ψi |}. What average value will the experimentalist ψi |A|ψ ˆsee in N measurements? For a large N the relative frequencies of oc-
78.
78 CHAPTER 2. QUANTUM MEASUREMENTScurrence of state |ψi is Ni /N = pi . Therefore the mean value observedby the experimentalist is ˆ A = ˆ pi tr{A|ψi ψi |} . (2.24) iThis equation is perfectly correct and we are able to calculate the ex- ˆpectation value of any observable A for any set of states {|ψi } andprobabilities {pi }. However, when the number of possible states |ψi isreally large then we have a lot of calculations to do. For each state |ψi ˆ ˆwe have to calculate the expectation value ψi |A|ψi = tr{A|ψi ψi |}and then sum up all the expectation values with their probabilities. If ˆwe then want to measure a diﬀerent observable B then we need to dothe same lengthy calculation again. That’s not eﬃcient at all! As atheoretical physicist I am of course searching for a better way to dothis calculation. Therefore let us reconsider Eq. (2.24), transformingit a bit and use it to deﬁne the density operator. ˆ A = ˆ pi tr{A|ψi ψi |} . i = tr{Aˆ pi |ψi ψi |} ˆρ =: tr{Aˆ} . (2.25)The last equality is really the deﬁnition of the density operator ρ. ˆQuite obviously, if I know the density operator ρ for the speciﬁc situa- ˆtion - here the oven generating the atomic beam - then I can calculatequite easily the expectation value that the experimentalist will mea-sure. If the experimentalist changes his apparatus and now measures ˆthe observable B then this is not such a big deal anymore. We just need ˆ ˆto re-evaluate Eq. (2.25) replacing A by B and not the potentially huge ˆnumber of expectation values tr{A|ψi ψi |}. ˆ ˆρExercise: Check that in general we have f (A) = tr{f (A)ˆ}. Before I present some useful properties of a density operator, letme stress again the diﬀerence between a coherent superposition ofquantum states and a statistical mixture of quantum states. In acoherent superposition state of a quantum mechanical system, each
79.
2.4. THE DENSITY OPERATOR 79representative of the ensemble is in the same pure state of the form |ψ = αi |ψi , (2.26) iwhich can be written as the density operator ∗ ρ = |ψ ψ| = αi αj |ψi ψj | . (2.27) i,jThis is completely diﬀerent to a statistical mixture where a repre-sentative is with a certain probability |αi |2 in a pure state |ψi ! Thecorresponding density operator is ρ= ˆ |αi |2 |ψi ψi | . (2.28) iand is also called a mixed state. You should convince yourself thatindeed the two states Eqs. (2.26-2.28) are diﬀerent, i.e. ρ= ˆ |αi |2 |ψi ψi | = |ψ ψ| . (2.29) i Now let us consider some properties of the density operator whichcan in fact be used as an alternative deﬁnition of the density operator.Theorem 41 Any density operator satisﬁes 1. ρ is a Hermitean operator. ˆ 2. ρ is a positive semideﬁnite operator, i.e. ∀|ψ : ψ|ˆ|ψ ≥ 0. ˆ ρ 3. tr{ˆ} = 1 ρ Proof: As we have deﬁned the density operator already we need toproof the above theorem. From our deﬁnition of the density operatorwe ﬁnd ρ† = ( pi |ψi ψi |)† = ˆ pi |ψi ψi | = ρ . ˆ (2.30) i iThis followed because |ψi ψi | is a projection operator. This provespart 1). Part 2) follows byψ|ˆ|ψ = ψ|( ρ pi |ψi ψi |)|ψ = pi ψ|ψi ψi |ψ = pi | ψ|ψi |2 ≥ 0 . i i i (2.31)
80.
80 CHAPTER 2. QUANTUM MEASUREMENTSThe last property follows from the fact that probabilities are normal-ized, i.e. i pi = 1. Then we can see that tr{ˆ} = tr{ ρ pi |ψi ψi |} = pi tr{|ψi ψi |} = pi = 1 . (2.32) i i iThis ﬁnishes the proof 2 . Some simple but useful properties that can be derived from Theorem41 are 1. ρ2 = ρ ⇐⇒ ρ is a pure state ˆ ˆ ˆ 2. All eigenvalues of ρ lie in the interval [0, 1]. ˆProof: Exercise! A natural question that arises when one considers the density oper-ator, is that of its decomposition into pure states, i.e. ρ = i pi |ψi ψi |.Is this decomposition unique, or are there many possible ways to obtaina given density operator as a statistical mixture of pure states? Theanswer can be found by looking at a particularly simple example. Letus consider the density operator representing the ’completely mixed’state of a two-level system, i.e. 1 ρ = (|0 0| + |1 1|) . ˆ (2.33) 2Looking at Eq. (2.33) we readily conclude that this density operatorcan be generated by an oven sending out atoms in states |0 or |1 witha probability of 50% each, see part a) of Fig . (2.2). That conclusion isperfectly correct, however, we could also imagine an oven that generates √atoms in states |± = (|0 ± |1 )/ 2 with a probability of 50% each.Let us check that this indeed gives rise to the same density operator. 1ρ =ˆ (|+ +| + |− −|) 2 1 1 = (|0 0| + |1 0| + |0 1| + |1 1|) + (|0 0| − |1 0| − |0 1| + |1 1|) 4 4 1 = (|0 0| + |1 1|) . (2.34) 2
81.
2.4. THE DENSITY OPERATOR 81Figure 2.2: a) An oven generates particles in states |0 or |1 with ˆprobability 50%. An experimentalist measures observable A and ﬁnds ˆthe mean value tr{Aρ}. b) The oven generates particles in states |± = √(|0 ± |1 )/ 2 with probability 50%. The experimentalist will ﬁnd the ˆsame mean value tr{Aρ}. Therefore the same density operator can be obtained in diﬀerentways. In fact, we can ﬁnd usually ﬁnd inﬁnitely many ways of gen-erating the same density operator. What does this mean? Does thismake the density operator an ill-deﬁned object, as it cannot distinguishbetween diﬀerent experimental realisations? The answer to that is aclear NO! The point is, that if a two realisations give rise to the samedensity operator, then, as we have seen above, we will also ﬁnd exactlythe same expectation values for any observable that we may chose tomeasure. If a quantum system in a mixed state gives rise to exactly thesame predictions for all possible measurements, then we are forced tosay that the two states are indeed the same! Of course this is somethingthat needs to be conﬁrmed experimentally and it turns out that it istrue. Even more interestingly it turns out that if I could distinguishtwo situations that are described by the same density operator then Iwould be able to transmit signals faster than light! A clear impossi-bility as it contradicts special relativity. This is what I will explain toyou in the next section using the concept of entanglement which youare going to encounter for the ﬁrst time.
82.
82 CHAPTER 2. QUANTUM MEASUREMENTS2.5 Mixed states, Entanglement and the speed of lightIn the previous section I have introduced the density operator to de-scribe situations in which we have a lack of knowledge due to animperfect preparation of a quantum state. However, this is not theonly situation in which we have to use a density operator. In thissection I will show you that the density operator description may alsobecome necessary when the system that we are investigating is only theaccessible part of some larger total system. Even if this largersystem is in a pure quantum state, we will see that the smaller systembehaves in every respect like a mixed state and has to be described bya density operator. This idea will then lead us to the insight that twodiﬀerent preparations that are described by the same density operatorcan not be distinguished experimentally, because otherwise we wouldbe able to send messages faster than the speed of light. This wouldclearly violate the special theory of relativity. To understand this new way of looking at density operators, I ﬁrstneed to explain how to describe quantum mechanical systems that con-sist of more than one particle. This is the purpose of the next subsec-tion.2.5.1 Quantum mechanics for many particlesIn this section I will show you what you have to do when you want todescribe a system of two particles quantum mechanically. The general-ization to arbitrary numbers of particles will then be obvious. If we have two particles, then the state of each particle are elementsof a Hilbert space; for particle A we have the Hilbert space HA whichis spanned by a set of basis states {|φi A }i=1,...,N , while particle B hasthe Hilbert space HB spanned by the set of basis states {|ψj B }j=1,...,M .The two Hilbert spaces are not necessarily equal and may describe totaldiﬀerent quantities or particles. Now imagine that system A is in state |φi A and system B is instate |ψj B . Then we write the total state of both systems in the tensor
83.
2.5. MIXED STATES, ENTANGLEMENT AND THE SPEED OF LIGHT83product form |Ψtot = |φi ⊗ |ψj . (2.35)The symbol ⊗ denotes the tensor product, which is not to be confusedwith ordinary products. Clearly there are N · M such combinationsof basis vectors. These vectors can be thought of as spanning a largerHilbert space HAB = HA ⊗ HB which describes all possible states thatthe two particles can be in. Of course we have to be able to add diﬀerent states in the newHilbert space HAB . We have to deﬁne this addition such that it iscompatible with physics. Imagine that system A is in a superpositionstate |Φ A = a1 |φ1 + a2 |φ2 and system B is in the basis state |Ψ =|ψ1 . Then |Φ A ⊗ |Ψ = (a1 |φ1 A + a2 |φ2 A) ⊗ |ψ1 (2.36)is the joint state of the two particles. Now surely it makes sense to saythat |Φ A ⊗ |Ψ = a1 |φ1 A ⊗ |ψ1 + a2 |φ2 A ⊗ |ψ1 . (2.37)This is so, because the existence of an additional particle (maybe at theend of the universe) cannot change the linearity of quantum mechanicsof system A. As the same argument applies for system B we shouldhave in general ai |φi ⊗ bj |ψj = ai bj |φi ⊗ |ψj . (2.38) i j ijThe set of states {|φi A ⊗ |ψj B } forms a basis of the state space of twoparticles. Now let us see how the scalar product between two states |ψ1 A ⊗|ψ2 B and |φ1 A ⊗ |φ2 B has to be deﬁned. Clearly if |ψ2 = |φ2 thenwe should have (|ψ1 A ⊗ |ψ2 B , |φ1 A ⊗ |φ2 B) = (|ψ1 A , |φ1 A ) (2.39)again because the existence of an unrelated particle somewhere in theworld should not change the scalar product of the ﬁrst state.
84.
84 CHAPTER 2. QUANTUM MEASUREMENTS Eq. (2.39) can only be true if in general (|ψ1 A ⊗ |ψ2 B , |φ1 A ⊗ |φ2 B) = (|ψ1 A , |φ1 A ) (|ψ2 B , |φ2 B ) . (2.40)The tensor product of linear operators is deﬁned by the action of theoperator on all possible product states. We deﬁne ˆ ˆ A ⊗ B(|φ A ⊗ |ψ B) ˆ ˆ := (A|φ A ) ⊗ (B|ψ B) . (2.41) In all these deﬁnitions I have only used states of the form |φ ⊗ |ψwhich are called product states. Are all states of this form? Theanswer is evidently no, because we can form linear superpositions ofdiﬀerent product states. An example of such a linear superposition isthe state 1 |Ψ = √ (|00 + |11 ) . (2.42) 2Note that I now omit the ⊗ and abbreviate |0 ⊗ |0 by |00 . If you tryto write the state Eq. (2.42) as a product|00 + |11 √ = (α|0 +β|1 )⊗(γ|0 +δ|1 ) = αγ|00 +αδ|01 +βγ|10 +βδ|11 . 2This implies that αδ = 0 = βγ. If we chose α = 0 then also αγ = 0 andthe two sides cannot be equal. Likewise if we chose δ = 0 then βδ = 0and the two sides are diﬀerent. As either α = 0 or δ = 0 we arrive ata contradiction. States which cannot be written in product form, arecalled entangled states. In the last part of these lectures we will learna bit more about the weird properties of the entangled states. They al-low such things as quantum state teleportation, quantum cryptographyand quantum computation. Let us now see how the tensor product looks like when you writethem as column vectors? The way you write them is really arbitrary,but of course there are clumsy ways and there are smart ways. Inliterature you will ﬁnd only one way. This particular notation for thetensor product of column vectors and matrices ensures that the relation ˆ ˆ ˆ ˆA⊗ B|φ ⊗|ψ = A|φ ⊗ B|ψ is true when you do all the manipulations
85.
2.5. MIXED STATES, ENTANGLEMENT AND THE SPEED OF LIGHT85using matrices and column vectors. The tensor product between a m-dimensional vector and an n-dimensional vector is written as b1 . a1 . . a1 b1 bn a1 b 1 . . . . . ⊗ . . . = . = . (2.43) . . . am bn b1 am b n a . . m . bn ˆThe tensor product of a linear operator A on an m-dimensional space ˆ on an n-dimensional space in matrix notation isand linear operator Bgiven by a11 . . . a1m b11 . . . b1n a11 B . . . a1m B . .. . ⊗ . .. . = . .. . . . . . . . . . . . . . . . . am1 . . . amm bn1 . . . bnn am1 B . . . amm B b11 . . . b1n b11 . . . b1n . ... . . . . a . .. . a11 . . . . . . . 1m . . bn1 . . . bnn bn1 . . . bnn . ... . = . . . . . (2.44) b11 . . . b1n b11 . . . b1n a . .. . ... a . .. . . . . . . . m1 . . mm . . bn1 . . . bnn bn1 . . . bnnThe simplest explicit examples that I can give are those for two 2-dimensional Hilbert spaces. There we have a1 b 1 a1 b1 a1 b 2 ⊗ = . (2.45) a2 b2 a2 b 1 a2 b 2
86.
86 CHAPTER 2. QUANTUM MEASUREMENTSand a11 b11 a11 b12 a12 b11 a12 b12 a11 a12 b11 b12 a11 b21 a11 b22 a12 b21 a12 b22 ⊗ = . a21 a22 b21 b22 a21 b11 a21 b12 a22 b11 a22 b12 a21 b21 a21 b22 a22 b21 a22 b22 (2.46)2.5.2 How to describe a subsystem of some large system?Let us now imagine the following situation. The total system thatwe consider consists of two particles each of which are described by anN(M)-dimensional Hilbert space HA (HB ) with basis vectors {|φi A }i=1,...,N({|ψi B }i=1,...,M ). Let us also assume that the two particles together arein an arbitrary, possibly mixed, state ρAB . One person, Alice, is holdingparticle A, while particle B is held by a diﬀerent person (let’s call himBob) which, for whatever reason, refuses Alice access to his particle.This situation is schematically represented in Fig. 2.3. Now imaginethat Alice makes a measurement on her system alone (she cannot ac-cess Bobs system). This means that she is measuring an operator of ˆthe form A ⊗ 1. The operator on Bobs system must be the identityoperator because no measurement is performed there. I would now liketo calculate the expectation value of this measurement, i.e. ˆ A = A ˆ φ|B ψj |A ⊗ 1|φ A |ψj ˆ = trAB {A ⊗ 1 ρAB } , ijwhere trAB means the trace over both systems, that of Alice andthat of Bob. This description is perfectly ok and always yields thecorrect result. However, if we only ever want to know about outcomesof measurements on Alice’s system alone then it should be suﬃcient tohave a quantity that describes the state of her system alone withoutany reference to Bob’s system. I will now derive such a quantity, whichis called the reduced density operator. Our task is the deﬁnition of a density operator ρA for Alices systemwhich satisﬁes ˆ ˆ ˆ A = trAB {A ⊗ 1 ρAB } = trA {AρA } (2.47)
87.
2.5. MIXED STATES, ENTANGLEMENT AND THE SPEED OF LIGHT87Figure 2.3: Alice and Bob hold a joint system, here composed of twoparticles. However, Alice does not have access to Bob’s particle andvice versa. ˆfor all observables A. How can we construct this operator? Let usrewrite the left hand side of Eq. (2.47) N M ˆ A = A ˆ φi |B ψj |A ⊗ 1 ρAB |φi A |ψj B i=1 j=1 N M = A ˆ φi |A B ψj |ρAB |ψj B |φi A i=1 j=1 M ˆ = trA {A B ψj |ρAB |ψj B } . j=1In the last step I have eﬀectively split the trace operation into two parts.This split makes it clear that the state of Alices system is described by
88.
88 CHAPTER 2. QUANTUM MEASUREMENTSthe reduced density operator M ρA := B ψj |ρAB |ψj B = trB {ρAB } . (2.48) j=1where the last identity describes the operation of taking the partialtrace. The point is now that the reduced density operator allows Aliceto describe the outcome of measurements on her particle without anyreference to Bob’s system, e.g. ˆ ˆ A = trA {AρA } . When I am giving you a density operator ρAB then you will askhow to actually calculate the reduced density operator explicitly. Letme ﬁrst consider the case where ρAB is the pure product state, i.e.ρAB = |φ A |ψ1 B A φ|B ψ1 | Then following Eq. (2.48) the partial traceis M trB {|φ1 ψ1 φ1 ψ1 |} = B ψi | (|φ1 ψ1 φ1 ψ1 |) |ψi B i=1 M = |φ1 AA φ1 |B ψi |ψ1 BB ψ1 |ψi B i=1 = |φ1 AA φ1 | . (2.49)Now it is clear how to proceed for an arbitrary density operator of thetwo systems. Just write down the density operator in a product basis N M ρAB = αij,kl |φi AA φk | ⊗ |ψj BB ψl | , (2.50) i,k=1 j,l=1and then use Eq. (2.49) to ﬁnd M N MtrB {ρAB } = B ψc | αij,kl |φi AA φk | ⊗ |ψj BB ψl | |ψc B c=1 i,k=1 j,l=1 M N M = αij,kl |φi AA φk |(B ψc |ψj B )(B ψl |ψc B) c=1 i,k=1 j,l=1
89.
2.5. MIXED STATES, ENTANGLEMENT AND THE SPEED OF LIGHT89 M N M = αij,kl |φi AA φk |δcj δlc c=1 i,k=1 j,l=1 N M = αic,kc |φi AA φk | (2.51) i,k=1 c=1 Let me illustrate this with a concrete example.Example: Consider two system A and B that have two energy lev-els each. I denote the basis states in both Hilbert spaces by |1 and|2 . The basis states in the tensor product if both spaces are then{|11 , |12 , |21 , |22 }. Using this basis I write a possible state of thetwo systems as 1 1 1 ρAB = |11 11| + |11 12| + |11 21| 3 3 3 1 1 1 + |12 11| + |12 12| + |12 21| 3 3 3 1 1 1 + |21 11| + |21 12| + |21 21| . (2.52) 3 3 3Now let us take the partial trace over the second system B. This meansthat we collect those terms which contain states of the form |i B B i|.Then we ﬁnd that 2 1 1 1 ρA = |1 1| + |1 2| + + |2 1| + |2 2| . (2.53) 3 3 3 3This is not a pure state (Check that indeed detρA = 0). It is notan accident that I have chosen this example, because it shows that thestate of Alice’s system may be a mixed state (incoherent superposition)although the total system that Alice and Bob are holding is in a purestate. To see this, just check that ρAB = |Ψ AB AB Ψ| , (2.54)where |Ψ AB = 1 |11 + 1 |12 + 1 |21 . This is evidently a pure 3 3 3state! Nevertheless, the reduced density operator Eq. (2.53) of Alice’ssystem alone is described by a mixed state! In fact this is a very general behaviour. Whenever Alice and Bobhold a system that is in a joint pure state but that cannot be written
90.
90 CHAPTER 2. QUANTUM MEASUREMENTSas a product state |φ ⊗|ψ (Check this for Eq. (2.54), then the reduceddensity operator describing one of the subsystems represents a mixedstate. Such states are called entangled states and they have quite a lotof weird properties some of which you will encounter in this lecture andthe exercises. Now let us see how these ideas help us to reveal a connection betweenspecial relativity and mixed states.2.5.3 The speed of light and mixed states.In the previous subsection you have seen that a mixed state of yoursystem can also arise because your system is part of some larger inac-cessible system. Usually the subsystem that you are holding is then ina mixed state described by a density operator. When I introduced thedensity operator, I told you that a particular mixed state be realizedin diﬀerent ways. An example was 1 1 ρ = |1 1| + |2 2| 2 2 1 1 = |+ +| + |− −| 2 2 1with |± = √2 (|1 ± |2 ). However, I pointed out that these two real-izations are physically the same because you are unable to distinguishthem. I did not prove this statement to you at that point. I will nowcorrect this omission. I will show that if you were able to distinguishtwo realizations of the same density operator, then you could send sig-nals faster than the speed of light, i.e. this would violate specialrelativity. How would that work? Assume two persons, Alice and Bob, hold pairs of particles whereeach particle is described by two-dimensional Hilbert spaces. Let usassume that Alice and Bob are sitting close together, and are able toaccess each others particle such that they are able to prepare the state 1 |ψ = √ (|11 + |22 ) . (2.55) 2Now Alice and Bob walk away from each other until they are separatedby, say, one light year. Now Alice discovers the answer to a very tricky
91.
2.5. MIXED STATES, ENTANGLEMENT AND THE SPEED OF LIGHT91question which is either ’yes’ or ’no’ and would like to send this answerto Bob. To do this she performs either of the following measurementson her particle. ˆ’yes’ Alice measures the observable Ayes = |1 1| + 2|2 2|. 1 The probability to ﬁnd the state |1 is p1 = 2 and to ﬁnd |2 is 1 p2 = 2 . This follows from postulate 4 because pi = tr{|i i|ρA } = 1 2 where ρA = 1 (|1 1| + |2 2|) is the reduced density operator 2 describing Alices system. After the result |1 the state of the total system is |1 1| ⊗ 1|ψ ∼ |11 , after the result |2 the state of the total system is |2 2| ⊗ 1|ψ ∼ |22 . ˆ’no’ Alice measures the observable Ano = |+ +| + 2|− −| = Ayes .ˆ 1 The probability to ﬁnd the state |+ is p+ = 2 and to ﬁnd |− is 1 p− = 2 . This follows from postulate 4 because pi = tr{|i i|ρA } = 1 2 where ρA = 1 (|+ +| + |− −|) is the reduced density operator 2 describing Alices system. After the result |+ the state of the total system is |+ +| ⊗ 1|ψ ∼ | + + , after the result |− the state of the total system is |− −| ⊗ 1|ψ ∼ | − − .What does this imply for Bob’s system? As the measurement results1 and 2 occur with probability 50%, Bob holds an equal mixture ofthe two states corresponding the these outcomes. In fact, when Al-ice wanted to send ’yes’ then the state of Bob’s particle after Alicesmeasurement is given by 1 1 ρyes = |0 0| + |1 1| (2.56) 2 2and if Alice wanted to send ’no’ then the state of Bob’s particle is givenby 1 1 ρno = |+ +| + |− −| . (2.57) 2 2You can check quite easily that ρno = ρyes . If Bob would be able todistinguish the two realizations of the density operators then he couldﬁnd out which of the two measurements Alice has carried out. In thatcase he would then be able to infer what Alices answer is. He can dothat immediately after Alice has carried out her measurements, whichin principle requires negligible time. This would imply that Alice could
92.
92 CHAPTER 2. QUANTUM MEASUREMENTSsend information to Bob over a distance of a light year in virtually notime at all. This clearly violates the theory of special relativity. As we know that special relativity is an extremely well establishedand conﬁrmed theory, this shows that Bob is unable to distinguish thetwo density operators Eqs. (2.56-2.57).2.6 Generalized measurements
93.
Chapter 3Dynamics and SymmetriesSo far we have only been dealing with stationary systems, i.e. we didnot have any prescription for the time evolution of quantum mechan-ical systems. We were only preparing systems in particular quantummechanical states and then subjected them to measurements. In real-ity of course any system evolves in time and we need to see what thequantum mechanical rules for time evolution are. This is the subjectof this chapter.3.1 The Schr¨dinger Equation oMost of you will know that the prescription that determines the quan-tum mechanical time evolution is given by the Schr¨dinger equation. oBefore I state this as the next postulate, let us consider what any de-cent quantum mechanical time evolution operator should satisfy, andthen we will convince us that the Schr¨dinger equation satisﬁes these ocriteria. In fact, once we have accepted these properties, we will nothave too much freedom of choice for the form of the Schr¨dinger equa- otion. ˆDeﬁnition 42 The time evolution operator U (t2 , t1 ) maps the quantumstate |ψ(t1 ) to the state |ψ(t2 ) obeying the properties ˆ 1. U (t2 , t1 ) is unitary. 93
94.
94 CHAPTER 3. DYNAMICS AND SYMMETRIES ˆ ˆ ˆ 2. We have U (t2 , t1 )U (t1 , t0 ) = U (t2 , t0 ). (semi-group property) ˆ 3. U (t, t1 ) is diﬀerentiable in t. What is the reason for these assumptions? First of all, the timeevolution operator has to map physical states onto physical states. Thatmeans in particular that a normalized state |ψ(t0 ) is mapped into anormalized state |ψ(t1 ) , i.e. for any |ψ(t0 ) ˆ ˆ ψ(t0 )|ψ(t0 ) = ψ(t1 )|ψ(t1 ) = ψ(t0 )|U † (t1 , t0 )U (t1 , t0 )|ψ(t0 ) . (3.1)Therefore it seems a fair guess that Uˆ (t1 , t0 ) should be a unitary oper-ator. The second property of the time evolution operator demands thatit does not make a diﬀerence if we ﬁrst evolve the system from t0 tot1 and then from t1 to t2 or if we evolve it directly from time t0 to t2 .This is a very reasonable assumption. Note however, that this does notimply that we may measure the system at the intermediate time t1 . Infact we must not interact with the system. The third condition is one of mathematical convenience and of phys-ical experience. Every system evolves continuously in time. This isan observation that is conﬁrmed in experiments. Indeed dynamics inphysics is usually described by diﬀerential equations, which already im-plies that observable quantities and physical state are diﬀerentiable intime. You may then wonder what all the fuss about quantum jumpsis then about. The point is that we are talking about the time evo-lution of a closed quantum mechanical system. This means that we,e.g. a person from outside the system, do not interact with the systemand in particular this implies that we are not measuring the system.We have seen that in a measurement indeed the state of a system canchange discontinuously, at least according to our experimentally welltested Postulate 4. In summary, the quantum state of a closed systemchanges smoothly unless the system is subjected to a measurement, i.e.an interaction with an outside observer. What further conclusions can we draw from the properties of thetime evolution operator? Let us consider the time evolution opera-tor for very short time diﬀerences between t and t0 and use that it is
95.
¨3.1. THE SCHRODINGER EQUATION 95diﬀerentiable in time (property 3) of its Deﬁnition 42). Then we ﬁnd ˆ i ˆ U (t, t0 ) = 1 − H(t0 )(t − t0 ) + . . . . (3.2) h ¯Here we have used the fact that we have assumed that the time evolu-tion of a closed quantum mechanical system is smooth. The operator ˆH(t0 ) that appears on the right hand side of Eq. (3.2) is called theHamilton operator of the system. Let us apply this to an initialstate |ψ(t0 ) and take the time derivative with respect to t on bothsides. Then we ﬁnd h ˆ ˆ i¯ ∂t |ψ(t) ≡ i¯ ∂t (U (t, t0 )|ψ(t0 ) ) ≈ H(t0 )|ψ(t0 ) . h (3.3)If we now carry out the limit t → t0 then we ﬁnally ﬁnd h ˆ i¯ ∂t |ψ(t) = H(t)|ψ(t) . (3.4)This is the Schr¨dinger equation in the Dirac notation in a coordinate oindependent form. To make the connection to the Schr¨dinger equation oas you know it from last years quantum mechanics course let us considerExample: Use the Hamilton operator for a particle moving in a one-dimensional potential V (x, t). Write the Schr´dinger equation in the oposition representation. This gives i¯ ∂t x|ψ(t) h = x|i¯ ∂t |ψ(t) h = ˆ x|H|ψ(t) h2 d2 ¯ = − + V (ˆ, t) x x|ψ(t) . 2m dx2This is the Schr¨dinger equation in the position representation as you oknow it from the second year course. Therefore, from the assumptions on the properties of the time evolu-tion operator that we made above we were led to a diﬀerential equationfor the time evolution of a state vector. Again, of course, this resulthas to be tested in experiments, and a (sometimes) diﬃcult task is to ˆﬁnd the correct Hamilton operator H that governs the time evolutionof the system. Now let us formulate the result of our considerations in
96.
96 CHAPTER 3. DYNAMICS AND SYMMETRIESthe followingPostulate 5 The time evolution of the quantum state of an isolatedquantum mechanical system is determined by the Schr¨dinger equation o h ˆ i¯ ∂t |ψ(t) = H(t)|ψ(t) , (3.5) ˆwhere H is the Hamilton operator of the system. As you may remember from your second year course, the Hamiltonoperator is the observable that determines the energy eigenvalues of thequantum mechanical system. In the present course I cannot justify thisin more detail, as we want to learn a lot more things about quantummechanics. The following discussion will however shed some more lighton the time evolution of a quantum mechanical system and in particularit will pave our way towards the investigation of quantum mechanicalsymmetries and conserved quantities.3.1.1 The Heisenberg pictureIn the preceding section we have considered the time evolution of thequantum mechanical state vector. In fact, we have put all the time evo-lution into the state and have left the observables time independent.This way of looking at the time evolution is called the Schr¨dinger opicture. This is not the only way to describe quantum dynamics (un-less they had some intrinsic time dependence). We can also go to theother extreme, namely leave the states time invariant and evolve theobservables in time. How can we ﬁnd out what the time evolution ofa quantum mechanical observable is? We need some guidance, someproperty that must be the same in both pictures of quantum mechan-ics. Such a property must be experimentally observable. What I willbe using here is the fact that expectation values of any observable hasto be the same in both pictures. Let us start by considering the expec- ˆtation value of an operator AS in the Schr¨dinger picture at time time o
97.
¨3.1. THE SCHRODINGER EQUATION 97t given an initial state |ψ(t0 ) . We ﬁnd ˆ ˆ ˆ ˆ A = ψ(t)|AS |ψ(t) = ψ(t0 )|U † (t, t0 )AS U (t, t0 )|ψ(t0 ) . (3.6)Looking at Eq. (3.6) we see that we can interpret the right hand sidealso as the expectation value of a time-dependent operator ˆ ˆ ˆ AH (t) = U † (t, t0 )AS U (t, t0 ) (3.7)in the initial state |ψ(t0 ) . That is ˆ ψ(t)|AS |ψ(t) = A = ψ(t0 )|AH (t)|ψ(t0 ) . (3.8)Viewing the state of the system as time-independent, and the observ-ables as evolving in time is called the Heisenberg-picture of quantummechanics. As we have seen in Eq. (3.8) both, the Schr¨dinger picture oand the Heisenberg picture, give exactly the same predictions for physi-cal observables. But, depending on the problem, it can be advantageousto chose one or the other picture. Like the states in the Schr¨dinger picture the Heisenberg operators oobey a diﬀerential equation. This can easily be obtained by taking thetime derivative of Eq. (3.7). To see this we ﬁrst need to know the time ˆderivative of the time evolution operator. Using |ψ(t) = U (t, t0 )|ψ(t0 )in the Schr¨dinger equation Eq. (3.4) we ﬁnd for any |ψ(t0 ) o h ˆ ˆ ˆ i¯ ∂t U (t, t0 )|ψ(t0 ) = H(t)U (t, t0 )|ψ(t0 ) (3.9)and therefore h ˆ ˆ ˆ i¯ ∂t U (t, t0 ) = H(t)U (t, t0 ) . (3.10) Assuming that the the operator in the Schr¨dinger picture has no oexplicit time dependence, we ﬁnd d d ˆ† ˆ ˆ AH (t) = U (t, t0 )AS U (t, t0 ) dt dt ˆ ∂ U † (t, t0 ) ˆ ˆ ˆ = ˆ ˆ ∂ U (t, t0 ) AS U (t, t0 ) + U † (t, t0 )AS ∂t ∂t i ˆ† ˆ ˆ ˆ ˆ † ˆ i ˆ ˆ = U (t, t0 )H(t)AS U (t, t0 ) + U (t, t0 )AS H(t)U (t, t0 ) h ¯ h ¯
98.
98 CHAPTER 3. DYNAMICS AND SYMMETRIES i ˆ† ˆ ˆ = U (t, t0 )[H(t), AS ]U (t, t0 ) h ¯ i ˆ = [H(t), AS ]H (3.11) h ¯ i ˆ = [HH (t), AH ] . (3.12) h ¯It is easy to check that for an operator that has an explicit time depen-dence in the Schr¨dinger picture, we ﬁnd the Heisenberg equation o d i ˆ ∂AS AH (t) = [H(t), AH (t)] + (t) . (3.13) dt h ¯ ∂t HOne of the advantages of the Heisenberg equation is that it has a directanalogue in classical mechanics. In fact this analogy can be viewed as ˆa justiﬁcation to identify the Hamiltonian of the system H with theenergy of the system. I will not go into details here and for those ofyou who would like to know more about that I rather recommend thebook: H. Goldstein, Classical Mechanics, Addison-Wesley (1980).3.2 Symmetries and Conservation LawsAfter this introduction to the Heisenberg picture it is now time todiscuss the concept of symmetries and their relation to conservationlaws.3.2.1 The concept of symmetryIf we are able to look at a system in diﬀerent ways and it appears thesame to us then we say that a system has a symmetry. This simplestatement will be put on solid ground in this section. Why are weinterested in symmetries? The answer is simple! It makes our lifeeasier. If we have a symmetry in our problem, then the solution willalso have such a symmetry. This symmetry of the solution expressesitself in the existence of a quantity which is conserved for any solutionto the problem. Of course we are always interested to know quantitiesthat are conserved and the natural question is how these conservedquantities are related to the symmetry of the problem! There are two
99.
3.2. SYMMETRIES AND CONSERVATION LAWS 99ways to formulate symmetries, in an active way and a passive way.If we assume that we transform the system S into a new system Sthen we speak of an active transformation, i.e. we actively changethe system. If we change the coordinate system in which we considerthe same system, then we speak of a passive transformation. In theselectures I will adopt the active point of view because it appears to be themore natural way of speaking. An extreme example is the time reversalsymmetry. Certainly we can make a system in which all momenta ofparticles are reversed (the active transformation) while we will not beable to actually reverse time, which would amount to the correspondingpassive transformation. Now let us formalize what we mean by a symmetry. In the activeviewpoint, we mean that the systems S and S = T S look the same. Wehave to make sure what we mean by ’S and S look the same’. Whatdoes this mean in quantum mechanics? The only quantities that areaccessible in quantum mechanics are expectation values or, even morebasic, transition probabilities. Two systems are the same in quantummechanics if the transition probabilities between corresponding statesare the same. To be more precise let us adopt the ˆDeﬁnition 43 A transformation T is called a symmetry transfor-mation when for all vectors |φ and |ψ in system S and the corre- ˆ ˆsponding vectors |φ = T |φ and |ψ = T |ψ in system S we ﬁnd | ψ|φ | = | ψ |φ | . (3.14) In this deﬁnition we have seen that quantum states are transformed ˆas |φ = T |φ . This implies that observables, which are of the form ˆA = i ai |ai ai | will be transformed according to the prescriptionAˆ = T AT † . ˆ ˆˆ ˆ Which transformations T are possible symmetry transformations?As all transition probabilities have to be preserved one would expectthat symmetry transformations are automatically unitary transforma-tions. But this conclusion would be premature because of two reasons.Firstly, we did not demand that the scalar product is preserved, butonly the absolute value of the scalar product and secondly we did noteven demand the linearity of the transformation T . Fortunately it turns
100.
100 CHAPTER 3. DYNAMICS AND SYMMETRIES ˆout that a symmetry transformation T cannot be very much more gen-eral than a unitary transformation. This was proven in a remarkabletheorem by Wigner which statesTheorem 44 Any symmetry transformation can either be representedby a unitary transformation or an anti-unitary transformation. For a proof (which is rather lengthy) of this theorem you shouldhave a look at books such as: K. Gottfried, Quantum Mechanics I,Benjamin (1966). An anti-unitary transformation has the property ˆ ˆ ˆ U (α|φ + β|ψ ) = α∗ U |φ + β ∗ U |ψ (3.15)and preserves all transition probabilities. An example of an anti-unitarysymmetry is time-reversal, but this will be presented later. Of course a system will usually be symmetric under more than justone particular transformation. In fact it will be useful to considersymmetry groups. Such groups may be discrete (you can number themwith whole numbers) or continuous (they will be parametrized by a realnumber).Deﬁnition 45 A continuous symmetry group is a set of symmetrytransformation that can be parametrized by a real parameter such thatthe symmetry transformations can be diﬀerentiated with respect to thisparameter. ˆExample: a) Any Hermitean operator A gives rise to a continuoussymmetry group using the deﬁnition ˆ ˆ h U ( ) := eiA /¯ . (3.16) ˆObviously the operator U ( ) can diﬀerentiated with respect to the pa-rameter . So far we have learned which transformations can be symmetrytransformations. A symmetry transformation of a quantum mechanicalsystem is either a unitary or anti-unitary transformation. However,
101.
3.2. SYMMETRIES AND CONSERVATION LAWS 101for a given quantum system not every symmetry transformation is asymmetry of that quantum system. The reason for this is that it is notsuﬃcient to demand that the conﬁguration of a system is symmetric ata given instance in time, but also that this symmetry is preservedunder the dynamics of the system! Otherwise we would actually beable to distinguish two ’symmetric’ states by just waiting and lettingthe system evolve. The invariance of the conﬁgurational symmetryunder a time evolution will lead to observables that are conserved intime. As we are now talking about the time evolution of a quantumsystem we realize that the Hamilton operator plays a major role inthe theory of symmetries of quantum mechanical systems. In fact, thisis not surprising because the initial state (which might posses somesymmetry) and the Hamilton operator determine the future of a systemcompletely. Assume that we have a symmetry that is preserved in time. At the ˆinitial time the symmetry takes a state |ψ into |ψ = T |ψ . The time ˆevolution of the original system S is given by the Hamilton operator Hwhile that of the transformed system S is given by H ˆ = T H T † . This ˆˆˆleads to ˆ h |ψ(t) = e−iHt/¯ |ψ(0) (3.17)for the time evolution in S. The time evolution of the transformed ˆsystem S is governed by the Hamilton operator H so that we ﬁnd ˆ |ψ(t) = e−iH t/¯ |ψ(0) h . (3.18)However, the state |ψ could also be viewed as quantum state of theoriginal system S! This means that the time evolution of system S ˆgiven by the Hamilton operator H could be applied and we would ﬁnd ˆ |ψ(t) = e−iH t/¯ |ψ(0) h . (3.19)If the symmetry of the system is preserved for all times then the twostates |ψ(t) and |ψ(t) cannot diﬀer by more than a phase factor, i.e. |ψ(t) = eiφ |ψ(t) . (3.20)because we need to have | ψ |ψ | = | ψ(t) |ψ(t) | . (3.21)
102.
102 CHAPTER 3. DYNAMICS AND SYMMETRIESAs Eq. (3.21) has to be true for all state vectors, the Hamilton operators ˆ ˆH and H can diﬀer by only a constant which is physically unobservable(it gives rise to the same global eiφ in all quantum states) and thereforethe two Hamilton operators are essentially equal. Therefore we canconclude withLemma 46 A symmetry of a system S that holds for all times needs to ˆbe a unitary or anti-unitary transformation T that leaves the Hamilton ˆoperator H invariant, i.e. ˆ ˆ ˆˆˆ H = H = T HT † . (3.22) Given a symmetry group which can be parametrized by a parametersuch that the symmetry transformations are diﬀerentiable with respectto this parameter (see the Example in this section above), we can de- ˆtermine the conserved observable G of the system quite easily from thefollowing relation ˆ U( ) − 1 ˆ G = lim i . (3.23) →0 ˆThe quantity G is also called the generator of the symmetry group. ˆWhy is G conserved? From Eq. (3.23) we can see that for small wecan write ˆ ˆ U ( ) = 1 − i G + O( 2 ) , (3.24) ˆwhere G is a Hermitean operator and O( 2 ) means that all other termscontain at least a factor of 2 and are therefore very small. (see in theﬁrst chapter for the proof). Now we know that the Hamilton operatoris invariant under the symmetry transformation, i.e. ˆ ˆ ˆ ˆˆ H = H = U ( )H U † ( ) . (3.25)For small values of we ﬁnd ˆ ˆ ˆ ˆ H = (1 − iG )H(1 + iG ) ˆ ˆ ˆ = H − i[G, H] + O( 2 ) . (3.26)As the equality has to be true for arbitrary but small , this impliesthat ˆ ˆ [G, H] = 0 . (3.27)
103.
3.2. SYMMETRIES AND CONSERVATION LAWS 103Now we remember the Heisenberg equation Eq. (3.13) and realize that ˆthe rate of change of the observable G in the Heisenberg picture isproportional to the commutator Eq. (3.27), ˆ dGH h ˆ ˆ = i¯ [G, H]H = 0 , (3.28) dt ˆi.e. it vanishes. Therefore the expectation value of the observable G is ˆconstant, which amounts to say that G is a conserved quantity, i.e. ˆ ψ(t)|G|ψ(t) ˆ ψ(0)|GH (t)|ψ(0) = =0 . (3.29) dt dt Therefore we have the importantTheorem 47 The generator of a continuous symmetry group of a quan-tum system is a conserved quantity under the time evolution of thatquantum system. After these abstract consideration let us now consider some exam-ples of symmetry groups.3.2.2 Translation Symmetry and momentum con- servationLet us now explore translations of quantum mechanical systems. Againwe adopt the active viewpoint of the transformations. First we need todeﬁne what we mean by translating the system by a distance a to the ˆright. Such a transformation Ta has to map the state of the system |ψ ˆinto |ψa = Ta |ψ such that x|ψa = x − a|ψ . (3.30)In Fig. 3.1 you can see that this is indeed a shift of the wavefunction (i.e.the system) by a to the right. Which operator represents the translation ˆoperator Ta ? To see this, we begin with the deﬁnition Eq. (3.30). Thenwe use the representation of the identity operator 1 = dp|p p| (see
104.
104 CHAPTER 3. DYNAMICS AND SYMMETRIESFigure 3.1: The original wave function ψ(x) (solid line) and the shiftedwave function ψa (x) = ψ(x−a). |ψa represents a system that has beenshifted to the right by a.Eq. (1.131)) and Eq. (1.128). We ﬁnd ˆ x|Ta |ψ = x − a|ψ = dp x − a|p p|ψ 1 = dp √ eip(x−a)/¯ p|ψ h 2π¯ h 1 = dp √ e−ipa/¯ eipx/¯ p|ψ h h 2π¯ h = dpe−ipa/¯ x|p p|ψ h = x|( dpe−ipa/¯ |p p|)|ψ h = x|e−iˆa/¯ |ψ . p h (3.31)As this is true for any |ψ , we that the translation operator has theform ˆ Ta = e−iˆa/¯ . p h (3.32)This means that the momentum operator is the generator of transla-tions. From Eq. (3.23) it follows then that in a translation invariant
105.
3.2. SYMMETRIES AND CONSERVATION LAWS 105system momentum is conserved, i.e. d ψ(t)|ˆ|ψ(t) p =0 . dt Consider the Hamilton operator of a particle in a potential, which ˆ p2 ˆ H= + V (ˆ) . x (3.33) 2mThe momentum operator is obviously translation invariant, but a trans-lation invariant potential has to satisfy V (x) = V (x + a) for all a whichmeans that it is a constant.3.2.3 Rotation Symmetry and angular momentum conservationAnother very important symmetry is rotation symmetry. To considerrotation symmetry it is necessary to go to three-dimensional systems. ˆLet us ﬁrst consider rotations Rz (α) with an angle α around the z-axis.This is deﬁned as ψα (r, φ, θ) = ψ(r, φ − α, θ) . (3.34)Here we have written the wavefunction in polar coordinates x1 = r cos φ sin θ (3.35) x2 = r sin φ sin θ (3.36) x3 = r cos θ . (3.37)In Fig. 3.2 we can see that transformation Eq. (3.34) indeed rotates awave-function by an angle α counter clockwise. As translations are generated by the momentum operator, we expectrotations to be generated by the angular momentum. Let us conﬁrmour suspicion. The angular momentum of a particle is given by ˆ ˆ ˆ l = x×p (3.38) = (ˆ2 p3 − x3 p2 )e1 + (ˆ3 p1 − x1 p3 )e2 + (ˆ1 p2 − x2 p1 )e3 , 3.39) x ˆ ˆ ˆ x ˆ ˆ ˆ x ˆ ˆ ˆ (
106.
106 CHAPTER 3. DYNAMICS AND SYMMETRIESFigure 3.2: The original wave function ψ(r, φ, θ) (solid line) and therotated wave function ψα (r, φ, θ) = ψ(r, φ − α, θ). |ψα represents asystem that has been rotated clockwise around the x3 − axis. ˆwhere × is the ordinary vector-product, p = p1 e1 + p2 e2 + p3 e3 and ˆ ˆ ˆˆ = x1 e1 + x2 e2 + x3 e3 with [ˆi , pj ] = i¯ δij .x ˆ ˆ ˆ x ˆ h As we are considering rotations around the z-axis, i.e. e3 , we willprimarily be interested in the ˆ3 component of the angular momentum loperator. In the position representation we ﬁnd ˆ3 = h x1 ∂ − x2 ∂ l ¯ . (3.40) i ∂x2 ∂x1This can be converted to polar coordinates by using the chain rule ∂ ∂x1 ∂ ∂x2 ∂ = + ∂φ ∂φ ∂x1 ∂φ ∂x2 ∂(r cos φ sin θ) ∂ ∂(r sin φ sin θ) ∂ = + ∂φ ∂x1 ∂φ ∂x2 ∂ ∂ = −r sin φ sin θ + r cos φ sin θ ∂x1 ∂x2 ∂ ∂ = −x2 + x1 ∂x1 ∂x2
107.
3.2. SYMMETRIES AND CONSERVATION LAWS 107 iˆ = l3 . (3.41) h ¯Now we can proceed to ﬁnd the operator that generates rotations byderiving a diﬀerential equation for it. ∂Rz (α) ∂ x| |ψ = x|Rz (α)|ψ ∂α ∂α ∂ = r, φ − α, θ|ψ ∂α ∂ = − r, φ − α, θ|ψ ∂φ ∂ = − x|Rz (α)|ψ ∂φ i = − x|ˆ3 Rz (α)|ψ . l (3.42) h ¯As this is true for all wave functions |ψ we have the diﬀerential equationfor the rotation operator ˆ ∂ Rz (α) i ˆ = − ˆ3 Rz (α) l (3.43) ∂α h ¯ ˆWith the initial condition Rz (0) = 1 the solution to this diﬀerentialequation is given by ˆ ˆ ˆ Rz (α) = e−il3 α/¯ = e−ile3 α/¯ . h h (3.44)We obtain a rotation around an arbitrary axis n by ˆ ˆ Rz (α) = e−ilnα/¯ . h (3.45) Any system that is invariant under arbitrary rotations around anaxis n preserves the component of the angular momentum in that direc- ˆtion, i.e. it preserves ln. As the kinetic energy of a particle is invariantunder any rotation (Check!) the angular momentum is preserved if theparticle is in a potential that is symmetric under rotation around theaxis n. The electron in a hydrogen atom has the Hamilton operator ˆ2 ˆ = p + V (|x|) . ˆ H (3.46) 2m
108.
108 CHAPTER 3. DYNAMICS AND SYMMETRIES ˆRotations leave the length of a vector invariant and therefore |x| and ˆwith it V (|x|) are invariant under rotations around any axis. Thismeans that any component of the angular momentum operator is pre-served, i.e. the total angular momentum is preserved.3.3 General properties of angular momentaIn the preceding section we have investigated some symmetry transfor-mations, the generators of these symmetry transformations and someof their properties. Symmetries are important and perhaps the mostimportant of all is the concept of rotation symmetry and its generator,the angular momentum. In the previous section we have consideredthe speciﬁc example of the orbital angular momentum which we coulddevelop from the classical angular momentum using the correspondenceprincipal. However, there are manifestations of angular momentum inquantum mechanics that have no classical counterpart, the spin of anelectron being the most important example. In the following I wouldlike to develop the theory of the quantum mechanical angular momen-tum in general, introducing the notion of group representations.3.3.1 RotationsWhenever we aim to generalize a classical quantity to the quantumdomain, we ﬁrst need to investigate it carefully to ﬁnd out those prop-erties that are most characteristic of it. Then we will use these basicproperties as a deﬁnition which will guide us in our quest to ﬁnd the cor-rect quantum mechanical operator. Correctness, of course, needs to betested experimentally, because nice mathematics does not necessarilydescribe nature although the mathematics itself might be consistent. In the case of rotations I will not look at rotations about arbitraryangles, but rather at rotations about very small, inﬁnitesimal angles.This will be most convenient in establishing some of the basic propertiesof rotations. As we have done in all our discussion of symmetries, we will adoptthe viewpoint of active rotations, i.e. the system is rotated while thecoordinate system remains unchanged. A rotation around an axis n
109.
3.3. GENERAL PROPERTIES OF ANGULAR MOMENTA 109for a positive angle φ is one that follows the right hand rule (thumb indirection of the axis of rotation). In three dimensional space rotations are given by real 3×3 matrices.A rotation around the x-axis with an angle φ is given by 1 0 0 Rx (φ) = 0 cos φ − sin φ . (3.47) 0 sin φ cos φThe rotations around the y-axis is given by cos φ 0 sin φ Ry (φ) = 0 1 0 (3.48) − sin φ 0 cos φand the rotation about the z-axis is given by cos φ − sin φ 0 Rz (φ) = sin φ cos φ 0 . (3.49) 0 0 1Using the expansions for sin and cos for small angles 2 sin = + O( 3 ) cos = 1 − , (3.50) 2we may now expand the rotation matrices Eqs. (3.47-3.49) up to secondorder in the small angle . We ﬁnd 1 0 0 2 Rx ( ) = 0 1 − − . (3.51) 2 2 0 1− 2 2 1− 2 0 Ry ( ) = 0 1 0 (3.52) 2 − 0 1− 2 2 1− 2 − 0 2 Rz ( ) = 1− 0 . (3.53) 2 0 0 1
110.
110 CHAPTER 3. DYNAMICS AND SYMMETRIESWe know already from experience, that rotations around the same axiscommute, while rotations around diﬀerent axis generally do not com-mute. This simple fact will allow us to derive the canonical commuta-tion relations between the angular momentum operators. To see the eﬀect of the non-commutativity of rotations for diﬀerentaxis, we compute a special combination of rotations. First rotate thesystem about an inﬁnitesimal angle about the y-axis and then by thesame angle about the x-axis 2 1 0 0 1− 2 0 2 Rx ( )Ry ( ) = 0 1 − − 0 1 0 2 2 2 0 1− 2 − 0 1− 2 2 1− 2 0 2 = 2 . (3.54) 1− − 2 2 − 1−Now we calculate Rx (− )Ry (− )Rx ( )Ry ( ) = 2 2 1− 2 0 − 1− 2 0 2 2 = 2 2 . 1− 1− − 2 2 2 2 − 1− − 1− 1 −2 0 2 = 1 0 . 0 0 1 = Rz ( 2 ) (3.55)This will be the relation that we will use to derive the commutationrelation between the angular momentum operators.3.3.2 Group representations and angular momen- tum commutation relationsWe already know that there are many diﬀerent angular momenta inquantum mechanics, orbital angular momentum and spin are just two
111.
3.3. GENERAL PROPERTIES OF ANGULAR MOMENTA 111examples. Certainly they cannot necessarily represented by 3 × 3 ma-trices as in the rotations in the previous section. Nevertheless, all theseangular momenta have some common properties that justify their clas-siﬁcation as angular momenta. This will lead us to a discussion ofgroups and their representations. We have already seen in the ﬁrst section on vector spaces what agroup is. Let us repeat the properties of a group G and its elementsx, y, z. 1. ∀x, y ∈ G : x · y = z ∈ G 2. ∃1 ∈ G : ∀x ∈ G : x · 1 = x 3. ∀x ∈ G : ∃x−1 ∈ G : x−1 · x = 1 and x · x−1 = 1 4. ∀x, y, z ∈ G : x · (y · z) = (x · y) · zNote that we did not demand that the group elements commute underthe group operation. The reason is that we intend to investigate rota-tions which are surely not commutative. The fundamental group thatwe are concerned with is the group of rotations in three-dimensionalspace, i.e. the group of 3 × 3 matrices. It represents all our intuitiveknowledge about rotations. Every other group of operators on wave-functions that we would like to call a group of rotations will have toshare the basic properties of the group of three-dimensional rotations.This idea is captured in the notion of a group representation.Deﬁnition 48 A representation of a group G of elements x is a groupS of unitary operators D(x) such that there is a map T : G → Sassociating with every x ∈ G an operator D(x) ∈ S such that the groupoperation is preserved, i.e. for all x, y ∈ G with x · y = z we have D(x)D(y) = D(z) . (3.56) This means that both groups G and S are essentially equal becausetheir elements share the same relations between each other. If we nowhave a representation of the rotation group then it is natural to say thatthe operators that make up this representation are rotation operators.
112.
112 CHAPTER 3. DYNAMICS AND SYMMETRIES In quantum mechanics a rotation has to be represented by a unitaryoperator acting on a state vector. For rotations around a given axis n ˆthese unitary operators have to form a set of operators U (φ) that isparametrized by a single parameter, the angle φ. As we know that theoperators are unitary we know that they have to be of the form ˆ ˆ U (φ) = e−iJn φ/¯ h ˆwith a Hermitean operator Jn and where we have introduced h for ¯convenience. The operator is the angular momentum operator alongdirection n. In general we can deﬁne the angular momentum operator ˆ ˆ ˆ ˆ ˆ ˆby J = Jx ex + Jy ey + Jz ez which yields Jn = J · n. Now let us ﬁndout what the commutation relations between the diﬀerent componentsof the angular momentum operator are. We use the fact that we havea representation of the rotation group, i.e. Eq. (3.56) is satisﬁed.Eq. (3.55) then implies that for an inﬁnitesimal rotation we have theoperator equation ˆ /¯ iJy /¯ −iJx /¯ −iJy /¯ h ˆ h ˆ h ˆ h ˆ 2 /¯ eiJx e e e = e−iJz h .Expanding both sides to second order in , we ﬁnd ˆ ˆ ˆ h 1 + [Jy , Jx ] 2 /¯ 2 = 1 − iJz 2 /¯ hwhich gives rise to the commutation relation ˆ ˆ hˆ [Jx , Jy ] = i¯ Jz . (3.57)In the same way one can obtain the commutation relations between anyother component of the angular momentum operator, which results in ˆ ˆ [Ji , Jj ] = i¯ h ˆ . ijk Jk (3.58)This fundamental commutation relation is perhaps the most importantfeature of the quantum mechanical angular momentum and can be usedas its deﬁnition.
113.
3.3. GENERAL PROPERTIES OF ANGULAR MOMENTA 1133.3.3 Angular momentum eigenstatesUsing the basic commutation relation of angular momentum operatorsEq. (3.58), we will now derive the structure of the eigenvectors of theangular momentum operators. Eq. (3.58) already shows that we arenot able to ﬁnd simultaneous eigenvectors to all the three componentsof the angular momentum. However, the square of the magnitude of ˆ ˆ2 ˆ2 ˆ2the angular momentum J 2 = Jx + Jy + Jz commutes with each of itscomponents. In the following we will concentrate on the z-componentof the angular momentum, for which we ﬁnd ˆ ˆ [J 2 , J z ] = 0 . (3.59)As the two operators commute, we can ﬁnd simultaneous eigenvectors ˆ2for them, which we denote by |j, m . We know that J is a positiveoperator and for convenience we chose ˆ J 2 |j, m = h2 j(j + 1)|j, m . ¯ (3.60)By varying j from zero to inﬁnity the expression j(j + 1) can take anypositive value from zero to inﬁnity. The slightly peculiar form for the ˆeigenvalues of J 2 has been chosen to make all the following expressions ˆas transparent as possible. Jz is not know to be positive, in fact it isn’t,and we have the eigenvalue equation ˆ Jz |j, m = hm|j, m . ¯ (3.61)Given one eigenvector for the angular momentum we would like to havea way to generate another one. This is done by the ladder operatorsfor the angular momentum. They are deﬁned as ˆ ˆ ˆ J ± = J x ± iJ y . (3.62)To understand why they are called ladder operators, we ﬁrst derive ˆtheir commutation relations with Jz . They are ˆ ˆ hˆ [Jz , J± ] = ±¯ J± . (3.63)
114.
114 CHAPTER 3. DYNAMICS AND SYMMETRIESUsing these commutation relations, we ﬁnd ˆ ˆ ˆ ˆ ¯ˆ Jz (J+ |j, m ) = (J+ Jz + hJ+ )|j, m ˆ ¯ ¯ˆ = (J+ hm + hJ+ )|j, m ¯ ˆ = h(m + 1)J+ |j, m . (3.64)This implies that ˆ (J+ |j, m ) = α+ (j, m)|j, m + 1 , (3.65) ˆ ˆi.e. J+ generates a new eigenvector of Jz with an eigenvalue increasedby one unit and α+ (j, m) is a complex number. Likewise we ﬁnd that ˆ (J− |j, m ) = α− (j, m)|j, m − 1 , (3.66) ˆ ˆi.e. J− generates a new eigenvector of Jz with an eigenvalue decreasedby one unit and α− (j, m) is a complex number. Now we have to es-tablish the limits within which m can range. Can m take any value orwill there be bounds given by j? To answer this we need consider thepositive operator ˆ ˆ ˆ ˆ J2 − J2 = J2 + J2 . z x y (3.67)Applying it to an eigenvector, we obtain ˆ ˆ2 j, m|(J 2 − Jz )|j, m = h2 (j(j + 1) − m2 ) ≥ 0 . ¯ (3.68)We conclude that the value of m is bounded by |m| ≤ j(j + 1) . (3.69)While from Eqs. (3.65-3.66) we can see that the ladder operators allowus to increase and decrease the value of m by one unit, Eq. (3.69)implies that there has to be a limit to this increase and decrease. Infact, this implies that for the largest value mmax we have ˆ J+ |j, mmax = 0|j, mmax + 1 = 0 , (3.70)while for the smallest value mmin we have ˆ J− |j, mmin = 0|j, mmin − 1 = 0 . (3.71)
115.
3.3. GENERAL PROPERTIES OF ANGULAR MOMENTA 115This implies that we have ˆ ˆ J− J+ |j, mmax = 0 , (3.72) ˆ ˆ J+ J− |j, mmin = 0 . (3.73) (3.74) ˆ ˆ ˆ ˆTo determine mmax and mmin we need to express J− J+ and J+ J− in 2 ˆ ˆterms of the operators J and Jz . This can easily be done by directcalculation which gives ˆ ˆ ˆ ˆ ˆ ˆ ˆ2 ˆ2 ˆ ˆ ˆ ˆ2 hˆ J+ J− = (Jx +iJy )(Jx −iJy ) = Jx +Jy −i[Jx , Jy ] = J 2 −Jz +¯ Jz , (3.75)and ˆ ˆ ˆ ˆ ˆ ˆ ˆ2 ˆ2 ˆ ˆ ˆ ˆ2 hˆ J− J+ = (Jx −iJy )(Jx +iJy ) = Jx +Jy +i[Jx , Jy ] = J 2 −Jz −¯ Jz . (3.76)Using these expressions we ﬁnd for mmax ˆ ˆ ˆ ˆ2 ¯ ˆ 0 = J− J+ |j, mmax = (J 2 − Jz − hJz )|j, mmax = h2 (j(j + 1) − mmax (mmax + 1))|j, mmax . ¯ (3.77)This implies that mmax = j . (3.78)Likewise we ﬁnd ˆ ˆ ˆ ˆ2 ¯ ˆ 0 = J+ J− |j, mmin = (J 2 − Jz + hJz )|j, mmin = h2 (j(j + 1) + mmin (1 − mmin ))|j, mmin . ¯ (3.79)This implies that mmin = −j . (3.80)We have to be able to go in steps of one unit from the maximal valuemmax to mmin . This implies that 2j is a whole number and thereforewe ﬁnd that n j= with n ∈ N . (3.81) 2Every real quantum mechanical particle can be placed in one of twoclasses. Either it has integral angular momentum, 0, 1, . . . (an example
116.
116 CHAPTER 3. DYNAMICS AND SYMMETRIESis the orbital angular momentum discussed in the previous section) andis called a boson, or it has half-integral angular momentum 1 , 3 , . . . (the 2 2electron spin is an example for such a particle) and is called a fermion. Finally we would like to determine the constant α± (j, m) in ˆ J± |j, m = α± (j, m)|j, m ± 1 . (3.82)This is easily done by calculating the norm of both sides of Eq. (3.82). ˆ ˆ ˆ2 ¯ ˆ |α+ (j, m)|2 = ||J+ |j, m ||2 = j, m|(J 2 − Jz − hJz )|j, m = h(j(j + 1) − m(m + 1)) . ¯ (3.83)Analogously we ﬁnd ˆ ˆ ˆ2 ¯ ˆ |α− (j, m)|2 = ||J− |j, m ||2 = j, m|(J 2 − Jz + hJz )|j, m = h(j(j + 1) − m(m − 1)) . ¯ (3.84)We chose the phase of the states |j, m in such a way that α± (j, m) isalways positive, so that we obtain ˆ J+ |j, m = h j(j + 1) − m(m + 1)|j, m + 1 ¯ (3.85) ˆ J− |j, m = h j(j + 1) − m(m − 1)|j, m − 1 . ¯ (3.86)3.4 Addition of Angular MomentaIn the previous section I have reviewed the properties of the angu-lar momentum of a single particle. Often, however, you are actuallyholding a quantum system consisting of more than one particle, e.g.a hydrogen atom, and you may face a situation where more than oneof those particles possesses angular momentum. Even a single particlemay have more than one angular momentum. An electron in a centralpotential, for example, has an orbital angular momentum as well asan internal angular momentum, namely the spin. Why do we need toadd these angular momenta? To see this, consider the example of anelectron in a central potential.
117.
3.4. ADDITION OF ANGULAR MOMENTA 117 The Hamilton operator of a particle in a central potential is givenby ˆ2 p ˆ H0 = ˆ + V (|x|) , (3.87) 2m ˆwhere p is the linear momentum of the particle, m its mass and V (|x|) ˆis the operator describing the central potential. From subsection 3.2.3we know that under this Hamilton operator the orbital angular mo-mentum is preserved because the Hamilton operator is invariant underrotations around an arbitrary axis. However, if you do experiments, ˆthen you will quickly realize that the Hamilton operator H0 is only theﬁrst approximation to the correct Hamilton operator of an electron ina central potential (you may have heard about that in the 2nd yearatomic physics course). The reason is, that the electron possesses aninternal angular momentum, the spin, and therefore there have to beadditional terms in the Hamilton operator. These additional terms canbe derived from a relativistic theory of the electron (Dirac equation),but here I will just give a heuristic argument for one of them becausethe full description of the Dirac equation would take far too long.) Intu-itively, an electron in a central potential rotates around the origin andtherefore creates a circular current. Such a circular current gives rise toa magnetic ﬁeld. The spin of an electron on the other hand gives rise toa magnetic moment of the electron whose orientation depends on theorientation of the spin - which is either up or down. This means thatthe electron, has diﬀerent energies in a magnetic ﬁeld B depending on ˆthe orientation of its spin. This energy is given by −µ0 S B, where −µ0 S ˆis the magnetic moment of the electron and S = h(σ1 e1 +σ2 e2 +σ3 e3 ) is ¯the electron spin operator. The magnetic ﬁeld created by the rotatingelectron is proportional to the orbital angular momentum (the higherthe angular momentum, the higher the current induced by the rotatingelectron and therefore the higher the magnetic ﬁeld) so that we ﬁndthat the additional part in the Hamilton operator Eq. (3.87) is givenby ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ H1 = −ξ S ⊗ L = ξ S1 ⊗ L1 + S2 ⊗ L2 + S3 ⊗ L3 , (3.88) ˆ ˆwhere ξ is a constant. Note that the operators S and L act on two
118.
118 CHAPTER 3. DYNAMICS AND SYMMETRIES ˆdiﬀerent Hilbert spaces, S acts on the 2-dimensional Hilbert space of ˆthe electron spin and L on the Hilbert space describing the motion ofthe electron. Under the Hamilton operator Eq. (3.87) the spin was a constant ˆof motion as the spin operator S evidently commutes with the Hamil- ˆ ˆtonian Eq. (3.87), i.e. [H0 , Si ] = 0. For the total Hamilton operator ˆ ˆ ˆ ˆH = H0 + H1 , however, neither the orbital angular momentum L nor ˆthe spin angular momentum S are constants of motion anymore. This ˆ ˆcan easily be seen by checking that now the commutators [Li , H] and ˆ ˆ[Si , H] are non-vanishing! See for example that ˆ ˆ ˆ ˆ [L1 , H] = [L1 , H1 ] ˆ ˆ ˆ = [L1 , −ξ S · L] 3 ˆ = −ξ[L1 , ˆ ˆ Si · Li ] i=1 h ˆ ˆ h ˆ ˆ = −i¯ ξ S2 L3 + i¯ ξ S3 L2 (3.89)and ˆ ˆ ˆ ˆ [S1 , H] = [S1 , H1 ] 3 ˆ = −ξ[S1 , ˆ ˆ Si · Li ] i=1 hˆ ˆ hˆ ˆ = −ξi¯ S3 L2 + ξi¯ S2 L3which both are evidently non-zero. However, the sum of the two spin ˆ ˆ ˆoperators J = L + S is a conserved quantity because all its componentscommute with the Hamiltonian, i.e. ˆ ˆ ˆ [Li + Si , H] = 0 . (3.90) ˆYou may check this easily for the ﬁrst component of J1 using the twocommutators Eqs. (3.89-3.90). We know a basis of eigenstates to the orbital angular momentumand also one for the spin angular momentum. However, as the an-gular momenta separately are not conserved anymore such a choice
119.
3.4. ADDITION OF ANGULAR MOMENTA 119of eigenstates is quite inconvenient because eigenstates to the angular ˆ ˆmomenta L and S are not eigenstates of the total Hamilton operator ˆ ˆ ˆH = H0 + H1 anymore. Of course it would be much more convenientto ﬁnd a basis that is composed of simultaneous eigenvectorsof the total angular momentum and the Hamilton operator.This is the aim of the following subsection which describes a generalprocedure how one can construct the eigenvectors of the total angu-lar momentum from the eigenstates of orbital angular momentum andspin. As diﬀerent components of the total angular momentum do notcommute, we can of course only ﬁnd joint eigenstates of the total angu- ˆ2 ˆlar momentum operator J , its z-component Jz and the total Hamilton ˆ ˆoperator H0 + H1 .3.4.1 Two angular momentaFirst I will present the general idea behind the addition of two angular ˆ(1) ˆ(2)momenta represented by the operators j for the ﬁrst particle and jfor the second particle. Then I will give the simplest possible explicit ˆ(1) ˆ(1)example. Note that I am writing j instead of the more precise j ⊗1to shorten the equations a little bit. Now I would like to consider the total angular momentum ˆ ˆ(1) ˆ(2) J := j + j , (3.91)Here the upper bracketed index indicates on which particle the operatoris acting. ˆ2 Like in the previous section J commutes with each of its compo- ˆ2 ˆ ˆ 2 (1)nents. (Check e.g. that [J , Jz ] = [J , ˆz + ˆz ] = 0). Therefore I j j (2) ˆ2would like to ﬁnd a set of joint eigenvectors of J and the z-component ˆof the total angular momentum Jz . These eigenvectors which are writ-ten as |J, M have to satisfy ˆ2 J |J, M = hJ(J + 1)|J, M ¯ ˆ Jz |J, M = hM |J, M . ¯
120.
120 CHAPTER 3. DYNAMICS AND SYMMETRIESAs the eigenstates |j (1) , m(1) ; j (2) , m(2) of the separate angular momentaˆ(1) and ˆ(2) form a basis, we have to be able to write |J, M as linearj jcombinations of the eigenvectors to the individual angular momenta|j (1) , m(1) ; j (2) , m(2) of which there are (2j (1) + 1)(2j (2) + 1) states. Itmight seem diﬃcult to ﬁnd these linear combinations, but luckily thereis a general recipe which I am going to explain in the following. In the previous section we constructed all the possible angular mo-mentum eigenstates of a single particle by starting with the state withhighest m-quantum number and then working our way down by apply-ing the angular momentum ladder operator ˆ− = ˆx −iˆy . We will apply j j j ˆan analogous strategy for the total angular momentum J. Firstly weneed to identify the ladder operators for the total angular momentum.They are ˆ (1) (2) J− = ˆ− + ˆ− j j (3.92) ˆ (1) (2) J+ = ˆ+ + ˆ+ . j j (3.93)Let us check whether these operators satisfy commutation relationsanalogous to Eq. (3.63). ˆ ˆ (1) (2) [Jz , J± ] = [ˆz + ˆz , ˆ± + ˆ± ] j (1) j (2) j j (1) (2) = [ˆz , ˆ± ] + [ˆz , ˆ± ] j (1) j j (2) j (1) (2) = ±¯ ˆ± ± hˆ± hj ¯j hˆ = ±¯ J± . (3.94)The commutation relation Eq. (3.94) allows us, just as in the case of asingle angular momentum, to obtain the state |J, M − 1 from the state|J, M via ˆ J− |J, M = h J(J + 1) − M (M − 1)|J, M − 1 . ¯ (3.95)This relation can be derived in exactly the same way as relation Eq.(3.86). The procedure for generating all the states |J, M has threesteps and starts withStep 1: Identify the state with maximal total angular momentum andmaximal value for the M -quantum number. If we have two angular
121.
3.4. ADDITION OF ANGULAR MOMENTA 121momenta j (1) and j (2) then their sum can not exceed Jmax = j (1) + j (2) .Which state could have total angular momentum Jmax = j (1) + j (2) andthe M = Jmax ? Clearly this must be a combination of two angularmomenta that are both parallel, i.e. we guess they should be in thestate |J = j (1) + j (2) , M = j (1) + j (2) = |j (1) , j (1) ⊗ |j (2) , j (2) . (3.96)To check whether this assertion is true we need to verify that ˆ2 J |J, M = h2 J(J + 1)|J, M ¯ ˆ Jz |J, M = hM |J, M , ¯with J = M = j (1) + j (2) . I will not present this proof here, but will letyou do it in the problem sheets. Now followsStep 2: Apply the ladder operator deﬁned in Eq. (3.92) using Eq.(3.95) to obtain all the states |J, M . Repeat this step until you reachedstate |J, M = −J . However, these are only 2Jmax + 1 states which is less than the to-tal of (2j1 + 1)(2j2 + 1) states of the joint state space of both angularmomenta. To obtain the other states we applyStep 3: Identify the state that is of the form |J − 1, M = J − 1 andthen go to step 2. The procedure stops when we have arrived at thesmallest possible value of J, which is obtained when the two angularmomenta are oriented in opposite direction so that we need to subtractthem. This implies that Jmin = |j (1) −j (2) |. The absolute value needs tobe taken, as the angular momentum is by deﬁnition a positive quantity. Following this strategy we obtain a new basis of states consisting of Jmax Jmax Jmin −1 i=Jmin (2i + 1) = i=0 2i + 1 − i=0 2i + 1 = (Jmax + 1)2 − Jmin = 2(j + j + 1) − (j (1) − j (2) )2 = (2j (1) + 1)(2j (2) + 1) states. (1) (2) 2 As this is a rather formal description I will now give the simplestpossible example for the addition of two angular momenta.Example:Now I am going to illustrate this general idea by solving the explicitexample of adding two spin- 1 . 2
122.
122 CHAPTER 3. DYNAMICS AND SYMMETRIES The angular momentum operators for a spin- 1 particle are 2 ˆx = hσx ˆy = hσy ˆz = hσz j ¯ˆ j ¯ˆ j ¯ˆ (3.97)with the Pauli spin-operators σi . It is easy to check that this deﬁnition ˆsatisﬁes the commutation relations for an angular momentum. The an-gular momentum operator of an individual particle is therefore given byˆj = h (ˆx ex + σy ey + σz ez ) and the total angular momentum operator ¯ σ ˆ ˆis given by ˆ ˆ(1) ˆ(2) J =j +j . (3.98)The angular momentum ladder operators are ˆ (1) (2) J− = ˆ− + ˆ− j j ˆ (1) (2) J+ = ˆ+ + ˆ+ . j jThe maximal value for the total angular momentum is J = 1. Thestate corresponding to this value is given by 1 1 1 1 |J = 1, M = 1 = |j (1) = , m(1) = ⊗ |j (2) = , m(2) = , 2 2 2 2 (3.99)or using the shorthand 1 1 |↑ = |j (1) = , m(1) = (3.100) 2 2 (1) 1 (1) 1 |↓ = |j = , m = − (3.101) 2 2we have |J = 1, M = 1 = | ↑ ⊗ | ↑ .Now we want to ﬁnd the representation of the state |J = 1, M = 0 ˆusing the operator J− . We ﬁnd ˆ J− |J = 1, M = 1 |J = 1, M = 0 = √ h 2 ¯ (1) (2) (ˆ− + ˆ− )| ↑ ⊗ | ↑ j j = √ h 2 ¯ |↓ ⊗|↑ |↑ ⊗|↓ = √ + √ (3.102) 2 2
123.
3.5. LOCAL GAUGE SYMMETRIES AND ELECTRODYNAMICS123 ˆTo ﬁnd the state |J = 1, M = −1 we apply J− again and we ﬁnd that ˆ J− |J = 1, M = 0 |J = 1, M = −1 = √ h 2 ¯ (1) (2) (ˆ− + ˆ− )|J = 1, M = 0 j j = √ h 2 ¯ = |↓ ⊗|↓ (3.103)Now we have almost ﬁnished our construction of the eigenstates of thetotal angular momentum operator. What is still missing is the statewith total angular momentum J = 0. Because J = 0 this state mustalso have M = 0.The state |J = 0, M = 0 must be orthogonal tothe three states |J = 1, M = 1 , |J = 1, M = 0 , |J = 1, M = −1 .Therefore the state must have the form |↓ ⊗|↑ |↑ ⊗|↓ |J = 0, M = 0 = √ − √ . (3.104) 2 2To check that this is the correct state, just verify that it is orthogonalto |J = 1, M = 1 , |J = 1, M = 0 , |J = 1, M = −1 . This concludesthe construction of the eigenvectors of the total angular momentum.3.5 Local Gauge symmetries and Electro- dynamics
125.
Chapter 4Approximation MethodsMost problems in quantum mechanics (like in all areas of physics)cannot be solved analytically and we have to resort to approximationmethods. In this chapter I will introduce you to some of these meth-ods. However, there are a large number of approximation methods inphysics and I can only present very few of them in this lecture. Often a particular problem will be in a form that it is exactly solv-able would it not be for a small additional term (a perturbation) in theHamilton operator. This perturbation may be both, time-independent(e.g. a static electric ﬁeld) or time-dependent (a laser shining on anatom). It is our task to compute the eﬀect that such a small perturba-tion has. Historically these methods originated from classical mechanicswhen in the 18th century physicists such as Lagrange tried to calculatethe mechanical properties of the solar system. In particular he and hiscolleagues have been very interested in the stability of the earths orbitaround the sun. This is a prime example of a problem where we cansolve part of the dynamics exactly, the path of a single planet aroundthe sun, while we are unable to solve this problem exactly once we takeinto account the small gravitational eﬀects due to the presence of allthe other planets. There are quite a few such problems in quantummechanics. For example we are able to solve the Schr¨dinger equation ofor the hydrogen atom, but if we place two hydrogen atoms maybe100 Angstrom away from each other then the problem has no exactsolution anymore. The atoms will start to perturb each other and aweak perturbation to their dynamics will be the eﬀect. While not be- 125
126.
126 CHAPTER 4. APPROXIMATION METHODSing strong enough to form a molecule this eﬀect leads to an attractiveforce, the van der Waals force. This force, although being quite weak,is responsible for the sometimes remarkable stability of foams.4.1 Time-independent Perturbation The- oryIn the second year quantum mechanics course you have seen a methodfor dealing with time independent perturbation problems. I will red-erive these results in a more general, shorter and elegant form. Then Iwill apply them to derive the van der Waals force between two neutralatoms.4.1.1 Non-degenerate perturbation theory ˆImagine that you have a system with a Hamilton operator H0 , eigen-vectors |φi and corresponding energies i . We will now assume thatwe can solve this problem, i.e. we know the expressions for the eigen-states |φi and energies i already. Now imagine a small perturbation ˆin the form of the operator λV is added to the Hamilton operator. Thereal parameter λ can be used to count the order to which we performperturbation theory. Now we have the new total Hamilton operator ˆ ˆ ˆH = H0 + λV with the new eigenvectors |ψi (λ) and energies Ei (λ).For λ = 0 we recover the old unperturbed system. For the new system,the time-independent Schr¨dinger equation now reads o ˆ ˆ (H0 + λV )|ψi (λ) = Ei (λ)|ψi (λ) . (4.1)For the following it is useful to introduce the slightly unusual ’normal-ization ’ φi |ψi (λ) = 1 (4.2)for the perturbed eigenstates |ψi (λ) , i.e. we do not assume that ψi (λ)|ψi (λ) = 1! Now we can multiply the Schr¨dinger equation (4.1) from the left owith φi | and ﬁnd ˆ ˆ φi |(H0 + λV )|ψi (λ) = φi |Ei (λ)|ψi (λ) . (4.3)
127.
4.1. TIME-INDEPENDENT PERTURBATION THEORY 127Using the ’normalization’ relation Eq. (4.2) we then obtain Ei (λ) = i ˆ + λ φi |V |ψi (λ) . (4.4)To obtain the energy eigenvalues of the perturbed Hamilton operatorwe therefore need to compute the corresponding eigenvector |ψi (λ) . Inthe following I will do this for the state |φ0 to obtain the perturbedenergy E0 (λ). (You can easily generalize this to an arbitrary eigenvalue,just by replacing in all expressions 0 s by the corresponding value.) Forthe moment I will assume that the eigenvector |φ0 of the unperturbed ˆHamilton operator H0 is non-degenerate. You will see soon why Ihad to make this assumption. In order to do the perturbation theory as systematic as possible Iintroduce the two projection operators ˆ ˆ ˆ P0 = |φ0 φ0 | and Q0 = 1 − P0 . (4.5)The second projector projects onto the subspace that supports the cor-rections to the unperturbed wave-function, i.e. ˆ Q0 |ψ0 (λ) = |ψ0 (λ) − |φ0 (4.6)Now let us rewrite the Schr¨dinger equation introducing an energy owhich will be speciﬁed later. We write ˆ ˆ ( − E0 (λ) + λV )|ψ0 (λ) = ( − H0 )|ψ0 (λ) . (4.7)This can be written as ˆ ˆ |ψ0 (λ) = ( − H0 )−1 ( − E0 (λ) + λV )|ψ0 (λ) . (4.8) ˆMultiplying this with the projector Q0 using Eq. (4.6) we ﬁnd ˆ ˆ ˆ |ψ0 (λ) = |φ0 + Q0 ( − H0 )−1 ( − E0 (λ) + λV )|ψ0 (λ) . (4.9)Now we can iterate this equation and we ﬁnd ∞ n |ψ0 (λ) = ˆ ˆ ˆ Q0 ( − H0 )−1 ( − E0 (λ) + λV ) |φ0 . (4.10) n=0
128.
128 CHAPTER 4. APPROXIMATION METHODSNow we can plug this into Eq. (4.4) and we ﬁnd the expression for theperturbed energies ∞ n E0 (λ) = 0+ ˆ ˆ ˆ ˆ φ0 |λV Q0 ( − H0 )−1 ( − E0 (λ) + λV ) |φ0 . n=0 (4.11)Eqs. (4.9) and (4.11) give the perturbation expansion of the energiesand corresponding eigenvectors to all orders. Remark: The choice of the constant is not completely arbitrary.If it is equal to the energy of one of the excited states of the unperturbedHamilton operator, then the perturbation expansion will not convergeas it will contain inﬁnite terms. This still leaves a lot of freedom.However, we also would like to be able to say that by taking into accountthe ﬁrst k terms of the sum Eq. (4.11), we have all the terms to orderλk . This is ensured most clearly for two choices for the value of . Onesuch choice is evidently = E0 (λ). In that case we have ∞ n E0 (λ) = 0+ ˆ ˆ ˆ ˆ φ0 |λV Q0 (E0 (λ) − H0 )−1 (λV ) |φ0 . (4.12) n=0This choice is called the Brillouin-Wigner perturbation theory. Theother choice, which is often easier to handle, is that where = 0 . Inthat case we obtain ∞ n E0 (λ) = 0 + ˆ ˆ φ0 |λV Q0 ( 0 ˆ − H0 )−1 ( 0 ˆ − E0 (λ) + λV ) |φ0 . n=0 (4.13)This is the Rayleigh perturbation theory. For this choice ( 0 − E0 (λ) + ˆλV ) is proportional to λ and therefore we obtain higher order correc-tions in λ by choosing more terms in the sum. Now let us look at the ﬁrst and second order expression so that youcan compare it with the results that you learned in the second yearcourse. To do this we now specify the value of the arbitrary constant to be the unperturbed energy 0 . For the energy of the perturbedHamilton operator we then obtainE0 (λ) = 0 ˆ ˆˆ ˆ + φ0 |λV |φ0 + φ0 |λV Q0 ( − H0 )−1 ( 0 ˆ − E0 (λ) + λV )|φ0 + . . .
129.
4.1. TIME-INDEPENDENT PERTURBATION THEORY 129 ˆ ˆ 1 ˆ = 0 + λ φ0 |V |φ0 + φ0 |λV |φn φn | ( 0 − E0 (λ) + λV )|φ0 + . . . n=0 0− n ˆ | φ0 |V |φn |2 = ˆ 0 + λ φ0 |V |φ0 + λ 2 + ... (4.14) n=0 0− nRemark: If |φ0 is the ground state of the unperturbed Hamilton oper-ator, then the second order contribution in the perturbation expansionis always negative. This has a good reason. The ground state of the ˆ ˆ ˆtotal Hamilton operator H = H0 + V is |ψ0 and we ﬁnd for |φ0 that ˆ | φ0 |V |φi |2 ˆ ˆ ˆ ˆ E0 = ψ0 |H0 + V |ψ0 < φ0 |H0 + V |φ0 = E0 − n=0 0− nThis implies that the second order perturbation theory term is negative!The new eigenstate to the eigenvector Eq. (4.14) can be calculated too,and the ﬁrst contributions are given by |ψ0 ˆ ˆ = |φ0 + Q0 ( − H0 )−1 ( 0 ˆ − E0 + V )|φ0 + . . . 1 ˆ = |φ0 + |φn φn | ( 0 − E0 + V )|φ0 + . . . n=0 0− n 1 ˆ = |φ0 + |φn φn |V |φ0 + . . . . n=0 0− nLooking at these two expressions it is clear why I had to demand,that the eigenvalue 0 of the unperturbed Hamilton operator H0 hasˆto be non-degenerate. In order to obtain a meaningful perturbationexpansion the individual terms should be ﬁnite and therefore the factors 1 0− n need to be ﬁnite for n = 0.4.1.2 Degenerate perturbation theoryThis part has not been presented in the lectureWhat do we do in the case of degenerate eigenvalues? We can followa similar strategy but as you can expect things get a little bit morecomplicated.
130.
130 CHAPTER 4. APPROXIMATION METHODS ˆ Let us assume that we have a Hamilton operator H0 , eigenvectors|φν i and corresponding energies i where the upper index ν numeratesan orthogonal set of eigenvectors to the degenerate eigenvalue i . Againlet us deal with the eigenvalue 0 . Now we write down the projectors ˆ P0 = 0 0 ˆ ˆ |φν φν | and Q0 = 1 − P0 . (4.15) νThen we ﬁnd ˆ µ µ ˆ µ Q0 |ψ0 (λ) = |ψ0 (λ) − P0 |ψ0 (λ) , (4.16) µwhere the eigenvector |ψ0 of the perturbed Hamilton operator origi-nates from the unperturbed eigenvector |φµ . 0 To obtain the analogue of Eq. (4.3) we multiply the Schr¨dinger o µ νequation for the state |ψ (λ) from the left with ψ (λ)|P ˆ0 and weobtain ˆ ˆ ˆ µ ψ0 (λ)|P0 (H0 + λV − E0 (λ))|ψ µ (λ) = 0 . ν (4.17)Note that I have proceeded slightly diﬀerent from the non-degeneratecase because there I multiplied from the left by φν |. I could have donethat too, but then I would have been forced to introduce a ’normal-ization’ condition of the form φν |ψ µ (λ) = δµν which is quite hard tocompute. Eq. (4.17) will eventually help us determine the new energy eigen-values. However, this time this will involve the diagonalization of amatrix. The reason of course being that we have a degeneracy in theoriginal unperturbed eigenvalue. Now we proceed very much along the lines of the unperturbed per-turbation theory. Using Eq. (4.16) we rewrite Eq. (4.9) and obtain µ ˆ µ ˆ ˆ µ ˆ µ |ψ0 (λ) = P0 |ψ0 (λ) + Q0 ( 0 − H0 )−1 ( 0 −E0 (λ)+λV )|ψ0 (λ) . (4.18)Iterating this equation yields ∞ n µ |ψ0 (λ) = ˆ Q0 ( 0 ˆ µ ˆ − H0 )−1 ( − E0 (λ) + λV ) ˆ µ P0 |ψ0 (λ) . (4.19) n=0
131.
4.1. TIME-INDEPENDENT PERTURBATION THEORY 131 µThis implies that we can calculate the whole eigenvector |ψ0 (λ) from ˆ µthe eigenvalues E µ (λ) and the components P0 |ψ0 (λ) of the eigenvector ˆin the eigenspace of the eigenvalue 0 of the Hamilton operator H0 . This may now be inserted in Eq. (4.17) so that we ﬁnd with the ν,Pshorthand notation |ψ0 = P0 |ψ0ˆ µ ∞ n ν,P ˆ ˆ µψ0 |(H0 +λV −E0 (λ)) ˆ Q0 ( 0 ˆ − H0 )−1 ( 0 µ ˆ − E0 (λ) + λV ) µ,P |ψ0 =0 . n=0 (4.20) µNow we need to determine the energies E0 from Eq. (4.20). Let usﬁrst determine the lowest order approximation, which means that weset n = 0. We ﬁnd for any µ and ν that ν,P ψ0 | 0 ˆ µ µ,P + λV − E0 (λ)|ψ0 =0 . (4.21)Therefore the new energy eigenvalues are obtained by diagonalizing ˆthe operator V in the eigenspace of the eigenvalue 0 of the Hamilton ˆ 0 spanned by the |φν . We ﬁndoperator H 0 µ E0 (λ) = 0 + λVµµ (4.22) ˆwhere the Vµµ are the eigenvalues of V . The second order contribution is then obtained by using the eigen-vectors obtained from Eq. (4.21) and use them in the next order expres-sion, e.g. for n = 1. The second order expression for the perturbationtheory is therefore obtained from taking the expectation values ν,P ψ0 |( 0 µ ˆˆ + λVνν − E0 (λ)) + λV Q0 ( 0 ˆ ˆ ν,P − H0 )−1 λV |ψ0 = 0 .This gives ν,P ˆ ν 2 | ψ0 |V |φm |2 E0 = 0 + λVνν + λ . (4.23) m=0 0− mIn the following section I will work out an example for the use of per-turbation theory which gives a non-trivial result.
132.
132 CHAPTER 4. APPROXIMATION METHODS4.1.3 The van der Waals forceIn this section I am going to derive the force that two neutral atomsare exerting on each other. This force, the van der Waals force, canplay a signiﬁcant role in the physics (also the chemistry and mechanics)of neutral systems. In the derivation of the van der Waals force I willuse time-independent perturbation theory. The calculations will besimpliﬁed by the fact that the system of two atoms exhibits symmetrieswhich have the eﬀect, that the ﬁrst order contribution to perturbationtheory vanishes. The basic experimental situation is presented in Fig. 4.1. We dis-cuss the van der Waals force between two hydrogen atoms. The twoFigure 4.1: Two hydrogen atoms are placed near each other, with adistance R much larger than the Bohr radius a0 .atomic nuclei (proton) are positioned at a ﬁxed distance from eachother. We assume that the protons are not moving. The proton ofatom A is placed at the origin, while the proton of atom B is located atthe position R. The electron of atom A is displaced from the origin byrA , while atom B is displaced from the nucleus B by rB . To make surethat we can still speak of separate hydrogen atoms we need to demandthat the separation between the two protons is much larger than theradius of each of the atoms, i.e. |R| a0 , (4.24)where a0 is the Bohr radius. After we have described the physical situation we will now have todetermine the Hamilton operator of the total system. The Hamilton
133.
4.1. TIME-INDEPENDENT PERTURBATION THEORY 133operator has the form ˆ ˆ ˆ H = H0 + V , (4.25) ˆwhere V describes the small perturbation that gives rise to the van der ˆWaals force. The unperturbed Hamilton-operator H0 is that of twohydrogen atoms, i.e. ˆ2 pA ˆ2 pB 1 1 ˆ H0 = + − − , (4.26) ˆA | 4π 0 |ˆB | 2m 2m 4π 0 |r rwhich can be solved exactly. We have energy eigenfunctions of theform |φA B n,l,m |φn ,l ,m with energies En + En . The perturbation to thisHamilton operator is due to the electrostatic forces between electronsand protons from diﬀerent atoms. From Fig. 4.1 we can easily see thatthis gives ˆ e2 e2 e2 e2 V = + − − . 4π 0 R 4π 0 |R + ˆB − ˆA | 4π 0 |R + ˆB | 4π 0 |R + ˆA | r r r r (4.27)As we have assumed that the Bohr radius is much smaller than theseparation between the two hydrogen atoms, we can conclude that|rA |, |rB | R. Therefore we are able to expand the perturbationHamilton operator and we obtain after lengthy but not too diﬃcultcalculation ˆ e2 ˆA ˆB r r (ˆA Ru )(ˆB Ru ) r r Vdd = 3 −3 3 , (4.28) 4π 0 R Rwhere Ru is the unit vector in direction of R. Now we are in a position todetermine the ﬁrst and second order contribution of the perturbationtheory. Because we are dealing with the ground state, we have nodegeneracy and we can apply non-degenerate perturbation theory.First order correction The ﬁrst order contribution to perturbationtheory is given by ˆ ∆E1 = φA | φB |Vdd |φA |φB 0,0,0 0,0,0 0,0,0 0,0,0 , (4.29)
134.
134 CHAPTER 4. APPROXIMATION METHODS A/Bwhere |φ0,0,0 are the unperturbed ground-states of atom A/B. InsertingEq. (4.27) into Eq. (4.29) gives e2 φA |ˆA |φA 0,0,0 r B ˆ B 0,0,0 φ0,0,0 |r B |φ0,0,0 ∆E1 = 4π 0 R3 φA |ˆA Ru |φA 0,0,0 r B ˆ B 0,0,0 φ0,0,0 |r B Ru |φ0,0,0 −3 . (4.30) R3Now we can use a symmetry argument to show that this whole, ratherlengthy expression must be zero. Why is that so? What appears inEq. (4.30) are expectation values of components of the position op-erator in the unperturbed ground state of the atom. However, theunperturbed Hamilton operator of the hydrogen atom possesses somesymmetries. The relevant symmetry here is that of the parity, i.e. the ˆ ˆHamilton operator H0 commutes with the operator P which is deﬁned ˆ ˆ ˆby P |x = | − x . This implies that both operators, H0 and P , canbe diagonalized simultaneously, i.e. they have the same eigenvectors.This is the reason why all eigenvectors of the unperturbed Hamilton ˆoperator H0 are also eigenvectors of the parity operator. Therefore we ˆ A/B A/Bhave P |φ0,0,0 = ±|φ0,0,0 . This implies that ˆ φ0,0,0 |x|φ0,0,0 = ˆˆ ˆ φ0,0,0 |P xP |φ0,0,0 = ˆ φ0,0,0 | − x|φ0,0,0 ˆThis equality implies that φ0,0,0 |x|φ0,0,0 = 0 and therefore ∆E1 = 0!This quick argument illustrates the usefulness of symmetry arguments.Second order correction The second order contribution of the per-turbation theory is given by , n n,l,m ˆ | φA ,l ,m | φB |Vdd |φA |φB |2 0,0,0 0,0,0 ∆E2 = (4.31) (n ,l ,m ),(n,l,m) 2E0 − En − Enwhere the means that the state |φA |φB 0,0,0 0,0,0 is excluded from thesummation. The ﬁrst conclusion that we can draw from expression Eq.
135.
4.1. TIME-INDEPENDENT PERTURBATION THEORY 135(4.31) is that the second order correction ∆E2 is negative so that wehave C ∆E2 = − (4.32) R6with a positive constant C. The force exerted on one of the atoms istherefore given by dE 6C F =− ≈− 7 (4.33) dR Rand therefore the force aims to reduce the distance between the atomsand we conclude that the van der Waals force is attractive. This is alsothe classical result that has been known before the advent of quantummechanics. The physical interpretation of the van der Waals force is the fol-lowing: A hydrogen atom should not be viewed as a stationary chargedistribution but rather as a randomly ﬂuctuating electric dipole dueto the motion of the electron. Assume that in the ﬁrst atom a dipole 1moment appears. This generates an electric ﬁeld, which scales like R3with distance. This electric ﬁeld will now induce an electric dipole mo-ment in the other atom. An electric dipole will always tend to movetowards areas with increasing ﬁeld strength and therefore the secondatom will be attracted to the ﬁrst one. The induced dipole momentwill be proportional to the electric ﬁeld. Therefore the net force will d 1scale as − dR R6 . This semi-classical argument gives you an idea for thecorrect R-dependence of the van der Waals force. However, this semi-classical explanation fails to account for some ofthe more intricate features of the van der Waals force. One exampleis that of two hydrogen atoms, one in the ground state and one in an Cexcited state. In that case the van der Waals force scales as R3 andthe constant C can be either positive or negative, i.e. we can have arepulsive as well as an attractive van der Waals force. The computationof this case is slightly more complicated because excited states in thehydrogen atom are degenerate, which requires the use of degenerateperturbation theory. On the other hand it only involves ﬁrst orderterms.
136.
136 CHAPTER 4. APPROXIMATION METHODS4.1.4 The Helium atomNow let us try to apply perturbation theory to the problem of theHelium atom. In fact it is not at all clear that we can do that becausethe two electrons in the Helium atom are very close together and theirmutual exerted force is almost as strong as that between the nucleusand the individual electrons. However, we may just have a look how farwe can push perturbation theory. Again I am interested in the groundstate of the Helium atom. The Hamilton operator is given by ˆ2 pA ˆ2 pB 2e2 2e2 e2 ˆ H= + − − + . (4.34) 2m 2m 4π 0 |ˆA | 4π 0 |ˆB | 4π 0 |ˆA − ˆB | r r r r ˆThe ﬁrst four terms, H0 describe two non-interacting electrons in acentral potential and for this part we can ﬁnd an exact solution, givenby the energies and wave-functions of a He+ ion. Now we performperturbation theory up to ﬁrst order. The zeroth order wavefunctionis given by |ψHe = |ψHe+ ⊗ |ψHe+ (4.35)where |ψHe+ is the ground state of an electron in a Helium ion, i.e. asystem consisting of two protons and one electron. In position spacethis wavefunction is given by Z 3 − Z(rA +rB ) ψHe (r) = e a0 , (4.36) πa30where a0 = 0.511 10−10 m is the Bohr radius and Z = 2 is the numberprotons in the nucleus. The energy of this state for the unperturbedsystem is given by E = −108.8eV . The ﬁrst order correction to theenergy of the ground state is now given by e2 ∆E = d3 rA d3 rB |ψHe (r)|2 = 34eV . (4.37) 4π 0 |ˆA − ˆB | r rTherefore our estimate for the ground state energy of the Helium atomis E1 = −108.8eV + 34eV = −74.8eV . This compares quite well withthe measured value of E1 = −78.9eV . However, in the next sectionwe will see how we can actually improve this value using a diﬀerentapproximation technique.
137.
4.2. ADIABATIC TRANSFORMATIONS AND GEOMETRIC PHASES1374.2 Adiabatic Transformations and Geo- metric phases4.3 Variational PrincipleAfter this section on perturbation methods I am now moving on to adiﬀerent way of obtaining approximate solutions to quantum mechan-ical problems. Previously we have investigated problems which were’almost’ exactly solvable, i.e. the exactly solvable Hamilton operatorhas a small additional term. Now I am going to deal with problemswhich can not necessarily be cast into such a form. An example wouldbe the Helium atom (which cannot be solved exactly) as compared toa negative Helium ion (which is basically like a hydrogen atom andtherefore exactly solvable). Because the two electrons in the Heliumatom are very close, it is not obvious that a perturbation expansiongives a good result, simply because the electrostatic forces between theelectrons are almost as large as those between electrons and nucleus.Here variational principles can be very useful. The example of the He-lium atom already indicates that variational methods are paramountimportance e.g. in atomic physics, but also in quantum chemistry orsolid state physics.4.3.1 The Rayleigh-Ritz MethodHere I will explain a simple variational method, which nevertheless ex-hibits the general idea of the method. Variational methods are partic-ularly well suited to determine the ground state energy of a quantummechanical system and this is what I will present ﬁrst. The wholemethod is based on the following theorem concerning the expectationvalue of the Hamilton operator. ˆTheorem 49 Given a Hamilton operator H with ground state |φ0 andground state energy E0 . For any |ψ we have ˆ ˆ φ0 |H|φ0 ≤ ψ|H|ψ . (4.38)
138.
138 CHAPTER 4. APPROXIMATION METHODSProof: For the proof we need to remember that any state vector |ψcan be expanded in terms of the eigenvectors |φi of the Hermitean ˆoperator H. This means that we can ﬁnd coeﬃcients such that |ψ = αi |φi . (4.39) iI will use this in the proof of theorem 49. ˆ ψ|H|ψ = ∗ ˆ αi αj φi |H|φj i,j ∗ = αi αj Ei φi |φj i,j = |αi |2 Ei . iNow we can see that ˆ ψ|H|ψ − E0 = |αi |2 Ei − E0 i = |αi |2 (Ei − E0 ) i ≥ 0 ,because Ei ≥ E0 . If the lowest energy eigenvalue E0 is not degeneratethen we have equality exactly if |ψ = |φ0 2 . How does this help us in ﬁnding the energy of the ground state of aquantum mechanical system that is described by the Hamilton operatorˆH? The general recipe proceeds in two steps.Step 1: Chose a ’trial’ wave function |ψ(α1 , . . . , αN ) that is parametrizedby parameters α1 , . . . , αN . This choice of the trial wave function willoften be governed by the symmetries of the problem.Step 2: Minimize the expression ¯ ˆ E = ψ(α1 , . . . , αN )|H|ψ(α1 , . . . , αN ) , (4.40)with respect to the αi . The value Emin that you obtain this way is yourestimate of the ground state wave function.
139.
4.3. VARIATIONAL PRINCIPLE 139The Helium atom Now let us come back to the Helium atom to seewhether we can make use of the variational principle to obtain a betterestimate of the ground state energy. In our perturbation theoretical calculation we have used the wave-function Eq. (4.36) with Z = 2. Now let us introduce the adjustableparameter σ and use the trial wavefunction 3 Zef f − Zef f (rA +rB ) ψHe (r) = e a0 , (4.41) πa3where Zef f = Z −σ. This is a physically reasonable assumption becauseeach electron sees eﬀectively a reduced nuclear charge due to the shield-ing eﬀect of the other electron. Now we can use this new trial wave-function to calculate the energy expectation value of the total Hamiltonoperator with this wavefunction. We ﬁnd after some computations 5 5 E(σ) = −2RH Z 2 − Z + σ − σ 2 , (4.42) 8 8 4where RH = 64π30 e ¯ 3 c is the Rydberg constant. Now we need to com- m 2h 0pute that value of σ for which Eq. (4.42) assumes its minimal value. 5The value that we obtain is σ = 16 , independently of the value of Z.Inserting this into Eq. (4.42) we ﬁnd 5 2 Emin = −2RH (Z − ) . (4.43) 16For the Helium atom (Z=2) we therefore obtain Emin = −77.4eV , (4.44)for the ground state energy. This is quite close to the true value of−78.9eV and represents a substantial improvement compared to thevalue obtained via perturbation theory. Making a slightly more detailedchoice for the trial wavefunction (basically using one more parameter)would lead to substantially improved values. This shows that the vari-ational method can indeed be quite useful. However, the next example will illustrate the shortcomings of thevariational method and proves that this method has to be applied withsome care.
140.
140 CHAPTER 4. APPROXIMATION METHODSThe harmonic oscillator The Hamilton operator for the harmonic ˆ p2 ˆoscillator is given H = 2m + 1 mω 2 x2 . Now, assume that we do not 2 ˆknow the ground state energy and the ground state wavefunction ofthe harmonic oscillator. This Hamilton operator commutes with theparity operator and therefore the eigenstates should also be chosenas eigenfunctions of the parity operator. Therefore let us chose thenormalized symmetric wave-function 2a3/2 1 ψa (x) = . (4.45) π x2 + aOf course I have chosen this slightly peculiar function in order to makethe calculations as simple as possible. Omitting the detailed calcu-lations we ﬁnd that the mean value of the energy of the harmonicoscillator in this state is given by ˆ ∗ h2 d2 ¯ 1 ψa |H|ψa = ψa (x) − 2 + mω 2 x2 ψa (x)dx (4.46) 2m dx 2 h2 1 1 ¯ = + mω 2 a . (4.47) 4m a 2Now we need to ﬁnd the value for which this expression becomes min-imal. We obtain 1 h ¯ amin = √ , (4.48) 2 mωand for the energy 1 Emin = √ hω . ¯ (4.49) 2Therefore our estimate gives an error √ Emin − E0 2−1 = ≈ 0.2 . (4.50) E0 2This is not too good, but would have been a lot better for other trialfunctions. This example shows the limitations of the variational principle. Itvery much depends on a good choice for the trial wave function. Itshould also be noted, that a good approximation to the ground state
141.
4.4. TIME-DEPENDENT PERTURBATION THEORY 141energy does not imply that the chosen trial wave function will give goodresults for other physical observables. This can be seen from the aboveexample of the harmonic oscillator. If we compute the expectationvalue of the operator x2 then we ﬁnd ˆ 1 h ¯ ψamin |ˆ2 |ψamin = √ x 2 mωwhich is quite close to the true value of 1 hω. On the other hand, the 2 ¯ 4expectation value of the operator x diverges if we calculate it with ˆthe trial wave function, while we obtain a ﬁnite result for the trueground-state wave function.The variational principle for excited states. The variationalmethod shown here can be extended to the calculation of the ener-gies of excited states. The basic idea that we will be using is thateigenvectors to diﬀerent eigenvalues of the Hamilton operator are nec-essarily orthogonal. If we know the ground state of a system, or at leasta good approximation to it, then we can use the variational method.The only thing we need to make sure is that our trial wavefunction isalways orthogonal to the ground state wave function or the best ap-proximation we have for it. If we have ensured this with the choice ofour trial function, then we can proceed analogously to the variationalprinciple for the ground state.4.4 Time-dependent Perturbation TheoryIn the previous sections I have explained some of the possible methodsof stationary perturbation theory. Using these methods we are able, inprinciple, to approximate the energy eigenvalues as well as the eigen-vectors of the Hamilton operator of a quantum mechanical system thathas a small perturbation. We are then able to approximate the spectraldecomposition of the Hamilton operator, which is given by ˆ H= Ei |ψi ψi | , (4.51) iwhere the Ei are the energy eigenvalues of the Hamilton operator andthe |ψi the corresponding eigenvectors. Equation (4.51) is suﬃcient to
142.
142 CHAPTER 4. APPROXIMATION METHODScompute the time evolution of any possible initial state |φ(t0 ) usingthe solution of the Schr¨dinger equation o ˆ e−iH(t−t0 ) |φ = e−iEi (t−t0 )/¯ |ψi ψi |φ(t0 ) . h (4.52) iUsing stationary perturbation theory we are then able to obtain theapproximate time-evolution of the system. However, there are reasons why this approach is not necessarilythe best. First of all, we might have a situation where the Hamiltonoperator of the system is time-dependent. In that case the solution ˆof the Schr¨dinger equation is generally not of the form e−i H(t )dt /¯ o hanymore, simply because Hamilton operators for diﬀerent times do notnecessarily commute with each other. For time-dependent Hamiltonoperators we have to proceed in a diﬀerent way in order to obtainapproximations to the time evolution. There are two situations in which we may encounter time-dependentHamilton operators. While the full Hamilton operator of a closed sys-tem is always time-independent, this is not the case anymore if wehave a system that is interacting with its environment, i.e. an opensystem. An example is an atom that is being irradiated by a laserbeam. Clearly the electro-magnetic ﬁeld of the laser is time-dependentand therefore we ﬁnd a time-dependent Hamilton operator. A time-dependent Hamilton operator may also appear in another situation. Ifwe have given the Hamilton operator of a quantum system which is ˆcomposed of two parts, the solvable part H0 and the perturbation V , ˆboth of which may be time independent, then it can be of advantageto go to an ’interaction’ picture that is ’rotating ’ with frequencies de- ˆtermined by the unperturbed Hamilton operator H0 . In this case theremaining Hamilton operator will be a time-dependent version of V . ˆ The rest of this section will be devoted to the explanation of amethod that allows us to approximate the time evolution operator ofa time-dependent Hamilton operator. The convergence of this methodis very much enhanced if we go into an interaction picture which elim-inates the dynamics of the system due to the unperturbed Hamiltonoperator. The deﬁnition of the ’interaction’ picture in a precise man-ner will be the subject of the ﬁrst part of this section.
143.
4.4. TIME-DEPENDENT PERTURBATION THEORY 1434.4.1 Interaction pictureOften the total Hamilton operator can be split into two parts H = ˆ ˆ ˆ ˆH0 + V , the exactly solvable Hamilton operator H0 and the pertur- ˆbation V . If we have a Hamilton operator, time-dependent or not,and we want to perform perturbation theory it will always be useful ˆto make use of the exactly solvable part H0 of the Hamilton operator.In the case of time-independent perturbation theory we have used theeigenvalues and eigenvectors of the unperturbed Hamilton operator andcalculated corrections to them in terms of the unperturbed eigenvec-tors and eigenvalues. Now we are going to do the analogous step forthe time evolution operator. We use the known dynamics due to the ˆunperturbed Hamilton operator H0 and then determine corrections tothis time evolution. In fact, the analogy goes so far that it is possi-ble in principle to rederive time-independent perturbation theory as aconsequence of time-dependent perturbation theory. I leave it to theslightly more ambitious student to work this out in detail after I haveexplained the ideas of time-dependent perturbation theory. The Schr¨dinger equation reads o d ˆ ˆ i¯ h |ψ(t) = (H0 + V )(t)|ψ(t) , (4.53) dt ˆwith a potentially time dependent total Hamilton operator H(t). Thesolution of the Schr¨dinger equation can be written formally as o ˆ |ψ(t) = U (t, t )|ψ(t ) , (4.54) ˆwhere the time evolution operator U (t, t ) obeys the diﬀerential equa-tion d h ˆ ˆ ˆ i¯ U (t, t ) = H(t)U (t, t ) . (4.55) dtThis can easily be checked by inserting Eq. (4.54) into the Schr¨dinger oequation. Now let us assume that we can solve the time-evolution ˆthat is generated by the unperturbed Hamilton operator H0 (t). The ˆcorresponding time evolution operator is given by U0 (t, t ) which givesthe solution |ψ0 (t) = Uˆ0 (t, t )|ψ0 (t ) of the Schr¨dinger equation o d ˆ i¯ h |ψ0 (t) = H0 (t)|ψ0 (t) . (4.56) dt
144.
144 CHAPTER 4. APPROXIMATION METHODSWhat we are really interested in is the time evolution according to ˆthe full Hamilton operator H. As we already know the time evolution ˆoperator U0 (t, t ), the aim is now the calculation of the deviation fromthis unperturbed time evolution. Let us therefore derive a Schr¨dinger oequation that describes this deviation. We obtain this Schr¨dingeroequation by going over to an interaction picture with respect to thetime t0 which is deﬁned by choosing the state vector ˆ† |ψI (t) = U0 (t, t0 )|ψ(t) . (4.57)Clearly the state in the interaction picture has been obtained by ’un-doing’ the part of the time evolution that is due to the unperturbedHamilton operator in the state |ψ(t) and therefore describes that part ˆof the time evolution that is due to the perturbation V (t). Now we haveto derive the Schr¨dinger equation for the interaction picture wavefunc- otion and in particular we have to derive the Hamilton-operator in theinteraction picture. This is achieved by inserting Eq. (4.57) into theSchr¨dinger equation Eq. (4.53). We ﬁnd o d ˆ ˆ ˆ ˆ i¯ h U0 (t, t0 )|ψI (t) = (H0 + V )(t)U0 (t, t0 )|ψI (t) (4.58) dtand then d ˆ di¯ h hˆ ˆ ˆ ˆ U0 (t, t0 ) |ψI (t) +i¯ U0 (t, t0 ) |ψI (t) = (H0 +V )(t)U0 (t, t0 )|ψI (t) . dt dt (4.59)Now we use the diﬀerential equation d ˆ ˆ ˆ i¯ h U0 (t, t0 ) = H0 (t)U0 (t, t0 ) . (4.60) dtInserting this into Eq. (4.59) givesˆ ˆ d hˆH0 U0 (t, t0 )|ψI (t) + i¯ U0 (t, t0 ) |ψI (t) ˆ ˆ ˆ = (H0 + V )(t)U0 (t, t0 )|ψI (t) dtand ﬁnally d ˆ† ˆ ˆ i¯ h |ψI (t) = U0 (t, t0 )V (t)U0 (t, t0 )|ψI (t) . dt
145.
4.4. TIME-DEPENDENT PERTURBATION THEORY 145Using the deﬁnition for the Hamilton operator in the interaction picture ˆ ˆ† ˆ ˆ ˆ HI (t) = U0 (t, t0 )(H − H0 )U0 (t, t0 ) (4.61)we ﬁnd the interaction picture Schr¨dinger equation o d ˆ i¯ h |ψI (t) = HI (t)|ψI (t) . (4.62) dtThe formal solution to the Schr¨dinger equation in the interaction pic- oture can be written as ˆ |ψI (t) = UI (t, t )|ψI (t ) . (4.63)From Eq. (4.63) we can then obtain the solution of the Schr¨dinger oequation Eq. (4.53) as |ψ(t) ˆ = U0 (t, t0 )|ψI (t) ˆ ˆ = U0 (t, t0 )UI (t, t )|ψI (t ) ˆ ˆ ˆ† = U0 (t, t0 )UI (t, t )U0 (t , t0 )|ψ(t ) . (4.64) This result shows that even in a system with a time independent ˆHamilton operator H we may obtain a time dependent Hamilton opera-tor by going over to an interaction picture. As time dependent Hamiltonoperators are actually quite common place it is important that we ﬁndout how the time evolution for a time-dependent Hamilton operatorcan be approximated.4.4.2 Dyson SeriesGiven the Schr¨dinger equation for a time dependent Hamilton oper- oator we need to ﬁnd a systematic way of obtaining approximations tothe time evolution operator. This is achieved by time-dependent per-turbation theory which is based on the Dyson series. Here I will dealwith the interaction picture Schr¨dinger equation Eq. (4.62). To ﬁnd othe Dyson series, we ﬁrst integrate the Schr¨dinger equation Eq. (4.62) oformally. This gives i t |ψI (t) = |ψI (t ) − ˆ dt1 HI (t1 )|ψI (t1 ) . (4.65) h ¯ t
146.
146 CHAPTER 4. APPROXIMATION METHODSThis equation can now be iterated. We then ﬁnd the Dyson series i t 1 t t1|ψI (t) = |ψI (t ) − ˆ dt1 HI (t1 )|ψI (t ) − dt1 ˆ ˆ dt2 HI (t1 )HI (t2 )|ψI (t ) +. . . . h ¯ t h2 ¯ t t (4.66)From this we immediately observe that the time evolution operator inthe interaction picture is given by i t 1 t t1 ˆ UI (t, t ) = 1 − ˆ dt1 VI (t1 ) − 2 dt1 ˆ ˆ dt2 VI (t1 )VI (t2 ) + . . . . h ¯ t h ¯ t t (4.67)Equations (4.66) and (4.67) are the basis of time-dependent perturba-tion theory. Remark: It should be pointed out that these expressions have somelimitations. Quite obviously every integral in the series will generallyhave the tendency to grow with increasing time diﬀerences |t − t |. Forsuﬃciently large |t − t | the Dyson series will then fail to converge.This problem can be circumvented by ﬁrst splitting the time evolutionoperator into suﬃciently small pieces, i.e. ˆ ˆ ˆ ˆ UI (t, 0) = UI (t, tn )UI (tn , tn−1 ) . . . UI (t1 , 0) , (4.68)such that each time interval ti − ti−1 is very small. Then each of thetime evolution operators is calculated using perturbation theory andﬁnally they are multiplied together.4.4.3 Transition probabilitiesFor the following let us assume that the unperturbed Hamilton operatoris time independent and that we take an interaction picture with respectto the time t0 = 0. In this case we can write ˆ ˆ U0 (t, 0) = e−iH0 t/¯ . h (4.69)Under the time evolution due to the unperturbed Hamilton operator ˆH0 the eigenstates |φn to the energy En of that Hamilton operatoronly obtain phase factors in the course of the time evolution, i.e. ˆ e−iH0 t/¯ |φn = e−iEn t/¯ |φn = e−iωn t |φn , h h (4.70)
147.
4.4. TIME-DEPENDENT PERTURBATION THEORY 147where ωn = En /¯ . Under the time evolution due to the total Hamilton h ˆoperator H this is not the case anymore. Now, the time evolution willgenerally take an eigenstate of the unperturbed Hamilton operator H0ˆto a superposition of eigenstates of the unperturbed Hamilton operator ˆH0 , i.e. ˆ e−iH(t−t )/¯ |φn = h αn→k (t)|φk (4.71) kwith nonzero coeﬃcients αn→k (t). If we make a measurement at thelater time t we will have a non-zero probability |αn→k |2 to ﬁnd eigen-states |φm with m = n. We then say that the system has a transitionprobability pn→k (t) for going from state |φn to state |φk . These ˆtransitions are induced by the perturbation V , which may for examplebe due to a laser ﬁeld. Let us calculate the transition probability in ˆlowest order in the perturbation V . This is a valid approximation aslong as the total eﬀect of the perturbation is small, i.e. the probabilityfor ﬁnding the system in the original state is close to 1. To obtainthe best convergence of the perturbative expansion, we are going overto the interaction picture with respect to the unperturbed Hamilton ˆoperator H0 and then break oﬀ the Dyson series after the term linear ˆ hˆ ˆ ˆ I (t) = eiH0 t1 /¯ V (t1 )e−iH0 t1 /¯ . If the initial state at time t is |φn hin Hthen the probability amplitude for ﬁnding the system in state |φm atthe later time t is given by αn→m (t) = φm |ψ(t) i t ˆ ˆ hˆ ≈ φm | |φn − dt1 eiH0 t1 /¯ V (t1 )e−iH0 t1 /¯ |φn h h ¯ t i t = δmn − ˆ dt1 ei(ωm −ωn )t1 φm |V (t1 )|φn . (4.72) h ¯ tIf m = n, we ﬁnd i t an→m (t) ≈ − ˆ dt1 ei(ωm −ωn )t1 φm |V (t1 )|φn , (4.73) h ¯ tand the transition probability is then given by pn→m (t) = |an→m (t)|2 . (4.74)
148.
148 CHAPTER 4. APPROXIMATION METHODSPeriodic perturbationLet us consider the special case in which we have a periodic perturbation ˆ ˆ 1ˆ V (t) = V0 cos ωt = V0 eiωt + e−iωt . (4.75) 2Inserting this into Eq. (4.73) we ﬁnd using ωmn := ωm − ωn i t 1 ˆ an→m (t) ≈ − dt1 eiωmn t1 φm |V0 |φn eiωt1 + e−iωt1 h 0 ¯ 2 i ˆ e−i(ω−ωmn )t − 1 ei(ω+ωmn )t − 1 = − φm |V0 |φn + (4.76) 2¯ h −i(ω − ωmn ) i(ω + ωmn )Static perturbation For a time independent perturbation we haveω = 0. This leads to i ˆ eiωmn t − 1 an→m (t) = − φm |V0 |φn (4.77) h ¯ iωmnand then to 1 ˆ sin2 ωmn t pn→m (t) = | φm |V0 |φn |2 ωmn 22 (4.78) h2 ¯ ( 2 )For suﬃciently large times t this is a very sharply peaked function inthe frequency ωmn . In fact in the limit t → ∞ this function tendstowards a delta-function sin2 ωmn t 2 lim = 2πtδ(ωmn ) . (4.79) t→∞ ( ωmn )2 2 −1For suﬃciently large times (t ωmn ) we therefore ﬁnd that the tran-sition probability grows linearly in time. We ﬁnd Fermi’s golden rulefor time independent perturbations 2π ˆ pω=0 (t) = t n→m 2 2 | φm |V0 |φn | δ(ωn − ωm ) . (4.80) h ¯Obviously this cannot be correct for arbitrarily large times t, becausethe transition probabilities are bounded by unity.
149.
4.4. TIME-DEPENDENT PERTURBATION THEORY 149High frequency perturbation If the frequency of the perturbationis unequal to zero, we ﬁnd two contributions to the transition ampli-tude, one with a denominator ω − ωmn and the other with the denom-inator ω + ωmn . As the frequency ω is always positive, only the ﬁrstdenominator can become zero, in which case we have a resonance andthe ﬁrst term in Eq. (4.76) dominates. We ﬁnd 2 (ω−ωmn )t 1 ˆ0 |φn |2 sin 2 sin2 (ω+ωmn )t 2 pn→m (t) = 2 | φm |V + . (4.81) 4¯ h ( ω−ωmn )2 2 ( ω+ωmn )2 2Again we ﬁnd that for large times the transition probability grows lin-early in time which is formulated as Fermi’s golden rule for time de-pendent perturbations 2π ˆ pω=0 (t) = t n→m | φm |V0 |φn |2 (δ(ωn −ωm −ω)+δ(ωn −ωm +ω)) . (4.82) 4¯ 2 hThe Zeno eﬀectFermi’s golden rule is not only limited to times that are not too large, −1but also ﬁnds its limitations for small times. In fact for times t ωmnthe transition probability grows quadratically in time. This is not onlya feature of our particular calculation, but is a general feature of thequantum mechanical time evolution that is governed by the Schr¨dinger oequation. This can be shown quite easily and is summarized in thefollowing ˆTheorem 50 Given a Hamilton operator H and an initial state |φ(0) ,then the transition probability to any orthogonal state |φ⊥ grows quadrat-ically for small times, i.e. d lim | φ⊥ |φ(t) |2 = 0 . (4.83) t→0 dtProof: Let us ﬁrst calculate the derivative of the transition probabilityp(t) = |α(t)|2 with α(t) = φ⊥ |φ(t) for arbitrary times t. We ﬁnd dp dα ∗ dα∗ (t) = (t)α (t) + α(t) (t) . (4.84) dt dt dt
150.
150 CHAPTER 4. APPROXIMATION METHODSObviously α(t = 0) = φ⊥ |φ(0) = 0, so that we ﬁnd dp lim (t) = 0 . (4.85) t→0 dtThis ﬁnishes the proof 2 .The transition probability for short times is therefore t2 Ct2 p(t) = p(0) + p (0) + . . . = + higher orders in t (4.86) 2 2where C is a positive constant. Theorem 50 has a weird consequence which you have encountered(in disguise) in the ﬁrst problem sheet. Assume that we have a twostate system with the orthonormal basis states |0 and |1 . Imaginethat we start at time t = 0 in the state |0 . The time evolution of the ˆsystem will be governed by a Hamilton operator H and after some timeT the system will be in the state |1 . Now let the system evolve for a time T , but, as I am very curioushow things are going I decide that I will look in which state the systemis after times T , 2T , . . . , T . The time evolution takes the initial state n nafter a time T into n T ˆ T |φ( ) = e−iH hn |0 . ¯ (4.87) nNow I make a measurement to determine in which of the two states |0or |1 the system is. The probability for ﬁnding the system in state |0 2is p1 = 1 − C T 2 with some nonzero constant C. If I ﬁnd the system nin the state |0 , then we wait until time 2T and perform the same nmeasurement again. The probability that in all of these measurementsI will ﬁnd the system in the state |0 is given by n T2 T2 p= 1−C 2 ≈1−C . (4.88) n nIn the limit of inﬁnitely many measurements, this probability tends to1. This result can be summarized as A continuously observed system does not evolve in time.
151.
4.4. TIME-DEPENDENT PERTURBATION THEORY 151This phenomenon has the name ’Quantum Zeno eﬀect’ and has indeedbeen observed in experiments about 10 years ago. With this slightly weird eﬀect I ﬁnish this part of the lecture andnow move on to explain some features of quantum information theory,a ﬁeld that has developed in the last few years only.
154.
Chapter 5Quantum InformationTheoryIn 1948 Claude Shannon formalised the notion of information and cre-ated what is now known as classical information theory. Until aboutﬁve years ago this ﬁeld was actually known simply as information the-ory but now the additional word ’classical’ has become necessary. Thereason for this is the realization that there is a quantum version of thetheory which diﬀers in quite a few aspects from the classical version.In these last lectures of this course I intend to give you a ﬂavour of thisnew theory that is currently emerging. What I am going to show youin the next few lectures is more or less at the cutting edge of physics,and many of the ideas that I am going to present are not older than 5years. In fact the material is modern enough that many of your physicsprofessors will actually not really know these things very well. So, af-ter these lectures you will know more than some professors of physics,and thats not easy to achieve as a third year student. Now you mightbe worried that we will require plenty of diﬃcult mathematics, as youwould expect this for a lecture that presents the cutting edge of physi-cal science. This however, is not the case. In this lecture I taught youmany of the essential techniques that are necessary for you to under-stand quantum information theory. This is again quite diﬀerentfrom other branches of physics that are at the cutting edge of research,e.g. super string theory where you need to study for quite a while untilyou reach a level that allows you to carry out research. The interesting 155
155.
156 CHAPTER 5. QUANTUM INFORMATION THEORYpoint about quantum information theory is the fact that it uniﬁes twoapparently diﬀerent ﬁelds of science, namely physics and informationtheory. Your question will now be: ’What is the connection’ ? and ’Whydid people invent the whole thing’ ? The motivation for scientists to begin to think about quantum infor-mation came from the rapid development of microprocessor technology.As many of you know, computers are becoming more and more pow-erful every year, and if you have bought a computer 5 years ago thenit will be hopelessly inferior to the latest model that is on the market.Even more amazing is the fact that computers do get more powerful,but they do not get much more expensive. This qualitative descrip-tion of the development of micro-electronics can actually be quantiﬁedquite well. Many years ago, in the 1960’s, Gordon Moore, one of thefounders of Intel observed that the number of transistors per unit areaFigure 5.1: chip doubles roughly every 18 months. Heof a micro-chipof a micro The number of transistors per unit area then predicteddoubles roughly every 18 months. in the future. As it turns out, hethat this development will continuewas right with his prediction and we are still observing the same growthlater this development has to stop because would Of continue like thisrate. This development can be seen in Fig. 5.1. it course sooner orfor another 20 years than the transistors would be so small, that theywould be fully charged by just one electron and they would shrink tothe size of an atom. While so far micro-electronics has worked on prin-ciples that can be understood to a large extent using classical physics itis clear that transistors that are of the size of one atom must see plentyof quantum mechanical eﬀects. As a matter of fact we would expectthat quantum mechanical eﬀects will play a signiﬁcant role even ear-lier. This implies that we should really start to think about informationprocessing and computation on the quantum mechanical level. Thesethoughts led to the birth of quantum information theory. Neverthelessit is not yet clear why there should be an essential diﬀerence betweeninformation at a classical level and information at the quantum level.However, this is not the case. A very important insight that people had is the observation thatinformation should not be regarded as an isolated purely mathematicalconcept! Why is that so? You may try to deﬁne information as anabstract concept but you should never forget that information needs to
156.
157be represented and transmitted. Both of these processes are physical.An example for the storage of information are my printed lecture noteswhich use ink on paper. Even in you brain information is stored notin an immaterial form, but rather in the form of synaptic connectionsbetween your brain cells. A computer, ﬁnally, stores information intransistors that are either charged or uncharged. Likewise informationtransmission requires physical objects. Talking to you means that I amusing sound waves (described by classical physics) to send informationto you. Television signals going through a cable represent informationtransfer using a physical system. In a computer, ﬁnally, the informationthat is stored in the transistors is transported by small currents. So, clearly information and physics are not two separate conceptsbut should be considered simultaneously. Our everyday experience isof course mainly dominated by classical physics, and therefore it is notsurprising that our normal perception of information is governed byclassical physics. However, the question remains, what will happenwhen we try to unify information theory with quantum physics. Whatnew eﬀects can appear? Can we do useful things with this new theory?These are all important questions but above all I think that it is goodfun to learn something new about nature.
157.
158 CHAPTER 5. QUANTUM INFORMATION THEORY5.1 What is information? Bits and all that.5.2 From classical information to quan- tum information.5.3 Distinguishing quantum states and the no-cloning theorem.5.4 Quantum entanglement: From qubits to ebits.5.5 Quantum state teleportation.5.6 Quantum dense coding.5.7 Local manipulation of quantum states.5.8 Quantum cyptography5.9 Quantum computationQuantum BitsBefore I really start, I will have to introduce a very basic notion. Theﬁrst one is the generalization of the classical bit. In classical informationtheory information is usually represented in the form of classical bits,i.e. 0 and 1. These two values may for example be represented as anuncharged transistor (’0’) and a fully charged transistor (’1’), see Fig5.2. Note that a charged transistor easily holds 108 electrons. Thereforeit doesn’t make much diﬀerence whether a few thousand electrons are
158.
5.9. QUANTUM COMPUTATION 159Figure 5.2: A transistor can represent two distinct logical values. Anuncharged transistor represents ’0’ and a fully charged transistor rep-resent ’1’. The amount of charge may vary a little bit without causingan error.missing. A transistor charged with 108 electrons and one with 0.999·108electrons both represent the logical value 1. This situation changes, when we consider either very small tran-sistors or in the extreme case atoms. Imagine an atom which storesthe numbers ’0’ and ’1’ in its internal states. For example an electronin the ground state represents the value ’0’, while an electron in theexcited state represents the value ’1’ (see Fig. 5.3). In the followingwe will disregard all the other energy levels and idealize the system asa two level system. Such a quantum mechanical two level system willfrom now on be called a quantum bit or shortly a qubit. So far thisis just the same situation as in the classical case of a transistor. How-ever, there are two diﬀerences. Firstly the atomic system will be muchmore sensitive to perturbations, because now it makes a big diﬀerencewhether there is one electron or no electron. Secondly, and probablymore importantly, in quantum mechanics we have the superpositionprinciple. Therefore we do not only have the possibilities 0 and 1represented by the two quantum states |0 and |1 , but we may alsohave coherent superpositions between diﬀerent values (Fig 5.3), i.e. |ψ = a|0 + b|1 (5.1) Therefore we have the ability to represent simultaneously two valuesin a single quantum bit. But it comes even better. Imagine that youare holding four qubits. Then they can be in a state that is a coherent
159.
160 CHAPTER 5. QUANTUM INFORMATION THEORY Figure 5.3:superposition of 16 diﬀerent states, each representing binary strings. 1 |ψ = (|0000 + |0001 + |0010 + |0011 4 +|0100 + |0101 + |0110 + |0111 +|1000 + |1001 + |1010 + |1011 +|1100 + |1101 + |1110 + |1111 ) (5.2)Evidently a collection of n qubits can be in a state that is a coherentsuperposition of 2n diﬀerent quantum states, each of which representsa number in binary notation. If we apply a unitary transformation osuch a state, we therefore manipulate 2n binary numbers simultane-ously! This represents a massive parallelism in our computation whichis responsible for the fact that a quantum mechanical system can solvecertain problems exponentially faster than any classical system can do.However, it is very diﬃcult to build such a quantum system in a waythat we can control it very precisely and that it is insensitive to noiseat the same time. This has is quite diﬃcult, and there are no realquantum computers around present. If I have enough time, I am going to explain the idea of a quantumcomputer in more detail, but ﬁrst I would like to present some eﬀectsand applications of quantum entanglement which have actually beenrealized in experiments.5.10 Entanglement and Bell inequalitiesIn the previous section we have seen that the superposition principleis one source for the new eﬀects that can arise in quantum informa-tion theory. However, the superposition principle alone exists also inclassical physics, e.g. in sound waves or electro-magnetic waves. What
160.
5.10. ENTANGLEMENT AND BELL INEQUALITIES 161is completely absent from classical physics is the notion of entangle-ment (from the German word Verschr¨nkung) of which you have heard aabout earlier in this lecture. Entanglement was ﬁrst investigated bySchr¨dinger in his Gedanken experiments using cats. Later Einstein, oPodolsky and Rosen proposed a famous Gedanken experiment whichthey used to criticise quantum mechanics. The notion of entanglementarises from the superposition principle together with the tensor productstructure of the quantum mechanical Hilbert space. States of the form 1 |ψ = (| ↓ + | ↑ ) ⊗ (| ↓ + | ↑ ) (5.3) 2are called product states. Such states are called disentangled as theoutcome of a measurement on the ﬁrst system is independent of theoutcome of a measurement on the other particle. On the other handstates of the form 1 |ψ = √ (| ↓ ⊗ | ↓ + | ↑ ⊗ ↑ ) (5.4) 2are very strongly correlated. If a measurement on the ﬁrst system showsthat the system is in state | ↓ (| ↑ ) then we immediately know thatthe second system must be in state | ↓ (| ↑ ) too. If this would beall that can be said about the correlations in the state Eq. (5.4), thenit would not be justiﬁed to call this states entangled simply becausewe can produce the same correlations also in a purely classical setting.Imagine for example, that we have two coins and that they are alwaysprepared in a way, such that either both of them show heads or bothof them show tails. Then by looking at only one of them, we knowwhether the other one shows head or tails. The correlations in the stateEq. (5.4), however, have much more complicated properties. These newproperties are due to the fact, that in quantum mechanics we can makemeasurements in bases other than the {| ↓ , | ↑ } basis. Any basis ofthe form {a| ↓ + b| ↑ , b∗ | ↓ − a∗ | ↑ } can also be used. This makesthe structure of the quantum mechanical state Eq. (5.4) much richerthan that of the example of the two coins. A famous example in which these new correlations manifest them-selves is that of the Bell inequalities. In the rest of this section I am go-ing to explain to you some of the ideas behind the Bell inequalities and
161.
162 CHAPTER 5. QUANTUM INFORMATION THEORYtheir signiﬁcance. When Bell started to think about the foundationsof quantum mechanics, the work that later led to the Bell inequalities,he was interested in one particular problem. Since the discovery ofquantum mechanics, physicists have been worried about the fact thatquantum mechanical measurements lead to random measurement out-comes. All that quantum mechanics predicts are the probabilities forthe diﬀerent possible measurement outcomes. This is quite substan-tially diﬀerent from everything that we know from classical physics,where the measurement outcomes are not random if we have completeknowledge of all system variables. Only incomplete knowledge of thesystem can lead to random measurement outcomes. Does that meanthat quantum mechanics is an incomplete description of nature? Arethere hidden variables, that we cannot observe directly, but which de-termine the outcome of our measurements? In a book on the mathe-matical foundations of quantum mechanics John von Neumann had ac-tually presented a proof that such theories cannot exist. Unfortunatelythe proof is wrong as von Neumann had made a wrong assumption.The question whether there are hidden variable theories that repro-duce all quantum mechanical predictions was therefore still open. JohnBell ﬁnally re-investigated the experiment that has originally been pro-posed by Einstein, Podolsky and Rosen and he ﬁnally succeeded inshowing that there is a physically observable quantity for which localhidden variable theories and quantum mechanics give diﬀerent predic-tions. This was an immense step forward, because now it had becomepossible to test experimentally whether quantum mechanics with allits randomness is correct or whether there are hidden variable theoriesthat explain the randomness in the measurement results by our incom-plete knowledge. In the following I am going to show you a derivationof Bell’s inequalities. To do this I now need to deﬁne quantitatively what I mean by corre-lations. The state Eq. (5.4) describes the state of two spin- 1 particles. 2Assume that I measure the orientation of the ﬁrst spin along direction|a and that of the second spin along direction b. Both measurementcan only have one of two outcomes, either the spin is parallel or anti-parallel to the direction along which it has been measured. If it isparallel then I assign the value a = 1 (b = 1) to the measurement out-come, if it is anti-parallel, then I assign the value a = −1 (b = −1)
162.
5.10. ENTANGLEMENT AND BELL INEQUALITIES 163to the measurement outcome. If we repeat the measurement N times,each time preparing the original state Eq. (5.4) and then performingthe measurement, then the correlation between the two measurementsis deﬁned as 1 N C(a, b) = lim an b n . (5.5) N →∞ N n=1 Now I want to show that correlations of the form Eq. (5.5) satisfyBells inequalities, given two assumptions are being made 1. We impose locality, i.e. a measurement on the ﬁrst particle has no eﬀect on the second side when the measurements are at space- like locations (no signal can travel after the measurement on the ﬁrst particle from there to the second particle before the mea- surement on the second particle has been carried out). This leads to probabilities that are just products of probabilities for a mea- surement outcome for the ﬁrst particle side and the probability for an measurement outcome on the second side. 2. There are hidden variables that we cannot access directly but which inﬂuence the probabilities that are observed. This implies that all probabilities are of the form PA (a, λ) and PB (b, λ) where λ describes the hidden variables. Under these assumptions I want to prove that for measurementsalong the four directions a, a , b and b we ﬁnd the Bell inequality |C(a, b) + C(a, b ) + C(a , b) − C(a , b )| ≤ 2 . (5.6)Proof: To see this we use the fact that for all an , an , bn , bn ∈ [−1, 1] |an (bn + bn ) + an (bn − bn )| ≤ 2 . (5.7)Now we ﬁnd C(a, b) = dλρ(λ) [PA (+, λ)PB (+, λ) + PA (−, λ)PB (−, λ) −PA (+, λ)PB (−, λ) − PA (−, λ)PB (+, λ)] = dλρ(λ) [PA (+, λ) − PA (−, λ)] [PB (+, λ) − PB (−, λ)] ≡ QA (a, λ)QB (b, λ)ρ(λ)dλ (5.8)
163.
164 CHAPTER 5. QUANTUM INFORMATION THEORYand therefore|C(a, b) + C(a, b ) + C(a , b) − C(a, b)| ≤ |QA (a, λ)QB (b, λ) + QA (a, λ)QB (b , λ) +QA (a , λ)QB (b, λ) − QA (a , λ)QB (b , λ)|ρ(λ)dλ ≤2This ﬁnishes the proof. Therefore if we are able to explain quantum mechanics by a localhidden variable theory, then Eq. (5.6) has to be satisﬁed. Now let us calculate what quantum mechanics is telling us aboutthe left hand side of Eq. (5.6). Quantum mechanically the correlationis given by ˆ ˆ C(a, b) = ψ|(aσ) ⊗ (aσ)|ψ , (5.9) ˆ ˆwhere σ = σx ex + σy ey + σz ez with the Pauli operators σi . Now we can ˆ ˆ ˆexpress the correlation in terms of the angle θab between the vectors aˆand ˆ We ﬁnd (this is an exercise for you) b. C(a, b) = − cos θab . (5.10) Now we make a particular choice for the four vectors a, a , b and b . Wechose a and b parallel and a and b such that all four vectors lie in oneplane. Finally we chose the angles θab = θa b = φ. All this is shown inFig. 5.4. Inserting this choice in the left hand side of Eq. (5.6), thenwe ﬁnd |1 + 2 cos φ − cos 2φ| ≤ 2 . (5.11) Plotting the function on the left hand side of Eq. (5.11) in Fig. 5.5we see that the inequality is actually violated for quite a wide range ofvalues of φ. The maximum value for the left hand side of Eq. (5.11) isgiven by 2.5 and is assumed for the value φ = π/3. Of course now the big question is whether Bell’s inequalities areviolated or not. Experiments testing the validity of the Bell inequalitieshave been carried out e.g. in Paris in 1982. The idea was very simple,but of course it was quite diﬃcult to actually do the experiment. Acentral source produces photons which are in the singlet state 1 |ψ − = √ (| ↑ | ↓ − | ↓ | ↑ ) . (5.12) 2
164.
5.10. ENTANGLEMENT AND BELL INEQUALITIES 165Figure 5.4: The relative orientation of the four directions a, a , b and balong which the spins are measured.One can derive identical Bell inequalities for such a state and this statewas chosen, because it can be produced fairly easily. The way it isdone, is via the decay of an atom in such a way that it always emitstwo photons and that the total change in angular momentum in theatom is zero. Then the two photons are necessarily in a spin-0 state,i.e. a singlet state. The two photons were then ﬂying away in oppositedirections towards to measurement apparatuses. These apparatuseswere then measuring the polarization state of the two photons alongfour possible directions that have been chosen to give maximal violationof Bell inequalities. After each decay for each apparatus a directionwas chosen randomly and independently from the other side. Finallythe measurement result was noted down. This experiment found avalue for the correlations of about 2.7 which is reasonably close to thequantum mechanically predicted value of 2.82. More recently moreprecise experiments have been carried out and the violation of Bellsinequalities has been clearly demonstrated.
165.
166 CHAPTER 5. QUANTUM INFORMATION THEORYFigure 5.5: The right hand side of Eq. (5.11) is plotted. You can clearlysee that it can exceed the value of 2 and achieves a maximum of 2.5.5.11 Quantum State TeleportationThe procedure we will analyse is called quantum teleportation and canbe understood as follows. The naive idea of teleportation involves aprotocol whereby an object positioned at a place A and time t ﬁrst“dematerializes” and then reappears at a distant place B at some latertime t + T . Quantum teleportation implies that we wish to apply thisprocedure to a quantum object. However, a genuine quantum telepor-tation diﬀers from this idea, because we are not teleporting the wholeobject but just its state from particle A to particle B. As quantumparticles are indistinguishable anyway, this amounts to ‘real’ telepor-tation. One way of performing teleportation (and certainly the wayportrayed in various science ﬁction movies, e.g. The Fly) is ﬁrst to
166.
5.12. A BASIC DESCRIPTION OF TELEPORTATION 167learn all the properties of that object (thereby possibly destroying it).We then send this information as a classical string of data to B whereanother object with the same properties is re-created. One problemwith this picture is that, if we have a single quantum system in an un-known state, we cannot determine its state completely because of theuncertainty principle. More precisely, we need an inﬁnite ensemble ofidentically prepared quantum systems to be able completely to deter-mine its quantum state. So it would seem that the laws of quantummechanics prohibit teleportation of single quantum systems. However,the very feature of quantum mechanics that leads to the uncertaintyprinciple (the superposition principle) also allows the existence of en-tangled states. These entangled states will provide a form of quantumchannel to conduct a teleportation protocol. It will turn out that thereis no need to learn the state of the system in order to teleport it. On theother hand, there is a need to send some classical information from A toB, but part of the information also travels down an entangled channel.This then provides a way of distinguishing quantum and classical cor-relations, which we said was at the heart of quantifying entanglement.After the teleportation is completed, the original state of the particleat A is destroyed (although the particle itself remains intact) and sois the entanglement in the quantum channel. These two features aredirect consequences of fundamental laws in information processing. Icannot explain these here as I do not have enough time, but if you areinterested you should have a look at the article M.B. Plenio and V.Vedral, Contemp. Physics 39, 431 (1998) which has been written forﬁnal year students and ﬁrst year PhD students.5.12 A basic description of teleportationLet us begin by describing quantum teleportation in the form originallyproposed by Bennett, Brassard, Crepeau, Jozsa, Peres, and Woottersin 1993. Suppose that Alice and Bob, who are distant from each other,wish to implement a teleportation procedure. Initially they need toshare a maximally entangled pair of quantum mechanical two levelsystems. Unlike the classical bit, a qubit can be in a superposition ofits basis states, like |Ψ = a|0 + b|1 . This means that if Alice and
167.
168 CHAPTER 5. QUANTUM INFORMATION THEORYBob both have one qubit each then the joint state may for example be √ |ΨAB = (|0A |0B + |1A |1B )/ 2 , (5.13)where the ﬁrst ket (with subscript A) belongs to Alice and second (withsubscript B) to Bob. This state is entangled meaning, that it cannot bewritten as a product of the individual states (like e.g. |00 ). Note thatthis state is diﬀerent from a statistical mixture (|00 00| + |11 11|)/2which is the most correlated state allowed by classical physics. Now suppose that Alice receives a qubit in a state which is unknownto her (let us label it |Φ = a|0 + b|1 ) and she has to teleport it toBob. The state has to be unknown to her because otherwise she canjust phone Bob up and tell him all the details of the state, and hecan then recreate it on a particle that he possesses. If Alice does notknow the state, then she cannot measure it to obtain all the necessaryinformation to specify it. Therefore she has to resort to using the state|ΨAB that she shares with Bob. To see what she has to do, we writeout the total state of all three qubits √ |ΦAB := |Φ |ΨAB = (a|0 + b|1 )(|00 + |11 )/ 2 . (5.14)However, the above state can be written in the following convenient way(here we are only rewriting the above expression in a diﬀerent basis,and there is no physical process taking place in between) √ |ΦAB = (a|000 + a|011 + b|100 + b|111 )/ 2 1 = |Φ+ (a|0 + b|1 ) + |Φ− (a|0 − b|1 ) 2 +|Ψ+ (a|1 + b|0 ) + |Ψ− (a|1 − b|0 ) , (5.15)where √ |Φ+ = (|00 + |11 )/ 2 (5.16) √ |Φ− = (|00 − |11 )/ 2 (5.17) √ |Ψ+ = (|01 + |10 )/ 2 (5.18) √ |Ψ− = (|01 − |10 )/ 2 (5.19)form an ortho-normal basis of Alice’s two qubits (remember that theﬁrst two qubits belong to Alice and the last qubit belongs to Bob).
168.
5.12. A BASIC DESCRIPTION OF TELEPORTATION 169The above basis is frequently called the Bell basis. This is a very usefulway of writing the state of Alice’s two qubits and Bob’s single qubit be-cause it displays a high degree of correlations between Alice’s and Bob’sparts: to every state of Alice’s two qubits (i.e. |Φ+ , |Φ− , |Ψ+ , |Ψ− )corresponds a state of Bob’s qubit. In addition the state of Bob’s qubitin all four cases looks very much like the original qubit that Alice hasto teleport to Bob. It is now straightforward to see how to proceedwith the teleportation protocol: 1. Upon receiving the unknown qubit in state |Φ Alice performs projective measurements on her two qubits in the Bell basis. This means that she will obtain one of the four Bell states randomly, and with equal probability. 2. Suppose Alice obtains the state |Ψ+ . Then the state of all three qubits (Alice + Bob) collapses to the following state |Ψ+ (a|1 + b|0 ) . (5.20) (the last qubit belongs to Bob as usual). Alice now has to com- municate the result of her measurement to Bob (over the phone, for example). The point of this communication is to inform Bob how the state of his qubit now diﬀers from the state of the qubit Alice was holding previously. 3. Now Bob knows exactly what to do in order to complete the teleportation. He has to apply a unitary transformation on his qubit which simulates a logical NOT operation: |0 → |1 and |1 → |0 . He thereby transforms the state of his qubit into the state a|0 + b|1 , which is precisely the state that Alice had to teleport to him initially. This completes the protocol. It is easy to see that if Alice obtained some other Bell state then Bob would have to apply some other simple operation to complete teleportation. We leave it to the reader to work out the other two operations (note that if Alice obtained |Φ+ he would not have to do anything). If |0 and |1 are written in their vector form then the operations that Bob has to perform can be represented by the Pauli spin matrices, as depicted in Fig. 5.6.
169.
170 CHAPTER 5. QUANTUM INFORMATION THEORY l l∼ ∼ ∼ ∼ l √(a) ψ ∼∼∼∼ (α|0 + β|1 )(|00 + |11 )/ 2 |Ψ+ 1 2 |Φ+ (α|0 + β|1 ) + 1 |Φ− (α|0 − β|1 )(b) ψl l∼ ∼ ∼ ∼ l ∼∼∼∼ 2 + 1 |Ψ+ (α|1 + β|0 ) 2 Measurement + 1 |Ψ− (α|1 − β|0 ) 2 {|Ψ± , |Φ± }(c) Alice ﬁnds|Φ+ −→0 Bob does nothing Alice ﬁnds|Φ− −→1 Bob performs σz Alice ﬁnds|Ψ+ −→2 Bob performs σx Alice ﬁnds|Ψ− −→3 Bob performs σz σx(d) l∼ l ∼∼ ψl Figure 5.6: Figure 5.6: The basic steps of quantum state teleportation. Aliceand Bob are spatially separated, Alice on the left of the dashed line,Bob on the right. (a) Alice and Bob share a maximally entangled pair √of particles in the state (|00 +|11 )/ 2. Alice wants to teleport the un-known state |ψ to Bob. (b) The total state of the three particles thatAlice and Bob are holding is rewritten in the Bell basis Eqs. (5.16-5.19)for the two particles Alice is holding. Alice performs a measurementthat projects the state of her two particles onto one of the four Bellstates. (c) She transmits the result encoded in the numbers 0, 1, 2, 3to Bob, who performs a unitary transformation 1, σz , σx , σz σx that de-pends only on the measurement result that Alice obtained but not onthe state |ψ ! (d) After Bob has applied the appropriate unitary op-eration on his particle he can be sure that he is now holding the statethat Alice was holding in (a).
170.
5.12. A BASIC DESCRIPTION OF TELEPORTATION 171 An important fact to observe in the above protocol is that all theoperations (Alice’s measurements and Bob’s unitary transformations)are local in nature. This means that there is never any need to performa (global) transformation or measurement on all three qubits simulta-neously, which is what allows us to call the above protocol a genuineteleportation. It is also important that the operations that Bob per-forms are independent of the state that Alice tries to teleport to Bob.Note also that the classical communication from Alice to Bob in step 2above is crucial because otherwise the protocol would be impossible toexecute (there is a deeper reason for this: if we could perform telepor-tation without classical communication then Alice could send messagesto Bob faster than the speed of light, remember that I explained thisin a previous lecture. Important to observe is also the fact that the initial state to be tele-ported is at the end destroyed, i.e it becomes maximally mixed, of theform (|0 0| + |1 1|)/2. This has to happen since otherwise we wouldend up with two qubits in the same state at the end of teleportation(one with Alice and the other one with Bob). So, eﬀectively, we wouldclone an unknown quantum state, which is impossible by the laws ofquantum mechanics (this is the no-cloning theorem of Wootters andZurek). I will explain this in more detail later. For those of you who are interested in quantum information theory,here are some references for further reading: L. Hardy, Cont. Physics 39, 415 (1998)M.B. Plenio and V. Vedral, Cont. Physics 39, 431 (1998)J. Preskill, http://www.theory.caltech.edu/people/preskill/ph229/
Be the first to comment