Mind reading computer report


Published on

it is report about design of machine which is helpful for reading mind.

Published in: Education, Technology
  • My business partners were searching for GSA SF 1444 this month and saw a web service that has a searchable forms database . If you are interested in GSA SF 1444 too , here's a link http://goo.gl/rVR6te
    Are you sure you want to  Yes  No
    Your message goes here
  • plz send me your report plz or more data if you have.....
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Mind reading computer report

  1. 1. INTRODUCTIONPeople express their mental states, including emotions, thoughts, and desires, all thetime through facial expressions, vocal nuances and gestures. This is true even when they areinteracting with machines. Our mental states shape the decisions that we make, govern howwe communicate with others, and affect our performance. The ability to attribute mentalstates to others from their behavior and to use that knowledge to guide our own actions andpredict those of others is known as theory of mind or mind-reading.Existing human-computer interfaces are mind-blind — oblivious to the user’s mentalstates and intentions. A computer may wait indefinitely for input from a user who is nolonger there, or decide to do irrelevant tasks while a user is frantically working towards animminent deadline. As a result, existing computer technologies often frustrate the user, havelittle persuasive power and cannot initiate interactions with the user. Even if they do take theinitiative, like the now retired Microsoft Paperclip, they are often misguided and irrelevant,and simply frustrate the user. With the increasing complexity of computer technologies andthe ubiquity of mobile and wearable devices, there is a need for machines that are aware ofthe user’s mental state and that adaptively respond to these mental states.1
  2. 2. WHAT IS MIND READING?A computational model of mind-readingDrawing inspiration from psychology, computer vision and machine learning, theteam in the Computer Laboratory at the University of Cambridge has developed mind-reading machines — computers that implement a computational model of mind-reading toinfer mental states of people from their facial signals. The goal is to enhance human-computer interaction through empathic responses, to improve the productivity of the user andto enable applications to initiate interactions with and on behalf of the user, without waitingfor explicit input from that user. There are difficult challenges:Fig: Processing stages in the mind-reading systemUsing a digital video camera, the mind-reading computer system analyzes a person’sfacial expressions in real time and infers that person’s underlying mental state, such aswhether he or she is agreeing or disagreeing, interested or bored, thinking or confused.2
  3. 3. Prior knowledge of how particular mental states are expressed in the face is combinedwith analysis of facial expressions and head gestures occurring in real time. The modelrepresents these at different granularities, starting with face and head movements andbuilding those in time and in space to form a clearer model of what mental state is beingrepresented. Software from Nevenvision identifies 24 feature points on the face and tracksthem in real time.Movement, shape and colour are then analyzed to identify gestures like a smile oreyebrows being raised. Combinations of these occurring over time indicate mental states. Forexample, a combination of a head nod, with a smile and eyebrows raised might mean interest.The relationship between observable head and facial displays and the corresponding hiddenmental states over time is modeled using Dynamic Bayesian Networks.3
  4. 4. WHY MIND READING?Current projects in Cambridge are considering further inputs such as body posture andgestures to improve the inference. We can then use the same models to control the animationof cartoon avatars. We are also looking at the use of mind-reading to support on-lineshopping and learning systems4
  5. 5. Fig:Monitoring a car driverThe mind-reading computer system presents information about your mental state aseasily as a keyboard and mouse present text and commands. Imagine a future where we aresurrounded with mobile phones, cars and online services that can read our minds and react toour moods. How would that change our use of technology and our lives? We are workingwith a major car manufacturer to implement this system in cars to detect driver mental statessuch as drowsiness, distraction and anger.Current projects in Cambridge are considering further inputs such as body posture andgestures to improve the inference. We can then use the same models to control the animationof cartoon avatars. We are also looking at the use of mind-reading to support on-lineshopping and learning systems.The mind-reading computer system may also be used to monitor and suggestimprovements in human-human interaction. The Affective Computing Group at the MIT5
  6. 6. Media Laboratory is developing an emotional-social intelligence prosthesis that explores newtechnologies to augment and improve people’s social interactions and communication skills.HOW DOES IT WORK?Fig: Futuristic headbandThe mind reading actually involves measuring the volume and oxygen level of theblood around the subjects brain, using technology called functional near-infraredspectroscopy (fNIRS).The user wears a sort of futuristic headband that sends light in that spectrum into thetissues of the head where it is absorbed by active, blood-filled tissues. The headband thenmeasures how much light was not absorbed, letting the computer gauge the metabolicdemands that the brain is making. The results are often compared to an MRI, but can begathered with lightweight, non-invasive equipment.6
  7. 7. Wearing the fNIRS sensor, experimental subjects were asked to count the number ofsquares on a rotating onscreen cube and to perform other tasks. The subjects were then askedto rate the difficulty of the tasks, and their ratings agreed with the work intensity detected bythe fNIRS system up to 83 percent of the time."We dont know how specific we can be about identifying users different emotionalstates," cautioned Sergio Fantini, a biomedical engineering professor at Tufts. "However, theparticular area of the brain where the blood-flow change occurs should provide indications ofthe brains metabolic changes and by extension workload, which could be a proxy foremotions like frustration.""Measuring mental workload, frustration and distraction is typically limited toqualitatively observing computer users or to administering surveys after completion of a task,potentially missing valuable insight into the users changing experiences.A computer program which can read silently spoken words by analyzing nerve signalsin our mouths and throats has been developed by NASA.Preliminary results show that using button-sized sensors, which attach under the chinand on the side of the Adams apple, it is possible to pick up and recognize nerve signals andpatterns from the tongue and vocal cords that correspond to specific words."Biological signals arise when reading or speaking to oneself with or without actuallip or facial movement," says Chuck Jorgensen.7
  8. 8. HEAD AND FACIAL ACTION UNIT ANALYSISTwenty four facial landmarks are detected using a face template in the initial frame,and their positions tracked across the video. The system builds on Facestation [1], afeature point tracker that supports both real time and offline tracking of facial features on alive or recorded video stream. The tracker represents faces as face bunch graphs [23] orstack-like structures which efficiently com- bine graphs of individual faces that vary infactors such as pose, glasses, or physiognomy. The tracker outputs the position of twentyfour feature points, which we then use for head pose estimation and facial feature extraction.EXTRACTING HEAD ACTION UNITSNatural human head motion typically ranges between 70-90o of downward pitch,55o of upward pitch, 70o of yaw (turn), and 55o of roll (tilt), and usually occursas a combination of all three rotations [16]. The output positions of the localizedfeature points are sufficiently accurate to permit the use of efficient, image-based head poseestimation. Expression invariant points such as the nose tip, root, nostrils, inner and outereye corners are used to estimate the pose. Head yaw is given by the ratio of left to right eyewidths. A head roll is given by the orientation angle of the two inner eye corners. Thecomputation of both head yaw and roll is invariant to scale variations that arise from movingtoward or away from the camera. Head pitch is determined from the vertical displacementof the nose tip normalized against the distance between the two eye corners to account forscale variations. The system supports up to 50o , 30o and 50o of yaw, roll and pitchrespectively. Pose estimates across consecutive frames are then used to identify head actionunits. For example, a pitch of 20o degrees at time t followed by 15o at time t + 1indicates a downward head action, which is AU54 in the FACS coding.EXTRACTING FACIAL ACTION UNITSFacial actions are identified from component-based facial features (e.g.mouth)comprised of motion, shape and color descriptors. Motion and shape-based analysis areparticularly suitable for a real time video system, in which motion is inherent and places astrict upper bound on the computational complexity of methods used in order to meet time8
  9. 9. constraints. Color-based analysis is computationally efficient, and is invariant to the scale orviewpoint of the face, especially when combined with feature localization (i.e. limited toregions already defined by feature point tracking). The shape descriptors are first stabilizedagainst rigid head motion. For that, we imagine that the initial frame in the sequence is areference frame attached to the head of the user. On that frame, let (Xp , Yp ) be an“anchor” point.2D projection of the approximated real point around which the head rotates in 3Dspace. The anchor point is initially defined as the midpoint between the two mouth cornerswhen the mouth is at rest, and is at a distance d from the line joining the two inner eyecorners l. In subsequent frames the point is measured at distance d from l, after accountingfor head turns.Fig 2: Polar distance in determining a lip corner pull and lip puckerOn each frame, the polar distance between each of the two mouth corners and theanchor point is computed. The average percentage change in polar distance calculated withrespect to an initial frame is used to discern mouth displays. An increase or decrease of 10%or more, determined empirically, depicts a lip pull or lip pucker respectively (Figure 2). Inaddition, depending on the sign of the change we can tell whether the display is in its onset,apex, offset. The advantages of using polar distances over geometric mouth width andheight (which is what is used in Tian et al [20]) are support for head motion andresilience to inaccurate feature point tracking, especially with respect to lower lip points.Fig 3 : Plot of aperture (red) and teeth (green) in luminance-saturation space9
  10. 10. The mouth has two color regions that are of interest: aperture and teeth. The extent ofaperture present inside the mouth depicts whether the mouth is closed, lips parted, or jawdropped, while the presence of teeth indicates a mouth stretch. Figure 3 shows a plot of teethand aperture samples in luminance-saturation space .Luminance given by the relativelightness or darkness of the color, acts as a good discriminator for the two types of mouthregions. A sample of n=125000 pixels was used to learn the probability distributionfunctions of aperture and teeth. A lookup table defining the probability of a pixel beingaperture given its luminance is computed for the range of possible luminance values (0% forblack to 100% for white). A similar lookup table is computed for teeth. Onlineclassification into mouth actions proceeds as follows: For every frame in the sequence, wecompute the luminance value of each pixel in the mouth polygon. The luminance value isthen looked up to determine the probability of the pixel being aperture or teeth. Dependingon empirically determined thresholds the pixel is classified as aperture or teeth or neither.Finally, the total number of teeth and aperture pixels are used to classify the mouth regioninto closed (or lips part), jaw drop, or mouth stretch. Figure 4 shows classification results of1312 frames into closed, jaw drop and mouth stretch.Fig 4: Classifying 1312 mouth regions into closed, jaw drop or stretch10
  11. 11. COGNITIVE MENTAL STATE INFERENCEThe HMM level outputs likelihood for each of the facial expressions and headdisplays .However, on their own, each display is a weak classifier that does not entirelycapture an underlying cognitive mental state. Bayesian networks have successfully beenused as an ensemble of classifiers, where the combined classifier performs much betterthan any individual one in the set [15].In such probabilistic graphical models, hidden states (the cognitive mental states inour case) influence a number of observation nodes, which describe the observed facial andhead displays. In dynamic Bayesian networks (DBN), temporal dependency acrossprevious states is also encoded. Training the DBN model entails determining the param- etersand structure of a DBN model from data. Maximum likelihood estimates is used to learn theparameters, while sequential backward elimination picks the (locally) optimal networkstructure for each mental state model. More details on how the parameters and structure arelearnt can be found in [13].EXPERIMENTAL EVALUATIONFor our experimental evaluation we use the Mind reading dataset (MR) [3]. MR is acomputer-based guide to emotions primarily collected to help individuals diagnosed withAutism recognize facial expressions of emotion. A total of 117 videos, recorded at 30fps with durations varying between 5 to 8 seconds, were picked for testing. The videosconveyed the following cognitive mental states: agreement, concentrating, disagreement,thinking and un- sure and interested. There are no restrictions on the head or bodymovement of actors in the video. The process of labeling involved a panel of 10 judges whowere asked could this be the emotion name. ? When 8 out of 10 agree, a statisticallysignificant majority, the video is included in MR. To our knowledge MR is the onlyavailable, labeled11
  12. 12. Fig 5: ROC curves for head and facial displaysresource with such a rich collection of mental states and emotions, even if they are posed.We first evaluate the classification rate of the display recognition layer and then the overallclassification ability of the system.DISPLAY RECOGNITIONWe evaluate the classification rate of the display recognition component of the systemon the following 6 displays: 4 head displays (head nod, head shake, tilt display, turndisplay) and 2 facial displays (lip pull, lip pucker). The classification results for each of thedisplays are shown using the Receiver Operator Characteristic (ROC) curves (Figure 5).ROC curves depict the relationship between the rate of correct classifications and number offalse positives (FP). The classification rate of display d is computed as the ratio of correctdetections to that of all occurrences of d in the sampled videos. The FP rate for d is givenby the ratio of samples falsely classified as d to that of all non-d occurrences. Table 2shows the classification rate that the system uses, and the respective FP rate for each display.A non-neutral initial frame is the main reason behind undetected and falsely detecteddisplays. To illustrate this, consider a sequence that starts as a lip pucker. If the lip pucker12
  13. 13. persists (i.e. no change in polar distance) the pucker display will pass undetected. If on theother hand, the pucker returns to neutral (i.e. increase in polar distance). It will be falselyclassified as a lip pull display. This problem could be solved by using the polar angle andcolor analysis to approximate the initial mouth state. The other reason accounting formisclassified mouth display is that of inconsistent illumination. Possible solutions to dealingwith illumination changes include extending the color-based analysis to account for overallbrightness changes or having different models for each possible lighting condition.MENTAL STATE RECOGNITIONWe then evaluate the overall system by testing the inference of cognitive mentalstates, using leave-5-out cross validation. Figure 6 shows the results of the various stagesof the mind reading system for a video portraying the mental state choosing, which belongsto the mental state group thinking. The mental state with the maximum likelihood over theentire video (in this case thinking) is taken as the classification of the system.87.4% of the videos were correctly classified. The recognition rate of a mentalclass m is given by the total number of videos of that class whose most likely class (summedover the entire video) matched the label of the class m. The false positive rate for classm (given by the percentage of files misclassified as m) was highest for agreement (5.4%)and lowest for thinking (0%). Table 2 summarizes the results of recognition and falsepositive rates for 6 mental states.A closer look at the results reveals a number of interesting points. First, onsetframes of a video occasionally portray a different mental state than that of the peak. Forexample, the onset of disapproving videos were misclassified as unsure .Although thisincorrectly biased the overall classification to unsure, one could argue that this result is notentirely incorrect and that the videos do indeed start off with the person being unsure.Second, subclasses that do not clearly exhibit the class signature are easilymisclassified. For example, the assertive and decided videos in the agreement group weremisclassified as concentrating, as they exhibit no smiles, and only very weak head nods.Finally, we found that some mental states were “closer” to each other and could co-occur.For example, a majority of the unsure files scored high for thinking too.13
  14. 14. WEB SEARCHFor the first test of the sensors, scientists trained the software program to recognizesix words - including "go", "left" and "right" - and 10 numbers. Participants hooked up to thesensors silently said the words to themselves and the software correctly picked up the signals92 per cent of the time.Then researchers put the letters of the alphabet into a matrix with each column androw labeled with a single-digit number. In that way, each letter was represented by a uniquepair of number co-ordinates. These were used to silently spell "NASA" into a web searchengine using the program."This proved we could browse the web without touching a keyboard”.14
  15. 15. MIND-READING COMPUTERS TURN HEADS AT HIGH-TECH FAIRDevices allowing people to write letters or play pinball using just the power of theirbrains have become a major draw at the worlds biggest high-tech fair.Huge crowds at the CeBIT fair gathered round a man sitting at a pinball table,wearing a cap covered in electrodes attached to his head, who controlled the flippers withgreat proficiency without using hands."He thinks: left-hand or right-hand and the electrodesmonitor the brain waves associated with that thought, send the information to a computer,which then moves the flippers," said Michael Tangermann, from the Berlin Brain ComputerInterface. But the technology is much more than a fun gadget, it could one day save your lifeScientists are researching ways to monitor motorists brain waves to improve reactiontimes in a crash. In an emergency stop situation, the brain activity kicks in on average around200 milliseconds before even an alert driver can hit the brake. There is no question of brakingautomatically for a driver -- "we would never take away that kind of control,"15
  16. 16. "However, there are various things the car can do in that crucial time, tighten the seatbelt, for example," he added. Using this brain-wave monitoring technology, a car can also tellwhether the driver is drowsy or not, potentially warning him or her to take a break. At theg.tec stall, visitors watched a man with a similar "electrode cap" sat in front of a screen witha large keyboard, with the letters flashing in an ordered sequence.The user concentrates hard when the chosen letter flashes and the brain wavesstimulated at this exact moment are registered by the computer and the letter appears on thescreen. The technology takes a long time at present -- it took the man around four minutes towrite a five-lettered word -- but researchers hope to speed it up in the near future. Anotherdevice allows users to control robots by brain power. The small box has lights flashing atdifferent16
  17. 17. ADVANTAGES AND USESMind Controlled Wheelchair1. This prototype mind-controlled wheelchair developed from the University of ElectroCommunications in Japan lets you feel like half Professor X and half StephenHawking—except with the theoretical physics skills of the former and the telekineticskills of the latter.2. A little different from the Brain-Computer Typing machine, this thing works bymapping brain waves when you think about moving left, right, forward or back, andthen assigns that to a wheelchair command of actually moving left, right, forward orback.3. The result of this is that you can move the wheelchair solely with the power of yourmind. This device doesnt give you MIND BULLETS (apologies to Tenacious D) butit does allow people who cant use other wheelchairs get around easier.4. The sensors have already been used to do simple web searches and may one day helpspace-walking astronauts and people who cannot talk. The system could sendcommands to rovers on other planets, help injured astronauts control machines, or aiddisabled people.5. In everyday life, they could even be used to communicate on the sly - people coulduse them on crowded buses without being overheard6. The finding raises issues about the application of such tools for screening suspectedterrorists -- as well as for predicting future dangerousness more generally. We arecloser than ever to the crime-prediction technology of Minority Report.7. The day when computers will be able to recognize the smallest units in the Englishlanguage—the 40-odd basic sounds (or phonemes) out of which all words orverbalized thoughts can be constructed. Such skills could be put to many practical17
  18. 18. uses. The pilot of a high-speed plane or spacecraft, for instance, could simply order bythought alone some vital flight information for an all-purpose cockpit display.DISADVANTAGES AND PROBLEMSTapping Brains for Future Crimes1. Researchers from the Max Planck Institute for Human Cognitive and Brain Sciences,along with scientists from London and Tokyo, asked subjects to secretly decide inadvance whether to add or subtract two numbers they would later are shown. Usingcomputer algorithms and functional magnetic resonance imaging, or fMRI, thescientists were able to determine with 70 percent accuracy what the participantsintentions were, even before they were shown the numbers. The popular press tends toover-dramatize scientific advances in mind reading. FMRI results have to account forheart rate, respiration, motion and a number of other factors that might all causevariance in the signal. Also, individual brains differ, so scientists need to study asubjects patterns before they can train a computer to identify those patterns or makepredictions.2. While the details of this particular study are not yet published, the subjects limitedoptions of either adding or subtracting the numbers means the computer already had a50/50 chance of guessing correctly even without fMRI readings. The researchersindisputably made physiological findings that are significant for future experiments,but were still a long way from mind reading.3. Still, the more we learn about how the brain operates, the more predictable humanbeings seem to become. In the Dec. 19, 2006, issue of The Economist, an articlequestioned the scientific validity of the notion of free will: Individuals with particularcongenital genetic characteristics are predisposed, if not predestined, to violence.18
  19. 19. 4. Studies have shown that genes and organic factors like frontal lobe impairments, lowserotonin levels and dopamine receptors are highly correlated with criminal behavior.Studies of twins show that heredity is a major factor in criminal conduct. While noone gene may make you a criminal, a mixture of biological factors, exacerbated byenvironmental conditions, may well do so.5. Looking at scientific advances like these, legal scholars are beginning to question thefoundational principles of our criminal justice system.6. For example, University of Florida law professor Christopher Slobogin, who isvisiting at Stanford this year, has set forth a compelling case for putting preventionbefore retribution in criminal justice.7. Its a tempting thought. If there is no such thing as free will, then a system thatpunishes transgressive behavior as a matter of moral condemnation does not make alot of sense. Its compelling to contemplate a system that manages and reduces therisk of criminal behavior in the first place.8. Max Planck Institute, neuroscience and bioscience are not at a point where we canreliably predict human behavior. To me, thats the most powerful objection to apreventative justice system -- if we arent particularly good at predicting futurebehavior, we risk criminalizing the innocent.9. We arent particularly good at rehabilitation, either, so even if we were sufficientlyaccurate in identifying future offenders, we wouldnt really know what to do with19
  20. 20. them.10. Nor is society ready to deal with the ethical and practical problems posed by a systemthat classifies and categorizes people based on oxygen flow, genetics andenvironmental factors that are correlated as much with poverty as with futurecriminality.11. In time, neuroscience may produce reliable behavior predictions. But until then, weshould take the lessons of science fiction to heart when deciding how to use newpredictive techniques.12. The preliminary tests may have been successful because of the short lengths of thewords and suggests the test be repeated on many different people to test the sensorswork on everyone.13. The initial success "doesnt mean it will scale up", he told New Scientist. "Small-vocabulary, isolated word recognition is a quite different problem than conversationalspeech, not just in scale but in kind."14. that genes and organic factors like frontal lobe impairments, low serotonin levels anddopamine receptors are highly correlated with criminal behavior. Studies of twinsshow that heredity is a major factor in criminal conduct. While no one gene maymake you a criminal, a mixture of biological factors, exacerbated by environmentalconditions, may well do so.15. Using computer algorithms and functional magnetic resonance imaging, or fMRI, thescientists were able to determine with 70 percent accuracy what the participantsintentions were, even before they were shown the numbers.20
  21. 21. CONCLUSIONTufts University researchers have begun a three-year research project which, ifsuccessful, will allow computers to respond to the brain activity of the computers user. Userswear futuristic-looking headbands to shine light on their foreheads, and then perform a seriesof increasingly difficult tasks while the device reads what parts of the brain are absorbing the21
  22. 22. light. That info is then transferred to the computer, and from there the computer can adjust itsinterface and functions to each individual.One professor used the following example of a real world use: "If it knew which airtraffic controllers were overloaded, the next incoming plane could be assigned to anothercontroller."Hence if we get 100% accuracy these computers may find various applications inmany fields of electronics where we have very less time to react.BIBILOGRAPHYwww.eurescom.de/message/default_Dec2004.aspblog.marcelotoledo.org/2007/1022
  23. 23. www.newscientist.com/article/dn4795-nasa-develops-mindreading-systemhttp://blogs.vnunet.com/app/trackback/9540923