Smart blind stick book
Upcoming SlideShare
Loading in...5

Smart blind stick book






Total Views
Slideshare-icon Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Smart blind stick book Smart blind stick book Document Transcript

    • Mansoura University Faculty of Engineering Dept. of Electronics and Communication Engineering Smart Blind Stick A B. Sc. Project in Electronics and Communications Engineering Supervised by Assist. Prof. Mohamed Abdel-Azim Eng. Ahmed Shabaan, Eng. Mohamed Gamal, Eng. Eman Ashraf Department of Electronics and Communications Engineering Faculty of Engineering-Mansoura University 2011-2012
    • Mansoura University Faculty of Engineering Dept. of Electronics and Comm. Engineering Smart Blind Stick A B. Sc. Project in Electronics and Communications Engineering Supervised by Assist. Prof. Mohamed Abdel-Azim Eng. Ahmed Shabaan, Eng. Mohamed Gamal, Eng. Eman Ashrsf Department of Electronics and Communications Engineering Faculty of Engineering-Mansoura University 2011-2012
    • Team Work Team Work No. Name Contact Information 1 Ahmed Helmy Abd-Ghaffar 2 Nesma Zein El-Abdeen Mohammed 3 Aya Gamal Osman El-Mansy 4 Fatma Ayman Mohammed 5 Ahmed Moawad Abo-Elenin Awad i
    • Acknowledgement Acknowledgement We would like to express our gratitude to our advisor and supervisor Dr. Mohammed Abd ElAzim for guiding this work with interest. We would like to also thank Eng. Ahmed Shaaban and Eng. Mohammed Gamal and Eng. Eman Ashraf, Teaching Assistance for the countless hours he spent in the labs. We are grateful to them for setting high standards and giving us the freedom to explore. We would like to thank our colleagues for the assistance and constant support provided by them. Our Team ii
    • Acknowledgement iii
    • Abstract Abstract There is approximately 36.9 million people in the world are blind in 2002 according to World Health Organization. Majority of them are using a conventional white cane to aid in navigation. The limitation in white cane is that the information’s are gained by touching the objects by the tip of the cane. The traditional length of a white cane depends on the height of user and it extends from the floor to the person’s sternum. So we'll design ultrasound sensor to detect all kinds of barriers whatever its shape or height and warn him with vibration. Blind people also face great problems in moving from place to another in the town and the only way for them is Guide dogs which can cost about $20, 000 and they can be useful for about 5 – 6 years. So we'll design GPS for blind people which help him in moving from place to another in the town with voice orders for directions and he'll identify the place he want to go with voice only and not need to type any thing. But we want also to help him in moving indoor or in closed places he goes daily from place to another we'll design an indoor navigation system depend on working off line to help him to move from location to another in specific places home, moles, libraries...Etc. also by voice orders . He may face a great problem in control his electric devices we'll design for him a total wireless control system to easily control all his electric devices by voice connected to a security system to warn him if he indoor or out if any thing wrong happen and help him to solve this problem . iv
    • Contents Chapter-01: Introduc on……………………………………………………………………………………………….. 1.1 Problem Definition …………………………………………………………………………………...... 1.2 Problem Solution …………………………………………………………………………………………. 1.3 Business Model ……………………………………………………………………………………………. 1.4 Block Diagram………………………………………………………………………………………………. 1.5 Detailed Technical Description ……………………….…………………………………………… 1.6 Pre-Project Planning….…………………………………………………………………………………. 1.7 Time Planning………………………………………………………………………………………………. Chapter-02: Speech recognition ………………………………………………………………………………………… 2.1 Introduction ………………………………………………………………………………………………… 2.2 Literature review …………………………………………………………………………………………. 2.2.1 Pattern recognition ………………………………………………………. 2.2.2 Generation of voice ……………………………………………………… 2.2.3 Voice as biometric ………………………………………………………… 2.2.4 Speech recognition ………………………………………………………. 2.2.5 Speaker recognition ……………………………………………………… 2.2.6 Speechspeaker modeling …………………………………………….. 2.3 Implementation details ……………………………………………………………………………….. 2.3.1 Pre-processing and feature extraction …………………………… 2.4 Artificial neural network……………………………………………………............................. 2.4.1 Introduction ………………………………………………………………….. 2.4.2 Models ………………………………………………………………………….. 2.4.3 Network function …………………………………………………………... 2.4.4 Ann dependency graph ………………………………………………….. 2.4.5 Learning …………………………………………………………………………. 2.4.6 Choosing a cost function ……………………………………………….. 2.4.7 Learning paradigms ……………………………………………………….. 2.4.8 Supervised learning ……………………………………………………….. 2.4.9 unsupervised learning ……………………………………………………. 2.4.10 Reinforcement learning …………………………………………………. 2.4.11 Learning algorithms………………………………………………………… 2.4.12 Employing artificial neural network ……………………………….. 2.4.13 Application …………………………………………………………………….. 2.4.14 Types of models …………………………………………………………….. 2.4.15 Neural network software ………………………………………………. 2.4.16 Types of artificial neural network ………………………………….. 2.4.17 Confidence analysis of neural network ………………………….. Chapter-03: Image Processing ………….…………………………………………………………………………….. 3.1 Introduction …………………………………………………………………………………………………. 3.1.1 What is digital image processing? ...................................... 3.1.2 Motivating problems ……………………………………………………… 3.2 Color vision ………………………………………………………………………………………………….. 3.2.1 Fundamentals ………………………………………………………………… 3.2.2 Image formats supported by mat lab …………………………….. 3.2.3 Working formats in mat lab …………………………………………… 3.3 Aspects of image processing ……………………………………………………………………….. ii 1 1 1 2 2 3 4 4 7 7 7 7 9 11 11 12 13 13 13 22 22 23 24 24 25 26 26 26 27 27 28 28 29 30 31 31 31 32 33 33 33 34 34 35 35 35
    • Contents 3.4 Image types …………………………………………………………………………………………………. 3.4.1 Intensity image ……………………………………………………………… 3.4.2 Binary image …………………………………………………………………. 3.4.3 Indexed image ………………………………………………………………. 3.4.4 RGB image……………………………………………………………………… 3.4.5 Multi frame image …………………………………………………………. 3.5 How to …………………………………………………………………………………………………………. 3.5.1 How to convert between different formats …………………… 3.5.2 How to read file …………………………………………………………….. 3.5.3 Loading and saving variables in mat lab …………………………. 3.5.4 How to display an image in mat lab ……………………………….. 3.6 Some important definitions …………………………………………………………………………. 3.6.1 Imread function …………………………………………………………….. 3.6.2 Rotation ………………………………………………………………………… 3.6.3 Scaling …………………………………………………………………………… 3.6.4 Interpolation …………………………………………………………………. 3.7 Edge detection …………………………………………………………………………………………….. 3.7.1 Canny edge detection ……………………………………………………. 3.7.2 Edge tracing …………………………………………………………………… 3.8 Mapping ………………………………………………………………………………………………………. 3.8.1 Mapping image onto surface overview ………………………….. 3.8.2 Mapping an image onto elevation data …………………………. 3.8.3 Initializing the IDL display objects…………………………………… 3.8.4 Displaying image and geometric surface object……………… 3.8.5 Mapping an image onto sphere……………………………………… 3.9 Mapping offline……………………………………………………………………………………………. Chapter-04: GPS naviga on………………………………………………………………………………………….. 4.1 Introduction ………………………………………………………………………………………………… 4.1.1 What is GPS ?...................................................................... 4.1.2 How it work ?...................................................................... 4.2 Basic concepts of GPS ………………………………………………………………………………….. 4.3 Position calculation ……………………………………………………………………………………… 4.4 Communication …………………………………………………………………………………………… 4.5 Message format ………………………………………………………………………………………….. 4.6 Satellite frequencies ……………………………………………………………………………………. 4.7 Navigation equations ………………………………………………………………………………….. 4.8 Bancroft's method ……………………………………………………………………………………….. 4.9 Trilateration …………………………………………………………………………………………………. 4.10 Multidimensional Newton-Raphson calculation …………………………………………. 4.11 Additional method for more than four satellites ………………………….................. 4.12 Error sources and analysis …………………………………………………………………………… 4.13 Accuracy enhancement and surveying ………………………………………………………… 4.13.1 Augmentation………………………………………………………………… 4.13.2 Precise monitoring…………………………………………………………. 4.14 Time keeping ………………………………………………………………………………………………. 4.14.1 Time keeping and leap seconds …………………………………….. iii 36 36 37 37 37 37 38 38 38 39 39 40 40 40 41 41 41 41 42 43 43 44 46 47 51 51 53 53 53 53 54 55 57 57 58 59 60 60 60 61 61 61 61 62 63 63
    • Contents 4.14.2 Time keeping accuracy …………………………………………………… 4.14.3 Time keeping format………………………………………………………. 4.14.4 Carrier phase tracking ……………………………………………………. 4.15 GPS navigation …………………………………………………………………………………………….. Chapter-05: Ultrasound ……………………………………………………………………………………………. 5.1 Introduction …………………………………………………………………………………………………. 5.1.1 History ……………………………………………………………………………. 5.2 Wave motion ……………………………………………………………………………………………….. 5.3 Wave characteristics ……………………………………………………………………………………. 5.4 Ultrasound intensity …………………………………………………………………………………….. 5.5 Ultrasound velocity ……………………………………………………………………………………… 5.6 Attenuation of ultrasound …………………………………………………………………………… 5.7 Reflection ……………………………………………………………………………………………………. 5.8 Refraction ……………………………………………………………………………………………………. 5.9 Absorption …………………………………………………………………………………………………. 5.10 Hardware part …………………………………………………………………………………… 5.10.1 Introduction ………………………………………………………….. 5.10.2 Calculating the distance…………………………………………. 5.10.3 Changing beam pattern and beam width …………………. 5.10.4 The development of the sensor………………………………… Chapter-06: Microcontroller ………………………………………………………………………………………. 6.1 Introduction …………………………………………………………………………………….. 6.1.1 History of microcontroller ……………………………………… 6.1.2 Embedded design…………………………………………………….. 6.1.3 Interrupt …………………………………………………………………. 6.1.4 Programs ………………………………………………………………… 6.1.5 Other microcontroller feature ……………………………….. 6.1.6 Higher integration ……………………………………………………. 6.1.7 Programming environment ……………………………………… 6.2 Types of micro controller …………………………………………………………………. 6.2.1 Interrupt latency ………………………………………………………. 6.3 Microcontroller embedded memory technology ………………………… 6.3.1 Data……………………………………………………………………….. 6.3.2 Firmware ………………………………………………………………… 6.4 PIC microcontroller ………………………………………………………………………….. 6.4.1 Family core architecture ……………………………………….. 6.5 PIC component ………………………………………………………………………………….. 6.5.1 Logic circuit ……………………………………………………………… 6.5.2 Power supply …………………………………………………………… 6.6 Development tools…………………………………………………………………………… 6.6.1 Device programs …………………………………………………….. 6.6.2 Debugging ………………………………………………………………. 6.7 LCD display ……………………………………………………………………………………….. 6.7.1 LCD display pins ………………………………………………………. 6.7.2 LCD screen ……………………………………………………………… 6.7.3 LCD memory ………………………………………………………….. iv 63 64 64 66 69 69 69 69 71 72 75 76 77 79 81 83 83 87 87 88 91 91 92 93 93 94 94 95 97 98 99 100 100 101 101 101 101 106 119 127 127 128 130 131 131 132
    • Contents 6.7.4 LCD basic command ………………………………………………….. 6.7.5 LCD connecting …………………………………………………………. 6.7.6 LCD initialization ……………………………………………………… Chapter-07: System Implementa on ………………………………………………………………………… 7.1 Introduction ……………………………………………………………………………………… 7.2 Survey……………………………………………………………………………………………….. 7.3 Searches …………………………………………………………………………………………… 7.3.1 Ultra sound sensor……………………………………………………. 7.3.2 Indoor navigation systems ………………………………………. 7.3.3 Outdoor navigation ………………………………………………… 7.4 Sponsors ……………………………………………………………………………………….. 7.5 Pre-design ………………………………………………………………………………………. 7.5.1 List of matrices ………………………………………………………. 7.5.2 Competitive Benchmarking Information………… 136 138 139 141 141 142 142 142 142 142 143 143 144 145 7.5.3 Ideal and marginally acceptable target values ……….. 7.5.4 Time plan diagram …………………………………………………… 7.6 Design ……………………………………………………………………………………………… 7.6.1 Speech recognition ……………………………………………….. 7.6.2 Ultra sensors …………………………………………………………… 7.6.3 Outdoor navigation ………………………………………………… 7.7 Product architecture ……………………………………………………………………… 7.7.1 Product schematic ………………………………………………….. 7.7.2 Rough geometric layout …………………………………………. 7.7.3 Incidental interactions …………………………………………….. 7.8 Defining secondary system …………………………………………………………….. 7.9 Detailed interface specification ……………………………………………………… 7.10 Establishing the architecture of the chunks ……………………………………… Chapter-08: conclusion ………………………………………………………………………………………………. 8.1 Introduction…………………………………………………………………………………. 8.2 Overview………………………………………………………………………………………….. 8.2.1 Outdoor navigation …………………………………………………… 8.2.1 Outdoor navigation online ……………………………………… Outdoor navigation offline ………………………………………. 8.2 8.2.2 Ultrasound sensor …………………………………………………….. 8.2.3 Object identifier ………………………………………………………. 8.3 Features ……………………………………………………………………………………………. 146 146 147 147 149 150 151 151 152 153 154 154 155 157 158 158 158 158 158 159 159 159 v
    • CHAPTER 1 Introduction
    • Chapter 1 | Introduction 1.1 | PROBLEM DEFINITION There is approximately 36.9 million people in the world are blind in 2002 according to World Health Organization. Majority of them are using a conventional white cane to aid in navigation. The limitation in white cane is that the information’s are gained by touching the objects by the tip of the cane. The traditional length of a white cane depends on the height of user and it extends from the floor to the person’s sternum. Blind people also face great problems in moving from place to another in the town and the only way for them is Guide dogs which can cost about $20, 000 and they can be useful for about 5 – 6 years. They also have a great problem to identify the objects he frequently used in his house as kitchen tools and clothes. And also he may face a great problem in control his electric devices or have a security problem and he can't face it. 1.2 | PROBLEM SOLUTION All previous problems we're trying to solve them. To help the user moving easily indoor and outdoor we'll use ultrasound sensor to detect the barriers on his way and alert him by 2 ways vibration motor which speed increases when the distance decreases and voice alert told him the distance between him and the barrier. To solve the problem of moving outside home from place to another we'll design a software to be used in smart phones to help him in moving from place to another with voice orders without any external help he just say the place he want to go then the phone will guide him with voice orders to arrive this place. To help him to identify the objects we'll use RFID every important object will have tag or id when the reader read the id it will told him what it is by voice. Inside the home we'll design a system to control all electronic devices by voice orders and also a security system designed especially for them the most important in it is the fire alarm when it detects a fire it will alert him by a call to his mobile phone and another call to his friends near him for help and also a security system to warn him if he forget to close his door. After finishing these applications we're going to make features after graduation by adding new technologies to help him moving in the street easier and help him crossing roads and reading books. The products in our market in Egypt for them don't cover any needs for them. 1
    • Chapter 1 | Introduction The blind needs to move control and do his tasks his self without any help from anybody. There’s just a white stick without any technologies or features. So finally we'll install on the white stick a sensor and RFID and the other part is a software part on the mobile to do the navigation and automation tasks. 1.3 | BUSINESS MODEL Our customers are blind people and a visually impaired person there's almost 1 million people in Egypt has one of the past problems. Our product would cover some needs of our customers as helping them to avoid the barriers on their way and guide them with voice to the direction they must go to avoid this and also help the to move free without any external help in different countries by android application on his mobile which designed especially for them to guide them with voice through roads and tell them the direction they have to go to arrive their goal. To reach our goal we met with different customers to know exactly what they need and help us to get a vision for our final product to be comfortable and also we were guided technically by our sponsors to find the best way to cover all these needs. In our market the available products doesn't cover any needs we just found a white stick without any technologies to help the user. 1.4 | BLOCK DIAGRAMS Fig.(1.1): General Project Block Diagram 2
    • Chapter 1 | Introduction 1.5 DETAILED TECHNICAL DESCRIPTION Our project was built on the simplest available technologies to reach our goal in the way that comfort the user so we divided our project into 2 parts software and hardware. The hardware part consists of MCU pic, MP3 module, cam module and ultrasound sensor module. The software part is an android application available to be installed on the mobile. In the hardware part there're 2 conditions for it indoor and outdoor. For indoor only one sensor will measure ranges and cam module will take a photo to the object when the user reaches 2 cm to detect the code put on and send it to MCU which processing it and identify the code number and then get the object name from database and then connect the mp3 module WT588D and get the mp3 file address which contains the name of it and out to from the speaker. For outdoor 3 sensors HC-SR04 sensors will be activated in 3 direction to determine the best way no barriers on it and send measured data to MCU and the MCU detect the best way and send the address of the mp3 which contains the wanted direction and it would be the output. For navigation outdoor we'll design android application using Google maps the user detect the place he want to go with voice and the application detect his current position using GPS and the digital compass detect the angel of view and guide him to the direction using GPS data and compass data. Choose Mode Left Button Right Button Outdoor Indoor Fig. (1.2): Button Configuration 3
    • Chapter 1 | Introduction Fig. (1.3): Indoor & Outdoor Processes Block Diagram 1.6 PRE-PROJECT PLANNING We start searching for a problem no one care it and we found blinds' problems take no care to be solved and available products in Egypt aren't found. So we found it's a good field to start in it to get an opportunity to solve a problem and also enter a new field in the market with low number of Competitors. 1.7 TIME PLANNING Project Timing: The three main parts are individual in execution time but each part has many branches which are series in execution time. Timing of Product Introductions: The timing of launching the product is dependent on the marketing and the market studying again to the products which must be having low cost and high quality. 4
    • Chapter 1 | Introduction Technology Readiness: One of the fundamental components in the product is technology because the Android and Ultrasonic technology are taking good importance between the Egyptian customers. Market Readiness: The market always has a readiness to any new product the market is common between products to give the customers the best one for them. The Product Plan: This plan makes the project comfortable in his implementation because anything arranged or planned to do give the best results. 5
    • CHAPTER 2 Speech Recognition
    • Chapter 2 | Speech Recognition 2.1 | INTRODUCTION Biometrics is, in the simplest definition, something you are. It is a physical characteristic unique to each individual such as fingerprint, retina, iris, speech. Biometrics has a very useful application in security; it can be used to authenticate a person’s identity and control access to a restricted area, based on the premise that the set of these physical characteristics can be used to uniquely identify individuals. Speech signal conveys two important types of information, the primarily the speech content and on the secondary level, the speaker identity. Speech recognizers aim to extract the lexical information from the speech signal independently of the speaker by reducing the inter-speaker variability. On the other hand, speaker recognition is concerned with extracting the identity of the person speaking the utterance. So both speech recognition and speaker recognition system is possible from same voice input. We use in our project the speech recognition technique because we want in our project to recognize the word that the stick will make action depending on this word. Mel Filter Cepstral Coefficient (MFCC) is used as feature for both speech and speaker recognition. We also combined energy features and delta and delta-delta features of energy and MFCC. After calculating feature, neural networks are used to model the speech recognition. Based on the speech model the system decides whether or not the uttered speech matches what was prompted to utter. 2.2 | LITERATURE REVIEW 2.2.1 | Pattern Recognition Pattern recognition, one of the branches of artificial intelligence, sub-section of machine learning, is the study of how machines can observe the environment, learn to distinguish patterns of interest from their background, and make sound and reasonable decisions about the categories of the patterns. A pattern can be a fingerprint image, a handwritten cursive word, a human face, or a speech signal, sales pattern etc… The applications of pattern recognition include data mining, document classification, financial forecasting, organization and retrieval of multimedia databases, and biometrics (personal identification based on various physical attributes such as face, retina, speech, ear and fingerprints).The essential steps of 7
    • Chapter 2 | Speech Recognition pattern recognition are: Data Acquisition, Preprocessing, Feature Extraction, Training and Classification. Features are used to denote the descriptor. Features must be selected so that they are discriminative and invariant. They can be represented as a vector, matrix, tree, graph, or string. They are ideally similar for objects in the same class and very different for objects indifferent class. Pattern class is a family of patterns that share some common properties. Pattern recognition by machine involves techniques for assigning patterns to their respective classes automatically and with as little human intervention as possible. Learning and Classification usually use one of the following approaches: Statistical Pattern Recognition is based on statistical characterizations of patterns, assuming that the patterns are generated by a probabilistic system. Syntactical (or Structural) Pattern Recognition is based on the structural interrelationships of features. Given a pattern, its recognition/classification may consist of one of the following two tasks according to the type of learning procedure: 1) Supervised Classification (e.g., Discriminant Analysis) in which the input pattern is identified as a member of a predefined class. 2) Unsupervised Classification (e.g., clustering) in which the pattern is assigned to a previously unknown class. Fig. (2.1): General block diagram of pattern recognition system 8
    • Chapter 2 | Speech Recognition 2.2.2 | Generation of Voice Speech begins with the generation of an airstream, usually by the lungs and diaphragm -process called initiation. This air then passes through the larynx tube, where it is modulated by the glottis (vocal chords). This step is called phonation or voicing, and is responsible fourth generation of pitch and tone. Finally, the modulated air is filtered by the mouth, nose, and throat - a process called articulation - and the resultant pressure wave excites the air. Fig. (2.2): Vocal Schematic Depending upon the positions of the various articulators different sounds are produced. Position of articulators can be modeled by linear time- invariant system that has frequency response characterized by several peaks called formants. The change in frequency of formants characterizes the phoneme being articulated. As a consequence of this physiology, we can notice several characteristics of the frequency domain spectrum of speech. First of all, the oscillation of the glottis 9
    • Chapter 2 | Speech Recognition results in an underlying fundamental frequency and a series of harmonics at multiples of this fundamental. This is shown in the figure below, where we have plotted a brief audio waveform for the phoneme /i: / and its magnitude spectrum. The fundamental frequency (180 Hz) and its harmonics appear as spikes in the spectrum. The location of the fundamental frequency is speaker dependent, and is a function of the dimensions and tension of the vocal chords. For adults it usually falls between 100 Hz and 250 Hz, and females‟ average significantly higher than that of males. Fig. (2.3): Audio Sample for /i: / phoneme showing stationary property of phonemes for a short period The sound comes out in phonemes which are the building blocks of speech. Each phoneme resonates at a fundamental frequency and harmonics of it and thus has high energy at those frequencies in other words have different formats. It is the feature that enables the identification of each phoneme at the recognition stage. The variations in Fig.(2.4): Audio Magnitude Spectrum for /i:/ phoneme showing fundamental frequency and its harmonics 10
    • Chapter 2 | Speech Recognition Inter-speaker features of speech signal during utterance of a word are modeled in word training in speech recognition. And for speaker recognition the intra-speaker variations in features in long speech content is modeled. Besides the configuration of articulators, the acoustic manifestation of a phoneme is affected by:  Physiology and emotional state of speaker.  Phonetic context.  Accent. 2.2.3 | Voice as Biometric The underlying premise for voice authentication is that each person’s voice differs in pitch, tone, and volume enough to make it uniquely distinguishable. Several factors contribute to this uniqueness: size and shape of the mouth, throat, nose, and teeth (articulators) and the size, shape, and tension of the vocal cords. The chance that all of these are exactly the same in any two people is very low. Voice Biometric has following advantages from other form of biometrics:  Natural signal to produce  Implementation cost is low since, doesn’t require specialized input device  Acceptable by user Easily mixed with other form of authentication system for multifactor authentication only biometric that allows users to authenticate remotely. 2.2.4 | Speech Recognition Speech is the dominant means for communication between humans, and promises to be important for communication between humans and machines, if it can just be made a little more reliable. Speech recognition is the process of converting an acoustic signal to a set of words. The applications include voice commands and control, data entry, voice user interface, automating the telephone operator’s job in telephony, etc. They can also serve as the input to natural language processing. There is two variant of speech recognition based on the duration of speech signal: Isolated word recognition, in which each word is surrounded by some sort of pause, is much easier than recognizing continuous speech, in which words run into each other and have to be segmented. Speech recognition is a difficult task because 11
    • Chapter 2 | Speech Recognition of the many source of variability associated with the signal such as the acoustic realizations of phonemes, the smallest sound units of which words are composed, are highly dependent on the context. Acoustic variability can result from changes in the environment as well as in the position and characteristics of the transducer. Third, within speaker variability can result from changes in the speaker's physical and emotional state, speaking rate, or voice quality. Finally, differences in socio linguistic background, dialect, and vocal tract size and shape can contribute to cross-speaker variability. Such variability is modeled in various ways. At the level of signal representation, the representation that emphasizes the speaker independent features is developed. 2.2.5 | Speaker Recognition Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual’s information included in speech waves. Speaker recognition can be classified into identification and verification. Speaker recognition has been applied most often as means of biometric authentication. | Types of Speaker Recognition Speaker Identification Speaker identification is the process of determining which registered speaker provides a given utterance. In Speaker Identification (SID) system, no identity claim is provided, the test utterance is scored against a set of known (registered) references for each potential speaker and the one whose model best matches the test utterance is selected. There is two types of speaker identification task closedset and open-set speaker identification .In closed-set, the test utterance belongs to one of the registered speakers. During testing, a matching score is estimated for each registered speaker. The speaker corresponding to the model with the best matching score is selected. This requires N comparisons for a population of N speakers. In open-set, any speaker can access the system; those who are not registered should be rejected. This requires another model referred to as garbage model or imposter model or background model, which is trained with data provided by other speakers different from the registered speakers. During testing, the matching score corresponding to the best speaker model is compared with the matching score estimated using the garbage model. In order to accept or reject the speaker, making the total number of comparisons equal to N + 12
    • Chapter 2 | Speech Recognition 1. Speaker identification performance tends to decrease as the population size increases. Speaker verification Speaker verification, on the other hand, is the process of accepting or rejecting the identity claim of a speaker. That is, the goal is to automatically accept or reject an identity that is claimed by the speaker. During testing, a verification score is estimated using the claimed speaker model and the anti-speaker model. This verification score is then compared to a threshold. If the score is higher than the threshold, the speaker is accepted, otherwise, the speaker is rejected. Thus, speaker verification, involves a hypothesis test requiring a simple binary decision: accept or reject the claimed identity regardless of the population size. Hence, the performance is quite independent of the population size, but it depends on the number of test utterances used to evaluate the performance of the system. 2.2.6 | Speaker/Speech Modeling There are various pattern modeling/matching techniques. They include Dynamic Time Warping (DTW), Gaussian Mixture Model (GMM), Hidden Markov Modeling (HMM), Artificial Neural Network (ANN), and Vector Quantization (VQ). These are interchangeably used for speech, speaker modeling. The best approach is statistical learning methods: GMM for Speaker Recognition, which models the variations in features of a speaker for a long sequence of utterance. And another statistical method widely used for speech recognition is HMM. HMM models the Markovian nature of speech signal where each phoneme represents a state and sequence of such phonemes represents a word. Sequence of Features of such phonemes from different speakers is modeled by HMM. 2.3 | IMPLEMENTATION DETAILS The implementation of system includes common pre-processing and feature extraction module, speaker independent speech modeling and classification by ANNs. 2.3.1 | Pre-Processing and Feature Extraction 13
    • Chapter 2 | Speech Recognition Starting from the capturing of audio signal, feature extraction consists of the following steps as shown in the block diagram below: Speech Signal Silence removal Preemphasis Framing Windowing DFT Mel Filter Bank Log IDF T CMS Energy 12MFCC 12 ΔMFCC 12 ΔΔ MFCC Delta 1 energy 1 Δ energy 1 ΔΔ energy Fig. (2.5): Pre-Processing and Feature Extraction | Capture      The first step in processing speech is to convert the analog representation (first air pressure, and then analog electric signals in a microphone) into a digital signal x[n], where n is an index over time. Analysis of the audio spectrum shows that nearly all energy resides in the band between DC and 4 kHz, and beyond 10 kHz there is virtually no energy what so ever. Used sound format: 22050 Hz 16-bits, Signed Little Endian Mono Channel Uncompressed PCM | End point detection and Silence removal The captured audio signal may contain silence at different positions such as beginning of signal, in between the words of a sentence, end of signal…. etc. If silent frames are included, modeling resources are spent on parts of the signal which do not contribute to the identification. The silence present must be removed before further processing. There are several ways for doing this: most popular are Short Time Energy and Zeros Crossing Rate. But they have their own limitation regarding setting thresholds as an ad hocbasis. The algorithm we used uses 14
    • Chapter 2 | Speech Recognition statistical properties of background noise as well as physiological aspect of speech production and does not assume any ad hoc threshold. It assumes that background noise present in the utterances is Gaussian in nature. Usually first 200msec or more (we used 4410 samples for the sampling rate 22050samples/sec) of a speech recording corresponds to silence (or background noise) because the speaker takes some time to read when recording starts. Endpoint Detection Algorithm: Step 1: Calculate the mean (μ) and standard deviation (σ) of the first 200ms samples of the given utterance. The background noise is characterized by this μ and σ. Step 2: Go from 1st sample to the last sample of the speech recording. In each sample, check whether one-dimensional Mahalanobis distance functions i.e. | x-μ |/ σ greater than 3 or not. If Mahalanobis distance function is greater than 3, the sample is to be treated as voiced sample otherwise it is an unvoiced/silence. The threshold reject the samples up to 99.7% as per given by P [|x−μ|≤3σ] =0.997 in a Gaussian distribution thus accepting only the voiced samples. Step 3: Mark the voiced sample as 1 and unvoiced sample as 0. Divide the whole speech signal into 10 ms non-overlapping windows. Represent the complete speech by only zeros and ones. Step 4: Consider there are M number of zeros and N number of ones in a window. If M ≥ N then convert each of ones to zeros and vice versa. This method adopted here keeping in mind that a speech production system consisting of vocal cord, tongue, vocal tract etc. cannot change abruptly in a short period of time window taken here as 10ms. Step 5: Collect the voiced part only according to the labeled „1‟ samples from the windowed array and dump it in a new array. Retrieve the voiced part of the original speech signal from labeled 1 sample. 15
    • Chapter 2 | Speech Recognition Fig. (2.6): Input signal to End-point detection system Fig. (2.7): Output signal from End point Detection System | PCM Normalization The extracted pulse code modulated values of amplitude is normalized, to avoid amplitude variation during capturing. | Pre-emphasis Usually speech signal is pre-emphasized before any further processing, if we look at the spectrum for voiced segments like vowels, there is more energy at lower frequencies than the higher frequencies. This drop in energy across frequencies is caused by the nature of the glottal pulse. Boosting the high frequency energy makes information from these higher formants more available to the acoustic model and improves phone detection accuracy. The pre-emphasis filter is a first-order high-pass filter. In the time domain, with input x[n]and 0.9 ≤ α ≤ 1.0, the filter equation is: y[n] = x[n]− α x[n−1] We used α=0.95. 16
    • Chapter 2 | Speech Recognition Fig. (2.8): Signal before Pre-Emphasis Fig.(2.9): Signal after Pre-Emphasis | Framing and windowing Speech is a non-stationary signal, meaning that its statistical properties are not constant across time. Instead, we want to extract spectral features from a small window of speech that characterizes a particular sub phone and for which we can make the (rough) assumption that the signal is stationary (i.e. its statistical properties are constant within this region).We used frame block of 23.22ms with 50% overlapping i.e., 512 samples per frame. 17
    • Chapter 2 | Speech Recognition Fig.(2.10): Frame Blocking of the Signal The rectangular window (i.e., no window) can cause problems, when we do Fourier analysis; it abruptly cuts of the signal at its boundaries. A good window function has a narrow main lobe and low side lobe levels in their transfer functions, which shrinks the values of the signal toward zero at the window boundaries, avoiding discontinuities. The most commonly used window function in speech processing is the Hamming window defined as follows: ( ) ( ) { ( )} Fig.(2.11): Hamming window The extraction of the signal takes place by multiplying the value of the signal at time n, s frame [n], with the value of the window at time n, S w [n]: Y[n] = Sw[n] × Sframe[n] 18
    • Chapter 2 | Speech Recognition Fig.(2.12): A single frame before and after windowing | Discrete Fourier Transform A Discrete Fourier Transform (DFT) of the windowed signal is used to extract the frequency content (the spectrum) of the current frame. The tool for extracting spectral information i.e., how much energy the signal contains at discrete frequency bands for a discrete-time (sampled) signal is the Discrete Fourier Transform or DFT. The input to the DFT is a windowed signal x[n]...x[m], and the output, for each of N discrete frequency bands, is a complex number X[k] representing the magnitude and phase of that frequency component in the original signal. |∑ ( ) ( ) | The commonly used algorithm for computing the DFT is the Fast Fourier Transform or in short FFT. | Mel Filter For calculating the MFCC, first, a transformation is applied according to the following formula: ( ) [ ] Where, x is the linear frequency. Then, a filter bank is applied to the amplitude of the Mel-scaled spectrum. The Mel frequency warping is most conveniently done by utilizing a filter bank with filters centered according to Mel 19
    • Chapter 2 | Speech Recognition frequencies. The width of the triangular filters varies according to the Mel scale, so that the log total energy in a critical band around the center frequency is included. The centers of the filters are uniformly spaced in the Mel scale. Fig.(2.13): Equally spaced Mel values The result of Mel filter is information about distribution of energy at each Mel scale band. We obtain a vector of outputs (12 coeffs.) from each filter. Fig.(2.13): Triangular filter bank in frequency scale We have used 30 filters in the filter bank. 20
    • Chapter 2 | Speech Recognition | Cestrum by Inverse Discrete Fourier Transform Cestrum transform is applied to the filter outputs in order to obtain MFCC feature of each frame. The triangular filter outputs Y (i), i=0, 1, 2… M are compressed using logarithm, and discrete cosine transform (DCT) is applied. Here, M is equal to number of filters in filter bank i.e., 30. [ ] ∑ () [ ( )] Where, C[n] is the MFCC vector for each frame. The resulting vector is called the Mel-frequency cepstrum (MFC), and the individual components are the Mel-frequency Cepstral coefficients (MFCCs). We extracted 12 features from each speech frame. | Post Processing Cepstral Mean Subtraction (CMS) A speech signal may be subjected to some channel noise when recorded, also referred to as the channel effect. A problem arises if the channel effect when recording training data for a given person is different from the channel effect in later recordings when the person uses the system. The problem is that a false distance between the training data and newly recorded data is introduced due to the different channel effects. The channel effect is eliminated by subtracting the Melcepstrum coefficients with the mean Mel-cepstrum coefficients: ( ) ( ) ∑ ( ) The energy feature The energy in a frame is the sum over time of the power of the samples in the frame; thus for a signal x in a window from time sample t1 to time sample t2 the energy is: ∑ [ ] Delta feature Another interesting fact about the speech signal is that it is not constant from frame to frame. Co-articulation (influence of a speech sound during another 21
    • Chapter 2 | Speech Recognition adjacent or nearby speech sound) can provide a useful cue for phone identity. It can be preserved by using delta features. Velocity (delta) and acceleration (delta delta) coefficients are usually obtained from the static window based information. This delta and delta delta coefficients model the speed and acceleration of the variation of Cepstral feature vectors across adjacent windows. A simple way to compute deltas would be just to compute the difference between frames; thus the delta value d(t ) for a particular Cepstral value c (t) at time t can be estimated as: ( ) [] [] [] The differentiating method is simple, but since it acts as a high-pass filtering operation on the parameter domain, it tends to amplify noise. The solution to this is linear regression, i.e. first-order polynomial, the least squares solution is easily shown to be of the following form: [] ∑ [] ∑ Where, M is regression window size. We used M=4.       Composition of Feature Vector We calculated 39 Features from each frame: 12 MFCC Features. 12 Deltas MFCC. 12 Delta-Deltas MFCC. 1 Energy Feature. 1 Delta Energy Feature. 1 Delta-Delta Energy Feature. 2.4 | ARTIFICIAL NEURAL NETWORKS 2.4.1 | Introduction We have used ANNs to model our system and train voices and test it to classify it into words categories which return actions. And here we will make an overview about artificial neural networks. The original inspiration for the term Artificial Neural Network came from examination of central nervous systems and their neurons, axons, dendrites, and synapses, which constitute the processing elements of biological neural networks investigated by neuroscience. In an artificial neural network, simple artificial nodes, variously called "neurons", "neurodes", "processing elements" (PEs) or 22
    • Chapter 2 | Speech Recognition "units", are connected together to form a network of nodes mimicking the biological neural networks — hence the term "artificial neural network". Because neuroscience is still full of unanswered questions, and since there are many levels of abstraction and therefore many ways to take inspiration from the brain, there is no single formal definition of what an artificial neural network is. Generally, it involves a network of simple processing elements that exhibit complex global behavior determined by connections between processing elements and element parameters. While an artificial neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow. These networks are also similar to the biological neural networks in the sense that functions are performed collectively and in parallel by the units, rather than there being a clear delineation of subtasks to which various units are assigned (see also connectionism). Currently, the term Artificial Neural Network (ANN) tends to refer mostly to neural network models employed in statistics, cognitive psychology and artificial intelligence. Neural network models designed with emulation of the central nervous system (CNS) in mind are a subject of theoretical neuroscience and computational neuroscience. In modern software implementations of artificial neural networks, the approach inspired by biology has been largely abandoned for a more practical approach based on statistics and signal processing. In some of these systems, neural networks or parts of neural networks (such as artificial neurons) are used as components in larger systems that combine both adaptive and non-adaptive elements. While the more general approach of such adaptive systems is more suitable for real-world problem solving, it has far less to do with the traditional artificial intelligence connectionist models. What they do have in common, however, is the principle of non-linear, distributed, parallel and local processing and adaptation. Historically, the use of neural networks models marked a paradigm shift in the late eighties from high-level (symbolic) artificial intelligence, characterized by expert systems with knowledge embodied in if-then rules, to lowlevel (sub-symbolic) machine learning, characterized by knowledge embodied in the parameters of a dynamical system. 2.4.2 | Models 23
    • Chapter 2 | Speech Recognition Neural network models in artificial intelligence are usually referred to as artificial neural networks (ANNs); these are essentially simple mathematical models defining a function or a distribution over or both and , but sometimes models are also intimately associated with a particular learning algorithm or learning rule. A common use of the phrase ANN model really means the definition of a class of such functions (where members of the class are obtained by varying parameters, connection weights, or specifics of the architecture such as the number of neurons or their connectivity). 2.4.3 | Network Function The word network in the term 'artificial neural network' refers to the inter– connections between the neurons in the different layers of each system. An example system has three layers. The first layer has input neurons, which send data via synapses to the second layer of neurons, and then via more synapses to the third layer of output neurons. More complex systems will have more layers of neurons with some having increased layers of input neurons and output neurons. The synapses store parameters called "weights" that manipulate the data in the calculations. An ANN is typically defined by three types of parameters:  The interconnection pattern between different layers of neurons  The learning process for updating the weights of the interconnections  The activation function that converts a neuron's weighted input to its output activation. Mathematically, a neuron's network function is defined as a composition of other functions, which can further be defined as a composition of other functions. This can be conveniently represented as a network structure, with arrows depicting the dependencies between variables. A widely used type of composition is the nonlinear weighted sum, where (commonly referred to as the activation function) is some predefined function, such as the hyperbolic tangent. It will be convenient for the following to refer to a collection of functions as simply a vector. 2.4.4 | ANN dependency graph This figure depicts such a decomposition of , with dependencies between variables indicated by arrows. These can be interpreted in two ways. The first view is the functional view: the input is transformed into a 3dimensional vector , which is then transformed into a 2-dimensional vector , which is finally transformed into . This view is most commonly encountered in the context of optimization. 24
    • Chapter 2 | Speech Recognition The second view is the probabilistic view: the random variable depends upon the random variable , which depends upon , which depends upon the random variable . This view is most commonly encountered in the context of graphical models. The two views are largely equivalent. In either case, for this particular network architecture, the components of individual layers are independent of each other (e.g., the components of are independent of each other given their input). This naturally enables a degree of parallelism in the implementation. Two separate depictions of the recurrent ANN dependency graph. Networks such as the previous one are commonly called feed forward, because their graph is a directed acyclic graph. Networks with cycles are commonly called recurrent. Such networks are commonly depicted in the manner shown at the top of the figure, where is shown as being dependent upon itself. However, an implied temporal dependence is not shown. 2.4.5 | Learning What has attracted the most interest in neural networks is the possibility of learning. Given a specific task to solve, and a class of functions, learning means using a set of observations to find which solves the task in some optimal sense. This entails defining a cost function such that, for the optimal solution, - i.e., no solution has a cost less than the cost of the optimal solution (see Mathematical optimization). The cost function is an important concept in learning, as it is a measure of how far away a particular solution is from an optimal solution to the problem to be solved. Learning algorithms search through the solution space to find a function that has the smallest possible cost. For applications where the solution is dependent on some data, the cost must necessarily be a function of the observations; otherwise we would not be modeling anything related to the data. It is frequently defined as a statistic to which only approximations can be made. As a simple example, consider the problem of finding the model , which minimizes , for data pairs drawn from some distribution . In practical situations we would only have samples from and thus, for the above example, we would only minimize . Thus, the cost is minimized over a sample of the data rather than the entire data set. 25
    • Chapter 2 | Speech Recognition When some form of online machine learning must be used, where the cost is partially minimized as each new example is seen. While online machine learning is often used when is fixed, it is most useful in the case where the distribution changes slowly over time. In neural network methods, some form of online machine learning is frequently used for finite datasets. 2.4.6 | Choosing a cost function While it is possible to define some arbitrary, ad hoc cost function, frequently a particular cost will be used, either because it has desirable properties (such as convexity) or because it arises naturally from a particular formulation of the problem (e.g., in a probabilistic formulation the posterior probability of the model can be used as an inverse cost). Ultimately, the cost function will depend on the desired task. An overview of the three main categories of learning tasks is provided below. 2.4.7 | Learning paradigms There are three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning. 2.4.8 | Supervised learning In supervised learning, we are given a set of example pairs and the aim is to find a function in the allowed class of functions that matches the examples. In other words, we wish to infer the mapping implied by the data; the cost function is related to the mismatch between our mapping and the data and it implicitly contains prior knowledge about the problem domain. A commonly used cost is the mean-squared error, which tries to minimize the average squared error between the network's output, f(x), and the target value y over all the example pairs. When one tries to minimize this cost using gradient descent for the class of neural networks called multilayer perceptron’s, one obtains the common and well-known back-propagation algorithm for training neural networks. Tasks that fall within the paradigm of supervised learning are pattern recognition (also known as classification) and regression (also known as function approximation). The supervised learning paradigm is also applicable to sequential 26
    • Chapter 2 | Speech Recognition data (e.g., for speech and gesture recognition). This can be thought of as learning with a "teacher," in the form of a function that provides continuous feedback on the quality of solutions obtained thus far. 2.4.9 | Unsupervised learning In unsupervised learning, some data is given and the cost function to be minimized, that can be any function of the data and the network's output. The cost function is dependent on the task (what we are trying to model) and our a priori assumptions (the implicit properties of our model, its parameters and the observed variables). As a trivial example, consider the model, where is a constant and the cost. Minimizing this cost will give us a value of that is equal to the mean of the data. The cost function can be much more complicated. Its form depends on the application: for example, in compression it could be related to the mutual information between and, whereas in statistical modeling, it could be related to the posterior probability of the model given the data. (Note that in both of those examples those quantities would be maximized rather than minimized). Tasks that fall within the paradigm of unsupervised learning are in general estimation problems; the applications include clustering, the estimation of statistical distributions, compression and filtering. 2.4.10 | Reinforcement learning In reinforcement learning, data are usually not given, but generated by an agent's interactions with the environment. At each point in time, the agent performs an action and the environment generates an observation and an instantaneous cost, according to some (usually unknown) dynamics. The aim is to discover a policy for selecting actions that minimizes some measure of a long-term cost; i.e., the expected cumulative cost. The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated. More formally, the environment is modeled as a Markov decision process (MDP) with states and actions with the following probability distributions: the instantaneous cost distribution, the observation distribution and the transition, while a policy is defined as conditional distribution over actions given the observations. Taken together, the two define a Markov chain (MC). The aim is to 27
    • Chapter 2 | Speech Recognition discover the policy that minimizes the cost; i.e., the MC for which the cost is minimal. ANNs are frequently used in reinforcement learning as part of the overall algorithm. Dynamic programming has been coupled with ANNs (Neuro dynamic programming) by Bertsekas and Tsitsiklis and applied to multi-dimensional nonlinear problems such as those involved in vehicle routing or natural resources management because of the ability of ANNs to mitigate losses of accuracy even when reducing the discretization grid density for numerically approximating the solution of the original control problems. Tasks that fall within the paradigm of reinforcement learning are control problems, games and other sequential decision making tasks. 2.4.11 | Learning algorithms Training a neural network model essentially means selecting one model from the set of allowed models (or, in a Bayesian framework, determining a distribution over the set of allowed models) that minimizes the cost criterion. There are numerous algorithms available for training neural network models; most of them can be viewed as a straightforward application of optimization theory and statistical estimation. Most of the algorithms used in training artificial neural networks employ some form of gradient descent. This is done by simply taking the derivative of the cost function with respect to the network parameters and then changing those parameters in a gradient-related direction. Evolutionary methods, simulated annealing, expectation-maximization, nonparametric methods and particle swarm optimization are some commonly used methods for training neural networks. 2.4.12 | Employing artificial neural networks Perhaps the greatest advantage of ANNs is their ability to be used as an arbitrary function approximation mechanism that 'learns' from observed data. However, using them is not so straightforward and a relatively good understanding of the underlying theory is essential. Choice of model: This will depend on the data representation and the application. Overly complex models tend to lead to problems with learning. 28
    • Chapter 2 | Speech Recognition Learning algorithm: There is numerous trades-offs between learning algorithms. Almost any algorithm will work well with the correct hyper parameters for training on a particular fixed data set. However selecting and tuning an algorithm for training on unseen data requires a significant amount of experimentation. Robustness: If the model, cost function and learning algorithm are selected appropriately the resulting ANN can be extremely robust. With the correct implementation, ANNs can be used naturally in online learning and large data set applications. Their simple implementation and the existence of mostly local dependencies exhibited in the structure allows for fast, parallel implementations in hardware. 2.4.13 | Applications The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations. This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical. | Real-life applications     The tasks artificial neural networks are applied to tend to fall within the following broad categories: Function approximation, or regression analysis, including time series prediction, fitness approximation and modeling. Classification, including pattern and sequence recognition, novelty detection and sequential decision making. Data processing, including filtering, clustering, blind source separation and compression. Robotics, including directing manipulators, Computer numerical control. Application areas include system identification and control (vehicle control, process control, natural resources management), quantum chemistry, game-playing and decision making (backgammon, chess, poker), pattern recognition (radar systems, face identification, object recognition and more), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial 29
    • Chapter 2 | Speech Recognition applications (automated trading systems), data mining (or knowledge discovery in databases, "KDD"), visualization and e-mail spam filtering. Artificial neural networks have also been used to diagnose several cancers. An ANN based hybrid lung cancer detection system named HLND improves the accuracy of diagnosis and the speed of lung cancer radiology. These networks have also been used to diagnose prostate cancer. The diagnoses can be used to make specific models taken from a large group of patients compared to information of one given patient. The models do not depend on assumptions about correlations of different variables. Colorectal cancer has also been predicted using the neural networks. Neural networks could predict the outcome for a patient with colorectal cancer with a lot more accuracy than the current clinical methods. After training, the networks could predict multiple patient outcomes from unrelated institutions. | Neural networks and neuroscience Theoretical and computational neuroscience is the field concerned with the theoretical analysis and computational modeling of biological neural systems. Since neural systems are intimately related to cognitive processes and behavior, the field is closely related to cognitive and behavioral modeling. The aim of the field is to create models of biological neural systems in order to understand how biological systems work. To gain this understanding, neuroscientists strive to make a link between observed biological processes (data), biologically plausible mechanisms for neural processing and learning (biological neural network models) and theory (statistical learning theory and information theory). 2.4.14 | Types of models Many models are used in the field defined at different levels of abstraction and modeling different aspects of neural systems. They range from models of the short-term behavior of individual neurons, models of how the dynamics of neural circuitry arise from interactions between individual neurons and finally to models of how behavior can arise from abstract neural modules that represent complete subsystems. These include models of the long-term, and short-term plasticity, of neural systems and their relations to learning and memory from the individual neuron to the system level. 30
    • Chapter 2 | Speech Recognition 2.4.15 | Neural network software Neural network software is used to simulate research, develop and apply artificial neural networks, biological neural networks and in some cases a wider array of adaptive systems. 2.4.16 | Types of artificial neural networks Artificial neural network types vary from those with only one or two layers of single direction logic, to complicated multi–input many directional feedback loop and layers. On the whole, these systems use algorithms in their programming to determine control and organization of their functions. Some may be as simple as a one neuron layer with an input and an output, and others can mimic complex systems such as dANN, which can mimic chromosomal DNA through sizes at cellular level, into artificial organisms and simulate reproduction, mutation and population sizes. Most systems use "weights" to change the parameters of the throughput and the varying connections to the neurons. Artificial neural networks can be autonomous and learn by input from outside "teachers" or even self-teaching from written in rules. 2.4.17 | Confidence analysis of a neural network Supervised neural networks that use an MSE cost function can use formal statistical methods to determine the confidence of the trained model. The MSE on a validation set can be used as an estimate for variance. This value can then be used to calculate the confidence interval of the output of the network, assuming a normal distribution. A confidence analysis made this way is statistically valid as long as the output probability distribution stays the same and the network is not modified. By assigning a softmax activation function on the output layer of the neural network (or a softmax component in a component-based neural network) for categorical target variables, the outputs can be interpreted as posterior probabilities. This is very useful in classification as it gives a certainty measure on classifications. 31
    • CHAPTER 3 Image Processing s
    • Chapter 3 | Image Processing 3.1 | INTRODUCTION This chapter is an introduction on how to handle images in Matlab. When working with images in Matlab, there are many things to keep in mind such as loading an image, using the right format, saving the data as different data types, how to display an image, conversion between different image formats, etc. This worksheet presents some of the commands designed for these operations. Most of these commands require you to have the Image processing tool box installed with MATLAB. To find out if it is installed type very at the Matlab prompt. This gives you a list of what tool boxes that are installed on your system. For further reference on image handling in Matlab you are recommended to use Matlab's help browser. There is an extensive (and quite good) on-line manual for the Image processing tool box that you can access via Matlab's help browser. The first sections of this worksheet are quite heavy. The only way to understand how the presented commands work, is to carefully work through the examples given at the end of the worksheet. Once you can get these examples to work, experiment on your own using your favorite image! 3.1.1 | What Is Digital Image Processing? Transforming digital information representing images. 3.1.2 | Motivating Problems: 1. 2. 3. 4. 5. 6. 7. 8. 9. Improve pictorial information for human interpretation. Remove noise. Correct for motion, camera position, and distortion. Enhance by changing contrast, color. Segmentation - dividing an image up into constituent parts Representation - representing an image by some more abstract. Models Classification. Reduce the size of image information for efficient handling. Compression with loss of digital information that minimizes loss of "perceptual" information. JPEG and GIF, MPEG. 33
    • Chapter 3 | Image Processing 3.2 | COLOR VISION The color-responsive chemicals in the cones are called cone pigments and are very similar to the chemicals in the rods. The retinal portion of the chemical is the same, however the scotopsin is replaced with photopsins. Therefore, the colorresponsive pigments are made of retinal and photopsins. There are three kinds of color-sensitive pigments: • Red-sensitive pigment • Green-sensitive pigment • Blue-sensitive pigmentlution representations versus quality of service. Each cone cell has one of these pigments so that it is sensitive to that color. The human eye can sense almost any gradation of color when red, green and blue are mixed. The wavelengths of the three types of cones (red, green and blue) are shown. The peak absorbancy of blue-sensitive pigment is 445 nanometers, for greensensitive pigment it is 535 nanometers, and for red-sensitive pigment it is 570 nanometers. MATLAB stores most images as two-dimensional arrays (i.e., matrices), in which each element of the matrix corresponds to a single pixel in the displayed image. For example, an image composed of 200 rows and 300 columns of different colored dots would be stored in MATLAB as a 200-by-300 matrix. Some images, such as RGB, require a three dimensional array, where the first plane in the 3rd dimension represents the red pixel intensities, the second plane represents the green pixel intensities, and the third plane represents the blue pixel intensities. To reduce memory requirements, MATLAB supports storing image data in arrays of class uint8 and uint16. The data in these arrays is stored as 8-bit or 16-bit unsigned integers. These arrays require one-eighth or one-fourth as much memory as data in double arrays. An image whose data matrix has class uint8 is called an 8bit image; an image whose data matrix has class uint16 is called a 16-bit image. 3.2.1 | Fundamentals A digital image is composed of pixels which can be thought of as small dots on the screen. A digital image is an instruction of how to color each pixel. We will see in detail later on how this is done in practice. A typical size of an image is 512by-512 pixels. Later on in the course you will see that it is convenient to let the 33
    • Chapter 3 | Image Processing dimensions of the image to be a power of 2. For example, 2 9=512. In the general case we say that an image is of size m-by-n if it is composed of m pixels in the vertical direction and n pixels in the horizontal direction. Let us say that we have an image on the format 512-by-1024 pixels. This means that the data for the image must contain information about 524288 pixels, which requires a lot of memory! Hence, compressing images is essential for efficient image processing. You will later on see how Fourier analysis and Wavelet analysis can help us to compress an image significantly. There are also a few "computer scientific" tricks (for example entropy coding) to reduce the amount of data required to store an image. 3.2.2 | Image Formats Supported By Mat lab. The following image formats are supported by Mat lab:       BMP HDF JPEG PCX TIFF XWB Most images you find on the Internet are JPEG-images which is the name for one of the most widely used compression standards for images. If you have stored an image you can usually see from the suffix what format it is stored in. For example, an image named myimage.jpg is stored in the JPEG format and we will see later on that we can load an image of this format into Mat lab. 3.2.3 | Working Formats In Matlab: If an image is stored as a JPEG-image on your disc we first read it into Matlab. However, in order to start working with an image, for example perform a wavelet transform on the image, we must convert it into a different format. This section explains four common formats. 3.3 | ASPECTS OF IMAGE PROCESSING 33
    • Chapter 3 | Image Processing Image Enhancement: Processing an image so that the result is more suitable for a particular application. (Sharpening or deploring an out of focus image, highlighting edges, improving image contrast, or brightening an image, removing noise) Image Restoration: This may be considered as reversing the damage done to an image by a known cause. (Removing of blur caused by linear motion, removal of optical distortions) Image Segmentation: This involves subdividing an image into constituent parts, or isolating certain aspects of an image.(finding lines, circles, or particular shapes in an image, in an aerial photograph, identifying cars, trees, buildings, or roads. 3.4 | IMAGE TYPES 3.4.1 | Intensity Image (Gray Scale Image) This is the equivalent to a "gray scale image" and this is the image we will mostly work with in this course. It represents an image as a matrix where every element has a value corresponding to how bright/dark the pixel at the corresponding position should be colored. There are two ways to represent the number that represents the brightness of the pixel: The double class (or data type). This assigns a floating number ("a number with decimals") between 0 and 1 to each pixel. The value 0 corresponds to black and the value 1 corresponds to white. The other class is called uint8 which assigns an integer between 0 and 255 to represent the brightness of a pixel. The value 0 corresponds to black and 255 to white. The class uint8 only requires roughly 1/8 of the storage compared to the class double. On the other hand, many mathematical functions can only be applied to the double class. We will see later how to convert between double and uint8. Fig. (3.1) 33
    • Chapter 3 | Image Processing 3.4.2 | Binary Image: This image format also stores an image as a matrix but can only color a pixel black or white (and nothing in between). It assigns a 0 for black and a 1 for white. 3.4.3 | Indexed Image: This is a practical way of representing color images. (In this course we will mostly work with gray scale images but once you have learned how to work with a gray scale image you will also know the principle how to work with color images.) An Indexed image stores an image as two matrices. The first matrix has the same size as the image and one number for each pixel. The second matrix is called the color map and its size may be different from the image. The numbers in the first matrix is an instruction of what number to use in the color map matrix. Fig. (3.2) 3.4.4 | RGB Image This is another format for color images. It represents an image with three matrices of sizes matching the image format. Each matrix corresponds to one of the colors red, green or blue and gives an instruction of how much of each of these colors a certain pixel should use. 3.4.5 | Multi-frame Image: In some applications we want to study a sequence of images. This is very common in biological and medical imaging where you might study a sequence of slices of a cell. For these cases, the multi-frame format is a convenient way of 33
    • Chapter 3 | Image Processing working with a sequence of images. In case you choose to work with biological imaging later on in this course, you may use this format. 3.5 | HOW TO? 3.5.1 | How To Convert Between Different Formats: The following table shows how to convert between the different formats given above. All these commands require the Image processing tool box! Table(3.1)Image format conversion (Within the parenthesis you type the name of the image you wish to convert) Operation Convert between intensity/indexed/RGB format to binary format. Convert between intensity format to indexed format. Convert between indexed format to intensity format. Convert between indexed format to RGB format. Convert a regular matrix to intensity format by scaling. Convert between RGB format to intensity format. Convert between RGB format to indexed format. Matlab command dither() gray2ind() ind2gray() ind2rgb() mat2gray() rgb2gray() rgb2ind() The command mat2gray is useful if you have a matrix representing an image but the values representing the gray scale range between, let's say, 0 and 1000. The command mat2gray automatically re scales all entries so that they fall within 0 and 255 (if you use the uint class) or 0 and 1 (if you use the double class). 3.5.2 | How to Read Files When you encounter an image you want to work with, it is usually in form of a file (for example, if you down load an image from the web, it is usually stored as a JPEG-file). Once we are done processing an image, we may want to write it back to a JPEG-file so that we can, for example, post the processed image on the web. This is done using the imread and imwrite commands. These commands require the Image processing tool box! 33
    • Chapter 3 | Image Processing Table(3.2)Reading and writing image files Operation Read an image. (Within the parenthesis you type the name of the image file you wish to read. Put the file name within single quotes Write an image to a file. (As the first argument within the parenthesis you type the name of the image you have worked with. As a second argument within the parenthesis you type the name of the file and format that you want to write the image to. Put the file name within single quotes. Matlab command imread() imwrite( ) Make sure to use semi-colon; after these commands, otherwise you will get LOTS OF number scrolling on your screen... The commands imread and imwrite support the formats given in the section "Image formats supported by Matlab" above. 3.5.3 | Loading And Saving Variables in Matlab This section explains how to load and save variables in Mat lab. Once you have read a file, you probably convert it into an intensity image (a matrix) and work with this matrix. Once you are done you may want to save the matrix representing the image in order to continue to work with this matrix at another time. This is easily done using the commands save and load. Note that save and load are commonly used Matlab commands, and works independently of what tool boxes that are installed. Table(3.3) Loading and saving variables Operation Save the variable X. Load the variable X. Matlab command Save X Load X 3.5.4 | How to Display an Image in MATLAB Here are a couple of basic Mat lab commands (do not require any tool box) for displaying an image. 33
    • Chapter 3 | Image Processing Table(3.4)Displaying an image given on matrix form Operation Display an image represented as the matrix X. Adjust the brightness .S is a parameter such that -1<s<0 gives a darker image, 0<s<1 gives a brighter image. Change the colors to gray. Matlab command imagesc(X) brighten(s) colormap(gray) Sometimes your image may not be displayed in gray scale even though you might have converted it into a gray scale image. You can then use the command colormap (gray) to "force" Matlab to use a gray scale when displaying an image. If you are using Matlab with an Image processing tool box installed, I recommend you to use the command imshow to display an image. Table (3.5)Displaying an image given on matrix form (with image processing tool box) Operation Matlab command Display an image represented as the matrix X. imshow(X) Zoom in (using the left and right mouse button). zoom on Turn off the zoom function. zoom off 3.6 | SOME IMPORTANT DEFINITIONS 3.6.1 | Imread Function A = imread (filename, fmt) reads a grayscale or true color image named filename into A. If the file contains a grayscale intensity image, A is a two-dimensional array. If the file contains a true color (RGB) image, A is a three-dimensional (mby-n-by-3) array. 3.6.2 | Rotation >> B = imrotate (A, ANGLE, METHOD) Where; A: Your image. ANGLE: The angle (in degrees) you want to rotate your image in the counter clockwise direction. METHOD: A string that can have one of these values If you omit the METHOD argument, IMROTATE uses the default method of 'nearest'. 34
    • Chapter 3 | Image Processing Note: to rotate the image clockwise, specify a negative angle. The returned image matrix B is, in general, larger than A to include the whole rotated image. IMROTATE sets invalid values on the periphery of B to 0. 3.6.3 | Scaling IMRESIZE resizes an image of any type using the specified interpolation method. Supported interpolation methods 3.6.4 | Interpolation 'nearest' (default) nearest neighbor interpolation? 'bilinear' bilinear interpolation? 'bicubic' bicubic interpolation ? B = IMRESIZE(A,M,METHOD) returns an image that is M times the size of A. If M is between 0 and 1.0, B is smaller than A. If M is greater than 1.0, B is larger than A. If METHOD is omitted, IMRESIZE uses nearest neighbor interpolation. B = IMRESIZE (A,[MROWS MCOLS],METHOD) returns an image of size MROWS-by-MCOLS. If the specified size does not produce the same aspect ratio as the input image has, the output image is distorted. a= imread(‘image.fmt’); % put your image in place of image.fmt. » B = IMRESIZE (a,[100 100],'nearest'); » imshow(B); » B = IMRESIZE(a,[100 100],'bilinear'); » imshow(B); » B = IMRESIZE(a,[100 100],'bicubic'); » imshow(B); 3.7 | EDGE DETECTION 3.7.1 | Canny Edge Detector 1. Low error rate of detection Well match human perception results 2. Good localization of edges The distance between actual edges in an image and the edges found by a computational algorithm should be minimized 3. Single response 34
    • Chapter 3 | Image Processing The algorithm should not return multiple edges pixels when only a single one exist. 3.7.2 | Edge Detectors bw color Canny sobel Fig.(3.4) Fig. (3.5) 3.7.3 | Edge Tracing b=rgb2gray(a); % convert to gray. WE can only do edge tracing for gray images. edge(b,'prewitt'); edge(b,'sobel'); edge(b,'sobel','vertical'); edge(b,'sobel','horizontal'); edge(b,'sobel','both'); We can only do edge tracing using gray scale images (i.e images without color). 34
    • Chapter 3 | Image Processing >> BW=rgb2gray (A); >> edge (BW,’prewitt’) Fig.(3.6) That is what I saw! >> edge (BW,’sobel’,’vertical’) >> edge (BW,’sobel’,’horizontal’) >> edge (BW,’sobel’,’both’) Table(3.6):Data types Type Int8 Uint8. Int16 Double Description 8-bit integer 8-bit unsigned integer 16-bit integer Double precision real number 3.8 | MAPPING 3.8.1 | Mapping Images onto Surfaces Overview 33 Range -128_127 0_255 -32768_32767 Machine specific
    • Chapter 3 | Image Processing Mapping an image onto geometry, also known as texture mapping, involves overlaying an image or function onto a geometric surface. Images may be realistic, such as satellite images, or representational, such as color-coded functions of temperature or elevation. Unlike volume visualizations, which render each voxel (volume element) of a three-dimensional scene, mapping an image onto geometry efficiently creates the appearance of complexity by simply layering an image onto a surface. The resulting realism of the display also provides information that is not as readily apparent as with a simple display of either the image or the geometric surface. Mapping an image onto a geometric surface is a two step process. First, the image is mapped onto the geometric surface in object space. Second, the surface undergoes view transformations (relating to the viewpoint of the observer) and is then displayed in 2D screen space. You can use IDL Direct Graphics or Object Graphics to display images mapped onto geometric surfaces. The following table introduces the tasks and routines. Table(3.7):Tasks and Routines Associated with Mapping an Image onto Geometry Routine(s)/Object(s) Description SHADE_SURF Display the elevation data IDLgrWindow::Init IDLgrView::Init Initialize the objects necessary for an Object Graphics display. IDLgrModel::Init IDLgrSurface:: Init Initialize a surface object containing the elevation data. IDLgrImage::Init Initialize an image object containing the satellite image XOBJVIEW Display the object in an interactive IDL utility allowing rotation and resizing. 3.8.2 | Mapping an Image onto Elevation Data The following Object Graphics example maps a satellite image from the Los Angeles, California vicinity onto a DEM (Digital Elevation Model) containing the areas topographical features. The realism resulting from mapping the image onto the corresponding elevation data provides a more informative view of the area’s topography. The process is segmented into the following three sections: • “Opening Image and Geometry Files” • “Initializing the IDL Display Objects” • “Displaying the Image and Geometric Surface Objects” 33
    • Chapter 3 | Image Processing Note: Data can be either regularly gridded (defined by a 2D array) or irregularly gridded (defined by irregular x, y, z points). Both the image and elevation data used in this example are regularly gridded. If you are dealing with irregularly gridded data, use GRIDDATA to map the data to a regular grid. Complete the following steps for a detailed description of the process. Example Code: See in the examples/doc/image subdirectory of the IDL installation directory for code that duplicates this example. Run the example procedure by entering elevation object at the IDL command prompt or view the file in an IDL Editor window by entering .EDIT Opening Image and Geometry Files: The following steps read in the satellite image and DEM files and display the Elevation data. 1. Select the satellite image: >> imageFile = FILEPATH('elev_t.jpg', $) SUBDIRECTORY = ['examples', 'data']) 2. Import the JPEG file: READ_JPEG, image File, image 3. Select the DEM file: demFile = FILEPATH('elevbin.dat', $) SUBDIRECTORY = ['examples', 'data']) 4. Define an array for the elevation data, open the file, read in the data and close the file: dem = READ_BINARY(demfile, DATA_DIMS = [64, 64] 5. Enlarge the size of the elevation array for display purposes: dem = CONGRID(dem, 128, 128, /INTERP) 6. To quickly visualize the elevation data before continuing on to the Object Graphics section, initialize the display, create a window and display the elevation data using the SHADE_SURF command: DEVICE, DECOMPOSED = 0 33
    • Chapter 3 | Image Processing WINDOW, 0, TITLE = 'Elevation Data' SHADE_SURF, dem After reading in the satellite image and DEM data, continue with the next section to create the objects necessary to map the satellite image onto the elevation surface. Fig.(3.7):Visual Display of the Elevation Data After reading in the satellite image and DEM data, continue with the next section to create the objects necessary to map the satellite image onto the elevation surface. 3.8.3 | Initializing the IDL Display Objects After reading in the image and surface data in the previous steps, you will need to create objects containing the data. When creating an IDL Object Graphics display, it is necessary to create a window object (oWindow), a view object (oView) and a model object (oModel). These display objects, shown in the conceptual representation in the following figure, will contain a geometric surface object (the DEM data) and an image object (the satellite image). These user-defined objects are instances of existing IDL object classes and provide access to the properties and methods associated with each object class. 33
    • Chapter 3 | Image Processing Note: (The XOBJVIEW utility (described in “Mapping an Image Object onto a Sphere” automatically creates window and view Complete the following steps to initialize the necessary IDL objects.) 1. Initialize the window, view and model display objects. For detailed syntax, arguments and keywords available with each object initialization, see IDLgrWindow::Init, IDLgrView::Init and IDLgrModel::Init. The following three lines use the basic syntax : oNewObject = OBJ_NEW('Class_Name') To create these objects: oWindow = OBJ_NEW('IDLgrWindow', RETAIN = 2, COLOR_MODEL = 0) oView = OBJ_NEW('IDLgrView') oModel = OBJ_NEW('IDLgrModel') 2. Assign the elevation surface data, dem, to an IDLgrSurface object. The IDLgrSurface::Init keyword, STYLE = 2, draws the elevation data using a filled line style: oSurface = OBJ_NEW('IDLgrSurface', dem, STYLE = 2) 3. Assign the satellite image to a user-defined IDLgrImage object using IDLgrImage::Init: oImage = OBJ_NEW('IDLgrImage', image, INTERLEAVE = 0, $ /INTERPOLATE) INTERLEAVE = 0 indicates that the satellite image is organized using pixel interleaving, and therefore has the dimensions (3, m, n). The INTERPOLATE keyword forces bilinear interpolation instead of using the default nearest neighbor interpolation method. 3.8.4 | Displaying the Image and Geometric Surface Objects This section displays the objects created in the previous steps. The image and surface objects will first be displayed in an IDL Object Graphics window and then with the interactive XOBJVIEW utility. 33
    • Chapter 3 | Image Processing 1. Center the elevation surface object in the display window. The default object graphics coordinate system is [–1,–1], [1,1]. To center the object in the window, position the lower left corner of the surface data at [–0.5,–0.5, –0.5] for the x, y and z dimensions: 2. Map the satellite image onto the geometric elevation surface using the IDLgrSurface::Init TEXTURE_MAP keyword: oSurface -> SetProperty, TEXTURE_MAP = oImage, $ COLOR = [255, 255, 255] For clearest display of the texture map, set COLOR = [255, 255, 255]. If the image does not have dimensions that are exact powers of 2, IDL resamples the image into a larger size that has dimensions which are the next powers of two greater than the original dimensions. This resampling may cause unwanted sampling artifacts. In this example, the image does have dimensions that are exact powers of two, so no resampling occurs. oSurface -> GETPROPERTY, XRANGE = xr, YRANGE = yr, $ ZRANGE = zr xs = NORM_COORD(xr) xs[0] = xs[0] - 0.5 ys = NORM_COORD(yr) ys[0] = ys[0] - 0.5 zs = NORM_COORD(zr) zs[0] = zs[0] - 0.5 oSurface -> SETPROPERTY, XCOORD_CONV = xs, $ YCOORD_CONV = ys, ZCOORD = zs Note: (If your texture does not have dimensions that are exact powers of 2 and you do not want to introduce resampling artifacts, you can pad the texture with unused data to a power of two and tell IDL to map only a subset of the texture onto the surface.) For example, if your image is 40 by 40, create a 64 by 64 image and fill part of it with the image data: textureImage = BYTARR(64, 64, /NOZERO) textureImage[0:39, 0:39] = image ; image is 40 by 40 oImage = OBJ_NEW('IDLgrImage', textureImage) Then, construct texture coordinates that map the active part of the texture to a surface (oSurface): textureCoords = [[], [], [], []] 33
    • Chapter 3 | Image Processing oSurface -> SetProperty, TEXTURE_COORD = textureCoords The surface object in IDL 5.6 is has been enhanced to automatically perform the above calculation. In the above example, just use the image data (the 40 by 40 array) to create the image texture and do not supply texture coordinates. IDL computes the appropriate texture coordinates to correctly use the 40 by 40 image. Note: (Some graphic devices have a limit for the maximum texture size. If your texture is larger than the maximum size, IDL scales it down into dimensions that work on the device. This rescaling may introduce resampling artifacts and loss of detail in the texture. To avoid this, use the TEXTURE_HIGHRES keyword to tell IDL to draw the surface in smaller pieces that can be texture mapped without loss of detail.) 3. Add the surface object, covered by the satellite image, to the model object. Then add the model to the view object: oModel -> Add, oSurface. oView -> Add, oMode. 4. Rotate the model for better display in the object window. Without rotating the model, the surface is displayed at a 90 elevation angle, containing no depth information. The following lines rotate the model 90 away from the viewer along the x-axis and 30clockwise along the y-axis and the x-axis: oModel -> ROTATE, [1, 0, 0], -90 oModel -> ROTATE, [0, 1, 0], 30 oModel -> ROTATE, [1, 0, 0], 30 5. Display the result in the Object Graphics window: oWindow -> Draw, oView Fig.(3.9:Image Mapped onto a Surface in an Object Graphics Window 33
    • Chapter 3 | Image Processing 6. Display the results using XOBJVIEW, setting the SCALE = 1 (instead of the default value of 1/SQRT3) to increase the size of the initial display: XOBJVIEW, oModel, /BLOCK, SCALE = 1 This results in the following display: Fig.( 3.10) Displaying the Image Mapped onto the Surface in XOBJVIEW After displaying the model, you can rotate it by clicking in the applicationwindow and dragging your mouse. Select the magnify button, then click near the middle of the image. Drag your mouse away from the center of the display to magnify the image or toward the center of the display to shrink the image. Select the left-most button on the XOBJVIEW toolbar to reset the display. 7. Destroy unneeded object references after closing the display windows: OBJ_DESTROY, [oView, oImage] The oModel and oSurface objects are automatically destroyed when oView is destroyed. For an example of mapping an image onto a regular surface using both Direct and Object Graphics displays, see “Mapping an Image onto a Sphere” 34
    • Chapter 3 | Image Processing 3.8.5 | Mapping an Image onto a Sphere The following example maps an image containing a color representation of world elevation onto a sphere using both Direct and Object Graphics displays. The example is broken down into two sections: • “Mapping an Image onto a Sphere Using Direct Graphics” . • “Mapping an Image Object onto a Sphere” . 3.9 | MAPPING OFF LINE: In the absence of a network or services we can identify and see the track through the use of image processing technique, We incorporate the map where an image of the places familiar to the person and determine how to access them and return them in a clear and safe. we calculate the distances by using mat lab function : IMDISTLINE and assuming speed to calculate time takes to get from one point to another and we guide person through voice commands for example on the road to move forward or back word or to left or to right. We have thus, we provide another way to work mapping without being online. 34
    • CHAPTER 4 GPS Navigation
    • Chapter 4 | GPS Navigation 4.1 | INTRODUCTION 4.1.1 | What Is GPS? The Global Positioning System (GPS) is a satellite-based navigation system made up of a network of 24 satellites placed into orbit by the U.S. Department of Defense. GPS was originally intended for military applications, but in the 1980s, the government made the system available for civilian use. GPS works in any weather conditions, anywhere in the world, 24 hours a day. There are no subscription fees or setup charges to use GPS. 4.1.2 | How It Works GPS satellites circle the earth twice a day in a very precise orbit and transmit signal information to earth. GPS receivers take this information and use triangulation to calculate the user's exact location. Essentially, the GPS receiver compares the time a signal was transmitted by a satellite with the time it was received. The time difference tells the GPS receiver how far away the satellite is. Now, with distance measurements from a few more satellites, the receiver can determine the user's position and display it on the unit's electronic map. Fig.(4-1): Satellite Screen 35
    • Chapter 4 | GPS Navigation Fig. (4-2): How GPS works A GPS receiver must be locked on to the signal of at least three satellites to calculate a 2D position (latitude and longitude) and track movement. With four or more satellites in view, the receiver can determine the user's 3D position (latitude, longitude and altitude). Once the user's position has been determined, the GPS unit can calculate other information, such as speed, bearing, track, trip distance, distance to destination, sunrise and sunset time and more. When people talk about "a GPS," they usually mean a GPS receiver. The Global Positioning System (GPS) is actually a constellation of 27 Earth-orbiting satellites (24 in operation and three extras in case one fails). The US military developed and implemented this satellite network as a military navigation system, but soon opened it up to 4.2 | BASIC CONCEPT OF GPS A GPS receiver calculates its position by precisely timing the signals sent by GPS satellites high above the Earth. Each satellite continually transmits messages that include: 35
    • Chapter 4 | GPS Navigation    The time the message was transmitted. Precise orbital information (the ephemeris). The general system health and rough orbits of all GPS satellites (the almanac). The receiver uses the messages it receives to determine the transit time of each message and computes the distance to each satellite. These distances along with the satellites' locations are used with the possible aid of trilateration, depending on which algorithm is used, to compute the position of the receiver. This position is then displayed, perhaps with a moving map display or latitude and longitude; elevation information may be included. Many GPS units show derived information such as direction and speed, calculated from position changes. Three satellites might seem enough to solve for position since space has three dimensions and a position near the Earth's surface can be assumed. However, even a very small clock error multiplied by the very large speed of light — the speed at which satellite signals propagate — results in a large positional error. Therefore receivers use four or more satellites to solve for both the receiver's location and time. The very accurately computed time is effectively hidden by most GPS applications, which use only the location. A few specialized GPS applications do however use the time; these include time transfer, traffic signal timing, and synchronization of cell phone base station. Although four satellites are required for normal operation, fewer apply in special cases. If one variable is already known, a receiver can determine its position using only three satellites. For example, a ship or aircraft may have known elevation. Some GPS receivers may use additional clues or assumptions to give a less accurate (degraded) position when fewer than four satellites are visible. 4.3 | POSITION CALCULATION INTRODUCTION To provide an introductory description of how a GPS receiver works, error effects are deferred to a later section. Using messages received from a minimum of four visible satellites, a GPS receiver is able to determine the times sent and then the satellite positions corresponding to these times sent. The x, y, and z components of position, and the time sent, are designated as [ ] where the subscript i is the satellite number and has the value 1, 2, 3, or 4. Knowing the indicated time the message was received ̅ , the GPS receiver could compute the ), if ̅ would be equal to correct reception transit time of the message as ( ̅ time, . 33
    • Chapter 4 | GPS Navigation ( ̅ ) , would be the traveling distance of the A pseudo range, message, assuming it traveled at the speed of light, c. A satellite's position and pseudo range define a sphere, centered on the satellite, with radius equal to the pseudo range. The position of the receiver is somewhere on the surface of this sphere. Thus with four satellites, the indicated position of the GPS receiver is at or near the intersection of the surfaces of four spheres. In the ideal case of no errors, the GPS receiver would be at a precise intersection of the four surfaces. If the surfaces of two spheres intersect at more than one point, they intersect in a circle. The article trilateration shows this mathematically. A figure, two sphere surfaces intersecting in a circle, is shown below. Two points where the surfaces of the spheres intersect are clearly shown in the figure. The distance between these two points is the diameter of the circle of intersection. The intersection of a third spherical surface with the first two will be its intersection with that circle; in most cases of practical interest, this means they intersect at two points. Another figure, surface of sphere intersection a circle (not a solid disk) at two points, illustrates the intersection. The two intersections are marked with dots. Again the article trilateration clearly shows this mathematically For automobiles and other near-earth vehicles, the correct position of the GPS receiver is the intersection closest to the Earth's surface. For space vehicles, the intersection farthest from Earth may be the correct one. The correct position for the GPS receiver is also on the intersection with the surface of the sphere corresponding to the fourth satellite. -Two sphere surfaces intersecting in a circle -Surface of sphere intersecting a circle (not a solid disk) at two points Fig. (4-3) 35
    • Chapter 4 | GPS Navigation 4.4 | COMMUNICATION The navigational signals transmitted by GPS satellites encode a variety of information including satellite positions, the state of the internal clocks, and the health of the network. These signals are transmitted on two separate carrier frequencies that are common to all satellites in the network. Two different encodings are used: a public encoding that enables lower resolution navigation, and an encrypted encoding used by the U.S. military. 4.5 | MESSAGE FORMAT Table(4.1):GPS message format Sub frames Description 1 Satellite clock, GPS time relationship 2–3 Ephemeris (precise satellite orbit) 4–5 Almanac component (satellite network synopsis, error correction) Each GPS satellite continuously broadcasts a navigation message on L1 C/A and L2 P/Y at a rate of 50 bits per second .Each complete message takes 750 seconds (12 1/2 minutes) to complete. The message structure has a basic format of a 1500-bit-long frame made up of five sub frames, each sub frame being 300 bits (6 seconds) long. Sub frames 4 and 5 are sub commutated 25 times each, so that a complete data message requires the transmission of 25 full frames. Each sub frame consists of ten words, each 30 bits long. Thus, with 300 bits in a sub frame times 5 sub frames in a frame times 25 frames in a message, each message is 37,500 bits long. At a transmission rate of 50 bps, this gives 750 seconds to transmit an entire almanac message. Each 30-second frame begins precisely on the minute or half minute as indicated by the atomic clock on each satellite. The first part of the message encodes the week number and the time within the week, as well as the data about the health of the satellite. The second part of the 35
    • Chapter 4 | GPS Navigation message, the ephemeris, provides the precise orbit for the satellite. The last part of the message, the almanac sub commutated in sub frames 4 & 5, contains coarse orbit and status information for up to 32 satellites in the constellation as well as data related to error correction. Thus, in order to obtain an accurate satellite location from this transmitted message the receiver must demodulate the message from each satellite it includes in its solution for 18 to 30 seconds. In order to collect all the transmitted almanacs the receiver must demodulate the message for 732 to 750 seconds or 12 1/2 minutes. All satellites broadcast at the same frequencies. Signals are encoded using code division multiple access (CDMA) allowing messages from individual satellites to be distinguished from each other based on unique encodings for each satellite (that the receiver must be aware of). Two distinct types of CDMA encodings are used: the course/acquisition (C/A) code, which is accessible by the general public, and the precise (P) code, that is encrypted so that only the U.S. military can access it. The ephemeris is updated every 2 hours and is generally valid for 4 hours, with provisions for updates every 6 hours or longer in non-nominal conditions. The almanac is updated typically every 24 hours. Additionally data for a few weeks following is uploaded in case of transmission updates that delay data upload. 4.6 | SATELLITE FREQUENCIES Table(4.2):GPS frequency overview Band Frequency Description L1 Coarse-acquisition (C/A) and encrypted precision P(Y) codes, plus 1575.42 MHz the L1 civilian (L1C) and military (M) codes on future Block III satellites. L2 1227.60 MHz L3 1381.05 MHz Used for nuclear detonation (NUDET) detection. L4 1379.913 MHz Being studied for additional ionospheric correction. L5 1176.45 MHz Proposed for use as a civilian safety-of-life (SoL) signal. P(Y) code, plus the L2C and military codes on the Block IIR-M and newer satellites. All satellites broadcast at the same two frequencies, 1.57542 GHz (L1 signal) and 1.2276 GHz (L2 signal). The satellite network uses a CDMA spread- 35
    • Chapter 4 | GPS Navigation spectrum technique where the low-bitrate message data is encoded with a highrate pseudo-random (PRN) sequence that is different for each satellite. The receiver must be aware of the PRN codes for each satellite to reconstruct the actual message data. The C/A code, for civilian use, transmits data at 1.023 million chips per second, whereas the P code, for U.S. military use, transmits at 10.23 million chips per second. The L1 carrier is modulated by both the C/A and P codes, while the L2 carrier is only modulated by the P code. The P code can be encrypted as a so-called P(Y) code that is only available to military equipment with a proper decryption key. Both the C/A and P(Y) codes impart the precise time-ofday to the user. The L3 signal at a frequency of 1.38105 GHz is used by the United States Nuclear Detonation (NUDET) Detection System (USNDS) to detect, locate, and report nuclear detonations (NUDETs) in the Earth's atmosphere and near space. One usage is the enforcement of nuclear test ban treaties. The L4 band at 1.379913 GHz is being studied for additional ionospheric correction. The L5 frequency band at 1.17645 GHz was added in the process of GPS modernization. This frequency falls into an internationally protected range for aeronautical navigation, promising little or no interference under all circumstances. The first Block IIF satellite that would provide this signal is set to be launched in 2009.The L5 consists of two carrier components that are in phase quadrature with each other. Each carrier component is bi-phase shift key (BPSK) modulated by a separate bit train. "L5, the third civil GPS signal, will eventually support safety-oflife applications for aviation and provide improved availability and accuracy. 4.7 | NAVIGATION EQUATIONS The receiver uses messages received from satellites to determine the satellite positions and time sent. The x, y, and z components of satellite position and the time sent are designated as [xi, yi, zi, ti] where the subscript i denotes the satellite and has the value 1, 2, ..., n, where Knowing when the message was received tr , the receiver computes the message's transit time as tr − ti . Note that the receiver indeed knows the reception time indicated by its on-board clock, rather than tr. Assuming the message traveled at the speed of light (c) the distance traveled is (tr − ti) c. Knowing the distance from receiver to satellite and the satellite's position implies that the receiver is on the surface of a sphere centered at the satellite's position. Thus the receiver is at or near the intersection of the 35
    • Chapter 4 | GPS Navigation surfaces of the spheres. In the ideal case of no errors, the receiver is at the intersection of the surfaces of the spheres. Let b denote the clock error or bias, the amount that the receiver's clock is off. The receiver has four unknowns, the three components of GPS receiver position and the clock bias [x, y, z, b]. The equations of the sphere surfaces are given by: ( ) ( ) ( ( Or in terms of pseudo ranges, √( ) ( ) ) ( ([ ] ) ) , as ) . These equations can be solved by algebraic or numerical methods. 4.8 | BANCROFT'S METHOD Bancroft's method involves an algebraic as opposed to numerical method and can be used for the case of four satellites or for the case of more than four satellites. If there are four satellites then Bancroft's method provides one or two solutions for the four unknowns. If there are more than four satellites then Bancroft's method provides the solution which minimizes the sum of the squares of the errors for the over determined system. 4.9 | TRILATERATION The receiver can use trilateration and one dimensional numerical root finding. Trilateration is used to determine the position based on three satellite's pseudo ranges. In the usual case of two intersections, the point nearest the surface of the sphere corresponding to the fourth satellite is chosen. Let d denote the signed distance from the receiver position to the sphere around the fourth satellite. 4.10 | MULTIDIMENSIONAL NEWTON-RAPHSON CALCULATIONS Alternatively, multidimensional root finding method such as NewtonRaphson method can be used. The approach is to linearize around an approximate solution, say [ ( ) ( ) ( ) ( ) ] from iteration k, then solve the linear equations derived from the quadratic equations above to obtain ( ) ( ) ( ) ( ) [ ] . Although there is no guarantee that the method always converges due to the fact that multidimensional roots cannot be bounded, 56
    • Chapter 4 | GPS Navigation when a neighborhood containing a solution is known as is usually the case for GPS, it is quite likely that a solution will be found. It has been shown that results are comparable in accuracy to those of the Bancroft's method. 4.11 | ADDITIONAL METHODS FOR MORE THAN FOUR SATELLITES When more than four satellites are available, the calculation can use the four best or more than four, considering number of channels, processing capability, and geometric dilution of precision (GDOP). Using more than four is an overdetermined system of equations with no unique solution, which must be solved by least-squares or a similar technique. If all visible satellites are used, the results are as good as or better than using the four best. Errors can be estimated through the residuals. With each combination of four or more satellites, a GDOP factor can be calculated, based on the relative sky directions of the satellites used. As more satellites are picked up, pseudo ranges from various 4-way combinations can be processed to add more estimates to the location and clock offset. The receiver then takes the weighted average of these positions and clock offsets. After the final location and time are calculated, the location is expressed in a specific coordinate system such as latitude and longitude, using the WGS 84 geodetic datum or a country-specific system. 4.12 | ERROR SOURCES AND ANALYSIS Error analysis for the Global Positioning System is an important aspect for determining what errors and their magnitude are to be expected. GPS errors are affected by geometric dilution of precision and depend on signal arrival time errors, numerical errors, atmospherics effects, ephemeris errors, multipath errors and other effects. The single largest source of error in modeling the orbital dynamics is due to variability in solar radiation pressure. 4.13 | ACCURACY ENHANCEMENT AND SURVEYING 4.13.1 | Augmentation Integrating external information into the calculation process can materially improve accuracy. Such augmentation systems are generally named or described 56
    • Chapter 4 | GPS Navigation based on how the information arrives. Some systems transmit additional error information (such as clock drift, ephemera, or ionospheric delay), others characterize prior errors, while a third group provides additional navigational or vehicle information. Examples of augmentation systems include the Wide Area Augmentation System (WAAS),European Geostationary Navigation Overlay Service (EGNOS),Differential GPS, Inertial Navigation Systems(INS) and Assisted GPS. 4.13.2 | Precise Monitoring Accuracy can be improved through precise monitoring and measurement of existing GPS signals in additional or alternate ways. The largest remaining error is usually the unpredictable delay through the ionosphere. The spacecraft broadcast ionospheric model parameters, but errors remain. This is one reason GPS spacecraft transmit on at least two frequencies, L1 and L2. Ionospheric delay is a well-defined function of frequency and the total electron content (TEC) along the path, so measuring the arrival time difference between the frequencies determines TEC and thus the precise ionospheric delay at each frequency. Military receivers can decode the P(Y)-code transmitted on both L1 and L2. Without decryption keys, it is still possible to use a codeless technique to compare the P(Y) codes on L1 and L2 to gain much of the same error information. However, this technique is slow, so it is currently available only on specialized surveying equipment. In the future, additional civilian codes are expected to be transmitted on the L2 and L5 frequencies (see GPS modernization). Then all users will be able to perform dual-frequency measurements and directly compute ionospheric delay errors. A second form of precise monitoring is called Carrier-Phase Enhancement (CPGPS). This corrects the error that arises because the pulse transition of the PRN is not instantaneous, and thus the correlation (satellitereceiver sequence matching) operation is imperfect. CPGPS uses the L1 carrier wave, which has a period of , which is about one-thousandth of the C/A Gold code bit period of , to act as an additional clock signal and resolve the uncertainty. The phase difference error in the normal GPS amounts to 2–3 meters (6.6–9.8 ft.) of ambiguity. CPGPS working to within 1% of perfect transition reduces this error to 3 centimeters (1.2 in.) of ambiguity. By eliminating this error source, CPGPS 56
    • Chapter 4 | GPS Navigation coupled with DGPS normally realizes between 20–30 centimeters (7.9–12 in) of absolute accuracy. Relative Kinematic Positioning (RKP) is a third alternative for a precise GPSbased positioning system. In this approach, determination of range signal can be resolved to a precision of less than 10 centimeters (3.9 in). This is done by resolving the number of cycles that the signal is transmitted and received by the receiver by using a combination of differential GPS (DGPS) correction data, transmitting GPS signal phase information and ambiguity resolution techniques via statistical tests—possibly with processing in real-time (real-time kinematic positioning, RTK). 4.14 | TIME KEEPING 4.14.1 | Timekeeping and leap seconds While most clocks are synchronized to Coordinated Universal Time (UTC), the atomic clocks on the satellites are set to GPS time (GPST; see the page of United States Naval Observatory). The difference is that GPS time is not corrected to match the rotation of the Earth, so it does not contain leap seconds or other corrections that are periodically added to UTC. GPS time was set to match Coordinated (UTC) in 1980, but has since diverged. The lack of corrections means that GPS time remains at a constant offset with International Atomic Time (TAI) (TAI – GPS = 19 seconds). Periodic corrections are performed on the on-board clocks to keep them synchronized with ground clocks. The GPS navigation message includes the difference between GPS time and UTC. As of 2011, GPS time is 15 seconds ahead of UTC because of the leap second added to UTC December 31, 2008. Receivers subtract this offset from GPS time to calculate UTC and specific time zone values. New GPS units may not show the correct UTC time until after receiving the UTC offset message. The GPS-UTC offset field can accommodate 255 leap seconds (eight bits) that, given the current period of the Earth's rotation (with one leap second introduced approximately every 18 months), should be sufficient to last until approximately the year 2300. 4.14.2 | Timekeeping Accuracy GPS time is accurate to about 14 nanoseconds. 55
    • Chapter 4 | GPS Navigation 4.14.3 | Timekeeping Format As opposed to the year, month, and day format of the Gregorian calendar, the GPS date is expressed as a week number and a seconds-into-week number. The week number is transmitted as a ten-bit field in the C/A and P(Y) navigation messages, and so it becomes zero again every 1,024 weeks (19.6 years). GPS week zero started at 00:00:00 UTC (00:00:19 TAI) on January 6, 1980, and the week number became zero again for the first time at 23:59:47 UTC on August 21, 1999 (00:00:19 TAI on August 22, 1999). To determine the current Gregorian date, a GPS receiver must be provided with the approximate date (to within 3,584 days) to correctly translate the GPS date signal. To address this concern the modernized GPS navigation message uses a 13-bit field that only repeats every 8,192 weeks (157 years), thus lasting until the year 2137 (157 years after GPS week zero). 4.14.4 | Carrier Phase Tracking (Surveying) Another method that is used in surveying applications is carrier phase tracking. The period of the carrier frequency multiplied by the speed of light gives the wavelength, which is about 0.19 meters for the L1 carrier. Accuracy within 1% of wavelength in detecting the leading edge reduces this component of pseudo range error to as little as 2 millimeters. This compares to 3 meters for the C/A code and 0.3 meters for the P code. However, 2 millimeter accuracy requires measuring the total phase—the number of waves multiplied by the wavelength plus the fractional wavelength, which requires specially equipped receivers. This method has many surveying applications. Triple differencing followed by numerical root finding, and a mathematical technique called least squares can estimate the position of one receiver given the position of another. First, compute the difference between satellites, then between receivers, and finally between epochs. Other orders of taking differences are equally valid. Detailed discussion of the errors is omitted. The satellite carrier total phase can be measured with ambiguity as to the number of cycles. Let φ( ) denote the phase of the carrier of satellite j measured by receiver i at time . This notation shows the meaning of the subscripts i, j, and k. The receiver (r), satellite (s), and time (t) come in alphabetical order as arguments of and to balance readability and conciseness, let φ( ) be a concise abbreviation. Also we define three functions,: , which return differences between receivers, satellites, and time points, respectively. Each function has variables with three subscripts as its 55
    • Chapter 4 | GPS Navigation arguments. These three functions are defined below. If is a function of the three integer arguments, i, j, and k then it is a valid argument for the functions,: , with the values defined as ( ( ) ) Also if ( ) are valid arguments for the three functions and a and b are constants then( ) is a valid argument with values defined as ( ( ( ) ) ) ( ( ( ) ) ) ( ( ( ), ), and ). Receiver clock errors can be approximately eliminated by differencing the phases measured from satellite 1 with that from satellite 2 at the same epoch. This difference is designated as ( ) Double differencing computes the difference of receiver 1's satellite difference from that of receiver 2. This approximately eliminates satellite clock errors. This double difference is: ( ( )) ( ) ( ) ( ( ) ( ) ) Triple differencing subtracts the receiver difference from time 1 from that of time 2. This eliminates the ambiguity associated with the integral number of wave lengths in carrier phase provided this ambiguity does not change with time. Thus the triple difference result eliminates practically all clock bias errors and the integer ambiguity. Atmospheric delay and satellite ephemeris errors have been significantly reduced. This triple difference is: ( ( ( ))) Triple difference results can be used to estimate unknown variables. For example if the position of receiver 1 is known but the position of receiver 2 unknown, it may be possible to estimate the position of receiver 2 using numerical root finding and least squares. Triple difference results for three independent time pairs quite possibly will be sufficient to solve for receiver 2's three position components. This may require the use of a numerical procedure. An approximation of receiver 2's 53
    • Chapter 4 | GPS Navigation position is required to use such a numerical method. This initial value can probably be provided from the navigation message and the intersection of sphere surfaces. Such a reasonable estimate can be key to successful multidimensional root finding. Iterating from three time pairs and a fairly good initial value produces one observed triple difference result for receiver 2's position. Processing additional time pairs can improve accuracy, over determining the answer with multiple solutions. Least squares can estimate an over determined system. Least squares determines the position of receiver 2 which best fits the observed triple difference results for receiver 2 positions under the criterion of minimizing the sum of the squares. 4.15 | GPS NAVIGATION: A GPS navigation device is any device that receives Global Positioning System (GPS) signals for the purpose of determining the device's current location on Earth. GPS devices provide latitude and longitude information, and some may also calculate altitude, although this is not considered sufficiently accurate or continuously available enough (due to the possibility of signal blockage and other factors) to rely on exclusively to pilot aircraft. GPS devices are used in military, aviation, marine and consumer product applications. GPS devices may also have additional capabilities such as:  containing all types of maps, like streets maps, which may be displayed in human readable format via text or in a graphical format  providing suggested directions to a human in charge of a vehicle or vessel via text or speech  providing directions directly to an autonomous vehicle such as a robotic probe  providing information on traffic conditions (either via historical or real time data) and suggesting alternative directions  Providing information on nearby amenities such as restaurants, fueling stations, etc. In other words, all GPS devices can answer the question "Where am I?", and may also be able to answer:  Which roads or paths are available to me now?  Which roads or paths should I take in order to get to my desired destination? 55
    • Chapter 4 | GPS Navigation  If some roads are usually busy at this time or are busy right now, what would be a better route to take?  Where can I get something to eat nearby or where can I get fuel for my vehicle? 55
    • CHAPTER 5 Ultrasound
    • Chapter 5 | Ultrasound 5.1 | INTRODUCTION Ultrasound is a mechanical disturbance that moves as a pressure wave through a medium. When the medium is a patient, the wavelike disturbance is the basis for use of ultrasound as a diagnostic tool. Appreciation of the characteristics of ultrasound waves and their behavior in various media is essential to understanding the use of diagnostic ultrasound in clinical medicine. 5.1.1 | History In 1880, French physicists Pierre and Jacques Curie discovered the piezoelectric effect.7 French physicist Paul Langevin attempted to develop piezoelectric materials as senders and receivers of high-frequency mechanical disturbances (ultrasound waves) through materials.8 His specific application was the use of ultrasound to detect submarines during World War I. This technique, sound navigation and ranging (SONAR), finally became practical during World War II. Industrial uses of ultra- sound began in 1928 with the suggestion of Soviet Physicist Sokolov that it could be used to detect hidden flaws in materials. Medical uses of ultrasound through the 1930s were confined to therapeutic applications such as cancer treatments and physical therapy for various ailments. Diagnostic applications of ultrasound began in the late 1940s through collaboration between physicians and engineers familiar with SONAR. 5.2 | WAVE MOTION A fluid medium is a collection of molecules that are in continuous random motion. The molecules are represented as filled circles in the margin figure. When no external force is applied to the medium, the molecules are distributed more or less uniformly (A). When a force is applied to the medium (represented by movement of the piston from left to right in B), the molecules are concentrated in front of the piston, resulting in an increased pressure at that location. The region of increased pressure is termed a zone of compression. Because of the forward motion imparted to the molecules by the piston, the region of increased pressure begins to migrate away from the piston and through the medium. That is, a mechanical disturbance introduced into the medium travels through the medium in a direction away from the source of the disturbance. In 96
    • Chapter 5 | Ultrasound clinical applications of ultrasound, the piston is replaced by an ultrasound transducer. As the zone of compression begins its migration through the medium, the piston may be withdrawn from right to left to create a region of reduced pressure immediately behind the compression zone. Molecules from the surrounding medium move into this region to restore it to normal particle density; and a second region, termed a zone of rarefaction, begins to migrate away from the piston (C). That is, the compression zone (high pressure) is followed by a zone of rarefaction (low pressure) also moving through the medium. If the piston is displaced again to the right, a second compression zone is established that follows the zone of rarefaction through the medium, If the piston oscillates continuously, alternate zones of compression and rarefaction are propagated through the medium, as illustrated in D. The propagation of these zones establishes a wave disturbance in the medium. This disturbance is termed a longitudinal wave because the motion of the molecules in the medium is parallel to the direction of wave propagation. A wave with a frequency between about 20 and 20,000 Hz is a sound wave that is audible to the human ear. An infrasonic wave is a sound wave below 20 Hz; it is not audible to the human ear. An ultrasound (or ultrasonic) wave has a frequency greater than 20,000 Hz and is also inaudible. In clinical diagnosis, ultrasound waves of frequencies between 1 and 20 MHz are used. As a longitudinal wave moves through a medium, molecules at the edge of the wave slide past one another. Resistance to this shearing effect causes these molecules to move somewhat in a direction away from the moving longitudinal wave. This transverse motion of molecules along the edge of the longitudinal wave establishes shear waves that radiate transversely from the longitudinal wave. In general, shear waves are significant only in a rigid medium such as a solid. In biologic tissues, bone is the only medium in which shear waves are important. If the piston is displaced again to the right, a second compression zone is established that follows the zone of rarefaction through the medium, If the piston oscillates continuously, alternate zones of compression and rarefaction are propagated through the medium, as illustrated in D. The propagation of these zones establishes a wave disturbance in the medium. This disturbance is termed a longitudinal wave because the motion of the molecules in the medium is parallel to 07
    • Chapter 5 | Ultrasound the direction of wave propagation. A wave with a frequency between about 20 and 20,000 Hz is a sound wave that is audible to the human ear. An infrasonic wave is a sound wave below 20 Hz; it is not audible to the human ear. An ultrasound (or ultrasonic) wave has a frequency greater than 20,000 Hz and is also inaudible. In clinical diagnosis, ultrasound waves of frequencies between 1 and 20 MHz are used. As a longitudinal wave moves through a medium, molecules at the edge of the wave slide past one another. Resistance to this shearing effect causes these molecules to move somewhat in a direction away from the moving longitudinal wave. This transverse motion of molecules along the edge of the longitudinal wave establishes shear waves that radiate transversely from the longitudinal wave. In general, shear waves are significant only in a rigid medium such as a solid. In biologic tissues, bone is the only medium in which shear waves are important. 5.3 | WAVE CHARACTERISTICS A zone of compression and an adjacent zone of rarefaction constitute one cycle of an ultrasound wave. A wave cycle can be represented as a graph of local pressure (particle density) in the medium versus distance in the direction of the ultrasound wave. The distance covered by one cycle is the wavelength of the ultrasound wave. The number of cycles per unit time (cps, or just sec -1 ) introduced into the medium each second is referred to as the frequency of the wave, expressed in units of hertz, kilohertz, or megahertz where 1 Hz equals 1 cps. The maximum height of the wave cycle is the amplitude of the ultrasound wave. The product of the frequency (υ) and the wavelength (λ) is the velocity of the wave; that is, c = υλ. In most soft tissues, the velocity of ultrasound is about 1540 m/sec. Frequencies of 1 MHz and greater are required to furnish ultrasound wavelengths suitable for diagnostic imaging. When two waves meet, they are said to “interfere” with each other. There are two extremes of interference. In constructive interference the waves are “in phase” (i.e., peak meets peak). In destructive interference the waves are “out of phase” (i.e., peak meets valley). Waves undergoing constructive interference add their amplitudes, whereas waves undergoing destructive interference may completely cancel each other. 07
    • Chapter 5 | Ultrasound Fig. (5.1): Characteristics of an ultrasound wave Table (5-1): Frequency Classification of Ultrasound Frequency (Hz) 20–20,000 20,000–1,000,000 1,000,000–30,000,000 Classification Audible sound Ultrasound Diagnostic medical Ultrasound Table (5-2):Quantities and Unit Pertaining to Ultrasound Intensity Quantity Energy (E ) Power (P ) Intensity (I ) Relationship Definition Ability to do work Rate at which energy is transported Power per unit area (a), where t = time Unit Joule watt ( joule/sec) watt/cm2 ( )( ) 5.4 | ULTRASOUND INTENSITY Ultrasound frequencies of 1 MHz and greater correspond to ultrasound wavelengths less than 1 mm in human soft tissue. As an ultrasound wave passes through a medium, it transports energy through the medium. The rate of energy transport is known as “power.” Medical ultrasound is produced in beams that are usually focused into a small area, and the beam is described in terms of the power per unit area, defined as the beam’s “intensity”. The relationships among the quantities and units pertaining to intensity are summarized in Table 5-2. 07
    • Chapter 5 | Ultrasound Intensity is usually described relative to some reference intensity. For example, the intensity of ultrasound waves sent into the body may be compared with that of the ultrasound reflected back to the surface by structures in the body. For many clinical situations the reflected waves at the surface may be as much as a hundredth or so of the intensity of the transmitted waves. Waves reflected from structures at depths of 10 cm or more below the surface may be lowered in intensity by a much larger factor. A logarithmic scale is most appropriate for recording data over a range of many orders of magnitude. In acoustics, the decibel scale is used, with the decibel defined as dB=10 log where I0 is the reference intensity. Table 5-3 shows examples of decibel values for certain intensity ratios. Several rules can be extracted from this table:  Positive decibel values result when a wave has a higher intensity than the reference wave; negative values denote a wave with lower intensity.  Increasing a wave’s intensity by a factor of 10 adds 10 dB to the intensity, and reducing the intensity by a factor of 10 subtracts 10 dB.  Doubling the intensity adds 3 dB, and halving subtracts 3 dB. Table (5-3): Calculation of Decibel Values Forms Intensity Ratios and Amplitude Ratios Intensity Ratio(I/I0) (dB) Ratio of Ultrasound wave Amplitude Ratio (A/A0) (dB) parameters 1000 30 60 100 20 40 10 10 20 2 3 6 1 ⁄ 0 −3 0 −6 ⁄ −10 −20 ⁄ −20 −40 ⁄ −30 −60 No universal standard reference intensity exists for ultrasound. Thus the statement “ultrasound at 50 dB was used” is nonsensical. However, a statement such as “the returning echo was 50 dB below the transmitted signal” is 07
    • Chapter 5 | Ultrasound informative. The trans-mitted signal then becomes the reference intensity for this particular application. For audible sound, a statement such as “a jet engine produces sound at 100 dB” is appropriate because there is a generally accepted reference intensity of 10 -16 W/cm2 for audible sound. A 1-kHz tone (musical note C one octave above middle C) at this intensity is barely audible to most listeners. A 1-kHz note at 120 dB (10 -4 W/cm2) is painfully loud. Because intensity is power per unit area and power is energy per unit time (Table 5-2), Eq. (5-1) may be used to compare the power or the energy contained within two ultrasound waves. Thus we could also write 10 log (Power/Power0)= 10 log(E/E0) Ultrasound wave intensity is related to maximum pressure (Pm) in the medium by the following expression: 1= Where is the density of the medium in grams per cubic centimeter and c is the speed of sound in the medium. Substituting Eq. (5-2) for I and I0 in Eq. (5-1) yields dB=10 log ( ) = 10 log [ ]2 = 20 log When comparing the pressure of two waves, Eq. (5-3) may be used directly. That is, the pressure does not have to be converted to intensity to determine the decibel value. An ultrasound transducer converts pressure amplitudes received from the patient (i.e., the reflected ultrasound wave) into voltages. The amplitude of voltages recorded for ultrasound waves is directly proportional to the variations in pressure in the reflected wave. The decibel value for the ratio of two waves may be calculated from Eq. (5-1) or from Eq. (5-3), depending upon the information that is available concerning the waves (see Margin Table). The “half-power value” (ratio of 0.5 in power between two waves) is –3 dB, whereas the “half-amplitude value” (ratio of 0.5 in amplitude) is –6 dB (Table 5-3). This difference reflects the greater sensitivity of the decibel scale to amplitude compared with intensity values. 07
    • Chapter 5 | Ultrasound 5.5 | ULTRASOUND VELOCITY The velocity of an ultrasound wave through a medium varies with the physical properties of the medium. In low-density media such as air and other gases, molecules may move over relatively large distances before they influence neighboring molecules. In these media, the velocity of an ultrasound wave is relatively low. In solids, molecules are constrained in their motion, and the velocity of ultrasound is relatively high. Liquids exhibit ultrasound velocities intermediate between those in gases and solids. With the notable exceptions of lung and bone, biologic tissues yield velocities roughly similar to the velocity of ultrasound in liquids. In different media, changes in velocity are reflected in changes in wavelength of the ultrasound waves, with the frequency remaining relatively constant. In ultrasound imaging, variations in the velocity of ultrasound in different media introduce artifacts into the image, with the major artifacts attributable to bone, fat, and, in ophthalmologic applications, the lens of the eye. Table (5-4): Approximate Velocities of Ultrasound in Selected Materials No biologic Material Velocity (m/sec) Biologic Material Acetone 1174 Fat Velocity (m/sec) 1475 Air 331 Brain 1560 Aluminum (rolled) 6420 Liver 1570 Brass 4700 Kidney 1560 Ethanol Glass (Pyrex) Acrylic plastic 1207 5640 2680 Spleen Blood Muscle 1570 1570 1580 Mercury 1450 Lens of eye 1620 Nylon (6-6) 2620 Skull bone 3360 Polyethylene 1950 Soft tissue (mean value) 1540 Water (distilled), 1498 Water(distilled), 1540 The velocities of ultrasound in various media are listed in Table 5-3.The velocity of an ultrasound wave should be distinguished from the velocity of molecules whose displacement into zones of compression and rarefaction constitutes the wave. The molecular velocity describes the velocity of the individual molecules in the medium, whereas the wave velocity describes the 07
    • Chapter 5 | Ultrasound velocity of the ultrasound wave through the medium. Properties of ultrasound such as reflection, transmission, and refraction are characteristic of the wave velocity rather than the molecular velocity. 5.6 | ATTENUATION OF ULTRASOUND As an ultrasound beam penetrates a medium, energy is removed from the beam by absorption, scattering, and reflection. These processes are summarized in Figure (5-2).As with x rays, the term attenuation refers to any mechanism that removes energy from the ultrasound beam. Ultrasound is “absorbed” by the medium if part of the beam’s Constructive and destructive interference effects characterize the echoes from no specular reflections. Because the sound is reflected in all directions, there are many opportunities for waves to travel different pathways. The wave fronts that return to the transducer may constructively or destructively interfere at random. The random interference pattern is known as “speckle”. Energy is converted into other forms of energy, such as an increase in the random motion of molecules. Ultrasound is “reflected” if there is an orderly deflection of all or part of the beam. If part of an ultrasound beam changes direction in a less orderly fashion, the event is usually described as “scatter.” Fig. (5.2) 09
    • Chapter 5 | Ultrasound The behavior of a sound beam when it encounters an obstacle depends upon the size of the obstacle compared with the wavelength of the sound. If the obstacle’s size is large compared with the wavelength of sound (and if the obstacle is relatively smooth), then the beam retains its integrity as it changes direction. Part of the sound beam may be reflected and the remainder transmitted through the obstacle as a beam of lower intensity. If the size of the obstacle is comparable to or smaller than the wavelength of the ultrasound, the obstacle will scatter energy in various directions. Some of the ultrasound energy may return to its original source after “no specular” scatter, but probably not until many scatter events have occurred. In ultrasound imaging, specular reflection permits visualization of the boundaries between organs, and no specular reflection permits visualization of tissue parenchyma (Figure 5-2). Structures in tissue such as collagen fibers are smaller than the wavelength of ultrasound. Such small structures provide scatter that returns to the transducer through multiple pathways. The sound that returns to the transducer from such no specular reflectors is no longer a coherent beam. It is instead the sum of a number of component waves that produces a complex pattern of constructive and destructive interference back at the source. This interference pattern, known as “speckle,” provides the characteristic ultrasonic appearance of complex tissue such as liver. The behavior of a sound beam as it encounters an obstacle such as an interface between structures in the medium is summarized in Figure 5-3. As illustrated in Figure (5-4), the energy remaining in the beam decreases approximately exponentially with the depth of penetration of the beam into the medium. The reduction in energy (i.e., the decrease in ultrasound intensity) is described in decibels, as noted earlier. 5.7 | REFLECTION In most diagnostic applications of ultrasound, use is made of ultrasound waves reflected from interfaces between different tissues in the patient. The fraction of the impinging energy reflected from an interface depends on the difference in acoustic impedance of the media on opposite sides of the interface. 00
    • Chapter 5 | Ultrasound The acoustic impedance Z of a medium is the product of the density medium and the velocity of ultrasound in the medium: Z= of the Acoustic impedances of several materials are listed in the margin. For an ultrasound wave incident perpendicularly upon an interface, the fraction of the incidentenergy that is reflected (i.e., the reflection coefficient ) is = Where Z1 and Z2 are the acoustic impedances of the two media. The fraction of the incident energy that is transmitted across an interface is described by the transmission coefficient , where = ( ) Obviously + = 1. With a large impedance mismatch at an interface, much of the energy of an ultrasound wave is reflected, and only a small amount is transmitted across the inter-face. For example, ultrasound beams are reflected strongly at air–tissue and air–water interfaces because the impedance of air is much less than that of tissue or water. Table (5-5): Approximate Acoustic Impedances of Selected Materials Acoustic Impedance (kg-m−2 -sec−1)×10−4 Material Air at standard temperature and pressure 0.0004 Water 1.50 Polyethylene Plexiglas Aluminum Mercury Brass Fat Aqueous and vitreous humor of eye Brain Blood Kidney Human soft tissue, mean value Spleen Liver Muscle Lens of eye Skull bone 07 1.85 3.20 18.0 19.5 38.0 1.38 1.50 1.55 1.61 1.62 1.63 1.64 1.65 1.70 1.85 6.10
    • Chapter 5 | Ultrasound Range of echoes from biologic interfaces and selection of internal echoes to be displayed over the major portion of the gray scale in an ultrasound unit. (From Kossoff, G., et al.12 Used with permission). The ultrasound waves will enter the patient with little reflection at the skin surface. Similarly, strong reflections of ultrasound occur at the boundary between the chest wall and the lungs and at the millions of air–tissue interfaces within the lungs. Because of the large impedance mismatch at these interfaces, efforts to use ultrasound as a diagnostic tool for the lungs have been unrewarding. The impedance mismatch is also high between soft tissues and bone, and the use of ultrasound to identify tissue characteristics in regions behind bone has had limited success. The discussion of ultrasound reflection above assumes that the ultrasound beam strikes the reflecting interface at a right angle. In the body, ultrasound impinges upon interfaces at all angles. For any angle of incidence, the angle at which the reflected ultrasound energy leaves the interface equals the angle of incidence of the ultrasound beam; that is, Angle of incidence = Angle of reflection In a typical medical examination that uses reflected ultrasound and a transducer that both transmits and detects ultrasound, very little reflected energy will be detected if the ultrasound strikes the interface at an angle more than about 3 degrees from perpendicular. A smooth reflecting interface must be essentially perpendicular to the ultrasound beam to permit visualization of the interface. Fig.(5.3) 06
    • Chapter 5 | Ultrasound 5.8 | REFRACTION As an ultrasound beam crosses an interface obliquely between two media, its direction is changed (i.e., the beam is bent). If the velocity of ultrasound is higher in the second medium, then the beam enters this medium at a more oblique (less steep) angle. This behavior of ultrasound transmitted obliquely across an interface is termed refraction. The relationship between incident and refraction angles is described by Snell’s law: = = For example, an ultrasound beam incident obliquely upon an interface between muscle (velocity 1580 m/sec) and fat (velocity 1475 m/sec) will enter the fat at a steeper angle. If an ultrasound beam impinges very obliquely upon a medium in which the ultrasound velocity is higher, the beam may be refracted so that no ultrasound energy enters the medium. The incidence angle at which refraction causes no ultrasound to enter a medium is termed the critical angle µc. For the critical angle, the angle of refraction is 90 degrees, and the sine of 90 degrees is 1. From Eq. (54): = But Sin 90 =1 Therefore = sin-1 [Ci / Cr] Where sin−1, or arcsine, refers to the angle whose sine is ci /cr. For any particular interface, the critical angle depends only upon the velocity of ultrasound in the two media separated by the interface. 77
    • Chapter 5 | Ultrasound Refraction is a principal cause of artifacts in clinical ultrasound images. For example, the ultrasound beam is refracted at a steeper angle as it crosses the interface between medium 1 and 2 (c1 > c2). As the beam emerges from medium 2 and reenters medium 1, it resumes its original direction of motion. The presence of medium 2 simply displaces the ultrasound beam laterally for a distance that depends upon the difference in ultrasound velocity and density in the two media and upon the thickness of medium 2. Suppose a small structure below medium 2 is visualized by reflected ultrasound. The position of the structure would appear to the viewer as an extension of the original direction of the ultrasound through medium 1. In this manner, refraction adds spatial distortion and resolution loss to ultrasound images. 5.9 | ABSORPTION Relaxation processes are the primary mechanisms of energy dissipation for an ultrasound beam transversing tissue. These processes involve (a) removal of energy from the ultrasound beam and (b) eventual dissipation of this energy primarily as heat. As discussed earlier, ultrasound is propagated by displacement of molecules of a medium into regions of compression and rarefaction. This displacement requires energy that is provided to the medium by the source of ultrasound. As the molecules attain maximum displacement from an equilibrium position, their motion stops, and their energy is transformed from kinetic energy associated with motion to potential energy associated with position in the compression zone. From this position, the molecules begin to move in the opposite direction, and potential energy is gradually transformed into kinetic energy. The maximum kinetic energy (i.e., the highest molecular velocity) is achieved when the molecules pass through their original equilibrium position, where the displacement and potential energy are zero. If the kinetic energy of the molecule at this position equals the energy absorbed originally from the ultrasound beam, then no dissipation of energy has occurred, and the medium is an ideal transmitter of ultrasound. Actually, the conversion of kinetic to potential energy (and vice versa) is always accompanied by some dissipation of energy. Therefore, the energy of the ultrasound beam is gradually reduced as it passes through the medium. This reduction is termed relaxation energy loss. The rate at which the beam energy decreases is a reflection of the attenuation properties of the medium. 77
    • Chapter 5 | Ultrasound The effect of frequency on the attenuation of ultrasound in different media is described in Table (5-7).14–18 Data in this table are reasonably good estimates of the influence of frequency on ultrasound absorption over the range of ultrasound frequencies used diagnostically. However, complicated structures such as tissue samples often exhibit a rather complex attenuation pattern for different frequencies, which probably reflects the existence of a variety of relaxation frequencies and other molecular energy absorption processes that are poorly Table (5-7): Variation of Ultrasound Attenuation Coefficient with Frequency in Megahertz, Where Is the Attenuation Coefficient at 1 MHz Tissue Frequency Variation Material Frequency Variation Blood Lung Fat Liver Muscle (across fibers) Brain Muscle (along fibers) Kidney Aqueous and vitreous humor of eye Lens of eye Spinal cord Water Castor oil Skull bone Lucite understood at present. These complex attenuation patterns are reflected in the data in Figure (5.3). If gas bubbles are present in a material through which a sound wave is passing, the compressions and rarefactions cause the bubble to shrink and expand in resonance with the sound wave. The oscillation of such bubbles is referred to as stable cavitation. Stable cavitation is not a major mechanism for absorption at ultrasound intensities used diagnostically, but it can be a significant source of scatter. If an ultrasound beam is intense enough and of the right frequency, the ultrasoundinduced mechanical disturbance of the medium can be so great that microscopic bubbles are produced in the medium. The bubbles are formed at foci, such as molecules in the rarefaction zones, and may grow to a cubic millimeter or so in size. As the pressure in the rarefaction zone increases during the next phase of the ultrasound cycle, the bubbles shrink to 10-2 mm3 or so and collapses, thereby creating minute shock waves that seriously disturb the medium if produced in large quantities. The effect, termed dynamic cavitation, produces high temperatures (up 77
    • Chapter 5 | Ultrasound to 10,000 C) at the point where the collapse occurs.19 Dynamic cavitation is associated with absorption of energy from the beam. Free radicals are also produced in water surrounding the collapse. Dynamic cavitation is not a significant mechanism of at-tenuation at diagnostic intensities, although there is evidence that it may occur under certain conditions. 5.10 | HARDWARE PART 5.10.1 | Introduction: We'll use HC-SR04 ultrasound sensor which transmits ultrasound waves using physical properties to calculate the range between the user and any barrier on his way by using reflection to calculate the distance by calculating the time the waves takes to travel on its way from the source to the target and the way back from the target to the receiver by: Velocity of sound = 340 m/s The time needed in seconds The total distance= v*t The distance = the total distance/2 Technical specifications: This project started after I looked at the Polaroid Ultrasonic Ranging module. It has a number of disadvantages for use in small robots etc. 1. The maximum range of 10.7 meter is far more than is normally required, and as a result 2. The current consumption, at 2.5 Amps during the sonic burst is truly horrendous. 3. The 150mA quiescent current is also far too high. 4. The minimum range of 26cm is useless. 1-2cm is more like it. 5. The module is quite large to fit into small systems, and 77
    • Chapter 5 | Ultrasound 6. it’s EXPENSIVE. The SRF04 was designed to be just as easy to use as the Polaroid sonar, requiring a short trigger pulse and providing an echo pulse. Your controller only has to time the length of this pulse to find the range. The connections to the SRF04 are shown below: Fig.(5-7): SRF04 Connections The SRF04 Timing diagram is shown below. You only need to supply a short 10uS pulse to the trigger input to start the ranging. The SRF04 will send out an 8 cycle burst of ultrasound at 40 kHz and raise its echo line high. It then listens for an echo, and as soon as it detects one it lowers the echo line again. The echo line is therefore a pulse whose width is proportional to the distance to the object. By timing the pulse it is possible to calculate the range in inches/centimeters or anything else. If nothing is detected then the SRF04 will lower its echo line anyway after about 36mS.Here is the schematic, 77
    • Chapter 5 | Ultrasound Fig. (5.8): SRF04 schematic Fig.(5.9): SRF04 Timing Diagram The circuit is designed to be low cost. It uses a PIC12C508 to perform the control functions and standard 40 kHz piezo transducers. The drive to the transmitting transducer could be simplest driven directly from the PIC. The 5v drive can give a useful range for large objects, but can be problematic detecting 77
    • Chapter 5 | Ultrasound smaller objects. The transducer can handle 20v of drive, so I decided to get up close to this level. A MAX232 IC, usually used for RS232 communication makes and ideal driver, providing about 16v of drive. The receiver is a classic two stage op-amp circuit. The input capacitor C8 blocks some residual DC which always seems to be present. Each gain stage is set to 24 for a total gain of 576-ish. This is close the 25 maximum gain available using the LM1458. The gain bandwidth product for the LM1458 is 1 MHz the maximum gain at 40 kHz is 1000000/40000 = 25. The output of the amplifier is fed into an LM311 comparator. A small amount of positive feedback provides some hysteresis to give a clean stable output. The problem of getting operation down to 1-2cm is that the receiver will pick up direct coupling from the transmitter, which is right next to it. To make matters worse the piezo transducer is a mechanical object that keeps resonating sometime after the drive has been removed. Up to 1mS depending on when you decide it has stopped. It is much harder to tell the difference between this direct coupled ringing and a returning echo, which is why many designs, including the Polaroid module, simply blank out this period. Looking at the returning echo on an oscilloscope shows that it is much larger in magnitude at close quarters than the cross-coupled signal. I therefore adjust the detection threshold during this time so that only the echo is detectable. The 100n capacitor C10 is charged to about –6v during the burst. This discharges quite quickly through the 10k resistor R6 to restore sensitivity for more distant echo’s. A convenient negative voltage for the op-amp and comparator is generated by the MAX232. Unfortunately, this also generates quite a bit of high frequency noise. I therefore shut it down whilst listening for the echo. The 10uF capacitor C9 holds the negative rail just long enough to do this. In operation, the processor waits for an active low trigger pulse to come in. It then generates just eight cycles of 40 kHz. The echo line is then raised to signal the host processor to start timing. The raising of the echo line also shuts of the MAX232. After a while – no more than 10-12mS normally, the returning echo will be detected and the PIC will lower the echo line. The width of this pulse represents the flight time of the sonic burst. If no echo is detected then it will automatically time out after about 30mS (It’s two times the WDT period of the PIC). Because the 79
    • Chapter 5 | Ultrasound MAX232 is shut down during echo detection, you must wait at least 10mS between measurement cycles for the +/- 10v to recharge. Performance of this design is, I think, quite good. It will reliably measure down to 3cm and will continue detecting down to 1cm or less but after 2-3cm the pulse width doesn’t get any smaller. Maximum range is a little over 3m. As an example of the sensitivity of this design, it will detect a 1inch thick plastic broom handle at 2.4m. Average current consumption is reasonable at less than 50mA and typically about 30mA. 5.10.2 | Calculating the Distance The SRF04 provides an echo pulse proportional to distance. If the width of the pulse is measured in µS, then dividing by 58 will give you the distance in cm, or dividing by 148 will give the distance in inches. µS/58=cm or µS/148=inches. 5.10.3 | Changing beam pattern and beam width You can't! This is a question which crops up regularly, however there is no easy way to reduce or change the beam width that I'm aware of. The beam pattern of the SRF04 is conical with the width of the beam being a function of the surface area of the transducers and is fixed. The beam pattern of the transducers used on the SRF04, taken from the manufacturers’ data sheet, is shown below. 70
    • Chapter 5 | Ultrasound Fig.(5.10) 5.10.4 | The development of the sensor Since the original design of the SRF04 was published, there have been incremental improvements to improve performance and manufacturing reliability. The op-amp is now an LMC6032 and the comparator is an LP311. The 10uF capacitor is now 22uF and a few resistor values have been tweaked. These changes have happened over a period of time. All SRF04's manufactured after May 2003 have new software implementing an optional timing control input using the "do not connect" pin. This connection is the PIC's Vpp line used to program the chip after assembly. After programming it’s just an unused input with a pull-up resistor. When left unconnected the SRF04 behaves exactly as it always has and is described above. When the "do not connect" pin is connected to ground (0v), the timing is changed slightly to allow the SRF04 to work with the slower controllers such as the Pic axe. The SRF04's "do not connect" pin now acts as a timing control. This pin is pulled high by default and when left unconnected, the timing remains exactly as before. With the timing pin pulled low (grounded) a 300uS delay is added between the end of the trigger pulse and transmitting the sonic burst. Since the echo output is not raised until the burst is completed, there is no change to the range timing, but the 300uS delay gives the Pic axe time to sort out which pin to look at and start 77
    • Chapter 5 | Ultrasound doing so. The new code has shipped in all SRF04's since the end of April 2003. The new code is also useful when connecting the SRF04 to the slower Stamps Such as the BS2. Although the SRF04 works with the BS2, the echo line needs to be connected to the lower number input pins. This is because the Stamps take progressively longer to look at the higher numbered pins and can miss the rising edge of the echo signal. In this case you can connect the "do not connect" pin to ground and give it an extra 300uS to get there. 76
    • CHAPTER 6 Microcontroller
    • Chapter 6 | Microcontroller 6.1 | INTRODUCTION A microcontroller is a microprocessor system which contains data and program memory, serial and parallel I/O, timers, and external and internal interrupts—all integrated into a single chip that can be purchased for as little as two dollars. About 40 percent of all microcontroller applications are found in office equipment, such as PCs, laser printers, fax machines, and intelligent telephones. About one third of all microcontrollers are found in consumer electronic goods. Products like CD players, hi-fi equipment, video games, washing machines, and cookers fall into this category. The communications market, the automotive market, and the military share the rest of the applications. Figure6.1 shows the microcontroller block diagram. However, this project concentrates on designing the Instruction Register (IR), Program Counter (PC) and Arithmetic Logic Unit (ALU) only. Fig.(6.1): Microcontroller Block Diagram 19
    • Chapter 6 | Microcontroller 6.1.1 | History of Microcontroller The first single-chip microprocessor was the 4-bit Intel 4004 released in 1971, with the Intel 8008 and other more capable microprocessors becoming available over the next several years. However, both processors required external chips to implement a working system, raising total system cost, and making it impossible to economically computerize appliances. The Smithsonian Institution says TI engineers Gary Boone and Michael Cochran succeeded in creating the first microcontroller in 1971. The result of their work was the TMS 1000, which went commercial in 1974. It combined read-only memory, read/write memory, processor and clock on one chip and was targeted at embedded systems. Partly in response to the existence of the single-chip TMS 1000,[2] Intel developed a computer system on a chip optimized for control applications, the Intel 8048, with commercial parts first shipping in 1977. [2] It combined RAM and ROM on the same chip. This chip would find its way into over one billion PC keyboards, and other numerous applications. At that time Intel's President, Luke J. Valenter, stated that the microcontroller was one of the most successful in the company's history, and expanded the division's budget over 25%. Most microcontrollers at this time had two variants. One had an erasable EPROM program memory, which was significantly more expensive than the PROM variant which was only programmable once. Erasing the EPROM required exposure to ultraviolet light through a transparent quartz lid. One-time parts could be made in lower-cost opaque plastic packages. In 1993, the introduction of EEPROM memory allowed microcontrollers (beginning with the Microchip PIC16x84) to be electrically erased quickly without an expensive package as required for EPROM, allowing both rapid prototyping, and In System Programming. The same year, Atmel introduced the first microcontroller using Flash memory. Other companies rapidly followed suit, with both memory types. Cost has plummeted over time, with the cheapest 8-bit microcontrollers being available for under $0.25 in quantity (thousands) in 2009, and some 32-bit microcontrollers around $1 for similar quantities. 19
    • Chapter 6 | Microcontroller Nowadays microcontrollers are cheap and readily available for hobbyists, with large online communities around certain processors. In the future, MRAM could potentially be used in microcontrollers as it has infinite endurance and its incremental semiconductor wafer process cost is relatively low. 6.1.2 | Embedded Design A microcontroller can be considered a self-contained system with a processor, memory and peripherals and can be used as an embedded system.[5] The majority of microcontrollers in use today are embedded in other machinery, such as automobiles, telephones, appliances, and peripherals for computer systems. While some embedded systems are very sophisticated, many have minimal requirements for memory and program length, with no operating system, and low software complexity. Typical input and output devices include switches, relays, solenoids, LEDs, small or custom LCD displays, radio frequency devices, and sensors for data such as temperature, humidity, light level etc. Embedded systems usually have no keyboard, screen, disks, printers, or other recognizable I/O devices of a personal computer, and may lack human interaction devices of any kind. 6.1.3 | Interrupts Microcontrollers must provide real time (predictable, though not necessarily fast) response to events in the embedded system they are controlling. When certain events occur, an interrupt system can signal the processor to suspend processing the current instruction sequence and to begin an interrupt service routine (ISR, or "interrupt handler"). The ISR will perform any processing required based on the source of the interrupt before returning to the original instruction sequence. Possible interrupt sources are device dependent, and often include events such as an internal timer overflow, completing an analog to digital conversion, a logic level change on an input such as from a button being pressed, and data received on a communication link. Where power consumption is important as in battery operated devices, interrupts may also wake a microcontroller from a low power sleep state where the processor is halted until required to do something by a peripheral event. 19
    • Chapter 6 | Microcontroller 6.1.4 Programs Typically microcontroller programs must fit in the available on-chip program memory, since it would be costly to provide a system with external, expandable, memory. Compilers and assemblers are used to convert high-level language and assembler language codes into a compact machine code for storage in the microcontroller's memory. Depending on the device, the program memory may be permanent, read-only memory that can only be programmed at the factory, or program memory may be field-alterable flash or erasable read-only memory. Manufacturers have often produced special versions of their microcontrollers in order to help the hardware and software development of the target system. Originally these included EPROM versions that have a "window" on the top of the device through which program memory can be erased by ultraviolet light, ready for reprogramming after a programming ("burn") and test cycle. Since 1998, EPROM versions are rare and have been replaced by EEPROM and flash, which are easier to use (can be erased electronically) and cheaper to manufacture Other versions may be available where the ROM is accessed as an external device rather than as internal memory, however these are becoming increasingly rare due to the widespread availability of cheap microcontroller programmers. The use of field-programmable devices on a microcontroller may allow field update of the firmware or permit late factory revisions to products that have been assembled but not yet shipped. Programmable memory also reduces the lead time required for deployment of a new product. Where hundreds of thousands of identical devices are required, using parts programmed at the time of manufacture can be an economical option. These "mask programmed" parts have the program laid down in the same way as the logic of the chip, at the same time. A customizable microcontroller incorporates a block of digital logic that can be personalized in order to provide additional processing capability, peripherals and interfaces that are adapted to the requirements of the application. For example, the AT91CAP from Atmel has a block of logic that can be customized during manufacturer according to user requirements. 6.1.5 | Other microcontroller features 19
    • Chapter 6 | Microcontroller Microcontrollers usually contain from several to dozens of general purpose input/output pins (GPIO). GPIO pins are software configurable to either an input or an output state. When GPIO pins are configured to an input state, they are often used to read sensors or external signals. Configured to the output state, GPIO pins can drive external devices such as LEDs or motors. Many embedded systems need to read sensors that produce analog signals. This is the purpose of the analog-to-digital converter (ADC). Since processors are built to interpret and process digital data, i.e. 1s and 0s, they are not able to do anything with the analog signals that may be sent to it by a device. So the analog to digital converter is used to convert the incoming data into a form that the processor can recognize. A less common feature on some microcontrollers is a digital-toanalog converter (DAC) that allows the processor to output analog signals or voltage levels. In addition to the converters, many embedded microprocessors include a variety of timers as well. One of the most common types of timers is the Programmable Interval Timer (PIT). A PIT may either count down from some value to zero, or up to the capacity of the count register, overflowing to zero. Once it reaches zero, it sends an interrupt to the processor indicating that it has finished counting. This is useful for devices such as thermostats, which periodically test the temperature around them to see if they need to turn the air conditioner on, the heater on, etc. A dedicated Pulse Width Modulation (PWM) block makes it possible for the CPU to control power converters, resistive loads, motors, etc., without using lots of CPU resources in tight timer loops. Universal Asynchronous Receiver/Transmitter (UART) block makes it possible to receive and transmit data over a serial line with very little load on the CPU. Dedicated on-chip hardware also often includes capabilities to communicate with other devices (chips) in digital formats such as I²C and Serial Peripheral Interface (SPI). 6.1.6 | Higher integration Micro-controllers may not implement an external address or data bus as they integrate RAM and non-volatile memory on the same chip as the CPU. Using fewer pins, the chip can be placed in a much smaller, cheaper package. 19
    • Chapter 6 | Microcontroller Integrating the memory and other peripherals on a single chip and testing them as a unit increases the cost of that chip, but often results in decreased net cost of the embedded system as a whole. Even if the cost of a CPU that has integrated peripherals is slightly more than the cost of a CPU and external peripherals, having fewer chips typically allows a smaller and cheaper circuit board, and reduces the labor required to assemble and test the circuit board.           A micro-controller is a single integrated circuit, commonly with the following features: central processing unit - ranging from small and simple 4-bit processors to complex 32- or 64-bit processors volatile memory (RAM) for data storage ROM, EPROM, EEPROM or Flash memory for program and operating parameter storage discrete input and output bits, allowing control or detection of the logic state of an individual package pin serial input/output such as serial ports (UARTs) other serial communications interfaces like I²C, Serial Peripheral Interface and Controller Area Network for system interconnect peripherals such as timers, event counters, PWM generators, and watchdog clock generator - often an oscillator for a quartz timing crystal, resonator or RC circuit many include analog-to-digital converters, some include digital-to-analog converters in-circuit programming and debugging support This integration drastically reduces the number of chips and the amount of wiring and circuit board space that would be needed to produce equivalent systems using separate chips. Furthermore, on low pin count devices in particular, each pin may interface to several internal peripherals, with the pin function selected by software. This allows a part to be used in a wider variety of applications than if pins had dedicated functions. Micro-controllers have proved to be highly popular in embedded systems since their introduction in the 1970s. Some microcontrollers use a Harvard architecture: separate memory buses for instructions and data, allowing accesses to take place concurrently. Where a Harvard architecture is used, instruction words for the processor may be a different bit size than the length of internal memory and registers; for example: 12-bit instructions used with 8-bit data registers. 19
    • Chapter 6 | Microcontroller The decision of which peripheral to integrate is often difficult. The microcontroller vendors often trade operating frequencies and system design flexibility against time-to-market requirements from their customers and overall lower system cost. Manufacturers have to balance the need to minimize the chip size against additional functionality. Microcontroller architectures vary widely. Some designs include generalpurpose microprocessor cores, with one or more ROM, RAM, or I/O functions integrated onto the package. Other designs are purpose built for control applications. A micro-controller instruction set usually has many instructions intended for bit-wise operations to make control programs more compact. [6] For example, a general purpose processor might require several instructions to test a bit in a register and branch if the bit is set, where a micro-controller could have a single instruction to provide that commonly-required function. Microcontrollers typically do not have a math coprocessor, so floating point arithmetic is performed by software. 9.1.7 | Programming environments Microcontrollers were originally programmed only in assembly language, but various high-level programming languages are now also in common use to target microcontrollers. These languages are either designed specially for the purpose, or versions of general purpose languages such as the C programming language. Compilers for general purpose languages will typically have some restrictions as well as enhancements to better support the unique characteristics of microcontrollers. Some microcontrollers have environments to aid developing certain types of applications. Microcontroller vendors often make tools freely available to make it easier to adopt their hardware. Many microcontrollers are so quirky that they effectively require their own non-standard dialects of C, such as SDCC for the 8051, which prevent using standard tools (such as code libraries or static analysis tools) even for code unrelated to hardware features. Interpreters are often used to hide such low level quirks. Interpreter firmware is also available for some microcontrollers. For example, BASIC on the early microcontrollers Intel 8052;[7] BASIC and FORTH on the Zilog Z8[8] as well as some modern devices. Typically these interpreters support interactive programming. 19
    • Chapter 6 | Microcontroller Simulators are available for some microcontrollers. These allow a developer to analyze what the behavior of the microcontroller and their program should be if they were using the actual part. A simulator will show the internal processor state and also that of the outputs, as well as allowing input signals to be generated. While on the one hand most simulators will be limited from being unable to simulate much other hardware in a system, they can exercise conditions that may otherwise be hard to reproduce at will in the physical implementation, and can be the quickest way to debug and analyze problems. Recent microcontrollers are often integrated with on-chip debug circuitry that when accessed by an in-circuit emulator via JTAG, allow debugging of the firmware with a debugger. 6.2 | TYPES OF MICROCONTROLLERS  o                As of 2008 there are several dozen microcontroller architectures and vendors including: ARM core processors (many vendors) includes ARM9, ARM Cortex-A8, Sitara ARM Microprocessor Atmel AVR (8-bit), AVR32 (32-bit), and AT91SAM (32-bit) Cypress Semiconductor's M8C Core used in their PSoC (Programmable Systemon-Chip) Free scale Cold Fire (32-bit) and S08 (8-bit) Free scale 68HC11 (8-bit) Intel 8051 Infineon: 8, 16, 32 Bit microcontrollers[9] MIPS Microchip Technology PIC, (8-bit PIC16, PIC18, 16-bit dsPIC33 / PIC24), (32-bit PIC32) NXP Semiconductors LPC1000, LPC2000, LPC3000, LPC4000 (32-bit), LPC900, LPC700 (8-bit) Parallax Propeller PowerPC ISE Rabbit 2000 (8-bit) Renesas RX, V850, Hitachi H8, Hitachi SuperH (32-bit), M16C (16-bit), RL78, R8C, 78K0/78K0R (8-bit) Silicon Laboratories Pipelined 8-bit 8051 Microcontrollers and mixed-signal ARM-based 32-bit microcontrollers STMicroelectronics STM8 (8-bit), ST10 (16-bit) and STM32 (32-bit) 19
    • Chapter 6 | Microcontroller   Texas Instruments TI MSP430 (16-bit) Toshiba TLCS-870 (8-bit/16-bit). Many others exist, some of which are used applications or are more like applications processors microcontroller market is extremely fragmented, technologies, and markets. Note that many vendors architectures. in very narrow range of than microcontrollers. The with numerous vendors, sell or have sold multiple 6.2.1 Interrupt latency In contrast to general-purpose computers, microcontrollers used in embedded systems often seek to optimize interrupt latency over instruction throughput. Issues include both reducing the latency, and making it be more predictable (to support real-time control). When an electronic device causes an interrupt, the intermediate results (registers) have to be saved before the software responsible for handling the interrupt can run. They must also be restored after that software is finished. If there are more registers, this saving and restoring process takes more time, increasing the latency. Ways to reduce such context/restore latency include having relatively few registers in their central processing units (undesirable because it slows down most non-interrupt processing substantially), or at least having the hardware not save them all (this fails if the software then needs to compensate by saving the rest "manually"). Another technique involves spending silicon gates on "shadow registers": One or more duplicate registers used only by the interrupt software, perhaps supporting a dedicated stack. Other factors affecting interrupt latency include:  Cycles needed to complete current CPU activities. To minimize those costs, microcontrollers tend to have short pipelines (often three instructions or less), small write buffers, and ensure that longer instructions are continuable or restartable. RISC design principles ensure that most instructions take the same number of cycles, helping avoid the need for most such continuation/restart logic.  The length of any critical section that needs to be interrupted. Entry to a critical section restricts concurrent data structure access. When a data structure must be accessed by an interrupt handler, the critical section must block that interrupt. Accordingly, interrupt latency is increased by however long that interrupt is blocked. When there are hard external constraints on system latency, developers 11
    • Chapter 6 | Microcontroller o o o   often need tools to measure interrupt latencies and track down which critical sections cause slowdowns. One common technique just blocks all interrupts for the duration of the critical section. This is easy to implement, but sometimes critical sections get uncomfortably long. A more complex technique just blocks the interrupts that may trigger access to that data structure. This is often based on interrupt priorities, which tend to not correspond well to the relevant system data structures. Accordingly, this technique is used mostly in very constrained environments. Processors may have hardware support for some critical sections. Examples include supporting atomic access to bits or bytes within a word, or other atomic access primitives like the LDREX/STREX exclusive access primitives introduced in the ARMv6 architecture. Interrupt nesting. Some microcontrollers allow higher priority interrupts to interrupt lower priority ones. This allows software to manage latency by giving time-critical interrupts higher priority (and thus lower and more predictable latency) than less-critical ones. Trigger rate. When interrupts occur back-to-back, microcontrollers may avoid an extra context save/restore cycle by a form of tail call optimization. Lower end microcontrollers tend to support fewer interrupt latency controls than higher end ones. 6.3 | Microcontroller embedded memory technology Since the emergence of microcontrollers, many different memory technologies have been used. Almost all microcontrollers have at least two different kinds of memory, a non-volatile memory for storing firmware and a readwrite memory for temporary data. 6.3.1 | Data From the earliest microcontrollers to today, six-transistor SRAM is almost always used as the read/write working memory, with a few more transistors per bit used in the register file. MRAM could potentially replace it as it is 4 to 10 times denser which would make it more cost effective. In addition to the SRAM, some microcontrollers also have internal EEPROM for data storage; and even ones that do not have any (or not enough) are often connected to external serial EEPROM chip (such as the BASIC Stamp) or external serial flash memory chip. A few recent microcontrollers beginning in 2003 have "self-programmable" flash memory. 911
    • Chapter 6 | Microcontroller 6.3.2 | Firmware The earliest microcontrollers used mask ROM to store firmware. Later microcontrollers (such as the early versions of the free scale 68HC11 and early PIC microcontrollers) had quartz windows that allowed ultraviolet light in to erase the EPROM. The Microchip PIC16C84, introduced in 1993,[10] was the first microcontroller to use EEPROM to store firmware. In the same year, Atmel introduced the first microcontroller using NOR Flash memory to store firmware. 6.4 | PIC MICROCONTROLLER PIC is a family of modified Harvard architecture microcontrollers made by Microchip Technology, derived from the PIC1650 originally developed by General Instrument's Microelectronics Division. The name PIC initially referred to "Peripheral Interface Controller". PICs are popular with both industrial developers and hobbyists alike due to their low cost, wide availability, large user base, extensive collection of application notes, availability of low cost or free development tools, and serial programming (and re-programming with flash memory) capability. Microchip announced on September 2011 the shipment of its ten billionth PIC processor. 6.4.1 | Family core architectural differences PIC microchips have Harvard architecture, and instruction words are unusual sizes. Originally, 12-bit instructions included 5 address bits to specify the memory operand, and 9-bit branch destinations. Later revisions added opcode bits, allowing additional address bits. | Baseline core devices (12 bit) These devices feature a 12-bit wide code memory, a 32-byte register file, and a tiny two level deep call stack. They are represented by the PIC10 series, as well as by some PIC12 and PIC16 devices. Baseline devices are available in 6-pin to 40-pin packages. Generally the first 7 to 9 bytes of the register file are special-purpose registers, and the remaining bytes are general purpose RAM. Pointers are 919
    • Chapter 6 | Microcontroller implemented using a register pair: after writing an address to the FSR (file select register), the INDF (indirect f) register becomes an alias for the addressed register. If banked RAM is implemented, the bank number is selected by the high 3 bits of the FSR. This affects register numbers 16–31; registers 0–15 are global and not affected by the bank select bits. Because of the very limited register space (5 bits), 4 rarely-read registers were not assigned addresses, but written by special instructions (OPTION and TRIS). The ROM address space is 512 words (12 bits each), which may be extended to 2048 words by banking. CALL and GOTO instructions specify the low 9 bits of the new code location; additional high-order bits are taken from the status register. Note that a CALL instruction only includes 8 bits of address, and may only specify addresses in the first half of each 512-word page. Lookup tables are implemented using a computed GOTO (assignment to PCL register) into a table of RETLW instructions. The instruction set is as follows. Register numbers are referred to as "f", while constants are referred to as "k". Bit numbers (0–7) are selected by "b". The "d" bit selects the destination: 0 indicates W, while 1 indicates that the result is written back to source register f. The C and Z status flags may be set based on the result; otherwise they are unmodified. Add and subtract (but not rotate) instructions that set C also set the DC (digit carry) flag, the carry from bit 3 to bit 4, which is useful for BCD arithmetic. Third-party clones (13 bit) ELAN Microelectronics Corp. make a series of PICmicro-like microcontrollers with a 13-bit instruction word.[11] The instructions are mostly compatible with the mid-range 14-bit instruction set, but limited to a 6-bit register address (16 special-purpose registers and 48 bytes of RAM) and a 10-bit (1024 word) program space. The 7 accumulator-immediate instructions are renumbered relative to the 14bit PICmicro, to fit into 3 opcode bits rather than 4, but they are all there, as well as an additional software interrupt instruction. There are a few additional miscellaneous instructions, and there are some changes to the terminology (the PICmicro OPTION register is called the Control 919
    • Chapter 6 | Microcontroller register; the PICmicro TRIS registers are called I/O control registers), but the equivalents are obvious. Mid-range core devices (14 bit) These devices feature a 14-bit wide code memory, and an improved 8 level deep call stack. The instruction set differs very little from the baseline devices, but the 2 additional opcode bits allow 128 registers and 2048 words of code to be directly addressed. There are a few additional miscellaneous instructions, and two additional 8-bit literal instructions, add and subtract. The mid-range core is available in the majority of devices labeled PIC12 and PIC16. The first 32 bytes of the register space are allocated to special-purpose registers; the remaining 96 bytes are used for general-purpose RAM. If banked RAM is used, the high 16 registers (0x70–0x7F) are global, as are a few of the most important special-purpose registers, including the STATUS register which holds the RAM bank select bits. (The other global registers are FSR and INDF, the low 8 bits of the program counter PCL, the PC high preload register PCLATH, and the master interrupt control register INTCON.) The PCLATH register supplies high-order instruction address bits when the 8 bits supplied by a write to the PCL register, or the 11 bits supplied by a GOTO or CALL instruction, is not sufficient to address the available ROM space. Enhanced Mid-range core devices (14 bit) Enhanced Mid-range core devices introduce a deeper hardware stack, additional reset methods, 14 additional instructions and ‘C’ programming language optimizations. In particular. There are two INDF registers (INDF0 and INDF1), and two corresponding FSR register pairs (FSRnL and FSRnH). Special instructions use FSRn registers like address registers, with a variety of addressing modes.   PIC17 high end core devices (16 bit) The 17 series never became popular and has been superseded by the PIC18 architecture. It is not recommended for new designs, and availability may be limited. Improvements over earlier cores are 16-bit wide opcodes (allowing many new instructions), and a 16 level deep call stack. PIC17 devices were produced in packages from 40 to 68 pins. The 17 series introduced a number of important new features: a memory mapped accumulator read access to code memory (table reads) 919
    • Chapter 6 | Microcontroller      direct register to register moves (prior cores needed to move registers through the accumulator) an external program memory interface to expand the code space an 8-bit × 8-bit hardware multiplier a second indirect register pair auto-increment/decrement addressing controlled by control bits in a status register (ALUSTA) PIC18 high end core devices (16 bit)        Microchip introduced the PIC18 architecture in 2000. [4] Unlike the 17 series, it has proven to be very popular, with a large number of device variants presently in manufacture. In contrast to earlier devices, which were more often than not programmed in assembly, C has become the predominant development language [5]. The 18 series inherits most of the features and instructions of the 17 series, while adding a number of important new features: call stack is 21 bits wide and much deeper (31 levels deep) the call stack may be read and written (TOSU:TOSH:TOSL registers) conditional branch instructions indexed addressing mode (PLUSW) extending the FSR registers to 12 bits, allowing them to linearly address the entire data address space the addition of another FSR register (bringing the number up to 3) The RAM space is 12 bits, addressed using a 4-bit bank select register and an 8bit offset in each instruction. An additional "access" bit in each instruction selects between bank 0 (a=0) and the bank selected by the BSR (a=1). A 1-level stack is also available for the STATUS, WREG and BSR registers. They are saved on every interrupt, and may be restored on return. If interrupts are disabled, they may also be used on subroutine call/return by setting the s bit (appending ", FAST" to the instruction). The auto increment/decrement feature was improved by removing the control bits and adding four new indirect registers per FSR. Depending on which indirect file register is being accessed it is possible to post decrement, post increment, or preincrement FSR; or form the effective address by adding W to FSR. In more advanced PIC18 devices, an "extended mode" is available which makes the addressing even more favorable to compiled code: a new offset addressing mode; some addresses which were relative to the access bank are now interpreted relative to the FSR2 register 919
    • Chapter 6 | Microcontroller              the addition of several new instructions, notable for manipulating the FSR registers. These changes were primarily aimed at improving the efficiency of a data stack implementation. If FSR2 is used either as the stack pointer or frame pointer, stack items may be easily indexed—allowing more efficient re-entrant code. Microchip's MPLAB C18 C compiler chooses to use FSR2 as a frame pointer. PIC24 and dsPIC 16-bit microcontrollers In 2001, Microchip introduced the dsPIC series of chips, which entered mass production in late 2004. They are Microchip's first inherently 16-bit microcontrollers. PIC24 devices are designed as general purpose microcontrollers. dsPIC devices include digital signal processing capabilities in addition. Architecturally, although they share the PIC moniker, they are very different from the 8-bit PICs. The most notable differences are:[15] they feature a set of 16 working registers (W0-W15) they fully support a stack in RAM, and do not have a hardware stack bank switching is not required to access RAM or special function registers data stored in program memory can be accessed directly using a feature called Program Space Visibility interrupt sources may be assigned to distinct handlers using an interrupt vector table Some features are: hardware MAC (multiply–accumulate) barrel shifting bit reversal (16×16)-bit single-cycle multiplication and other DSP operations hardware divide assist (19 cycles for 16/32-bit divide) hardware support for loop indexing Direct memory access dsPICs can be programmed in C using Microchip's C30 compiler which is a variant of gcc. PIC32 32-bit microcontrollers In November 2007 Microchip introduced the new PIC32MX family of 32bit microcontrollers. The initial device line-up is based on the industry standard MIPS32 M4K Core[6]. The device can be programmed using the Microchip MPLAB C Compiler for PIC32 MCUs, a variant of the GCC compiler. The first 18 models currently in production (PIC32MX3xx and PIC32MX4xx) are pin to pin compatible and share the same peripherals set with the PIC24FxxGA0xx family of 919
    • Chapter 6 | Microcontroller         (16-bit) devices allowing the use of common libraries, software and hardware tools. The PIC32 architecture brings a number of new features to Microchip portfolio, including: The highest execution speed 80 MIPS (120+[16] Dhrystone MIPS @ 80 MHz) The largest flash memory: 512 Kbytes One instruction per clock cycle execution The first cached processor Allows execution from RAM Full Speed Host/Dual Role and OTG USB capabilities Full JTAG and 2 wire programming and debugging Real-time trace 6.5 | PIC COMPONENT 6.5.1 | Logic Circuits Some of the program instructions give the same results as logic gates. The principle of their operation will be discussed in the text below. AND Gate Fig.(6.2) The logic gate ‘AND’ has two or more inputs and one output. Let us presume that the gate used in this example has only two inputs. A logic one (1) will appear on its output only if both inputs (A AND B) are driven high (1). Table on the right shows mutual dependence between inputs and the output. Fig.(6.3) When used in a program, a logic AND operation is performed by the program instruction, which will be discussed later. For the time being, it is enough to 919
    • Chapter 6 | Microcontroller remember that logic AND in a program refers to the corresponding bits of two registers. OR GATE Fig.(6.4) Similarly, OR gates also have two or more inputs and one output. If the gate has only two inputs the following applies. Alogic one (1) will appear on its output if either input (A OR B) is driven high (1). If the OR gate has more than two inputs then the following applies. A logic one (1) appears on its output if at least one input is driven high (1). If all inputs are at logic zero (0), the output will be at logic zero (0) as well. Fig.(6.5) In the program, logic OR operation is performed in the same manner as logic AND operation. NOT GATE The logic gate NOT has only one input and only one output. It operates in an extremely simple way. When logic zero (0) appears on its input, a logic one (1) appears on its output and vice versa. It means that this gate inverts the signal and is often called inverter, therefore. Fig.(6.6) 919
    • Chapter 6 | Microcontroller Fig.(6.7) In the program, logic NOT operation is performed upon one byte. The result is a byte with inverted bits. If byte bits are considered to be a number, the inverted value is actually a complement thereof. The complement of a number is a value which added to the number makes it reach the largest 8-digit binary number. In other words, the sum of an 8-digit number and its complement is always 255. EXCLUSIVE OR GATE Fig.(6.8) The EXCLUSIVE OR (XOR) gate is a bit complicated comparing to other gates. It represents a combination of all of them. A logic one (1) appears on its output only when its inputs have different logic states. Fig.(6.9) In the program, this operation is commonly used to compare two bytes. Subtraction may be used for the same purpose (if the result is 0, bytes are equal). Unlike subtraction, the advantage of this logic operation is that it is not possible to obtain negative results. REGISTER In short, a register or a memory cell is an electronic circuit which can memorize the state of one byte. 919
    • Chapter 6 | Microcontroller Fig.(6.10) SFR REGISTERS In addition to registers which do not have any special and predetermined function, every microcontroller has a number of registers (SFR) whose function is predetermined by the manufacturer. Their bits are connected (literally) to internal circuits of the microcontroller such as timers, A/D converter, oscillators and others, which means that they are directly in command of the operation of these circuits, i.e. the microcontroller. Imagine eight switches which control the operation of a small circuit within the microcontroller- Special Function Registers do exactly that. Fig.(6.11) In other words, the state of register bits is changed from within the program, registers run small circuits within the microcontroller; these circuits are via microcontroller pins connected to peripheral electronics which is used for... Well, it’s up to you. INPUT / OUTPUT PORTS In order to make the microcontroller useful, it has to be connected to additional electronics, i.e. peripherals. Each microcontroller has one or more registers (called ports) connected to the microcontroller pins. Why input/output? Because you can change a pin function as you wish. For example, suppose you want your device to turn on/off three signal LEDs and simultaneously monitor the logic state of five sensors or push buttons. Some of the ports need to be configured so that there are three outputs (connected to LEDs) and five inputs (connected to 911
    • Chapter 6 | Microcontroller sensors). It is simply performed by software, which means that a pin function can be changed during operation. Fig.(6.12) One of important specifications of input/output (I/O) pins is the maximum current they can handle. For most microcontrollers, current obtained from one pin is sufficient to activate an LED or some other low-current device (10-20 mA). The more I/O pins, the lower maximum current of one pin. In other words, the maximum current stated in the data specifications sheet for the microprocessor is shared across all I/O ports. Another important pin function is that it can have pull-up resistors. These resistors connect pins to the positive power supply voltage and come into effect when the pin is configured as an input connected to a mechanical switch or a push button. Newer versions of microcontrollers have pull-up resistors configurable by software. Each I/O port is usually under control of the specialized SFR, which means that each bit of that register determines the state of the corresponding microcontroller pin. For example, by writing logic one (1) to a bit of the control register (SFR), the appropriate port pin is automatically configured as an input and voltage brought to it can be read as logic 0 or 1. Otherwise, by writing zero to the SFR, the appropriate port pin is configured as an output. Its voltage (0V or 5V) corresponds to the state of appropriate port register bit. MEMORY UNIT Memory is part of the microcontroller used for data storage. The easiest way to explain it is to compare it with a filing cabinet with many drawers. Suppose, the drawers are clearly marked so that their contents can be easily found out by reading the label on the front of the drawer . 991
    • Chapter 6 | Microcontroller Fig.(6.13) Similarly, each memory address corresponds to one memory location. The contents of any location can be accessed and read by its addressing. Memory can either be written to or read from. There are several types of memory within the microcontroller: READ ONLY MEMORY (ROM) Read Only Memory (ROM) is used to permanently save the program being executed. The size of program that can be written depends on the size of this memory. Today’s microcontrollers commonly use 16-bit addressing, which means that they are able to address up to 64 Kb of memory, i.e. 65535 locations. As a novice, your program will rarely exceed the limit of several hundred instructions. There are several types of ROM. Masked ROM (MROM) Masked ROM is a kind of ROM the content of which is programmed by the manufacturer. The term ‘masked’ comes from the manufacturing process, where regions of the chip are masked off before the process of photolithography. In case of a large-scale production, the price is very low. Forget it... One Time Programmable ROM (OTP ROM) One time programmable ROM enables you to download a program into it, but, as its name states, one time only. If an error is detected after downloading, the only thing you can do is to download the correct program to another chip. UV Erasable Programmable ROM (UV EPROM) Both the manufacturing process and characteristics of this memory are completely identical to OTP ROM. However, the package of the microcontroller with this memory has a recognizable ‘window’ on its top side. It enables data to be erased under strong ultraviolet light. After a few minutes it is possible to download a new program into it. 999
    • Chapter 6 | Microcontroller Installation of this window is complicated, which normally affects the price. From our point of view, unfortunately-negative . Flash Memory This type of memory was invented in the 80s in the laboratories of INTEL and was represented as the successor to the UV EPROM. Since the content of this memory can be written and cleared practically an unlimited number of times, microcontrollers with Flash ROM are ideal for learning, experimentation and small-scale production. Because of its great popularity, most microcontrollers are manufactured in flash technology today. So, if you are going to buy a microcontroller, the type to look for is definitely flash! RANDOM ACCESS MEMORY (RAM) Once the power supply is off the contents of RAM is cleared. It is used for temporary storing data and intermediate results created and used during the operation of the microcontroller. For example, if the program performs an addition (of whatever), it is necessary to have a register representing what in everyday life is called the ‘sum’. For this reason, one of the registers of RAM is called the ‘sum’ and used for storing results of addition. ELECTRICALLY ERASABLE PROGRAMMABLE ROM (EEPROM) The contents of EEPROM may be changed during operation (similar to RAM), but remains permanently saved even after the loss of power (similar to ROM). Accordingly, EEPROM is often used to store values, created during operation, which must be permanently saved. For example, if you design an electronic lock or an alarm, it would be great to enable the user to create and enter the password, but it’s useless if lost every time the power supply goes off. The ideal solution is a microcontroller with an embedded EEPROM. INTERRUPT Most programs use interrupts in their regular execution. The purpose of the microcontroller is mainly to respond to changes in its surrounding. In other words, when an event takes place, the microcontroller does something... For example, when you push a button on a remote controller, the microcontroller will register it and respond by changing a channel, turn the volume up or down etc. If the microcontroller spent most of its time endlessly checking a few buttons for hours or days, it would not be practical at all. 999
    • Chapter 6 | Microcontroller This is why the microcontroller has learnt a trick during its evolution. Instead of checking each pin or bit constantly, the microcontroller delegates the ‘wait issue’ to a ‘specialist’ which will respond only when something attention worthy happens. The signal which informs the central processor unit about such an event is called an INTERRUPT.    CENTRAL PROCESSOR UNIT (CPU) As its name suggests, this is a unit which monitors and controls all processes within the microcontroller. It consists of several subunits, of which the most important are: Instruction Decoder is a part of electronics which decodes program instructions and runs other circuits on the basis of that. The ‘instruction set’ which is different for each microcontroller family expresses the abilities of this circuit; Arithmetical Logical Unit (ALU) performs all mathematical and logical operations upon data; and Accumulator is an SFR closely related to the operation of the ALU. It is a kind of working desk used for storing all data upon which some operation should be performed (addition, shift/move etc.). It also stores results ready for use in further processing. One of the SFRs, called a Status Register (PSW), is closely related to the accumulator. It shows at any given time the ‘status’ of a number stored in the accumulator (number is larger or less than zero etc.). Accumulator is also called working register and is marked as W register or just W, therefore. Fig.(6.14) BUS A bus consists of 8, 16 or more wires. There are two types of buses: the address bus and the data bus. The address bus consists of as many lines as necessary for memory addressing. It is used to transmit address from the CPU to the memory. The data bus is as wide as the data, in our case it is 8 bits or wires wide. It is used to connect all the circuits within the microcontroller. SERIAL COMMUNICATION 999
    • Chapter 6 | Microcontroller Parallel connection between the microcontroller and peripherals via input/output ports is the ideal solution on shorter distances up to several meters. However, in other cases when it is necessary to establish communication between two devices on longer distances it is not possible to use parallel connection. Instead, serial communication is used.     Today, most microcontrollers have built in several different systems for serial communication as a standard equipment. Which of these systems will be used depends on many factors of which the most important are: How many devices the microcontroller has to exchange data with? How fast the data exchange has to be? What is the distance between devices? Is it necessary to send and receive data simultaneously? Fig.(6.15) One of the most important things concerning serial communication is the Protocol which should be strictly observed. It is a set of rules which must be applied in order that devices can correctly interpret data they mutually exchange. Fortunately, the microcontroller automatically takes care of this, so that the work of the programmer/user is reduced to simple write (data to be sent) and read (received data). BAUD RATE The term baud rate is used to denote the number of bits transferred per second [bps]. Note that it refers to bits, not bytes. It is usually required by the protocol that each byte is transferred along with several control bits. It means that one byte in serial data stream may consist of 11 bits. For example, if the baud rate is 300 bps then maximum 37 and minimum 27 bytes may be transferred per second. The most commonly used serial communication systems are: 999
    • Chapter 6 | Microcontroller I2C (INTER INTEGRATED CIRCUIT) Inter-integrated circuit is a system for serial data exchange between the microcontrollers and specialized integrated circuits of a new generation. It is used when the distance between them is short (receiver and transmitter are usually on the same printed board). Connection is established via two conductors. One is used for data transfer, the other is used for synchronization (clock signal). As seen in figure below, one device is always a master. It performs addressing of one slave chip before communication starts. In this way one microcontroller can communicate with 112 different devices. Baud rate is usually 100 Kb/sec (standard mode) or 10 Kb/sec (slow baud rate mode). Systems with the baud rate of 3.4 Mb/sec have recently appeared. The distance between devices which communicate over an I2C bus is limited to several meters. Fig.(6.16) SPI (SERIAL PERIPHERAL INTERFACE BUS) A serial peripheral interface (SPI) bus is a system for serial communication which uses up to four conductors, commonly three. One conductor is used for data receiving, one for data sending, one for synchronization and one alternatively for selecting a device to communicate with. It is a full duplex connection, which means that data is sent and received simultaneously. The maximum baud rate is higher than that in the I2C communication system. Fig.(6.17) UART (UNIVERSAL ASYNCHRONOUS RECEIVER/TRANSMITTER) This sort of communication is asynchronous, which means that a special line for transferring clock signal is not used. In some applications, such as radio connection or infrared waves remote control, this feature is crucial. Since only one 999
    • Chapter 6 | Microcontroller communication line is used, both receiver and transmitter operate at the same predefined rate in order to maintain necessary synchronization. This is a very simple way of transferring data since it basically represents the conversion of 8-bit data from parallel to serial format. Baud rate is not high, up to 1 Mbit/sec. OSCILLATOR Fig.(6.18) Even pulses generated by the oscillator enable harmonic and synchronous operation of all circuits within the microcontroller. The oscillator is usually configured so as to use quartz crystal or ceramic resonator for frequency stability, but it can also operate as a stand-alone circuit (like RC oscillator). It is important to say that instructions are not executed at the rate imposed by the oscillator itself, but several times slower. It happens because each instruction is executed in several steps. In some microcontrollers, the same number of cycles is needed to execute all instructions, while in others, the number of cycles is different for different instructions. Accordingly, if the system uses quartz crystal with a frequency of 20 MHz, the execution time of an instruction is not 50nS, but 200, 400 or 800 nS, depending on the type of MCU! EXTERNAL OSCILLATOR IN EC MODE The external clock (EC) mode uses external oscillator as a clock source. The maximum frequency of this clock is limited to 20 MHz Fig.(6.19)  The advantages of the external oscillator when configured to operate in EC mode: The independent external clock source is connected to the OSC1 input and the OSC2 is available as a general purpose I/O; 999
    • Chapter 6 | Microcontroller    It is possible to synchronize the operation of the microcontroller with the rest of on-board electronics; In this mode the microcontroller starts operation immediately after the power is on. No time delay is required for frequency stabilization; and Temporary disabling the external clock source causes device to stop operation, while leaving all data intact. After restarting the external clock, the device proceeds with operation as if nothing has happened. Fig.(6.20) EXTERNAL OSCILLATOR IN LP, XT OR HS MODE    The LP, XT and HS modes use external oscillator as a clock source the frequency of which is determined by quartz crystal or ceramic resonators connected to the OSC1 and OSC2 pins. Depending on the features of the component in use, select one of the following modes: LP mode - (Low Power) is used for low-frequency quartz crystal only. This mode is designed to drive only 32.768 kHz crystals usually embedded in quartz watches. It is easy to recognize them by small size and specific cylindrical shape. The current consumption is the least of the three modes. XT mode is used for intermediate-frequency quartz crystals up to 8 MHz The current consumption is the medium of the three modes. HS mode - (High Speed) is used for high-frequency quartz crystals over 8 MHz The current consumption is the highest of the three modes. Fig.(6.21) 999
    • Chapter 6 | Microcontroller CERAMIC RESONATORS IN XT OR HS MODE Ceramic resonators are by their features similar to quartz crystals and are connected in the same way, therefore. Unlike quartz crystals, they are cheaper and oscillators containing them have a bit poorer characteristics. They are used for clock frequencies ranging from 100 kHz to 20 MHz Fig.(6.22) EXTERNAL OSCILLATOR IN RC AND RCIO MODE There are certainly many advantages in using elements for frequency stabilization, but sometimes they are really unnecessary. In most cases the oscillator may operate at frequencies not precisely defined so that embedding of such elements is a waste of money. The simplest and cheapest solution in these situations is to use one resistor and one capacitor for the operation of oscillator. There are two modes: Fig.(6.23) RC mode. When the external oscillator is configured to operate in RC mode, the OSC1 pin should be connected to the RC circuit as shown in figure on the right. The OSC2 pin outputs the RC oscillator frequency divided by 4. This signal may be used for calibration, synchronization or other application requirements. Fig.(6.24) 999
    • Chapter 6 | Microcontroller     RCIO mode. Likewise, the RC circuit is connected to the OSC1 pin. This time, the available OSC2 pin is used as an additional general-purpose I/O pin. In both cases, it is recommended to use components as shown in figure. The frequency of such an oscillator is calculated according to the formula f = 1/T in which: f = frequency [Hz]; T = R * C = time constant [s]; R = resistor resistance [Ω]; and C = capacitor capacity [F]. 6.5.2 | Power Supply Circuit   There are two things worth attention concerning the microcontroller power supply circuit: Brown out is a potentially dangerous condition which occurs at the moment the microcontroller is being turned off or when the power supply voltage drops to a minimum due to electric noise. As the microcontroller consists of several circuits with different operating voltage levels, this state can cause its out-of-control performance. In order to prevent it, the microcontroller usually has a built-in circuit for brown out reset which resets the whole electronics as soon as the microcontroller incurs a state of emergency. Reset pin is usually marked as MCLR (Master Clear Reset). It is used for external reset of the microcontroller by applying a logic zero (0) or one (1) to it, which depends on the type of the microcontroller. In case the brown out circuit is not built in, a simple external circuit for brown out reset can be connected to the MCLR pin. TIMERS/COUNTERS The microcontroller oscillator uses quartz crystal for its operation. Even though it is not the simplest solution, there are many reasons to use it. The frequency of such oscillator is precisely defined and very stable, so that pulses it generates are always of the same width, which makes them ideal for time measurement. Such oscillators are also used in quartz watches. If it is necessary to measure time between two events, it is sufficient to count up pulses generated by this oscillator. This is exactly what the timer does. Most programs use these miniature electronic ‘stopwatches’. These are commonly 8- or 16-bit SFRs the contents of which are automatically incremented by each coming pulse. Once a register is completely loaded, an interrupt may be generated! 991
    • Chapter 6 | Microcontroller If the timer uses an internal quartz oscillator for its operation then it can be used to measure time between two events (if the register value is T1 at the moment measurement starts, and T2 at the moment it terminates, then the elapsed time is equal to the result of subtraction T2-T1). If registers use pulses coming from external source then such a timer is turned into a counter. This is only a simple explanation of the operation itself. It is however more complicated in practice. Fig.(6.25) HOW DOES THE TIMER OPERATE? In practice, pulses generated by the quartz oscillator are once per each machine cycle, directly or via a prescaler, brought to the circuit which increments the number stored in the timer register. If one instruction (one machine cycle) lasts for four quartz oscillator periods then this number will be incremented a million times per second (each microsecond) by embedding quartz with the frequency of 4MHz. Fig.(6.26) It is easy to measure short time intervals, up to 256 microseconds, in the way described above because it is the largest number that one register can store. This restriction may be easily overcome in several ways such as by using a slower oscillator, registers with more bits, presales or interrupts. The first two solutions have some weaknesses so it is more recommended to use presales or interrupts. USING A PRESCALER IN TIMER OPERATION 991
    • Chapter 6 | Microcontroller Presale is an electronic device used to reduce frequency by a predetermined factor. In order to generate one pulse on its output, it is necessary to bring 1, 2 , 4 or more pulses on its input. Most microcontrollers have one or more presales built in and their division rate may be changed from within the program. The prescaler is used when it is necessary to measure longer periods of time. If one prescaler is shared by timer and watchdog timer, it cannot be used by both of them simultaneously. Fig.(6.26) USING INTERRUPT IN TIMER OPERATION If the timer register consists of 8 bits, the largest number it can store is 255. As for 16-bit registers it is the number 65.535. If this number is exceeded, the timer will be automatically reset and counting will start at zero again. This condition is called an overflow. If enabled from within the program, the overflow can cause an interrupt, which gives completely new possibilities. For example, the state of registers used for counting seconds, minutes or days can be changed in an interrupt routine. The whole process (except for interrupt routine) is automatically performed behind the scenes, which enables the main circuits of the microcontroller to operate normally. Fig.(6.28) 999
    • Chapter 6 | Microcontroller This figure illustrates the use of an interrupt in timer operation. Delays of arbitrary duration, having almost no influence on the main program execution, can be easily obtained by assigning the presales to the timer. COUNTERS If the timer receives pulses frm the microcontroller input pin, then it turns into a counter. Obviously, it is the same electronic circuit able to operate in two different modes. The only difference is that in this case pulses to be counted come over the microcontroller input pin and their duration (width) is mostly undefined. This is why they cannot be used for time measurement, but for other purposes such as counting products on an assembly line, number of axis rotation, passengers etc. (depending on sensor in use). WATCHDOG TIMER A watchdog timer is a timer connected to a completely separate RC oscillator within the microcontroller.If the watchdog timer is enabled, every time it counts up to the maximum value, the microcontroller reset occurs and the program execution starts from the first instruction. The point is to prevent this from happening by using a specific command. Anyway, the whole idea is based on the fact that every program is executed in several longer or shorter loops. If instructions which reset the watchdog timer are set at the appropriate program locations, besides commands being regularly executed, then the operation of the watchdog timer will not affect the program execution. If for any reason, usually electrical noise in industry, the program counter ‘gets stuck’ at some memory location from which there is no return, the watchdog timer will not be cleared, so the register’s value being constantly incremented will reach the maximum et voila! Reset occurs! Fig.(6.29) A/D CONVERTER 999
    • Chapter 6 | Microcontroller Fig.(6.30) External signals are usually fundamentally different from those the microcontroller understands (ones and zeros) and have to be converted therefore into values understandable for the microcontroller. An analogue to digital converter is an electronic circuit which converts continuous signals to discrete digital numbers. In other words, this circuit converts an analogue value into a binary number and passes it to the CPU for further processing. This module is therefore used for input pin voltage measurement (analogue value). The result of measurement is a number (digital value) used and processed later in the program. Fig.(6.31) INTERNAL ARCHITECTURE All upgraded microcontrollers use one of two basic design models called Harvard and von-Neumann architecture. They represent two different ways of exchanging data between CPU and memory. VON-NEUMANN ARCHITECTURE 999
    • Chapter 6 | Microcontroller Fig.(6.32) Microcontrollers using von-Neumann architecture have only one memory block and one 8-bit data bus. As all data are exchanged through these 8 lines, the bus is overloaded and communication is very slow and inefficient. The CPU can either read an instruction or read/write data from/to the memory. Both cannot occur at the same time since instructions and data use the same bus. For example, if a program line reads that RAM memory register called ‘SUM’ should be incremented by one (instruction: incf SUM), the microcontroller will do the following: 1. Read the part of the program instruction specifying WHAT should be done (in this case it is the ‘incf’ instruction for increment). 2. Read the other part of the same instruction specifying upon WHICH data it should be performed (in this case it is the ‘SUM’ register). 3. After being incremented, the contents of this register should be written to the register from which it was read (‘SUM’ register address). The same data bus is used for all these intermediate operations. HARVARD ARCHITECTURE Fig.(6.33) Microcontrollers using Harvard architecture have two different data buses. One is 8 bits wide and connects CPU to RAM. The other consists of 12, 14 or 16 lines and connects CPU to ROM. Accordingly; the CPU can read an instruction and access data memory at the same time. Since all RAM memory registers are 8 999
    • Chapter 6 | Microcontroller bits wide, all data being exchanged are of the same width. During the process of written a program, only 8-bit data are considered. In other words, all you can change from within the program and all you can influence are 8 bits wide. All the programs written for these microcontrollers will be stored in the microcontroller internal ROM after being compiled into machine code. However, ROM memory locations do not have 8, but 12, 14 or 16 bits. The rest of bits 4, 6 or 8 represent instruction specifying for the CPU what to do with the 8-bit data.    The advantages of such design are the following: All data in the program is one byte (8 bits) wide. As the data bus used for program reading has 12, 14 or 16 lines, both instruction and data can be read simultaneously using these spare bits. For this reason, all instructions are single-cycle instructions, except for the jump instruction which is two-cycle instruction. Owing to the fact that the program (ROM) and temporary data (RAM) are separate, the CPU can execute two instructions at a time. Simply put, while RAM read or write is in progress (the end of one instruction), the next program instruction is read through the other bus. When using microcontrollers with von-Neumann architecture, one never knows how much memory is to be occupied by the program. Basically, most program instructions occupy two memory locations (one contains information on WHAT should be done, whereas the other contains information upon WHICH data it should be done). However, it is not a hard and fast rule, but the most common case. In microcontrollers with Harvard architecture, the program word bus is wider than one byte, which allows each program word to consist of instruction and data, i.e. one memory location - one program instruction. INSTRUCTION SET Fig.(6.34) All instructions understandable to the microcontroller are called together the Instruction Set. When you write a program in assembly language, you actually specify instructions in such an order they should be executed. The main restriction here is a number of available instructions. The manufacturers usually adopt either approach described below: 999
    • Chapter 6 | Microcontroller RISC (REDUCED INSTRUCTION SET COMPUTER) In this case, the microcontroller recognizes and executes only basic operations (addition, subtraction, copying etc.). Other, more complicated operations are performed by combining them. For example, multiplication is performed by performing successive addition. It’s the same as if you try to explain to someone, using only a few different words, how to reach the airport in a new city. However, it’s not as black as it’s painted. First of all, this language is easy to learn. The microcontroller is very fast so that it is not possible to see all the arithmetic ‘acrobatics’ it performs. The user can only see the final results. At last, it is not so difficult to explain where the airport is if you use the right words such as left, right, kilometers etc. CISC (COMPLEX INSTRUCTION SET COMPUTER) CISC is the opposite to RISC! Microcontrollers designed to recognize more than 200 different instructions can do a lot of things at high speed. However, one needs to understand how to take all that such a rich language offers, which is not at all easy... HOW TO MAKE THE RIGHT CHOICE? Ok, you are the beginner and you have made a decision to go on an adventure of working with the microcontrollers. Congratulations on your choice! However, it is not as easy to choose the right microcontroller as it may seem. The problem is not a limited range of devices, but the opposite! Before you start to design a device based on the microcontroller, think of the following: how many input/output lines will I need for operation? Should it perform some other operations than to simply turn relays on/off? Does it need some specialized module such as serial communication, A/D converter etc.? When you create a clear picture of what you need, the selection range is considerably reduced and it’s time to think of price. Are you planning to have several same devices? Several hundred? A million? Anyway, you get the point. If you think of all these things for the very first time then everything seems a bit confusing. For this reason, go step by step. First of all, select the manufacturer, i.e. the microcontroller family you can easily get. Study one particular model. Learn as much as you need, doesn't go into details. Solve a specific problem and something incredible will happen- you will be able to handle any model belonging to that microcontroller family. 999
    • Chapter 6 | Microcontroller Remember learning to ride a bicycle. After several bruises at the beginning, you were able to keep balance, then to easily ride any other bicycle. And of course, you will never forget programming just as you will never forget riding bicycles! 6.6 | DEVELOPMENT TOOLS Microchip provides a freeware IDE package called MPLAB, which includes an assembler, linker, software simulator, and debugger. They also sell C compilers for the PIC18 and dsPIC which integrate cleanly with MPLAB. Free student versions of the C compilers are also available with all features. But for the free versions, optimizations will be disabled after 60 days. Several third parties make C language compilers for PICs, many of which integrate to MPLAB and/or feature their own IDE. A fully featured compiler for the PICBASIC language to program PIC microcontrollers is available from melbas, Inc. Development tools are available for the PIC family under the GPL or other free software or open source licenses. 6.6.1 | Device Programmers Fig.(6.35) A development board for low pin-count MCU, from Microchip Devices called "programmers" are traditionally used to get program code into the target PIC. Most PICs that Microchip currently sell feature ICSP (In Circuit Serial Programming) and/or LVP (Low Voltage Programming) capabilities, allowing the PIC to be programmed while it is sitting in the target circuit. ICSP programming is performed using two pins, clock and data, while a high voltage (12V) is present on the Vpp/MCLR pin. Low voltage programming dispenses with the high voltage, but reserves exclusive use of an I/O pin and can therefore be disabled to recover the pin for other uses (once disabled it can only be re-enabled using high voltage programming). There are many programmers for PIC microcontrollers, ranging from the extremely simple designs which rely on ICSP to allow direct download of code 999
    • Chapter 6 | Microcontroller from a host computer, to intelligent programmers that can verify the device at several supply voltages. Many of these complex programmers use a preprogrammed PIC themselves to send the programming commands to the PIC that is to be programmed. The intelligent type of programmer is needed to program earlier PIC models (mostly EPROM type) which do not support in-circuit programming. Many of the higher end flash based PICs can also self-program (write to their own program memory). Demo boards are available with a small boot loader factory programmed that can be used to load user programs over an interface such as RS-232 or USB, thus obviating the need for a programmer device. Alternatively there is boot loader firmware available that the user can load onto the PIC using ICSP. The advantages of a boot loader over ICSP is the far superior programming speeds, immediate program execution following programming, and the ability to both debug and program using the same cable. Fig.(6.36) Microchip PICSTART Plus programmer Programmers/debuggers are available directly from Microchip. Third party programmers range from plans to build your own, to self-assembly kits and fully tested ready-to-go units. Some are simple designs which require a PC to do the low-level programming signaling (these typically connect to the serial or parallel port and consist of a few simple components), while others have the programming logic built into them (these typically use a serial or USB connection, are usually faster, and are often built using PICs themselves for control). 6.6.2 | Debugging Software emulation Commercial and free emulators exist for the PIC family processors. In-circuit debugging Later model PICs feature an ICD (in-circuit debugging) interface, built into the CPU core. ICD debuggers (MPLAB ICD2 and other third party) can 999
    • Chapter 6 | Microcontroller communicate with this interface using three lines. This cheap and simple debugging system comes at a price however, namely limited breakpoint count (1 on older pics 3 on newer PICs), loss of some IO (with the exception of some surface mount 44-pin PICs which have dedicated lines for debugging) and loss of some features of the chip. For small PICs, where the loss of IO caused by this method would be unacceptable, special headers are made which are fitted with PICs that have extra pins specifically for debugging. In-circuit emulators Microchip offers three full in circuit emulators: the MPLAB ICE2000 (parallel interface, a USB converter is available); the newer MPLAB ICE4000 (USB 2.0 connection); and most recently, the REAL ICE. All of these ICE tools can be used with the MPLAB IDE for full source-level debugging of code running on the target. The ICE2000 requires emulator modules, and the test hardware must provide a socket which can take either an emulator module, or a production device. The REAL ICE connects directly to production devices which support incircuit emulation through the PGC/PGD programming interface, or through a high speed connection which uses two more pins. According to Microchip, it supports "most" flash-based PIC, PIC24, and dsPIC processors. The ICE4000 is no longer directly advertised on Microchip's website, and the purchasing page states that it is not recommended for new designs. PICKit 2 open source structure and clones PICKit 2 has been an interesting PIC programmer from Microchip. It can program all PICs and debug most of the PICs (as of May-2009, only the PIC32 family is not supported for MPLAB debugging). Ever since its first releases, all software source code (firmware, PC application) and hardware schematic are open to the public. This makes it relatively easy for an end user to modify the programmer for use with a non-Windows operating system such as Linux or Mac OS. In the mean time, it also creates lots of DIY interest and clones. This open source structure brings many features to the PICKit 2 community such as Programmer-to-Go, the UART Tool and the Logic Tool, which have been contributed by PICKit 2 users. Users have also added such features to the PICKit 2 as 4MB Programmer-to-go capability, USB buck/boost circuits, RJ12 type connectors and others. 991
    • Chapter 6 | Microcontroller Part number suffixes The F in a name generally indicates the PICmicro uses flash memory and can be erased electronically. Conversely, a C generally means it can only be erased by exposing the die to ultraviolet light (which is only possible if a windowed package style is used). An exception to this rule is the PIC16C84 which uses EEPROM and is therefore electrically erasable. An L in the name indicates the part will run at a lower voltage, often with frequency limits imposed.[19] Parts designed specifically for low voltage operation, within a strict range of 3 - 3.6 Volts, are marked with a J in the part number. These parts are also uniquely I/O tolerant as they will accept up to 5V as inputs. [19] Fig.(6.37) This figure below shows the most commonly used solution. Fig.(6.38) In order to prevent the appearance of high voltage self-induction, caused by a sudden stop of the current flow through the coil, an inverted polarized diode is connected in parallel to the coil. The purpose of this diode is to 'cut off' the voltage peak. 6.7 | LCD DISPLAY 991
    • Chapter 6 | Microcontroller This component is specifically manufactured to be used with microcontrollers, which means that it cannot be activated by standard IC circuits. It is used for displaying different messages on a miniature liquid crystal display. The model described here is for its low price and great capabilities most frequently used in practice. It is based on the HD44780 microcontroller (Hitachi) and can display messages in two lines with 16 characters each. It can display all the letters of alphabet, Greek letters, punctuation marks, mathematical symbols etc. It is also possible to display symbols made up by the user. Other useful features include automatic message shift (left and right), cursor appearance, LED backlight etc. Fig.(6.39) 6.7.1 | LCD Display Pins Along one side of the small printed board of the LCD display there are pins that enable it to be connected to the microcontroller. There are in total of 14 pins marked with numbers (16 if there is a backlight). Fig.(6.40) 6.7.2 | LCD Screen 999
    • Chapter 6 | Microcontroller An LCD screen can display two lines with 16 characters each. Every character consists of 5x8 or 5x11 dot matrix. This book covers a 5x8 character display which is most commonly used. Display contrast depends on the power supply voltage and whether messages are displayed in one or two lines. For this reason, varying voltage 0-Vdd is applied to the pin marked as Vee. A trimmer potentiometer is usually used for this purpose. Some of the LCD displays have built-in backlight (blue or green LEDs). When used during operation, a current limiting resistor should be serially connected to one of the pins for backlight power supply (similar to LED diodes). Fig.(6.41) If there are no characters displayed or if all of them are dimmed when the display is switched on, the first thing that should be done is to check the potentiometer for contrast adjustment. Is it properly adjusted? The same applies if the mode of operation has been changed (writing in one or two lines). 6.7.3 | LCD Memory    LCD display contains three memory blocks: DDRAM Display Data RAM; CGRAM Character Generator RAM; and CGROM Character Generator ROM. 999
    • Chapter 6 | Microcontroller DDRAM Memory DDRAM memory is used for storing characters to be displayed. The size of this memory is capable of storing 80 characters. Some memory locations are directly connected to the characters on display. Everything works quite simply: it is enough to configure the display to increment addresses automatically (shift right) and set the starting address for the message to be displayed (for example 00 hex). Afterwards, all characters sent through lines D0-D7 will be displayed in the message format we are used to- from left to right. In this case, displaying starts from the first field of the first line because the initial address is 00 hex. If more than 16 characters are sent, then all of them will be memorized, but only the first sixteen characters will be visible. In order to display the rest of them, the shift command should be used. Virtually, everything looks as if the LCD display is a window which shifts left-right over memory locations containing different characters. In reality, this is how the effect of the message shifting over the screen has been created. Fig.(6.42) If the cursor is on, it appears at the currently addressed location. In other words, when a character appears at the cursor position, it will automatically move to the next addressed location. This is a sort of RAM memory so that data can be written to and read from it, but its content is irretrievably lost when the power goes off. CGROM Memory CGROM memory contains a standard character map with all characters that can be displayed on the screen. Each character is assigned to one memory location: 999
    • Chapter 6 | Microcontroller Fig.(6.43) The addresses of CGROM memory locations match the characters of ASCII. If the program being currently executed encounters a command ‘send character P to port’ then the binary value 0101 0000 appears on the port. This value is the ASCII equivalent to the character P. It is then written to an LCD, which results in displaying the symbol from the 0101 0000 location of CGROM. In other words, the character ‘P’ is displayed. This applies to all letters of alphabet (capitals and small), but not to numbers. As seen on the previous map, addresses of all digits are pushed forward by 48 relative to their values (digit 0 address is 48, digit 1 address is 49, digit 2 address is 50 etc.). Accordingly, in order to display digits correctly it is necessary to add the decimal number 48 to each of them prior to being sent to an LCD. 999
    • Chapter 6 | Microcontroller What is ASCII? From their inception till today, computers can recognize only numbers, but not letters. It means that all data a computer swaps with a peripheral device has a binary format even though the same is recognized by the man as letters (the keyboard is an excellent example). In other words, every character matches a unique combination of zeroes and ones. ASCII is character encoding based on the English alphabet. ASCII code specifies a correspondence between standard character symbols and their numerical equivalents. Fig.(6.44) CGRAM Memory Apart from standard characters, the LCD display can also display symbols defined by the user itself. It can be any symbol in the size of 5x8 pixels. RAM memory called CGRAM in the size of 64 bytes enables it. Memory registers are 8 bits wide, but only 5 lower bits are used. Logic one (1) in every register represents a dimmed dot, while 8 locations grouped together represent one character. It is best illustrated in figure below: 999
    • Chapter 6 | Microcontroller Fig.(6.45) Symbols are usually defined at the beginning of the program by simple writing zeros and ones to registers of CGRAM memory so that they form desired shapes. In order to display them it is sufficient to specify their address. Pay attention to the first columns in the CGROM map of characters. It doesn't contain RAM memory addresses, but symbols being discussed here. In this example, ‘display 0’ means - display ‘č’, ‘display 1’ means - display ‘ž’ etc. 6.7.4 | LCD Basic Commands  All data transferred to an LCD through the outputs D0-D7 will be interpreted as a command or a data, which depends on the RS pin logic state: RS = 1 - Bits D0 - D7 are addresses of the characters to be displayed. LCD processor addresses one character from the character map and displays it. The DDRAM address specifies location on which the character is to be displayed. This 999
    • Chapter 6 | Microcontroller  address is defined prior to transferring character or the address of the previously transferred character is automatically incremented. RS = 0 - Bits D0 - D7 are commands for setting the display mode. Here is a list of commands recognized by the LCD: Execution Command RS RW D7 D6 D5 D4 D3 D2 D1 D0 Time Clear display 0 0 0 0 0 0 0 0 0 1 1.64mS Cursor home 0 0 0 0 0 0 0 0 1 x 1.64mS Entry mode set 0 0 0 0 0 0 0 1 I/D S 40uS Display on/off control 0 0 0 0 0 0 1 D U B 40uS Cursor/Display Shift 0 0 0 0 0 1 D/C R/L x x 40uS Function set 0 0 0 0 1 DL N F x x 40uS Set CGRAM address 0 0 0 1 CGRAM address 40uS Set DDRAM address 0 0 1 DDRAM address 40uS Read "BUSY" flag (BF) 0 1 BF DDRAM address Write to CGRAM or 1 0 D7 D6 D5 D4 D3 D2 D1 D0 40uS DDRAM Read from CGRAM or 1 1 D7 D6 D5 D4 D3 D2 D1 D0 40uS DDRAM I/D 1 = Increment (by 1) 0 = Decrement (by 1) S 1 = Display shift on 0 = Display shift off DL 1 = 8-bit interface 0 = 4-bit interface D 1 = Display on 0 = Display off U 1 = Cursor on 0 = Cursor off N 1 = Display in two lines 0 = Display in one line F 1 = Character format 5x10 dots 0 = Character format 5x7 dots B 1 = Cursor blink on 0 = Cursor blink off 999 R/L 1 = Shift right 0 = Shift left D/C 1 = Display shift 0 = Cursor shift
    • Chapter 6 | Microcontroller WHAT IS THE BUSY FLAG? Compared to the microcontroller, the LCD is an extremely slow component. For this reason, it was necessary to provide a signal which would, upon command execution, indicate that the display is ready for the next piece of data. That signal, called the busy flag, can be read from the line D7. The display is ready to receive new data when the voltage on this line is 0V (BF=0). 6.7.5 | LCD Connecting Depending on how many lines are used for connecting an LCD to the microcontroller, there are 8-bit and 4-bit LCD modes. The appropriate mode is selected at the beginning of the operation in the process called 'initialization'. The 8-bit LCD mode uses outputs D0- D7 to transfer data as explained on the previous page. The main purpose of the 4-bit LCD mode is to save valuable I/O pins of the microcontroller. Only 4 higher bits (D4-D7) are used for communication, while others may be left unconnected. Each piece of data is sent to the LCD in two stepsfour higher bits are sent first (normally through the lines D4-D7), then four lower bits. Initialization enables the LCD to link and interpret received bits correctly. Fig.(6.46) 999
    • Chapter 6 | Microcontroller Data is rarely read from the LCD (it is mainly transferred from the microcontroller to the LCD) so it is often possible to save an extra I/O pin by simple connecting the R/W pin to the Ground. Such a saving has its price. Messages will be normally displayed, but it will not be possible to read the busy flag since it is not possible to read the display either. Fortunately, there is a simple solution. After sending a character or a command it is important to give the LCD enough time to do its job. Owing to the fact that the execution of a command may last for approximately 1.64mS, it will be sufficient to wait about 2mS for the LCD. 6.7.6 | LCD Initialization 1. 2. 3. 4. 5. 6. 7. The LCD is automatically cleared when powered up. It lasts for approximately 15mS. After this, it is ready for operation. The mode of operation is set by default, which means that: Display is cleared. Mode DL = 1 - Communication through 8-bit interface N = 0 - Messages are displayed in one line F = 0 - Character font 5 x 8 dots Display/Cursor on/off D = 0 - Display off U = 0 - Cursor off B = 0 - Cursor blink off Character entry ID = 1 Displayed addresses are automatically incremented by 1 S = 0 Display shift off Automatic reset mostly occurs without any problems. Mostly, but not always! If for any reason the power supply voltage doesn’t reach full value within 10mS, the display will start to perform completely unpredictably. If the voltage unit is not able to meet that condition or if it is needed to provide completely safe operation, the process of initialization is applied. Initialization, among other things, causes a new reset by enabling the display to operate normally. There are two initialization algorithms. Which one is to be performed depends on whether connecting to the microcontroller is through 4- or 8-bit interface. In both cases, all that’s left to do after initialization is to specify basic commands and of course - to display messages. 991
    • CHAPTER 7 System Implementation
    • Chapter 7 | System Implementation 7.1| INTRODUCTION We’ll take overview about the project to completely understand the project. The project aims to help visually impaired to face the different problems they face in their life. Our system pass through different stages we start with searches then surveys then searching for sponsors to help us to get the best form of the product then we start design phase then development to get the best results the making prototype and get the final product. Our systems consists of 2 parts hardware and software : The software is out door navigation online and offline designed initially using MATLAB. The user just has to say the place he wants to go and we have 2 cases: Case 1: if GPS is on, the code will receive GPS data and compare it with database and due to the result a specific action is done Case 2: if GPS is off, the code will choose the pre-saved maps in the database and due to the speed and the length of the road which is calculated the program will calculate the time between the wanted orders. The hardware section is ultrasound sensor connected with vibration motor calculating the distance the lower the distance the faster the vibration motor. And RFID connected with mp3 player to help the user to identify the objects he usually uses. We get these ideas and application from searches and surveys. 7.2| SURVEYS We want to make product which solve a real problems so we go to different non-profit organization especially RESALA which help us to meet visually impaired volunteers several times to see the real and high risk problems they face so we re-order the wanted applications in the project. The ultrasound sensor is the most important part to help them moving freely without any problems. Then the RFID which help them to identify the objects they usually use. Then outdoor navigation and we did this application with MATLAB. Then we start the next stage searches. 141
    • Chapter 7 | System Implementation 7.3| SEARCHES We start searches to find the best way to reach our goal. At the first we wanted to make sensor and indoor and outdoor navigation beside security system. In this part we reach to all technical data we need to start designing the systems. 7.3.1| Ultrasound Sensor Then we start to search for a suitable ultrasound sensor module to use it in the system we decided to use maxsonar but because of its high price we decided to use sensor HC-SR04 temporarily then use maxsonar. 7.3.2| Indoor Navigation System We need in this part RFID reader for suitable range minimum 2 meters and we initially start with image processing and search for courses in it and then start design the code with the aid of MATLAB help. 7.3.3| Outdoor Navigation We did our searches to get the best available GPS module available and we choose the MediaTek MT3329 GPS module because of: -Based on MediaTek Single Chip Architecture. -Dimension:16mm x 16mm x 6mm -L1 Frequency, C/A code, 66 channels -High Sensitivity Up to -165dBm tracking, superior urban performances -Position Accuracy:< 3m CEP (50%) without SA (horizontal) -Cold Start is under 35 seconds (Typical) -Warm Start is under 34 seconds (Typical) -Hot Start is under 1 second (Typical) -Low Power Consumption:48mA @ acquisition, 37mA @ tracking -Low shut-down current consumption:15uA, typical -DGPS (WAAS, EGNOS, MSAS) support (optional by firmware) -USB/UART Interface -Support AGPS function (Offline mode: EPO valid up to 14 days ) -Includes a Molex cable adapter, 5 cm -Includes the new basic adapter -Weight: 0.3oz; 8 g 141
    • Chapter 7 | System Implementation Then we start our search to connect the GPS module with matlab so we buy FTDI cable to connect between the module and pc 7.4| SPONSORS We find it difficult to convert our project totally to hardware so we decided to search for sponsors to help us to choose the best way to reach our goal So after meeting our sponsors several times we reach to the present vision of our project. We searched for sponsors in the field of embedded systems , medical equipment and programming. We wanted to design auditory outdoor navigation with pic put we found it so difficult and it needs technical and financial support doesn’t exist in Egypt so we make this application using matlab.We wanted to design indoor navigation using RFID but after surveying with Resala non-profit organization we found that the users doesn't need it so we cancelled this part and replace it with identifying objects using RFID. We used one sensor to calculate the distance but we developed it by using 3 sensors in 3 directions to get the way the user must go. Our sponsors is Futek and it works in embedded systems and power saving we need it in designing the system and also finishing it and get final product. Brilliance and we'll need it as a technical support to help us to find the best technical solutions for any problems. 7.5| PRE-DESIGN At the first we make the acceptable specifications we need in our project Table(7.2):Acceptable specificitions No. Need Imp 1 The suspension Acceptable Range 5 2 The suspension Used Outdoor 5 3 The suspension Low Cost 5 141
    • Chapter 7 | System Implementation 4 The suspension Light Intensity 5 5 The suspension Low Power 5 6 The suspension Good style and finishing 3 7 The suspension Arabic 5 8 The suspension Clear Voice 4 9 The suspension Easy to use 5 10 The suspension High Quality Materials 3 7.5.1| List Of Metrics 1 Fast response 3 Low cost 5 Light Intensity 6 Low power Measurement 10 Fast Cycle 9 1.9cm * 2.1cm US Module Design it a watch 8 to 2.5V to start work 7 Two Buttons choose Mode 6 Use Cam Module with uart data transfer 5 used outdoor 4 4 Acceptab √ le Range 2 3 Weather Resistance 2 Very Small Size No. Mp3 Module 1 Meters 6 Ultrasound Range Table(7.2) :List Of Metrics 144 √ √ √ √ √ √ √ √
    • Chapter 7 | System Implementation 7 8 Good style and Finishing Arabic Languag e √ √ √ √ 9 Easy Use 10 High quality materials √ Clear Voice √ 11 to √ √ √ 7.5.2| Competitive Benchmarking Information Table(7.3): Competitive Benchmarking Information Metric No. 1 Needs no. 1 Metric 6 meters ultrasound 5 range 3 4 2,4,8,10 Mp3 module ,11 3,10 Weather resistance 6 Low Power 5 7,10 6 2 7 9 8 5,7 2 141 Imp Good Design Graduation project in Our Units Mansoura Project university M 0.3 4 5 4 Found V 3 Use cam module with 4 uart data transfer Buttons to choose the 3 mode Small dimensions us 4 module 3 Not Found 5V Bag Found cm Found 2.5V Stick (or) Bracele t Found More than 5 2 buttons Buttons 1.9cm* 4cm*2cm 2.1cm
    • Chapter 7 | System Implementation 7.5.3| Ideal And Marginally Acceptable Target Values Table(7.4):Ideal and Marginally Acceptable Target Values Metr ic No. 1 Needs no. 3 4 5 7,10 6 2 7 9 8 5,7 Imp 6 meters ultrasound 5 range 1 2,4,8,10, 11 3,10 6 2 Metric Mp3 module 4 Weather resistance Low Power 5 4 Good Design Uni ts M 3 Use cam module with uart data 4 transfer Buttons to choose 3 the mode Small dimensions us 4 module MB V Marginal Values 3 6 32 64 Not Found 5V Found 2.5V Stick (or) Bracelet Bag Found Found 2 buttons Cm Ideal Values 1.9cm*2.1c m No Buttons 1.9cm*2.1 cm 7.5.4| Time Plane Diagram time plan exams design lectures; 21.2 exams; 21.2 develop finishing other activities; 18.18 design; 25.62 travelling play other activities play; 1.59 travelling; 1.7 finishing; 2.01 develop; 8.5 Fig.(7.1) 141 lectures
    • Chapter 7 | System Implementation 7.6| DESIGN We designed speech recognition, outdoor mapping using mapping toolbox, indoor mapping using image processing, ultrasound sensor using mikroc and GUI interface using matlab. 7.6.1| Speech Recognition: Speech processing Steps for one sample: and then but it and other samples in dataset to train it using NNs. 141
    • Chapter 7 | System Implementation Fig.(7.3): Screenshot of some parts of speech recognition code 141
    • Chapter 7 | System Implementation 7.6.2| Ultrasound Sensor Fig.(7.4) : Screenshot of some parts of Ultrasonic sensor code Fig.(7.5):Simulation of ultrasonic sensor circuit 141
    • Chapter 7 | System Implementation 7.6.3| Outdoor Navigation Mapping Code: Fig.(7.6): Screenshot of mapping code Fig.(7.7):Result of pre-map code 111
    • Chapter 7 | System Implementation 7.7| PRODUCT ARCHITECTURE 7.7.1| Product Schematic: For the visually impaired and blind women: Button1 Indoor Button 2 Outdoor User Interface Main Board Supply dc power Ultrasound sensors modules Input MCU Camera module Output MP3 module Speaker Fig. (7.8): visually impaired and blind women model 111
    • Chapter 7 | System Implementation For blind men: Input Ultrasoun d sensors modules Main Board Supply dc power MCU Output MP3 module Speaker Fig.(7.9) : Blind Men Model 7.7.2| Rough Geometric Layout For blind men: Mp3 module White stick U_S to left U_S to forward U_S to Right Speaker Fig.(7.10): Blind men Geometric Layout 111
    • Chapter 7 | System Implementation For the visually impaired and blind women: U_S to Center Button 2 U_S to Left Button 1 U_S to Right Fig.(7.11): This design is optional not the default design but the Stick is the default one which we work on it firstly. 7.7.3| Incidental Interactions For Blind Men: Speaker PW M MP3 module MCU Analog data SPI data Select mp3 file Fig. (7.12) 111 Ultrasound module
    • Chapter 7 | System Implementation For the visually impaired and blind women: Speaker PW M MP3 module MCU Analog data SPI data Select mp3 file Ultrasound module Send image data RFID Module Fig.(7.13) 7.8| DEFINING SECONDARY SYSTEMS  Power button (on/off).  Speaker (connected with MP3 module).  Led connected MP3 module to show the status.  Mute button.  USB interface to connect module with computer to download files.  Power supply rechargeable. 7.9| DETAILED INTERFACE SPECIFICATIONS Table(7.5) Line 1 2 3 114 Name Power Ground Input Properties 5v 0v analog
    • Chapter 7 | System Implementation 1 2 Control unit 3 Ultrasound sensor Fig.(7.14) Table(7.6) Line 1 2 3 4 5 6 Name Power Power Ground Spi Spi Spi Properties 3.3v 3.3v 0v data CS clk 1 2 Control unit 3 MP3 Module 4 5 6 Fig(7.15) 7.10| ESTABLISHING THE ARCHITECTURE OF THE CHUNKS Choose mode Control port WT588D-U Module Voice output 111 USB Download
    • CHAPTER 8 Conclusion
    • Chapter 8 | Conclusion 8.1| INTRODUCTION Finally our purpose was to make a project solve a real problem found in our life or solve difficulties faces some people as we said before blind or deaf people need a special care and special devices to make their life easier. After the survey we did and meeting sponsors technical and marketing we choose the most wanted applications: Outdoor navigation, ultrasound sensor and objects identifier and we cancelled indoor navigation as the user doesn't need it. We'll have an overview for every part 8.2| OVERVIEW 8.2.1| Outdoor navigation| Outdoor navigation online In this part we use two subsystems: 1. Speech recognition. 2. Serial communication to read GPS. 1st step in this system the user speak to choose the place or location he wants to go then GPS module is activated to detect the current location then the program we made compare between the received data and the pre-saved data and due to the result the code take a specific action. We have 2 cases: Case1: the 2 values aren't equal so there's no action is made. Case 2: the 2 values are equal due to this location the program will output a sound contain the direction he must go forward, right, left or telling him that he arrived .| Outdoor navigation offline In this part we use 3 subsystems : 1. Speech recognition. 2. Image processing. 3. GUI . 1st step in this system the user speak to choose the place or location he wants to go then the GUI load image then he enter his velocity through GUI then the code load the map of this path then calculating the time the user needs to finish this way 851
    • Chapter 8 | Conclusion by getting the pre-saved length of the road then divide it by velocity and calculate the delay between every order the code take a specific action. 8.2.2| Ultrasound sensor This part calculate the distance between the user and any barrier on his way and this sensor connected with dc vibration motor and due to this distance the motor's speed increases while the distance decreases. 8.2.3| Object identifier In this part we use RFID reader connected with microcontroller and mp3 module. Every object will have a tag when the user's hands approaches to any object the reader activate the tag and this tag sends its id to the reader which sends it to pic. The pic due to the ID will activate a specific wav file saved in mp3 module which contains the name of the object. 8.3| FEATURES We plan to develop and add more features in our project to solve all problems we can solve it. These features appear in the next points:  Help him to read books.  Help him to market easily.  Help him to reach his lost objects. 851
    • Appendix
    • Appendix A: GUI Appendix A: GUI A.1 | INTRODUCTION A.1.1 | What Is a GUI? A graphical user interface (GUI) is a graphical display in one or more windows containing controls, called components that enable a user to perform interactive tasks. The user of the GUI does not have to create a script or type commands at the command line to accomplish the tasks. Unlike coding programs to accomplish tasks, the user of a GUI need not understand the details of how the tasks are performed. GUI components can include menus, toolbars, push buttons, radio buttons, list boxes, and sliders—just to name a few. GUIs created using MATLAB® tools can also perform any type of computation, read and write data files, communicate with other GUIs, and display data as tables or as plots. The following figure illustrates a simple GUI that you can easily build yourself. Fig.(A.1): A simple GUI The GUI contains: • An axes component 161
    • Appendix A: GUI • A pop-up menu listing three data sets that correspond to MATLAB functions: peaks, membrane, and sinc • A static text component to label the pop-up menu • Three buttons that provide different kinds of plots: surface, mesh, and contour When you click a push button, the axes component displays the selected data set using the specified type of 3-D plot. A.1.2|How Does a GUI Work? In the GUI described in “What Is a GUI?” the user selects a data set from the pop-up menu, then clicks one of the plot type buttons. The mouse click invokes a function that plots the selected data in the axes. Most GUIs wait for their user to manipulate a control, and then respond to each action in turn. Each control, and the GUI itself, has one or more user-written routines (executable MATLAB code) known as callbacks, named for the fact that they “call back” to MATLAB to ask it to do things. The execution of each callback is triggered by a particular user action such as pressing a screen button, clicking a mouse button, selecting a menu item, typing a string or a numeric value, or passing the cursor over a component. The GUI then responds to these events. You, as the creator of the GUI, provide callbacks which define what the components do to handle events. This kind of programming is often referred to as event-driven programming. In the example, a button click is one such event. In event-driven programming, callback execution is asynchronous, that is, it is triggered by events external to the software. In the case of MATLAB GUIs, most events are user interactions with the GUI, but the GUI can respond to other kinds of events as well, for example, the creation of a file or connecting a device to the computer. A.1.3 |How can you code callbacks? You can code callbacks in two distinct ways: • As MATLAB language functions stored in files • As strings containing MATLAB expressions or commands (such as 'c = sqrt(a*a + b*b);'or 'print') 162
    • Appendix A: GUI Using functions stored in code files as callbacks is preferable to using strings, as functions have access to arguments and are more powerful and flexible. MATLAB scripts (sequences of statements stored in code files that do not define functions) cannot be used as callbacks. Although you can provide a callback with certain data and make it do anything you want, you cannot control when callbacks will execute. That is, when your GUI is being used, you have no control over the sequence of events that trigger particular callbacks or what other callbacks might still be running at those times. This distinguishes event-driven programming from other types of control flow, for example, processing sequential data files. A.1.4|Where Do I Start? Ways to Build MATLAB GUIs A MATLAB GUI is a figure window to which you add user-operated controls. You can select, size, and position these components as you like. Using callbacks you can make the components do what you want when the user clicks or manipulates them with keystrokes. You can build MATLAB GUIs in two ways: • Use GUIDE (GUI Development Environment), an interactive GUI construction kit. • Create code files that generate GUIs as functions or scripts (programmatic GUI construction). The first approach starts with a figure that you populate with components from within a graphic layout editor. GUIDE creates an associated code file containing callbacks for the GUI and its components. GUIDE saves both the figure (as a FIG-file) and the code file. Opening either one also opens the other to run the GUI. In the second, programmatic, GUI-building approach, you create a code file that defines all component properties and behaviors; when a user executes the file, it creates a figure, populates it with components, and handles user interactions. The figure is not normally saved between sessions because the code in the file creates a new one each time it runs. 163
    • Appendix A: GUI As a result, the code files of the two approaches look different. Programmatic GUI files are generally longer, because they explicitly define every property of the figure and its controls, as well as the callbacks. GUIDE GUIs define most of the properties within the figure itself. They store the definitions in its FIG-file rather than in its code file. The code file contains callbacks and other functions that initialize the GUI when it opens. MATLAB software also provides functions that simplify the creation of Table(A.1):GUI Technique Type of GUI Technique Dialog box MATLAB software provides a selection of standard dialog boxes that you can create with a single function call. For an example, see the documentation for msgbox, which also provides links to functions that create specialized predefined dialog boxes. It is often simpler to create GUIs that contain only a few components programmatically. You can fully define each component with a single function call. GUIDE simplifies the creation of moderately complex GUIs. Creating complex GUIs programmatically lets you control exact placement of the components and provides reproducibility. GUI containing just a few Components Moderately complex GUIs Complex GUIs with many components, and GUIs that require interaction with other GUIs standard dialog boxes, for example to issue warnings or to open and save files. The GUI-building technique you choose depends on your experience, your preferences, and the kind of application you need the GUI to operate. This table outlines some possibilities. programmatically and later modify it with GUIDE. You can combine the two approaches to some degree. You can create a GUI with GUIDE and then modify it programmatically. However, you cannot create a GUI A.2|WHAT IS GUIDE? 164
    • Appendix A: GUI GUIDE, the MATLAB Graphical User Interface Development Environment, provides a set of tools for creating graphical user interfaces (GUIs). These tools greatly simplify the process of laying out and programming GUIs. Opening GUIDE There are several ways to open GUIDE from the MATLAB Command line. Table (A.2): ways to open GUIDE from the MATLAB Command line Command guide guide FIG-file name Result Opens GUIDE with a choice of GUI templates Opens FIG-file name in GUIDE You can also right-click a FIG-file in the Current Folder Browser and select Fig.(A.2) Open in GUIDE from the context menu When you right-click a FIG-file in this way, the figure opens in the GUIDE Layout Editor, where you can work on it 165
    • Appendix A: GUI Fig.(A.3) . All tools in the tool palette have tool tips. Setting a GUIDE preference lets you display the palette in GUIDE with tool names or just their icons. A.2.1 | Getting Help in GUIDE When you open GUIDE to create a new GUI, a gridded layout area displays. It has a menu bar and toolbar above it, a tool palette to its left, and a status bar below it, as shown below. See “GUIDE Tools Summary” on page 4-3 for a full description. At any point, you can access help topics from the GUIDE Help menu, shown in the following illustration. The first three options lead you to topics in the GUIDE documentation that can help you get started using GUIDE. The Example GUIs option opens a list of complete examples of GUIs built using GUIDE that you can browse, study, open in GUIDE, and run. The bottom option, Online Video Demos, opens a list of GUIDE- and related GUIbuilding video tutorials on MATLAB Central. You can access MATLAB video demos, as well as the page on MATLAB Central by clicking links in the following table. 166
    • Appendix A: GUI Table(A.3): clicking links TYPE OF VEDIO MATLAB New Feature Demos MATLAB Central Video Tutorials VEDIO CONTENT New Graphics and GUI Building Features in Version 7.6 (9 min, 31 s) New Graphics and GUI Building Features in Version 7.5 (2 min, 47 s) New Creating Graphical User Interfaces features in Version 7 (4 min, 24 s) Archive for the “GUI or GUIDE” Category from 2005 to present. A.2.2| Laying Out a GUIDE GUI The GUIDE Layout Editor enables you to populate a GUI by clicking and dragging GUI components into the layout area. Fig.(A.4) There you can resize, group and align buttons, text fields, sliders, axes, and other components you add. Other tools accessible from the Layout Editor enable you to: • Create menus and context menus • Create toolbars • Modify the appearance of components 167
    • Appendix A: GUI • Set tab order • View a hierarchical list of the component objects • Set GUI options A.2.3| Programming a GUIDE GUI When you save your GUI layout, GUIDE automatically generates a file of MATLAB code for controlling the way the GUI works. This file contains code to initialize the GUI and organizes the GUI callbacks. Callbacks are functions that execute in response to user-generated events, such as a mouse click. Using the MATLAB editor, you can add code to the callbacks to perform the functions you want. Simple GUIDE GUI Example Simple GUIDE GUI Components This section shows you how to use GUIDE to create the graphical user interface (GUI) shown in the following figure. Fig.(A.5) 168
    • Appendix A: GUI Fig.(A.6) To use the GUI, select a data set from the pop-up menu, then click one of the plot-type buttons. Clicking the button triggers the execution of a callback that plots the selected data in the axes. Lay Out the Simple GUI in GUIDE Open a New GUI in the GUIDE Layout Editor 1 Start GUIDE by typing guide at the MATLAB prompt. The GUIDE Quick Start dialog displays, as shown in the following figure. 2 In the GUIDE Quick Start dialog box, select the Blank GUI (Default) template. Click OK to display the blank GUI in the Layout Editor, as shown in the following figure. 169
    • Appendix A: GUI Fig.(A.7) 3 Display the names of the GUI components in the component palette. Select File > Preferences. Then select GUIDE > Show names in component palette, and then click OK. The Layout Editor then appears as shown in the following figure Fig(A.8) 170
    • Appendix A: GUI Add Components to the Simple GUIDE GUI 1 Add the three push buttons to the GUI. Select the push button tool from the component palette at the left side of the Layout Editor and drag it into the layout area. Create three buttons this way, positioning them approximately as shown in the following figure. Fig.(A.9) 2 Add the remaining components to the GUI. • A static text area • A pop-up menu • An axes Arrange the components as shown in the following figure. Resize the axes component to approximately 2-by-2 inches. Align the Components If several components have the same parent, you can use the Alignment Tool to align them to one another. To align the three push buttons: 1 Select all three push buttons by pressing Ctrl and clicking them. 2 Select Tools > Align Objects. 3 Make these settings in the Alignment Tool, as shown in the following figure: • Left-aligned in the horizontal direction. 171
    • Appendix A: GUI • 20 pixels spacing between push buttons in the vertical direction Fig.(A.10) 4 Click OK. Your GUI now looks like this in the Layout Editor. Fig.(A.11) 172
    • Appendix A: GUI Label the Push Buttons. Each of the three push buttons lets the GUI user choose a plot type: 1 Select Property Inspector from the View menu. surf, mesh, and contour. This ' Fig.(A.12) topic shows you how to label the buttons with those choices. 2 In the layout area, select the top push button by clicking it Fig.(A.13) 3 In the Property Inspector, select the String property and then replace the existing 173
    • Appendix A: GUI Fig.(A.14) 5 Select each of the remaining push buttons in turn and repeat steps 3 and 4. Label the middle push button Mesh, and the bottom button Contour. List Pop-Up Menu Items. The pop-up menu provides a choice of three data sets: peaks, membrane, and sinc. These data sets correspond to MATLAB functions of the same name. This topic shows you how to list those data sets as choices in the pop-menu. 1 In the layout area, select the pop-up menu by clicking it. 2 In the Property Inspector, click the button next to String. The String dialog box displays. Fig.(A.15) 3 Replace the existing text with the names of the three data sets: Peaks, Membrane, and Sinc. Press Enter to move to the next line. 174
    • Appendix A: GUI Fig.(A.16) 4 When you have finished editing the items, click OK. The first item in your list, Peaks, appears in the pop-up menu in the layout area. Fig.(A.17) 175
    • Appendix A: GUI Modify the Static Text. In this GUI, the static text serves as a label for the pop-up menu. The user cannot change this text. This topic shows you how to change the static text to read Select Data. 1 In the layout area, select the static text by clicking it. 2 In the Property Inspector, click the button next to String. In the String dialog box that displays, replace the existing text with the phrase Select Data. 3 Click OK. The phrase Select Data appears in the static text component above the Fig.(A.18) pop-up menu. Completed Simple GUIDE GUI Layout In the Layout Editor, your GUI now looks like this and the next step is to save the layout. The next topic, “Save the GUI Layout” Fig.(A.19) 176
    • Appendix A: GUI A.2.4| Save the GUI Layout When you save a GUI, GUIDE creates two files, a FIG-file and a code file. The FIG-file, with extension .fig, is a binary file that contains a description of the layout. The code file, with extension .m, contains MATLAB functions that control the GUI. 1 Save and activate your GUI by selecting Run from the Tools menu. 2 GUIDE displays the following dialog box. Click Yes to continue. Fig.(A.20) 3 GUIDE opens a Save As dialog box in your current folder and prompts you for a FIG-file name. Fig.(A.21) 177
    • Appendix A: GUI 4 Browse to any folder for which you have write privileges, and then enter the filename simple_gui for the FIG-file. GUIDE saves both the FIG-file and the code file using this name. 5 If the folder in which you save the GUI is not on the MATLAB path, GUIDE opens a dialog box, giving you the option of changing the current folder to the folder containing the GUI files, or adding that folder to the top or bottom of the MATLAB path. Fig.(A.22) 6 GUIDE saves the files simple_gui.fig and simple_gui.m and activates the GUI. It also opens the GUI code file in your default editor. The GUI opens in a new window. Notice that the GUI lacks the standard menu bar and toolbar that MATLAB figure windows display. You can add your own menus and toolbar buttons with GUIDE, but by default a GUIDE GUI includes none of these components. When you operate simple_gui, you can select a data set in the pop-up menu and click the push buttons, but nothing happens. This is because the code file contains no statements to service the pop-up menu and the buttons. 178
    • Appendix A: GUI Fig.(A.23) To run a GUI created with GUIDE without opening GUIDE, execute its code file by typing its name. simple_gui You can also use the run command with the code file, for example, run simple_gui Note Do not attempt to run a GUIDE GUI by opening its FIG-file outside of GUIDE. If you do so, the figure opens and appears ready to use. 179
    • Appendix B: RFID Appendix B: RFID B.1|INTRODUCTION RFID stands for Radio-Frequency IDentification. The acronym refers to small electronic devices that consist of a small chip and an antenna. The chip typically is capable of carrying 2,000 bytes of data or less. RFID technology has been available for more than fifty years. System has three parts: 1- A scanning antenna. 2- A transceiver with a decoder to interpret the data. 3- A transponder - the RFID tag - that has been programmed with information. B.2 | HOW RFID WORKS? The scanning antenna puts out radio-frequency signals in a relatively short range. The RF radiation does two things:  It provides a means of communicating with the transponder (the RFID tag) AND  It provides the RFID tag with the energy to communicate (in the case of Passive RFID tag . This is an absolutely key part of the technology; RFID tags do not need to contain batteries, and can therefore remain usable for very long periods of time (maybe decades). The scanning antennas can be permanently affixed to a surface; handheld antennas are also available. They can take whatever shape you need; for example, you could build them into a door frame to accept data from persons or objects passing through. When an RFID tag passes through the field of the scanning antenna, it detects the activation signal from the antenna. That "wakes up" the RFID chip, and it transmits the information on its microchip to be picked up by the scanning antenna. 871
    • Appendix 2:RFID In addition, the RFID tag may be of one of two types Active RFID tag have their own power source; the advantage of these tags is that the reader can be much farther away and still get the signal. Even though some of these devices are built to have up to a 10 year. life span, they have limited life spans passive RFID tag , however, do not require batteries, and can be much smaller and have a virtually unlimited life span. RFID tags can be read in a wide variety of circumstances, where barcodes or other optically read technologies are useless.    The tag need not be on the surface of the object (and is therefore not subject to wear) The read time is typically less than 100 milliseconds Large numbers of tags can be read at once rather than item by item. B.3|TECHNICAL PROBLEMS WITH RFID 1-Problems with RFID Standards RFID has been implemented in different ways by different manufacturers; global standards are still being worked on. It should be noted that some RFID devices are never meant to leave their network. This can cause problems for companies 2-RFID systems can be easily disrupted Since RFID systems make use of the electromagnetic spectrum (like Wi-Fi networks or cell phones), they are relatively easy to jam using energy at the right frequency. Although this would only be an inconvenience for consumers in stores (longer waits at the checkout), it could be disastrous in other environments where RFID is increasingly used, like hospitals or in the military in the field. Also, active RFID tags (those that use a battery to increase the range of the system) can be repeatedly interrogated to wear the battery down, disrupting the system. 3-RFID Reader Collision 871
    • Appendix B: RFID Reader collision occurs when the signals from two or more readers overlap. The tag is unable to respond to simultaneous queries. Systems must be carefully set up to avoid this problem; many systems use an anti-collision protocol (also called a singulation protocol. Anti-collision protocols enable the tags to take turns in transmitting to a reader. 4- RFID Tag Collision Tag collision occurs when many tags are present in a small area; but since the read time is very fast, it is easier for vendors to develop systems that ensure that tags respond one at a time. B.4|SECURITY, PRIVACY AND ETHICS PROBLEMS WITH RFID 1-The contents of an RFID tag can be read after the item leaves the supply chain An RFID tag cannot tell the difference between one reader and another. RFID scanners are very portable; RFID tags can be read from a distance, from a few inches to a few yards. This allows anyone to see the contents of your purse or pocket as you walk down the street. Some tags can be turned off when the item has left the supply chain 2-RFID tags are difficult to remove RFID tags are difficult to for consumers to remove; some are very small (less than a half-millimeter square, and as thin as a sheet of paper) - others may be hidden or embedded inside a product where consumers cannot see them. New technologies allow RFID tags to be "printed" right on a product and may not be removable at all 3-RFID tags can be read without your knowledge Since the tags can be read without being swiped or obviously scanned, anyone with an RFID tag reader can read the tags embedded in your clothes and other consumer products without your knowledge. For example, you could be scanned before you enter the store, just to see what you are carrying. You might then be approached by a clerk who knows what you have in your backpack or purse, and can suggest accessories or other items. 4-RFID tags can be read a greater distances with a high-gain antenna 811
    • Appendix 2:RFID For various reasons, RFID reader/tag systems are designed so that distance between the tag and the reader is kept to a minimum (see the material on tag collision above). However, a high-gain antenna can be used to read the tags from much further away, leading to privacy problems. 5-RFID tags with unique serial numbers could be linked to an individual credit card number At present, the Universal Product Code (UPC) implemented with barcodes allows each product sold in a store to have a unique number that identifies that product. Work is proceeding on a global system of product identification that would allow each individual item to have its own number. When the item is scanned for purchase and is paid for, the RFID tag number for a particular item can be associated with a credit card number. B.5 | RFID TAG An RFID tag is a microchip combined with an antenna in a compact package; the packaging is structured to allow the RFID tag to be attached to an object to be tracked The tag's antenna picks up signals from an RFID reader or scanner and then returns the signal, usually with some additional data (like a unique serial number or other customized information).RFID tags can be very small - the size of a large rice grain. Others may be the size of a small paperback book B.5.1| What Are Zombie RFID Tags? One of the main concerns with RFID tags is that their contents can be read by anyone with an appropriately equipped scanner - even after you take it out of the store. One technology that has been suggested is a zombie RFID tag, a tag that can be temporarily deactivated when it leaves the store. The process would work like this: you bring your purchase up to the register, the RFID scanner reads the item, you pay for it and as you leave the store, you pass a special device that sends a signal to the RFID tag to "die." That is, it is no longer readable. The "zombie" element comes in when you bring an item back to the store. A special device especially made for that kind of tag "re-animates" the RFID tag, allowing the item to reenter the supply chain. 818
    • References ]1]KennethR.Castleman. Digital Image Processing. Prentice Hall,1996. ]2[AshleyR.Clarkand Colin NEberhardt. MicroscopyTechniques for Materials Science. CRC=Press,BocaRaton,Fl,2002. ]3[JamesD.Foley,AndriesvanDam,StevenK.Feiner,JohnF.Hughes,andRichardL. Phillips. Introduction to Computer Graphics. Addison-Wesley, 1994. ]4[Rafael Gonzalez and Richard E.Woods. Digital Image Processing. Addison Wesley, second edition,2002. ]5[Robert M.Haralick and Linda G.Shapiro. Computer and Robot Vision. Addison-Wesley, 1993. ]6[RobertV.Hogg and AllenT.Craig. Introductionto Mathematical Statistics. Prentice-Hall,fth edition,1994. ]7[JaeS.Lim.Two-Dimensional Signal and Image Processing. Prentice Hall, 1990. ]8[William K.Pratt. secondedition,1991. Digital Image Processing. JohnWileyandSons, ]9[Majid RabbaniandPaulW.Jones. DigitalImage Compression Techniques. SPIEOptical EngineeringPress,1991. ]11[Steven Roman. Introduction to Coding and Information Theory. SpringerVerlag,1997. ]11[AzrielRosenfeldandAvinashC.Kak. DigitalPicture Processing. Academic Press,second edition,1982. ]12[JeanPaulSerra.Image AcademicPress,1982. analysisandmathematicalmorphology. ]13[MelvinP. Siedband. Medicalimaging systems. In JohnG.Webster, editor, Medicalinstru-mentation: application and design, pages518 .576JohnWileyandSons,1998. ]14[MilanSonka,VaclavHlavac,andRogerBoyle.ImageProcessing,Analysis and MachineVision. PWS Publishing, secondedition,1999. 281
    • References ]15[Scott E. Umbaugh. Computer Vision and Image Processing: A Practical Approach Using CVIP Tools. Prentice-Hall, 1998. ]16[Dominic Welsh. Codes and Cryptography. OxfordUniversityPress,1989. [17] R. N. Bracewell. The Fourier Transform and its Applications. McGrawHill, 2000. [18] E. Oran Brigham. Schnelle Fourier Transformation. Oldenbourg Verlag, 1987. [19] L. R. Rabiner C.-H. Lee and R. Pieraccini. Speaker IndependentContinuous Speech Recognition Using Continuous Density Hidden Markov Models., volume F75 of NATO ASI Series, Speech Recogni-tion and Understanding. Recent Advances. Ed. by P. Laface and R.De Mori. Springer Verlag, Berlin Heidelberg, 1992. [20] X. D. Huang and K. F. Lee. Phonene classification using semicontin-uous hidden markov models.IEEE Trans. on Signal Processessing,40(5):1962–1067, May 1992. [21] F. Jelinek. Statistical Methods for Speech Recognition. MIT Press,1998. [22] S. E. Levinson L.R. Rabiner, B.H. Juang and M. M. Sondhi. Recognition of isolated digits using hidden markov models with continu-ous mixture densities.AT & T Technical Journal, 64(6):1211–1234, July-August 1985. [23] S. E. Levinson L.R. Rabiner, B.H. Juang and M. M. Sondhi. Some properties of continuous hidden markov model representations.AT & T Technical Journal, 64(6):1251–1270, July-August 1985. [24] J. G. Wilpon L.R. Rabiner and Frank K. SOONG. High per-formance connected digit recognition using hidden markov mod-els. IEEE Transactions on Acoustics, Speech and Signal Processing, 37(8):1214–1225, August 1989. [25] H. Ney. The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics,Speech and Signal Processing, ASSP-32(2):263–271, April 1984. 281
    • References [26] H. Ney. Modeling and search in continuous speech recognition.Pro ceedings of EUROSPEECH 1993, pages 491–498, 1993. [27] L.R. Rabiner. A tutorial on hidden markov models and selected ap plications in speech recognition.Proceedings of the IEEE, 77(2):257– 286, February 1989. [28] L.R. Rabiner and B.H. Juang. An introduction to hidden markov models.IEEE ASSP Magazine, pages 4–16, January 1986. [29] G. Ruske. Automatische Spracherkennung. Oldenbourg Verlag, M¨unchen, 1988. [30] E. G. Schukat-Talamazzini. Automatische Spracherkennung. Vieweg Verlag, 1995. [31] J.J. Odell S. Young and P.C. Woodland. Tree-based state tying for high accuracy acoustic modelling. Proc. Human Language Tech-nology Workshop, Plainsboro NJ, Morgan Kaufman PublishersInc.,pages 307–312, 1994. [32] K. F. Lee X. D. Huang and H. W. Hon. On semi-continuous hidden markov modeling.Proceedings ICASSP 1990, Albuquerque, Mexico, pages 689–692, April 1990. [33] F. Alleva X. Huang, M. Belin and M. Hwang. Unified stochas-tic engine (use) for speech recognition. Proceedings ICASSP 1993,, II:636–639, 1993. [34] S. J. Young. The general use of tying in phoneme-based hmm speech recognisers. Proceedings of ICASSP 1992, I(2):569–572, 1992. [35] S. Young. Large vocabulary continuous speech recognition: A review. IEEE Signal Processing Magazine, 13(5):45–57, 1996. 281
    • References [36] S. Young. Statistical modeling in continuous speech recognition.Proc. Int. Conference on Uncertainity in Artificial Intelligence,Seattle, WA, August 2001. [37] Zagzebski, J. Essentials of Ultrasound Physics. St. Louis, Mosby–Year Book,1996. [38] Wells, P. N. T. Biomedical Ultrasonics. New York, Academic Press, 1977. [39] McDicken, W. Diagnostic Ultrasonics. New York, John Wiley & Sons, 1976. [40]Eisenberg, R. Radiology: An Illustrated History. St. Louis, Mosby–Year Book,1992, pp. 452–466. [41] Bushong, S. Diagnostic Ultrasound. New York, McGraw-Hill, 1999. [42] Palmer, P. E. S. Manual of Diagnostic Ultrasound. Geneva, Switzerland, World Health Organization, 1995. [43] Graff KF. Ultrasonics: Historical aspects. Presented at the IEEE Symposium on Sonics and Ultrasonics, Phoenix, October 26–28, 1977. [44] Hendee,W. R., and Holmes, J. H. History of Ultrasound Imaging, in Fullerton,G. D., and Zagzebski, J. A. (eds.), Medical Physics of CT and Ultrasound. NewYork: American Institute of Physics, 1980. [45] Hendee,W. R. Cross sectional medical imaging: A history. Radiographics 1989;9:1155–1180. [46] Kinsler, L. E., et al. Fundamentals of Acoustics, 3rd edition New York, John Wiley & Sons, 1982, pp. 115–117. [47] ter Haar GR. In CR Hill (ed): Physical Principles of Medical Ultrasonics. Chichester, England, Ellis Horwood/Wiley, 1986. [48] Kossoff, G., Garrett, W. J., Carpenter, D. A., Jellins, J., Dadd, M. J. Principles and classification of soft tissues by grey scale echography. Ultrasound Med. Biol. 1976; 2:89–111. [49] Thrush, A., and Hartshorne, T. Peripheral Vascular Ultrasound. London, Churchill-Livingstone, 1999. 281
    • References [50] Chivers, R., and Hill, C. Ultrasonic attenuation in human tissues. Ultrasound Med. Biol. 1975; 2:25. [51] Dunn, F., Edmonds, P., and Fry, W. Absorption and Dispersion of Ultrasound in Biological Media, in H. Schwan (ed.), Biological Engineering. New York, McGraw-Hill, 1969, p. 205 [52] Powis, R. L., and Powis,W. J. A Thinker’s Guide to Ultrasonic Imaging. Baltimore, Urban & Schwarzenberg, 1984. [53] Kertzfield, K., and Litovitz, T. Absorption and Dispersion of Ultrasonic Waves.New York, Academic Press, 1959. [54] Wells, P. N. T. Review: Absorption and dispersion of ultrasound in biological tissue. Ultrasound Med Biol 1975; 1:369–376. [55] Suslick, K. S. (ed.). Ultrasound, Its Chemical, Physical and Biological Effects.New York, VCH Publishers, 1988. [56] Apfel, R. E. Possibility of microcavitation from diagnostic ultrasound. Trans. IEEE 1986; 33:139–142. 281