In this paper, we present how Bell’s Palsy, a neurological disorder, can be detected just from a subject’s eyes in a video. We notice that Bell’s Palsy patients often struggle to blink their eyes on the affected side. As a result, we can observe a clear contrast between the blinking patterns of the two eyes. Although previous works did utilize images/videos to detect this disorder, none have explicitly focused on the eyes. Most of them require the entire face. One obvious advantage of having an eye-focused detection system is that subjects’ anonymity is not at risk. Also, our AI decisions are based on simple blinking patterns, making them explainable and straightforward. Specifically, we develop a novel feature called blink similarity, which measures the similarity between the two blinking patterns. Our extensive experiments demonstrate that the proposed feature is quite robust, for it helps in Bell’s Palsy detection even with very few labels. Our proposed eye-focused detection system is not only cheaper but also more convenient than several existing methods.
Driving support systems, such as car navigation systems are becoming common and they
support driver in several aspects. Non-intrusive method of detecting Fatigue and drowsiness
based on eye-blink count and eye directed instruction controlhelps the driver to prevent from
collision caused by drowsy driving. Eye detection and tracking under various conditions such as
illumination, background, face alignment and facial expression makes the problem
complex.Neural Network based algorithm is proposed in this paper to detect the eyes efficiently.
In the proposed algorithm, first the neural Network is trained to reject the non-eye regionbased
on images with features of eyes and the images with features of non-eye using Gabor filter and Support Vector Machines to reduce the dimension and classify efficiently. In the algorithm, first the face is segmented using L*a*btransform color space, then eyes are detected using HSV and Neural Network approach. The algorithm is tested on nearly 100 images of different persons under different conditions and the results are satisfactory with success rate of 98%.The Neural Network is trained with 50 non-eye images and 50 eye images with different angles using Gabor filter. This paper is a part of research work on “Development of Non-Intrusive system for realtime Monitoring and Prediction of Driver Fatigue and drowsiness” project sponsored by Department of Science & Technology, Govt. of India, New Delhi at Vignan Institute of Technology and Sciences, Vignan Hills, Hyderabad.
The document provides an overview of approaches to artificial vision and visual neuroprosthetics. It discusses the history of artificial vision from early electrical stimulation experiments to modern retinal implants. Current approaches to prosthetic rehabilitation include epiretinal, subretinal, transchoroidal, and optic nerve implants as well as cortical implants. Each approach has advantages and drawbacks related to safety, acuity restoration potential, and applicable patient populations. Key challenges to developing effective neuroprosthetics include electrode miniaturization, signal processing algorithms, and targeting specific cell types. The field remains nascent but promising strategies involving optogenetics and cell therapy are emerging.
The document summarizes research on automated face detection and recognition. It discusses common applications of face detection such as webcam tracking and photo tagging. Face recognition can be used for biometrics, mugshot databases, and detecting fake IDs. The document then compares human and computer abilities in face detection/recognition and describes challenges computers face representing multidimensional face data. It provides a brief history of the field and covers common approaches to face detection and recognition including eigenfaces, Fisherfaces, neural networks, Gabor wavelets, and active shape models. The document also discusses challenges of 3D, video, and comparing face recognition systems.
Assistive System Using Eye Gaze Estimation for Amyotrophic Lateral Sclerosis ...Editor IJCATR
Amyotrophic lateral sclerosis (ALS) patients cannot control their muscle except eyes in the later stage of the disease
progress. This paper aims to develop an eye-based assistive system that is controlled by the eye gaze to help ALS patients improve
their life quality. Two main functions are proposed in this paper. The first one is called HelpCall that can detect the users’ eye gaze to
active the corresponding events. ALS patients can “talk” with other people more easily by looking at specific buttons in the HelpCall
system. The second one is an eye-control browser that allows the users browsing web pages in Internet. We design an interface that
embeds the IE browser into several buttons controlled by the user eye gaze. ALS patients can visit the Internet only using their eyes in
our proposed system. This paper discusses our ideas for the assistive system and then describes the design and implementation of our
proposed system in details.
IRDO: Iris Recognition by fusion of DTCWT and OLBPIJERA Editor
This document proposes a new iris recognition method called IRDO that fuses Dual Tree Complex Wavelet Transform (DTCWT) and Overlapping Local Binary Pattern (OLBP) features. DTCWT is used to extract micro-texture features from the iris, while OLBP enhances the extraction of edge features. Fusing these two methods results in improved matching performance and classification accuracy compared to state-of-the-the-art techniques. The proposed IRDO method achieves higher iris recognition rates as measured by Total Success Rate and Equal Error Rate.
The document proposes a unified framework for iris recognition that addresses challenges in unconstrained acquisition, robust matching, and privacy. It uses random projections and sparse representations to select good quality iris images, recognize iris patterns in a single step, and introduce cancelable templates for enhanced privacy without compromising security or recognition performance. Experimental results on public datasets demonstrate benefits of the proposed approach for robust and accurate iris recognition.
1. Scientists are developing a bionic eye called an artificial retina that could restore vision. It works by using a small implanted chip with light-sensitive cells that transmit signals to the optic nerve and brain.
2. The chip detects incoming light and transmits electrical signals to stimulate remaining retinal cells. Early versions used silicon but now scientists are testing ceramic cells that are safer for the human body.
3. The surgery to implant the chip involves making small incisions to insert the chip and use fluid to lift the retina and seal it over the chip. The goal is to replace damaged photoreceptor cells and restore basic vision.
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...AM Publications
Biometric authentication scheme used for person identification. Biometric authentication scheme consists of
uniqueness for identifying human using physiological and behavioral characteristics. So this technique is used for
criminal identification and this technique is used in civil service areas. In order to provide security to the data (2, 2)
secret sharing scheme. Basically iris recognition is the most secured scheme. Visual cryptography is the techniques
that divide the secret into shares.
Driving support systems, such as car navigation systems are becoming common and they
support driver in several aspects. Non-intrusive method of detecting Fatigue and drowsiness
based on eye-blink count and eye directed instruction controlhelps the driver to prevent from
collision caused by drowsy driving. Eye detection and tracking under various conditions such as
illumination, background, face alignment and facial expression makes the problem
complex.Neural Network based algorithm is proposed in this paper to detect the eyes efficiently.
In the proposed algorithm, first the neural Network is trained to reject the non-eye regionbased
on images with features of eyes and the images with features of non-eye using Gabor filter and Support Vector Machines to reduce the dimension and classify efficiently. In the algorithm, first the face is segmented using L*a*btransform color space, then eyes are detected using HSV and Neural Network approach. The algorithm is tested on nearly 100 images of different persons under different conditions and the results are satisfactory with success rate of 98%.The Neural Network is trained with 50 non-eye images and 50 eye images with different angles using Gabor filter. This paper is a part of research work on “Development of Non-Intrusive system for realtime Monitoring and Prediction of Driver Fatigue and drowsiness” project sponsored by Department of Science & Technology, Govt. of India, New Delhi at Vignan Institute of Technology and Sciences, Vignan Hills, Hyderabad.
The document provides an overview of approaches to artificial vision and visual neuroprosthetics. It discusses the history of artificial vision from early electrical stimulation experiments to modern retinal implants. Current approaches to prosthetic rehabilitation include epiretinal, subretinal, transchoroidal, and optic nerve implants as well as cortical implants. Each approach has advantages and drawbacks related to safety, acuity restoration potential, and applicable patient populations. Key challenges to developing effective neuroprosthetics include electrode miniaturization, signal processing algorithms, and targeting specific cell types. The field remains nascent but promising strategies involving optogenetics and cell therapy are emerging.
The document summarizes research on automated face detection and recognition. It discusses common applications of face detection such as webcam tracking and photo tagging. Face recognition can be used for biometrics, mugshot databases, and detecting fake IDs. The document then compares human and computer abilities in face detection/recognition and describes challenges computers face representing multidimensional face data. It provides a brief history of the field and covers common approaches to face detection and recognition including eigenfaces, Fisherfaces, neural networks, Gabor wavelets, and active shape models. The document also discusses challenges of 3D, video, and comparing face recognition systems.
Assistive System Using Eye Gaze Estimation for Amyotrophic Lateral Sclerosis ...Editor IJCATR
Amyotrophic lateral sclerosis (ALS) patients cannot control their muscle except eyes in the later stage of the disease
progress. This paper aims to develop an eye-based assistive system that is controlled by the eye gaze to help ALS patients improve
their life quality. Two main functions are proposed in this paper. The first one is called HelpCall that can detect the users’ eye gaze to
active the corresponding events. ALS patients can “talk” with other people more easily by looking at specific buttons in the HelpCall
system. The second one is an eye-control browser that allows the users browsing web pages in Internet. We design an interface that
embeds the IE browser into several buttons controlled by the user eye gaze. ALS patients can visit the Internet only using their eyes in
our proposed system. This paper discusses our ideas for the assistive system and then describes the design and implementation of our
proposed system in details.
IRDO: Iris Recognition by fusion of DTCWT and OLBPIJERA Editor
This document proposes a new iris recognition method called IRDO that fuses Dual Tree Complex Wavelet Transform (DTCWT) and Overlapping Local Binary Pattern (OLBP) features. DTCWT is used to extract micro-texture features from the iris, while OLBP enhances the extraction of edge features. Fusing these two methods results in improved matching performance and classification accuracy compared to state-of-the-the-art techniques. The proposed IRDO method achieves higher iris recognition rates as measured by Total Success Rate and Equal Error Rate.
The document proposes a unified framework for iris recognition that addresses challenges in unconstrained acquisition, robust matching, and privacy. It uses random projections and sparse representations to select good quality iris images, recognize iris patterns in a single step, and introduce cancelable templates for enhanced privacy without compromising security or recognition performance. Experimental results on public datasets demonstrate benefits of the proposed approach for robust and accurate iris recognition.
1. Scientists are developing a bionic eye called an artificial retina that could restore vision. It works by using a small implanted chip with light-sensitive cells that transmit signals to the optic nerve and brain.
2. The chip detects incoming light and transmits electrical signals to stimulate remaining retinal cells. Early versions used silicon but now scientists are testing ceramic cells that are safer for the human body.
3. The surgery to implant the chip involves making small incisions to insert the chip and use fluid to lift the retina and seal it over the chip. The goal is to replace damaged photoreceptor cells and restore basic vision.
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...AM Publications
Biometric authentication scheme used for person identification. Biometric authentication scheme consists of
uniqueness for identifying human using physiological and behavioral characteristics. So this technique is used for
criminal identification and this technique is used in civil service areas. In order to provide security to the data (2, 2)
secret sharing scheme. Basically iris recognition is the most secured scheme. Visual cryptography is the techniques
that divide the secret into shares.
The document discusses bionic eyes and their technological components. It describes how a bionic eye works by having electrodes implanted on the retina that are connected to a camera, video processing unit, and wireless transmitter. The Argus II is highlighted as the most advanced retinal prosthesis currently. It summarizes the key components of a bionic eye like the camera sensor technology, video processing unit, wireless transmission, and microelectrode array. The document also outlines improvements in resolution, material biocompatibility, wireless efficiency, and decreasing costs and size over time as important future opportunities to enhance bionic eye technology.
The document discusses bionic eyes, which are electronic devices that can replace dysfunctional parts of the eye and restore vision. It describes two approaches for a bionic eye - an artificial silicon retina (ASR) which is a microchip implanted under the retina, and a multi-unit artificial retina chipset (MARC) which uses an external camera, transmitter and implanted chip and electrodes. Both aim to stimulate remaining healthy retinal cells and transmit signals to the brain to provide a limited form of artificial vision. However, challenges remain in developing a bionic eye with sufficient resolution and long-term durability.
This document discusses current research on bionic eyes and their future prospects. It begins by describing the structure and function of the human eye. It then outlines causes of blindness and eye diseases. The document introduces the concept of a bionic eye as a bioelectronic device that can replace or enhance eye functionality. It describes different regions where implants can be placed and discusses approaches like epiretinal and subretinal implantation. The rest of the document focuses on the artificial silicon retina and multiple unit artificial retina chipset system as examples of bionic eye technologies, outlining their design, advantages, and limitations. It concludes by noting the progress made in bionic devices and remaining challenges in providing power and brain interfaces.
Final aartifical eye seminar report march2016gautam221094
This document is a seminar report submitted by Arjun Ram on the topic of artificial eyes, also known as bionic eyes. It provides an overview of the human eye and various eye conditions that can lead to vision loss. It discusses the history of artificial eyes from early glass and plastic prosthetics to current developments in bionic eyes. The report covers various bionic eye technologies such as the artificial silicon retina and MARC system. It describes the working, implantation process, and manufacturing of bionic eyes, which aim to restore sight to the blind.
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...CSCJournals
The proposed method describes the current forensics and biometrics in a modern approach and implements the concept of IRIS along with brain and resolves the issues and increases the strength of Digital Forensics Community. It has enormous features in biometrics to enhance diverse security levels. A new method to identify individuals using IRIS Patterns with the brain wave signals (EEG) is proposed. Several different algorithms were proposed for detecting, verifying and extracting the deterministic patterns in a person’s IRIS from the Eye. The extracted EEG recordings form the person\'s brain has proved to be unique. Next we combine EEG signals into the IRIS patterns a biometric application which makes use of future multi modal combination architecture. The proposed forensic research directions and argues that to move forward the community needs to adopt standardized, modular approaches for person identification. The result of each authentication test is compared with the user\'s pre-recorded measurements, using pattern recognition methods and signal-processing algorithms.
The document discusses the bio electronic eye, which replaces some or all functionality of the eye using electronics. It provides a history of the development of the bionic eye, describes how the human eye works compared to the bionic version, and details the key components and working principle of the MARC retinal prosthesis system. Some advantages are that it can be implanted with minimal surgery and has low power needs, though limitations include the difficulty of repairs and high costs. The conclusion is that while full vision may not be restored, bionic eyes can help the blind see shapes and objects.
This document summarizes an artificial silicon retina (ASR) seminar presentation. It describes how ASR technology works to restore vision by implanting a microchip retinal prosthesis that converts light into electrical signals. The ASR is small enough to fit in the eye and receives power from light, without needing batteries. It sends signals through the optic nerve to the brain, allowing some patients to perceive images. However, ASR technology remains highly experimental and can only provide basic vision currently. Much more research is still needed to develop it further.
Artificial Implants and the Field of Visual ProsthesisBrittney Pfeifer
This document outlines a presentation on visual prostheses and a study testing the Alpha-IMS subretinal prosthesis. The presentation covers retinal anatomy, diseases like AMD and RP, and different types of prostheses. The study tested the Alpha-IMS device in 9 patients, finding it restored some visual function like light perception and motion detection. Patients reported recognizing faces, objects, and letters. While results were promising, long-term testing is still needed to evaluate stability and recognition abilities over time. Visual prostheses show potential to treat retinal diseases but continued research and development is required.
This document discusses the development of artificial vision systems to cure blindness. It describes how artificial silicon retinas work to restore some vision by bypassing damaged retinal cells. The artificial silicon retina is a tiny microchip implanted in the eye that contains solar cells to convert light into electrical pulses, similar to the function of retinal rods and cones. It receives power from light entering the eye without needing external batteries or wires. Researchers are also developing other implantable microchips like the artificial retina component chip to restore partial vision for the blind.
The paper explores iris recognition for personal identification and verification. In this paper a new iris recognition technique is proposed using (Scale Invariant Feature Transform) SIFT. Image-processing algorithms have been validated on noised real iris image database. The proposed innovative technique is computationally effective as well as reliable in terms of recognition rates.
Biometric Iris Recognition Based on Hybrid Techniqueijsc
This document presents a study on implementing an iris recognition system using a hybrid technique. The system utilizes several image processing and machine learning techniques. It begins with preprocessing the iris image, including capturing, resizing and converting to grayscale. Histogram equalization is then used for enhancement. Two-dimensional discrete wavelet transform (2D DWT) is applied for feature extraction. Various edge detection algorithms including Canny, Prewitt, Roberts and Sobel are used to detect iris boundaries. The features are then stored in a vector for classification. The system is tested on different iris images and analysis shows 2D DWT and Canny edge detection provide adequate results for feature extraction and iris recognition.
A Novel Approach for Detecting the IRIS CryptsIJERDJOURNAL
ABSTRACT:- The iris is a stable biometric trait that has been widely used for human recognition in various applications. However, deployment of iris recognition in forensic applications has not been reported. A primary reason is the lack of human friendly techniques for iris comparison. To further promote the use of iris recognition in forensics, the similarity between irises should be made visualizable and interpretable. Recently, a human-in-the-loop iris recognition system was developed, based on detecting and matching iris crypts. Building on this framework, we propose a new approach for detecting and matching iris crypts automatically. Our detection method is able to capture iris crypts of various sizes. Our matching scheme is designed to handle potential topological changes in the detection of the same crypt in different images. Our approach outperforms the known visible-feature-based iris recognition method on three different data sets. After iris Crypts detection, Iris images were taken before and after the treatment of eye disease and the output shows the mathematical difference obtained from treatment. Gabor filter is used to extract the features. This iris recognition was effectively withstood with most ophthalmic disease like corneal oedema, iridotomies and conjunctivitis. This proposed iris recognition should be used to solve the potential problems that could cause in key biometric technology and medical diagnosis
A Robust Approach in Iris Recognition for Person AuthenticationIOSR Journals
The document describes a robust approach for iris recognition used for person authentication. It proposes using eight main stages: 1) scanning the iris image, 2) converting it to grayscale, 3) applying median filters to reduce noise, 4) detecting the pupil center, 5) using canny edge detection to identify iris and pupil edges, 6) determining the iris and pupil radii, 7) localizing the iris, and 8) unrolling the iris texture. It then uses k-means clustering to compare images and match them to authenticate individuals in a database. The approach aims to improve on previous iris recognition methods by more accurately detecting non-circular iris and pupil shapes.
The document discusses artificial eyes and how they work. It describes that artificial eyes consist of a camera, video processing unit, radio transmitter, radio receiver, and retinal implant. The camera captures images and sends them to the video processing unit which simplifies the images into light spots. The processed images are then sent to the retinal implant via the radio transmitter and receiver. The retinal implant stimulates the retina and optic nerve to send signals to the brain, allowing individuals with eye diseases to regain vision. The technology provides basic object and shape recognition but has limitations such as the need for surgery and the high cost. It represents an important development for restoring sight.
The document discusses various methods for providing artificial vision to blind individuals, including microchips, nanotube implants, digital artificial vision, and ocular prosthetics. It focuses on the digital artificial vision method, which uses a video camera, computer, and microcontroller to process images into electrical impulses that are transmitted to electrodes implanted in the visual cortex of the brain. This allows blind individuals to perceive black and white images with their brain. Future research aims to provide color images and replace implants with non-invasive technologies like rays or waves. The goal is for artificial vision systems to eventually restore full sight to all blind people.
This document discusses an artificial vision system that could restore sight for those with retinal diseases. It describes the major components of the system: an artificial silicon retina implanted in the eye that converts light to electrical signals; a miniature video camera that captures images; a video processing unit that simplifies the images; and an infrared LCD screen on goggles that converts signals to light pulses for the retina. The system provides basic vision allowing users to identify objects, but has limitations in cost, resolution, and applicability to infants. Overall, it represents an important breakthrough for treating retinal degeneration and other vision-impairing diseases.
Graduation Project - Face Login : A Robust Face Identification System for Sec...Ahmed Gad
Face login is my 2015 graduation project started in 2014 and lasted 1.5 years of work.
Generally, it is an identification system using face images. It is a multi-use system but it was mainly created to authorize users to login into their system.
There is an IEEE paper published by the project algorithm used in ICCES 2014 http://ieeexplore.ieee.org/abstract/document/7030929/.
Here is its citation Semary, Noura A., and Ahmed Fawzi Gad. "A proposed framework for robust face identification system." Computer Engineering & Systems (ICCES), 2014 9th International Conference on. IEEE, 2014.
A YouTube video describing the project generally.
https://www.youtube.com/watch?v=OUvaPW70Eko
Find me on:
AFCIT
http://www.afcit.xyz
YouTube
https://www.youtube.com/channel/UCuewOYbBXH5gwhfOrQOZOdw
Google Plus
https://plus.google.com/u/0/+AhmedGadIT
SlideShare
https://www.slideshare.net/AhmedGadFCIT
LinkedIn
https://www.linkedin.com/in/ahmedfgad/
ResearchGate
https://www.researchgate.net/profile/Ahmed_Gad13
Academia
https://www.academia.edu/
Google Scholar
https://scholar.google.com.eg/citations?user=r07tjocAAAAJ&hl=en
Mendelay
https://www.mendeley.com/profiles/ahmed-gad12/
ORCID
https://orcid.org/0000-0003-1978-8574
StackOverFlow
http://stackoverflow.com/users/5426539/ahmed-gad
Twitter
https://twitter.com/ahmedfgad
Facebook
https://www.facebook.com/ahmed.f.gadd
Pinterest
https://www.pinterest.com/ahmedfgad/
The document discusses the development of an artificial retina using thin film transistor technology. It describes how retinal conditions like retinitis pigmentosa and age-related macular degeneration cause vision loss and how an artificial retina could help. The artificial retina would be implanted either on top of the retina (epiretinal) or underneath (subretinal) and use thin film transistors to detect light and stimulate remaining retinal cells via electrical pulses. Both approaches have advantages and disadvantages relating to attachment, heat dissipation, power requirements and image processing. Complications from long-term implantation are also addressed. The technology shows promise for restoring basic vision but has limitations and further development is still needed.
This document describes the development of an artificial vision system to cure blindness. It discusses how researchers are developing retinal implants that can process images from a camera into electrical signals that the optic nerve can interpret as vision. The system includes a miniature video camera, a processor to translate images into signals, and an infrared screen on goggles to stimulate a silicon chip implanted on the retina. This technology has potential to restore limited sight to those blinded by retinal degeneration, though it cannot currently provide high-resolution images.
1) The document presents a new face parts detection algorithm that combines the Viola-Jones object detection framework with geometric information of facial features.
2) It detects faces, then isolates regions of interest for the eyes, nose, and mouth. Eye pupils are located using iris recognition techniques.
3) The algorithm was tested on hundreds of images and showed promising results for automated facial feature detection.
Development of novel BMIP algorithms for human eyes affected with glaucoma an...Premier Publishers
Glaucoma is one of the second driving eye maladies on the planet, if not treated legitimately may prompt lasting visual impairment. There are no particular side effects when the glaucoma disease is considered, especially for this type of eye disease, the effect of which is the vision loss in the human eyes. Because of measuring, the container zone increments, which will result in the vision impairment in the human eyes. Normally exceptionally prepared opthalmogists physically review eye pictures as tedious way. In this unique circumstance, we are attempting to build up some novel calculations for programmed recognition of eyes influenced with glaucoma utilizing picture preparing separating and change strategies and actualize the same on equipment utilizing micro-controller framework. The product that will be created by us could be implanted on the equipment to test the sound and undesirable fundus pictures for the recognition of glaucoma. The calculations that could be created can be actualized wrt the eye pictures in HDL language utilizing Xilinx ISE, MATLAB and MODELSIM, TI based unit or NI based pack (any one) is the equipment apparatus that is considered for execution purposes.
INTEGRATING HEAD POSE TO A 3D MULTITEXTURE APPROACH FOR GAZE DETECTIONijma
This document summarizes a research paper that proposes integrating head pose information with a 3D multi-texture active appearance model (MT-AAM) to improve gaze detection from webcam images. The 3D MT-AAM combines an iris texture model with an eye skin model that has a hole for the iris region. This allows the iris texture to rotate realistically under the skin for different gaze directions. The paper also proposes a multi-objective optimization that applies the 3D MT-AAM to both eyes, weighting the results based on detected head pose to determine which eye is more visible. Experimental results showed this approach outperformed a standard AAM and was comparable to state-of-the-art methods that require manual initialization.
The document discusses bionic eyes and their technological components. It describes how a bionic eye works by having electrodes implanted on the retina that are connected to a camera, video processing unit, and wireless transmitter. The Argus II is highlighted as the most advanced retinal prosthesis currently. It summarizes the key components of a bionic eye like the camera sensor technology, video processing unit, wireless transmission, and microelectrode array. The document also outlines improvements in resolution, material biocompatibility, wireless efficiency, and decreasing costs and size over time as important future opportunities to enhance bionic eye technology.
The document discusses bionic eyes, which are electronic devices that can replace dysfunctional parts of the eye and restore vision. It describes two approaches for a bionic eye - an artificial silicon retina (ASR) which is a microchip implanted under the retina, and a multi-unit artificial retina chipset (MARC) which uses an external camera, transmitter and implanted chip and electrodes. Both aim to stimulate remaining healthy retinal cells and transmit signals to the brain to provide a limited form of artificial vision. However, challenges remain in developing a bionic eye with sufficient resolution and long-term durability.
This document discusses current research on bionic eyes and their future prospects. It begins by describing the structure and function of the human eye. It then outlines causes of blindness and eye diseases. The document introduces the concept of a bionic eye as a bioelectronic device that can replace or enhance eye functionality. It describes different regions where implants can be placed and discusses approaches like epiretinal and subretinal implantation. The rest of the document focuses on the artificial silicon retina and multiple unit artificial retina chipset system as examples of bionic eye technologies, outlining their design, advantages, and limitations. It concludes by noting the progress made in bionic devices and remaining challenges in providing power and brain interfaces.
Final aartifical eye seminar report march2016gautam221094
This document is a seminar report submitted by Arjun Ram on the topic of artificial eyes, also known as bionic eyes. It provides an overview of the human eye and various eye conditions that can lead to vision loss. It discusses the history of artificial eyes from early glass and plastic prosthetics to current developments in bionic eyes. The report covers various bionic eye technologies such as the artificial silicon retina and MARC system. It describes the working, implantation process, and manufacturing of bionic eyes, which aim to restore sight to the blind.
Enhancement of Multi-Modal Biometric Authentication Based on IRIS and Brain N...CSCJournals
The proposed method describes the current forensics and biometrics in a modern approach and implements the concept of IRIS along with brain and resolves the issues and increases the strength of Digital Forensics Community. It has enormous features in biometrics to enhance diverse security levels. A new method to identify individuals using IRIS Patterns with the brain wave signals (EEG) is proposed. Several different algorithms were proposed for detecting, verifying and extracting the deterministic patterns in a person’s IRIS from the Eye. The extracted EEG recordings form the person\'s brain has proved to be unique. Next we combine EEG signals into the IRIS patterns a biometric application which makes use of future multi modal combination architecture. The proposed forensic research directions and argues that to move forward the community needs to adopt standardized, modular approaches for person identification. The result of each authentication test is compared with the user\'s pre-recorded measurements, using pattern recognition methods and signal-processing algorithms.
The document discusses the bio electronic eye, which replaces some or all functionality of the eye using electronics. It provides a history of the development of the bionic eye, describes how the human eye works compared to the bionic version, and details the key components and working principle of the MARC retinal prosthesis system. Some advantages are that it can be implanted with minimal surgery and has low power needs, though limitations include the difficulty of repairs and high costs. The conclusion is that while full vision may not be restored, bionic eyes can help the blind see shapes and objects.
This document summarizes an artificial silicon retina (ASR) seminar presentation. It describes how ASR technology works to restore vision by implanting a microchip retinal prosthesis that converts light into electrical signals. The ASR is small enough to fit in the eye and receives power from light, without needing batteries. It sends signals through the optic nerve to the brain, allowing some patients to perceive images. However, ASR technology remains highly experimental and can only provide basic vision currently. Much more research is still needed to develop it further.
Artificial Implants and the Field of Visual ProsthesisBrittney Pfeifer
This document outlines a presentation on visual prostheses and a study testing the Alpha-IMS subretinal prosthesis. The presentation covers retinal anatomy, diseases like AMD and RP, and different types of prostheses. The study tested the Alpha-IMS device in 9 patients, finding it restored some visual function like light perception and motion detection. Patients reported recognizing faces, objects, and letters. While results were promising, long-term testing is still needed to evaluate stability and recognition abilities over time. Visual prostheses show potential to treat retinal diseases but continued research and development is required.
This document discusses the development of artificial vision systems to cure blindness. It describes how artificial silicon retinas work to restore some vision by bypassing damaged retinal cells. The artificial silicon retina is a tiny microchip implanted in the eye that contains solar cells to convert light into electrical pulses, similar to the function of retinal rods and cones. It receives power from light entering the eye without needing external batteries or wires. Researchers are also developing other implantable microchips like the artificial retina component chip to restore partial vision for the blind.
The paper explores iris recognition for personal identification and verification. In this paper a new iris recognition technique is proposed using (Scale Invariant Feature Transform) SIFT. Image-processing algorithms have been validated on noised real iris image database. The proposed innovative technique is computationally effective as well as reliable in terms of recognition rates.
Biometric Iris Recognition Based on Hybrid Techniqueijsc
This document presents a study on implementing an iris recognition system using a hybrid technique. The system utilizes several image processing and machine learning techniques. It begins with preprocessing the iris image, including capturing, resizing and converting to grayscale. Histogram equalization is then used for enhancement. Two-dimensional discrete wavelet transform (2D DWT) is applied for feature extraction. Various edge detection algorithms including Canny, Prewitt, Roberts and Sobel are used to detect iris boundaries. The features are then stored in a vector for classification. The system is tested on different iris images and analysis shows 2D DWT and Canny edge detection provide adequate results for feature extraction and iris recognition.
A Novel Approach for Detecting the IRIS CryptsIJERDJOURNAL
ABSTRACT:- The iris is a stable biometric trait that has been widely used for human recognition in various applications. However, deployment of iris recognition in forensic applications has not been reported. A primary reason is the lack of human friendly techniques for iris comparison. To further promote the use of iris recognition in forensics, the similarity between irises should be made visualizable and interpretable. Recently, a human-in-the-loop iris recognition system was developed, based on detecting and matching iris crypts. Building on this framework, we propose a new approach for detecting and matching iris crypts automatically. Our detection method is able to capture iris crypts of various sizes. Our matching scheme is designed to handle potential topological changes in the detection of the same crypt in different images. Our approach outperforms the known visible-feature-based iris recognition method on three different data sets. After iris Crypts detection, Iris images were taken before and after the treatment of eye disease and the output shows the mathematical difference obtained from treatment. Gabor filter is used to extract the features. This iris recognition was effectively withstood with most ophthalmic disease like corneal oedema, iridotomies and conjunctivitis. This proposed iris recognition should be used to solve the potential problems that could cause in key biometric technology and medical diagnosis
A Robust Approach in Iris Recognition for Person AuthenticationIOSR Journals
The document describes a robust approach for iris recognition used for person authentication. It proposes using eight main stages: 1) scanning the iris image, 2) converting it to grayscale, 3) applying median filters to reduce noise, 4) detecting the pupil center, 5) using canny edge detection to identify iris and pupil edges, 6) determining the iris and pupil radii, 7) localizing the iris, and 8) unrolling the iris texture. It then uses k-means clustering to compare images and match them to authenticate individuals in a database. The approach aims to improve on previous iris recognition methods by more accurately detecting non-circular iris and pupil shapes.
The document discusses artificial eyes and how they work. It describes that artificial eyes consist of a camera, video processing unit, radio transmitter, radio receiver, and retinal implant. The camera captures images and sends them to the video processing unit which simplifies the images into light spots. The processed images are then sent to the retinal implant via the radio transmitter and receiver. The retinal implant stimulates the retina and optic nerve to send signals to the brain, allowing individuals with eye diseases to regain vision. The technology provides basic object and shape recognition but has limitations such as the need for surgery and the high cost. It represents an important development for restoring sight.
The document discusses various methods for providing artificial vision to blind individuals, including microchips, nanotube implants, digital artificial vision, and ocular prosthetics. It focuses on the digital artificial vision method, which uses a video camera, computer, and microcontroller to process images into electrical impulses that are transmitted to electrodes implanted in the visual cortex of the brain. This allows blind individuals to perceive black and white images with their brain. Future research aims to provide color images and replace implants with non-invasive technologies like rays or waves. The goal is for artificial vision systems to eventually restore full sight to all blind people.
This document discusses an artificial vision system that could restore sight for those with retinal diseases. It describes the major components of the system: an artificial silicon retina implanted in the eye that converts light to electrical signals; a miniature video camera that captures images; a video processing unit that simplifies the images; and an infrared LCD screen on goggles that converts signals to light pulses for the retina. The system provides basic vision allowing users to identify objects, but has limitations in cost, resolution, and applicability to infants. Overall, it represents an important breakthrough for treating retinal degeneration and other vision-impairing diseases.
Graduation Project - Face Login : A Robust Face Identification System for Sec...Ahmed Gad
Face login is my 2015 graduation project started in 2014 and lasted 1.5 years of work.
Generally, it is an identification system using face images. It is a multi-use system but it was mainly created to authorize users to login into their system.
There is an IEEE paper published by the project algorithm used in ICCES 2014 http://ieeexplore.ieee.org/abstract/document/7030929/.
Here is its citation Semary, Noura A., and Ahmed Fawzi Gad. "A proposed framework for robust face identification system." Computer Engineering & Systems (ICCES), 2014 9th International Conference on. IEEE, 2014.
A YouTube video describing the project generally.
https://www.youtube.com/watch?v=OUvaPW70Eko
Find me on:
AFCIT
http://www.afcit.xyz
YouTube
https://www.youtube.com/channel/UCuewOYbBXH5gwhfOrQOZOdw
Google Plus
https://plus.google.com/u/0/+AhmedGadIT
SlideShare
https://www.slideshare.net/AhmedGadFCIT
LinkedIn
https://www.linkedin.com/in/ahmedfgad/
ResearchGate
https://www.researchgate.net/profile/Ahmed_Gad13
Academia
https://www.academia.edu/
Google Scholar
https://scholar.google.com.eg/citations?user=r07tjocAAAAJ&hl=en
Mendelay
https://www.mendeley.com/profiles/ahmed-gad12/
ORCID
https://orcid.org/0000-0003-1978-8574
StackOverFlow
http://stackoverflow.com/users/5426539/ahmed-gad
Twitter
https://twitter.com/ahmedfgad
Facebook
https://www.facebook.com/ahmed.f.gadd
Pinterest
https://www.pinterest.com/ahmedfgad/
The document discusses the development of an artificial retina using thin film transistor technology. It describes how retinal conditions like retinitis pigmentosa and age-related macular degeneration cause vision loss and how an artificial retina could help. The artificial retina would be implanted either on top of the retina (epiretinal) or underneath (subretinal) and use thin film transistors to detect light and stimulate remaining retinal cells via electrical pulses. Both approaches have advantages and disadvantages relating to attachment, heat dissipation, power requirements and image processing. Complications from long-term implantation are also addressed. The technology shows promise for restoring basic vision but has limitations and further development is still needed.
This document describes the development of an artificial vision system to cure blindness. It discusses how researchers are developing retinal implants that can process images from a camera into electrical signals that the optic nerve can interpret as vision. The system includes a miniature video camera, a processor to translate images into signals, and an infrared screen on goggles to stimulate a silicon chip implanted on the retina. This technology has potential to restore limited sight to those blinded by retinal degeneration, though it cannot currently provide high-resolution images.
1) The document presents a new face parts detection algorithm that combines the Viola-Jones object detection framework with geometric information of facial features.
2) It detects faces, then isolates regions of interest for the eyes, nose, and mouth. Eye pupils are located using iris recognition techniques.
3) The algorithm was tested on hundreds of images and showed promising results for automated facial feature detection.
Development of novel BMIP algorithms for human eyes affected with glaucoma an...Premier Publishers
Glaucoma is one of the second driving eye maladies on the planet, if not treated legitimately may prompt lasting visual impairment. There are no particular side effects when the glaucoma disease is considered, especially for this type of eye disease, the effect of which is the vision loss in the human eyes. Because of measuring, the container zone increments, which will result in the vision impairment in the human eyes. Normally exceptionally prepared opthalmogists physically review eye pictures as tedious way. In this unique circumstance, we are attempting to build up some novel calculations for programmed recognition of eyes influenced with glaucoma utilizing picture preparing separating and change strategies and actualize the same on equipment utilizing micro-controller framework. The product that will be created by us could be implanted on the equipment to test the sound and undesirable fundus pictures for the recognition of glaucoma. The calculations that could be created can be actualized wrt the eye pictures in HDL language utilizing Xilinx ISE, MATLAB and MODELSIM, TI based unit or NI based pack (any one) is the equipment apparatus that is considered for execution purposes.
INTEGRATING HEAD POSE TO A 3D MULTITEXTURE APPROACH FOR GAZE DETECTIONijma
This document summarizes a research paper that proposes integrating head pose information with a 3D multi-texture active appearance model (MT-AAM) to improve gaze detection from webcam images. The 3D MT-AAM combines an iris texture model with an eye skin model that has a hole for the iris region. This allows the iris texture to rotate realistically under the skin for different gaze directions. The paper also proposes a multi-objective optimization that applies the 3D MT-AAM to both eyes, weighting the results based on detected head pose to determine which eye is more visible. Experimental results showed this approach outperformed a standard AAM and was comparable to state-of-the-art methods that require manual initialization.
International Journal of Research in Engineering and Science is an open access peer-reviewed international forum for scientists involved in research to publish quality and refereed papers. Papers reporting original research or experimentally proved review work are welcome. Papers for publication are selected through peer review to ensure originality, relevance, and readability.
ROP is an eye disease in premature infants. In infants who are born earlier than normal, retinal
vessel growth stops. Early treatment is very crucial in this case as it can end up to blindness if
the diagnosis has not done in a short time. The purpose of this research is to design an
intelligent automated system for the early detection of disease at an early stage to prevent these
babies from dangerous consequences. In this study, we analyzed the images in the Lab color
space, and evaluated the efficiency of applying filters named, Canny Laplacian and Sobel. The results indicate relatively higher efficiency and quality of the Laplacian filter in ROP diagnosis.
ROP is an eye disease in premature infants. In infants who are born earlier than normal, retinal
vessel growth stops. Early treatment is very crucial in this case as it can end up to blindness if
the diagnosis has not done in a short time. The purpose of this research is to design an
intelligent automated system for the early detection of disease at an early stage to prevent these
babies from dangerous consequences. In this study, we analyzed the images in the Lab color
space, and evaluated the efficiency of applying filters named, Canny Laplacian and Sobel. The
results indicate relatively higher efficiency and quality of the Laplacian filter in ROP diagnosis.
This document describes a student project to develop a real-time eye gaze tracking system using a low-cost webcam. It provides background on eye tracking research and methods, including a brief history and modern techniques like video-based and invasive methods. It then discusses the general concepts and tools used in the project, including the structure of the human eye, Purkinje images, infrared light, webcams, and MATLAB. The implementation, applications, results and future improvements are also summarized.
Glaucoma progressiondetection based on Retinal Features.pptxssuser097984
1. The document describes a thesis submitted to the University of Babylon for a Master's degree in Software. The thesis proposes a system for detecting glaucoma progression based on analyzing retinal features in fundus images.
2. The system includes stages for preprocessing, segmenting the optic disc and optic cup, extracting blood vessels, and extracting features for classification. Features like cup-to-disc ratio are extracted using techniques like gray-level co-occurrence matrix.
3. The contributions of the thesis include developing an automated system that can diagnose glaucoma cases and severity levels based on approved characteristics extracted from fundus images, achieving good results compared to previous work, and enabling real-time analysis of cases under 4 seconds
An intelligent strabismus detection method based on convolution neural networkTELKOMNIKA JOURNAL
Strabismus is one of the widespread vision disorders in which the eyes are misaligned and asymmetric. Convolutional neural networks (CNNs) are properly designed for analyzing images and detecting texture patterns. In this paper, we proposed a system that uses deep learning CNN applications for automatically detecting and classifying strabismus disorder. The proposed system includes two main stages: first, the detection of facial eye segmentation using the viola-jones algorithm. The second stage is to map the segmented eye area according to the iris position of each eye. This method is applied to three strabismus datasets, gathered as digital images. The second section covers the segmentation of the eye region. Besides, the evaluation equations for measuring system performance. The system has undergone numerous experiments in various stages to simulate and analyze the detection performance of CNN layers through different classifiers and variant thresholds ratio. The researchers investigated the experimental outcomes during the training and testing phases and obtained promising results that exhibit the effectiveness of the proposed system. According to the results, the accuracy of this technique reached 95.62%.
The document proposes an eye gaze-based electric wheelchair control system (EBEWC) that is controlled solely by human eyes. It aims to allow disabled individuals to safely and easily control a wheelchair independently. The system uses a laptop camera to track eye movements and detect the pupil location in order to select numbers, functions, and commands on a GUI to control the wheelchair. It aims to be robust against factors like different users, illumination changes, movement, and vibration. The design flow involves using OpenCV for image analysis tasks like face detection and eye pupil position computation to determine eye gaze direction and select commands to control the wheelchair via a microcontroller.
Multimodal Biometrics Recognition from Facial Video via Deep Learning cscpconf
Biometrics identification using multiple modalities has attracted the attention of many
researchers as it produces more robust and trustworthy results than single modality biometrics.
In this paper, we present a novel multimodal recognition system that trains a Deep Learning
Network to automatically learn features after extracting multiple biometric modalities from a
single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile
face, frontal face, right profile face, and right ear, present in the facial video clips, we train
supervised denosing autoencoders to automatically extract robust and non-redundant features.
The automatically learned features are then used to train modality specific sparse classifiers to
perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a
99.17% and 97.14% rank-1 recognition rates, respectively. The multimodal recognition
accuracy demonstrates the superiority and robustness of the proposed approach irrespective of
the illumination, non-planar movement, and pose variations present in the video clips.
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNINGcsandit
Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics.
In this paper, we present a novel multimodal recognition system that trains a Deep Learning Network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denosing autoencoders to automatically extract robust and non-redundant features.The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% rank-1 recognition rates, respectively. The multimodal recognition
accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips.
This document presents a new iris segmentation method for iris recognition systems. The proposed method uses Canny edge detection and Hough transform to locate the iris boundary after finding the pupil boundary using image gray levels. Experiments on the CASIA iris image database of 756 images show the method can accurately detect the iris boundary in 99.2% of images. This is an improvement over other existing segmentation techniques. The key steps of the proposed method are preprocessing, segmentation using Canny edge detection and Hough transform, normalization using the rubber sheet model, feature encoding with Gabor wavelets, and matching with Hamming distance.
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNINGsipij
Biometrics identification using multiple modalities has attracted the attention of many researchers as it
produces more robust and trustworthy results than single modality biometrics. In this paper, we present a
novel multimodal recognition system that trains a Deep Learning Network to automatically learn features
after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing
different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in
the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and nonredundant
features. The automatically learned features are then used to train modality specific sparse
classifiers to perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and
97.14% rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the
superiority and robustness of the proposed approach irrespective of the illumination, non-planar
movement, and pose variations present in the video clips.
This document discusses a study that aims to classify glaucomatous images based on wavelet features with high accuracy. It analyzes energy distributions from several wavelet filters to extract texture features from retinal images. These features are ranked and selected before being introduced to classifiers like support vector machine, sequential minimal optimization, random forest, and naive Bayes. The proposed system is expected to achieve over 95% accuracy in glaucoma classification, outperforming existing techniques.
A Review on Face Detection under Occlusion by Facial AccessoriesIRJET Journal
This document reviews various methods for detecting faces that are partially occluded by accessories like sunglasses or scarves. It discusses approaches that divide the face into patches and use PCA to detect occluded regions. Other methods use particle filtering to track occluded objects over multiple frames, or detect occlusion through Gabor wavelets and SVM classification of facial components. More advanced techniques apply deep convolutional neural networks to simultaneously estimate positions of facial landmarks while being robust to occlusion, pose variations and illumination changes. The document concludes that occlusion detection is important for face recognition systems and that future work could aim to improve detection accuracy.
Eye Blink and Hand Gesture detection For Aid of Paralyzed PatientsIRJET Journal
This document describes an assistive technology system called Paral-Eye that uses eye blink detection and hand gesture recognition to help paralyzed patients control home appliances. The system tracks the user's face using Haar cascade classifiers to detect eye blinks. It also uses flex sensors in a glove to detect hand gestures by measuring changes in resistance as the user flexes their fingers. Different blinks and gestures are mapped to commands to control lights, fans, and other devices. The goal is to allow paralyzed patients to have more independence and functionality through this eye and hand-based assistive system when caregivers are not present. The document provides background on paralysis and reviews related works before describing the modules, algorithms, and software used to implement
Brain computer interfacing for controlling wheelchair movementIRJET Journal
This document describes research on developing a brain-computer interface (BCI) system to control a wheelchair using EEG brain wave signals. Specifically, it focuses on using alpha waves detected during a relaxed state to allow users to control wheelchair movement and direction. The system is intended to help people with disabilities who cannot move themselves. The document provides background on BCI and previous related work, then describes the proposed system which uses EEG signals from a low-cost headset to classify motor imagery and control a wheelchair wirelessly. It discusses the algorithms and experimental results, showing the system can accurately detect different movement intentions based on alpha wave detection with minimal training.
Abstract: This paper presents a new face parts information analyzer, as a promising model for detecting faces and locating the facial features in images. The main objective is to build fully automated human facial measurements systems from images with complex backgrounds. Detection of facial features such as eye, nose, and mouth is an important step for many subsequent facial image analysis tasks. The main study of face detection is detect the portion of part and mention the circle or rectangular of the every portion of body. In this paper face detection is depend upon the face pattern which is match the face from the pattern reorganization. The study present a novel and simple model approach based on a mixture of techniques and algorithms in a shared pool based on viola jones object detection framework algorithm combined with geometric and symmetric information of the face parts from the image in a smart algorithm.Keywords: Face detection, Video frames, Viola-Jones, Skin detection, Skin color classification, Face reorganization, Pattern reorganization. Skin Color.
Title: Face Detection Using Modified Viola Jones Algorithm
Author: Alpika Gupta, Dr. Rajdev Tiwari
International Journal of Recent Research in Mathematics Computer Science and Information Technology
ISSN 2350-1022
Paper Publications
The document discusses using eye tracking devices to assess cognitive functions through ocular-complex measurements during human-computer interaction. It describes that eye fixations allow users to process information by refreshing and extending short-term memory. Eye tracking is presented as the best technique to assess functions of the ocular complex like saccades, pursuits, and accommodation which relate to mental processes like attention, awareness, and working memory. Examples of potential tests are provided that measure fixation duration, number of fixations and saccades using a gradient scale.
Similar to Eye-focused Detection of Bell’s Palsy in Videos (20)
Most existing high-performance co-segmentation algorithms
are usually complex due to the way of co-labelling a
set of images as well as the common need of fine-tuning few
parameters for effective co-segmentation. In this paper, instead
of following the conventional way of co-labelling multiple images,
we propose to first exploit inter-image information through cosaliency,
and then perform single-image segmentation on each
individual image. To make the system robust and to avoid heavy
dependence on one single saliency extraction method, we propose
to apply multiple existing saliency extraction methods on each
image to obtain diverse salient maps. Our major contribution lies
in the proposed method that fuses the obtained diverse saliency
maps by exploiting the inter-image information, which we call
saliency co-fusion. Experiments on five benchmark datasets with
eight saliency extraction methods show that our saliency co-fusion
based approach achieves competitive performance even without
parameter fine-tuning when compared with the state-of-the-art
methods.
Qcce quality constrained co saliency estimation for common object detectionKoteswar Rao Jerripothula
Despite recent advances in joint processing of images,
sometimes it may not be as effective as single image
processing for object discovery problems. In this paper while
aiming for common object detection, we attempt to address
this problem by proposing a novel QCCE: Quality Constrained
Co-saliency Estimation method. The approach here is to iteratively
update the saliency maps through co-saliency estimation
depending upon quality scores, which indicate the degree of
separation of foreground and background likelihoods (the easier
the separation, the higher the quality of saliency map). In this
way, joint processing is automatically constrained by the quality
of saliency maps. Moreover, the proposed method can be applied
to both unsupervised and supervised scenarios, unlike other
methods which are particularly designed for one scenario only.
Experimental results demonstrate superior performance of the
proposed method compared to the state-of-the-art methods.
Most of the existing co-segmentation methods are usually
complex, and require pre-grouping of images, fine-tuning a
few parameters and initial segmentation masks etc. These
limitations become serious concerns for their application on
large scale datasets. In this paper, Group Saliency Propagation
(GSP) model is proposed where a single group saliency
map is developed, which can be propagated to segment the
entire group. In addition, it is also shown how a pool of these
group saliency maps can help in quickly segmenting new
input images. Experiments demonstrate that the proposed
method can achieve competitive performance on several
benchmark co-segmentation datasets including ImageNet,
with the added advantage of speed up.
This document summarizes a project report on image segmentation using an advanced fuzzy c-means algorithm. The report was submitted by two students to fulfill requirements for a Bachelor of Technology degree in Electrical Engineering at the Indian Institute of Technology Roorkee. It describes implementing various clustering algorithms for image segmentation, including k-means, fuzzy c-means, bias-corrected fuzzy c-means, and Gaussian kernel fuzzy c-means. It then proposes improvements to the algorithms by automatically selecting the number of clusters and initial cluster centers based on a moving average filter on the image histogram. This approach removes problems of non-convergence and increases speed, enabling real-time video segmentation.
[PDF] Automatic Image Co-segmentation Using Geometric Mean Saliency (Top 10% ...Koteswar Rao Jerripothula
Most existing high-performance co-segmentation algorithms are usually complicated due to the way of co-labelling a set of images and the requirement to handle quite a few parameters for effective co-segmentation. In this paper, instead of relying on the complex process of co-labelling multiple images, we perform segmentation on individual images but based on a combined saliency map that is obtained by fusing single-image saliency maps of a group of similar images. Particularly, a new multiple image based saliency map extraction, namely geometric mean saliency (GMS) method, is proposed to obtain the global saliency maps. In GMS, we transmit the saliency information among the images using the warping technique. Experiments show that our method is able to outperform state-of-the-art methods on three benchmark co-segmentation datasets.
This document provides an overview of machine learning and different types of machine learning algorithms. It discusses how machine learning algorithms have been used to rank webpages, filter spam emails, and make predictions. The document explains that machine learning grew out of artificial intelligence research and allows computers to learn without being explicitly programmed. It describes supervised learning algorithms, which are taught using labeled training data, and unsupervised learning algorithms, which find hidden patterns in unlabeled data. Examples of supervised learning problems include regression to predict prices and classification to detect cancer. Unsupervised learning finds clusters within data without predefined labels.
Most existing high-performance co-segmentation algorithms are usually complicated due to the way of co-labelling a set of images and the requirement to handle quite a few parameters for effective co-segmentation. In this paper, instead of relying on the complex process of co-labelling multiple images, we perform segmentation on individual images but based on a combined saliency map that is obtained by fusing single-image saliency maps of a group of similar images. Particularly, a new multiple image based saliency map extraction, namely geometric mean saliency (GMS) method, is proposed to obtain the global saliency maps. In GMS, we transmit the saliency information among the images using the warping technique. Experiments show that our method is able to outperform state-of-the-art methods on three benchmark co-segmentation datasets.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. 2
two sides. Our main idea is to model this contrast as a predictor (feature) for building the
required system.
However, there are some challenges in building such a system: (i) There are not many
publicly available videos of the patients suffering from Bell’s Palsy to develop an end-to-
end data-driven model. Whatever model we build should be so robust that it works with
even very few labels. (ii) There are no ready-made video-level features that capture the
blink-related contrast discussed in the above paragraph. Although there are several works
on images, there are very few prior works that deal with videos to solve this problem. (iii)
Users may have privacy concerns too, and the system needs to have a privacy-preserving
feature to facilitate the anonymity of the data collected.
We propose a hybrid approach that efficiently blends deep learning and traditional ma-
chine learning (ML) approaches. We attempt Bell’s Palsy detection (for which there is lim-
ited video data available) at the video level by first performing blink detection (for which
there is plenty of image data available) at the frame level. Essentially, we break the problem
into two stages: frame-level blink detection and video-level Bell’s Palsy detection. While
we solve the blink detection problem through the feature learning idea prevalent in deep
learning, we rely on the hand-crafted feature design approach dominant in traditional ma-
chine learning for solving the Bell’s Palsy detection problem. To solve these two problems,
we develop two datasets by collecting images and videos from the internet. Our algorithm
needs only eye regions. Hence, the subjects need not have privacy concerns because the
entire face is no more needed.
This paper makes three contributions: (1) We develop novel benchmark datasets for
detecting Bell’s Palsy and blinks. (2) We create a novel eye-focused feature named blink
similarity for videos through a hybrid approach. (3) We show that a Bell’s Palsy diagnosis
system can be build while incorporating privacy concerns by just focusing on the eyes.
2. Related Work
Since this work deals with blink detection and Bell’s Palsy detection, we provide literature
reviews of both in this section.
The blink detection problem is a well-explored research problem. There are heuristic
methods such as [10], where authors exploit the brightness feature. Then, there are tradi-
tional ML-based methods using HOG [11] feature, motion [12] feature, etc. With the rise
of deep learning, researchers have explored several CNN-based approaches recently, such
as [12–14]. We also use a CNN-based architecture for this problem. However, our main
contribution to this problem is the dataset we provide, which is at the magnitude of nearly
40k images, which is the largest dataset built so far for this particular problem, to the best
of our knowledge. Also, none of the existing works explicitly used blink detection to solve
Bell’s Palsy problem, to the best of our knowledge.
As far as Bell’s Palsy is concerned, in [15], the authors analyze the video clips captured
with webcam of PC for diagnosis by measuring the asymmetry index around the mouth
region. In contrast, our work focuses on the eyes to detect Bell’s Palsy. In [16], authors
propose a method named ASMLBP (Active Shape Models plus Local Binary Patterns). In
ASMLBP, the face is divided into eight local regions to describe each region using the facial
points extracted using Active Shape Models (ASM) and through region descriptors of Local
Binary Patterns(LBP). However, this method requires videos of the subjects to be taken
in a controlled environment (see [16] for more details). It also needs specific movements to
be carried out for accurate detection. Similarly, [17, 18] also have constraints with video
recording environment, video lengths, etc. Our proposed method does not have any such
constraints.
3. 3
(a) A person's blink who is not suffering from Bell's Palsy: Both the eyes blink simultaneously.
(b) A person's blink who is suffering from Bell's Palsy: While one eye blinks normally, another fails to do the same.
Figure 1. This illustration demonstrates the difference between the blink of (a) a normal
person and (b) a Bell’s Palsy patient. While blinking patterns of the two eyes appear
the same for a normal person, they differ a lot for a Bell’s Palsy patient. Note how the
patient’s right eye remains open while his left blinks normally.
In [19], the authors propose a smartphone-based facial palsy diagnostic system that can
work in the wild. They localize and track facial landmarks with a modified linear regression
method and then compute the displacement ratio between the left and right side of facial
landmarks located on the forehead and mouth region. Another work [20] proposes an in-
cremental face alignment method to handle videos in the wild for this problem. However,
there is always a possibility of facial points mismatching in such analysis. Recently, [21]
proposed a method to analyze facial Palsy using a deep hierarchical network by detecting
asymmetric regions. Again, it depends upon the accuracy of facial landmarks detected.
As far as the proposed method is concerned, we use facial landmarks only for localization
[22–25] of the eye regions; therefore, we can tolerate inaccuracy in landmarks detection up
to a significant tolerance level. Most close to our work is [26], which attempted facial palsy
detection using videos by applying transfer learning on the 3D-CNN of [27]. However, such
end-to-end networks are hardly explainable. As a two-stage algorithm, our algorithm with
an intuitive reason for intermediate detection of blinks has an edge in explainability.
3. Proposed Method
Eye-blinking is a biological phenomenon of regular closure of eyelids for a brief period.
As discussed earlier, in Bell’s Palsy, one side of the face becomes weak or paralyzed, which
seriously affects eye-blinking on that side. It leads to an abnormal contrast in the blinking
patterns of the two eyes. As illustrated in Fig.1, there is a significant difference between the
blinking patterns of (a) a normal person and (b) a Bell’s Palsy patient. While both the eyes
of a normal person blink together, only the patient’s normal side’s eye blinks. Hence, we
can exploit this observation related to the eye-blinks for detecting Bell’s Palsy in a person.
However, even normal persons may sometimes blink just one eye when they wink or when an
instantaneous agitation occurs. Therefore, the blinks need to be monitored for a substantial
time, necessitating a video-based diagnosis system.
Let ECT denote the total time duration during which an eye remains closed due to the
blinking phenomenon in a video. For a patient, the ECT of the paralyzed side will be much
lower than that of the normal side. On the contrary, for a normal person, both the ECTs
should be almost equal.
There are two phases in the eye-blinking activity: eye-closure time (EC) and eye-open
time (EO). If we assume eye-blinking to be a periodic activity, these phases tend to have
constant values for a particular eye, although they may vary eye-to-eye and person-to-
person. Thus, the blink time-period becomes EC + EO. If L denotes the length (total time
duration) of a video, we can now compute ECT for an eye as
ECT =
EC
EC + EO
× L, (3.1)
4. 4
Eyes
Extraction
Left Eye-closed
frames count (ECFL)
ML Model
Blink Detector
(CNN)
Bell’s Palsy
Normal
Frames
of
Video
Flip left
eye
Blink Similarity Feature Extraction
Right Eye-closed
frames count (ECFR)
min( , )
max( , )
L R
L R
ECF ECF
BS
ECF ECF
Figure 2. Proposed approach for the extraction of blink-rate similarity feature and the
end-to-end workflow for the detection of Bell’s Palsy.
where ( EC
EC+EO ) serves as fraction coefficient of L to compute ECT. So, multiplying the
length of video with such a fraction yields the total time duration during which the eye
remains closed, i.e., ECT. Although we can extract L from the meta-data of a video,
EC and EO are variables not only across different persons but also possibly between two
eyes of the same person (especially, patient). Therefore, they are difficult to compute.
Nevertheless, we can also compute the number of eye-closed frames, denoted as ECF, using
the same fraction coefficient, i.e.,
ECF =
EC
EC + EO
× F, (3.2)
where F is the total number of frames. By dividing Eqn.(3.1) with Eqn.(3.2), we get a
relationship between ECT and ECF, which is as follows:
ECT = ECF ×
L
F
. (3.3)
Considering both the left (l) eye and the right (r) eye now, we have ECTl = ECFl × L
F
and ECTr = ECFr × L
F . By dividing the two relationships with one another, we can
conclude that
ECTl
ECTr
=
ECFl
ECFr
. (3.4)
Thus, the ratio of two ECTs is equal to the ratio of two ECFs, and there is no dependence
on video parameters like L and F to compute that ratio. In this way, we model a ratio of
physical parameters (ECTs) as a ratio of computable parameters (ECFs). Therefore, we
need only to find the number of frames in which the left eye is closed (ECFl) and the number
of frames in which the right eye is closed (ECFr) to compute the ratio. This ratio should
be close to 1 for normal persons and deviate from 1 for Bell’s Palsy patients.
3.1. Blink Similarity
Inspired by the above ratio, which can identify Bell’s Palsy patients, we modify it so that
it has a physical meaning. We propose a novel video-level feature called blink similarity
(BS), which we compute in the following manner:
BS =
min
(
ECFl, ECFr
)
max
(
ECFl, ECFr
), (3.5)
where the ratio automatically becomes ECFl
ECFr
or ECFr
ECFl
depending upon whether the paralyzed
side is the left one or the right one, respectively, because the eye-closure time of the eye in
5. 5
CONV
RELU
POOL
CONV
RELU
POOL
CONV
RELU
POOL
CONV
RELU
POOL
CONV
RELU
POOL
CONV
RELU
POOL
CONV
RELU
CONV
RELU
CONV
RELU
CONV
RELU
INPUT
FC
FC
RELU
SOFTMAX
DROPOUT
Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 Layer 6 Layer 7 Layer 8 Layer 9 Layer 10
Fully Connected
Layers
Input
Image/Frame
Filters=32
Kernel
Size=
(2,2)
Activation=
Relu
Pool=
(2,2)
Filters=
64
Kernel
Size=
(2,2)
Activation=
Relu
Pool=
(2,2)
Filters=
32
Kernel
Size=
(2,2)
Activation=
Relu
Pool=
(2,2)
Filters=
64
Kernel
Size=
(2,2)
Activation=
Relu
Pool=
(2,2)
Filters=
32
Kernel
Size=
(2,2)
Activation=
Relu
Pool=
(2,2)
Filters=
64
Kernel
Size=
(2,2)
Activation=
Relu
Pool=
(2,2)
Filters=
32
Kernel
Size=
(2,2)
Activation=
Relu
Filters=
64
Kernel
Size=
(2,2)
Activation=
Relu
Filters=
32
Kernel
Size=
(2,2)
Activation=
Relu
Filters=
64
Kernel
Size=
(2,2)
Activation=
Relu
FLATTEN
Neurons=
1024
Activation=
Relu
Dropout=
0.8
Neurons=
2
Activation=
Softmax
OUTPUT
Class
Labels
(Open
or
Close)
Figure 3. CNN architecture used for developing our intermediate detection model (Blink
Detector).
the paralyzed side will be less than that of the other eye. More than that, it has a physical
meaning of similarity of blinks. It varies from 0 to 1 such that the more the value more is
the similarity of the two blinking patterns.
To develop this feature, we need an intermediate detection model at frames level that
detects blinks (eye-closure). Also, to effectively use this feature, we need a final detection
model at the video level that detects Bell’s Palsy. We will discuss both these models next.
We give the entire end-to-end pipeline in Fig. 2.
3.2. Intermediate Detection Model
To detect the closed eye, we build an intermediate detector named blink detector using
a Convolutional Neural Network (CNN). It takes eye images as input and outputs the state
of the eye, i.e., whether it is closed or open. From any frame, we extract the person’s two
eyes and flip the left one sideways because we train the CNN on all right-types (by flipping
the left-types). For creating the blink detector, we build a blink detection dataset (details
in the next section) comprising open and closed eye images to train a convolutional neural
network (CNN). As presented in Fig. 3, the architecture’s convolutional base consists of 10
convolutional layers, each followed by ReLU (Rectified Linear Unit) and Max-pooling layers.
The convolutional base’s output is flattened and fed to the fully connected layers to generate
the final output. The last fully connected layer uses the softmax activation function, while
the other fully connected layer uses ReLU as its activation function. The purpose behind
having an additional Dropout (0.8) layer is to prevent overfitting. The input images for the
blink detector are of resolutions 50x50. Our CNN uses the categorical cross-entropy loss
(CEL), as mentioned below:
CEL(y, ŷ) = −
C
∑
j=1
|I|
∑
i=1
(
yij ∗ log(ŷij)
)
(3.6)
where yij is the ground-truth label value of ith
image with respect to jth
class, and ŷij is
the predicted label value of ith
image with respect to jth
class. C is the number of classes,
which is two (2) in this case. |I| is the total number of images in a mini-batch. We use the
Adam optimization algorithm to train this CNN.
3.3. Final Detection Model
Since we now have a video-level feature in the form of a blink similarity feature, we can
tune over several supervised learning algorithms to build the best final detection model based
6. 6
Predicted Open
Predicted Closed
Figure 4. Sample results of our blink detector.
Table 1. Comparison of our Intermediate detector with those built using VGG and
Inception-ResNet features on our Blink Detection Dataset.
Detectors Training Validation Testing
VGG + Random Forest Blink Detector 99.96% 98.28% 98.09%
Inception-ResNet + Random Forest Blink Detector 99.90% 97.80% 97.56%
Proposed Blink Detector 99.80% 99.72% 99.57%
on cross-validation classification accuracy. We found that Bell’s Palsy is best-detected using
a model developed by applying Stochastic Gradient Descent (SGD) with hinge loss. The
model thus developed is a binary classifier that detects if the subject present in the video
has Bell’s Palsy or not by simply utilizing the intuitive blink similarity feature developed.
4. Experimental Results
In this section, we discuss the datasets created, the evaluation process employed, our
detectors’ performances, and various comparisons.
4.1. Datasets
We develop two datasets: (i) One for developing the intermediate detection model (Blink
Detector). (ii) Another one for developing the final detection model (Bell’s palsy detector).
While the intermediate one takes image/frame as input, the final one takes our video-level
feature as inputs. While the intermediate detector’s output is whether there is a blink or
not, the final detectors’ output is whether the person has Bell’s Palsy or not.
Blink Detection Dataset: It consists of 21896 open eye images and 19753 closed eyes
images. The purpose is to detect blinks, i.e., to detect if a given eye image contains a closed
eye. Most of these images were collected from YouTube videos by manually cropping out
the eyes frame-by-frame.
Bell’s Palsy Video Dataset: It consists of 41 videos of patients suffering from Bell’s
Palsy and 34 videos of normal persons. We collect most of these videos from YouTube.
We also test our method on publicly available YouTube Facial Palsy(YFP) [21] dataset.
The YFP dataset contains videos of only Bell’s Palsy patients.
4.2. Evaluation
We report classification accuracy, which means the percentage of correct predictions. We
take different evaluation approaches for intermediate and final detections given the disparity
in the labeled data available for the two problems.
7. 7
Table 2. Confusion matrix obtained during the testing phase while building our inter-
mediate detector on our blink detection dataset.
Blink Detector
Predicted Predicted
Open Close
Actually Open 3538 15
Actually Close 7 2849
Table 3. Comparison of 3DPalsyNet [26] features with our Blink Similarity feature in
terms of classification accuracies obtained during cross-validation using different machine
learning algorithms on our Bell’s Palsy Video dataset.
AB LR NB NN RF SGD SVM DT kNN Average
3DPalsyNet [26] 62.2 73.8 66.8 81.6 68.2 71.0 70.8 63.0 79.7 70.8
Blink Similarity (Ours) 90.7 92.0 90.7 93.3 93.3 94.7 93.3 90.7 93.3 92.4
Table 4. Confusion Matrix obtained during cross-validation of our final detector on our
Bell’s Palsy Video dataset.
Bell’s Palsy
Predicted No Predicted Yes
Actual No 32 2
Actual Yes 2 39
For intermediate detector, since we have labeled data in abundance, we can divide the
labeled data into three parts: training, validation, and testing. Specifically, we use the
hold-out approach to validation. In the hold-out approach, a portion of the dataset is kept
aside for validation. We manage to set the best hyperparameters while training the CNNs
for building the intermediate detector with the help of the validation scores. Once we set
them, we test the trained CNN on the testing dataset.
In contrast, since we have minimal labeled data for the final detector, we avoid the test-
ing phase altogether and employ the k-fold cross-validation approach on the entire labeled
dataset. Note that we set k to 3. The cross-validation helps us in identifying the best
algorithm to use among nine basic machine learning algorithms, namely AdaBoost (AB),
Logistic Regression (LR), Naive Bayes (NB), Neural Networks (NN), Random Forests (RF),
Stochastic Gradient Descent (SGD), Support Vectors Machine (SVM), Decision Tree (DT)
and k-Nearest Neighbors (kNN). Once we choose the best learning algorithm, we train our
model on the entire labeled dataset to create our final detectors. We use the default setting
(as per Python’s Orange package) of hyper-parameters for these algorithms. In addition to
the classification accuracy, we also report appropriate confusion matrices while reporting
our detectors’ performances.
4.3. Intermediate Detection Results
We give the quantitative results of our intermediate detector in terms of classification
accuracy in Table 1, while also comparing with other detectors built using Random Forest
upon the features extracted from the pre-trained deep learning models like VGG19 and
Inception-ResNet. Our detector obtains superior results compared to others in challenging
validation and testing phases. It obtains classification accuracy of above 99% in all three
phases. We split the datasets into three parts: 70% for the training, 15% for validation,
and 15% for testing. The confusion matrix obtained during the testing phase using our
intermediate detector is given in Table 2. We give the sample detection results of our
intermediate detector in Figure 4. Our blink detector detects very well whether the eye is
in the open state or closed state.
8. 8
4.4. Final Detection Results
We build our final detector using the blink similarity feature we have developed at the
video level. The quantitative results of our final detector in terms of classification accuracy
during cross-validation are given in Table 3. It is clear from the table that SGD (Stochastic
Gradient Descent) is the best algorithm for detecting Bell’s Palsy using our feature. Even
with very few labels, the algorithms’ high average accuracy shows that our feature is robust,
thanks to our disorder-specific design. Like the intermediate detector, for the final detector
also, we provide the cross-validation confusion matrices in Table 4. It clearly shows that
most videos are being correctly classified. Only four videos are misclassified.
4.5. Comparisons
There is a publicly available video dataset name YouTube Facial Palsy (YFP) dataset.
It consists of 31 videos of Bell’s Palsy patients. On this dataset, we test the model trained
on our Bell’s Palsy Video dataset. It could detect 28/31 videos as Bell’s Palsy videos,
showing our method can work cross-domain too. Since there is also a prior work[26] on
detecting Bell’s Palsy from videos, we extract the features from the network presented in
[26] and perform the same experiments as we did with our Blink Similarity feature on our
Bell’s Palsy Video dataset. The results are compared with ours in Table 3. It can be
seen that our blink similarity feature outperforms features of [26] no matter which machine
learning algorithm is used. [26] achieves the best cross-validation accuracy of 81.6% using
NN, whereas our feature achieves the best cross-validation accuracy of 94.7% using SGD.
In terms of average accuracy across the machine learning algorithms, while [26] obtains
70.8%, our feature obtains 92.4%. This comparison shows that our blink similarity feature
is superior to that of [26], thanks to our disorder-focused design.
4.6. Discussion & Limitations
Our feature can also provide the severity degree of the disorder. The farther the blink
similarity feature value from 1, the more severer is the Bell’s Palsy. Thus, the proposed
feature can also assist in the rehabilitation [28] process. Note that our proposed method
entirely relies on detecting blinks. It might be possible that such events may not get ad-
equately captured in some videos. For example, there could also be a short video that
not even captures one blink. Similarly, there could be a video that captures only one eye.
In both cases, our method would fail. These are some of the limitations of our proposed
method.
Conclusion
We attempted a novel problem called eyes-focused detection of Bell’s Palsy. We design a
novel eye-based feature named blink similarity at the video level. The feature is so robust
that it helps build a machine learning model with even very few labels. While designing
the feature, we also had to develop an intermediate detection model (a CNN) for blink
detection. In this way, we propose a two-stage hybrid approach while efficiently leveraging
the benefits of both the feature learning and hand-crafted designing of the features. We
build two datasets: one for blink detection and another for Bell’s Palsy detection. Our
extensive experiments demonstrate better performance than existing methods.
9. 9
References
[1] D. N. Singh, M. F. Hashmi, and S. K. Sharma. “Predictive analytics & modeling for modern
health care system for cerebral palsy patients”. In: Multimedia Tools and Applications (2019),
pp. 1–23.
[2] G. Yolcu, I. Oztel, S. Kazan, C. Oz, K. Palaniappan, T. E. Lever, and F. Bunyak. “Facial
expression recognition for monitoring neurological disorders based on convolutional neural
network”. In: Multimedia Tools and Applications 78.22 (2019), pp. 31581–31603.
[3] D. Goyal, K. R. Jerripothula, and A. Mittal. “Detection of Gait Abnormalities caused by
Neurological Disorders”. In: 2020 IEEE 22nd International Workshop on Multimedia Signal
Processing (MMSP). 2020, pp. 1–6.
[4] F. Sullivan, F. Daly, and I. Gagyor. “Antiviral agents added to corticosteroids for early
treatment of adults with acute idiopathic facial nerve paralysis (Bell Palsy)”. In: Jama 316.8
(2016), pp. 874–875.
[5] D. B. Pitts, K. K. Adour, and R. L. Hilsinger Jr. “Recurrent Bell’s palsy: analysis of 140
patients”. In: The Laryngoscope 98.5 (1988), pp. 535–540.
[6] M. Divjak and H. Bischof. “Eye Blink Based Fatigue Detection for Prevention of Computer
Vision Syndrome.” In: MVA. 2009, pp. 350–353.
[7] T. J. Anderson and M. R. MacAskill. “Eye movements in patients with neurodegenerative
disorders”. In: Nature Reviews Neurology 9.2 (2013), pp. 74–85.
[8] G. Pusiol, A. Esteva, S. S. Hall, M. Frank, A. Milstein, and L. Fei-Fei. “Vision-Based Clas-
sification of Developmental Disorders Using Eye-Movements”. In: Medical Image Computing
and Computer-Assisted Intervention – MICCAI 2016. Ed. by S. Ourselin, L. Joskowicz, M. R.
Sabuncu, G. Unal, and W. Wells. Cham: Springer International Publishing, 2016, pp. 317–
325. isbn: 978-3-319-46723-8.
[9] T. Wang, S. Zhang, J. Dong, L. Liu, and H. Yu. “Automatic evaluation of the degree of facial
nerve paralysis”. In: Multimedia Tools and Applications 75.19 (2016), pp. 11893–11908.
[10] T. Appel, T. Santini, and E. Kasneci. “Brightness-and motion-based blink detection for head-
mounted eye trackers”. In: Proceedings of the 2016 ACM International Joint Conference on
Pervasive and Ubiquitous Computing: Adjunct. 2016, pp. 1726–1735.
[11] L. Schillingmann and Y. Nagai. “Yet another gaze detector: An embodied calibration free
system for the iCub robot”. In: 2015 IEEE-RAS 15th International Conference on Humanoid
Robots (Humanoids). IEEE. 2015, pp. 8–13.
[12] E. R. Anas, P. Henriquez, and B. J. Matuszewski. “Online eye status detection in the wild with
convolutional neural networks”. In: International Conference on Computer Vision Theory and
Applications. Vol. 7. SciTePress. 2017, pp. 88–95.
[13] K. W. Kim, H. G. Hong, G. P. Nam, and K. R. Park. “A study of deep CNN-based classifi-
cation of open and closed eyes using a visible light camera sensor”. In: Sensors 17.7 (2017),
p. 1534.
[14] K. Cortacero, T. Fischer, and Y. Demiris. “RT-BENE: A Dataset and Baselines for Real-Time
Blink Estimation in Natural Environments”. In: 2019 IEEE/CVF International Conference
on Computer Vision Workshop (ICCVW). 2019, pp. 1159–1168.
[15] M. Park, J. Seo, and K. Park. “PC-based asymmetry analyzer for facial palsy study in
uncontrolled environment: A preliminary study”. In: Computer methods and programs in
biomedicine 99.1 (2010), pp. 57–65.
[16] T. Wang, J. Dong, X. Sun, S. Zhang, and S. Wang. “Automatic recognition of facial movement
for paralyzed face”. In: Bio-medical materials and engineering 24.6 (2014), pp. 2751–2760.
[17] S. He, J. J. Soraghan, B. F. O’Reilly, and D. Xing. “Quantitative analysis of facial paraly-
sis using local binary patterns in biomedical videos”. In: IEEE Transactions on Biomedical
Engineering 56.7 (2009), pp. 1864–1870.
[18] T. H. Ngo, M. Seo, N. Matsushiro, W. Xiong, and Y.-W. Chen. “Quantitative analysis of
facial paralysis based on limited-orientation modified circular gabor filters”. In: 2016 23rd
International Conference on Pattern Recognition (ICPR). IEEE. 2016, pp. 349–354.
[19] H. Kim, S. Kim, Y. Kim, and K. Park. “A smartphone-based automatic diagnosis system for
facial nerve palsy”. In: Sensors 15.10 (2015), pp. 26756–26768.
10. 10
[20] A. Asthana, S. Zafeiriou, S. Cheng, and M. Pantic. “Incremental face alignment in the wild”.
In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2014,
pp. 1859–1866.
[21] G.-S. J. Hsu, J.-H. Kang, and W.-F. Huang. “Deep hierarchical network with line segment
learning for quantitative analysis of facial palsy”. In: IEEE Access 7 (2018), pp. 4833–4842.
[22] K. R. Jerripothula, J. Cai, and J. Yuan. “CATS: Co-saliency Activated Tracklet Selection
for Video Co-localization”. In: Computer Vision – ECCV 2016. Ed. by B. Leibe, J. Matas,
N. Sebe, and M. Welling. Cham: Springer International Publishing, 2016, pp. 187–202. isbn:
978-3-319-46478-7.
[23] K. R. Jerripothula, J. Cai, and J. Yuan. “Efficient Video Object Co-Localization With Co-
Saliency Activated Tracklets”. In: IEEE Transactions on Circuits and Systems for Video
Technology 29.3 (2019), pp. 744–755.
[24] K. R. Jerripothula, J. Cai, and J. Yuan. “QCCE: Quality constrained co-saliency estima-
tion for common object detection”. In: 2015 Visual Communications and Image Processing
(VCIP). 2015, pp. 1–4.
[25] K. R. Jerripothula, J. Cai, and J. Yuan. “Quality-Guided Fusion-Based Co-Saliency Estima-
tion for Image Co-Segmentation and Colocalization”. In: IEEE Transactions on Multimedia
20.9 (2018), pp. 2466–2477.
[26] G. Storey, R. Jiang, S. Keogh, A. Bouridane, and C.-T. Li. “3DPalsyNet: A Facial Palsy
Grading and Motion Recognition Framework using Fully 3D Convolutional Neural Networks”.
In: IEEE Access 7 (2019), pp. 121655–121664.
[27] K. Hara, H. Kataoka, and Y. Satoh. “Can spatiotemporal 3d cnns retrace the history of 2d
cnns and imagenet?” In: Proceedings of the IEEE conference on Computer Vision and Pattern
Recognition. 2018, pp. 6546–6555.
[28] D. Avola, L. Cinque, G. L. Foresti, M. R. Marini, and D. Pannone. “VRheab: a fully immersive
motor rehabilitation system based on recurrent neural network”. In: Multimedia Tools and
Applications 77.19 (2018), pp. 24955–24982.