Augmented reality has been a topic of intense research for several years for many applications. It consists of inserting a virtual object into a real scene. The virtual object must be accurately positioned in a desired place. Some measurements (calibration) are thus required and a set of correspondences between points on the calibration target and the camera images must be found. In this paper, we present a tracking technique based on both detection of Chessboard corners and a least squares method; the objective is to estimate the perspective transformation matrix for the current view of the camera. This technique does not require any information or computation of the camera parameters; it can used in real time without any initialization and the user can change the camera focal without any fear of losing alignment between real and virtual object.
Performance analysis on color image mosaicing techniques on FPGAIJECEIAES
Today, the surveillance systems and other monitoring systems are considering the capturing of image sequences in a single frame. The captured images can be combined to get the mosaiced image or combined image sequence. But the captured image may have quality issues like brightness issue, alignment issue (correlation issue), resolution issue, manual image registration issue etc. The existing technique like cross correlation can offer better image mosaicing but faces brightness issue in mosaicing. Thus, this paper introduces two different methods for mosaicing i.e., (a) Sliding Window Module (SWM) based Color Image Mosaicing (CIM) and (b) Discrete Cosine Transform (DCT) based CIM on Field Programmable Gate Array (FPGA). The SWM based CIM adopted for corner detection of two images and perform the automatic image registration while DCT based CIM aligns both the local as well as global alignment of images by using phase correlation approach. Finally, these two methods performances are analyzed by comparing with parameters like PSNR, MSE, device utilization and execution time. From the analysis it is concluded that the DCT based CIM can offers significant results than SWM based CIM.
Development of Human Tracking System For Video Surveillancecscpconf
Visual surveillance in dynamic scenes, especially for human and some objects is one of the
most active research areas. An attempt has been made to this issue in this work. It has wide
spectrum of promising application including human identification to detect the suspicious
behavior, crowd flux statistics, and congestion analysis using multiple cameras.
In this paper deals with the problem of detecting and tracking multiple moving people in a static
background. Detection of foreground object is done by background subtraction. Detected
objects are identified and analyzed through different blobs. Then tracking is performed by
matching corresponding features of blob. An algorithm has been developed in this perspective
using Angular Deviation of Center of Gravity (ADCG), which gives a satisfying result for segmentation of human object.
An Assessment of Image Matching Algorithms in Depth EstimationCSCJournals
Computer vision is often used with mobile robot for feature tracking, landmark sensing, and obstacle detection. Almost all high-end robotics systems are now equipped with pairs of cameras arranged to provide depth perception. In stereo vision application, the disparity between the stereo images allows depth estimation within a scene. Detecting conjugate pair in stereo images is a challenging problem known as the correspondence problem. The goal of this research is to assess the performance of SIFT, MSER, and SURF, the well known matching algorithms, in solving the correspondence problem and then in estimating the depth within the scene. The results of each algorithm are evaluated and presented. The conclusion and recommendations for future works, lead towards the improvement of these powerful algorithms to achieve a higher level of efficiency within the scope of their performance.
Performance analysis on color image mosaicing techniques on FPGAIJECEIAES
Today, the surveillance systems and other monitoring systems are considering the capturing of image sequences in a single frame. The captured images can be combined to get the mosaiced image or combined image sequence. But the captured image may have quality issues like brightness issue, alignment issue (correlation issue), resolution issue, manual image registration issue etc. The existing technique like cross correlation can offer better image mosaicing but faces brightness issue in mosaicing. Thus, this paper introduces two different methods for mosaicing i.e., (a) Sliding Window Module (SWM) based Color Image Mosaicing (CIM) and (b) Discrete Cosine Transform (DCT) based CIM on Field Programmable Gate Array (FPGA). The SWM based CIM adopted for corner detection of two images and perform the automatic image registration while DCT based CIM aligns both the local as well as global alignment of images by using phase correlation approach. Finally, these two methods performances are analyzed by comparing with parameters like PSNR, MSE, device utilization and execution time. From the analysis it is concluded that the DCT based CIM can offers significant results than SWM based CIM.
Development of Human Tracking System For Video Surveillancecscpconf
Visual surveillance in dynamic scenes, especially for human and some objects is one of the
most active research areas. An attempt has been made to this issue in this work. It has wide
spectrum of promising application including human identification to detect the suspicious
behavior, crowd flux statistics, and congestion analysis using multiple cameras.
In this paper deals with the problem of detecting and tracking multiple moving people in a static
background. Detection of foreground object is done by background subtraction. Detected
objects are identified and analyzed through different blobs. Then tracking is performed by
matching corresponding features of blob. An algorithm has been developed in this perspective
using Angular Deviation of Center of Gravity (ADCG), which gives a satisfying result for segmentation of human object.
An Assessment of Image Matching Algorithms in Depth EstimationCSCJournals
Computer vision is often used with mobile robot for feature tracking, landmark sensing, and obstacle detection. Almost all high-end robotics systems are now equipped with pairs of cameras arranged to provide depth perception. In stereo vision application, the disparity between the stereo images allows depth estimation within a scene. Detecting conjugate pair in stereo images is a challenging problem known as the correspondence problem. The goal of this research is to assess the performance of SIFT, MSER, and SURF, the well known matching algorithms, in solving the correspondence problem and then in estimating the depth within the scene. The results of each algorithm are evaluated and presented. The conclusion and recommendations for future works, lead towards the improvement of these powerful algorithms to achieve a higher level of efficiency within the scope of their performance.
Interactive full body motion capture using infrared sensor networkijcga
Traditional motion capture (mocap) has been
well
-
stud
ied in visual science for
the last decades
. However
the fie
ld is mostly about capturing
precise animation to be used in
specific
application
s
after
intensive
post
processing such as studying biomechanics or rigging models in movies. These data set
s are normally
captured in complex laboratory environments with
sophisticated
equipment thus making motion capture a
field that is mostly exclusive to professional animators.
In
addition
, obtrusive sensors must be attached to
actors and calibrated within t
he capturing system, resulting in limited and unnatural motion.
In recent year
the rise of computer vision and interactive entertainment opened the gate for a different type of motion
capture which focuses on producing
optical
marker
less
or mechanical sens
orless
motion capture.
Furtherm
ore a wide array of low
-
cost
device are released that are easy to use
for less mission critical
applications
.
This paper
describe
s
a new technique of using multiple infrared devices to process data from
multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap
using commodity
devices such as Kinect
. The method involves analyzing each individual sensor
data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from
sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasize
s on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability
Interactive Full-Body Motion Capture Using Infrared Sensor Network ijcga
Traditional motion capture (mocap) has been well-studied in visual science for the last decades. However the field is mostly about capturing precise animation to be used in specific applications after intensive post processing such as studying biomechanics or rigging models in movies. These data sets are normally captured in complex laboratory environments with sophisticated equipment thus making motion capture a
field that is mostly exclusive to professional animators. In addition, obtrusive sensors must be attached to actors and calibrated within the capturing system, resulting in limited and unnatural motion. In recent year the rise of computer vision and interactive entertainment opened the gate for a different type of motion capture which focuses on producing optical markerless or mechanical sensorless motion capture. Furthermore a wide array of low-cost device are released that are easy to use for less mission critical applications. This paper describes a new technique of using multiple infrared devices to process data from multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap using commodity
devices such as Kinect. The method involves analyzing each individual sensor data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasizes on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability.
Occlusion and Abandoned Object Detection for Surveillance ApplicationsEditor IJCATR
Object detection is an important step in any video analysis. Difficulties of the object detection are finding hidden objects
and finding unrecognized objects. Although many algorithms have been developed to avoid them as outliers, occlusion boundaries
could potentially provide useful information about the scene’s structure and composition. A novel framework for blob based occluded
object detection is proposed. A technique that can be used to detect occlusion is presented. It detects and tracks the occluded objects in
video sequences captured by a fixed camera in crowded environment with occlusions. Initially the background subtraction is modeled
by a Mixture of Gaussians technique (MOG). Pedestrians are detected using the pedestrian detector by computing the Histogram of
Oriented Gradients descriptors (HOG), using a linear Support Vector Machine (SVM) as the classifier. In this work, a recognition and
tracking system is built to detect the abandoned objects in the public transportation area such as train stations, airports etc. Several
experiments were conducted to demonstrate the effectiveness of the proposed approach. The results show the robustness and
effectiveness of the proposed method.
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
In this paper, we propose a method for effectively tracking moving objects in videos captured using a
moving camera in complex scenes. The video sequences may contain highly dynamic backgrounds and
illumination changes. Four main steps are involved in the proposed method. First, the video is stabilized
using affine transformation. Second, intelligent selection of frames is performed in order to extract only
those frames that have a considerable change in content. This step reduces complexity and computational
time. Third, the moving object is tracked using Kalman filter and Gaussian mixture model. Finally object
recognition using Bag of features is performed in order to recognize the moving objects.
Intelligent indoor mobile robot navigation using stereo visionsipij
Majority of the existing robot navigation systems, which facilitate the use of laser range finders, sonar
sensors or artificial landmarks, has the ability to locate itself in an unknown environment and then build a
map of the corresponding environment. Stereo vision,while still being a rapidly developing technique in the
field of autonomous mobile robots, are currently less preferable due to its high implementation cost. This
paper aims at describing an experimental approach for the building of a stereo vision system that helps the
robots to avoid obstacles and navigate through indoor environments and at the same time remaining very
much cost effective. This paper discusses the fusion techniques of stereo vision and ultrasound sensors
which helps in the successful navigation through different types of complex environments. The data from
the sensor enables the robot to create the two dimensional topological map of unknown environments and
stereo vision systems models the three dimension model of the same environment.
Object Detection and tracking in Video SequencesIDES Editor
This paper focuses on key steps in video analysis
i.e. Detection of moving objects of interest and tracking of
such objects from frame to frame. The object shape
representations commonly employed for tracking are first
reviewed and the criterion of feature Selection for tracking is
discussed. Various object detection and tracking approaches
are compared and analyzed.
Deconstructing SfM-Net architecture and beyond
"SfM-Net, a geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations. Given a sequence of frames, SfM-Net predicts depth, segmentation, camera and rigid object motions, converts those into a dense frame-to-frame motion field (optical flow), differentiably warps frames in time to match pixels and back-propagates."
Alternative download:
https://www.dropbox.com/s/aezl7ro8sy2xq7j/sfm_net_v2.pdf?dl=0
Computer vision has received great attention over the last two decades.
This research field is important not only in security-related software but also in the advanced interface between people and computers, advanced control methods, and many other areas.
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONsipij
The implementation of a stand-alone system developed in JAVA language for motion detection has been discussed. The open-source OpenCV library has been adopted for video surveillance image processing thus implementing Background Subtraction algorithm also known as foreground detection algorithm. Generally the region of interest of a body or object to detect is related to a precise objects (people, cars, etc.) emphasized on a background. This technique is widely used for tracking a moving objects. In particular, the BackgroundSubtractorMOG2 algorithm of OpenCV has been applied. This algorithm is based on Gaussian distributions and offers better adaptability to different scenes due to changes in lighting and the detection of shadows as well. The implemented webcam system relies on saving frames and creating GIF and JPGs files with previously saved frames. In particular the Background Subtraction function, find Contours, has been adopted to detect the contours. The numerical quantity of these contours has been compared with the tracking points of sensitivity obtained by setting an user-modifiable slider able to save the frames as GIFs composed by different merged JPEGs. After a full design of the image processing prototype different motion test have been performed. The results showed the importance to consider few sensitivity points in order to obtain more frequent image storages also concerning minor movements.Sensitivity points can be modified through a slider function and are inversely proportional to the number of saved images. For small object in motion will be detected a low percentage of sensitivity points.Experimental results proves that the setting condition are mainly function of the typology of moving object rather than the light conditions. The proposed prototype system is suitable for video surveillance smart
camera in industrial systems.
Moving object detection using background subtraction algorithm using simulinkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Object Capturing In A Cluttered Scene By Using Point Feature MatchingIJERA Editor
Capturing means getting or catching. This project contains an algorithm for capturing a specific target based on the points which corresponds between reference and target image. It can capture the objects in-plane rotation and also effective to small amount of out-of plane rotation also. This method of object capturing works best for objects that exhibit in a cluttered texture patterns, which give rise to unique point feature matches. When a part of object is occluded by other objects in the scene, only features of that part are missed. As long as there are enough features detected in the unoccluded part, the object can captured. The local representation is based on the appearance. There is no need to extract geometric primitives (e.g. lines) which are generally hard to detect reliably.
Overview Of Video Object Tracking SystemEditor IJMTER
The goal of video object tracking system is segmenting a region of interest from a video
scene and keeping track of its motion, positioning and occlusion. There are the three steps of video
object tracking system those are object detection, object classification and object tracking. Object
detection is performed to check existence of objects in video. Then the detected object can be
classified in various categories on the basis on their shape, motion, color and texture. Object tracking
is performed using monitoring object changes. This paper we are going to take overview of different
object detection, object classification and object tracking techniques and also the comparison of
different techniques used for various stages of tracking.
Color based image processing , tracking and automation using matlabKamal Pradhan
Image processing is a form of signal processing in which the input is an image, such as a photograph or video frame. The output of image processing may be either an image or, a set of characteristics or parameters related to the image. Most image-processing techniques involve treating the image as a two-dimensional signal and applying standard signal-processing techniques to it. This project aims at processing the real time images captured by a Webcam for motion detection and Color Recognition and system automation using MATLAB programming.
In color based image processing we work with colors instead of object. Color provides powerful information for object recognition. A simple and effective recognition scheme is to represent and match images on the basis of color histograms.
Tracking refers to detection of the path of the color once the color based processing is done the color becomes the object to be tracked this can be very helpful in security purposes.
Automation refers to an automated system is any system that does not require human intervention. In this project I’ve automated the mouse that work with our gesture and do the desired tasks.
Matching algorithm performance analysis for autocalibration method of stereo ...TELKOMNIKA JOURNAL
Stereo vision is one of the interesting research topics in the computer vision field. Two cameras are used to generate a disparity map, resulting in the depth estimation. Camera calibration is the most important step in stereo vision. The calibration step is used to generate an intrinsic parameter of each camera to get a better disparity map. In general, the calibration process is done manually by using a chessboard pattern, but this process is an exhausting task. Self-calibration is an important ability required to overcome this problem. Self-calibration required a robust and good matching algorithm to find the key feature between images as reference. The purpose of this paper is to analyze the performance of three matching algorithms for the autocalibration process. The matching algorithms used in this research are SIFT, SURF, and ORB. The result shows that SIFT performs better than other methods.
Interactive full body motion capture using infrared sensor networkijcga
Traditional motion capture (mocap) has been
well
-
stud
ied in visual science for
the last decades
. However
the fie
ld is mostly about capturing
precise animation to be used in
specific
application
s
after
intensive
post
processing such as studying biomechanics or rigging models in movies. These data set
s are normally
captured in complex laboratory environments with
sophisticated
equipment thus making motion capture a
field that is mostly exclusive to professional animators.
In
addition
, obtrusive sensors must be attached to
actors and calibrated within t
he capturing system, resulting in limited and unnatural motion.
In recent year
the rise of computer vision and interactive entertainment opened the gate for a different type of motion
capture which focuses on producing
optical
marker
less
or mechanical sens
orless
motion capture.
Furtherm
ore a wide array of low
-
cost
device are released that are easy to use
for less mission critical
applications
.
This paper
describe
s
a new technique of using multiple infrared devices to process data from
multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap
using commodity
devices such as Kinect
. The method involves analyzing each individual sensor
data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from
sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasize
s on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability
Interactive Full-Body Motion Capture Using Infrared Sensor Network ijcga
Traditional motion capture (mocap) has been well-studied in visual science for the last decades. However the field is mostly about capturing precise animation to be used in specific applications after intensive post processing such as studying biomechanics or rigging models in movies. These data sets are normally captured in complex laboratory environments with sophisticated equipment thus making motion capture a
field that is mostly exclusive to professional animators. In addition, obtrusive sensors must be attached to actors and calibrated within the capturing system, resulting in limited and unnatural motion. In recent year the rise of computer vision and interactive entertainment opened the gate for a different type of motion capture which focuses on producing optical markerless or mechanical sensorless motion capture. Furthermore a wide array of low-cost device are released that are easy to use for less mission critical applications. This paper describes a new technique of using multiple infrared devices to process data from multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap using commodity
devices such as Kinect. The method involves analyzing each individual sensor data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasizes on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability.
Occlusion and Abandoned Object Detection for Surveillance ApplicationsEditor IJCATR
Object detection is an important step in any video analysis. Difficulties of the object detection are finding hidden objects
and finding unrecognized objects. Although many algorithms have been developed to avoid them as outliers, occlusion boundaries
could potentially provide useful information about the scene’s structure and composition. A novel framework for blob based occluded
object detection is proposed. A technique that can be used to detect occlusion is presented. It detects and tracks the occluded objects in
video sequences captured by a fixed camera in crowded environment with occlusions. Initially the background subtraction is modeled
by a Mixture of Gaussians technique (MOG). Pedestrians are detected using the pedestrian detector by computing the Histogram of
Oriented Gradients descriptors (HOG), using a linear Support Vector Machine (SVM) as the classifier. In this work, a recognition and
tracking system is built to detect the abandoned objects in the public transportation area such as train stations, airports etc. Several
experiments were conducted to demonstrate the effectiveness of the proposed approach. The results show the robustness and
effectiveness of the proposed method.
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
In this paper, we propose a method for effectively tracking moving objects in videos captured using a
moving camera in complex scenes. The video sequences may contain highly dynamic backgrounds and
illumination changes. Four main steps are involved in the proposed method. First, the video is stabilized
using affine transformation. Second, intelligent selection of frames is performed in order to extract only
those frames that have a considerable change in content. This step reduces complexity and computational
time. Third, the moving object is tracked using Kalman filter and Gaussian mixture model. Finally object
recognition using Bag of features is performed in order to recognize the moving objects.
Intelligent indoor mobile robot navigation using stereo visionsipij
Majority of the existing robot navigation systems, which facilitate the use of laser range finders, sonar
sensors or artificial landmarks, has the ability to locate itself in an unknown environment and then build a
map of the corresponding environment. Stereo vision,while still being a rapidly developing technique in the
field of autonomous mobile robots, are currently less preferable due to its high implementation cost. This
paper aims at describing an experimental approach for the building of a stereo vision system that helps the
robots to avoid obstacles and navigate through indoor environments and at the same time remaining very
much cost effective. This paper discusses the fusion techniques of stereo vision and ultrasound sensors
which helps in the successful navigation through different types of complex environments. The data from
the sensor enables the robot to create the two dimensional topological map of unknown environments and
stereo vision systems models the three dimension model of the same environment.
Object Detection and tracking in Video SequencesIDES Editor
This paper focuses on key steps in video analysis
i.e. Detection of moving objects of interest and tracking of
such objects from frame to frame. The object shape
representations commonly employed for tracking are first
reviewed and the criterion of feature Selection for tracking is
discussed. Various object detection and tracking approaches
are compared and analyzed.
Deconstructing SfM-Net architecture and beyond
"SfM-Net, a geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations. Given a sequence of frames, SfM-Net predicts depth, segmentation, camera and rigid object motions, converts those into a dense frame-to-frame motion field (optical flow), differentiably warps frames in time to match pixels and back-propagates."
Alternative download:
https://www.dropbox.com/s/aezl7ro8sy2xq7j/sfm_net_v2.pdf?dl=0
Computer vision has received great attention over the last two decades.
This research field is important not only in security-related software but also in the advanced interface between people and computers, advanced control methods, and many other areas.
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONsipij
The implementation of a stand-alone system developed in JAVA language for motion detection has been discussed. The open-source OpenCV library has been adopted for video surveillance image processing thus implementing Background Subtraction algorithm also known as foreground detection algorithm. Generally the region of interest of a body or object to detect is related to a precise objects (people, cars, etc.) emphasized on a background. This technique is widely used for tracking a moving objects. In particular, the BackgroundSubtractorMOG2 algorithm of OpenCV has been applied. This algorithm is based on Gaussian distributions and offers better adaptability to different scenes due to changes in lighting and the detection of shadows as well. The implemented webcam system relies on saving frames and creating GIF and JPGs files with previously saved frames. In particular the Background Subtraction function, find Contours, has been adopted to detect the contours. The numerical quantity of these contours has been compared with the tracking points of sensitivity obtained by setting an user-modifiable slider able to save the frames as GIFs composed by different merged JPEGs. After a full design of the image processing prototype different motion test have been performed. The results showed the importance to consider few sensitivity points in order to obtain more frequent image storages also concerning minor movements.Sensitivity points can be modified through a slider function and are inversely proportional to the number of saved images. For small object in motion will be detected a low percentage of sensitivity points.Experimental results proves that the setting condition are mainly function of the typology of moving object rather than the light conditions. The proposed prototype system is suitable for video surveillance smart
camera in industrial systems.
Moving object detection using background subtraction algorithm using simulinkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Object Capturing In A Cluttered Scene By Using Point Feature MatchingIJERA Editor
Capturing means getting or catching. This project contains an algorithm for capturing a specific target based on the points which corresponds between reference and target image. It can capture the objects in-plane rotation and also effective to small amount of out-of plane rotation also. This method of object capturing works best for objects that exhibit in a cluttered texture patterns, which give rise to unique point feature matches. When a part of object is occluded by other objects in the scene, only features of that part are missed. As long as there are enough features detected in the unoccluded part, the object can captured. The local representation is based on the appearance. There is no need to extract geometric primitives (e.g. lines) which are generally hard to detect reliably.
Overview Of Video Object Tracking SystemEditor IJMTER
The goal of video object tracking system is segmenting a region of interest from a video
scene and keeping track of its motion, positioning and occlusion. There are the three steps of video
object tracking system those are object detection, object classification and object tracking. Object
detection is performed to check existence of objects in video. Then the detected object can be
classified in various categories on the basis on their shape, motion, color and texture. Object tracking
is performed using monitoring object changes. This paper we are going to take overview of different
object detection, object classification and object tracking techniques and also the comparison of
different techniques used for various stages of tracking.
Color based image processing , tracking and automation using matlabKamal Pradhan
Image processing is a form of signal processing in which the input is an image, such as a photograph or video frame. The output of image processing may be either an image or, a set of characteristics or parameters related to the image. Most image-processing techniques involve treating the image as a two-dimensional signal and applying standard signal-processing techniques to it. This project aims at processing the real time images captured by a Webcam for motion detection and Color Recognition and system automation using MATLAB programming.
In color based image processing we work with colors instead of object. Color provides powerful information for object recognition. A simple and effective recognition scheme is to represent and match images on the basis of color histograms.
Tracking refers to detection of the path of the color once the color based processing is done the color becomes the object to be tracked this can be very helpful in security purposes.
Automation refers to an automated system is any system that does not require human intervention. In this project I’ve automated the mouse that work with our gesture and do the desired tasks.
Matching algorithm performance analysis for autocalibration method of stereo ...TELKOMNIKA JOURNAL
Stereo vision is one of the interesting research topics in the computer vision field. Two cameras are used to generate a disparity map, resulting in the depth estimation. Camera calibration is the most important step in stereo vision. The calibration step is used to generate an intrinsic parameter of each camera to get a better disparity map. In general, the calibration process is done manually by using a chessboard pattern, but this process is an exhausting task. Self-calibration is an important ability required to overcome this problem. Self-calibration required a robust and good matching algorithm to find the key feature between images as reference. The purpose of this paper is to analyze the performance of three matching algorithms for the autocalibration process. The matching algorithms used in this research are SIFT, SURF, and ORB. The result shows that SIFT performs better than other methods.
Wireless Vision based Real time Object Tracking System Using Template MatchingIDES Editor
In the present work the concepts of template
matching through Normalized Cross Correlation (NCC) has
been used to implement a robust real time object tracking
system. In this implementation a wireless surveillance pinhole
camera has been used to grab the video frames from the non
ideal environment. Once the object has been detected it is
tracked by employing an efficient Template Matching
algorithm. The templates used for the matching purposes are
generated dynamically. This ensures that any change in the
pose of the object does not hinder the tracking procedure. To
automate the tracking process the camera is mounted on a
disc coupled with stepper motor, which is synchronized with a
tracking algorithm. As and when the object being tracked
moves out of the viewing range of the camera, the setup is
automatically adjusted to move the camera so as to keep the
object of about 360 degree field of view. The system is capable
of handling entry and exit of an object. The performance of
the proposed Object tracking system has been demonstrated
in real time environment both for indoor and outdoor by
including a variety of disturbances and a means to detect a
loss of track.
Design and implementation of video tracking system based on camera field of viewsipij
The basic idea of this paper is to design and implement of video tracking system based on Camera Field of
View (CFOV), Otsu’s method was used to detect targets such as vehicles and people. Whereas most
algorithms were spent a lot of time to execute the process, an algorithm was developed to achieve it in a
little time. The histogram projection was used in both directional to detect target from search region,
which is robust to various light conditions in Charge Couple Device (CCD) camera images and saves
computation time.
Our algorithm based on background subtraction, and normalize cross correlation operation from a series
of sequential sub images can estimate the motion vector. Camera field of view (CFOV) was determined and
calibrated to find the relation between real distance and image distance. The system was tested by
measuring the real position of object in the laboratory and compares it with the result of computed one. So
these results are promising to develop the system in future.
AUTO LANDING PROCESS FOR AUTONOMOUS FLYING ROBOT BY USING IMAGE PROCESSING BA...csandit
In today’s technological life, everyone is quite familiar with the importance of security
measures in our lives. So in this regard, many attempts have been made by researchers and one
of them is flying robots technology. One well-known usage of flying robot, perhaps, is its
capability in security and care measurements which made this device extremely practical, not
only for its unmanned movement, but also for the unique manoeuvre during flight over the
arbitrary areas. In this research, the automatic landing of a flying robot is discussed. The
system is based on the frequent interruptions that is sent from main microcontroller to camera
module in order to take images; these images have been distinguished by image processing
system based on edge detection, after analysing the image the system can tell whether or not to
land on the ground. This method shows better performance in terms of precision as well as
experimentally.
IMAGE RECOGNITION USING MATLAB SIMULINK BLOCKSETIJCSEA Journal
The world over, image recognition are essential players in promoting quality object recognition especially in emergency and search-rescue operation. In this paper precise image recognition system using Matlab Simulink Blockset to detect selected object from crowd is presented. The process involves extracting object
features and then recognizes it considering illumination, direction and pose. A Simulink model has been developed to eliminate the tiny elements from the image, then creating segments for precise object recognition. Furthermore, the simulation explores image recognition from the coloured and gray-scale images through image processing techniques in Simulink environment. The tool employed for computation
and simulation is the Matlab image processing blockset. The process comprises morphological operation method which is effective for captured images and video. The results of extensive simulations indicate that this method is suitable for application identifying a person from a crow. The model can be used in emergency and search-rescue operation as well as in medicine, information security, access control, law enforcement, surveillance system, microscopy etc.
REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE K...IJCSEA Journal
The registration in augmented reality is process which merges virtual objects generated by computer with
real world image caught by camera. This paper describes the knowledge-based registration, computer
vision-based registration and tracker-based registration technology. This paper mainly focused on trackerbased registration technology in augmented reality. Also described method in tracker- based technology,
problem and solution.
REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE K...IJCSEA Journal
The registration in augmented reality is process which merges virtual objects generated by computer with real world image caught by camera. This paper describes the knowledge-based registration, computer vision-based registration and tracker-based registration technology. This paper mainly focused on trackerbased
registration technology in augmented reality. Also described method in tracker- based technology, problem and solution.
An algorithm to quantify the swelling by reconstructing 3D model of the face with stereo images is presented. We
analyzed the primary problems in computational stereo, which include correspondence and depth calculation. Work has been carried out to determine suitable methods for depth estimation and standardizing volume estimations. Finally we designed software for reconstructing 3D images from 2D stereo images, which was built on Matlab and Visual C++. Utilizing
techniques from multi-view geometry, a 3D model of the face was constructed and refined. An explicit analysis of the stereo
disparity calculation methods and filter elimination disparity estimation for increasing reliability of the disparity map was
used. Minimizing variability in position by using more precise positioning techniques and resources will increase the accuracy of this technique and is a focus for future work
A Detailed Analysis on Feature Extraction Techniques of Panoramic Image Stitc...IJEACS
Image stitching is a technique which is used for attaining a high resolution panoramic image. In this technique, distinct aesthetic images that are imaged from different view and angles are combined together to produce a panoramic image. In the field of computer graphics, photographic and computer vision, Image stitching techniques are considered as current research areas. For obtaining a stitched image it becomes mandatory that one should have the knowledge of geometric relations among multiple image co-ordinate system [1].First, image stitching will be done based on feature key point matches. Final image with seam will be blended with image blending technique. Hence in this paper we are going to address multiple distinct techniques like some invariant features as Scale Invariant Feature Transform and Speeded up Robust Transform and Corner techniques as Harris Corner Detection Technique that are useful in sorting out the issues related with stitching of images.
Leader Follower Formation Control of Ground Vehicles Using Dynamic Pixel Coun...ijma
This paper deals with leader-follower formations of non-holonomic mobile robots, introducing a formation
control strategy based on pixel counts using a commercial grade electro optics camera. Localization of the
leader for motions along line of sight as well as the obliquely inclined directions are considered based on
pixel variation of the images by referencing to two arbitrarily designated positions in the image frames.
Based on an established relationship between the displacement of the camera movement along the viewing
direction and the difference in pixel counts between reference points in the images, the range and the angle
estimate between the follower camera and the leader is calculated. The Inverse Perspective Transform is
used to account for non linear relationship between the height of vehicle in a forward facing image and its
distance from the camera. The formulation is validated with experiments.
Similar to Tracking Chessboard Corners Using Projective Transformation for Augmented Reality (20)
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Tracking Chessboard Corners Using Projective Transformation for Augmented Reality
1. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 78
Tracking Chessboard Corners Using Projective Transformation
for Augmented Reality
S. Malek s_malek@cdta.dz
Advanced Technologies Development
Center (CDTA)
Algiers, 16005. Algeria
Zenati-Henda nzenati@cdta.dz
Advanced Technologies Development
Center (CDTA)
Algiers, 16005. Algeria
M.Belhocine mbelhocine@cdta.dz
Advanced Technologies Development
Center (CDTA)
Algiers, 16005. Algeria
S.Benbelkacem sbenbelkacem@cdta.dz
Advanced Technologies Development
Center (CDTA)
Algiers, 16005. Algeria
Abstract
Augmented reality has been a topic of intense research for several years for many applications. It
consists of inserting a virtual object into a real scene. The virtual object must be accurately
positioned in a desired place. Some measurements (calibration) are thus required and a set of
correspondences between points on the calibration target and the camera images must be found.
In this paper, we present a tracking technique based on both detection of Chessboard corners
and a least squares method; the objective is to estimate the perspective transformation matrix for
the current view of the camera. This technique does not require any information or computation of
the camera parameters; it can used in real time without any initialization and the user can change
the camera focal without any fear of losing alignment between real and virtual object.
Keywords: Pinhole Model, Least Squares Method, Augmented Reality, Chessboard Corners Detection.
1. INTRODUCTION
The term Augmented Reality (AR) is used to describe systems that blend computer generated
virtual objects with real environments. In real-time systems, AR is a result of mixing live video
from a camera with computer-generated graphical objects that are recorded in a user’s three-
dimensional environment [1]. This augmentation may include labels (text), 3D rendered models,
or shading and illumination variations.
In order for AR to be effective, the real and computer-generated objects must be accurately
positioned relative to each other. This implies that certain measurements or calibrations need to
be made initially.
Camera calibration is the first step toward computational computer vision. The process of camera
calibration determines the intrinsic parameters (internal characteristics of camera) and/or extrinsic
parameters of the camera (camera position related to a fixed 3D frame) from correspondences
between points in the real world (3D) and their projection points (2D) in one or more images.
2. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 79
There are different methods used to estimate the parameters of the camera model. They are
classified in three groups of techniques:
* Non linear optimization techniques: the camera parameters are obtained through iteration with
the constraint of minimizing a determined function. These techniques are used in many works
[2][3]. There advantage is that almost any model can be calibrated and accuracy usually
increases by increasing the number of iterations. However, these techniques require a good initial
guess in order to guarantee convergence. Surf
* Linear techniques which compute the transformation matrix: due to the slowness and
computational burden of the first techniques, closed-form solutions have also been suggested.
These techniques use the least squares method to obtain a transformation matrix which relates
3D points with their projections [4][5].
* Two-steps techniques: these approaches consider a linear estimation of some parameters while
others are iteratively estimated.
To ensure tracking of the virtual object, the camera position must be estimated for each view.
Several techniques can be used. They are classified in two groups: vision-based tracking
techniques and sensor-based tracking techniques [6].
Most of the available vision-based tracking techniques are divided into two classes: feature-based
and model-based. The principle of the feature-based methods is to find a correspondence
between 2D image features and their 3D world frame coordinates. The camera pose can then be
found from projecting the 3D coordinates of the feature into the observed 2D image coordinates
and minimizing the distance to their corresponding 2D features [7]. The features extracted are
often used to construct models and use them in model-based methods; edges are often the most
used features as they are computationally efficient to find and robust to changes in lighting [6].
The sensor-based techniques are based on measurements provided by different types of
sensors: magnetic, acoustic, inertial, GPS ... In augmented reality, they are mainly combined with
techniques of the first group (hybrid tracking). These methods have the advantage of being robust
to abrupt movements of the camera, but they have the disadvantage of being sensitive to
environmental disturbances or a limited range in a small volume.
Recently, there is another classification more used to distinguish between vision-based tracking
techniques: fiducial marker tracking and markerless tracking [8]. In the first one, the fiducial
marker is surrounded by a black rectangle or circle shape boundary for easy detection.
Markerless techniques are the other vision-based tracking techniques which don’t use fiducial
marker.
The common thing between augmented reality applications is the necessity to calibrate the
camera at first; if the user wants to change the camera, the resolution or the focal length (zoom
lens), he must then calibrate his camera again before starting working with it.
This paper introduces a technique to model a camera using a robust method to find the
correspondences without any need of calibration. We use the least squares method to estimate
the Perspective Transformation Matrix. For tracking, we use chessboard corners as features to
track; these corners are extracted from each video image and used to estimate the Perspective
Transformation Matrix. This system does not need any camera parameters information (intrinsic
and extrinsic parameters which are included in the Perspective Transformation Matrix). The only
requirement is the ability to track the chessboard corners across video images.
The rest of the paper is organized as follows. Section 2 describes related work. Section 3
discusses and evaluates the proposed approach of tracking using a chessboard pattern. In
section 4, we present how virtual objects are inserted into real scenes and we give some results.
Finally, in the last section, conclusions and discuss of results are given.
3. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 80
2. RELATED WORKS
There are many works which use fiducial marker for tracking. ARToolKit [9][10] is one of the most
popular library for augmented reality, it use a rectangular shape which contains an arbitrary
grayscale patterns. When the marker is detected, its pattern is extracted and cross-correlated
with all known patterns stocked on its data-bases. the camera pose is estimated by using the
projective relation between the vertices of the fiducial marker in the scene and the world. ArtoolKit
has the inconvenient to be more slowly when using several pattern and markers. However, later
research suggested to apply digital communication coding techniques to improve the system’s
performance, at the cost of customization: QR-code [11], TRIP [12], Artag [13], Cantag [14].
In markerless tracking, the principle is to match features extracted from scenes taken from
different viewpoints. Lowe [15] presents a new method named SIFT (Scale Invariant Feature
Transform) for extracting distinctive invariant features from images. These features can be used
to perform reliable matching between different viewpoints; they are invariant to image scale and
rotation, and present robust matching when changing in illumination or adding of noise. Bay et al.
[16] present a novel scale- and rotation-invariant interest point detector and descriptor, named
SURF (Speeded Up Robust Features). It approximates and also outperforms previously proposed
methods to get more robustness and to be much faster.
Recently, Lee and Höllerer [17] extract distinctive image features of the scene and track them
frame-by-frame by computing optical flow. To avoid problem of consuming time of processing and
achieve real-time performance, multiple operations are processed in a synchronized
multithreaded manner.
3. IMPLEMENTATION
We will present here the camera model and the tracking method used for augmented reality.
3.1 Camera Model
The model is a mathematical formulation which approximates the behavior of any physical
device, e.g. a camera. There are several camera models depending on the desired accuracy. The
simplest is the pinhole Model.
In an AR system, it is necessary to know the relationship between the 3D object coordinates and
the image coordinates. The pinhole model defines the basic projective imaging geometry with
which 3D objects are projected onto a 2D image plane.
This is an ideal model commonly used in computer graphics and computer vision to capture the
imaging geometry. The transformation that maps the 3D world points into the 2D image
coordinates is characterized by the next expression:
=
1
Z
Y
X
1000
trrr
trrr
trrr
1
1
1
100
v0
u
s
sv
su
z333231
y232221
x131211
0v
0u
α
γα
(1)
Equation (1) can be simplified to:
=
1
..
1
Z
Y
X
tRAv
u
s (2)
4. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 81
With: [u,v,1]T
represents a 2D point position in Pixel coordinates. [X,Y,Z]T
represents a 3D point
position in World coordinates. t describes the translation between the two frames (camera frame
and world frame), and R is a 3x3 orthonormal rotation matrix which can be defined by the three
Euler angles and A is the intrinsic matrix containing 5 intrinsic parameters.
The product A.R.t represents the “Perspective Transformation Matrix” M, with:
M = A.R.t =
34
24
14
333231
232221
131211
m
m
m
mmm
mmm
mmm
(3)
From equations (2) and (3), we get:
+++
+++
=
+++
+++
=
34333231
24232221
34333231
14131211
mZmYmXm
mZmYmXm
v
mZmYmXm
mZmYmXm
u
(4)
Each 3D point gives two equations. Six points are then sufficient to estimate the 12 coefficients of
the matrix M. But it is possible to use more than 6 points to get better precision. To calculate the
different parameters, the constraint m34 = 1 is used.
To solve the system (4), we first transform it into a linear system as described by (5):
−−−+++=
−−−+++=
−−−+++=
−−−+++=
NN33NN32NN3124N23N22N21N
NN33NN32NN3114N13N12N11N
113311321131241231221211
113311321131141131121111
v.Zmv.Ymv.XmmZmYmXmv
u.Zmu.Ymu.XmmZmYmXmu
.
.
.
v.Zmv.Ymv.XmmZmYmXmv
u.Zmu.Ymu.XmmZmYmXmu
(5)
Then, we transform this system in a matrix form:
U = P.Vm (eq 6)
=
33
32
31
24
23
22
21
14
13
12
11
NNNNNN
NNN666
111111
111111
N
N
1
1
m
m
m
m
m
m
m
m
m
m
m
ZYX1ZYX0000
ZYX00001ZYX
...........
...........
...........
ZYX1ZYX0000
ZYX00001ZYX
v
u
.
.
.
v
u
(6)
To find the mij parameters, the least squares method is used. The following relation is obtained:
5. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 82
Vm = (PT
.P)-1
. PT
.U (7)
Equation (7) represents a system of equations which can be solved by using a numerical
technique such as the Gauss-Jacobi technique.
3.2 Checkpoints Auto Detection and Tracking
We use chessboard corners as features to locate the position of virtual objects in the world
coordinate; these corners are extracted from each video frame (Webcam) and used to compute
the perspective transformation matrix for the current view. The popular open source library for
computer vision OpenCV [18] is used. It provides a function for automatically finding grids from
chessboard patterns (cvFindChessboardCorners() and cvFindCornerSubPix()).
To ensure good tracking and detection, some precautions must be taken which are summarized
in the following points:
• The function cvFindChessboardCorners() lists the detected corners line-by-line from left to
right and bottom to top according to the first detected corner. There thus are four
possibilities of classification which produce errors in the localization of the 3D world
reference. In order to avoid this, a rectangular chessboard (it contains MxN corners with
M≠N) is used instead of a square one and the corner where the square on its top/right
direction is white is considered as the first detected corner. The center of the 3D world
coordinate is fixed on the first corner of the second line (Figure1).
• The two faces of the 3D object which contain the chessboard are inclined instead of being
perpendicular (Figure1); this configuration facilitates the detection of the checkpoints
(chessboard corners) by minimizing the shadow effects and also gives the camera more
possibilities of moving.
FIGURE1: Example of a 3D object with a 3x4 chessboard
3.3 The Performance Evaluation
To evaluate our method, several measures are performed using three cameras. All tests are
performed on the rate frame of 20 Hz with the resolution of 640x480.
The quality of each measurement is estimated by computing the re-projection error, which is the
Euclidean distance between the identified corner and the projection of its corresponding 3D
coordinates onto the image; about 100 calculations are performed for each case. The obtained
results, which represent the average of each case, are represented on the Table1 and compared
with results obtained by Fiala and Shu [19].
6. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 83
Our camera model method
Results obtained by Fiala
and Shu
Camera
Reproj.error
(pixel)
Std.Dev/Max
Camera
Reproj.error
(pixel)
Std.Dev/Max
Sony VGP-
VCC3 0.265
0.14/0.57
Creative
webcam
live ultra
0.16/1.97
Logitech
QuickCam
Pro 9000
0.15/0.53
SONY 999
NTSC
camera
0.13/1.25
VRmUsbCam 0.18/0.62
Logitech
QuickCam
Pro 4000
0.29/2.62
TABLE1: Comparison of our camera model method with results obtained by Fiala and Shu [19]
According to Table1, we note that our obtained results are very close from those obtained by
Fiala and Shu.
The last important factor of this tracking technique is its robustness against the problem of lighting
change; actually corners have the propriety to be detected in different lighting condition. Figure 2
shows an example of corners detection in a very low lighting environment and Figure 3 shows
another example in a very high lighting environment.
FIGURE 2: corners detection in a very low lighting environment
FIGURE 3: corners detection in a very high lighting environment
4. AUGMENTATION TECHNIQUE AND RESULTS
4.1 Principle of Virtual Object Insertion
With the proposal technique, the extrinsic parameters are not required to be calculated. We draw
the virtual object directly on the current image using the standard graphic libraries provided by the
C++Builder environment.
7. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 84
The principle of insertion is simple; equation (4) is used to compute the projection of any 3D point
on the current image. To insert a segment, a straight line is drawn between the projections of its
two extremities. For a polygon, it can be inserted using the projection of its apexes. The case of a
3D object is somehow more complicated; this object can be inserted on the current image by
drawing its faces one by one. An example of a cube is presented in figure4. In this example, a
virtual cube is drawn and positioned as mentioned in figure5. At first, the projections of its apexes
(Pi : i = 1..8) are calculated, then its six faces are drawn one by one in a respective order which is
described in the next sub-section.
FIGURE4: the virtual object to insert (cube)
FIGURE5: The technique used to insert a 3D object
4.2 Technique Used to Avoid Occultation Problem
Occultation problem between real and virtual objects are not treated here, but only the occultation
between the virtual object components, i.e. the different faces which form the augmented object.
The virtual cube shown in figure5 is inserted in a special order (Face5, Face3, Face4, Face2,
Face1 and Face6). If this order changes, the inserted object does not appear as a cube. Figure6
depicts a wrong presentation of a virtual object (house) when its faces are inserted in another
order; the front faces are occulted by others since they are inserted in advance.
8. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 85
We note that this technique is not necessary when using the wire frame technique for virtual
object insertion.
FIGURE6: Example of wrong augmentation
This problem can be avoided by estimating the camera viewpoint. This viewpoint consists of the
estimation of the camera orientation according to the direction of the 3D world reference axes.
We note that this technique is not required if the virtual object is drawn by the wire frame
technique.
The technique is based on the use of a virtual Cube put on the horizontal plan (OXZ) where its
vertices are S1(0,0,0), S2(0,0,L), S3(L,0,L), S4(L,0,0), S5(0,L,0), S6(0,L,L), S7(L,L,L) and S8(L,L,0).
L can take any positive value. Projections of these 3D points on the current image (s1(u1,v1),
s2(u2,v2), s3(u3,v3), s4(u4,v4), s5(u5,v5), s6(u6,v6), s7(u7,v7), s8(u8,v8)) are then calculated using
equation 5 (see figure7 and figure8). Once calculated, a set of tests are applied using these
projections to estimate the camera viewpoint.
These tests are divided into two groups; in the first group, we use the four vertices of the bottom-
square to estimate their positions according to the camera. For the second one, we need to
calculate the projection of “s5” on the lines (s1s2) and (s1s4) and the projection of “s7” on the lines
(s2s3) and (s3s4); these projections are respectively p5_12(u5_12,v5_12), p5_14(u5_14,v5_14),
p7_32(u7_32,v7_32) and p7_34(u7_34,v7_34). Then, using these projections we estimate the faces which
are opposed to the camera.
In the example shown in figure7, the verified tests are:
• (u4 < u1 < u2 ) and (v5 > v5_12) and (v5 > v5_14) which means that the camera is oriented
according to and axes.
In the example shown in figure8, the verified tests are:
• (u1 < u2 < u3 ) and (v5 > v5_12 ) and (v7 > v7_32 ) which means that the camera is oriented
according to and axes.
9. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 86
FIGURE7: Example 1 of the camera estimation viewpoint
FIGURE8: Example 2 of the camera estimation viewpoint
4.3 Results
We present here two application examples of our approach. The scene captured by the camera
(webcam) is augmented by virtual objects in real time. In the first example, the “Logitech
QuickCam Pro 9000” webcam is used with the resolution 320x240, in the second one; a “Sony
VGP-VCC3 0.265A” webcam is used with a resolution 640x480. Images presented in figure9 and
figure10 are selected arbitrarily from the webcams.
10. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 87
FIGURE9: Example of virtual object insertion into real scene with the wire frame technique
FIGURE10: Example of augmentation with a 3D virtual model.
According to these examples, we note that using 12 fiducial points (corners) is sufficient to model
the camera using a pinhole model; the virtual object is positioned in the desired place with a good
accuracy and it size depends of the camera depth. We also note that the used technique of
tracking (tracking of chessboard corners) gives good and acceptable results and the virtual object
is well tracked across camera frame in real time. The last point is the efficacy of the developed
technique of virtual object insertion; the virtual object follows the motion of the camera and it is
positioned and drawn correctly without any information about intrinsic or extrinsic parameters of
the camera.
5. CONCLUSION
This paper introduces a technique of virtual object tracking in real time. This technique is
developed as part of an augmented reality system and does not require any information or
computation of the camera parameters (intrinsic and extrinsic parameters). It is based on both
detection of Chessboard corners and a least squares method to estimate the perspective
transformation matrix for the current view of the camera.
The use of this technique is simple and does not require initialization steps or manual
intervention, the only requirement is the tracking of the marker (chessboard) across images
provided by the camera.
11. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 88
The results show the efficiency of the tracking method based on detection of chessboard corners;
the major advantages of tracking corners are their detection robustness at a large range of
distances, their reliability under severe orientations, and their tolerance of lighting changes or
shadows. The technique used for the insertion of virtual objects gives its proof when the camera
parameters are not calculated.
6. REFERENCES
1. R. Azuma, Y. Baillot,R. Behringer, S. Feiner, and S. Julier, “Recent advances in augmented
reality”, IEEE Computer Graphics and Applications, 21(6), pp. 34-47, 2001.
2. R. Tsai, “An efficient and accurate camera calibration technique for 3D machine vision”, IEEE
Proceedings CVPR, pp. 364-374, June 1986.
3. M. Tuceryan, M., Greer, D., Whitaker, R., Breen, D., Crampton, C., Rose, E., and Ahlers, K.,
“Calibration Requirements and Procedures for Augmented Reality”, IEEE Trans. On
Visualization and Computer Graphics, pp. 255-273, September 1995.
4. O.D. Faugeras and G. Toscani. “The calibration problem for stereo”, In Proceedings of
CVPR, pp. 15-20, 1986.
5. Z. Zhang, “Camera calibration with one-dimensional objects”, In Proc. 7th European
Conference on Computer Vision, (IV), pp. 161–174, 2002.
6. Zhou, F., Duh, H.B.L., Billinghurst, M. " Trends in Augmented Reality Tracking, Interaction
and Display: A Review of Ten Years of ISMAR". Cambridge, UK: 7th IEEE and ACM
International Symposium on Mixed and Augmented Reality (ISMAR 2008), pp. 15-18, Sep
2008.
7. H. Wuest, F. Vial and D. stricker. “Adaptative line tracking with multiple hypotheses for
augmented reality”. In ISMAR ’05, pp. 62-69, 2005.
8. Hyun Seung Yang, Kyusung Cho, Jaemin Soh, Jinki Jung, Junseok Lee: "Hybrid Visual
Tracking for Augmented Books". ICEC 2008: pp. 161-166, 2008.
9. H. Kato and M. Billinghurst. "Marker tracking and HMD calibration for a video-based
augmented reality conferencing system". IWAR ’99 : Proceedings of the 2nd IEEE and ACM
International Workshop on Augmented Reality, Washington, DC, USA. IEEE Computer
Society, pp. 85–92, 1999.
10. ARToolKit site: http://www.hitl.washington.edu/artoolkit.
11. ISO. International standard, iso/iec18004. ISO International Standard, Jun 2000.
12. Diego Lopez de Ipina, Paulo R. S. Mendonca, and Andy Hopper. "Trip: A low-cost vision-
based location system for ubiquitous computing". Personal Ubiquitous Comput., 6(3), pp.
206–219, , 2005.
13. Fiala, M.: ARTag: "a fiducial marker system using digital techniques". In: CVPR’05, vol. 1, pp.
590 – 596, 2005.
14. Rice, A., Beresford, A., Harle, R.: "Cantag: an open source software toolkit for designing and
deploying marker-based vision systems". In: 4th IEEE International Conference on Pervasive
Computing and Communications, 2006.
12. S. Malek, N. Zenati-Henda, M.Belhocine & S. Benbelkacem
International Journal of Image Processing (IJIP), Volume (5) : Issue (1) : 2011 89
15. Lowe, D.G.: "Distinctive Image Features from Scale-Invariant Keypoints". International
Journal of Computer Vision 20(2), pp. 91–110, 2004.
16. Bay, H., Tuytelaars, T., VanGool, L.: "SURF: Speeded Up Robust Features". In: 9th
European Conference on Computer Vision, pp. 404–417, 2006.
17. T. Lee and T. Höllerer. 2009. "Multithreaded Hybrid Feature Tracking for Markerless
Augmented Reality". IEEE Transactions on Visualization and Computer Graphics 15, 3, pp.
355-368, May. 2009.
18. Open source computer vision library. In:
http://www.intel.com/research/mrl/research/opencv.
19. Mark Fiala, Chang Shu: "Self-identifying patterns for plane-based camera calibration". Mach.
Vis. Appl. 19(4), pp. 209-216, 2008.