The document summarizes a project where an industrial robot named Baxter, made by Rethink Robotics, plays pool using its vision sensors and inverse kinematics. The key steps involved finding the desired orientation to strike the cue ball into the pocket using either a 3D sensor or Baxter's head camera, moving Baxter's arm to that orientation, then using visual servoing with Baxter's hand camera to align the end effector with the center of the ball while maintaining orientation. Once aligned, Baxter would linearly move its end effector to strike the ball. Some limitations were the workspace was limited due to Baxter's fixed position, and insufficient force was achieved on the ball. Future work proposed making Baxter mobile and using linear actuators to increase striking force
Interactive full body motion capture using infrared sensor networkijcga
Traditional motion capture (mocap) has been
well
-
stud
ied in visual science for
the last decades
. However
the fie
ld is mostly about capturing
precise animation to be used in
specific
application
s
after
intensive
post
processing such as studying biomechanics or rigging models in movies. These data set
s are normally
captured in complex laboratory environments with
sophisticated
equipment thus making motion capture a
field that is mostly exclusive to professional animators.
In
addition
, obtrusive sensors must be attached to
actors and calibrated within t
he capturing system, resulting in limited and unnatural motion.
In recent year
the rise of computer vision and interactive entertainment opened the gate for a different type of motion
capture which focuses on producing
optical
marker
less
or mechanical sens
orless
motion capture.
Furtherm
ore a wide array of low
-
cost
device are released that are easy to use
for less mission critical
applications
.
This paper
describe
s
a new technique of using multiple infrared devices to process data from
multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap
using commodity
devices such as Kinect
. The method involves analyzing each individual sensor
data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from
sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasize
s on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability
Interactive Full-Body Motion Capture Using Infrared Sensor Network ijcga
Traditional motion capture (mocap) has been well-studied in visual science for the last decades. However the field is mostly about capturing precise animation to be used in specific applications after intensive post processing such as studying biomechanics or rigging models in movies. These data sets are normally captured in complex laboratory environments with sophisticated equipment thus making motion capture a
field that is mostly exclusive to professional animators. In addition, obtrusive sensors must be attached to actors and calibrated within the capturing system, resulting in limited and unnatural motion. In recent year the rise of computer vision and interactive entertainment opened the gate for a different type of motion capture which focuses on producing optical markerless or mechanical sensorless motion capture. Furthermore a wide array of low-cost device are released that are easy to use for less mission critical applications. This paper describes a new technique of using multiple infrared devices to process data from multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap using commodity
devices such as Kinect. The method involves analyzing each individual sensor data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasizes on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability.
A Fast Single-Pixel Laser Imager for VR/AR Headset TrackingPing Hsu
In this work we demonstrate a highly flexible laser imaging system for 3D sensing applications such as in tracking of VR/AR headsets, hands and gestures. The system uses a MEMS mirror scan module to transmit low power laser pulses over programmable areas within a field of view and uses a single photodiode to measure the reflected light...
Digital 3D imaging can benefit from advances in VLSI technology in order to accelerate its deployment in many fields like visual communication and industrial automation. High-resolution 3D images can be acquired using laser-based vision systems. With this approach, the 3D information becomes relatively insensitive to background illumination and surface texture. Complete images of visible surfaces that are rather featureless to the human eye or a video camera can be generated. Intelligent digitizers will be capable of measuring accurately and simultaneously color and 3D.
Secure System based on Dynamic Features of IRIS Recognitionijsrd.com
Basically, the idea behind this system is improvement in cybernetics, the biometric person identification technique based on the pattern of the human iris is well suited to be applied to access control. The human eye is sensitive to visible light. Security systems having realized the value of biometrics for two basic purposes: to verify or identify users. In this busy world, identification should be fast and efficient. In this paper I focus on an efficient methodology for identification and verification for iris detection using Haar transform and Minimum hamming distance. I use canny operator for the edge detection. This biological phenomenon contracts and dilates the two pupils synchronously when illuminating one of the eyes by visible light .I applied the Haar wavelet compressing the data. By comparing the quantized vectors using the Hamming Distance operator, we determine finally whether two irises are similar. The result shows that system is quite effective.
Applying edge density based region growing with frame difference for detectin...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE K...IJCSEA Journal
The registration in augmented reality is process which merges virtual objects generated by computer with real world image caught by camera. This paper describes the knowledge-based registration, computer vision-based registration and tracker-based registration technology. This paper mainly focused on trackerbased
registration technology in augmented reality. Also described method in tracker- based technology, problem and solution.
Interactive full body motion capture using infrared sensor networkijcga
Traditional motion capture (mocap) has been
well
-
stud
ied in visual science for
the last decades
. However
the fie
ld is mostly about capturing
precise animation to be used in
specific
application
s
after
intensive
post
processing such as studying biomechanics or rigging models in movies. These data set
s are normally
captured in complex laboratory environments with
sophisticated
equipment thus making motion capture a
field that is mostly exclusive to professional animators.
In
addition
, obtrusive sensors must be attached to
actors and calibrated within t
he capturing system, resulting in limited and unnatural motion.
In recent year
the rise of computer vision and interactive entertainment opened the gate for a different type of motion
capture which focuses on producing
optical
marker
less
or mechanical sens
orless
motion capture.
Furtherm
ore a wide array of low
-
cost
device are released that are easy to use
for less mission critical
applications
.
This paper
describe
s
a new technique of using multiple infrared devices to process data from
multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap
using commodity
devices such as Kinect
. The method involves analyzing each individual sensor
data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from
sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasize
s on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability
Interactive Full-Body Motion Capture Using Infrared Sensor Network ijcga
Traditional motion capture (mocap) has been well-studied in visual science for the last decades. However the field is mostly about capturing precise animation to be used in specific applications after intensive post processing such as studying biomechanics or rigging models in movies. These data sets are normally captured in complex laboratory environments with sophisticated equipment thus making motion capture a
field that is mostly exclusive to professional animators. In addition, obtrusive sensors must be attached to actors and calibrated within the capturing system, resulting in limited and unnatural motion. In recent year the rise of computer vision and interactive entertainment opened the gate for a different type of motion capture which focuses on producing optical markerless or mechanical sensorless motion capture. Furthermore a wide array of low-cost device are released that are easy to use for less mission critical applications. This paper describes a new technique of using multiple infrared devices to process data from multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap using commodity
devices such as Kinect. The method involves analyzing each individual sensor data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasizes on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability.
A Fast Single-Pixel Laser Imager for VR/AR Headset TrackingPing Hsu
In this work we demonstrate a highly flexible laser imaging system for 3D sensing applications such as in tracking of VR/AR headsets, hands and gestures. The system uses a MEMS mirror scan module to transmit low power laser pulses over programmable areas within a field of view and uses a single photodiode to measure the reflected light...
Digital 3D imaging can benefit from advances in VLSI technology in order to accelerate its deployment in many fields like visual communication and industrial automation. High-resolution 3D images can be acquired using laser-based vision systems. With this approach, the 3D information becomes relatively insensitive to background illumination and surface texture. Complete images of visible surfaces that are rather featureless to the human eye or a video camera can be generated. Intelligent digitizers will be capable of measuring accurately and simultaneously color and 3D.
Secure System based on Dynamic Features of IRIS Recognitionijsrd.com
Basically, the idea behind this system is improvement in cybernetics, the biometric person identification technique based on the pattern of the human iris is well suited to be applied to access control. The human eye is sensitive to visible light. Security systems having realized the value of biometrics for two basic purposes: to verify or identify users. In this busy world, identification should be fast and efficient. In this paper I focus on an efficient methodology for identification and verification for iris detection using Haar transform and Minimum hamming distance. I use canny operator for the edge detection. This biological phenomenon contracts and dilates the two pupils synchronously when illuminating one of the eyes by visible light .I applied the Haar wavelet compressing the data. By comparing the quantized vectors using the Hamming Distance operator, we determine finally whether two irises are similar. The result shows that system is quite effective.
Applying edge density based region growing with frame difference for detectin...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE K...IJCSEA Journal
The registration in augmented reality is process which merges virtual objects generated by computer with real world image caught by camera. This paper describes the knowledge-based registration, computer vision-based registration and tracker-based registration technology. This paper mainly focused on trackerbased
registration technology in augmented reality. Also described method in tracker- based technology, problem and solution.
Human action recognition with kinect using a joint motion descriptorSoma Boubou
- We proposed a novel descriptor for motion of skeleton joints.
- Proposed descriptor proved to outperform the state-of-the-art descriptors such as HON4D and the one proposed by Chen et al 2013.
- Our proposed approached proved to be effective for periodic actions (e.g., Waving, Walking, Jogging, Side-Boxing, etc).
- Grouping was effective for actions with unique joints trajectories (e.g., Tennis serving, Side kicking , etc).
- Grouping joints into eight groups is always effective with actions of MSR3D dataset.
A STUDY OF VARIATION OF NORMAL OF POLY-GONS CREATED BY POINT CLOUD DATA FOR A...Tomohiro Fukuda
This slide is presented in CAADRIA2011 (The 16th International Conference on Computer Aided Architectural Design Research in Asia).
Abstracts: Acquiring current 3D space data of cities, buildings, and rooms rapidly and in detail has become indispensable. When the point cloud data of an object or space scanned by a 3D laser scanner is converted into polygons, it is an accumulation of small polygons. When object or space is a closed flat plane, it is necessary to merge small polygons to reduce the volume of data, and to convert them into one polygon. When an object or space is a closed flat plane, each normal vector of small polygons theoretically has the same angle. However, in practise, these angles are not the same. Therefore, the purpose of this study is to clarify the variation of the angle of a small polygon group that should become one polygon based on actual data. As a result of experimentation, no small polygons are converted by the point cloud data scanned with the 3D laser scanner even if the group of small polygons is a closed flat plane lying in the same plane. When the standard deviation of the extracted number of polygons is assumed to be less than 100, the variation of the angle of the normal vector is roughly 7 degrees.
Preliminary study of multi view imaging for accurate 3 d reconstruction using...eSAT Journals
Abstract This paper presents a multi-view structured-light approach for surface scanning to reconstruct three-dimensional (3D) object using a turntable. It is a modification from DAVID 3D Scanner SLS-1 (Structured-Light Scanner) as a starting point of study on improving and builds a complete system of 3D structured-light based scanner. This type of scanner uses a video projector to project various patterns onto an object which is going to be digitized or reconstruct to a 3D model. At the same time, a camera will record and capture the scene at least one image of each pattern from a certain point of view for example from right, left, above or below of the video projector. Then, 3D meshes of surface of the object will be computed based on the deformations of the projected patterns. The preliminary results show that object which are model of prostheses are successfully reconstructed. Index Terms: 3D scanner, structured-light scanner, 3D reconstruction, and multiple-view
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...CSCJournals
Augmented reality has been a topic of intense research for several years for many applications. It consists of inserting a virtual object into a real scene. The virtual object must be accurately positioned in a desired place. Some measurements (calibration) are thus required and a set of correspondences between points on the calibration target and the camera images must be found. In this paper, we present a tracking technique based on both detection of Chessboard corners and a least squares method; the objective is to estimate the perspective transformation matrix for the current view of the camera. This technique does not require any information or computation of the camera parameters; it can used in real time without any initialization and the user can change the camera focal without any fear of losing alignment between real and virtual object.
Human action recognition using local space time features and adaboost svmeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
ABSTRACT Feature extraction plays a vital role in the analysis and interpretation of remotely sensed data. The two important components of Feature extraction are Image enhancement and information extraction. Image enhancement techniques help in improving the visibility of any portion or feature of the image. Information extraction techniques help in obtaining the statistical information about any particular feature or portion of the image. This presented work focuses on the various feature extraction techniques and area of optical character recognition is a particularly important in Image processing. Keywords— Image character recognition, Methods for Feature Extraction, Basic Gabor Filter, IDA, and PCA.
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Kalle
Laboratory eyetrackers, constrained to a fixed display and static (or accurately tracked) observer, facilitate automated analysis of fixation data. Development of wearable eyetrackers has extended environments and tasks that can be studied at the expense of automated analysis. Wearable eyetrackers provide 2D point-of-regard (POR) in scene-camera coordinates, but the researcher is typically interested in some high-level semantic property (e.g., object identity, region, or material) surrounding individual fixation points. The synthesis of POR into fixations and semantic information remains a labor-intensive manual task, limiting the application of wearable eyetracking.
We describe a system that segments POR videos into fixations and allows users to train a database-driven, object-recognition system. A correctly trained library results in a very accurate and semi-automated translation of raw POR data into a sequence of objects, regions or materials.
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...csandit
Time-delay estimation is an essential building block of many signal processing applications.This paper follows up on earlier work for acoustic source localization and time delay estimation
using pattern recognition techniques in the adverse environment such as reverberant rooms or underwater; it presents unprecedented high performance results obtained with supervised training of neural networks which challenge the state of the art and compares its performance to that of well-known methods such as the Generalized Cross-Correlation or Adaptive Eigenvalue Decomposition.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
A presentation on Image Recognition, the basic definition and working of Image Recognition, Edge Detection, Neural Networks, use of Convolutional Neural Network in Image Recognition, Applications, Future Scope and Conclusion
Scene recognition using Convolutional Neural NetworkDhirajGidde
Scene recognition is one of the hallmark tasks of computer vision, allowing definition of a context for object recognition. Whereas the tremendous recent progress in object recognition tasks is due to the availability of large datasets like ImageNet and the rise of Convolutional Neural Networks (CNNs) for learning high-level features, performance at scene recognition has not attained the same level of success.
A musculoskeletal model driven by microsoft kinect sensor v2 dataAdam Frank
ABSTRACT
Objective. To develop a musculoskeletal model driven by data retrieved from Microsoft Kinect Sensor v2 and compare the output to a musculoskeletal model driven by data from makerbased motion capture system for three different movements. Furthermore, determine the optimal position for the Microsoft Kinect Sensor v2 for each movement.
Method. In the positioning test, a combination of seven angles, three heights and three distances was conducted to find the optimal position for obtaining data for a musculoskeletal model, doing a gait, squat and shoulder abduction cycle. When the optimal positions for the three different movements were determined, data for the comparison test were collected for five healthy male subjects. Eight Oqus 1 infrared high-speed cameras and two force platforms were used to collect the maker-less based motion capture data. One Microsoft Kinect Sensor v2 was used to collect the marker-less based motion capture data. AnyBody Modeling System was used to analyze different variables for the two systems.
Results. Multiple positions were fund to be optimal for the position of the Microsoft Kinect Sensor v2 at the squat and the shoulder abduction movement. The same positions for these movements were chosen to be the same (0°, 0.75/2.6). The optimal position for the gait movement (0°, 0.75m/3.4m) was determined, based on the highest percentage of tracked Kinectjoints. Strong correlations were found in the comparison test for knee flexion angle and hip flexion angle for both the gait and the squat movement. Doing the shoulder abduction movement, strong correlations were found for shoulder abduction angle (0.99) and moment (0.88). Even though strong correlation were found in the ankle flexion angle (0.71) in the squat movement, other results indicates that the Microsoft Kinect Sensor v2 has limitations tracking the ankle sufficiently. A strong correlation in the ground reaction force (0.81) was observed for the gait movement, where as the ground reaction forces in the squat movement were: left (0.49) and right (0.50).
Conclusion. The results of this study show that data obtained by the Microsoft Kinect Sensor v2 can be used as input in a musculoskeletal model. Though the Microsoft Kinect Sensor v2 show some encouraging results for some variables, it still proves insufficient as a alternative to marker-based systems.
Page 1 of 14 ENS4152 Project Development Proposal a.docxkarlhennesey
Page 1 of 14
ENS4152 Project Development
Proposal and Risk Assessment Report
Baxter Research Robot: Solving a Rubik’s Cube
Chris Dawes
Student # 10282558
30 Mar 2015
Supervisor: Dr Alexander Rassau
Page 2 of 14
Abstract
Robotics is currently used to perform many tasks but many of these are simple repetition of a
predefined method. By combining AI with robotics we can greatly increase the applications of
robotics. An algorithm that combines the vision and servo systems of a Baxter Research Robot
with a solving solution for a Rubik’s cube will demonstrate that the use of even simple AI with
robotics allows complex tasks to be completed. Further integration of object recognition will
allow the task to be completed in a dynamic environment, and further increase the areas
robots are capable of working within.
1. Introduction
1.1. Motivation
The Baxter Research Robot by Rethink Robotics is a dual arm robot, with seven degrees of
freedom per arm, released in 2012. Developed to be affordable, flexible in its purpose, and
above all else safe, Baxter includes three cameras, one on each wrist and the other on its head,
and a screen for displaying information relating to Baxter’s current task. The robot is designed
to be a versatile research platform while containing the same hardware as its industry
counterpart, allowing research to translate into industrial applications (Rethink Robotics,
2015).
In general robotics artificial intelligence (AI) has been developed separately to robotics, but is
now starting to become integrated. Unfortunately current AI is fragmented as each application
focuses on one area, as opposed to making a true AI that thinks like a human (Bogue, 2014).
Current usable AI is more akin to ‘smart’ robotics where decisions are made and problems
solved by the robot in very specific applications. In industry, robots are expanding into areas
that require more flexibility allowing robots to fill many more positions in increasingly complex
areas (Hajduk, Jenčík, Jezný, & Vargovčík, 2013). Mobile robots are even becoming more
common place, allowing for dynamic and spread out workspaces. These are all due to adding
sensing and analysis to robots allowing them to react to dynamic environments.
To further robotics in industry, multi robot work cells have been designed that combine
several robots working on the same part while cooperatively performing either one task, such
as welding and the required handling, or multiple tasks at the same time (Hajduk, Jenčík, Jezný,
& Vargovčík, 2013). The number of activities these work cells can perform increases
Page 3 of 14
dramatically, as the complexity of the task or tasks can be higher while the robots don’t need
to be capable of performing the whole task individually.
For performing more human tasks, dual arm robots have begun to emerge (Hajduk, Jenčík,
Jezný, & Vargovčík, 2013 ...
Human action recognition with kinect using a joint motion descriptorSoma Boubou
- We proposed a novel descriptor for motion of skeleton joints.
- Proposed descriptor proved to outperform the state-of-the-art descriptors such as HON4D and the one proposed by Chen et al 2013.
- Our proposed approached proved to be effective for periodic actions (e.g., Waving, Walking, Jogging, Side-Boxing, etc).
- Grouping was effective for actions with unique joints trajectories (e.g., Tennis serving, Side kicking , etc).
- Grouping joints into eight groups is always effective with actions of MSR3D dataset.
A STUDY OF VARIATION OF NORMAL OF POLY-GONS CREATED BY POINT CLOUD DATA FOR A...Tomohiro Fukuda
This slide is presented in CAADRIA2011 (The 16th International Conference on Computer Aided Architectural Design Research in Asia).
Abstracts: Acquiring current 3D space data of cities, buildings, and rooms rapidly and in detail has become indispensable. When the point cloud data of an object or space scanned by a 3D laser scanner is converted into polygons, it is an accumulation of small polygons. When object or space is a closed flat plane, it is necessary to merge small polygons to reduce the volume of data, and to convert them into one polygon. When an object or space is a closed flat plane, each normal vector of small polygons theoretically has the same angle. However, in practise, these angles are not the same. Therefore, the purpose of this study is to clarify the variation of the angle of a small polygon group that should become one polygon based on actual data. As a result of experimentation, no small polygons are converted by the point cloud data scanned with the 3D laser scanner even if the group of small polygons is a closed flat plane lying in the same plane. When the standard deviation of the extracted number of polygons is assumed to be less than 100, the variation of the angle of the normal vector is roughly 7 degrees.
Preliminary study of multi view imaging for accurate 3 d reconstruction using...eSAT Journals
Abstract This paper presents a multi-view structured-light approach for surface scanning to reconstruct three-dimensional (3D) object using a turntable. It is a modification from DAVID 3D Scanner SLS-1 (Structured-Light Scanner) as a starting point of study on improving and builds a complete system of 3D structured-light based scanner. This type of scanner uses a video projector to project various patterns onto an object which is going to be digitized or reconstruct to a 3D model. At the same time, a camera will record and capture the scene at least one image of each pattern from a certain point of view for example from right, left, above or below of the video projector. Then, 3D meshes of surface of the object will be computed based on the deformations of the projected patterns. The preliminary results show that object which are model of prostheses are successfully reconstructed. Index Terms: 3D scanner, structured-light scanner, 3D reconstruction, and multiple-view
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...CSCJournals
Augmented reality has been a topic of intense research for several years for many applications. It consists of inserting a virtual object into a real scene. The virtual object must be accurately positioned in a desired place. Some measurements (calibration) are thus required and a set of correspondences between points on the calibration target and the camera images must be found. In this paper, we present a tracking technique based on both detection of Chessboard corners and a least squares method; the objective is to estimate the perspective transformation matrix for the current view of the camera. This technique does not require any information or computation of the camera parameters; it can used in real time without any initialization and the user can change the camera focal without any fear of losing alignment between real and virtual object.
Human action recognition using local space time features and adaboost svmeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
ABSTRACT Feature extraction plays a vital role in the analysis and interpretation of remotely sensed data. The two important components of Feature extraction are Image enhancement and information extraction. Image enhancement techniques help in improving the visibility of any portion or feature of the image. Information extraction techniques help in obtaining the statistical information about any particular feature or portion of the image. This presented work focuses on the various feature extraction techniques and area of optical character recognition is a particularly important in Image processing. Keywords— Image character recognition, Methods for Feature Extraction, Basic Gabor Filter, IDA, and PCA.
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Kalle
Laboratory eyetrackers, constrained to a fixed display and static (or accurately tracked) observer, facilitate automated analysis of fixation data. Development of wearable eyetrackers has extended environments and tasks that can be studied at the expense of automated analysis. Wearable eyetrackers provide 2D point-of-regard (POR) in scene-camera coordinates, but the researcher is typically interested in some high-level semantic property (e.g., object identity, region, or material) surrounding individual fixation points. The synthesis of POR into fixations and semantic information remains a labor-intensive manual task, limiting the application of wearable eyetracking.
We describe a system that segments POR videos into fixations and allows users to train a database-driven, object-recognition system. A correctly trained library results in a very accurate and semi-automated translation of raw POR data into a sequence of objects, regions or materials.
NEURAL NETWORKS FOR HIGH PERFORMANCE TIME-DELAY ESTIMATION AND ACOUSTIC SOURC...csandit
Time-delay estimation is an essential building block of many signal processing applications.This paper follows up on earlier work for acoustic source localization and time delay estimation
using pattern recognition techniques in the adverse environment such as reverberant rooms or underwater; it presents unprecedented high performance results obtained with supervised training of neural networks which challenge the state of the art and compares its performance to that of well-known methods such as the Generalized Cross-Correlation or Adaptive Eigenvalue Decomposition.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
A presentation on Image Recognition, the basic definition and working of Image Recognition, Edge Detection, Neural Networks, use of Convolutional Neural Network in Image Recognition, Applications, Future Scope and Conclusion
Scene recognition using Convolutional Neural NetworkDhirajGidde
Scene recognition is one of the hallmark tasks of computer vision, allowing definition of a context for object recognition. Whereas the tremendous recent progress in object recognition tasks is due to the availability of large datasets like ImageNet and the rise of Convolutional Neural Networks (CNNs) for learning high-level features, performance at scene recognition has not attained the same level of success.
A musculoskeletal model driven by microsoft kinect sensor v2 dataAdam Frank
ABSTRACT
Objective. To develop a musculoskeletal model driven by data retrieved from Microsoft Kinect Sensor v2 and compare the output to a musculoskeletal model driven by data from makerbased motion capture system for three different movements. Furthermore, determine the optimal position for the Microsoft Kinect Sensor v2 for each movement.
Method. In the positioning test, a combination of seven angles, three heights and three distances was conducted to find the optimal position for obtaining data for a musculoskeletal model, doing a gait, squat and shoulder abduction cycle. When the optimal positions for the three different movements were determined, data for the comparison test were collected for five healthy male subjects. Eight Oqus 1 infrared high-speed cameras and two force platforms were used to collect the maker-less based motion capture data. One Microsoft Kinect Sensor v2 was used to collect the marker-less based motion capture data. AnyBody Modeling System was used to analyze different variables for the two systems.
Results. Multiple positions were fund to be optimal for the position of the Microsoft Kinect Sensor v2 at the squat and the shoulder abduction movement. The same positions for these movements were chosen to be the same (0°, 0.75/2.6). The optimal position for the gait movement (0°, 0.75m/3.4m) was determined, based on the highest percentage of tracked Kinectjoints. Strong correlations were found in the comparison test for knee flexion angle and hip flexion angle for both the gait and the squat movement. Doing the shoulder abduction movement, strong correlations were found for shoulder abduction angle (0.99) and moment (0.88). Even though strong correlation were found in the ankle flexion angle (0.71) in the squat movement, other results indicates that the Microsoft Kinect Sensor v2 has limitations tracking the ankle sufficiently. A strong correlation in the ground reaction force (0.81) was observed for the gait movement, where as the ground reaction forces in the squat movement were: left (0.49) and right (0.50).
Conclusion. The results of this study show that data obtained by the Microsoft Kinect Sensor v2 can be used as input in a musculoskeletal model. Though the Microsoft Kinect Sensor v2 show some encouraging results for some variables, it still proves insufficient as a alternative to marker-based systems.
Page 1 of 14 ENS4152 Project Development Proposal a.docxkarlhennesey
Page 1 of 14
ENS4152 Project Development
Proposal and Risk Assessment Report
Baxter Research Robot: Solving a Rubik’s Cube
Chris Dawes
Student # 10282558
30 Mar 2015
Supervisor: Dr Alexander Rassau
Page 2 of 14
Abstract
Robotics is currently used to perform many tasks but many of these are simple repetition of a
predefined method. By combining AI with robotics we can greatly increase the applications of
robotics. An algorithm that combines the vision and servo systems of a Baxter Research Robot
with a solving solution for a Rubik’s cube will demonstrate that the use of even simple AI with
robotics allows complex tasks to be completed. Further integration of object recognition will
allow the task to be completed in a dynamic environment, and further increase the areas
robots are capable of working within.
1. Introduction
1.1. Motivation
The Baxter Research Robot by Rethink Robotics is a dual arm robot, with seven degrees of
freedom per arm, released in 2012. Developed to be affordable, flexible in its purpose, and
above all else safe, Baxter includes three cameras, one on each wrist and the other on its head,
and a screen for displaying information relating to Baxter’s current task. The robot is designed
to be a versatile research platform while containing the same hardware as its industry
counterpart, allowing research to translate into industrial applications (Rethink Robotics,
2015).
In general robotics artificial intelligence (AI) has been developed separately to robotics, but is
now starting to become integrated. Unfortunately current AI is fragmented as each application
focuses on one area, as opposed to making a true AI that thinks like a human (Bogue, 2014).
Current usable AI is more akin to ‘smart’ robotics where decisions are made and problems
solved by the robot in very specific applications. In industry, robots are expanding into areas
that require more flexibility allowing robots to fill many more positions in increasingly complex
areas (Hajduk, Jenčík, Jezný, & Vargovčík, 2013). Mobile robots are even becoming more
common place, allowing for dynamic and spread out workspaces. These are all due to adding
sensing and analysis to robots allowing them to react to dynamic environments.
To further robotics in industry, multi robot work cells have been designed that combine
several robots working on the same part while cooperatively performing either one task, such
as welding and the required handling, or multiple tasks at the same time (Hajduk, Jenčík, Jezný,
& Vargovčík, 2013). The number of activities these work cells can perform increases
Page 3 of 14
dramatically, as the complexity of the task or tasks can be higher while the robots don’t need
to be capable of performing the whole task individually.
For performing more human tasks, dual arm robots have begun to emerge (Hajduk, Jenčík,
Jezný, & Vargovčík, 2013 ...
Page 1 of 14 ENS4152 Project Development Proposal a.docxsmile790243
Page 1 of 14
ENS4152 Project Development
Proposal and Risk Assessment Report
Baxter Research Robot: Solving a Rubik’s Cube
Chris Dawes
Student # 10282558
30 Mar 2015
Supervisor: Dr Alexander Rassau
Page 2 of 14
Abstract
Robotics is currently used to perform many tasks but many of these are simple repetition of a
predefined method. By combining AI with robotics we can greatly increase the applications of
robotics. An algorithm that combines the vision and servo systems of a Baxter Research Robot
with a solving solution for a Rubik’s cube will demonstrate that the use of even simple AI with
robotics allows complex tasks to be completed. Further integration of object recognition will
allow the task to be completed in a dynamic environment, and further increase the areas
robots are capable of working within.
1. Introduction
1.1. Motivation
The Baxter Research Robot by Rethink Robotics is a dual arm robot, with seven degrees of
freedom per arm, released in 2012. Developed to be affordable, flexible in its purpose, and
above all else safe, Baxter includes three cameras, one on each wrist and the other on its head,
and a screen for displaying information relating to Baxter’s current task. The robot is designed
to be a versatile research platform while containing the same hardware as its industry
counterpart, allowing research to translate into industrial applications (Rethink Robotics,
2015).
In general robotics artificial intelligence (AI) has been developed separately to robotics, but is
now starting to become integrated. Unfortunately current AI is fragmented as each application
focuses on one area, as opposed to making a true AI that thinks like a human (Bogue, 2014).
Current usable AI is more akin to ‘smart’ robotics where decisions are made and problems
solved by the robot in very specific applications. In industry, robots are expanding into areas
that require more flexibility allowing robots to fill many more positions in increasingly complex
areas (Hajduk, Jenčík, Jezný, & Vargovčík, 2013). Mobile robots are even becoming more
common place, allowing for dynamic and spread out workspaces. These are all due to adding
sensing and analysis to robots allowing them to react to dynamic environments.
To further robotics in industry, multi robot work cells have been designed that combine
several robots working on the same part while cooperatively performing either one task, such
as welding and the required handling, or multiple tasks at the same time (Hajduk, Jenčík, Jezný,
& Vargovčík, 2013). The number of activities these work cells can perform increases
Page 3 of 14
dramatically, as the complexity of the task or tasks can be higher while the robots don’t need
to be capable of performing the whole task individually.
For performing more human tasks, dual arm robots have begun to emerge (Hajduk, Jenčík,
Jezný, & Vargovčík, 2013.
Page 1 of 14 ENS4152 Project Development Proposal a.docxjakeomoore75037
Page 1 of 14
ENS4152 Project Development
Proposal and Risk Assessment Report
Baxter Research Robot: Solving a Rubik’s Cube
Chris Dawes
Student # 10282558
30 Mar 2015
Supervisor: Dr Alexander Rassau
Page 2 of 14
Abstract
Robotics is currently used to perform many tasks but many of these are simple repetition of a
predefined method. By combining AI with robotics we can greatly increase the applications of
robotics. An algorithm that combines the vision and servo systems of a Baxter Research Robot
with a solving solution for a Rubik’s cube will demonstrate that the use of even simple AI with
robotics allows complex tasks to be completed. Further integration of object recognition will
allow the task to be completed in a dynamic environment, and further increase the areas
robots are capable of working within.
1. Introduction
1.1. Motivation
The Baxter Research Robot by Rethink Robotics is a dual arm robot, with seven degrees of
freedom per arm, released in 2012. Developed to be affordable, flexible in its purpose, and
above all else safe, Baxter includes three cameras, one on each wrist and the other on its head,
and a screen for displaying information relating to Baxter’s current task. The robot is designed
to be a versatile research platform while containing the same hardware as its industry
counterpart, allowing research to translate into industrial applications (Rethink Robotics,
2015).
In general robotics artificial intelligence (AI) has been developed separately to robotics, but is
now starting to become integrated. Unfortunately current AI is fragmented as each application
focuses on one area, as opposed to making a true AI that thinks like a human (Bogue, 2014).
Current usable AI is more akin to ‘smart’ robotics where decisions are made and problems
solved by the robot in very specific applications. In industry, robots are expanding into areas
that require more flexibility allowing robots to fill many more positions in increasingly complex
areas (Hajduk, Jenčík, Jezný, & Vargovčík, 2013). Mobile robots are even becoming more
common place, allowing for dynamic and spread out workspaces. These are all due to adding
sensing and analysis to robots allowing them to react to dynamic environments.
To further robotics in industry, multi robot work cells have been designed that combine
several robots working on the same part while cooperatively performing either one task, such
as welding and the required handling, or multiple tasks at the same time (Hajduk, Jenčík, Jezný,
& Vargovčík, 2013). The number of activities these work cells can perform increases
Page 3 of 14
dramatically, as the complexity of the task or tasks can be higher while the robots don’t need
to be capable of performing the whole task individually.
For performing more human tasks, dual arm robots have begun to emerge (Hajduk, Jenčík,
Jezný, & Vargovčík, 2013.
REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE K...IJCSEA Journal
The registration in augmented reality is process which merges virtual objects generated by computer with
real world image caught by camera. This paper describes the knowledge-based registration, computer
vision-based registration and tracker-based registration technology. This paper mainly focused on trackerbased registration technology in augmented reality. Also described method in tracker- based technology,
problem and solution.
Tiny-YOLO distance measurement and object detection coordination system for t...IJECEIAES
A humanoid robot called BarelangFC was designed to take part in the Kontes Robot Indonesia (KRI) competition, in the robot coordination division. In this division, each robot is expected to recognize its opponents and to pass the ball towards a team member to establish coordination between the robots. In order to achieve this team coordination, a fast and accurate system is needed to detect and estimate the other robot’s position in real time. Moreover, each robot has to estimate its team members’ locations based on its camera reading, so that the ball can be passed without error. This research proposes a Tiny-YOLO deep learning method to detect the location of a team member robot and presents a real-time coordination system using a ZED camera. To establish the coordinate system, the distance between the robots was estimated using a trigonometric equation to ensure that the robot was able to pass the ball towards another robot. To verify our method, real-time experiments was carried out using an NVDIA Jetson NX Xavier, and the results showed that the robot could estimate the distance correctly before passing the ball toward another robot.
Goal location prediction based on deep learning using RGB-D camerajournalBEEI
In the navigation system, the desired destination position plays an essential role since the path planning algorithms takes a current location and goal location as inputs as well as the map of the surrounding environment. The generated path from path planning algorithm is used to guide a user to his final destination. This paper presents a proposed algorithm based on RGB-D camera to predict the goal coordinates in 2D occupancy grid map for visually impaired people navigation system. In recent years, deep learning methods have been used in many object detection tasks. So, the object detection method based on convolution neural network method is adopted in the proposed algorithm. The measuring distance between the current position of a sensor and the detected object depends on the depth data that is acquired from RGB-D camera. Both of the object detected coordinates and depth data has been integrated to get an accurate goal location in a 2D map. This proposed algorithm has been tested on various real-time scenarios. The experiments results indicate to the effectiveness of the proposed algorithm.
Fuzzy-proportional-integral-derivative-based controller for object tracking i...IJECEIAES
This paper aims at designing and implementing an intelligent controller for the orientation control of a two-wheeled mobile robot. The controller is designed in LabVIEW and based on analyzed image parameters from cameras. The image program calculates the distance and angle from the camera to the object. The fuzzy controller will get these parameters as crisp input data and send the calculated velocity as crisp output data to the right and left wheel motor for the robot tracking the target object. The results show that the controller gives a fast response and high reliability and quickly carries out data recovery from system faults. The system also works well in the uncertainties of process variables and without mathematical modeling.
Intelligent indoor mobile robot navigation using stereo visionsipij
Majority of the existing robot navigation systems, which facilitate the use of laser range finders, sonar
sensors or artificial landmarks, has the ability to locate itself in an unknown environment and then build a
map of the corresponding environment. Stereo vision,while still being a rapidly developing technique in the
field of autonomous mobile robots, are currently less preferable due to its high implementation cost. This
paper aims at describing an experimental approach for the building of a stereo vision system that helps the
robots to avoid obstacles and navigate through indoor environments and at the same time remaining very
much cost effective. This paper discusses the fusion techniques of stereo vision and ultrasound sensors
which helps in the successful navigation through different types of complex environments. The data from
the sensor enables the robot to create the two dimensional topological map of unknown environments and
stereo vision systems models the three dimension model of the same environment.
Control of a Movable Robot Head Using Vision-Based Object TrackingIJECEIAES
This paper presents a visual tracking system to support the movement of the robot head for detecting the existence of objects. Object identification and object position estimation were conducted using image-based processing. The movement of the robot head was in four directions namely to the right, left, top, and bottom of the robot head. Based on the distance of the object, it shifted the object to many points to assess the accuracy of the process of tracking the object. The targeted objects are detected through several processes, namely normalization of RGB images, thresholding, and object marking. The process of tracking the object conducted by the robot head varied in 40 various object points with high accuracy. The further the object’s distance to the robot, the smaller the corner of the movement of the robot produced compared to the movement of the robot head to track an object that was closer even though with the same distance stimulant shift object. However, for the distance and the shift of the same object, the level of accuracy showed almost the same results. The results showed the movement of the robot head to track the object under the head of the robot produced the movement with a larger angular error compared to the movement of the robot head in another direction even though with the stimulant distance of the same object position and the distance shift of the same object.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Automatic identification of animal using visual and motion saliencyeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Wireless network implementation is a viable option for building network infrastructure in rural communities. Rural people lack network infrastructures for information services and socio-economic development. The aim of this study was to develop a wireless network infrastructure architecture for network services to rural dwellers. A user-centered approach was applied in the study and a wireless network infrastructure was designed and deployed to cover five rural locations. Data was collected and analyzed to assess the performance of the network facilities. The results shows that the system had been performing adequately without any downtime with an average of 200 users per month and the quality of service has remained high. The transmit/receive rate of 300Mbps was thrice as fast as the normal Ethernet transmit/receive specification with an average throughput of 1 Mbps. The multiple output/multiple input (MIMO) point-to-multipoint network design increased the network throughput and the quality of service experienced by the users.
3D reconstruction is a technique used in computer vision which has a wide range of applications in areas like object recognition, city modelling, virtual reality, physical simulations, video games and special effects. Previously, to perform a 3D reconstruction, specialized hardwares were required. Such systems were often very expensive and was only available for industrial or research purpose. With the rise of the availability of high-quality low cost 3D sensors, it is now possible to design inexpensive complete 3D scanning systems. The objective of this work was to design an acquisition and processing system that can perform 3D scanning and reconstruction of objects seamlessly. In addition, the goal of this work also included making the 3D scanning process fully automated by building and integrating a turntable alongside the software. This means the user can perform a full 3D scan only by a press of a few buttons from our dedicated graphical user interface. Three main steps were followed to go from acquisition of point clouds to the finished reconstructed 3D model. First, our system acquires point cloud data of a person/object using inexpensive camera sensor. Second, align and convert the acquired point cloud data into a watertight mesh of good quality. Third, export the reconstructed model to a 3D printer to obtain a proper 3D print of the model.
3D reconstruction is a technique used in computer vision which has a wide range of applications in areas like object recognition, city modelling, virtual reality, physical simulations, video games and special effects. Previously, to perform a 3D reconstruction, specialized hardwares were required. Such systems were often very expensive and was only available for industrial or research purpose. With the rise of the availability of high-quality low cost 3D sensors, it is now possible to design inexpensive complete 3D scanning systems. The objective of this work was to design an acquisition and processing system that can perform 3D scanning and reconstruction of objects seamlessly. In addition, the goal of this work also included making the 3D scanning process fully automated by building and integrating a turntable alongside the software. This means the user can perform a full 3D scan only by a press of a few buttons from our dedicated graphical user interface. Three main steps were followed to go from acquisition of point clouds to the finished reconstructed 3D model. First, our system acquires point cloud data of a person/object using inexpensive camera sensor. Second, align and convert the acquired point cloud data into a watertight mesh of good quality. Third, export the reconstructed model to a 3D printer to obtain a proper 3D print of the model.
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...ijcsit
3D reconstruction is a technique used in computer vision which has a wide range of applications in
areas like object recognition, city modelling, virtual reality, physical simulations, video games and
special effects. Previously, to perform a 3D reconstruction, specialized hardwares were required.
Such systems were often very expensive and was only available for industrial or research purpose.
With the rise of the availability of high-quality low cost 3D sensors, it is now possible to design
inexpensive complete 3D scanning systems. The objective of this work was to design an acquisition and
processing system that can perform 3D scanning and reconstruction of objects seamlessly. In addition,
the goal of this work also included making the 3D scanning process fully automated by building and
integrating a turntable alongside the software. This means the user can perform a full 3D scan only by
a press of a few buttons from our dedicated graphical user interface. Three main steps were followed
to go from acquisition of point clouds to the finished reconstructed 3D model. First, our system
acquires point cloud data of a person/object using inexpensive camera sensor. Second, align and
convert the acquired point cloud data into a watertight mesh of good quality. Third, export the
reconstructed model to a 3D printer to obtain a proper 3D print of the model.
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
BAXTER PLAYING POOL
1. BAXTER PLAYING POOL
Koyya Shiva Karthik Reddy Vishnunandan Venkatesh
Department of Electrical and Electronics Engineering
Rochester institute of Technology
Department of Electrical and Electronics Engineering
Rochester institute of Technology
Abstract ---The project focuses on accomplishing a
complex task of playing pool with the industrial robot
Baxter (14 DOF) made by Rethink Robotics. The
process had many complex intermediate goals such as
analysing various views so as to find the required
orientation to make the shot. The task of finding the
orientation was attempted with helpof 3-D sensorXtion
as well as the Baxter’s head camera.
Once the desiredorientation was found, visual
servoing was accomplishedusing Baxter’s hand camera
to find the centerof the ball so that the end effectorcan
have a perfect strike. Baxter’s inverse kinematics
package provided by BaxterSDKwas used during the
entire course of the project.
KEYWORDS: PointCloud, OpenCV, ROS,
PointCloud Library, Baxter, Inverse kinematics,
desired view, Current View.
The paper majorly constitutes of V sections.
Section I gives an introduction on robotics in broad,
the recent improvements in the field and our
approach to the problem. Section II discusses the
related work in the field. Section III discusses flow of
the system. Section IV discusses the results and
limitations. Section V throws some light on the future
scope of improvements.
I. INTRODUCTION
In today’s modern world robotics has evolved a
lot, robots accomplish complex tasks which a human
body cannot achieve and with the improvement in
processors, computation of complex problems is
becoming easier for highly advanced robotic
machines. In past 10 years or so drastic improvement
in field of robotic vision had given a new dimension
to the robotic world , affordable 3-D sensors like
KINECT or XTION have opened the door to many
new possibilities to be explored. Now with the help
of vision and other sensors robots can be taught to
understand their environments and perform actions
with much more data and information. With various
open sourced libraries such as open CV and
pointcloud library robots can be trained to detect
specific objects in their surroundings easily.
Industrial robots such as Baxter are human
compliance robots and are highly precise and have
their own on systemprocessorand vision sensors that
make the computation much faster. Platforms like
Robot Operating system (ROS) have made it possible
to build complex applications as it provides easy
integration of various modules independent of the
platform. Many built in packages in ROS include
simulation and GUI which make understanding
coding and visualizing of problems easier.
In our approach a Xtion sensor was placed on
Baxter’s wrist so as to record a desired view
pointcloud wherein the cue ball and the pocket were
perfectly aligned in a position which would result in a
perfect shot (i.e. cue ball being hit into the
pocket).Once the desired view was recoded the hand
was moved to some random location where its
current view was again analysed and recoded as a
2. pointcloud. The pointcloud recorded as the current
view was attempted to be aligned (registration) with
the desired view based upon the unique features
extracted from both the pointclouds so as find the
required orientation. Upon analysis of the results due
to inaccurate result and large computation time a
secondary approach was choose to tackle the
orientation problem wherein the Baxter’s head
camera was used to find the orientation between cue
ball and the pocket.
After addressing the orientation part, using
Baxter’s hand camera and with help of Open CV the
ball’s center was tracked and the baxter’s hand was
moved in linear fashion with the same orientation so
that the end effector (cue stick) would always be
straight to the center of the ball and this would help
the end effector strike the ball perfectly in the center.
II. RELATED WORK IN THE FIELD
In [1] the author discusses the problemconcerned
with 3-D pointclouds registration with concern being
the inaccuracy and computation time of the
conventional methods like Iterative closes point (ICP)
and coherent point drift (CPD) algorithms, he comes
up with a variation of CPD; fastCPD which is based
on global convergent squared iterative EM scheme
which improved the computation time of the
registration process. Segmentation of the view of
interest is also a challenging task as well
computationally expensive as discussed in [2] where
the author proposes choosing normal estimates based
on tensor voting. The result shown improved
performance even in noisy observations and missing
data. In [3] author discusses another approach for
segmentation where manual segmentation of objects
from scenes were used and the automated
approximation of objects with high level descriptors
were made to recreate the required 3-D model. The
algorithms working on point cloud are
computationally expensive due to the large volume of
data, therefore proper pre-processing of the
pointclouds is a necessary step when working with
large pointclouds, in [4] the author discusses a Super
Voxels for pointcloud connectivity segmentation,
which drastically reduces the time complexity of later
process like feature generation and registration. The
type of sensors also effect the result of experiment
many sensors can be used for 3-D modelling like
stereo camera, depth camera like Xtion or kinect,
LIADARS, in [5] the author compares the
performance of two widely used sensors namely
Microsoft Kinect and Asus Xtion and concludes that
the depth resolution of both sensors gets worse when
the distance increases. In terms of depth accuracy, the
depth sensor accuracy decreases when increasing the
distance between the sensor and the planar surface.
The Kinect sensor’s accuracy is more sensitive than
Xtion to radiant illuminated objects.
Visual Servoing plays an important role in
helping the end effector align itself perfectly straight
to the ball so as to make the perfect shot. Visual
servoing can be divided into three groups namely
position based, feature based and 2-1/2-D visual
servoing. [6] Describes position based visual
servoing. A moving camera was considered rather
than a fixed camera. . The camera is moved about a
single axis so as to compensate for the loss in depth.
The paper deals with the complexities involved with
deriving the position of a goal point using a single
camera. Positional visual servoing has its perks as
long as it can determine the position of an object
definitively. If the object cannot be determined
effectively then it would result in errors. [7] Talks
about a method which can perform visual servoing
using features from the camera rather than using
positions. To reduce errors in visual servoing (error in
the difference between the visual feedback and the
tracked object position), positional visual servoing
techniques use the camera calibration matrix and the
Cartesian points of the object tracked. However if the
position of the object cannot be determined
effectively then this method would fail. The method
proposed in this paper is to obtain feature points from
the camera calibration matrix. Feature points would
have an advantage over position vector points as the
loss of information on depth would not destabilize
them. Another way to minimize errors in visual
servoing was proposed in [8] which used multiple
cameras instead of single cameras. The eye in hand
configuration was used here for visual servoing.
Apart from the camera in the end effector a stereo
camera setup was made. The proposed method was
that the stereo system would determine the error and
perform the error corrections and then the eye in hand
camera would carry on with visual servoing.
[9] Talks about the Baxter robot tracking human
body motion. Two approaches are used to control the
joint of the robot, one being the inverse kinematics
approach and the other being the vector approach.
The paper concludes that the inverse kinematics
approach is much better than the vector approach.
The vector approach uses only four of the joints for
controlling the positions of the arm and it also leads
to a few errors which had to be stabilized. However,
the inverse kinematics approach ensured all joints
being used and gave satisfactory movements to the
3. joints provided there were inverse kinematic joint
solutions for various positions. [10] Builds a
kinematics model for the Baxter robot with the help
of its universal robotic descriptive format (URDF)
file provided by its SDK. The kinematics model is
developed for the purpose of simulations and it
discusses the D-H parameters of various joints of
Baxter.
III. SYSTEM FLOW
Fig. 1. Flow of the system
The entire process was performed as per the flow
described in figure 1. This section is further sub
divided, explaining each of the steps in further
details.
A. Orientation
Finding the desired orientation to strike is one of
the most important steps in the entire process. The
process and available approaches tried to achieve are
shown in the figure 2.
Fig. 2. Orientation
1) Xtion and pointcloud: Xtion depth sensor
developed by Asus was used in the
experiment to get the pointcloud (depth and
RGB) data of the view. Pointcloud is set of
data points which represent an extended
surface of the object in some coordinate
system which in Cartesian system is X, Y, Z
where Z is the depth information(i.e. the
distance of that point from the sensor)
pointclouds are extensively used in 3-D
modelling and other such related research.
The pointcloud obtained using Xtion were
was further processed using PointCloud
Library [12] C++ bindings. The process
overtaken on pointcloud to find the desired
orientation is shown in figure 3.
Fig. 3. Point Cloud processing
2) Head Camera: Using the head camera
located above the display screen on the
Baxter robot, the orientation was
determined. Considering the fact that the
Baxter robot was at a fixed position and the
Find the desired
orientation
Move baxter hand tohome
position
Take hand tothedesired
orientation
Find the center of theball
while maitaning the
orientation
Move end effector(cuestick)
to the center ofthe ball.
ORIENTATION
XTION
HEAD
CAMERA
capure desired
and current
view
remove NaN
data
define normal
and feature
radius
compute
normals
extract
features
regestration
process
refinement of
regestration
(ICP)
compute the
rotationand
translation
matrix
Methods
As mentioned in the introduction due to
inaccurate results and large computation time
when working with pointclouds, Baxter’s head
camera was used to find the orientation.
4. pool table arrangement was at a fixed
distance from Baxter, the positions of the
ball and the pocket was determined with the
help of the 2-D pixel coordinates obtained
from Baxter's head camera. Using the Open
CV libraries in python the pool ball and the
pocket were masked using various threshold
methods. Contours were drawn around the
ball and the pocket. The center was found
using the moments function. Once the center
points were found the slope between the
pool ball and the pocket was calculated
using mathematical operations and the angle
was found in radians.
B. Baxter hand moments
Baxter robot is a 14 DOF industrial robot with 7
DOF in each arm. Each arm has seven rotational
joints and eight links. The robot is said to be human
complacent and is programmed to work in a safe way
in any environment
Once the orientation is determined the right limb
of Baxter is taken to a pre-determined home position.
The orientation of this position is parallel to the Z
axis of the end effector. The orientation angle
obtained in Section A is then converted to the
respective Baxter end effector quaternion and Baxter
is commanded to align itself to the given orientation.
C. Visual servoing
Fig. 4. Visual servoing
In this step Visual Servoing is used to ensure that
the end effector (cue stick) in aligned with the center
of the cue ball in the desired orientation already set
using the head camera. Baxter's right hand camera is
used along with Open CV libraries. The ball is
masked from the image with the help of threshold
functions. Dilating and Eroding is performed so as to
remove small error blobs left on the masked image.
Again using the moments function the center of the
ball is calculated. To successfully move the hand
based on the position of the center of the ball the
cameras 2-D coordinates needs to be converted to
Baxter's joint coordinates. Keeping the distance
between Baxter's end effector and the pool table
arrangement constant we were able to remove factor
of depth from this process (i.e. keeping Z-axis
coordinate of the end effector constant). Then the
pixel/centimetre for each of the cameras 2-D axes
was calculated. These pixel/centimetre values were
mapped to Baxter's end effector translation values
about their axes respectively. Once these values are
obtained they were passed to the Inverse Kinematics
solver provided by Baxter SDK [12] so as to obtain a
joint solution for Baxter to move to. Visual Servoing
is repeated till Baxter's end effector is perfectly
aligned with the center of the ball keeping in mind
that it is away from the ball at a constant depth.
D. End effector striking moment
Once the right arm of Baxter is in the specified
orientation and is aligned with the center of the ball it
then has to make a linear motion to strike the ball.
This also means that Baxter needs to translate by its
X axis by a certain amount and it needs to translate
about its Y axis by the tangent of the orientation
angle obtained. Since Baxter does not follow a linear
interpolated form of movement when it moves from
one point to another the translations made were
divided into small increments to achieve linear
movement. A much efficient way of doing this would
be to use Jacobian matrices. But this would decrease
the force with which the arm would strike the pool
ball greatly.
IV. RESULTS AND LIMITATIONS
The game of pool was successfully played with
help of Baxter’s vision sensors and using Baxter’s
inverse kinematics package but because of fixed
BAXTER
ROBOT
Right Hand
Camera
Find center
coordinate of
ball using Open
CV library
Convert 2-D
pixel
coordinates to
baxter position
coordinates
SOLVE Inverse
Kinematics for
coordinates
Set required
Joint Positions
5. position of the Baxter robot the playing workspace
was limited.
Following observations were made, the use of
Xtion was not successful because to get a proper
depth image of the view the Xtion should be at some
specific distance and height but from that specific
height the inverse kinematics failed more often than
not. Moreover the registration process with the
pointcloud was a computationally expensive
operation.
The following limitations were in the system:
1. The pointcloud results were not accurate
2. Workspace was limited.
3. With the hitting action sufficient force was
not be achieved and the ball was able to move only
by a few centimetres.
4. Due to minute quiver movements of the
Baxter’s end effectors the center detection was a bit
affected.
V. FUTURE WORK
Making the Baxter robot mobile would add to a
greater flexibility in adding a much larger workspace
for Baxter to play in. Once mobile the robot can
overcome the limitations caused by the elementary
inverse kinematics package by adjusting its position.
One more improvement to bring into the project
would be to use the pointcloud library more
effectively to calculate the orientation and bring in
the factor of depth also into the project with the help
of the Xtion 3-D camera. A linear actuator could be
placed at the end effector to substantially increase the
force with which Baxter robot strikes the ball.
REFERENCES
[1] Min Lu; Jian Zhao; Yulan Guo; Jianping Ou; Li, J.,
"A 3D pointcloud registration algorithm based on fast
coherent point drift," in Applied Imagery Pattern
Recognition Workshop (AIPR), 2014 IEEE , vol., no.,
pp.1-6, 14-16 Oct. 2014
[2] Ming Liu; Pomerleau, F.; Colas, F.; Siegwart, R.,
"Normal estimation for pointcloud using GPU based
sparse tensor voting," in Robotics and Biomimetics
(ROBIO), 2012 IEEE International Conference on ,
vol., no., pp.91-96, 11-14 Dec. 2012
[3] Strand, M.; Dillmann, R., "Segmentation and
approximation of objects in pointclouds using
superquadrics," in Information and Automation, 2009.
ICIA '09. International Conference on , vol., no.,
pp.887-892, 22-24 June 2009
[4] Papon, J.; Abramov, A.; Schoeler, M.; Worgotter, F.,
"Voxel Cloud Connectivity Segmentation -
Supervoxels for Point Clouds," in Computer Vision
and Pattern Recognition (CVPR), 2013 IEEE
Conference on , vol., no., pp.2027-2034, 23-28 June
2013
[5] Haggag, H.; Hossny, M.; Filippidis, D.; Creighton, D.;
Nahavandi, S.; Puri, V., "Measuring depth accuracy in
RGBD cameras," in Signal Processing and
Communication Systems (ICSPCS), 2013 7th
International Conference on , vol., no., pp.1-7, 16-18
Dec. 2013
[6] Hespanha, J.P., "Single-camera visual servoing," in
Decision and Control, 2000. Proceedings of the 39th
IEEE Conference on , vol.3, no., pp.2533-2538 vol.3,
2000
[7] Navarro-Alarcon, D.; Yun-Hui Liu, "Lyapunov-stable
eye-in-hand kinematic visual servoing with
unstructured static feature points," in Intelligent
Robots and Systems (IROS 2014), 2014 IEEE/RSJ
International Conference on , vol., no., pp.755-760,
14-18 Sept. 2014
[8] LianKui Qiu; Quanjun Song; Jianhe Lei; Yong Yu;
Yunjian Ge, "Multi-Camera Based Robot Visual
Servoing System," in Mechatronics and Automation,
Proceedings of the 2006 IEEE International
Conference on , vol., no., pp.1509-1514, 25-28 June
2006
[9] Reddivari, H.; Yang, C.; Ju, Z.; Liang, P.; Li, Z.; Xu,
B., "Teleoperation control of Baxter robot using body
motion tracking," in Multisensor Fusion and
Information Integration for Intelligent Systems (MFI),
2014 International Conference on , vol., no., pp.1-6,
28-29 Sept. 2014
[10] Zhangfeng Ju; Chenguang Yang; Hongbin Ma,
"Kinematics modeling and experimental verification
of baxter robot," in Control Conference (CCC), 2014
33rd Chinese , vol., no., pp.8518-8523, 28-30 July
2014
[11] Pointclouds.org
[12] Sdk.rethinkrobotics.com