This document discusses methods for image restoration and denoising to enable object detection using a vision-based tactile sensor called FingerVision. It proposes two methods: FVIC which uses OpenCV functions like masking, restoration and denoising; and FVICNN which is based on a convolutional neural network. An experiment uses these methods to process noisy FingerVision images and test object detection accuracy using YOLOv3, finding that FVIC has slightly better results but is slower than FVICNN. The document concludes it is possible to obtain multimodal sensing through appropriate image processing for FingerVision.
2019/10/16
初心者向けCTFのWeb分野の強化法
CTFのweb分野を勉強しているものの本番でなかなか解けないと悩んでいないでしょうか?そんな悩みを持った方を対象に、私の経験からweb分野の強化法を解説します。
How to strengthen the CTF Web field for beginners !!
Although you are studying the CTF web field, are you worried that you can't solve it in production?
For those who have such problems, I will explain how to strengthen the web field based on my experience.
(study group) https://yahoo-osaka.connpass.com/event/149524/
2019/10/16
初心者向けCTFのWeb分野の強化法
CTFのweb分野を勉強しているものの本番でなかなか解けないと悩んでいないでしょうか?そんな悩みを持った方を対象に、私の経験からweb分野の強化法を解説します。
How to strengthen the CTF Web field for beginners !!
Although you are studying the CTF web field, are you worried that you can't solve it in production?
For those who have such problems, I will explain how to strengthen the web field based on my experience.
(study group) https://yahoo-osaka.connpass.com/event/149524/
[論文紹介] DPSNet: End-to-end Deep Plane Sweep StereoSeiya Ito
DPSNet is an end-to-end deep learning model that estimates dense depth maps from stereo image pairs. It generates cost volumes from multi-scale feature maps of reference and paired images. It then refines the cost slices with dilated convolutions considering contextual information. Finally, it regresses the depth maps from the initial and refined cost volumes. Evaluation on various datasets shows DPSNet achieves state-of-the-art performance in depth map estimation, outperforming other methods in terms of accuracy metrics while maintaining full completeness of predictions.
This document introduces deep reinforcement learning and provides some examples of its applications. It begins with backgrounds on the history of deep learning and reinforcement learning. It then explains the concepts of reinforcement learning, deep learning, and deep reinforcement learning. Some example applications are controlling building sway, optimizing smart grids, and autonomous vehicles. The document also discusses using deep reinforcement learning for robot control and how understanding the principles can help in problem setting.
Human motion is fundamental to understanding behaviour. In spite of advancement on single image 3 Dimensional pose and estimation of shapes, current video-based state of the art methods unsuccessful to produce precise and motion of natural sequences due to inefficiency of ground-truth 3 Dimensional motion data for training. Recognition of Human action for programmed video surveillance applications is an interesting but forbidding task especially if the videos are captured in an unpleasant lighting environment. It is a Spatial-temporal feature-based correlation filter, for concurrent observation and identification of numerous human actions in a little-light environment. Estimated the presentation of a proposed filter with immense experimentation on night-time action datasets. Tentative results demonstrate the potency of the merging schemes for vigorous action recognition in a significantly low light environment.
This document summarizes a seminar presentation about Puppetooner, a system for 3D animation using physical puppet models and Microsoft Kinect depth sensing. The system allows puppeteers to manipulate physical puppets in front of a Kinect to capture their motion in real-time. This captured motion is then applied to virtual 3D character models which are projected back onto the physical puppets through projection mapping, creating an augmented reality experience. The system provides a low-cost and accessible way to create 3D animations without requiring software expertise by allowing puppeteers to focus on physical performance.
[論文紹介] DPSNet: End-to-end Deep Plane Sweep StereoSeiya Ito
DPSNet is an end-to-end deep learning model that estimates dense depth maps from stereo image pairs. It generates cost volumes from multi-scale feature maps of reference and paired images. It then refines the cost slices with dilated convolutions considering contextual information. Finally, it regresses the depth maps from the initial and refined cost volumes. Evaluation on various datasets shows DPSNet achieves state-of-the-art performance in depth map estimation, outperforming other methods in terms of accuracy metrics while maintaining full completeness of predictions.
This document introduces deep reinforcement learning and provides some examples of its applications. It begins with backgrounds on the history of deep learning and reinforcement learning. It then explains the concepts of reinforcement learning, deep learning, and deep reinforcement learning. Some example applications are controlling building sway, optimizing smart grids, and autonomous vehicles. The document also discusses using deep reinforcement learning for robot control and how understanding the principles can help in problem setting.
Human motion is fundamental to understanding behaviour. In spite of advancement on single image 3 Dimensional pose and estimation of shapes, current video-based state of the art methods unsuccessful to produce precise and motion of natural sequences due to inefficiency of ground-truth 3 Dimensional motion data for training. Recognition of Human action for programmed video surveillance applications is an interesting but forbidding task especially if the videos are captured in an unpleasant lighting environment. It is a Spatial-temporal feature-based correlation filter, for concurrent observation and identification of numerous human actions in a little-light environment. Estimated the presentation of a proposed filter with immense experimentation on night-time action datasets. Tentative results demonstrate the potency of the merging schemes for vigorous action recognition in a significantly low light environment.
This document summarizes a seminar presentation about Puppetooner, a system for 3D animation using physical puppet models and Microsoft Kinect depth sensing. The system allows puppeteers to manipulate physical puppets in front of a Kinect to capture their motion in real-time. This captured motion is then applied to virtual 3D character models which are projected back onto the physical puppets through projection mapping, creating an augmented reality experience. The system provides a low-cost and accessible way to create 3D animations without requiring software expertise by allowing puppeteers to focus on physical performance.
This document summarizes an academic paper that proposes a method for incrementally training object detection models to classify unseen object classes in real-time. It begins by providing background on object detection techniques like YOLO and SSD that can perform detection in a single pass. The paper aims to improve these single-shot detectors through incremental learning to classify new object classes without retraining the entire model from scratch. It conducted experiments on YOLO and VGG16 to investigate how well they can classify objects from unseen classes and whether their performance is affected by factors like background, bounding box size, or network architecture. The goal is to develop a more robust object detection method that can easily adapt to new classes of objects in real-time applications.
The document summarizes a student project to develop a virtual mouse interface using computer vision and finger tracking. The project is divided into 5 modules: 1) basic video operations in OpenCV, 2) image processing techniques, 3) object tracking, 4) finger-tip detection, and 5) using detected finger motions to control mouse functions. Key functions demonstrated include moving the cursor, left and right clicking, dragging, brightness control, and scrolling. Evaluation of the system found finger tracking accuracy between 60-85% for different gestures. The project aims to provide an alternative input method that reduces hardware needs and workspace.
Interactive full body motion capture using infrared sensor networkijcga
Traditional motion capture (mocap) has been
well
-
stud
ied in visual science for
the last decades
. However
the fie
ld is mostly about capturing
precise animation to be used in
specific
application
s
after
intensive
post
processing such as studying biomechanics or rigging models in movies. These data set
s are normally
captured in complex laboratory environments with
sophisticated
equipment thus making motion capture a
field that is mostly exclusive to professional animators.
In
addition
, obtrusive sensors must be attached to
actors and calibrated within t
he capturing system, resulting in limited and unnatural motion.
In recent year
the rise of computer vision and interactive entertainment opened the gate for a different type of motion
capture which focuses on producing
optical
marker
less
or mechanical sens
orless
motion capture.
Furtherm
ore a wide array of low
-
cost
device are released that are easy to use
for less mission critical
applications
.
This paper
describe
s
a new technique of using multiple infrared devices to process data from
multiple infrared sensors to enhance the flexibility and accuracy of the markerless mocap
using commodity
devices such as Kinect
. The method involves analyzing each individual sensor
data, decompose and rebuild
them into a uniformed skeleton across all sensors. We then assign criteria to define the confidence level of
captured signal from
sensor. Each sensor operates on its own process and communicates through MPI.
Our method emphasize
s on the need of minimum calculation overhead for better real time performance
while being able to maintain good scalability
A Literature Survey: Neural Networks for object detectionvivatechijri
Humans have a great capability to distinguish objects by their vision. But, for machines object
detection is an issue. Thus, Neural Networks have been introduced in the field of computer science. Neural
Networks are also called as ‘Artificial Neural Networks’ [13]. Artificial Neural Networks are computational
models of the brain which helps in object detection and recognition. This paper describes and demonstrates the
different types of Neural Networks such as ANN, KNN, FASTER R-CNN, 3D-CNN, RNN etc. with their accuracies.
From the study of various research papers, the accuracies of different Neural Networks are discussed and
compared and it can be concluded that in the given test cases, the ANN gives the best accuracy for the object
detection.
This document discusses object detection techniques for images. It begins with an introduction to image processing and object detection. Object detection aims to find instances of real-world objects in images and is used in applications like surveillance and automotive safety. The document then reviews several papers on related topics, including object removal from videos, visual surveillance systems for tracking targets, and augmented reality object compositing. It proposes using skull detection and region growing techniques to detect particular objects while removing shadows and other background objects for improved detection. Region growing would detect the object, while skull detection removes it from the image to analyze the object separately. In summary, the document reviews object detection methods and proposes a hybrid approach using skull detection and region growing.
The document summarizes an OpenCV based image processing attendance system. It discusses using OpenCV to detect faces in images and recognize faces by comparing features to a database. The key steps are face detection using Viola-Jones detection, face recognition using eigenfaces generated by principal component analysis to project faces into "face space", and measuring similarity by distance between projections.
This document provides a summary of a report on 3D object recognition using the Point Cloud Library (PCL). It describes the key steps in the global and local pipelines for 3D object recognition. The global pipeline is used for object detection and classification, while the local pipeline determines the object's pose. The report details the training and testing process for each pipeline, including keypoint detection, descriptor calculation, and matching. It also presents results from experiments on public and custom datasets and analyzes the performance of different algorithm combinations.
Interactive Full-Body Motion Capture Using Infrared Sensor Network ijcga
The document describes a new technique for interactive full-body motion capture using multiple infrared sensors. It processes data from each sensor independently and then combines the results to enhance flexibility and accuracy. The method aims to maintain real-time performance while improving on issues like limited actor orientation, inaccurate joint tracking, and conflicting data from individual sensors.
This document provides an overview of recent developments in object detection using AI robots. It explores several deep learning-based object detection techniques and their advantages over traditional computer vision methods. The paper discusses how object detection is used in robotics applications like grasping, manipulating, and navigation. It also presents the results of experiments conducted to evaluate an object detection system using a robot equipped with cameras and sensors. The system uses a Fast R-CNN algorithm combined with a Kalman filter for real-time object detection and tracking.
Elderly Assistance- Deep Learning Theme detectionTanvi Mittal
It was a Capstone project for AMPBA class of 2019 Winter. It uses Deep Learning to analyse the theme of Video. It combines various pre-trained models, enhances them using Transfer learning for the context of Elderly assistance and gives us a Warning Score in real time for any suspicious activity.
The document discusses the Blue Eyes technology, which aims to develop computers that can understand users' emotions, identity, and presence through techniques like facial recognition and speech recognition. The technology uses non-obtrusive sensing methods to gather physiological data from users to determine their emotional states. This would allow computers to interact more naturally with humans. Experimental results showed that measures of skin conductivity, heart rate, finger temperature, and mouse movements can reliably predict a user's emotional state. Future work aims to improve these techniques with smaller, less intrusive sensors.
The document discusses research directions for augmented reality, including developing improved tracking, display, and input technologies; creating tools for authoring AR content; developing novel AR applications and interaction techniques; and exploring new types of AR experiences through user evaluations. It also examines specific challenges like occlusion handling in see-through displays and examples of gesture and multimodal interaction research.
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...ijtsrd
This document discusses techniques for detecting anomalies in surveillance videos based on saliency detection and segmentation. It proposes extracting salient objects from motion fields using saliency detection algorithms. Surveillance videos capture behavioral activities, with some frequent sequences considered normal and deviations considered anomalies that could indicate criminal activity. The document describes calculating image gradients, thresholding, using a Sobel edge detector, and implementing the proposed system to detect anomalies by recognizing actions, detecting objects, and identifying moving regions in test video frames. Experimental results on test videos demonstrate action recognition, object detection, and identification of anomalies.
Fingerprint scanners provide biometric security by verifying a user's identity based on their unique fingerprint ridges and valleys. Optical scanners use light reflection to generate an image of ridges and valleys, while capacitive scanners use electric fields. Both generate images of the fingerprint pattern. The scanner then analyzes and compares specific fingerprint features called minutiae points, like ridge endings and bifurcations, between the scanned fingerprint and stored fingerprints to identify a match. By matching a sufficient number of common minutiae points, the scanner can determine if two fingerprints belong to the same individual for authentication purposes.
Stereo vision uses two cameras to capture 3D information by processing two images of the same scene taken from slightly different angles. The seminar discussed concepts of stereo vision and its potential use for a virtual touch screen. Requirements for such a system include using two cameras for stereo vision capabilities, mouse input replacement with touch, and GUI modification for touch events. Challenges like correspondence and calibration problems were also covered, along with solutions like correlation-based algorithms. Applications of stereo vision include robotics, surveillance and 3D mapping.
IRJET - Smart Blind Stick using Image ProcessingIRJET Journal
1. The document describes a proposed smart blind stick system that uses ultrasonic sensors, a camera, and a raspberry pi to help visually impaired people avoid obstacles and navigate independently.
2. The system would use ultrasonic sensors to detect obstacles and a camera to capture images of obstacles. The images would be processed using CNN and RNN models to generate captions, which would then be converted to speech for the user.
3. The proposed system aims to help the blind community travel independently by detecting obstacles using sensors, identifying obstacles using image processing and captioning, and informing users of obstacles and the environment through audio.
Similar to FingerVisionを用いた触覚センシングと物体検出を同時に実現するための画像修復とノイズ除去 (20)
Design and optimization of ion propulsion dronebjmsejournal
Electric propulsion technology is widely used in many kinds of vehicles in recent years, and aircrafts are no exception. Technically, UAVs are electrically propelled but tend to produce a significant amount of noise and vibrations. Ion propulsion technology for drones is a potential solution to this problem. Ion propulsion technology is proven to be feasible in the earth’s atmosphere. The study presented in this article shows the design of EHD thrusters and power supply for ion propulsion drones along with performance optimization of high-voltage power supply for endurance in earth’s atmosphere.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELijaia
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
AI for Legal Research with applications, toolsmahaffeycheryld
AI applications in legal research include rapid document analysis, case law review, and statute interpretation. AI-powered tools can sift through vast legal databases to find relevant precedents and citations, enhancing research accuracy and speed. They assist in legal writing by drafting and proofreading documents. Predictive analytics help foresee case outcomes based on historical data, aiding in strategic decision-making. AI also automates routine tasks like contract review and due diligence, freeing up lawyers to focus on complex legal issues. These applications make legal research more efficient, cost-effective, and accessible.
Software Engineering and Project Management - Introduction, Modeling Concepts...Prakhyath Rai
Introduction, Modeling Concepts and Class Modeling: What is Object orientation? What is OO development? OO Themes; Evidence for usefulness of OO development; OO modeling history. Modeling
as Design technique: Modeling, abstraction, The Three models. Class Modeling: Object and Class Concept, Link and associations concepts, Generalization and Inheritance, A sample class model, Navigation of class models, and UML diagrams
Building the Analysis Models: Requirement Analysis, Analysis Model Approaches, Data modeling Concepts, Object Oriented Analysis, Scenario-Based Modeling, Flow-Oriented Modeling, class Based Modeling, Creating a Behavioral Model.
Gas agency management system project report.pdfKamal Acharya
The project entitled "Gas Agency" is done to make the manual process easier by making it a computerized system for billing and maintaining stock. The Gas Agencies get the order request through phone calls or by personal from their customers and deliver the gas cylinders to their address based on their demand and previous delivery date. This process is made computerized and the customer's name, address and stock details are stored in a database. Based on this the billing for a customer is made simple and easier, since a customer order for gas can be accepted only after completing a certain period from the previous delivery. This can be calculated and billed easily through this. There are two types of delivery like domestic purpose use delivery and commercial purpose use delivery. The bill rate and capacity differs for both. This can be easily maintained and charged accordingly.
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...PriyankaKilaniya
Energy efficiency has been important since the latter part of the last century. The main object of this survey is to determine the energy efficiency knowledge among consumers. Two separate districts in Bangladesh are selected to conduct the survey on households and showrooms about the energy and seller also. The survey uses the data to find some regression equations from which it is easy to predict energy efficiency knowledge. The data is analyzed and calculated based on five important criteria. The initial target was to find some factors that help predict a person's energy efficiency knowledge. From the survey, it is found that the energy efficiency awareness among the people of our country is very low. Relationships between household energy use behaviors are estimated using a unique dataset of about 40 households and 20 showrooms in Bangladesh's Chapainawabganj and Bagerhat districts. Knowledge of energy consumption and energy efficiency technology options is found to be associated with household use of energy conservation practices. Household characteristics also influence household energy use behavior. Younger household cohorts are more likely to adopt energy-efficient technologies and energy conservation practices and place primary importance on energy saving for environmental reasons. Education also influences attitudes toward energy conservation in Bangladesh. Low-education households indicate they primarily save electricity for the environment while high-education households indicate they are motivated by environmental concerns.
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...PIMR BHOPAL
Variable frequency drive .A Variable Frequency Drive (VFD) is an electronic device used to control the speed and torque of an electric motor by varying the frequency and voltage of its power supply. VFDs are widely used in industrial applications for motor control, providing significant energy savings and precise motor operation.
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
FingerVisionを用いた触覚センシングと物体検出を同時に実現するための画像修復とノイズ除去
1. Image Restoration and Denoising for
Simultaneous Tactile Sensing and Object
Detection Using FingerVision
Yamasaki, Kakeru
1
2. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Vision-based tactile sensor
• Reproduces the sense of
touch by using a camera to
read the membrane that
changes when it comes into
contact with an object.
• FingerVision is a kind of
vision-based tactile sensors.
Structure
2
3. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Feature
• Shape recognition
• Tactile data can be acquired as images.
• Object recognition
3
Vision-based tactile sensor
Yuan, Wenzhen, Siyuan Dong, and Edward H. Adelson. "Gelsight: High-resolution robot tactile sensors for estimating geometry and force." Sensors 17.12 (2017): 2762.
4. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Sensor type
4
Vision-based tactile sensor
GelSight TacTip FingerVision
https://news.mit.edu/2014/fingertip-sensor-gives-robot-dexterity-0919 http://www.brl.ac.uk/researchthemes/medicalrobotics/tactip.aspx
5. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
GelSight
Vision-based tactile sensor
5
Yuan, Wenzhen, Siyuan Dong, and Edward H. Adelson. "Gelsight: High-resolution robot tactile sensors for estimating geometry and force." Sensors 17.12 (2017): 2762.
6. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
TacTip
Vision-based tactile sensor
6
https://softroboticstoolkit.com/tactip
Winstone, Benjamin, et al. "TACTIP—Tactile fingertip device, challenges in reduction of size to ready for
robot hand integration." 2012 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, 2012.
7. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FingerVision
Vision-based tactile sensor
7
8. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FingerVision vs. others
Vision-based tactile sensor
FingerVision can recognize objects directly.
8
9. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FingerVision vs. others
Vision-based tactile sensor
FingerVision can recognize objects directly.
• If it's not pressurized, it can still detect slip.
• It can detect object.
9
http://akihikoy.net/notes/?project%2FFingerVision
11. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FingerVision vs. others
Vision-based tactile sensor
FingerVision can recognize objects directly.
• If it's not pressurized, it can still detect slip.
•It can detect object.
11
http://akihikoy.net/notes/?project%2FFingerVision
12. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Object detection
FingerVision
• There is not object detection research that displays bounding boxes
and class probabilities in FV.
12
class probability
bounding box
13. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Object detection issues
FingerVision
The silicon membrane can be the noise.
• Circle grid
• Re
fl
ection of light
13
14. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Object detection methods
FingerVision
1. Learning models of object detection in images including circle grids
2. Use existing trained models after denoising
14
15. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Object detection methods
FingerVision
1. Learning models of object detection in images including circle grids
2. Use existing trained models after denoising
We need data for the number of objects we want to recognize.
Existing models cannot be used available.
15
16. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Object detection methods
FingerVision
1. Learning models of object detection in images including circle grids
2. Use existing trained models after denoising
We only need a data set for denoising and image restoration.
Existing models can be used.
16
17. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Purpose
17
obtained from FingerVision
Denoising
Image restration
Object detection
FingerVison Image Restoration and denoising
18. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Related works
Raindrop Image Restoration
18
Image restration
Shaodi You, Robby T Tan, Rei Kawakami, Yasuhiro Mukaigawa, and Katsushi Ikeuchi. Adherent raindrop mod- eling, detectionand removal in video.
IEEE transactions on pattern analysis and machine intelligence, Vol. 38, No. 9, pp. 1721–1733, 2015. (Yamashita Atushi+,2007)
Using multiple cameras with parallax
background difference method
(Shaodi You+,2015)
Modeling raindrops based on the laws of physics
(Takahashi Saki+,2017)
Image style transfer using CNN
19. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
• Based on OpenCV
19
Denoising and Image Restoration Methods
FVIC
(FingerVision Image Correction)
FVICNN
(FingerVision Inpainting Convolution Neural Network)
• Based on Convolution Neural
Network
20. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Based on OpenCV
① Mask
② Restoration
③ Denoise
FVIC
20
①Mask
②Restoration
③Denoise
21. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Based on OpenCV
① Mask
② Restoration
③ Denoise
Manual work
The cost is low because the
position of the dot does not
change once it is set.
FVIC
21
①Mask
②Restoration
③Denoise
22. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Based on OpenCV
① Mask
② Restoration
③ Denoise
Fast Marching Method
FVIC
22
①Mask
②Restoration
③Denoise
23. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
OpenCV cv.inpaint(img,mask,3,cv.INPAINT_TELEA)
Normalized weighted average of
all the known pixels of the
neighborhood replace Missing
pixel.
Estimates the color of the
missing pixel to be repaired
using known pixels of the
neighborhood and gradients.
Fast Marching Method
23
https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html
24. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Based on OpenCV
① Mask
② Restoration
③ Denoise
Non-local means
fi
lter
FVIC
24
①Mask
②Restoration
③Denoise
25. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
OpenCV cv.fastNlMeansDenoisingColored()
Depending on the similarity, the
weight of the points that are
similar to the template is
increased and the weight of the
points that are not similar to the
template is decreased to correct
the pixels of interest.
Non-local means filter
25
https://www.slideshare.net/masayukitanaka1975/ssii2014
26. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FVICNN
Based on Convolution Neural Network
• Since noise data can be obtained, supervised learning is used.
• Although there is a GAN in the generation model, we did not use it in
this report because it is di
ffi
cult to set parameters and the training
may not converge. (Future issues.)
26
27. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
• It is based on DnCNN, which
is widely used for denoising,
and SRCNN, which is used for
high resolution.
• 400epochs
• Batch size = 8
• Learning rate = 5e-3
FVICNN
27
i
Conv+ReLU
Kernel=(3, 3),filter=32 strides=(1, 1)
Conv+BatchNormalization+ReLU
Kernel=(3, 3),filter=32 strides=(1, 1)
×6
Conv+BatchNormalization+ReLU
Kernel=(9, 9),filter=32 strides=(1, 1)
Conv+BatchNormalization+ReLU
Kernel=(1, 1),filter=16 strides=(1, 1)
Conv+BatchNormalization+tanh
Kernel=(5, 5),filter=3 strides=(1, 1)
input
output
Based on Convolution Neural Network
480×640×3
480×640×3
28. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FVICNN
• Prepare data without
membranes and data with
membranes.
• 10 randomly selected images
were used as test data.
• 29 images are saved for
validation data.(Hold-out)
Data set
28
×300
×300
30. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FVICNN
• Loss function
• mean squared error
• Optimizer
• Adam
• lr=0.005,β1=0.900,
β2=0.999
Learn result
30
31. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Object detection
• In order to evaluate our
experiment, we use an object
detection model, YOLOv3.
• Class = 80 categories
• We selected banana class
from that category for
experiment.
YOLOv3
31
32. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Accuracy Veri
fi
cation for FVIC and FVICNN
1. Process the noise data with
FVIC and FVICNN,
respectively.
2. Input no-noise data, data
obtained by FVIC and data
obtained by FVICNN to
YOLOv3.
Experiment
32
FVIC
FVICNN
33. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
FVIC and FVICNN output
33
34. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Accuracy Veri
fi
cation for FVIC and FVICNN
• By images restoration and
denoising, only one of the
fi
ve
images could be used to detect
an object. FVIC processing has
slightly better detection
accuracy than FVICNN.
• processing time
FVIC 1.6[s],FVICNN 7.5[s]
Result
34
35. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Discussions
• My research was shown that it is possible to detect objects with existing
models by using appropriate image processing, even if the circle grid is
embedded in the image.
• The object detection results depend greatly on the accuracy of image
restoration and denoising, and we con
fi
rmed that the accuracy is greatly
reduced compared to the case without noise.
• Since the number of datasets was only 600, a more e
ffi
cient method of
creating the dataset allows for the creation of a larger dataset, which leads
to improved accuracy.
35
36. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Discussions
• Another possibility is to use a large data set such as CIFAR-10 to learn
image restoration and then adapt the model to FingerVision.
• Although the same camera was used for the two FingerVision systems
in this study, we believe that a clearer image can be obtained by using
a high-resolution camera to create the no noise data.
36
37. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
summary
• There has not been enough research on the task of object detection
using FingerVision.
• The most important point in this research is that there is a possibility
to obtain multimodal information (visual and haptic) with a single
inexpensive mechanism by using image restoration and denoising for
FingerVision.
37
38. Yamasaki, Kakeru
Image Restoration and Denoising for Simultaneous Tactile Sensing and Object Detection Using Finger Vision
Future Issues
• Implementation of the GAN
• Increase the dataset
• use a large dataset such as CIFAR-10 to learn image restoratin and
apply it to FingerVision
• High-resolution cameras to create no noising data
38