This presentation as made as a tutorial at NCVPRIPG (http://www.iitj.ac.in/ncvpripg/) at IIT Jodhpur on 18-Dec-2013.
Kinect is a multimedia sensor from Microsoft. It is shipped as the touch-free console for Xbox 360 video gaming platform. Kinect comprises an RGB Camera, a Depth Sensor (IR Emitter and Camera) and a Microphone Array. It produces a multi-stream video containing RGB, depth, skeleton, and audio streams.
Compared to common depth cameras (laser or Time-of-Flight), the cost of a Kinect is quite low as it uses a novel structured light diffraction and triangulation technology to estimate the depth. In addition, Kinect is equipped with special software to detect human figures and to produce its 20-joints skeletons.
Though Kinect was built for touch-free gaming, its cost effectiveness and human tracking features have proved useful in many indoor applications beyond gaming like robot navigation, surveillance, medical assistance and animation.
Develop Store Apps with Kinect for Windows v2Clemente Giorio
Agenda:
Intro
The Sensor
Data Source
Kinect Evolution
Data Source
Windows Store App
Body Frame
Coordinate Mapper
Kinect Studio
Gesture Recognition
Gesture Builder
Al giorno d'oggi, l'utilizzo di NUI (Natural User Interface) rendono possibili nuove modalità di interazione tra utente e dispositivo. Vediamo come Microsoft Kinect e Intel Realsense ed I relastivi SDK rendono possibile l’implementazione di queste tecnologie nelle nostre applicazioni, nonche’ le recenti applicazioni di autenticazione biometrica introdotte in Windows 10.
Welcome to the world of Microsoft Kinect for
Windows–enabled applications. This document
is your roadmap to building exciting humancomputer interaction solutions you once
thought were impossible.
All you need to know about Kinect2 development.
With these slides you can understand the system requirements needed to start developing with Kinect2 sensor, the available sources accessible with the sensor and the gestures you can perform in order to interact with touchless interfaces.
Develop Store Apps with Kinect for Windows v2Clemente Giorio
Agenda:
Intro
The Sensor
Data Source
Kinect Evolution
Data Source
Windows Store App
Body Frame
Coordinate Mapper
Kinect Studio
Gesture Recognition
Gesture Builder
Al giorno d'oggi, l'utilizzo di NUI (Natural User Interface) rendono possibili nuove modalità di interazione tra utente e dispositivo. Vediamo come Microsoft Kinect e Intel Realsense ed I relastivi SDK rendono possibile l’implementazione di queste tecnologie nelle nostre applicazioni, nonche’ le recenti applicazioni di autenticazione biometrica introdotte in Windows 10.
Welcome to the world of Microsoft Kinect for
Windows–enabled applications. This document
is your roadmap to building exciting humancomputer interaction solutions you once
thought were impossible.
All you need to know about Kinect2 development.
With these slides you can understand the system requirements needed to start developing with Kinect2 sensor, the available sources accessible with the sensor and the gestures you can perform in order to interact with touchless interfaces.
How Augment your Reality: Different perspective on the Reality / Virtuality C...Matteo Valoriani
If you think there's been a lot of talk about Augmented Reality and Virtual Reality this year, 2018 is going to blow you away. Apple with ARkit, Google with ARCore , Microsoft with HoloLens, Facebook with Oculus and many others are working to transform our Reality with new products and services in the not-too-distant future. Therefore Apple, Microsoft, Google and Facebook is approaching AR/VR from different perspectives and in this session we will try to understand how these different technologies work and which best suits the different areas (industry 4.0, tourism, healthcare, ...) .
Accelerate Large-Scale Inverse Kinematics with the Intel® Distribution of Ope...Intel® Software
Inverse kinematics (IK) technology helps game developers create natural character animations but its complexity makes it time consuming to implement across a large cast. This talk details how the Deep Learning Deployment Toolkit (DLDT) allows game developers to quickly deploy deep-learning algorithms to solve character IK problems and to yield better results
Create a Scalable and Destructible World in HITMAN 2*Intel® Software
Gain insight into how IO Interactive* (IOI) designed the crowd, environmental audio, non-playable character simulation, and physical destruction systems to take advantage of available hardware and dynamically upscale resolution and deliver more realism. See the design and architecture of the destruction system, including the asset pipeline and game runtime that enables IOI to create a more interesting world for their players.
The Architecture of 11th Generation Intel® Processor GraphicsIntel® Software
Scheduled for release this year, this next generation brings significant improvements over the widely used 9th generation of Intel® Processor Graphics. The talk begins with an overview of Intel® Graphics architecture, its building blocks, and their performance implications. Next, take an in-depth look at the new and innovative features of this latest generation of integrated graphics.
What is a Digital Twin? Why is it another point of view of the IoT stack in Azure. Which are the features? How does it relates to IoT Hub and other Azure IoT services?
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...Intel® Software
Explore practical examples of Intel® Embree and Intel® OSPRay in production rendering and the best practices of using the kernels in typical rendering pipelines.
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Intel® Software
Variable-rate shading (VRS) is a new feature of Microsoft DirectX* 12 and is supported on the 11th generation of Intel® graphics hardware. Get an overview and learn best practices, recommendations, and how to modify traditional 3D effects to take advantage of VRS.
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016Shuichi Tsutsumi
In recent years, "IoT" or "Wearable" are one of buzzwords, so many people might have interests in building hardware products. But learning how to develop electric circuits, mechanical systems or embedded systems etc. from zero is so difficult. However, iOS developers can contribute to projects of hardware products with the knowledge of Core Bluetooth / Bluetooth Low Energy (BTLE), even if they are not familiar with hardware layer. In this session, he will introduce BTLE, show easy examples of Core Bluetooth, and share knowledges with his experiences developing more than 10 apps for IoT and Wearable products.
What is Bluetooth Low Energy? Why use this?
Very easy examples of how to communicate using Core Bluetooth
What part was my responsibility in the projects? Communication with firmware engineer.
Designing GATT
Designing the behavior of the app in background
Limitations in background. What are possible and impossible?
State Preservation and Restoration
Develop without prototypes of the hardware
BTLE Module's Developer Kit
Prototyping tools
Build emulator apps
Trouble Shootings
Debugging tools
Each cases: Can't find / connect / send or receive information
How Augment your Reality: Different perspective on the Reality / Virtuality C...Matteo Valoriani
If you think there's been a lot of talk about Augmented Reality and Virtual Reality this year, 2018 is going to blow you away. Apple with ARkit, Google with ARCore , Microsoft with HoloLens, Facebook with Oculus and many others are working to transform our Reality with new products and services in the not-too-distant future. Therefore Apple, Microsoft, Google and Facebook is approaching AR/VR from different perspectives and in this session we will try to understand how these different technologies work and which best suits the different areas (industry 4.0, tourism, healthcare, ...) .
Accelerate Large-Scale Inverse Kinematics with the Intel® Distribution of Ope...Intel® Software
Inverse kinematics (IK) technology helps game developers create natural character animations but its complexity makes it time consuming to implement across a large cast. This talk details how the Deep Learning Deployment Toolkit (DLDT) allows game developers to quickly deploy deep-learning algorithms to solve character IK problems and to yield better results
Create a Scalable and Destructible World in HITMAN 2*Intel® Software
Gain insight into how IO Interactive* (IOI) designed the crowd, environmental audio, non-playable character simulation, and physical destruction systems to take advantage of available hardware and dynamically upscale resolution and deliver more realism. See the design and architecture of the destruction system, including the asset pipeline and game runtime that enables IOI to create a more interesting world for their players.
The Architecture of 11th Generation Intel® Processor GraphicsIntel® Software
Scheduled for release this year, this next generation brings significant improvements over the widely used 9th generation of Intel® Processor Graphics. The talk begins with an overview of Intel® Graphics architecture, its building blocks, and their performance implications. Next, take an in-depth look at the new and innovative features of this latest generation of integrated graphics.
What is a Digital Twin? Why is it another point of view of the IoT stack in Azure. Which are the features? How does it relates to IoT Hub and other Azure IoT services?
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...Intel® Software
Explore practical examples of Intel® Embree and Intel® OSPRay in production rendering and the best practices of using the kernels in typical rendering pipelines.
Use Variable Rate Shading (VRS) to Improve the User Experience in Real-Time G...Intel® Software
Variable-rate shading (VRS) is a new feature of Microsoft DirectX* 12 and is supported on the 11th generation of Intel® graphics hardware. Get an overview and learn best practices, recommendations, and how to modify traditional 3D effects to take advantage of VRS.
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016Shuichi Tsutsumi
In recent years, "IoT" or "Wearable" are one of buzzwords, so many people might have interests in building hardware products. But learning how to develop electric circuits, mechanical systems or embedded systems etc. from zero is so difficult. However, iOS developers can contribute to projects of hardware products with the knowledge of Core Bluetooth / Bluetooth Low Energy (BTLE), even if they are not familiar with hardware layer. In this session, he will introduce BTLE, show easy examples of Core Bluetooth, and share knowledges with his experiences developing more than 10 apps for IoT and Wearable products.
What is Bluetooth Low Energy? Why use this?
Very easy examples of how to communicate using Core Bluetooth
What part was my responsibility in the projects? Communication with firmware engineer.
Designing GATT
Designing the behavior of the app in background
Limitations in background. What are possible and impossible?
State Preservation and Restoration
Develop without prototypes of the hardware
BTLE Module's Developer Kit
Prototyping tools
Build emulator apps
Trouble Shootings
Debugging tools
Each cases: Can't find / connect / send or receive information
Technical presentation of the gesture based NUI I developed for the Aigaio smart conference room in IIT Demokritos
Demo In Greek:
https://www.youtube.com/watch?v=5C_p7MHKA4g
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
This presentation describes briefly about the image enhancement in spatial domain, basic gray level transformation, histogram processing, enhancement using arithmetic/ logical operation, basics of spatial filtering and local enhancements.
Synthetic Data: From 3D model to AI on the EdgeSherry List
This presentation shows you how easily #SyntheticData can be created with the help of Three.JS and Playwright, and how this data can be used to train #ai models that will be used on the #Edge.
Presentation on Microsoft Technologies in Teaching, Learning and Research presented at Microsoft IT Academy Summit 2011 October. - Presentation Video in low quality to allow upload
Kinect is a motion sensing input device by Microsoft for the Xbox 360 video game console and Windows PCs. Based around a webcam-style add-on peripheral for the Xbox 360 console, it enables users to control and interact with the Xbox 360 without the need to touch a game controller, through a natural user interface using gestures and spoken commands.Microsoft released Kinect software development kit for Windows 7 on June 16, 2011.This SDK will allow developers to write Kinecting apps in C++/CLI, C#, or Visual Basic .NET.
Storytelling using Immersive TechnologiesKumar Ahir
This is Kickstarter presentation for understanding the domain of Immersive technologies and giving a guide to creating an immersive experience using Unity, Vuforia and Aframe.
Even before we get into how to do StoryTelling using this new media, we need to understand what's possible and where is it heading, which positions us better to design the story and capture it properly.
This help you understand the ecosystem of Immersive Technologies from Business, Product, Design and Development perspective.
HCI BASED APPLICATION FOR PLAYING COMPUTER GAMES | J4RV4I1014Journal For Research
This paper describes a command interface for games based on hand gestures and voice command defined by postures, movement and location. The system uses computer vision requiring no sensors or markers by the user. In voice command the speech recognizer, recognize the input from the user. It stores and passes command to the game, action takes place. We propose a simple architecture for performing real time colour detection and motion tracking using a webcam. The next step is to track the motion of the specified colours and the resulting actions are given as input commands to the system. We specify blue colour for motion tracking and green colour for mouse pointer. The speech recognition is the process of automatically recognizing a certain word spoken by a particular speaker based on individual information included in speech waves. This application will help in reduction in hardware requirement and can be implemented in other electronic devices also.
Robotics & AI User Group - Computer Vision - Azure KinectStefano Tempesta
Slide deck from the first Robotics & Artificial Intelligence Virtual User Group, with theme "Computer Vision". Includes a presentation of Azure Kinect.
https://www.meetup.com/robotics-artificial-intelligence-meetup-group/
Land of Pyramids, Petra, and Prayers - Egypt, Jordan, and Israel Tourppd1961
This is the presentation of photos and history of Land of Pyramids, Petra, and Prayers from our Egypt, Jordan, and Israel Tour during February, 2020. This was prepared and presented to the family and friends on 19th July, 2020.
This is an overview of C++ (based on 1999 / 2003 standard) and its use in Object Oriented Programming. The presentation assumes that the audience knows C programming.
This presentation was made in PRISM workshop on Technology Innovations and Trends in IT in the second decade of 21st century. The agenda is from IEEE Computer Society.
The new standard for C++ language has been signed in 2011. This new (extended) language, called C++11, has a number of new semantics (in terms of language constructs) and a number of new standard library support. The major language extensions are discussed in this presentation. The library will be taken up in a later presentation.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
"Impact of front-end architecture on development cost", Viktor Turskyi
Kinectic vision looking deep into depth
1. 1
Looking Deep into
Depth
Prof. Partha Pratim Das, ppd@cse.iitkgp.ernet.in & partha.p.das@gm
Department of Computer Science & Engineering
Indian Institute of Technology Kharagpur
18th December 2013
Tutorial: NCVPRIPG 2013
2. Natural Interaction
Understanding human nature
How we interact with other people
How we interact with environment
How we learn
Interaction methods the user is familiar and
comfortable with
Touch
Gesture
Voice
Eye Gaze
6. Natural User Interface
Revolutionary change in Human-Computer interaction
Interpret natural human communication
Computers communicate more like people
The user interface becomes invisible
or looks and feels so real that it is no longer iconic
25. 26
1.
2.
Kinect hardware - The hardware components, including the
Kinect sensor and the USB hub through which the Kinect sensor
is connected to the computer.
Kinect drivers - Windows drivers for the Kinect:
3.
4.
Audio and Video Components Kinect NUI for skeleton tracking,
audio, and color and depth imaging
DirectX Media Object (DMO)
5.
Microphone array as a kernel-mode audio device –std audio APIs in
Windows.
Audio and video streaming controls (color, depth, and skeleton).
Device enumeration functions for more than one Kinect
microphone array beamforming
audio source localization.
Windows 7 standard APIs - The audio, speech, and media APIs
and the Microsoft Speech SDK.
28. 29
RGB camera
Infrared (IR) emitter and an IR depth sensor –
estimate depth
Multi-array microphone - four microphones for
capturing sound.
3-axis Accelerometer - configured for a 2g
range, where g is the acceleration due to
gravity. Determines the current orientation of
the Kinect.
31. 32
NUI API lets the user programmatically control
and access four data streams.
Color Stream
Infrared Stream
Depth Stream
Audio Stream
32. 33
Source: Kinect for Windows Developer Toolkit
v1.8
http://www.microsoft.com/en-
in/download/details.aspx?id=40276
App Examples:
Color
Basics - D2D
Infrared Basics - D2D
Depth Basics - D2D / Depth - D3D
Depth with Color - D3D
Kinect Explorer - D2D
Kinect Explorer - WPF
33. 34
Source: Kinect for Windows Developer Toolkit
v1.8
http://www.microsoft.com/en-
in/download/details.aspx?id=40276
App Examples:
Color
Basics - D2D
34. 35
Color
Format
RGB
YUV
Bayer
Resolution
640x480 Fps30
1280x960 Fps12
640x480 Fps15
640x480 Fps30
1280x960 Fps12
New features of SDK 1.6 onwards:
• Low light (or brightly lit).
• Use hue, brightness, or contrast to improve visual
clarity.
• Use gamma to adjust the way the display appears
on certain hardware.
36. 37
IR stream is the test pattern observed from
both the RGB and IR camera
Resolution 640x480 Fps 30
37. 38
Source: Kinect for Windows Developer Toolkit
v1.8
http://www.microsoft.com/en-
in/download/details.aspx?id=40276
App Examples:
Depth
Basics - D2D
Depth - D3D
Depth with Color - D3D
38. 39
The depth data stream merges two separate
types of data:
Depth
data, in millimeters.
Player segmentation data
Each
player segmentation value is an integer
indicating the index of a unique player detected in the
scene.
The Kinect runtime processes depth data to identify up
to six human figures in a segmentation map.
40. 41
Source: Kinect for Windows Developer Toolkit
v1.8
http://www.microsoft.com/en-
in/download/details.aspx?id=40276
App Examples:
Kinect
Explorer - D2D
Kinect Explorer - WPF
41. 42
High-quality audio capture
Audio input from + and – 50 degree in front of sensor
The array can be pointed at 10 degree increment with in the
range
Identification of the direction of audio sources
Raw voice data access
43. 44
In addition to the hardware capabilities, the
Kinect software runtime implements:
Skeleton Tracking
Speech Recognition - Integration with the
Microsoft Speech APIs.
47. 48
Source: Kinect for Windows Developer Toolkit
v1.8
http://www.microsoft.com/en-
in/download/details.aspx?id=40276
App Examples:
Speech
Recognition - Integration with the
Microsoft Speech APIs.
Speech
Basics - D2D
Tic-Tac-Toe - WPF
48. 49
Microphone array is an excellent input device
for speech recognition.
Applications can use Microsoft.Speech API
for latest acoustical algorithms.
Acoustic models have been created to allow
speech recognition in several locales in
addition to the default locale of en-US.
50. 51
Source: Kinect for Windows Developer Toolkit
v1.8
http://www.microsoft.com/en-
in/download/details.aspx?id=40276
App Examples:
Face
Tracking 3D - WPF
Face Tracking Basics - WPF
Face Tracking Visualization
51. 52
Input Images
The Face Tracking SDK accepts Kinect color and depth
images as input. The tracking quality may be affected by
the image quality of these input frames
Face Tracking Outputs
Tracking status
2D points
3D head pose
Action Units (Aus)
Source: http://msdn.microsoft.com/en-us/library/jj130970.aspx
55. 56
Source: Kinect for Windows Developer Toolkit
v1.8
http://www.microsoft.com/en-
in/download/details.aspx?id=40276
App Examples:
Kinect
Fusion Basics - D2D
Kinect Fusion Color Basics - D2D
Kinect Fusion Explorer - D2D
56. 57
Take depth images from the Kinect camera
with lots of missing data
Within a few seconds producing a realistic
smooth 3D reconstruction of a static scene by
moving the Kinect sensor around.
From this, a point cloud or a 3D mesh can be
produced.
59. 60
Kinect Interaction provides:
Identification of up to 2 users and identification
and tracking of their primary interaction hand.
Detection services for user's hand location and
state.
Grip and grip release detection.
Press and scroll detection.
Information on the control targeted by the user.
62. 63
Matlab is a high-level language and interactive
environment for numerical computation,
visualization, and programming.
Open Source Computer Vision (OpenCV) is a
library of programming functions for real-time
computer vision.
MATLAB and OpenCV functions perform
numerical computation and visualization on
video and depth data from sensor.
Source:
Maltlab – http://msdn.microsoft.com/enus/library/dn188693.aspx
64. 65
Provides libraries and tools to help software
developers create robot applications.
Provides hardware abstraction, device
drivers, libraries, visualizers, messagepassing, package management, and more.
Provides additional RGB-D processing
capabilities.
Source: ROS –
65. 66
Binary Image processing
Image registration and model fitting
Interest point detection
Camera calibration
Provides RGB-D processing capabilities.
Depth image from Kinect
sensor
Source: BoofCV –
3D point cloud created from
RGB & depth images
68. 69
Body
Parts
Gestures
Applications
Referenc
es
Fingers
9 hand
Gestures for
number 1to 9
Sudoku game
Ren et al.
2011
Hand,
Fingers
Pointing
gestures:
pointing at
objects or
locations of
interest
Real-time hand gesture interaction
with the robot. Pointing gestures
translated into goals for the robot.
Bergh et
al. 2011
Arms
Lifting both
Physical rehabilitation
arms to the
front, to the side
and upwards
Chang
2011
Hand
10 hand
Gestures for
Doliotis
2011
HCI system. Experimental datasets
in-clude hand signed digits
69. 70
Body
Parts
Gestures
Applications
Referenc
es
Hand and
Head
Clap, Call, Greet, Yes,
No, Wave, Clasp, Rest
HCI applications. Not
build for any specic
application
Biswas
2011
Single
Hand
Grasp and Drop
Various HCI applications
Tang
2012
Right hand Push, Hover
(default),
Left hand,
Head,
Foot
or Knee
View and select recipes
for cooking in Kitchen
when hands are messy
Panger
2012
Hand,
Arms
Various HCI applications.
Not build for any specific
application
Lai
2012
Right / Left-arm swing,
Right / Left-arm push,
Right / Left-arm back,
Zoom-in/out
70. 71
Image source : Kinect for Windows Human Interface Guidelines v1.5
Left and right swipe
Start presentation
End presentation
Hand as a
marker
Circling gesture
Push gesture
Image source : Kinect for Windows Human Interface Guidelines v1.5
72. 73
•
Consider an example : Take a left swipe path
-0.4
-0.2
0.5
0.4
0.3
0.2
0.1
0
-0.1 0
-0.2
-0.3
Series1
0.2
0.4
•
The sequence of angles will be of the form [180,170,182, … ,170,180,185]
•
The corresponding quantized feature vector is [9,9,9, … , 9,9,9]
•
How does it distinguish left swipe from right swipe?
straightforward!
Very
73. 74
HMM
model for
right swipe
L3
L2
L1
L4
Real time gesture
• Returns likelihoods which are then converted to
probabilities by normalization, followed by thresholding
(empirical) to classify gestures
74. 75
• Here are the results obtained by the implementation
Correct Gesture
Incorrect gesture
Correctly classified
281
8
Wrongly classified
19
292
• Precision = 97.23 %
• Recall = 93.66 %
• F-measure = 95.41 %
• Accuracy = 95.5 %
80. 81
Additional libraries and APIs
Skeleton Tracking
KinectInteraction
Kinect Fusion
Face Tracking
OpenNI Middleware
NITE
3D Hand
KSCAN3D
3-D FACE
Source:
NITE – http://www.openni.org/files/nite/
3D Hand – http://www.openni.org/files/3d-hand-tracking-library/
3D face – http://www.openni.org/files/3-d-face-modeling/
KSCAN3D – http://www.openni.org/files/kscan3d-2/
81. 82
Evaluating a Dancer’s Performance using Kinectbased Skeleton Tracking
Joint Positions
Joint Velocity
3D Flow Error
Source:
82. 83
Kinect SDK
Better skeleton
representation
20 joints skeleton
Joints are mobile
Higher frame rate – better for
tracking actions involving
rapid pose changes
Significantly better skeletal
tracking algorithm (by
Shotton, 2013). Robust and
fast tracking even in complex
body poses
OpenNi
Simpler skeleton
representation
15 joints skeleton
Fixed basic frame of
[shoulder left: shoulder right:
hip] triangle and
[head:shoulder] vector
Lower frame rate – slower
tracking of actions
Weak skeletal tracking.
86. 87
The IR emitter projects
an irregular pattern of IR
dots of varying
intensities.
The IR camera
reconstructs a depth
image by recognizing
the distortion in this
pattern.
Kinect works on stereo
matching algorithm. It
captures stereo with only
one IR camera.
87. 88
Project Speckle pattern onto the scene
Infer depth from the deformation of the Speckle
pattern
Source:
http://users.dickinson.edu/~jmac/selected-talks/kinect.pdf
90. 91
Highly varied training dataset
- driving, dancing, kicking, running, navigating
menus, etc (500k images)
The pairs of depth and body part images
- used for learning the classifier.
Randomized decision trees and forests - used
to classify each pixel to a body part based on
some depth image feature and threshold
91. 92
Find joint point in each body parts - local
mode-finding approach based on mean shift
with a weighted Gaussian kernel.
This process considers both the inferred body
part probability at the pixel and the world
surface area of the pixel.
92. 93
Input Depth
Inferred body parts
Side view
Top view
Front view
Inferred body joint position
No tracking or smoothing
Source:
http://research.microsoft.com/pubs/145347/Kinect%20Slides%20CVPR2011.pptx
94. PROS
Multimedia Sensor
Ease of Use
Low Cost
Strong Library
Support
CONS
2.5D depth data
Limited field of view
Limited Range
43o vert, 57o horz
0.8 – 3.5 m
Uni-Directional View
Depth Shadows
Limited resolution
Missing depth values
95. 96
For a single Kinect, minimum of 10 feet by 10
feet space (3 meters by 3 meters) is needed.
Side View
Top View
Source: iPiSoft Wiki:
http://wiki.ipisoft.com/User_Guide_for_Dual_Depth_Sensor_Configuration
117.
Microsoft's Kinect SDK v1.6 has introduced a
new API
(KinectSensor.ForceInfraredEmitterOff) to
control the IR emitter as a Software Shutter.
This
API works for Kinect for Windows only.
Earlier the IR emitter was always active when
the camera was active. Using this API the IR
emitter can be put o as needed.
This is the most effective TDM scheme.
However, no multi-Kinect application has yet
been reported with TDM-SS.
120. Quality Factor
SDM
TDM
PDM
Accuracy
Excellent
Bad
Very Good
Scalability
Good (no
overlaps)
Bad
Excellent
Frame rate
30 fps
30/n fps
30 fps
Ease of Use
Very easy
Cumbersome
shutters
Inconvenient
vibrators
Cost
Low
High
Moderate
Limitations
Few
configurations
Unstable
Blurred RGB
Robustness
Change
Geometry
Adjust Set-up
Quite robust
122.
Kinect for Xbox 360 –Good for
Gaming
The gaming console of Xbox. Costs around $100
without the Xbox
http://www.xbox.com/en-IN/Kinect
Kinect Windows SDK is partially supported for
this
Recommended for
Kinect for Windows –Research
This
is for Kinect App development. Costs around
$250
http://www.microsoft.com/en-us/kinectforwindows/
Kinect Windows SDK is fully supported for this
123. 124
1.
Ren, Z., Meng, J., and Yuan, J. Depth camera based hand gesture
recognition and its applications in human-computer-interaction. In
Information, Communications and Signal Processing (ICICS) 2011
8th International Conference on (2011), pp. 1- 5.
2.
Biswas, K. K., and Basu, S. K. Gesture recognition using Microsoft
Kinect. In Proceedings of the 5th International Conference on
Automation, Robotics and Applications (2011), pp. 100 - 103.
3.
Chang, Y.-J., Chen, S.-F., and Huang, J.-D. A Kinect-based system
for physical rehabilitation: A pilot study for young adults with motor
disabilities. Research in Developmental Disabilities
(2011), 25662570.
4.
Doliotis, P., Stefan, A., McMurrough, C., Eckhard, D., and
Athitsos, V. Comparing gesture recognition accuracy using color
and depth information. In Proceedings of the 4th International
Conference on PErvasive Technologies Related to Assistive
Environments, PETRA '11 (2011).
124. 125
5.
Lai, K., Konrad, J., and Ishwar, P. A gesture-driven computer
interface using kinect. In Image Analysis and Interpretation
(SSIAI), 2012 IEEE Southwest Symposium on (2012), pp. 185 188.
6.
Panger, G. Kinect in the kitchen: testing depth camera interactions
in practical home environments. In CHI '12 Extended Abstracts on
Human Factors in Computing Systems (2012), pp. 1985 - 1990.
7.
8.
Tang, M. Recognizing hand gestures with Microsoft’s Kinect.
Department of Electrical Engineering, Stanford University.
Ahmed, N. A system for 360 acquisition and 3D animation
reconstruction using multiple RGB-D cameras. Unpublished article.
2012.
URL: http://www.mpiinf.mpg.de/~nahmed/casa2012.pdf.
125. 126
9.
Caon, M., Yue, Y., Tscherrig, J., Mugellini, E., and Khaled, O. A.
Contextaware 3D gesture interaction based on multiple Kinects. In
Ambient Computing, Applications, Services and Technologies, Proc.
AMBIENT First Int'l. Conf. on, pages 7-12, 2011.
10.
Berger, K., Ruhl, K., Albers, M., Schroder, Y., Scholz, A.,
Kokem•uller, J., Guthe, S., and Magnor, M. The capturing of
turbulent gas flows using multiple Kinects. In Computer Vision
Workshops, IEEE Int'l. Conf. on, pages 1108-1113, 2011a.
11.
Berger, K., Ruhl, K., Brmmer, C., Schr•oder, Y., Scholz, A., and
Magnor, M. Markerless motion capture using multiple color-depth
sensors. In Vision, Modeling and Visualization, Proc. 16th Int'l.
Workshop on, pages 317-324, 2011b.
12.
Faion, F., Friedberger, S., Zea, A., and Hanebeck, U. D. Intelligent
sensor scheduling for multi-Kinect-tracking. In Intelligent Robots
and Systems (IROS), IEEE/RSJ Int'l. Conf. on, pages 3993-3999,
2012.
126. 127
13.
Butler, A., Izadi, S., Hilliges, O., Molyneaux, D., Hodges, S., and
Kim, D. Shake'n'Sense: Reducing interference for overlapping
structured light depth cameras. In Human Factors in Computing
Systems, Proc. ACM CHI Conf. on, pages 1933-1936, 2012.
14.
Kainz, B., Hauswiesner, S., Reitmayr, G., Steinberger, M., Grasset,
R., Gruber, L., Veas, E., Kalkofen, D., Seichter, H., and
Schmalstieg, D. (2012). OmniKinect: Real-time dense volumetric
data acquisition and applications. In Virtual reality software and
technology, Proc. VRST'12: 18th ACM symposium on, pages 2532, 2012.
15.
Maimone, A. and Fuchs, H Reducing interference between
multiple structured light depth sensors using motion. In Virtual
Reality, Proc. IEEE Conf. on, pages 51-54, 2012.
16.
Kramer, J., Burrus, N., Echtler, F., C., D. H., and Parker, M.
Hacking the Kinect. Apress, 2012.
127. 128
17.
Zelnik-Manor, L. & Or–El, R. Kinect Depth Map, Seminar in
“Advanced Topics in Computer Vision” (048921 –Winter 2013).
URL:
http://webee.technion.ac.il/~lihi/Teaching/2012_winter_048921/PP
T/Roy.pdf
18.
iPiSoft Wiki. User Guide for Dual Depth Sensor Configurations.
Last accessed 17-Dec-2013.
URL:
http://wiki.ipisoft.com/User_Guide_for_Dual_Depth_Sensor_Configuration
19.
Tong, J., Zhou, J., Liu, L., Pan, Z., and Yan, H. (2012). Scanning
3D full human bodies using Kinects. Visualization and Computer
Graphics, IEEE Transactions on, 18:643-650, 2012.
20.
Mallick, T., Das, P.P., Majumdar, A. Study of Interference Noise in
Multi-Kinect Set-up, Accepted for presentation at VISAPP 2014: 9th
Int’l Jt. Conf. on Computer Vision, Imaging and Computer Graphics
Theory and Applications, Lisbon, Portugal, 5-8, Jan, 2014.
128. 129
21.
Huawei/3DLife ACM Multimedia Grand Challenge for 2013.
http://mmv.eecs.qmul.ac.uk/mmgc2013/
22.
Schroder, Y., Scholz, A., Berger, K., Ruhl, K., Guthe, S., and
Magnor, M. Multiple Kinect studies. Technical Report 0915, ICG, 2011.
129. 130
Tanwi Mallick, TCS Research Scholar, Department of
Computer Science and Engineering, IIT, Kharagpur has
prepared this presentation.
Full Body GamingController-free gaming means full body play. Kinect responds to how you move. So if you have to kick, then kick. If you have to jump, then jump. You already know how to play. All you have to do now is to get off the couch.It’s All About YouOnce you wave your hand to activate the sensor, your Kinect will be able to recognize you and access your Avatar. Then you'll be able to jump in and out of different games, and show off and share your moves.Something For EveryoneWhether you're a gamer or not, anyone can play and have a blast. And with advanced parental controls, Kinect promises a gaming experience that's safe, secure and fun for everyone.
Simply step in front of the sensor and Kinect recognizes you and responds to your gestures.
Kinect hardware - The hardware components, including the Kinect sensor and the USB hub through which the Kinect sensor is connected to the computer.Kinect drivers - The Windows drivers for the Kinect, which are installed as part of the SDK setup process as described in this document. The Kinect drivers support: The Kinect microphone array as a kernel-mode audio device that you can access through the standard audio APIs in Windows.Audio and video streaming controls for streaming audio and video (color, depth, and skeleton).Device enumeration functions that enable an application to use more than one KinectAudio and Video Components Kinect natural user interface for skeleton tracking, audio, and color and depth imaging DirectX Media Object (DMO) for microphone array beam forming and audio source localization.Windows 7 standard APIs - The audio, speech, and media APIs in Windows 7, as described in the Windows 7 SDK and the Microsoft Speech SDK. These APIs are also available to desktop applications in Windows 8.
Kinect hardware - The hardware components, including the Kinect sensor and the USB hub through which the Kinect sensor is connected to the computer.Kinect drivers - The Windows drivers for the Kinect, which are installed as part of the SDK setup process as described in this document. The Kinect drivers support: The Kinect microphone array as a kernel-mode audio device that you can access through the standard audio APIs in Windows.Audio and video streaming controls for streaming audio and video (color, depth, and skeleton).Device enumeration functions that enable an application to use more than one KinectAudio and Video Components Kinect natural user interface for skeleton tracking, audio, and color and depth imaging DirectX Media Object (DMO) for microphone array beam forming and audio source localization.Windows 7 standard APIs - The audio, speech, and media APIs in Windows 7, as described in the Windows 7 SDK and the Microsoft Speech SDK. These APIs are also available to desktop applications in Windows 8.
There is only one color image stream available per sensor. The infrared image stream is provided by the color image stream, therefore, it is impossible to have a color image stream and an infrared image stream open at the same time on the same sensor.
The beamforming functionality supports 11 fixed beams, which range from -0.875 to 0.875 radians (approximately -50 to 50 degrees in 10-degree increments). Applications can use the DMO’s adaptive beamforming option, which automatically selects the optimal beam, or specify a particular beam. The DMO also includes a source localization algorithm, which estimates the source direction.
The Face Tracking SDK uses the Kinect coordinate system to output its 3D tracking results. The origin is located at the camera’s optical center (sensor), Z axis is pointing towards a user, Y axis is pointing up. The measurement units are meters for translation and degrees for rotation angles.
The Face Tracking SDK tracks the 87 2D points indicated in the following image (in addition to 13 points that aren’t shown in Figure 2 - Tracked Points):
The X,Y, and Z position of the user’s head are reported based on a right-handed coordinate system (with the origin at the sensor, Z pointed towards the user and Y pointed UP – this is the same as the Kinect’s skeleton coordinate frame). Translations are in meters.The user’s head pose is captured by three angles: pitch, roll, and yaw.
The Face Tracking SDK results are also expressed in terms of weights of six AUs and 11 SUs, which are a subset of what is defined in the Candide3 model (http://www.icg.isy.liu.se/candide/). The SUs estimate the particular shape of the user’s head: the neutral position of their mouth, brows, eyes, and so on. The AUs are deltas from the neutral shape that you can use to morph targets on animated avatar models so that the avatar acts as the tracked user does.The Face Tracking SDK tracks the following AUs. Each AU is expressed as a numeric weight varying between -1 and +1.
http://pointclouds.org/documentation/The Point Cloud Library (PCL) is a large scale, open project for point cloud processing. The PCL framework contains numerous algorithms including filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation. These algorithms can be used, for example, to filter outliers from noisy data, stitch 3D point clouds together, segment relevant parts of a scene, extract keypoints and compute descriptors to recognize objects in the world based on their geometric appearance, and create surfaces from point clouds and visualize them – to name a few.PCL is cross-platform, and has been successfully compiled and deployed on Linux, MacOS, Windows, and Android. To simplify development, PCL is split into a series of smaller code libraries, that can be compiled separately
NITE - http://www.openni.org/files/nite/3D Hand - http://www.openni.org/files/3d-hand-tracking-library/3D face - http://www.openni.org/files/3-d-face-modeling/KSCAN3D - http://www.openni.org/files/kscan3d-2/
http://doras.dcu.ie/16574/1/ACMGC_Doras.pdf
Highly varied training dataset allows the classifier to estimate body parts invariant to pose, body shape, clothing, etc. (500k images)The pairs of depth and body part images are used as fully labelled data for learning the classifier.Randomized decision trees and forests are used to classify each pixel to a body part based on some depth image feature and threshold