SDK Overview<br />Dr David Brown<br />Microsoft Technology Centre<br />
Agenda<br />Installing and Using the Kinect Sensor<br />Setting up your Development Environment<br />Camera Fundamentals<b...
Hardware<br />Computer with a dual-core 2.66-GHz or faster processor<br />2GB RAM<br />Windows 7-compatible graphics card ...
Kinect Sensor<br /> 3D DEPTH SENSORS<br />RGB CAMERA<br />MOTORIZED TILT<br />MULTI-ARRAY MIC<br />
Development Environment<br />Microsoft Visual Studio 2010 Express or other Visual Studio 2010 edition <br />.NET Framework...
Image API<br />
Demo<br />
Depth Image<br />Array of bytes (ImageFrame.Image.Bits)<br />Left to right, top to bottom<br />Represents distance for pix...
Depth Data<br />2 bytes per pixel (16 bits)<br />Depth (Distance per pixel)<br />Bitshiftsecond byte by 8<br />Distance (0...
Demo<br />
Skeleton API<br />
Skeleton Data<br />
Joint Data<br />Maximum two players tracked at once<br />Six player proposals<br />Each player with set of joints<br /><x,...
Demo<br />
Audio Processing<br />Four microphone arraywith hardware-basedaudio processing<br />Multichannel echo cancellation (MEC)<b...
Audio API<br />
Speech Recognition<br />Grammar – What we are listening for<br />Code – GrammarBuilder, Choices<br />Speech Recognition Gr...
Demo<br />
Samples<br />NUI<br />Skeletal viewer, C++, C#<br />Shape Game Demo, C#<br />Audio<br />Raw capture, C++<br />Audio filter...
Resources<br />SDK, http://research.microsoft.com/kinectsdk<br />Channel 9 quick-starts, http://channel9.msdn.com/series/K...
Architecture<br />
Upcoming SlideShare
Loading in …5
×

Kinect for Windows SDK Dr David Brown

3,791 views

Published on

Installing and Using the Kinect Sensor
Setting up your Development Environment
Camera Fundamentals
Working with Depth Data
Skeletal Tracking Fundamentals
Audio Fundamentals

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,791
On SlideShare
0
From Embeds
0
Number of Embeds
1,036
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Speech Platform Runtime &amp; SDK must use x86 editionMicrosoft Kinect Speech Platform is the same speech recognition for XBOX.NET 4.0 Windows.Speech namespace can be used but not as up-to-date
  • Colour and depth stream4 to 11.5 feet (1.2 to 3.5 meters) Skeletal tracking4 to 11.5 feet (1.2 to 3.5 meters) Viewing angle43° vertical by 57° horizontal field of viewMechanized tilt range (vertical)±28° Frame rate (depth and colour stream)30 frames per second (FPS)Resolution, depth streamQVGA (320 × 240) Resolution, colour streamVGA (640 × 480) Audio format16-kHz, 16-bit mono pulse code modulation (PCM)Audio input characteristicsA four-microphone array with 24-bit analogue-to-digital converter (ADC) and Kinect-resident signal processing such as echo cancellation and noise suppression
  • Speech Platform Runtime &amp; SDK must use x86 editionMicrosoft Kinect Speech Platform is the same speech recognition (acoustic model) for XBOX.NET 4.0 Windows.Speech namespace can be used but not as up-to-date
  • WPF event-driven RGB &amp; Depth framesCamera tilt
  • WPF event-driven RGB &amp; Depth framesCamera tilt
  • WPF event-driven RGB &amp; Depth framesCamera tilt
  • WPF event-driven RGB &amp; Depth framesCamera tilt
  • Skeletal Viewer (C++ and C#) The Kinect sensor includes two cameras: one delivers depth information and the other delivers color data. The NUI API enables applications to access and manipulate this data. The SkeletalViewer sample uses the NUI API to render data from the Kinect sensor’s cameras as images on the screen. The managed sample uses WPF to render captured images, and the native application uses DirectX.ShapeGame—Creating a Game with Audio and Skeletal Tracking Displays the tracked skeletons of two players together with shapes falling from the sky. Players can control the shapes by moving and speaking commands.Audio Capture Raw (C++) The Kinect sensor’s audio component is a four-element microphone array. The AudioCaptureRaw sample uses the Windows Audio Session API (WASAPI) to capture the raw audio stream from the Kinect sensor’s microphone array and write it to a .wav file.MicArrayEchoCancellation—Acoustic Echo Cancellation, Beam Forming, and Source Localization (C++)The primary way for C++ applications to access the Kinect sensor’s microphone array is through the MSRKinectAudio DirectX Media Object (DMO). The MSRKinectAudio DMO supports all standard microphone array functionality, and adds support for beamforming and source localization. The MicArrayEchoCancellation sample shows how to use the KinectAudio DMO in a DirectShow graph. It uses acoustic echo cancellation to record a high-quality audio stream and beamforming and source localization to determine the selected beam and the direction to the sound source. MFAudioFilter—Media Foundation Audio Filter (C++) Shows how to capture an audio stream from the Kinect sensor’s microphone array by using the MSRKinectAudio DMO in filter mode in a Windows Media Foundation topology.RecordAudio—Recording an Audio Stream and Monitoring Direction (C#) Demonstrates how to capture an audio stream from the Kinect sensor’s microphone array and monitor the currently selected beam and sound source direction.Speech—Recognizing Voice Commands (C#) Demonstrates how to use the Kinect sensor’s microphone array with the Microsoft.Speech API to recognize voice commands
  • Kinect for Windows SDK Dr David Brown

    1. 1. SDK Overview<br />Dr David Brown<br />Microsoft Technology Centre<br />
    2. 2. Agenda<br />Installing and Using the Kinect Sensor<br />Setting up your Development Environment<br />Camera Fundamentals<br />Working with Depth Data<br />Skeletal Tracking Fundamentals<br />Audio Fundamentals<br />
    3. 3. Hardware<br />Computer with a dual-core 2.66-GHz or faster processor<br />2GB RAM<br />Windows 7-compatible graphics card that supports DirectX 9.0c<br />
    4. 4. Kinect Sensor<br /> 3D DEPTH SENSORS<br />RGB CAMERA<br />MOTORIZED TILT<br />MULTI-ARRAY MIC<br />
    5. 5. Development Environment<br />Microsoft Visual Studio 2010 Express or other Visual Studio 2010 edition <br />.NET Framework 4.0<br />SDK, http://research.microsoft.com/kinectsdk<br />DirectX Samples<br />Microsoft DirectX® SDK - June 2010 or later version<br />Current runtime for Microsoft DirectX® 9<br />Speech Samples<br />Microsoft Speech Platform Runtime, version 10.2 (x86 edition)<br />Microsoft Kinect Speech Platform (US-English version)<br />Microsoft Speech Platform - Software Development Kit, version 10.2 (x86 edition)<br />
    6. 6. Image API<br />
    7. 7. Demo<br />
    8. 8. Depth Image<br />Array of bytes (ImageFrame.Image.Bits)<br />Left to right, top to bottom<br />Represents distance for pixel in mm (850 to 4,000mm)<br />0 means unknown<br />Shadows, low reflectivity, and high reflectivity among the few reasons<br />Player Index<br />0, No player<br />1, Skeleton 0<br />2, Skeleton 1<br />
    9. 9. Depth Data<br />2 bytes per pixel (16 bits)<br />Depth (Distance per pixel)<br />Bitshiftsecond byte by 8<br />Distance (0,0) = (int)(Bits[0] | Bits[1] << 8);<br />DepthAndPlayer Index (Includes Player index)<br />Bitshift by 3 first byte (player index), 5 second byte<br />Distance (0,0) =(int)(Bits[0] >> 3 | Bits[1] << 5);<br />
    10. 10. Demo<br />
    11. 11. Skeleton API<br />
    12. 12. Skeleton Data<br />
    13. 13. Joint Data<br />Maximum two players tracked at once<br />Six player proposals<br />Each player with set of joints<br /><x, y, z> in meters<br />Tracking state<br />Tracked<br />Inferred<br />Occluded, clipped, or low confidence joints<br />Not tracked<br />Rare, but your code must check for this state<br />
    14. 14. Demo<br />
    15. 15. Audio Processing<br />Four microphone arraywith hardware-basedaudio processing<br />Multichannel echo cancellation (MEC)<br />Sound position tracking<br />Other digital signal processing (noise suppression and reduction)<br />
    16. 16. Audio API<br />
    17. 17. Speech Recognition<br />Grammar – What we are listening for<br />Code – GrammarBuilder, Choices<br />Speech Recognition Grammar Specification (SRGS)<br />C:Program Files (x86)Microsoft Speech Platform SDKSamplesSample Grammars<br />Set AutomaticGainControl = false<br />
    18. 18. Demo<br />
    19. 19. Samples<br />NUI<br />Skeletal viewer, C++, C#<br />Shape Game Demo, C#<br />Audio<br />Raw capture, C++<br />Audio filtering, C++<br />Echo cancellation, C++<br />Recording, C#<br />Speech, C#<br />
    20. 20. Resources<br />SDK, http://research.microsoft.com/kinectsdk<br />Channel 9 quick-starts, http://channel9.msdn.com/series/KinectSDKQuickstarts/<br />Coding4Fun gallery & blog, http://channel9.msdn.com/coding4fun/kinect<br />
    21. 21.
    22. 22. Architecture<br />

    ×