Writing applications using the Microsoft Kinect Sensor
Writing applications using the Microsoft Kinect sensor Phil Denoncourt firstname.lastname@example.org Philknows.net
About me Consultant based in Concord NH Writing software for over 20 years Writing .NET applications for 10 years MCPD, MCITP, MCSD, MCDBA, MCSE Philknows.net
Kinect Features Motion sensing device for Xbox 360 + Windows Contains RGB Camera Depth Sensor Multi array microphone
Kinect SDKHardware Software Dual Core > 2.66 Windows 7 GHz Windows 7 Embedded 2 GB Ram (4 Rec’d) DirectX 9.0c Visual Studio 2010 Kinect for Windows Can use Xbox Kinect Microsoft Speech with power adapter for Platform 11 development
SDK Features Kinect Drivers Supports up to 4 connected devices Each device needs a dedicated USB bus Managed + Native libraries Access to the various streams Video Depth Skeleton Manipulate Camera Elevation Access to multi-array microphone
What it doesn’t do Doesn’t work with XNA for Xbox Need XDK to develop Kinect for Xbox Does work with XNA for Windows Skeleton Limitations Doesn’t determine fingers Doesn’t determine skull features Eyes, Jaw, Nose… Only works on humanoid figures No person/face recognition Speech Recognition doesn’t support Dictation
Depth Stream Depth “Image” captured 30/sec Returned as byte array Left-Right, Top to Bottom Returns distance of pixel in millimeters Between 850 – 4000 mm -1 = unknown (Shadows, reflectivity) Near mode allows between 400-3000 mm Also contains info describing which player occupies that pixel.
Skeleton Streams Can capture and track up 2 skeletons Can monitor up to 6 Captures data at 30/sec Captures a collection of 20 joints X,Y,Z position in meters from the sensor Some joints are inferred Recognizes “partial” skeletons No indication of joint’s orientation Where is the person looking? No built in gesture support Choose which skeleton to track, or sensor can automatically determine.
Basic Models of Interaction Event based Event is raised for every frame You must copy data from frame before next frame comes in Routines should read data quickly Interrogation based You ask the sensor for the latest frame Up to you when you ask Might miss frames
Audio Processing 4 microphone array Processing occurs on Kinect hardware Echo Cancelation Position Tracking Other Noise Suppression Reduction Recording is done on separate thread Make sure apps are MTA, not STA
Speech Recognition Command based recognition only Kinect uses Microsoft.Speech libraries Not System.Speech Needs Speech Platform Runtime (v11) App needs to be MTA, not STA
Possible Applications Kiosk / Self Service Portals Cheap Security Monitors Video Conferencing / Recording
Upcoming New SDK released late May Should be compatible with v1 Gesture Recording Stronger support for “seated” skeleton ASUS is rumored to be releasing laptop with embedded Kinect
Resourceshttp://www.microsoft.com/en-us/kinectforwindows/--SDK Sitehttp://channel9.msdn.com/coding4fun/kinect--Bunch of good samples, walkthroughshttp://www.codeplex.com--Bunch of user submitted code--Make sure samples have been updated from the Beta SDK.http://www.meetup.com/kinectboston/--Next meeting April 12 2012