Table of Contents Introduction ◦ Who is the author? ◦ Overview ◦ Kinect basics Chapter 1: Tech Side ◦ Hardware/software preparation ◦ Hacks and tips Chapter 2: Biz Side ◦ Original intention and actual feedback ◦ Video view analyses ◦ What else happened
Overview (1) What is this presentation all about? ◦ My Kinect Hacks as holiday project: http://code.google.com/p/kinect-ultra/ http://code.google.com/p/kinect-kamehameha/ ◦ How much fun Kinect hacking could be What is Kinect Hacks? ◦ Creating your own cool stuff using Kinect, motion sensing gaming system for Xbox360 When and how I started Kinect Hacks? ◦ On Dec 2011, a month after Kinect’s release ◦ From my friend’s tweet about Kinect and Kinect Hacks ◦ Me: “Wow, I’ve gotta do this! I don’t mind spending all my Winter Holidays!”
Overview (2) What is interesting with my Kinect Hacks? ◦ Intensive crash project Major part(*) done in a week (before my wife runs out patience) No special knowledge of motion detection or 3D CG at the beginning ◦ Challenge for “the silliest thing ever” Me: “I’ll take my hats off to the smart hacks created by the brilliant people all around the world. Then, I’ll create something silly nobody ever thought of, dedicating the best of my intelligence, energy, and CPU & GPU power. It must be a fun!” ◦ Got unexpectedly huge response from public Huge views in YouTube & Nicovideo (300k in 1st week) Appeared on news blogs, newspapers, TV, and other media Contest-awarded Contacted by investor for commercialization … (*) kinect-ultra V1 that earned largest public response
Overview (3) What you may learn today ◦ How to start cool Kinect Hacks by yourself Chapter 1: Tech Side ◦ Some hints for a geek to make a “hit” (Well, I hope so) Chapter 2: Biz Side Disclaimers ◦ I am totally amateur for image recognition, motion detection, and 3D CG ◦ I know only things interesting and/or necessary for me ◦ I do not care much for academic accuracy (Be careful I may be lying) ◦ I am a geek but not a business person
Kinect Basics (1) What is Kinect actually? ◦ Gaming system for Xbox360 that enables intuitive and natural game play without controllers ◦ Released at Nov 2011 What is Kinect Sensor? ◦ Input device with RGB camera, IR depth sensor, and some other auxiliary sensors 640x480@30fps, 1280x1024@10fps(*) Internals developed by PrimeSense ◦ Connectable to PC via USB Drivers and libraries available for free ◦ In this presentation, “Kinect” refers to Kinect Sensor (*) With Avin’s Windows driver
Kinect Basics (2) What can you do with Kinect? Generally speaking… Very Far Near Far RGB camera + Kinect provides color of Depth sensor and distance to the object for each pixelDon’t you see youcan build any cool stuff on this? Let’s hack! Skeleton recognition by PC 3D object recognition (So you will get 3D by PC positions for each joint)
Chapter I: Tech Side This chapter explains the nuts-and-bolts behind this crash project ◦ Like the tricks behind a magic, it’s nothing surprising once you get to know ◦ General mathematics (especially geometry) required How much time did I spend? Got huge public response for this ◦ Study: 3 days ◦ kinect-ultra: 7 days (for V1) + 2 days (for V2) ◦ kinect-kamehameha: 1 day (for V1) + 1 day (for V2) I think I should count “night” rather than “day” actually
Hardware Preparation Kinect, of course! ◦ Caution: buy standalone, but not Xbox-bundle Xbox-bundle does not have the adapter for USB connector Windows PC ◦ With fairly fast CPU and GPU The more powerful your hardware is, the more energy you can use for cool essential stuff rather than performance optimization Mine: Core i7 2600 + GeForce GTX 285 ◦ How about Mac and Linux? I am not so familiar, but probably Windows is safer because of good driver support(*) and Microsoft’s SDK in the future You don’t need Xbox (*) Avin’s Windows driver can automatically calibrate RGB camera and IR depth sensor, but I was not able to find the same feature in Linux drivers when I tried. It could be better now.
Software Preparation (1) OpenNI + NITE + Avin’s SensorKinect ◦ Basic software component set for sensor information access and recognition algorithms OpenNI: Framework NITE: OpenNI-compatible implementation Avin’s SensorKinect: OpenNI-compabitle Kinect driver ◦ Advantages to other options (such as OpenKinect) Released by PrimeSense Player recognition and skeleton tracking available out-of-the-box! Actually, this was the key success factor for me to get this project done so quickly without any special knowledge about motion recognition Auto calibration between RGB camera and IR sensor Thanks to Avin for nice driver implementation ◦ In this presentation, “OpenNI” refers to all of these software components as a set
Software Preparation (2) OpenGL support libraries ◦ Chose OpenGL for my first 3D API to learn ◦ Just followed “OpenGL SuperBible 5th Edition” Standard support libraries (e.g. freeglut) Original library in this book (GLTools) Others ◦ OpenCV Only used for reading image files and Gaussian random number
Hack 0: Study with Sample Programs Study for 3 days before starting kinect-ultra ◦ Surveyed both OpenKinect and OpenNI, and chose latter ◦ Learned basic pixel information access and OpenGL usage from OpenNI’s sample programs First practice piece: depth-aware delayed-overlay See “Algorithm March by Kinect” http://www.youtube.com/watch?v=j4ABDmFhkgA
Hack 1: Transformation Use “calibration complete” event to trigger transformation ◦ Calibration by “psi pose” is common for Kinect apps to start skeleton tracking ◦ “Something happens on calibration complete” is Kinect-ish entertainment Modulate color of player area to represent the superhero suit ◦ OpenNI reports “hey, this pixel seems a part of player #1” so the app easily knows which pixels should be modulated ◦ Switch color (red or gray) for each pixel based on its distance from head App can calculate Euclid distance between any pixels/joints in real world coordinates It is slow, however; some optimization is required ◦ You: “Isn’t it too rough?” Me: “Well, that’s OK, this is meant to be funny after all!” Skinning should be ideal, but too serious and challenging Ψ psi pose
TIP: A Bit about Coordinate Systems Kinect coordinates OpenGL coordinates • Raw pixel & depth data • Raw vertex & pixel data from Kinect for OpenGL 10000~ 1.0 Depth Z-buffer (seems linear) (Non-linear) 0.0 0 Z Each XY plane(0, 0)~(640, 480) Each XY plane (-1.0, -1.0)~(1.0, 1.0) XY plane Projected by Transformed by OpenNI API OpenGL API (a little slow) Real world coordinates • Skeleton positions from OpenNI • Virtual 3D polygon objects
Hack 2: Detect Pose Shoot Laser No motion detection, only pose detection! ◦ Calculation is tremendously easy without time derivative ◦ Once the positions of skeleton parts are given, elementary vector operations (distance, dot product, cross product) work very well ◦ Try and error to decide good parameters (e.g. thresholds) Spawn laser while pose is detected ◦ Laser is flat rectangle object in 3D space with alpha texture, and laid over image from RGB camera ◦ Position/direction/initial velocity calculated from the pose Same approach for shooting Eye Slugger ◦ With an additional stability check
Hack 3: Hidden Surface Processing Place each pixel from Kinect as point object in 3D space ◦ Not texture mapping ◦ So pixels and other 3D objects hide each other Handle pixels in projective coords for good performance ◦ 3D objects basically reside in real world coords, but mapping all pixels into real world is too slow ◦ Instead, directly map pixels from Kinect coords to OpenGL raw coords by transforming depth value to OpenGL Z-buffer value ◦ See next page, it was a hack
TIP: Fast Depth Transformation Direct transformation from Kinect’s Kinect coordinates depth value to OpenGL Z-buffer value OpenGL coordinates • Raw pixel & depth data is much faster! Some hacking was • Raw vertex & pixel data from Kinect needed to figure out the formula. for OpenGL 10000~ 1.0 Depth Z-buffer (seems linear) (Non-linear) Uniform everything into real 0.0 world makes the logic easier, but slow. 0 Z Each XY plane(0, 0)~(640, 480) Each XY plane (-1.0, -1.0)~(1.0, 1.0) XY plane Projected by Transformed by OpenNI API OpenGL API (a little slow) Real world coordinates • Skeleton positions from OpenNI • Virtual 3D polygon objects
Hack 4: Hit testing Hit-test between lasers (= rectangles in 3D space) and image pixels (= points in 3D space), and convert lasers into sparks ◦ Impractical to check the distance between all the objects ◦ Instead, divide the real world space into coarse 1-bit voxels, and mark voxels that contain points No distance calculation, just voxel look up is enough for hit testing Mark voxels with down-sampled pixels Marking voxels needs to be done in the real world coordinates thus slow ◦ Maybe inaccurate, but fun!
TIP: How Kinect Works in Darkness? IR laser depth sensing works even in dark room ◦ http://www.youtube.com/watch?v=nvvQJxgykcU ◦ Cast random dot pattern and analyze parallax (capture from above URL)
Hack 5: Light Ball Drawing white circle does not look light ball at all… Instead, brighten surroundings as per distance from light ball center You feel dazzling light and heat! (Thanks to human illusion) Use approximation because real Euclid distance calculation for all pixels is slow Calculate “pseudo” distance in projective coordinates (with tweaking Z value a bit) Try and error to decide how to modulate brightness by pseudo distance Not 100% scientific and realistic, but good enough and, most importantly, fun!
Hack 6: Energy Wave (1) Represented by long-stretched polygon sphere Decide transparency by dot product between normal of polygonal surface and sight vector (for nebular effect) ◦ Solid around center, transparent around edge ◦ Implemented by GLSL (shading language) Although it was first time for me to work on this language, it’s done in about 30 minutes by tweaking a sample code in a book Add random fluctuation to normal (for misty/swirly effect) ◦ Accidentally discovered from bug
Hack 6: Energy Wave (2) Act as brightness Simple Reflection rgb = rgb·(n·v / |v|)k After a quick tweak… v (sight vector) n (normal) Nebular Effect Add random fluctuation to the a = (n·v / |v|)k normal to make the transparency roughly modulated by position and time. This makes the energy Act as transparency wave look misty or swirly
Hack 7: Hair! Secret formula to model the hair ◦ O = center of head, P = each pixel on player’s border near and above O ◦ Render narrow triangle from P to the direction of OP with length of n|OP| where n is a simple linear saw-wave function of r where r is the angle of OP against the horizon Add some repulsion against energy ball Randomly blend graded yellow (for “goldish shine” effect) Everything is calculated/rendered in 2D on projective plane ◦ Easy and unrealistic, but cartoonish and funny n|OP| n = simple linear saw-wave function of r P n O r Player’s border r 0 π/2
Chapter 2: Biz Side Got unexpected huge response to uploaded video Maybe able to read some hint for a geek to make a “hit”…
What Did I Intend Actually? Absolutely no intention to be “successful”, but had other clear intentions which might be eventual success factors ◦ Desire to be in the same line as other Kinect Hackers ◦ Must be differentiated -- useless, nonsense, and never-seen ◦ Must be quickly done Before real game studios publish their serious work Before someone else (as crazy as myself) shoot lasers ◦ Completeness of entertainment First created laser shooting only (in 2 days), then added other features one by one till satisfied with “completeness” Motivated by “hey, this idea is too good! I couldn’t finish without it!” Transformation, hidden surface, hit testing, Eye Slugger, timeout, flying out, … ◦ Targeted at worldwide Created videos in both Japanese and English, and uploaded them to both YouTube and Nicovideo (Japanese video site) Creating only for one community would mean not to welcome the other
Examples of unexpected feedback It’s for kids! ◦ “My kid keeps PC and never leaves.” ◦ “When my kids and I play heroes and bad guys, they identify themselves with the heroes in their mind. If they can actually become the heroes out of their imagination, it will be wonderful.” It makes my dream come true! ◦ “I wanted to do this since I was a kid.” ◦ “The kid’s part of me says ‘Look! He transforms! I wanna do it!’ and drown out my adult’s words.” Me: “I did not mean it at all. I just tried to be silly and funny. But, it is definitely a pleasure to see people get excited about the future of the technology demonstrated by this.”
Video View Analysis of kinect-ultra Exploded within 24 hours and reached to 300k in a week ◦ More discussion in next page Japan heats up and cools down very quickly while worldwide seems a little slower Forgotten while nothing happens, and remembered by occasional events Total Views Views/day in first two weeks 600,000 140000 500,000 120000 Explosion 100000 400,000 80000 300,000 60000 Nico (ja) 200,000 40000 YT (ja) 100,000 20000 YT (en) 0 0 Nicovideo- award nominee
Hypothesis of explosion mechanism Interesting to think how access could grew up so largely and rapidly Hypothesis: multistage explosive chain reaction among video, tweets, and news sites(*) Stage 1 • Maniac communities first notice the video, and start tweeting (~10h) • Views and tweets increase slowly Stage 2 • Number of tweets penetrates some threshold (~20h) • News sites notice it and post articles (independent blog sites first and then major news sites such as Yahoo! News) • Views and tweets rapidly increase by positive feedback effect Stage 3 • Number of views penetrates some threshold and ranks in most popular videos • Feedback effect even more accelerated Cool down • Tweet cools down and feedback effect stops gradually (48h~) Is it possible to make it happen intentionally? Not sure, probably very difficult (*) My colleague tracked the public activity and came up with this hypothesis. Great job of him.
Video View Analysis of kinect-kamehameha No explosion ◦ Got many views at first in Nicovideo (more than ultra in fact), but did not fuse explosion ◦ Probably insufficient impact to make them tweet and penetrate the threshold Sustainable popularity from worldwide more than Japan ◦ From DBZ fans in the world? Most views come from Brazil ◦ Sporadic jump up – don’t know what is happening Total Views Views/day in first two weeks200,000 20000180,000 18000160,000 16000140,000 14000120,000 12000100,000 10000 80,000 8000 Nico (ja) 60,000 6000 YT (ja) 40,000 4000 20,000 2000 YT (en) 0 0
What else happened (1) Appear on media ◦ Blog, news, and tech review sites ◦ Papers and magazines (e.g. Japan Times) ◦ TV shows (e.g. NHK BS1/2 in Japan) ◦ Net casting (in Japan and France) ◦ For more information: http://code.google.com/p/kinect-ultra/wiki/Articles http://code.google.com/p/kinect-kamehameha/wiki/Articles Public demos and presentations ◦ 3D Vision & Kinect Hacking Meetup ◦ JTPA Geek Saloon ◦ Maker Faire (Thanks to Matt Bell for involving me) ◦ Campus Party (Did not make it, though)
What else happened (2) Win and nominated for awards ◦ Matt Cutt’s Kinect Contest Winner ◦ Maker Faire 2011 Bay Area Editor’s Choice Winner ◦ Nicovideo Award 2011 Spring Nominee Other interesting contacts from ◦ Other hackers, of course! ◦ Investors ◦ Artists (who wanted to use the video in his art work) ◦ 3D modelers (who kindly contributed Eye Slugger model)