Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
空間 x 科技 x 藝術
吳 冠 穎 MAO
⽊木天寮互動設計有限公司 負責⼈人
交⼤大建築研究所 博⼠士⽣生
mao@mutienliao.com
Make Things See
Kinect + Processing Workshop
MAO WU
Fabcafe.Taipe
20141230
Meet Kinect
eet the Kinect
r Windows
ensor and SDK
inect for Windows sensor and SDK
de the ears and eyes of your applicati...
What Kinect sees
Kinect for Windows is versatile, and can see
people holistically, not just smaller hand
gestures. Six peo...
Depth Sensor Comparison
Both Microsoft Kinect and ASUS Xtion
(Live) / PrimeSense Carmine sensors are
based on the same Pri...
Simple-Openni
OpenNi library for Processing
https://code.google.com/p/simple-openni/wiki/Installation
Processing
• Open yo...
Depth Image Interaction
In this section, you will learn how to play
with the pixels of depth image and
implement some simp...
Simple-Openni
OpenNi library for Processing Reference Index
Context
The context is the top-level object that encapsulates ...
Simple-Openni
OpenNi library for Processing
Image
The different cameras provide different data and functionality.
RGB
The ...
Simple-Openni
OpenNi library for Processing
Depth
The depth image is calculated by the IR camera and the pseudorandom arra...
Simple-Openni
OpenNi library for Processing
IR
The IR image is what the IR camera sees and cannot be enabled while the RGB...
that will contain all the values of the array and the int that comes before it as a
label telling us that everything that ...
Gesture & Skeleton
In this section, you will learn how to play
with the the default gesture control and
play something wit...
GESTURE_WAVE
GESTURE_HAND_RAISE
GESTURE_CLICK
Gesture > Gesture Interaction Design
ign for variability
nput
evious experie...
Simple-Openni
OpenNi library for Processing
Hand
To start capture hand gesture we need make it enabled within setup().

co...
Skeleton Tracking
This tutorial will explain how to track human
skeletons using the Kinect. The OpenNI library
can identif...
Simple-Openni
OpenNi library for Processing
User
To start capture user information we need enable depth and user within se...
Simple-Openni
OpenNi library for Processing
Drawing the Skeleton
Now we will use our drawSkeleton() function to draw lines...
Upcoming SlideShare
Loading in …5
×

Kinect v1+Processing workshot fabcafe_taipei

1,041 views

Published on

This is a basic introduction for kinect v1 and processing in 2014. However, some practice codes not included in this slide. It's only the concept help you understand some information about how using processing play with kinect.

Published in: Devices & Hardware
  • Be the first to comment

  • Be the first to like this

Kinect v1+Processing workshot fabcafe_taipei

  1. 1. 空間 x 科技 x 藝術
  2. 2. 吳 冠 穎 MAO ⽊木天寮互動設計有限公司 負責⼈人 交⼤大建築研究所 博⼠士⽣生 mao@mutienliao.com
  3. 3. Make Things See Kinect + Processing Workshop MAO WU Fabcafe.Taipe 20141230
  4. 4. Meet Kinect eet the Kinect r Windows ensor and SDK inect for Windows sensor and SDK de the ears and eyes of your application. want to keep their capabilities in mind as esign your application. Introduction > Meet the Kinect for Windows Sensor and SDK RGB Camera Infrared projector and sensor Motorised till Microphone array “…intended for non-commercial use to enable experimentation in the world of natural user interface experiences”
  5. 5. What Kinect sees Kinect for Windows is versatile, and can see people holistically, not just smaller hand gestures. Six people can be tracked, including two whole skeletons. The sensor has an RGB (red-green-blue) camera for color video, and an infrared emitter and camera that measure depth. The measurements for depth are returned in millimeters. The Kinect for Windows sensor enables a wide variety of interactions, but any sensor has “sweet spots” and limitations. With this in mind, we defined its focus and limits as follows: Physical limits – The actual capabilities of the sensor and what it can see. Sweet spots – Areas where people experience optimal interactions, given that they’ll often have a large range of movement and need to be tracked with their arms or legs extended. Kinect for Window Human interface Guidelines v1.8 Kinect for Windows | Human Interface Guidelines v1.8 7 • Physical limits: 0.4m to 3m • Sweet spot: 0.8m to 2.5m What Kinect for Windows sees Kinect for Windows is versatile, and can see people holistically, not just smaller hand gestures. Six people can be tracked, including two whole skeletons. The sensor has an RGB (red-green-blue) camera for color video, and an infrared emitter and camera that measure depth. The measurements for depth are returned in millimeters. The Kinect for Windows sensor enables a wide variety of interactions, but any sensor has “sweet spots” and limitations. With this in mind, we defined its focus and limits as follows: Physical limits – The actual capabilities of the sensor and what it can see. Sweet spots – Areas where people experience optimal interactions, given that they’ll often have a large range of movement and need to be tracked with their arms or legs extended. 27° 43.5° 27° 57.5° 0.4m/1.3ft sweet spot physical limits 3m/9.8ft 0.8m/2.6ft 2.5m/8.2ft sweet spot physical limits0.8m/2.6ft 4m/13.1ft 1.2m/4ft 3.5m/11.5ft Near mode depth ranges • Physical limits: 0.8m to 4m (default) Extended depth (beyond 4m) can also be retrieved but skeleton and player tracking get noisier the further you get away, and therefore may be unreliable. • Sweet spot: 1.2m to 3.5m Default mode depth ranges • Horizontal: 57.5 degrees • Vertical: 43.5 degrees, with -27 to +27 degree tilt range up and down Angle of vision (depth and RGB) Note that Near mode is an actual setting for Kinect for Windows, and is different from the various ranges we detail in Interaction Ranges, later in this document. Introduction > Meet the Kinect for Windows Sensor and SDK
  6. 6. Depth Sensor Comparison Both Microsoft Kinect and ASUS Xtion (Live) / PrimeSense Carmine sensors are based on the same PrimeSense infra-red technology. So all basic characteristics critical for full-body motion capture are generally the same. But there are certain differences that you can take into account: http://wiki.ipisoft.com/Depth_Sensors_Comparison Device Pros Cons Microsoft Kinect ▪ High quality of device drivers ▪ Stable work with various hardware models ▪ Has motor that can be controlled ▪ Bigger size (12" x 3" x 2.5" against 7" x 2" x 1.5") ▪ Higher weight (3.0 lb agains 0.5 lb) ▪ Require ACDC power supply ▪ Lower RGB image quality in comparison with MS Kinect ASUS Xtion / PrimeSense Carmine ▪ More compact ( 7" x 2" x 1.5" against 12" x 3" x 2.5") ▪ Lighter weight (0.5 lb agains 3.0 lb) ▪ Does not require power supply except USB ▪ Better RGB image quality ▪ Less popular device ▪ Lower drivers quality ▪ Does not work with some USB controllers (especially USB 3.0) ▪ No motor, allow only manual positioning MS Kinect for Windows ASUS Xtion Live ASUS Xtion PrimeSense Camine 1.08 • ASUS Xtion Live or PrimeSense Carmine is recommended because it includes color sensor as well. Color image is currently not used for tracking, but eventually will. Also it helps to operate the system
  7. 7. Simple-Openni OpenNi library for Processing https://code.google.com/p/simple-openni/wiki/Installation Processing • Open your processing (> 2.0) • Go to the menu: 
 Sketch->Import Liberary…-> Add Library… • Select and install SimpleOpenNI ◉ Windows Need Install Kinect SDK • Download KinectSDK • Start the Kinect SDK Install
 If everything worked out, you should see the plugged camera in your Device Manager(under 'Kinect for Windows'). In case you have an error when you startup a processing sketch with SimpleOpenNI, try to install the Runtime Libraries from Microsoft. Install SimpleOpenNi
  8. 8. Depth Image Interaction In this section, you will learn how to play with the pixels of depth image and implement some simple interactive codes.
  9. 9. Simple-Openni OpenNi library for Processing Reference Index Context The context is the top-level object that encapsulates all the camera and image functionality. The context is typically declared globally and instantiated within setup(). There is an optional flag argument for forcing single or multi-threading, but in our experience we haven't found a difference between the two. SimpleOpenNI context = new SimpleOpenNI(this) SimpleOpenNI context = new SimpleOpenNI(this, SimpleOpenNI.RUN_MODE_SINGLE_THREADED) SimpleOpenNI context = new SimpleOpenNI(this, SimpleOpenNI.RUN_MODE_MULTI_THREADED) For each frame in Processing, the context needs to be updated with the most recent data from the Kinect. context.update() The image drawn by the context defaults to showing the world from its point of view, so when facing the kinect and looking at the resulting image, your movements are not mirrored. It is possible to easily change the configuration so that the Kinect image acts as a mirror. The setMirror() method controls this. Below the first line turns on mirroring and the second turns it off. context.setMirror(true) context.setMirror(false)
  10. 10. Simple-Openni OpenNi library for Processing Image The different cameras provide different data and functionality. RGB The RGB camera is the simplest camera and does no more than a standard webcam. It should be noted that it cannot be used when the IR (not depth) image is enabled. It first needs to be enabled within setup(). context.enableRGB() To create a window the same size as the RGB camera image, use rgbHeight() and rgbWidth(). The context needs to be instantiated and RGB enabled before these methods can be called. size(context.rgbWidth(), context.rgbHeight()) To draw what the RGB camera sees, the current frame is drawn within an image() in draw(). image(context.rgbImage(), 0, 0) Reference Index
  11. 11. Simple-Openni OpenNi library for Processing Depth The depth image is calculated by the IR camera and the pseudorandom array of IR points projected onto the scene. It first needs to be enabled within setup(). context.enableDepth() To create a window the same size as the depth image, use depthHeight() and dpethWidth(). The context needs to be instantiated and depth enabled before these methods can be called. size(context.depthWidth(), context.depthHeight()) To draw a grayscale image of the depth values, the current frame is drawn within an image() in draw(). image(context.depthImage(), 0, 0) The default colour of the drawn depth image is gray, but the colour can be changed. For instance, the below code shades the image in blue instead of gray. context.setDepthImageColor(100, 150, 200) Like Processing, there are two colour modes for the depth image. The default is RGB, but can be switched to HSB. context.setDepthImageColorMode(0) // for RGB context.setDepthImageColorMode(1) // for HSB An array containing all of the distances in millimetres can be requested with depthMap() int[] dmap = context.depthMap() The size of the depth map can also be requested. int dsize = context.depthMapSize() To draw the depth image, the current frame is drawn within an image() in draw(). image(context.depthImage(), 0, 0) Reference Index
  12. 12. Simple-Openni OpenNi library for Processing IR The IR image is what the IR camera sees and cannot be enabled while the RGB image is also enabled or the RGB image will not appear. It first needs to be enabled within setup(). context.enableIR() To create a window the same size as theIR camera image, use irHeight() and irWidth(). The context needs to be instantiated and enabled before these methods can be called. size(context.irWidth(), context.irHeight()) To draw what the IR camera sees, the current frame is drawn within an image() in draw(). image(context.irImage(), 0, 0) The timestamp returns the number of frames that have passed since the ir was enabled. context.depthMapTimeStamp() Reference Index
  13. 13. that will contain all the values of the array and the int that comes before it as a label telling us that everything that goes in this box must be an integer. So, we have an array of integers. How can this box full of numbers store the same kind of information we’ve so far seen in the pixels of an image? The Ki- nect is, after all, a camera. The data that comes from it is two-dimensional, representing all the depth values in its rectangular field of view, whereas an array is one-dimensional, it can only store a single stack of numbers. How do you represent an image as a box full of numbers? Here’s how. Start with the pixel in the top-leftmost corner of the image. Put it in the box. Then, moving to the right along the top row of pixels, put each pixel into the box on top of the previous ones. When you get to the end of the row, jump back to left side of the image, move down one row, and repeat the procedure, continuing to stick the pixels from the second row on top of the ever-growing stack you began in the first row. Continue this procedure for each row of pixels in the image until you reach the very last pixel in the bottom right. Now, instead of a rectangular image, you’ll have a single stack of pixels: a one-dimensional array. All the pixels from each row will be stacked together, and the last pixel from each row will be right in front of the first pixel from the next row, as Figure 2-12 shows. Pixels in the image Pixels in an array row 1 row 1 row 2 row 3 row 3 row 2 1 9 17 25 2 10 18 26 3 11 19 27 4 12 20 28 5 13 21 29 6 14 22 30 7 15 23 31 8 16 24 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Figure 2-12. Pixels in a two-dimensional image get stored as a flat array. Understanding how to split this array back into rows is key to processing images. bject Figure 2-14. Our red circle following my outstretched fist. 3 Next, let’s look at our two for loops. We know from our pseudocode that we want to go through every row in the image, and within every row we want to look at every point in that row. How did we translate that into code? What we’ve got here is two for loops, one inside the other. The outer one increments a variable y from 0 up to 479. We know that the depth image from the Kinect is 480 pixels tall. In other words, it consists of 480 rows of pixels. This outer loop will run once for each one of those rows, setting y to the number of the current row (starting at 0). 4 This line kicks off a for loop that does almost the same thing, but with a different variable, x, and a different constraint, 640. This inner loop will run once per row. We want it to cover every pixel in the row. Since the depth image from the Kinect is 640 pixels wide, we know that it’ll have to run 640 times in order to do so. The code inside of this inner loop, then, will run once per pixel in the
  14. 14. Gesture & Skeleton In this section, you will learn how to play with the the default gesture control and play something with skeleton points.
  15. 15. GESTURE_WAVE GESTURE_HAND_RAISE GESTURE_CLICK Gesture > Gesture Interaction Design ign for variability nput evious experience and expectations w they interact with your application. mind that one person might not perform e the same way as someone else. Gesture interpretation Simply “asking users to wave” doesn’t guarantee the same motion. They might wave: • From their wrist • From their elbow • With their whole arm • With an open hand moving from left to right • By moving their fingers up and down together Simple-Openni OpenNi library for Processing Kinect for Windows | Human Interface Guidelines v1.8 22 Basics In this document we use the term gesture broadly to mean any form of movement that can be used as an input or interaction to control or influence an application. Gestures can take many forms, from simply using your hand to target something on the screen, to specific, learned patterns of movement, to long stretches of continuous movement using the whole body. Gesture is an exciting input method to explore, but it also presents some intriguing challenges. Following are a few examples of commonly used gesture types. Gesture > Basics Hand Gesture
  16. 16. Simple-Openni OpenNi library for Processing Hand To start capture hand gesture we need make it enabled within setup(). context.enableHand() Then chose which gesture we need. context.startGesture(SimpleOpenNI.GESTURE_CLICK) context.startGesture(SimpleOpenNI.GESTURE_WAVE) context.startGesture(SimpleOpenNI.GESTURE_HAND_RAISE) Note: Any skeleton data from SimpleOpenNi need to be convert form real world coordination to projective: context.convertRealWorldToProjective(realworld_pos,converted_pos) Hand Gesture
  17. 17. Skeleton Tracking This tutorial will explain how to track human skeletons using the Kinect. The OpenNI library can identify the position of key joints on the human body such as the hands, elbows, knees, head and so on. These points form a representation we call the 'skeleton'..
  18. 18. Simple-Openni OpenNi library for Processing User To start capture user information we need enable depth and user within setup(). context.enableUser() Get information of users, context.getUsers(); Check if the user is tracking, context.isTrackingSkeleton(userid) Get user’s center context.getCoM(userid,center_pos) Detecting New Users and Losing Users // when a person ('user') enters the field of view void onNewUser(int userId) { println("New User Detected - userId: " + userId);   // start pose detection context.startPoseDetection("Psi", userId); }   // when a person ('user') leaves the field of view void onLostUser(int userId) { println("User Lost - userId: " + userId); } Skeleton
  19. 19. Simple-Openni OpenNi library for Processing Drawing the Skeleton Now we will use our drawSkeleton() function to draw lines between joints. Each joint has an identifier (just a reference to a simple integer) and there are 15 joints in all. They are: SimpleOpenNI.SKEL_HEAD SimpleOpenNI.SKEL_NECK SimpleOpenNI.SKEL_LEFT_SHOULDER SimpleOpenNI.SKEL_LEFT_ELBOW SimpleOpenNI.SKEL_LEFT_HAND SimpleOpenNI.SKEL_RIGHT_SHOULDER SimpleOpenNI.SKEL_RIGHT_ELBOW SimpleOpenNI.SKEL_RIGHT_HAND SimpleOpenNI.SKEL_TORSO SimpleOpenNI.SKEL_LEFT_HIP SimpleOpenNI.SKEL_LEFT_KNEE SimpleOpenNI.SKEL_LEFT_FOOT SimpleOpenNI.SKEL_RIGHT_HIP SimpleOpenNI.SKEL_RIGHT_KNEE SimpleOpenNI.SKEL_RIGHT_FOOT Draw line between skeletons context.drawLimb(userId, SimpleOpenNI.SKEL_HEAD, SimpleOpenNI.SKEL_NECK); Get skeleton position context.getJointPositionSkeleton(userId,SimpleOpenNI.SKEL_LEFT_HAND, pos); Skeleton

×