State of the art (Part I) – Object recognition with Kinect

26 04 2011

We have identified the following tasks in order to accomplish our goals:

  • obtain a 3D-model of the objects by working with the Kinect sensor
  • project images over those objects, considering their shapes
  • build the robotic arm that will hold the Kinect sensor as well as the projector

For this post we have focused on the state of art of the first point, the object recognition using the Kinect. The aim is to scan real objects using a Kinect Sensor.
The device contains two cameras  -IR and RGB-, an IR Projector, and four microphones.

Although Microsoft has not made any official releases yet, there are a lot of people working on different drivers and libraries for nowadays popular operating systems. We have investigated and tested some of those drivers for the Kinect in different OS (Windows and Linux mainly).

We’ve focused on the following:


Open source API that enables communication with vision and audio sensors as well as vision and audio perception middleware developed by the same people who developed the technology behind the Kinect.


  • It has a neat architecture of extensible module for detecting gestures.
  • Not restricted to use it with Kinect only.
  • Big community around.


  • The main disadvantage is that only works on Windows and Linux but not on MacOS, a requirement for our project, so we will not be able to use it.

For further information please refer to the OpenNI Page


Open source library for accessing the Kinect RGB and Depth images, to be used with Windows, Linux and MacOS.  One of the advantages of the libfreenect library is that it may be combined with ofxKinect, which is a free openFrameworks addon/wrapper for the library. The ofxKinect wraper works on Linux, MacOS and Windows (in it’s development branch).


  • Multi-platform.
  • Easy to install.
  • Exists an addon for openframeworks.


  • Basic operations.
  • Development in early stage (doesn’t have operations for the microphones)

The operations the ofxKinect provides are:

  • Adjust the tilt angle of the kinect.
  • Get acceleration information.
  • Get the distance of a certain pixel.
  • Get the color of a certain pixel.
  • Transform a screen-coordinate to a real-world-coordinate.
  • Get the RGB texture.
  • Get the gray scale depth texture.
  • Draw the RGB output.
  • Draw the Depth output (as grayscale video)

More information on this library can be found at the OpenKinect community website.

We found some interesting projects that are already doing things like what we are trying to do.


Intel has a project named Oasis (Object Aware Situated Interaction System), where they recognize objects in front of the camera, and augment them by projecting information over them.

Here’s a video of what they have achieved:

They have published several papers about this project that we will analyze in this blog in the fore coming posts.

Skeleton Tracking with line projection

Video report of an Augmented Dance session test.

The Kinect sensor is used to track body movement. More information can be found here.

Augmented reality fireballs

Street Fighter fireball effect with virtual dynamic light source. For more information go to the Kimchi and Chips’ blog

Interactive Projection Mapping | Kinect Hack

Using the depth mapping provided by the kinect they are using hand’s position as the light source of the physical model. Developed in C++, Openframeworks, and OpenNI.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: