header image
 

Zenital 3D traking with Kinect and OpenCv

We finally got working the code for the “detection room” installation. There were a few problems to sort out before achieving an acceptable accuracy.

 

Grayscale data vs raw data

The first problem we encounter was that the point cloud showed a heavy stepping. The fact that we were using the 8bit grayscale depth buffer made the stepping even worse, due to the downscale conversion from the 11bit raw depth to the 8bit grayscale image. The 11bit image has a range from 0 to 2048, but after the disparity calculation the usable values go approximately from 500 to 1000, which converted to meters correspond to the range limits of the Kinect sensor. So there is a lost of accuracy when converting from the 500 (1000-500) valid values to the 255 possible values of a 8bit grayscale image.

The stepping was slightly better as we switched to a float image, asigning to each pixel the real world depth in meters.

 

 

Figure 3D point cloud stepping

Figure 3D point cloud stepping

 

Angled position of the sensor

Another problem to solve was that we couln’t attach the sensor to the ceiling (which is about 5 meters high), and even if we could, the noise of the sensor at that distance makes the data useless. Therefore, we decided to attach the sensor to a wall, at 3m from the floor, angled to get advantage of the vertical field of view.

Figure position of the sensor

Figure position of the sensor

This setup forced us to make a 3D rotation and translation of the point cloud, to be able to detect the highest points of each blob related to the floor and not the nearest to the sensor.

The steps to do the detection of the highest point related to the floor are these:

Depth buffer blobs highest point detection

Figure Depth buffer blobs highest point detection

Occlusion

The detection of the highest point of each blob was successful. But we enconuter the problem that if two people were to close together, so that from the point of view of the camera there was one single blob, there would be just one highest point detected for the blob of the two people, instead of having each one his own highest point.

 

Occlusion problem

Figure Occlusion problem

To solve this issue, we generated a complete new zenital image from the projection of the 3D rotated point cloud. So the detection was done afterwards on the image as seen from above, where two people next to each other appear as two different blobs.

Zenital Projection of depth buffer detecting one person

Figure Zenital Projection of depth buffer detecting one person

Now with the goal of the detection achieved, we calculated the distances from each blob’s highpoint to the PTZ camera unit, aiming the camera towards the nearest one.

Zenital Projection of depth buffer detecting two person

Figure Zenital Projection of depth buffer detecting two person

Here is the system in action:

YouTube Preview Image YouTube Preview Image

~ by David Sanz Kirbis on 2 July, 2011. Tagged: , , , , , ,

Comments are closed.