This thesis presents a full methodological approach to model objects, which are held in human hands. A major advantage of our method is, that it works in real time and does not need a complex manuel calibration. Another advantage is, that the object can be modeled inside of the users hand, which is a big advantage in HRI. That means objects can be learned without puting them in some designated place. To acheive this, we present a solution for efficient segmention of the users hand and the background, using an RGB-D camera. For this we will extract the skin color of the users face to detect the users hand and remove it in the image. Afterwards the object will be segmented. To combine the different views, we use V4R-Librabry to map the single images to a 3D-model. To show the efficiency of our method, we present at the end some models, which we created with our method and discuss the limits.