Classifying Retail Store Cabinets with Missing or Misplaced Products Using Verification Learning

by Stefan Bonhof/

Stefan Bonhof


Performing tasks in dynamic environments is still an open challenge in robotics. To be able to perform a task reliably in such scenarios, the state of the world has to be continuously monitored. In this context, most state-of-the-art perception methods focus on the recognition and classification of individual objects. However, these methods require extensive data collection and artificial neural network training, especially in complex scenes when the number of unique objects to recognise is large. This is for instance the case of retail stores, where there can be as many as 120,000 different products. Applying the state-of-the-art learning methods in this domain is not only expensive in terms of data gathering, but it will also require models so complex that product recognition would be significantly slow. This research tackles the problem of cabinet classification in a retail store, introducing a method to identify cabinets with missing or misplaced products without individual object recognition. Prior knowledge on the layout of the retail store is used to generate an image of what a cabinet is supposed to look like when it is correctly stocked. Taking an image of the current state of the cabinet and comparing this to the previously created image allows for a verification network to verify whether the cabinet is still fully and correctly stocked or not. This research provides three main results. First, verification learning is demonstrated to transfer well to the retail store cabinet domain, maintaining high speed and accuracy. Second, this work shows that the verification network generalises well to both unseen cabinet configurations as well as unseen products, eliminating the need to include every product in the dataset used to train the network. Lastly, this research shows that verification learning transfers well from simulation to the real world to classify cabinets with missing products. However, this last result does not hold for cabinets with misplaced products, due to the smaller difference between a correctly stocked cabinet image and an incorrectly stocked cabinet image. Furthermore, while the verification network is very fast on the hardware used for this research, it will be significantly slower when applied on the less powerful hardware more commonly found in robots. This thesis represents a starting point for the detection of missing and misplaced products in retail store products, and it serves as a foundation for future research in this domain.

  • Delft
  • Perception
  • Robotics