Problems in geometric modelling and perceptual grouping of man-made objects in aerial images

Research on man-made object recognition in satellite and aerial images has been published for several decades now. Frequently model based approaches for the recognition of roads and buildings are proposed. Models are described in terms of relations of parts forming hierarchical arrangements. The recognition process then consists of corresponding substeps of perceptual groupings, searching for constructions consistent with the model. This approach has been proposed for quite different recognition tasks. Detection (i. e. searching for special objects), classification (i. e. assigning object class labels to data objects) and even reconstruction of all objects in a scene have been investigated. Data may have been acquired by different sensors (e. g. visible light, IR, SAR or laser range data). Perspectives vary from very oblique to perpendicular. Objects of interest may be fixed in position (e. g. buildings) as well as moving around (e. g. vehicles or containers). Buildings may have numerous variations in shape whereas vehicles may be categorised into quite homogenous classes. Civil tasks differ frequently from military tasks in the type of objects, sensors and perspectives used. This implies distinct object descriptions and strategies of modelling and searching. The euphoric activism of the 80s concerning the application of model based approaches to automatic photogrammetry, remote sensing and cartography has decreased during the last years. This results not from a saturation process settling the research on secure ground and common views. On the contrary, there seems to be resignation because of practical problems with structural models and lack of success in urban scenery. This paper discusses typical problems in the construction and application of such grouping approaches to man-made object recognition in aerial images. There are different strategies in decomposing object aggregates into object parts. A simple parallelogram shape may for instance be constructed from an angle pair or a pair of parallel line pairs. We demonstrate implications on the search performance resulting from distinct modelling strategies by two examples. Often there is a functional aspect of modelled parts exploitable for recognition guidance. Also different possible representational schemes like semantic nets and production systems are presented. Complex models with many degrees of freedom permit classical one-step template matching (with tolerable effort) only if there is enough prior knowledge (e. g. from maps). Otherwise perceptual grouping of object parts seems to be the only reasonable alternative. This gives best results if an alternation is implemented between grouping and matching. Generic models pose the most severe performance problems because of their inherent combinatorics. This is discussed with examples of grouping buildings into settlement structures. Two main types of grouping are distinguished - cyclical and cycle free - with implications on the associated computational effort. Frequently used grouping structures are line prolongation for roads and rectilinear pairs or triples for buildings. Important geometrical relations are therefore collinearity, parallelity, vicinity e. c. Many systems use inverting techniques (like content addressable retrieval) for some such relations, that help accelerating the search for permissible group partners in the database. We also give two simple examples where relation inverting is impractical. For tasks with very huge data amounts some interdisciplinary fertilisation with databanking is helpful.
Michaelsen E, Stilla U (2000) Problems in geometric modelling and perceptual grouping of man-made objects in aerial images. International Archives of Photogrammetry and Remote Sensing. Vol. 33, Part B3, 577-583
[ ]