In common with much work in AI, logic-based approaches have a great deal to offer in terms of consistency checking and explicit, declarative knowledge representation. In particular, formal approaches using well-defined languages with clear meaning for time, events, and causality, e.g., Allen  and Shoham , are useful for validating and prototyping new approaches in many AI subfields. For image interpretation, the reconstruction of MAPSEE within a logical framework  is a classic example. Spatial and temporal logics are characterised by declarative representation in some formal description language and reasoning using some form of theorem-proving or calculus. However, translating the knowledge into a precompiled procedural form for fast execution, as in the work of Kaelbling and Rosenschein , is a major trend in the field. For example, the work described earlier by Mackworth  using contraint-based vision in situated agents was based on underlying formal logic notions, so that the designer can achieve provably correct behaviour. The constraint net models were transformed into their dynamic forms to allow fast processing in the on-line system. This trend makes formal modelling extremely useful for robotics and in classifying objects and types of events, as well as for spatiotemporal reasoning in knowledge-based vision.
Recently, the need for formal descriptions in visual knowledge representation has been emphasised by Schroeder and Neumann . They advocate the use of an object-centred, description logic tailored to the requirements of image understanding, together with an effective calculus. Their language can be used to formalise scene-independent domain knowledge using a set of axioms. However, there is still some way to go in making the calculus tractable for realistic problems. More applied work on spatiotemporal reasoning in VIEWS for advanced surveillance used logical rules which could be made into executable networks for incident detection  or used for occlusion reasoning . Semantic regions underlying the interpretation of behavior in traffic scenes  and trajectories for event descriptions  have also been learnt from images to support high-level reasoning.