As discussed in the introduction, the task-specific requirements of applied vision systems often drive the development of high-level vision capabilities. Thus, a great deal of innovative research in interpretation and understanding is both developed and exploited in a variety of application contexts. For example, there has been important research to integrate vision and language and deliver conceptual descriptions for advanced surveillance. Pioneering research on describing behaviour in traffic scenes by Nagel  and Neumann  established a useful ontology for the events and episodes observed. More recently, this has been extended in terms of the complexity of vehicle interactions analysed by Howarth and Buxton [14,33] and the sophistication of the linguistic descriptions computed by Nagel and colleagues [40,25]. Real-time constraints for descriptions in video-surveillance applications have also received attention in the new PASSWORDS project . These techniques were clearly developed for advanced surveillance but are also more generally applicable in interactive vision systems.
Suchman  proposed a situated approach for general human computer interaction and here, again, there is a clear requirement for systems that integrate both vision and language, for example . Interdisciplinary work in cognitive science, HCI, and AI approaches to vision and language will be an important component of long term work in this area. In the short term, many researchers are developing useful techniques for multimodal and multimedia interaction. For example, Kender  has been active in bringing spatial reasoning and gesture recognition to these problems. Bobick [8,9] has also been leading work at MIT Media lab for a variety of interactive vision applications. These applications seek to understand actions directly from the image sequences using approximate models in order to meet real-time constraints.
Smart cars using new sensor technology for vehicle control are also being developed in conjuction with traffic monitoring in intelligent highway system projects by Malik and colleagues [34,44]. This type of application is also closely linked to innovative work on behavioural control in robotics by Bajcsy and colleagues [41,57] using discrete event dynamic systems. The idea of integrating work on understanding scenes with behavioural control for automatic vehicle guidance has great commercial potential and exciting new work is being done in this area. In addition, a new situated approach using constraint-based vision by Mackworth  is being developed to integrate knowledge-based and behavioural control in robotics. These developments, then, involve fundamental science while being highly applicable for real-world applications.