Military significance: Visual and Acoustic Surveillance
and Monitoring, University of Maryland
Our vision of battlefield surveillance systems involves a distributed
suite of heterogenous and relocatable sensors monitoring a large
geographic area, in the
context of a site model, for the entrances, exits, and activities
of people and vehicles. Scarce human operators must monitor the
outputs of the surveillance system, under constraints of limited
bandwidth and possibly severe psychological pressure.
We envision a surveillance system for monitoring the urban
battlefield, where the movements and actions of even a small number
of individuals and equipment can lead to a great loss of life, and
in which one must rely on incomplete and qualitative site modeling to
control and focus perception systems.
In order for the surveillance system to monitor an area of any significant
geographic extent, it must employ a suite of sensor platforms which, generally, must
be relocatable to bring more surveillance power to bear on potentially interesting
situations or simply to provide adequate coverage of a large
surveillance site. However, to control the costs and complexity of the surveillance
system, the number of sensor platforms must be limited. This suggests that
the surveillance system should employ multiple levels of analysis of the
area. Our research considers two levels - a coarse level
in which a significant portion of the area is monitored at low resolution,
and a fine level in which a much smaller area is monitored at
much higher resolution.
Decades of research on vision systems have demonstrated that if they are
to be employed in visually complex scenes, then they must be armed with
prior knowledge about the structures in those scenes. In RADIUS this
was reflected in the construction and employment of geometrically precise
site models, which delineated and described the 3-D geometry of buildings,
road networks, etc. In ground-based surveillance of urban areas,
both time to deployment and the dynamic nature of the area dictate
that these surveillance systems utilize less precise models of the environments
in which they must operate. We envision a surveillance system in which site models
contain information about building footprints, entrances, and rooflines, and
additionally include information about roads, park areas, etc. The
models will be used to control the visual
and acoustic focus of attention and
to provide a spatial context for interpretation.
Finally, a battlefield surveillance system must be able to reasonably infer
what people and vehicles are doing from its sensor data analysis. This involves
not only the detection and tracking of people and vehicles, but also some
way to represent and interpret their individual and collective activities.
This is clearly an enormous problem given the wide range of activities that
people engage in. Our research focuses on several important classes of actions
involving people, vehicles and buildings---entering, leaving, carrying, exchanging
and acting suspiciously.