A Survey of Digital Map Processing Techniques

YAO-YI CHIANG, University of Southern California

STEFAN LEYK, University of Colorado, Boulder

CRAIG A. KNOBLOCK, University of Southern California

 

Abstract—Maps depict natural and human-induced changes on earth at a fine resolution for large areas and over long periods of time. In addition, maps—especially historical maps—are often the only information source about the earth as surveyed using geodetic techniques. In order to preserve these unique documents, increasing numbers of digital map archives have been established, driven by advances in software and hardware technologies. Since the early 1980s, researchers from a variety of disciplines, including computer science and geography, have been working on computational methods for the extraction and recognition of geographic features from archived images of maps (digital map processing). The typical result from map processing is geographic information that can be used in spatial and spatiotemporal analyses in a Geographic Information System environment, which benefits numerous research fields in the spatial, social, environmental, and health sciences. However, map processing literature is spread across a broad range of disciplines in which maps are included as a special type of image. This article presents an overview of existing map processing techniques, with the goal of bringing together the past and current research efforts in this interdisciplinary field, to characterize the advances that have been made, and to identify future research directions and opportunities.

 

CHALLENGES AND OPPORTUNITIES

 

As seen in this article, there has been a significant amount of interest in and research on map processing over the years. Existing research efforts have been scattered over many organizations and countries, have addressed a wide variety of issues, and have been published in many different venues. The authors hope that this article will provide a more integrated view of the large body of work that has been conducted. Past research has typically focused on specific map types, because the research is often driven by the need to extract the data from a specific set or series of maps. Although much progress has been made on solving numerous problems in map processing, there are many, many problems left to solve. Furthermore, we believe that the ability to extract and georeference the data in maps will unlock a wealth of information that has many applications, including constructing newmaps, building better and more detailed gazetteers, performing historical research about areas that have changed over time, and conducting medical research on links between man-made features and diseases.

 

Given the history of scattered funding and the narrow focus on specific map types, the question is how to accelerate the progress in the area of map processing. We believe that this can be best achieved by encouraging researchers to make their software and datasets available. This will allow other researchers to build on the current state of the art instead of having to implement each component individually. The availability of datasets also means that researchers can then directly compare their techniques against other algorithms and evaluate their techniques on maps that have been processed by others. If researchers make their software available under open source licenses, it also means that we can begin to put together integrated systems that combine the best algorithms, which will then provide a set of tools that can be used by others for extracting data from their own maps. As has been described in this survey, it is difficult to draw direct linkages between certain techniques and types or qualities of maps. Map processing still depends on the expertise of the analyst to choose an appropriate technique for a particular extraction task depending on the type and quality of the map document at hand. Thus, currently there is no unified or generic map processing framework that would make it possible to select the best-performing techniques and methods for map processing work for various kinds of maps. However, if the community can be better integrated, a systematic understanding of existing solutions can be developed and evaluated to provide the basis for an initial framework in the near future.

 

Beyond the issue of developing a more integrated community, there are also several areas of specific research that we want to highlight as deserving more attention. First, processing maps is hard due to the complex overlapping data provided in many maps. Thus, we believe that it is especially important to develop interactive techniques that reduce the user effort in map processing and allow users to apply their own expertise to extract, verify, and even correct the extracted data. By developing efficient interactive techniques, we can better exploit the knowledge and human abilities for understanding map representations and possibly integrate learning processes to improve the systems. Second, there is a vast amount of historical data captured in maps that could have a wide variety of uses, but often the historical maps are the hardest documents to process. Therefore, we believe that there should be more focus and more research funding directed to the problems of extracting data from noisy maps. Third, little work has addressed the problem of the provenance, accuracy, and reliability of the extracted results. As map processing research comes into its prime, there is the opportunity to extract a great deal of information from maps. As a community, we need to record the provenance of these extracted data, develop techniques to estimate the accuracy, and provide methods to quantify the reliability of the generated spatial data.

 

Reference

Chiang Y Y, Leyk S, Knoblock C A. A Survey of Digital Map Processing Techniques [J]. ACM Computing Surveys, 2014, 47(1):1-44.