Benifits of Web-Based Data Distribution Systems
This paper describes the challenges and benefits surrounding a
web-based data distribution system.
a web-based spatial data distribution can be a challenge, but many organizations
are doing it successfully and reaping the benefits. Before enterprises undertake
such projects however, they must first allocate time to understanding the
quality of their data, and the target audience.
Traditionally, it has been very difficult for an enterprise to consolidate its disparate map data into a single, seamless database, and integrate this significant asset into the decision making process. For the past 30 years, organizations around the world have been capturing spatial data digitally in a wide variety of data formats. With thousands of data formats, sharing mapping data is a complete process. Industry sectors, governments, and even departments often work in data formats that most appropriately address their needs. Recent developments in spatial database technology from a variety of vendors are making it possible for organizations to realize the dream of integrated spatial and attribute corporate databases. Although the transition from mapsheet files into a spatial database can be difficult, many organizations are now successfully doing this in order to leverage their significant spatial data investment. This article provides guidance in preparing for such a migration, independent of spatial database type.
Spatial data is being used by an ever-increasing number of organizations - from city to national governments, and from small companies to large corporations - who all view spatial data as a strategic asset. As spatial data increases in importance, both businesses and governments need to disseminate and have access to the latest data as cost-effectively and as fast as possible.
As the need for spatial data grows, there is also an increasing number of web-based mapping systems that enable users to view data, and perform simple analysis and other basic GIS operations. The focus of these mapping systems was on providing GIs-based functionality over the Internet/intranet; however, the products have a limited native ability to distribute data.
Historically, spatial data has been distributed using physical media; and, since spatial data is voluminous, data providers were often forced to provide the data in a single format and a single coordinate system. As a result, data consumers who wanted the data in a different format or coordinate system had to convert the data either by writing customized software or by using a commercial data translator such as Safe Software's Feature Manipulation Engine (FME) or Blue Marble Geographics' Geographic Translator.
The growth of Internet and web-based technologies provides new distribution possibilities for spatial data users and providers. Web-based data distribution products are now hitting the market, such as Safe Software's SpatialDirect.
When you are choosing a web-based data distribution product, you will need to consider some key points to ensure that the system satisfies both your immediate and future needs.
Relational database-based systems provide superior performance in addition to the benefits of a Relational Database Management System (RDBMS).
Web-based data distribution systems that are built on relational databases are also in no way limited or complicated by file boundaries or other tiling issues, that is the complete data holding can be represented as one contiguous dataset.
The system must be able to satisfy clients using the smallest single cpu machine data distribution systems to large clients using multi-machine systems, and must be able to easily grow from one extreme to the other without causing organizations to lose their investment. The architecture must thus be flexible enabling software components to be easily moved from one machine to another with minimal change to configuration files.
The system must be secure in two ways:
not allow users to see any restricted data, and
it must guard against requests for too much data that, if processed blindly, would result in loss of or degradation in service.
Since the Internet can be a very hostile environment, there must be a layer of software between the underlying database and users, ensuring that users cannot find ways to sensitive data. If the system detects any attempts to thwart the security, then it should log this information with as much user and/or IP information as possible, and notify system operators.
The system must also be capable of handling requests for too much data. For example if there is a theme named "Roads" that contains all the roads in the continental United States, the system should guard against a misinformed or hostile user that requests all the roads for a particular state or for the whole country. This is too much data for a real-time request and processing such a request would greatly degrade the system performance.
Ideally, systems administrators should be able to define the size of data that is to be distributed on a layer-by-layer basis and provides for different levels of service based on the amount of data that is requested. An example of one possible set of different levels is described below:
Service: This is for small requests. This value is dependent on a number
of factors: server bandwidth, client bandwidth, number of expected simultaneous
clients, and throughput of server. For these requests, the system processes
the request immediately with a turnaround time that would be acceptable
for a user waiting at a browser.
E-mail Service: These requests are the next level in size. The server still processes the requests immediately, but it is recognized that the delay is beyond the threshold of a user waiting at a browser. The user is sent an e-mail message with an ftp link that points to the extracted data.
Physical Media Service: This level of service is for data requests that are performed off-line and then put on physical media. These results are deemed to be simply too big to be sent via the communication infrastructure.
Prohibited Service: This is for requests that are deemed too large to process. The request is logged and the client is simply notified that that the data request is too big for the data distribution system.
The system must be reliable, and at the same time it must have an administrative capability that catches and reports faults. It must also have a statistics reporting capability so that administrators can see how the system is performing. If any bottlenecks exist; administrators need to know where they are located so that future performance issues can be identified before there is a serious impact on the users.
The data distribution system must be cost-effective, providing a cost based on server configuration or number of concurrent users and not on the total number of users.
The data distribution system must also be able to be used without requiring software be installed on the client machine. For Internet-based solutions, it is best if the software can run from a standard browser such as Internet Explorer or Netscape without requiring plug-ins.
The move to web-based data distribution systems builds on the trends to move spatial data into databases and GIs functionality to the web. When choosing a web-based data distribution system, an organization must ensure that the system meets both their immediate and future needs. The chosen data distribution system must have an open architecture and must adhere to industry standards so that it can easily work with the web mapping solutions from both current vendors and future standards-based products. The product must be scalable, able to grow with the need to distribute data. Last but not least it must be cost-effective - not priced on number of users, but on server configuration, which enables the deploying organization to benefit from the continual decline in computer hardware pricing.