This chapter offers an introduction to database components, application-database interaction, data warehouses, and data marts. Technical terms and examples that are useful in improving test planning and bug-report communication are also discussed.
Databases play an important role in Web application technology. They house the content that Web applications manage—fulfilling user requests for data storage and record queries. Understanding how databases operate within Web applications is essential to effective database testing. Databases are repositories that are organized in such a way that it is easy to manage and update the data they contain. One of the database technologies commonly used in Web-based applications is the relational database. Relational databases are tabular databases that can be easily reorganized and queried. Additionally, in a Web environment, the term distributed database is used to refer to databases that are dispersed over multiple servers on a network.
Two common approaches used to address the needs of target users are online transaction processing (OLTP) and online analytical processing (OLAP). Online transaction processing is transaction-oriented. The design objective is to support users who need access to systems to process sales or other types of transactions. An example of an OLTP-type of application is an e-commerce system where sales transactions are processed between buyers and merchants. In contrast, OLAP is intended for users who need access to systems such as data warehouses and data marts to obtain various types of metrics or analytical reports. Figure shows an example of OLTP versus OLAP design.
Three databases containing operational information are used for product sales, training registration, and purchasing (OLTP). A data warehouse collects and transforms the raw data from the operational databases and stores it in a read-only informational database. This information is used to support decision-making processes (OLAP). Data replication executes every hour 24/7/52. Data warehouse information is further parsed and distributed to data marts that are designed for sales and fulfillment departments to help in marketing expenditure and inventory control decisions.
Data warehouses are large databases that aggregate and process information from multiple databases. The data is stored in a format that supports various analytical needs. Data marts are customized databases normally derived from a data warehouse that has been formatted to meet the needs of specific workgroups. Data warehouses are structured around data. Data marts are structured around user needs. Data warehouses also allow Web sites to catalog large amounts of user profile data, e-commerce purchases, use and session data information, trends, and statistics.
Data warehouses are large databases that aggregate information from multiple databases. Raw data is transformed via a filtering process and stored in a format that accommodates the database designer's informational needs. Generally, the data warehousing process supplies data marts (see following) or users with the data they require. Data warehouses are commonly referred to as informational databases. Data marts are informational databases that have been custom formatted to meet the needs of specific workgroups. They differ from data warehouses, which are structured around data, in that they are built around the needs of users. Generally, both database types are read-only. They may be used to collect database activity statistics, such as numbers of calls, page hits, sources of hits, MIME types, header names, and so on.
OLTP versus OLAP (data warehouse and data mart) example
There are two ways to copy and update databases. With synchronous updates, changes in the operational database immediately affect the informational database. With asynchronous updates, changes in the operational database are uploaded to the informational database at regular time intervals.
Data warehouses can be designed to selectively collect data about operational activities in an OLTP system. The data can then be transferred to an OLAP system where it can be used for reporting, analysis, and decision support. Information stored in data warehouses is typically summarized, historical, read-only data. The goal of such a design is to improve query performance for decision-supporting systems (DSSs).
Several different data extraction processes may be implemented:
- Subsets of rows and columns may be configured to copy data from a source database to a target database.
- Aggregations of data, such as summaries or number of hourly transactions, may be copied from a source database to a target database.
- Data that are sent to target databases may be derived via calculations on raw-source data.