Introduction to XML
The essence of Web development is markup, even when using a powerful visual editor such as Dreamweaver or GoLive to create markup. Markup drives everything on the Web. Without it, there would be no World Wide Web. Markup consists of a set of rules that a document must follow in order for the software processing that document to read it correctly. The process of software reading a marked-up document is often referred to as parsing. If the document is not marked up correctly, the software can’t parse it.
In theory, HTML was designed to maintain a strict set of markup rules, but those rules were enforced rather loosely by the Web browsing software designed to parse HTML.The result was inconsistency, and browser vendors who added their own markup “rules” exacerbated the problem; each browser, in essence, followed its own set of rules. The frame element, for example, found its way into the HTML specification when it gained popularity shortly after Netscape introduced it. Browser developers raced ahead with new features, while the W3C, the organization responsible for Web standards, lagged behind.
Over the past few years, the situation has reversed, and the W3C has released a slew of specifications that vendors are having difficulty keeping up with. One of these specifications, Extensible Markup Language (XML), was introduced by the W3C to address general inconsistencies in markup, and to add another data-centric layer to the user interface paradigm. This chapter introduces you to XML and how and when to deploy it.
The Need for XML
The Web is all about markup, but it’s also about data. This is true whether the data is document-centric, such as the kind of content in a magazine or journal, or more granular, such as the kind of data extracted from a database.
One problem with this type of data is that it can be difficult to extrapolate across different software environments and platforms because it has traditionally been stored in proprietary formats. What if you were able to instead develop a set of rules defining a table of text-based data and simply wrap markup around each chunk of data? Such data could be as simple as a Web configuration file that stores settings on how a Web server is configured, such as this piece of code from a .NET web.config file:
Or, the data could be much more complex, derived from a large number of relational tables requiring a carefully constructed set of rules in order for the processing software to know what each element means. Many modern database systems, such as Oracle, can now be used to extract such data into sets of marked-up elements. These result in documents that can be easily shared across platforms, software environments, and even other companies and organizations, because the markup these documents are based on, XML, has consistent rules worldwide. The key to this kind of integration is the use of documents that define rules for what an element means. There isn’t much good to having the following element if you don’t know what the element is supposed to do:
Some specific uses for XML include the following:
- Use it to store data outside your HTML.
- Use it to store data inside HTML pages as “Data Islands.”
- Use it to share and exchange data between incompatible systems.
- Use it as a data storage mechanism completely outside the HTML layer.
- Use it to make your data human-readable.
- Use it to invent new languages or plug data into an existing language (a process called transformation, which you’ll see in the section on XSL).