Sharing structured data

XML Magazine

Subscribe to XML Magazine: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get XML Magazine: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

XML Authors: Mehdi Daoudi, William Schmarzo, Corey Roth, Debu Panda, Jayaram Krishnaswamy

Related Topics: XML Magazine, Java Developer Magazine

XML: Article

XML & Java: The Why and the How

XML & Java: The Why and the How

Today the technical media talks a great deal about the Java platform and its importance in creating a ubiquitous Internet execution environment. While most of us have bought into this concept, other technologies that are emerging rapidly promise to smooth out the road to the computing promised land. XML is one of these technologies that needs to be taken seriously. There are many aspects of XML: Document Type Definitions (DTD), Style Sheets (XSL), Viewers, parsers, HTML 4.0 and data. Out of these, perhaps the most promising aspect of XML is its ability to represent data. Its ability to describe its document content via its markup mechanism allows it to behave like a universal data format for any number of applications.

Data representation using XML is a major step toward creating a ubiquitous data environment. XML allows authors to define their own tags, which in turn describe their content and make it possible to define a reusable data layer. Authors are able to leverage their document structure and meaning to allow the processing of special instructions on parts of their document. XML can also be used by two or more entities as a particular exchange format for transaction protocols. This allows XML documents to be manipulated without human interaction in batch mode. Some examples of exchange formats are defined by the Rossetta Net and Microsoft's BizTalk standards.

Why XML?
XML by itself has nothing to do with Java and vice versa. So why should the Java community care about XML? The answer lies in the data layer. The Java language alone doesn't provide a mechanism for standardizing data formats. Java programs need to rely on predefined, nonflexible, hard-coded formats for reading information. This makes it difficult to extend or add functionality to a program without breaking the existing code base.

Take a business scenario. Imagine that you do business with two partners, one on the West Coast, the other on the East Coast. The latter expects his purchase orders to contain three fields: part number, quantity and delivery date. The West Coast partner expects her purchase orders to contain part number, quantity, delivery date and preferred shipping carrier. Thus they each have different definitions of a purchase order. How will they converse? While this particular problem doesn't seem that complicated, multiply the number of partners by 10 or 100, each with his or her own definition of a purchase order. Now we have a problem!

The naive approach to dealing with this problem would be to have our Java code deal with the individual partners in a special way. The problem with this approach is that each partner that requires special information forces the modification of the Java code used to implement the business model.

The ideal solution is to create a generic Java program that doesn't have to deal with the individual requirements of each partner. This can be done using XML. A core exchange format can be set up between you and your partners, and the individual information required by each partner can be abstracted in a properties file. The file will be responsible for matching additional information to a specific partner. In this particular scenario each partner will deal with the information he or she understands; the remainder of the information will be ignored. As new partners join your "circle of friends," the only information that needs to be modified is the properties file and the XML data file. This is where the power and flexibility of the XML data format complements the power and flexibility of the Java runtime environment. Furthermore, a neat side effect is that the properties file could have been written using XML.

XML Documents and Dynamic Class Loading
When people talk about XML for data representation, the most basic concept they refer to is a document structure with data. This structure, similar to a populated C structure, outlines a tree whose nodes describe the content found on the leaves. Simple documents don't contain any behavior that defines how to access the content on the tree. Thus an XML document can be thought of as a data object with accessor methods. This idea can be heavily leveraged to implement exchange formats for transaction protocols. More complex XML documents leverage the concept of mobile agents to provide behavior to XML documents. This approach leverages URL links embedded inside the document as object repositories from which functionality can be downloaded over the Web and used to process specific document tags. It is here that XML leverages the power of Java to extend its data model to add behavior. The Java code contained in the URL links is downloaded via the URL class loader mechanism contained in the Java platform. Once the class bytecodes are downloaded over the Web, a class object is created and temporary object instances are created and used to evaluate the information contained inside the XML file. This enables the dynamic extension of program behavior. Another way in which Java components can be leveraged is to send mobile agents to evaluate information stored inside XML files by analyzing the tags contained inside the document.p> Although the Quantity attribute could have been expressed as an element, for our particular example it's more advantageous to define it as an attribute because it can be directly manipulated by the SAX API inside the callbacks attribute list of the OrderNumber tag. Figure 1 illustrates the document hierarchy. The XML document format is shown in Listing 1.

Some partners may take the orders, process them and notify senders of the status of their order. These partners parse the information contained in the document in a batch manner and create objects that are used by their purchase order systems. Based on the information contained in the document, the purchase order system might createthree objects: a purchase request object, a buyer object and an order object. The purchase request object contains the buyer object and a list of order objects. The DOM interface is the correct mechanism to facilitate the creation of these objects from the XML document. Listing 2 shows the use of the DOM Java API to retrieve the document information needed to create the buyer object.

Some partners may want to evaluate requests whose item quantity is greater than or equal to 500. To facilitate processing, the application programmer may wish to evaluate the "Quantity attribute in the ŒOrderNumber'" tag independent of any other information in the document tree. In this case we use the SAX interface, which, among other things, allows us to register a document handler as a callback object that's triggered when a document tag is found. When the tag being processed is equal to "OrderNumber," the quantity attribute will be evaluated against the quantity rule. In this scenario it's irrelevant that the "OrderNumber" tag is contained inside the "PurchaseRequest" tag. Listing 3 shows the use of the SAX Java API to capture the "OrderNumber" tag from the document information and evaluate its Quantity attribute.

In this particular example orders fewer than 500 items will not be processed by the system. Those orders greater than or equal to 500 will be processed using the DOM API. However, in this case the SAX API will allow the application writer to filter the information and not overload the system with unprofitable requests.

The definition of a standard representation of the purchase order document enables the various partners to manipulate the information as they see fit, independent of each other. This flexibility can be extended by allowing sophisticated partners to add additional tags into the document hierarchy. As long as the main tag dependencies are kept, the sophisticated partners will be capable of leveraging the additional information on their transactions. Additional tag examples can be a "DeliveryDateOffset" tag that allows a partner to identify a range of days from the "DeliveryDate" tag by which the order can be supplied. If the information is present in the purchase request, the partner can leverage it. If it's not present, it can be ignored by the partner system.

The URL class-loading capabilities of the Java 2 platform supplement the XML data model by allowing a document to contain behavior in addition to data. This is accomplished by embedding URL links to Java classes inside a document. Class loading coupled with reflection form a powerful mechanism that allows Java programs to dynamically download functionality from a partner Web site in order to process new XML tags. However, this particular mechanism requires an adapter-based framework similar to the Beans model that allows application developers to dynamically define interactions between their legacy systems and the newly downloaded functionality. (This mechanism will be covered in a separate article.)

In this article we discussed the advantages of marrying the XML and Java technologies. XML is to Java as cream is to coffee; it makes the coffee drinkable. While Java by itself provides a great deal of dynamic behavior through its dynamic class loading and reflection mechanisms, by itself it's not the best way to deal with data format issues. XML takes Java to the next level by providing a flexible and extensible tag definition environment that is machine independent. Java applications coupled with XML data formatting are more capable of adapting to data format changes in a generic, nonprogrammatic way. This increases time to market and gives developers the ability to react more quickly to market changes.

More Stories By Israel Hilerio

Israel Hilerio is a program manager at Microsoft in the Windows Workflow Foundation team. He has 15+ years of development experience doing business applications and has a PhD in Computer Science.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.