Sharing structured data

XML Magazine

Subscribe to XML Magazine: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get XML Magazine: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

XML Authors: Jayaram Krishnaswamy, Chris Pollach, Jason Bloomberg, Peter Silva, Mehdi Daoudi

Related Topics: ColdFusion on Ulitzer, XML Magazine

CFDJ: Article

Building Blocks

Building Blocks

In the past two years, we have witnessed an explosion of Web services and XML communication technologies. While WSDL , SOAP, and UDDI have become the accepted bases of Web services, there are even more standards in the making.

This article is the first of a two-part series that examines the Web services technological space in order to provide an overview of some of the major Web services standards now in progress in various organizations and consortiums across the country.

General Classifications of Web Services
Web services technology can be broadly classified into three main groups, as shown in Figure 1.


The description stack deals with a wide range of technologies that describe Web services in order to facilitate their common use for business process modeling and workflow choreography in B2B collaborations. The discovery stack deals with technologies that allow for directory, discovery, and inspection services. The wire stack consists of technologies that provide the steam for the runtime engines of Web services.

Figure 2 breaks these stacks into their subcomponents. Many of the available Web services technologies can be mapped to these stacks, although not all stacks have a corresponding specification or technology.


Functional Classifications of Web Services Technology
Another useful way to organize the Web services technology space is according to function. We are interested in understanding why these technologies matter, what their purpose is with regard to solving business issues, and what their relationship to one another is. If we want to create a basic Web service, for example, what kind of technologies exist that can help? If we want to upgrade our basic service to a mission-critical Web service, what steps must we take?

The list below shows the various functional areas of the Web services technology space. This list is not comprehensive, but it does cover most of the available technologies.

  • Basic service
    - Service description
    - Communication protocols
    - Transport protocols

  • Complex payloads
  • Discovery
    - Inspection
    - Directory services

  • Enterprise strength
    - Transaction
    - Security
    - Reliability
    - Routing

  • B2B collaboration
    - Process modeling and orchestration

    In the remainder of this article, I'll address each of the functional categories of Web services technology and discuss the standards that apply.

    Basic Service
    There are two main groups of technologies we must consider to create a basic Web service: service description and communication protocols. Transport protocols aren't specific to Web services, and hence aren't covered here.

    Service Description
    Service description standards specify what a service is about, what actions are supported by the service, what input and output parameters the service takes, and how the service deals with error conditions. In 2000, IBM and Microsoft came out with competing technologies: NASSL and SDL, respectively. Fortunately, they were soon merged into WSDL (Web Services Description Language). As of today, WSDL stands as one of the widely adopted technologies for describing Web services. WSDL 1.1 has been submitted to W3C, which has recently started a working group to ratify it.

    WSDL takes a two-step approach to describing Web services. The first step is to provide an abstract definition of services and the data format; the second is to bind this abstract definition to concrete protocols. This two-step process permits reuse; it's possible to have many similar Web services based on one abstract definition with each implemented using different protocols.

    WSDL is independent of any network and communication protocols, although it does define default binding to HTTP, SOAP, and MIME. Similarly, WSDL isn't tied to any type of system, although it does use XML Schema. WSDL is designed to be extensible to work with different types of systems and other network and communication protocols. Listing 1 demonstrates the simple WSDL grammar used to describe services. Note: This example isn't complete and won't parse; more namespaces need to be defined.

    In Listing 1 the message element, along with the part element, defines the data in abstract terms. The operation element defines the action supported by the service. WSDL defines four basic operations: one-way, request-response, solicit-response, and notification. The portType element acts as a container for a set of abstract operations. In this example, we define a portType element, "StockQuotePortType", with a single operation, "GetLastTradePrice", which takes an input message, "GetLastTradePriceRequest", and gives an output message, "GetLastTradePriceResponse".

    These abstract definitions are then bound to concrete protocols using the binding element. The port element captures the communication endpoint details, and the service element contains a list of related ports. The types element (not shown) acts as a data container holding various data type definitions. In the example, the operations in "StockQuotePortType" are bound to SOAP and HTTP.

    WSDL enjoys the support of many tools. Some help generate WSDL from existing Java and C++ classes, and others generate Java and C++ classes from WSDL documents.

    Communication Protocols
    Standards in the communication protocols area deal with message format and serialization details. In order for a receiver to correctly parse and digest a message, the format of the message must be known. In contrast to the service description area, many protocols have been published in the communication area. These protocols include XML-RPC, SOAP, the ebXML messaging specification, WDDX, and Jabber.

    XML-RPC is an XML-based RPC protocol based on HTTP POST with a simple data model that came from Userland software in 1998. Compared to SOAP it is simple; in addition to RPC, SOAP provides much richer processing semantics, an enhanced data model, and support for messaging. SOAP has garnered a great deal of attention and a huge user base.

    The ebXML messaging specification, built on top of SOAP, is one part of a set of ebXML specifications. WDDX, an effort from Allaire, is focused on providing a simple, lightweight data exchange mechanism for Web programming languages such as ColdFusion, ASP, Perl, and PHP. Though RPC semantics can be layered on top of WDDX, it isn't as widely adopted as SOAP for RPC purposes. Jabber is an open-source protocol that enables exchange of structured information in a near-real-time manner between two or more end points. Jabber is used in the instant messaging areas.

    Let's look at SOAP in detail, since it is the protocol of choice for most Web services.

    SOAP, the Protocol of Choice
    SOAP has a come a long way since its 0.9 release by Microsoft in 1999. SOAP is now handled by the W3C, which was close to publishing a last-call working draft of SOAP 1.2 at the time of this writing.

    SOAP is a lightweight XML-based communication protocol for the exchange of information in a decentralized, distributed environment. SOAP is neutral with regard to language, platform, and programming model, allowing both the sender and the receiver to operate in their environment of choice. SOAP documents can be exchanged over many transport protocols.

    The SOAP specification can be broadly classified into four main parts:

    • A framework for describing the content of a message and how to process it
    • A simple data model and a set of encoding rules for serialization
    • A convention for representing remote procedure calls and responses
    • A binding to HTTP
    The SOAP "grammar" can be best demonstrated by a SOAP message, as shown below.

    <env:Envelope xmlns:env=

    In this example, the SOAP message is identified by the namespace-qualified root element "Envelope". The Envelope namespace determines the version of the SOAP specification to which a SOAP message conforms. The header element is optional; it is typically used to carry out-of-bounds information, such as transaction or security information. The header can contain any number of namespace-qualified XML elements, called entries or blocks. The above example contains one header entry named "app:trans actionId". The body element contains the essence of the message intended for the endpoint. Unlike the header element, the body element must be contained in every SOAP message; the body element can contain one or more namespace-qualified XML elements, called entries or blocks. The above example contains one application-defined body entry named "app:getStockQuote". SOAP defines one body block, called Fault, to represent errors.

    As part of its encoding rules, SOAP defines a simple data model consisting of simple types, compound types similar to structs in programming languages, an array type, and an ID/HREF type that represents references. The encoding rules define a particular serialization rule for this data model. SOAP data model and encoding rules are optional. SOAP defines an "encodingStyle" attribute under the "env" namespace, which can be used to specify a particular encoding rule in effect for a specific element or group of elements.

    Like encoding rules, the RPC conventions defined by SOAP are optional. In SOAP, both the request and the response of an RPC call are modeled as structs; they can also be modeled as arrays, according to recent changes in the SOAP specification. The name of the struct represents the name of the method being invoked. The parameters of a request or the results of an invocation are modeled as named accessors inside the struct. Our example message is an RPC request defined according to SOAP-RPC conventions. Though SOAP has defined a set of conventions for RPC, SOAP is not RPC-centric. It can be used for any general-purpose messaging.

    SOAP can be exchanged over many transport protocols, but the SOAP 1.2 specification defines a binding to HTTP and provides an e-mail binding.

    The W3C working group on SOAP is expected to publish their recommendation around August 2002. To participate or follow their progress, go to

    Complex Payloads
    So far we've looked at technologies that help create a basic XML Web service. The data exchange format and the message format in all these technologies is XML. But not all of the world's data is in XML. We have legacy systems, EDI systems, images, and many more formats. How can we use these new technologies for non-XML data? Converting all this data into XML is inefficient and time consuming; also, XML may not be the best representation for all kinds of data. For example, JPEG may make better sense for images. Even sending arbitrary XML could be a problem. We cannot simply take one XML document, insert it into another, and expect to end up with a valid XML document. Even to carry arbitrary XML in XML-based protocols such as SOAP, we need help.

    There are at least two technologies that address this space:

    • SOAP with Attachments
    • DIME
    SOAP with Attachments
    SOAP with Attachments (SwA) was an effort by a group of individuals to combine the existing SOAP and MIME technologies to facilitate carrying arbitrary data in SOAP. The W3C has published SwA as a W3C note.

    SwA doesn't introduce any new technology. Rather, it uses the referencing facilities in SOAP (HREF attribute) and Multipart MIME (RFC 2045) to make it possible to carry arbitrary data. The whole message is constructed as a multipart MIME message with the SOAP message as the root part. The MIME message can have any number of MIME parts, and the SOAP message can refer to any of these parts using the HREF attribute. In addition, the specification places a few more constraints (such as content-type and start parameter), and makes some recommendation on how the reference URIs in the HREF attribute can be resolved using existing RFCs.

    Listing 2 shows a SOAP 1.2 message with an attached facsimile image of a signed claim form (claim061400a.tiff).

    Until recently, SwA was the most popular way to carry arbitrary data in SOAP; now DIME seems to be shifting the balance. The W3C has not yet started any work on SwA.

    DIME (Direct Internet Message Encapsulation) came from Microsoft and is published as an Internet-Draft by the Internet Engineering Task Force (IETF). DIME is a packaging protocol for multiple binary records with a fixed format and a variable record length. DIME allows for chunking, a process in which data is streamed out without having to be held in memory to calculate the maximum length. DIME has "begin record" and "end record" boundaries so the records can be assembled in order at the receiving end. Figure 3 provides details of the DIME record structure.

    The MB, ME, and CF fields are bitmasks indicating the "begin", "chunk", and "end" of records. The Type Name field is a 3-bit field indicating the structure of the value of the type field. DIME provides a numeric value mapping for different media and MIME types. The ID field is used to give an identifier for each DIME payload. The maximum size of the data field is limited to 4GB.


    Microsoft has also published a companion Internet-Draft that shows how SOAP messages can use DIME to send arbitrary data. Using SOAP with DIME is somewhat similar to using SwA. In both cases, the SOAP message is wrapped in a compound structure with the SOAP message as the root or first message, and the referenced parts as the second. DIME specifies rules for resolving the URIs referenced through HREF attributes in SOAP messages; they are similar to SwA rules (RFC 2396 and 2557). DIME also adds on to SOAP-HTTP binding semantics by specifying the content-type as application/dime, rather than the default text/xml specified by SOAP.

    It's important to note that in both SwA and DIME, the SOAP message itself travels as either a MIME or DIME message with respect to the carrier or transport protocols.

    SwA Versus SOAP with DIME
    DIME is designed for simplicity, with SOAP and XML Web services in mind, while MIME offers great flexibility. DIME makes data handling a bit easier, as it requires that the data length be specified. DIME also makes parsing easier, since it's easier to identify the boundaries of different records using data length, rather than scanning for the string separators used in Multipart MIME to separate data records. The compulsory inclusion of data length may also help in heap management. DIME does not require encoding of binary data, and hence may be faster. MIME on the other hand is very flexible and well-understood, with many implementations supporting it.

    If programmatic discovery of Web services needs to be supported, there are a few technologies that can help:

    • WS-Inspection (WS-I)
    • UDDI
    • ebXML-Registry and Repository Specification
    UDDI has garnered as much attention as SOAP and WSDL; the three together are now considered the basic building blocks of any Web service. Though ebXML has a registry specification of its own that can be used as a standalone specification, it hasn't attracted much attention. WS-I is a companion technology to UDDI that addresses a specific purpose in the area of service discovery. I'll focus on UDDI and WS-I.

    UDDI (Universal Description, Discovery, and Integration) is an effort by a group of companies that hasn't yet been submitted to any other consortium or standards body. UDDI consists of an XML Schema that allows a user to provide a description of a Web service along with its business information. The description of a service in UDDI schema takes a business-centric view, whereas WSDL takes a functional view. In addition, UDDI also defines an API specification that allows a user to publish Web services and to query and obtain information on other published services. UDDI publish/ query is based on SOAP messages.

    The other face of UDDI is the repository itself. A repository implements the API specification with which users can publish or discover services. UDDI repositories are logically centralized and physically distributed. As of this writing, there are four node operators running UDDI registries: Microsoft, IBM, SAP, and HP.

    The UDDI repository implementations are open source; users could get them and run their own in-house UDDI repositories. Though UDDI was initially touted as the technology that would open the gates for dynamic discovery of Web services and dynamic collaboration, it is more and more frequently used for Intranet and in-house repository needs.

    WS-I was a joint offering from IBM and Microsoft released in 2001. While UDDI involves going to a central place to publish and query about services, WS-I involves going to a site offering Web services and seeking information about the services offered at there.

    WS-I defines a simple grammar to aggregate service description documents of various services offered at that site. The service descriptions can be in any format, such as WSDL or UDDI. There can be many service descriptions per service, and many services can be defined in a single WS-I document. WS-I also defines an extended binding grammar for both WSDL and UDDI that provides hints about what may be found in the referred service description documents.

    WS-I makes some recommendations on how its documents may be made available to users, so they are easily found. WS-Inspection documents may also be placed within a content medium such as HTML.

    So far we have looked at technologies in service description, communication protocols, complex payloads, on-site inspection, and general discovery. In the next part of this series, we will look at technologies that deal with enterprise-strength issues, such as transactions and security, and technologies that cover routing and process orchestration.


  • W3C Note on WSDL:
  • W3C Working Draft on the SOAP Messaging Framework:
  • UDDI home page:
  • IETF Network Working Group Request for Comments:
  • IETF Internet Draft on Direct Internet Message Encapsulation (DIME):
  • W3C Recommendation on XML Schema, Part 1:
  • XML-RPC Specification:
  • OASIS ebXML Messaging Services Technical Committee:
  • The Web Distributed Data Exchange:
  • Jabber Software Foundation:
  • W3C Note on SOAP Messages with Attachments:
  • IETF Interment Draft on Encapsulating SOAP in DIME:
  • Web Services Inspection Language (WS-Inspection) 1.0:
  • OASIS ebXML Registry Technical Committee:
  • W3C Web Services Workshop position papers:
  • More Stories By Murali Janakiraman

    Murali Janakiraman is Rogue Wave's software architect for the XML Products team. He has been a developer, senior developer and tech lead on almost all of Rogue Wave's database product, including DBTools.h++, JDBTools, DBTools.h++ XA, and RWMetro. For the past five years, Murali has focused on databases, distributed transactions, andobject-relational mapping.

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.