My blog has moved and can now be found at http://blog.aniljohn.com

No action is needed on your part if you are already subscribed to this blog via e-mail or its syndication feed.

Tuesday, July 19, 2005
« Oyster Docking Station & Toshiba M200 Ta... | Main | BlogJet FTP + DasBlog + ISA Server 2004 ... »

Some time back (As in late March of this year), there was a very interesting thread discussing web services InterOp on the SOAPBuilder’s list serve. The great thing was that there were a lot of folks participating in that discussion, including folks from IBM, Microsoft, WS-I and more, and some great real life issues were brought up during that discussion. 

I had at that time made a note to myself to somehow document them, if for nothing else but for my own benefit.  I am sure the discussion thread is archived so you can check out who said what if you need to,  but here are some of the important (at least to me) points that were made during that discussion:

  • Most InterOp challenges occur at the point where SOAP implementations attempt to map language objects into XML data and vice versa. A key factor affecting InterOp is the impedance mismatch between language type systems and the XML type system.
  • The impedance mismatch in question is caused by various web service stacks implementing a subset of the W3C XML schema...... and they are not all the same subset!
  • Worse, the toolkits implement the same subset in different ways. My favourite example is .NET 1.0's problems with handling xsi:nil for value types like integers and enums and how that interacts poorly with Axis' tendency to use xsi:nil everywhere. You can argue about which platform is wrong but in the end of the day we're still left with it being difficult to build interoperable applications.
  • ..just to clarify -- Anil said that you shouldn't have problems if you are using the same technology/platform -- and by that he refers to a particular SOAP implementation -- not just a language. You may experience interop problems going Java-to-Java if, for example, you are using Apache Axis on one side and Sun JWSDP on the other.
  • An example on the Java side is that the <choice> content group is not supported in JAX-RPC while an example on the .NET side would be that  Substitution groups are not supported by the .NET Xml Serializer.
  • The idea behind the WS-I Basic Profile was to make sure that a service description defines an unabiguous wire format.  One of the first decisions taken by the WS-I was to drop SOAP encoding in favor of XML Schema to express the wire format for data types.  Further, the Basic Profile expressly allows the use of all types and constructs in the W3C XML Schema 1.0 Specification 2nd Edition.

    Of course, lots of toolkits have problems with XML Schema -- especially around language bindings.  But this is a thorny problem to solve.  The text of the XML Schema spec can be hard to digest and type systems are complex subjects.  Even with its shortcomings, there aren't really any major ambiguities or internal inconsistency in the XML Schema spec itself.  This is a different situation than with some of the original web service specs, which partly led to the creation of WS-I Basic Profile.

    So how could this problem be resolved?  You could "profile away" XML Schema features and disallow the use of constructs that cause toolkits to choke.  But many (including me) think this is a bad idea.  What would you remove and for what language/toolkit problem (since toolkit support for XML schema varies widely)?

    Would you eliminate the use of xs:nonPositiveInteger because there is no directly related type in most languages?  That would mean those who use XML Schema validation would have to implement that particular bounds checking in our code. 

    The other issue is that there are lots of people (including me) who don't want to see the capabilities of XML and Infosets reduced for the sake of easier binding to languages.  But there are also lots of people who just want object-based (for example) code to remote in an interoperable way via SOAP.

    So two bad outcomes of all this would be (a) winners and losers get picked or (b) multiple web service standards merge (perhaps one emphasizing full "XML capabilities", one emphasizing easy language bindings). 

    I'm still hopeful that the toolkits will simply continue to improve their support of the XML Schema spec.  Unsupported schema constructs are OK as long as the toolkit allows a developer to mitigate the problem.  For example, don't error out (or worse, blow up) because there's an xs:redefine present -- expose the construct as an XML node type and let the developer process it in their code.
  • > I am curious to find out if the WS-I or anyone else currently have
    > pointers to any documentation from the various toolkit vendors that
    > show exactly which schema artifacts are currently unsupported by their
    > products.

    I don't know of any matrix comparing schema features and toolkit support. I wouldn't expect the WS-I to take that effort on because it wants to avoid certifying toolkits directly.  But I've heard a lot of people *wishing* for such a matrix -- in the spirit of the earlier SoapBuilders interop grids.
  • > But therefore I wonder: given the importance that XSD bindings have
    > for interoperability, how is that the WS-I BP doesn't put any limit
    > into the XSD constructs? If they do, toolkits would agree what it
    > should be supported. In this sense, maybe the limit could be those
    > constructs that were defined in the SOAP encoding soap1.1 section 5.

    So which XML Schema features would you pick to eliminate?  Coming up with an XML Schema sweet spot would mean deciding which toolkits and/or languages "matter" more than others. 

    Also, many of the WS-* specs use XML Schema to describe message formats – not to mention that XML Schema itself has a schema.  So, in eliminating something like derivations, redefinitions, or the xs:choice compositor you can actually cause the web services stack itself to get internally inconsistent (self-non-conformant?  Hmm.).

    Anyway, I seriously doubt the WS-I would reconsider SOAP encoding.  It was tossed from consideration in the BP very early on and with broad consensus.  XML Schema – which wasn't a standard when the SOAP spec was initially written, BTW – was considered a superior type system for describing XML documents.  So, XML Schema was adopted even though it set the effectiveness of existing toolkits back. 

    I find this last bit interesting.  The WS-I is basically run by the major toolkit vendors.  And although nothing gets published without affirmation by the whole membership, the WS-I Board approves what gets voted on.  Board discussions and votes are secret and it only takes 2 "no" votes out of 9 to defeat a motion to publish material. 
    Despite all this, the WS-I published the BP even though these same vendors knew it would exacerbate areas where their toolkits were pretty weak.
  • > although XSD is very powerful for describing type semantics, current
    > toolkits don't support all the constructs and this produces impedance
    > mismatches.

    The impedance mismatch is between the XSD types (hierarchies) and OO language types (rich object graphs), regardless of the abilities of the current toolkits -- the type systems are fundamentally different.

    I actually think that the major interoperability issues are caused from the other direction -- current toolkits aren't very good at mapping rich object graphs to hierarchical structures. Many toolkits attempt to treat SOAP/WSDL as just another distributed object system (similar to CORBA, RMI, and DCOM). The toolkits focus on generating XSD definitions from code -- and lots of developers try to expose language-specific object types (Java hashmaps, .NET Datasets, etc) through their SOAP interfaces. This approach often results in interoperability challenges.

    If developers start with WSDL descriptions and XSD types and generate code from them, the interop issues are definitely lessened. And if the toolkit doesn't support a specific XSD construct, the toolkit can always resort to DOM.
  • > Can any language implement any XSD type? If the answer is not, here
    > there is an interoperability problem.

    I understand the argument, but respectfully disagree.  The purpose of XML Schema in WSDL is to describe the format of an XML message.  As long as that format is unambiguously described, interoperability is maintained.

    WSDL / XML Schema was never intended to prescribe -- or even describe -- a programming model.
  • > If I am a Java programmer that wants to create a service that sends
    > say, a hashmap, by using WSDL-first I would define somehow a XSD type
    > (something like a list of key-value pairs), but then the toolkit will
    > not generate the hashmap type, so I would have to program the hashmap
    > behaviour myself. Isn't this an inconvenience? As a programmer, I
    > would prefer to use predefined types.

    Not all languages have hashmaps, and not all hashmaps are the same (e.g., C++ STL vs Java).  If you don't care about non-Java languages, then you don't need the interop provided by WSDL-first.

    But if you don't care about non-Java, why not just use RMI?
  • My point of view was that by doing the WSDL-first approach the service implementation (whatever is java, .net..) cannot take advantage of the language capabilities. This means that you are going to get a map of the XSD defined types: arrays, lists, etc, but not a mapping of more complex structures like trees (unless you do the parsing yourself using DOM as it was said before). But if your language supports graph types, IMHO you can't take advantage of that using WSDL-first.
  • Right.  And my point was that if you want to support clients written in many different languages, then this is the price you have to pay.
  • Or, alternatively, you must build an abstraction layer between your WSDL interface and your internal object model.
  • In our enterprise we do not consider support via DOM to be support; it is a vendor cop out that is slightly better than no support and too many vendors use DOM support to claim support.
     
    In our strictly WSDL-first-development environment, we have found the interoperability problems associated with varying xsd support to be so significant that we maintain and enforce internal standards regarding allowed/disallowed XSD constructs. Furthermore, we have found toolkit xsd support variability and implementation quality to be a significant problem and, therefore, we have standards that strictly limit the WS toolkits that may be used. We allow three toolkits and life would be improved if we could reduce it to one.
     
    In my view, the state of WS interoperability is reminiscent of CORBA interoperability. And like CORBA interoperability, these problem will be solved over time.
  • > I'm still curious what are the existing interop issues for ints and
    > dateTimes?
    > I see it mentioned from time to time, but what are the concrete
    > examples?

    I know that in .NET 1.x DateTime types are "value types", which means that there is no notion of a NULL value or empty content.  So there is no way to interpret an empty node in a message to a DateTime.

    In .NET 2.0, all value types get a "default" value that will let NULL get interpreted at least in a consistent way.

    I don't know the issue with int values.  However, in .NET 1.x some integer-ish XML Schema types -- like xs:nonNegativeInteger get cast as strings in generated types.  I never got a good reason why.
  • For ints, its the problem when you want to make them optional. So it's not strictly fair to say it's int interop, it's optional parameters that are value types in Java and/or C#.

  • Right, nullable value types were not supported in .Net 1.0/1.1, fixed in 2.0 by introducing Nullable<T>.

  • nil and minOccurs='0' typically get handled in different ways in different platforms, if you're lucky, the platforms you care about handle them in ways that are don't cause trouble. Unfortunately today, java tools tend to serialize nulls explicitly as xsi:nil='true' unless told otherwise, whilst .NET tends to do either that, or not serialize it at all, depending on whether its a value type or a reference type, this gets better in .NET 2.0, but that is still beta. I expect people to run into this with .NET 1.1 for quite a while. (because an xsi:nil='true' on an element that's an int in .NET will cause it to barf)

    >I've never personally had a dateTime problem, which in retrospect
    >surprises me. Our users have a lot of confusion about timezones, but
    >the interop is actually working the way it is supposed to.

    Steve Loughran has written a number of times about problems with dateTime, i've never fully understood the issue he talks about, although lots of users get confused over timezones, and whether there toolkit works with UTC or local times (as most platforms DateTime datatype typically doesn't retain TZ info).

    One other problem i regularly see, is that xsd:long is used in a few services but has no usable mapping to COM, COM itself does support the equivilent of an xsd:long with the VT_I8 type, but every COM environment except for C++ doesn't support the type. This for example makes it tough to use the google adword API with PocketSOAP

    One final problem i haven't seen mentioned, and i suspect most vendors don't even consider it an issue, is the evolution of code generated proxies. for example code written against an Axis 1.1 generated proxy is not necessarily going work when the proxy is regenerated with Axis 1.2 (as the xsd -> java type mapping rules are now different). Or with .NET if the WSDL is evolved to contain 2 instead of 1 services, .NET will start using the binding name instead of the service name for it generated classes, breaking any existing code.

  • >Barring the null issue, I think the interop on basic constructs (value
    >types, structs, arrays) has been fairly satisfactory. But I could be
    >proven wrong...

    Between Java and .NET, I think you're right that basic interop works. And I apologize for continuing to bring up the null issues, it's just a specific example I understand really well.

    But the whole world isn't Java and .NET. In Perl, SOAP::Lite did a pretty rough job with document/literal up until the 0.65 release a few months ago, and even then it still fakes it in a lot of places. In Python, SOAPpy doesn't really do document/literal at all (although you can fake it talking to Axis); ZSI is better, but I don't have much experience with it. PHP now offers us three choices for SOAP but last I checked none of them really worked right. gSOAP for C is apparently quite good at talking to Axis in doc/lit.

    ... It's not so much that it's impossible to make doc/lit services interop, it's that it's difficult. That it requires a lot of expertise. This isn't off the shelf technology; users have to understand a lot about XML, and Schema, and they have to fiddle with the code and their usage of it. And the debugging experience for them is awfully confusing.

A very interesting and educational discussion.. Hopefully of value.

Monday, August 1, 2005 4:39:41 PM (Eastern Daylight Time, UTC-04:00)
Very good article. I've been struggling with the variable and unpredictable Schema support by JAX-RPC (JWSDP 1.5) and Axis 2 (former doesn't allow XML "any" but latter does). Gimme good old CORBA/IDL back - is seems so much simpler than WS-*!

-- andre
Andre
Comments are closed.