Well… it appears I may have found another bug in .NET. This time related to ASP.NET Web Services and DataSet serialization. I’m not sure if this is actually a bug, but it really does look like one. Perhaps someone from Microsoft could explain to me what’s happening here.
Here’s the scenario:
I’m sure just about everybody has worked with a DataSet, and most people have also worked with serializing and deserializing them. This is all great in a typical Windows environment where you’re in “complete” control of the serialization / deserialization process. This is great, and works like a charm. The problem comes in when you need to manipulate a DataSet within a web service using a custom SoapExtension.
I have written a custom SoapExtension to do stuff with a DataSet on the web services pipeline, and it has been a bit of a mission to understand exactly what ASP.NET is doing under the hood when reconstructing the return data types from the SOAP Body. For the sake of confidentiality, I’ll simplify what I’m doing with the DataSet and hopefully someone can enlighten me as to what’s actually happening here.
Let’s assume we have a DataSet being returned to the client from a web service. This DataSet has a single table with a single column called “Name”. Whenever the Name column is empty, I want to make it “Unknown”. So I intercept the SoapMessage and do what I need to do. I pick out the specific node that represents the DataSet, deserialize it, and make the necessary changes. When I’m done, I serialize the modified DataSet and overwrite the existing node with my changed DataSet.
The problem now exists with the XML generated by the DataSet serialization process. When this method returns to the client, I am left with an empty DataSet. The schema is fully intact, but there is absolutely no data. Having traced the input and output at various stages, I found the only difference between my changed DataSet and the unchanged DataSet was a single attribute, which I’ve highlighted below:
My Serialized Version:
<DataSet>
<xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:Locale="en-ZA">
<xs:complexType>
<xs:choice maxOccurs="unbounded">
<xs:element name="MyTable">
<xs:complexType>
<xs:sequence>
<xs:element name="Name" type="xs:string" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<NewDataSet>
<MyTable diffgr:id="MyTable1" msdata:rowOrder="0">
<Name>Stuart Gunter</Name>
</MyTable>
</NewDataSet>
</diffgr:diffgram>
</DataSet>
Unchanged Version:
<?xml version="1.0" encoding="Windows-1252"?>
<DataSet>
<xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:Locale="en-ZA">
<xs:complexType>
<xs:choice maxOccurs="unbounded">
<xs:element name="MyTable">
<xs:complexType>
<xs:sequence>
<xs:element name="Name" type="xs:string" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<NewDataSet xmlns="">
<MyTable diffgr:id="MyTable1" msdata:rowOrder="0">
<Name>Stuart Gunter</Name>
</MyTable>
</NewDataSet>
</diffgr:diffgram>
</DataSet>
From what I can see… this causes problems in the deserialization process in ASP.NET. It appears that this missing namespace attribute prevents ASP.NET from linking the schema to the data. It will receive the full data, but it won’t be able to link the schema to the data because they have different namespaces. The schema (“xs:schema” element) will always have the xmlns attribute (even if it’s empty); whereas the data will only have the xmlns attribute if it’s not empty.
So ultimately, the solution to this problem is to assign a namespace value to the DataSet via the Namespace property on the DataSet. Having done this… it all works fine. It sounds pretty simple, but it’s very frustrating when you’re trying to figure out why the exact same XML will deserialize from a file, but not via the web service.
If this doesn’t make sense to anyone, please let me know. I’ll try make a sample web service and client to demonstrate what’s going on. This is not a trivial issue (in my opinion), and it caused a lot of frustration when it’s not well documented. I searched Google a few times and it seems there’s nothing on the net explaining this problem. It’s unusual that the manual deserialization of a DataSet without this xmlns attribute would work, but not the ASP.NET deserialization. Are these using two different processes? I understand the significance of a namespace, but surely this should be documented somewhere?
So my question is… is this another bug, or is it by design? If it’s by design, please could someone explain it?
Thanks