What we have in this page?
---------Next------------------------------
Overview
In this module, you will learn:
This is the first of two modules that introduce you to the XML capabilities of the Microsoft .NET Framework. XML plays a major role in .NET as an enabling technology, and the .NET Framework provides full support for just about everything you’ll need to do with XML. This module assumes that you already know something about XML. You should be comfortable with elements, attributes, validation, namespaces, and all the other paraphernalia that surrounds XML. There isn’t space to give you a grounding in XML and the XML technologies, so if you haven’t met XML before, you might want to consult XML books or a very brief Web terms.
XML and .NET
One of the major features of the .NET Framework is that it enables you to easily produce distributed applications that are language-independent and that will be platform-independent when .NET is ported to other platforms. XML plays a major part in this plan by acting as a simple, portable glue layer that’s used to pass data around in distributed applications. Microsoft has XML-enabled many parts of the .NET Framework, and a few of the main ones listed below to give you a flavor of where and how they are used:
|
Microsoft contributes to the efforts of the W3C working groups who define and set standards for XML and other Web protocols. Among the XML standards Microsoft currently provides developer support for are the following:
The XML Schema definition language (XSD), a current W3C standard for using XML to create XML Schemas. XML Schemas can be used to validate other XML documents.
Extensible Stylesheet Language Transformations (XSLT) 1.0, a current W3C XML style sheet language standard. XSLT is recommended for transforming XML documents.
The XML Path Language (XPath) 1.0, a current W3C XML standard used by XSLT and other XML programming vocabularies to query and filter data stored in XML documents.
The .NET Framework contains a number of namespaces supporting XML functionality, as summarized in the following table.
Namespace | Description |
System::Xml | The overall namespace for XML support. |
System::Xml::Schema | Support for the World Wide Web Consortium (W3C) and the Microsoft XML-Data Reduced (XDR) schemas. |
System::Xml::Serialization | Supports serializing objects to and from XML. |
System::Xml::XPath | Supports XPath parsing and evaluation. |
System::Xml::Xsl | Supports Extensible Stylesheet Language Transformations (XSLT). |
Table 1. |
This module will be mainly concerned with the System::Xml namespace and will touch on some of the capabilities of System::Xml::Schema. Module 26 will cover using the XPath and Xsl namespaces.
There are four main classes in the System::Xml namespace for processing XML. We’ll briefly list their capabilities and functionality here, before getting into more detailed examination in the rest of the module.
The XmlTextReader class is used for fast, forward-only parsing without validation. It will check that documents are well-formed using a Document Type Definition (DTD), but it doesn’t use the DTD for validation. Forward-only parsing means that you parse the document from start to finish, and you can’t back up to reparse an earlier part of the document.
XmlValidatingReader implements a forward-only parser that provides more functionality than XmlTextReader, in particular, the ability to validate input using DTDs, W3C XML Schema Definition (XSD) schemas, or XDR schemas. Both XmlTextReader and XmlValidatingReader are derived from the abstract class XmlReader, which provides much of the basic functionality.
XmlTextWriter provides a fast, forward-only way to write XML to streams or files. The XML produced conforms to the W3C XML 1 specification, complete with namespace support.
XmlDocument implements the W3C Document Object Model (DOM), providing an in-memory representation of an XML document.
Let’s start by looking at how you can parse XML with the XmlTextReader class. XmlTextReader provides you with a way to parse XML data that minimizes resource usage by reading forward through the document, recognizing elements as it reads. Very little data is cached in memory, but the forward-only style has two main consequences. The first is that it isn’t possible to go back to an earlier point in the file without starting to read from the top again. The second consequence is slightly more subtle: elements are read and presented to you one by one, with no context. So, if you need to keep track of where an element occurs within the document structure, you’ll need to do it yourself. If either of these consequences sound like limitations to you, you might need to use the XmlDocument class, which is discussed in the “Using XmlDocument” section later in this module.
XmlTextReader uses a pull model, which means that you call a function to get the next node when you’re ready. This model is in contrast to the widely used SAX (Simple API for XML) API, which uses a push model, meaning that it fires events at callback functions that you provide. The following tables list the properties and methods of the XmlTextReader class. Again, XmlTextReader represents a reader that provides fast, non-cached, forward-only access to XML data. The following tables list the members exposed by the XmlTextReader type.
Public Constructors | |
Name | Description |
Symbol |
|
XmlTextReader | Overloaded. Initializes a new instance of the XmlTextReader. |
Table 2. |
Protected Constructors | |
Name | Description |
Symbol |
|
XmlTextReader | Overloaded. Initializes a new instance of the XmlTextReader. |
Table 3. |
Public Properties | |
Name | Description |
Symbol |
|
AttributeCount | Overridden. Gets the number of attributes on the current node. |
BaseURI | Overridden. Gets the base URI of the current node. |
CanReadBinaryContent | Overridden. Gets a value indicating whether the XmlTextReader implements the binary content read methods. |
CanReadValueChunk | Overridden. Gets a value indicating whether the XmlTextReader implements the ReadValueChunk method. |
CanResolveEntity | Overridden. Gets a value indicating whether this reader can parse and resolve entities. |
Depth | Overridden. Gets the depth of the current node in the XML document. |
Encoding | Gets the encoding of the document. |
EntityHandling | Gets or sets a value that specifies how the reader handles entities. |
EOF | Overridden. Gets a value indicating whether the reader is positioned at the end of the stream. |
HasAttributes | Gets a value indicating whether the current node has any attributes. (Inherited from XmlReader.) |
HasValue | Overridden. Gets a value indicating whether the current node can have a Value other than String.Empty. |
IsDefault | Overridden. Gets a value indicating whether the current node is an attribute that was generated from the default value defined in the DTD or schema. |
IsEmptyElement | Overridden. Gets a value indicating whether the current node is an empty element (for example, <MyElement/>). |
Item | Overloaded. When overridden in a derived class, gets the value of the attribute. (Inherited from XmlReader.) |
LineNumber | Gets the current line number. |
LinePosition | Gets the current line position. |
LocalName | Overridden. Gets the local name of the current node. |
Name | Overridden. Gets the qualified name of the current node. |
Namespaces | Gets or sets a value indicating whether to do namespace support. |
NamespaceURI | Overridden. Gets the namespace URI (as defined in the W3C Namespace specification) of the node on which the reader is positioned. |
NameTable | Overridden. Gets the XmlNameTable associated with this implementation. |
NodeType | Overridden. Gets the type of the current node. |
Normalization | Gets or sets a value indicating whether to normalize white space and attribute values. |
Prefix | Overridden. Gets the namespace prefix associated with the current node. |
ProhibitDtd | Gets or sets a value indicating whether to allow DTD processing. |
QuoteChar | Overridden. Gets the quotation mark character used to enclose the value of an attribute node. |
ReadState | Overridden. Gets the state of the reader. |
SchemaInfo | Gets the schema information that has been assigned to the current node as a result of schema validation. (Inherited from XmlReader.) |
Settings | Overridden. Gets the XmlReaderSettings object used to create this XmlTextReader instance. |
Value | Overridden. Gets the text value of the current node. |
ValueType | Gets The Common Language Runtime (CLR) type for the current node. (Inherited from XmlReader.) |
WhitespaceHandling | Gets or sets a value that specifies how white space is handled. |
XmlLang | Overridden. Gets the current xml:lang scope. |
XmlResolver | Sets the XmlResolver used for resolving DTD references. |
XmlSpace | Overridden. Gets the current xml:space scope. |
Table 4. |
Public Methods | |
Name | Description |
Symbol |
|
Close | Overridden. Changes the ReadState to Closed. |
Create | Overloaded. Creates a new XmlReader instance. (Inherited from XmlReader.) |
Equals | Overloaded. Determines whether two Object instances are equal. (Inherited from Object.) |
GetAttribute | Overloaded. Overridden. Gets the value of an attribute. |
GetHashCode | Serves as a hash function for a particular type. GetHashCode is suitable for use in hashing algorithms and data structures like a hash table. (Inherited from Object.) |
GetNamespacesInScope | Gets a collection that contains all namespaces currently in-scope. |
GetRemainder | Gets the remainder of the buffered XML. |
GetType | Gets the Type of the current instance. (Inherited from Object.) |
HasLineInfo | Gets a value indicating whether the class can return line information. |
IsName | Gets a value indicating whether the string argument is a valid XML name. (Inherited from XmlReader.) |
IsNameToken | Gets a value indicating whether or not the string argument is a valid XML name token. (Inherited from XmlReader.) |
IsStartElement | Overloaded. Tests if the current content node is a start tag. (Inherited from XmlReader.) |
LookupNamespace | Overridden. Resolves a namespace prefix in the current element's scope. |
MoveToAttribute | Overloaded. Overridden. Moves to the specified attribute. |
MoveToContent | Checks whether the current node is a content (non-white space text, CDATA, Element, EndElement, EntityReference, or EndEntity) node. If the node is not a content node, the reader skips ahead to the next content node or end of file. It skips over nodes of the following type: ProcessingInstruction, DocumentType, Comment, Whitespace, or SignificantWhitespace. (Inherited from XmlReader.) |
MoveToElement | Overridden. Moves to the element that contains the current attribute node. |
MoveToFirstAttribute | Overridden. Moves to the first attribute. |
MoveToNextAttribute | Overridden. Moves to the next attribute. |
Read | Overridden. Reads the next node from the stream. |
ReadAttributeValue | Overridden. Parses the attribute value into one or more Text, EntityReference, or EndEntity nodes. |
ReadBase64 | Decodes Base64 and returns the decoded binary bytes. |
ReadBinHex | Decodes BinHex and returns the decoded binary bytes. |
ReadChars | Reads the text contents of an element into a character buffer. This method is designed to read large streams of embedded text by calling it successively. |
ReadContentAs | Reads the content as an object of the type specified. (Inherited from XmlReader.) |
ReadContentAsBase64 | Overridden. Reads the content and returns the Base64 decoded binary bytes. |
ReadContentAsBinHex | Overridden. Reads the content and returns the BinHex decoded binary bytes. |
ReadContentAsBoolean | Reads the text content at the current position as a Boolean. (Inherited from XmlReader.) |
ReadContentAsDateTime | Reads the text content at the current position as a DateTime object. (Inherited from XmlReader.) |
ReadContentAsDecimal | Reads the text content at the current position as a Decimal object. (Inherited from XmlReader.) |
ReadContentAsDouble | Reads the text content at the current position as a double-precision floating-point number. (Inherited from XmlReader.) |
ReadContentAsFloat | Reads the text content at the current position as a single-precision floating point number. (Inherited from XmlReader.) |
ReadContentAsInt | Reads the text content at the current position as a 32-bit signed integer. (Inherited from XmlReader.) |
ReadContentAsLong | Reads the text content at the current position as a 64-bit signed integer. (Inherited from XmlReader.) |
ReadContentAsObject | Reads the text content at the current position as an Object. (Inherited from XmlReader.) |
ReadContentAsString | Reads the text content at the current position as a String object. (Inherited from XmlReader.) |
ReadElementContentAs | Overloaded. Reads the current element and returns the contents as an object of the type specified. (Inherited from XmlReader.) |
ReadElementContentAsBase64 | Overridden. Reads the element and decodes the Base64 content. |
ReadElementContentAsBinHex | Overridden. Reads the element and decodes the BinHex content. |
ReadElementContentAsBoolean | Overloaded. Reads the current element value as a Boolean object. (Inherited from XmlReader.) |
ReadElementContentAsDateTime | Overloaded. Reads the current element and returns the contents as a DateTime object. (Inherited from XmlReader.) |
ReadElementContentAsDecimal | Overloaded. Reads the current element value as a Decimal object. (Inherited from XmlReader.) |
ReadElementContentAsDouble | Overloaded. Reads the current element and returns the contents as a double-precision floating-point number. (Inherited from XmlReader.) |
ReadElementContentAsFloat | Overloaded. Reads the current element value as a single-precision floating-point number. (Inherited from XmlReader.) |
ReadElementContentAsInt | Overloaded. Reads the current element and returns the contents as a 32-bit signed integer. (Inherited from XmlReader.) |
ReadElementContentAsLong | Overloaded. Reads the current element and returns the contents as a 64-bit signed integer. (Inherited from XmlReader.) |
ReadElementContentAsObject | Overloaded. Reads the current element and returns the contents as an Object. (Inherited from XmlReader.) |
ReadElementContentAsString | Overloaded. Reads the current element and returns the contents as a String object. (Inherited from XmlReader.) |
ReadElementString | Overloaded. This is a helper method for reading simple text-only elements. (Inherited from XmlReader.) |
ReadEndElement | Checks that the current content node is an end tag and advances the reader to the next node. (Inherited from XmlReader.) |
ReadInnerXml | When overridden in a derived class, reads all the content, including markup, as a string. (Inherited from XmlReader.) |
ReadOuterXml | When overridden in a derived class, reads the content, including markup, representing this node and all its children. (Inherited from XmlReader.) |
ReadStartElement | Overloaded. Checks that the current node is an element and advances the reader to the next node. (Inherited from XmlReader.) |
ReadString | Overridden. Reads the contents of an element or a text node as a string. |
ReadSubtree | Returns a new XmlReader instance that can be used to read the current node, and all its descendants. (Inherited from XmlReader.) |
ReadToDescendant | Overloaded. Advances the XmlReader to the next matching descendant element. (Inherited from XmlReader.) |
ReadToFollowing | Overloaded. Reads until the named element is found. (Inherited from XmlReader.) |
ReadToNextSibling | Overloaded. Advances the XmlReader to the next matching sibling element. (Inherited from XmlReader.) |
ReadValueChunk | Reads large streams of text embedded in an XML document. (Inherited from XmlReader.) |
ReferenceEquals | Determines whether the specified Object instances are the same instance. (Inherited from Object.) |
ResetState | Resets the state of the reader to ReadState.Initial. |
ResolveEntity | Overridden. Resolves the entity reference for EntityReference nodes. |
Skip | Overridden. Skips the children of the current node. |
ToString | Returns a String that represents the current Object. (Inherited from Object.) |
Table 5. |
Protected Methods | |
Name | Description |
Symbol |
|
Dispose | Releases the unmanaged resources used by the XmlReader and optionally releases the managed resources. (Inherited from XmlReader.) |
Finalize | Allows an Object to attempt to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection. (Inherited from Object.) |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) |
Table 6. |
Explicit Interface Implementations |
|
Name | Description |
Symbol |
|
System.Xml.IXmlLineInfo.HasLineInfo | - |
System.Xml.IXmlNamespaceResolver.GetNamespacesInScope | For a description of this member, see IXmlNamespaceResolver.GetNamespacesInScope. |
System.Xml.IXmlNamespaceResolver.LookupNamespace | For a description of this member, see IXmlNamespaceResolver.LookupNamespace. |
System.Xml.IXmlNamespaceResolver.LookupPrefix | For a description of this member, see IXmlNamespaceResolver.LookupPrefix. |
Table 7. |
The most important function in the second of these tables is Read, which tells the XmlTextReader to fetch the next node from the document. Once you’ve got the node, you can use the NodeType property to find out what you have. You’ll get one of the members of the XmlNodeType enumeration, whose members are listed in the following table.
Node Type | Description |
Attribute | An attribute, for example, type=hardback. |
CDATA | A CDATA section. |
Comment | An XML comment. |
Document | The document object, representing the root of the XML tree. |
DocumentFragment | A fragment of XML that isn’t a document in itself. |
DocumentType | A document type declaration. |
Element, EndElement | The start and end of an XML element. |
Entity, EndEntity | The start and end of an entity declaration. |
EntityReference | An entity reference (for example, <). |
None | Used if the node type is queried when no node has been read. |
Notation | A notation entry in a DTD. |
ProcessingInstruction | An XML processing instruction. |
SignificantWhitespace | White space in a mixed content model document, or when xml:space=preserve has been set. |
Text | The text content of an element. |
Whitespace | White space between markup. |
XmlDeclaration | The XML declaration at the top of a document. |
Table 8. |