< C++ .NET System::IO - Files 7 | Main | Reading & Writing XML 1 >


 

 

Reading and Writing XML 1

 

 

What we have in this page?

  1. Overview

  2. XML and .NET

  3. The .NET XML Namespaces

  4. The XML Processing Classes

  5. Parsing XML with XmlTextReader

---------Next------------------------------

  1. Verifying Well-Formed XML

  2. Handling Attributes

  3. Parsing XML with Validation

  4. Writing XML Using XmlTextWriter

  5. Using XmlDocument

  6. What Is the W3C DOM?

  7. The XmlNode Class

  8. A Very Quick Reference

Overview

 

In this module, you will learn:

  1. Why XML is so important to Microsoft .NET.

  2. The classes that make up the .NET XML namespaces.

  3. How to parse XML files using XmlTextReader.

  4. How to validate XML using XmlValidatingReader.

  5. How to write XML using XmlTextWriter.

  6. How to use the XmlDocument class to manipulate XML in memory.

This is the first of two modules that introduce you to the XML capabilities of the Microsoft .NET Framework. XML plays a major role in .NET as an enabling technology, and the .NET Framework provides full support for just about everything you’ll need to do with XML. This module assumes that you already know something about XML. You should be comfortable with elements, attributes, validation, namespaces, and all the other paraphernalia that surrounds XML. There isn’t space to give you a grounding in XML and the XML technologies, so if you haven’t met XML before, you might want to consult XML books or a very brief Web terms.

 

XML and .NET

 

One of the major features of the .NET Framework is that it enables you to easily produce distributed applications that are language-independent and that will be platform-independent when .NET is ported to other platforms. XML plays a major part in this plan by acting as a simple, portable glue layer that’s used to pass data around in distributed applications. Microsoft has XML-enabled many parts of the .NET Framework, and a few of the main ones listed below to give you a flavor of where and how they are used:

  1. It’s possible for the results of database queries to be returned as XML so that they are far more portable than ActiveX data object (ADO) recordset objects. It’s also possible to interact with databases more fully using XML.

  2. Calls can be made to Web services using SOAP, an XML-based protocol for making remote procedure calls (instead of using security vulnerable RPC).

  3. Finding out what a Web service provider can do for you involves using UDDI, the Universal Description, Discovery, and Integration service. When you query a UDDI service, you post a query in XML and a description of what is available comes back as more XML.

 

The .NET XML Namespaces

 

Microsoft contributes to the efforts of the W3C working groups who define and set standards for XML and other Web protocols. Among the XML standards Microsoft currently provides developer support for are the following:

 

 

The .NET Framework contains a number of namespaces supporting XML functionality, as summarized in the following table.

 

Namespace

Description

System::Xml

The overall namespace for XML support.

System::Xml::Schema

Support for the World Wide Web Consortium (W3C) and the Microsoft XML-Data Reduced (XDR) schemas.

System::Xml::Serialization

Supports serializing objects to and from XML.

System::Xml::XPath

Supports XPath parsing and evaluation.

System::Xml::Xsl

Supports Extensible Stylesheet Language Transformations (XSLT).

 

Table 1.

 

This module will be mainly concerned with the System::Xml namespace and will touch on some of the capabilities of System::Xml::Schema. Module 26 will cover using the XPath and Xsl namespaces.

 

 

 

 

The XML Processing Classes

 

There are four main classes in the System::Xml namespace for processing XML. We’ll briefly list their capabilities and functionality here, before getting into more detailed examination in the rest of the module.

  1. The XmlTextReader class is used for fast, forward-only parsing without validation. It will check that documents are well-formed using a Document Type Definition (DTD), but it doesn’t use the DTD for validation. Forward-only parsing means that you parse the document from start to finish, and you can’t back up to reparse an earlier part of the document.

  2. XmlValidatingReader implements a forward-only parser that provides more functionality than XmlTextReader, in particular, the ability to validate input using DTDs, W3C XML Schema Definition (XSD) schemas, or XDR schemas. Both XmlTextReader and XmlValidatingReader are derived from the abstract class XmlReader, which provides much of the basic functionality.

  3. XmlTextWriter provides a fast, forward-only way to write XML to streams or files. The XML produced conforms to the W3C XML 1 specification, complete with namespace support.

  4. XmlDocument implements the W3C Document Object Model (DOM), providing an in-memory representation of an XML document.

Parsing XML with XmlTextReader

 

Let’s start by looking at how you can parse XML with the XmlTextReader class. XmlTextReader provides you with a way to parse XML data that minimizes resource usage by reading forward through the document, recognizing elements as it reads. Very little data is cached in memory, but the forward-only style has two main consequences. The first is that it isn’t possible to go back to an earlier point in the file without starting to read from the top again. The second consequence is slightly more subtle: elements are read and presented to you one by one, with no context. So, if you need to keep track of where an element occurs within the document structure, you’ll need to do it yourself. If either of these consequences sound like limitations to you, you might need to use the XmlDocument class, which is discussed in the “Using XmlDocument” section later in this module.

XmlTextReader uses a pull model, which means that you call a function to get the next node when you’re ready. This model is in contrast to the widely used SAX (Simple API for XML) API, which uses a push model, meaning that it fires events at callback functions that you provide. The following tables list the properties and methods of the XmlTextReader class. Again, XmlTextReader represents a reader that provides fast, non-cached, forward-only access to XML data. The following tables list the members exposed by the XmlTextReader type.

 

Public Constructors

Name

Description

Symbol

Public method

XmlTextReader

Overloaded. Initializes a new instance of the XmlTextReader.

 

Table 2.

 

Protected Constructors

Name

Description

Symbol

Protected method

XmlTextReader

Overloaded. Initializes a new instance of the XmlTextReader.

 

Table 3.

 

 

 

 

Public Properties

Name

Description

Symbol

Public property

AttributeCount

Overridden. Gets the number of attributes on the current node.

BaseURI

Overridden. Gets the base URI of the current node.

CanReadBinaryContent

Overridden. Gets a value indicating whether the XmlTextReader implements the binary content read methods.

CanReadValueChunk

Overridden. Gets a value indicating whether the XmlTextReader implements the ReadValueChunk method.

CanResolveEntity

Overridden. Gets a value indicating whether this reader can parse and resolve entities.

Depth

Overridden. Gets the depth of the current node in the XML document.

Encoding

Gets the encoding of the document.

EntityHandling

Gets or sets a value that specifies how the reader handles entities.

EOF

Overridden. Gets a value indicating whether the reader is positioned at the end of the stream.

HasAttributes

Gets a value indicating whether the current node has any attributes. (Inherited from XmlReader.)

HasValue

Overridden. Gets a value indicating whether the current node can have a Value other than String.Empty.

IsDefault

Overridden. Gets a value indicating whether the current node is an attribute that was generated from the default value defined in the DTD or schema.

IsEmptyElement

Overridden. Gets a value indicating whether the current node is an empty element (for example, <MyElement/>).

Item

Overloaded. When overridden in a derived class, gets the value of the attribute. (Inherited from XmlReader.)

LineNumber

Gets the current line number.

LinePosition

Gets the current line position.

LocalName

Overridden. Gets the local name of the current node.

Name

Overridden. Gets the qualified name of the current node.

Namespaces

Gets or sets a value indicating whether to do namespace support.

NamespaceURI

Overridden. Gets the namespace URI (as defined in the W3C Namespace specification) of the node on which the reader is positioned.

NameTable

Overridden. Gets the XmlNameTable associated with this implementation.

NodeType

Overridden. Gets the type of the current node.

Normalization

Gets or sets a value indicating whether to normalize white space and attribute values.

Prefix

Overridden. Gets the namespace prefix associated with the current node.

ProhibitDtd

Gets or sets a value indicating whether to allow DTD processing.

QuoteChar

Overridden. Gets the quotation mark character used to enclose the value of an attribute node.

ReadState

Overridden. Gets the state of the reader.

SchemaInfo

Gets the schema information that has been assigned to the current node as a result of schema validation. (Inherited from XmlReader.)

Settings

Overridden. Gets the XmlReaderSettings object used to create this XmlTextReader instance.

Value

Overridden. Gets the text value of the current node.

ValueType

Gets The Common Language Runtime (CLR) type for the current node. (Inherited from XmlReader.)

WhitespaceHandling

Gets or sets a value that specifies how white space is handled.

XmlLang

Overridden. Gets the current xml:lang scope.

XmlResolver

Sets the XmlResolver used for resolving DTD references.

XmlSpace

Overridden. Gets the current xml:space scope.

 

Table 4.

 

Public Methods

Name

Description

Symbol

Public method

Close

Overridden. Changes the ReadState to Closed.

Create

Overloaded. Creates a new XmlReader instance. (Inherited from XmlReader.)

Equals

Overloaded. Determines whether two Object instances are equal. (Inherited from Object.)

GetAttribute

Overloaded. Overridden. Gets the value of an attribute.

GetHashCode

Serves as a hash function for a particular type. GetHashCode is suitable for use in hashing algorithms and data structures like a hash table. (Inherited from Object.)

GetNamespacesInScope

Gets a collection that contains all namespaces currently in-scope.

GetRemainder

Gets the remainder of the buffered XML.

GetType

Gets the Type of the current instance. (Inherited from Object.)

HasLineInfo

Gets a value indicating whether the class can return line information.

IsName

Gets a value indicating whether the string argument is a valid XML name. (Inherited from XmlReader.)

IsNameToken

Gets a value indicating whether or not the string argument is a valid XML name token. (Inherited from XmlReader.)

IsStartElement

Overloaded. Tests if the current content node is a start tag. (Inherited from XmlReader.)

LookupNamespace

Overridden. Resolves a namespace prefix in the current element's scope.

MoveToAttribute

Overloaded. Overridden. Moves to the specified attribute.

MoveToContent

Checks whether the current node is a content (non-white space text, CDATA, Element, EndElement, EntityReference, or EndEntity) node. If the node is not a content node, the reader skips ahead to the next content node or end of file. It skips over nodes of the following type: ProcessingInstruction, DocumentType, Comment, Whitespace, or SignificantWhitespace. (Inherited from XmlReader.)

MoveToElement

Overridden. Moves to the element that contains the current attribute node.

MoveToFirstAttribute

Overridden. Moves to the first attribute.

MoveToNextAttribute

Overridden. Moves to the next attribute.

Read

Overridden. Reads the next node from the stream.

ReadAttributeValue

Overridden. Parses the attribute value into one or more Text, EntityReference, or EndEntity nodes.

ReadBase64

Decodes Base64 and returns the decoded binary bytes.

ReadBinHex

Decodes BinHex and returns the decoded binary bytes.

ReadChars

Reads the text contents of an element into a character buffer. This method is designed to read large streams of embedded text by calling it successively.

ReadContentAs

Reads the content as an object of the type specified. (Inherited from XmlReader.)

ReadContentAsBase64

Overridden. Reads the content and returns the Base64 decoded binary bytes.

ReadContentAsBinHex

Overridden. Reads the content and returns the BinHex decoded binary bytes.

ReadContentAsBoolean

Reads the text content at the current position as a Boolean. (Inherited from XmlReader.)

ReadContentAsDateTime

Reads the text content at the current position as a DateTime object. (Inherited from XmlReader.)

ReadContentAsDecimal

Reads the text content at the current position as a Decimal object. (Inherited from XmlReader.)

ReadContentAsDouble

Reads the text content at the current position as a double-precision floating-point number. (Inherited from XmlReader.)

ReadContentAsFloat

Reads the text content at the current position as a single-precision floating point number. (Inherited from XmlReader.)

ReadContentAsInt

Reads the text content at the current position as a 32-bit signed integer. (Inherited from XmlReader.)

ReadContentAsLong

Reads the text content at the current position as a 64-bit signed integer. (Inherited from XmlReader.)

ReadContentAsObject

Reads the text content at the current position as an Object. (Inherited from XmlReader.)

ReadContentAsString

Reads the text content at the current position as a String object. (Inherited from XmlReader.)

ReadElementContentAs

Overloaded. Reads the current element and returns the contents as an object of the type specified. (Inherited from XmlReader.)

ReadElementContentAsBase64

Overridden. Reads the element and decodes the Base64 content.

ReadElementContentAsBinHex

Overridden. Reads the element and decodes the BinHex content.

ReadElementContentAsBoolean

Overloaded. Reads the current element value as a Boolean object. (Inherited from XmlReader.)

ReadElementContentAsDateTime

Overloaded. Reads the current element and returns the contents as a DateTime object. (Inherited from XmlReader.)

ReadElementContentAsDecimal

Overloaded. Reads the current element value as a Decimal object. (Inherited from XmlReader.)

ReadElementContentAsDouble

Overloaded. Reads the current element and returns the contents as a double-precision floating-point number. (Inherited from XmlReader.)

ReadElementContentAsFloat

Overloaded. Reads the current element value as a single-precision floating-point number. (Inherited from XmlReader.)

ReadElementContentAsInt

Overloaded. Reads the current element and returns the contents as a 32-bit signed integer. (Inherited from XmlReader.)

ReadElementContentAsLong

Overloaded. Reads the current element and returns the contents as a 64-bit signed integer. (Inherited from XmlReader.)

ReadElementContentAsObject

Overloaded. Reads the current element and returns the contents as an Object. (Inherited from XmlReader.)

ReadElementContentAsString

Overloaded. Reads the current element and returns the contents as a String object. (Inherited from XmlReader.)

ReadElementString

Overloaded. This is a helper method for reading simple text-only elements. (Inherited from XmlReader.)

ReadEndElement

Checks that the current content node is an end tag and advances the reader to the next node. (Inherited from XmlReader.)

ReadInnerXml

When overridden in a derived class, reads all the content, including markup, as a string. (Inherited from XmlReader.)

ReadOuterXml

When overridden in a derived class, reads the content, including markup, representing this node and all its children. (Inherited from XmlReader.)

ReadStartElement

Overloaded. Checks that the current node is an element and advances the reader to the next node. (Inherited from XmlReader.)

ReadString

Overridden. Reads the contents of an element or a text node as a string.

ReadSubtree

Returns a new XmlReader instance that can be used to read the current node, and all its descendants. (Inherited from XmlReader.)

ReadToDescendant

Overloaded. Advances the XmlReader to the next matching descendant element. (Inherited from XmlReader.)

ReadToFollowing

Overloaded. Reads until the named element is found. (Inherited from XmlReader.)

ReadToNextSibling

Overloaded. Advances the XmlReader to the next matching sibling element. (Inherited from XmlReader.)

ReadValueChunk

Reads large streams of text embedded in an XML document. (Inherited from XmlReader.)

ReferenceEquals

Determines whether the specified Object instances are the same instance. (Inherited from Object.)

ResetState

Resets the state of the reader to ReadState.Initial.

ResolveEntity

Overridden. Resolves the entity reference for EntityReference nodes.

Skip

Overridden. Skips the children of the current node.

ToString

Returns a String that represents the current Object. (Inherited from Object.)

 

Table 5.

Protected Methods

Name

Description

Symbol

Protected method

Dispose

Releases the unmanaged resources used by the XmlReader and optionally releases the managed resources. (Inherited from XmlReader.)

Finalize

Allows an Object to attempt to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection. (Inherited from Object.)

MemberwiseClone

Creates a shallow copy of the current Object. (Inherited from Object.)

 

Table 6.

 

Explicit Interface Implementations

 

Name

Description

Symbol

Explicit interface implementation

System.Xml.IXmlLineInfo.HasLineInfo

-

System.Xml.IXmlNamespaceResolver.GetNamespacesInScope

For a description of this member, see IXmlNamespaceResolver.GetNamespacesInScope.

System.Xml.IXmlNamespaceResolver.LookupNamespace

For a description of this member, see IXmlNamespaceResolver.LookupNamespace.

System.Xml.IXmlNamespaceResolver.LookupPrefix

For a description of this member, see IXmlNamespaceResolver.LookupPrefix.

 

Table 7.

 

The most important function in the second of these tables is Read, which tells the XmlTextReader to fetch the next node from the document. Once you’ve got the node, you can use the NodeType property to find out what you have. You’ll get one of the members of the XmlNodeType enumeration, whose members are listed in the following table.

 

Node Type

Description

Attribute

An attribute, for example, type=hardback.

CDATA

A CDATA section.

Comment

An XML comment.

Document

The document object, representing the root of the XML tree.

DocumentFragment

A fragment of XML that isn’t a document in itself.

DocumentType

A document type declaration.

Element, EndElement

The start and end of an XML element.

Entity, EndEntity

The start and end of an entity declaration.

EntityReference

An entity reference (for example, &lt;).

None

Used if the node type is queried when no node has been read.

Notation

A notation entry in a DTD.

ProcessingInstruction

An XML processing instruction.

SignificantWhitespace

White space in a mixed content model document, or when xml:space=preserve has been set.

Text

The text content of an element.

Whitespace

White space between markup.

XmlDeclaration

The XML declaration at the top of a document.

 

Table 8.

 

 

 

Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7

 

 


< C++ .NET System::IO - Files 7 | Main | Reading & Writing XML 1 >