Select the product you need help with
- Internet Explorer
- Windows Phone
- More products
Information about Microsoft Excel 2002 and XML
Article ID: 288215 - View products that this article applies to.
This article was previously published under Q288215
Excel 2002 enables you to open and save files in the Extensible Markup Language (XML) format. This article provides an overview of XML and XML stylesheets and how they can be used with Excel. The following topics are discussed:
What is XML?
What is an XML stylesheet?
How does Excel translate XML files?The XML Spreadsheet (XMLSS) format
Integrating XML in your Excel solutions
What is XML?XML is a text format for structured data. Because XML is easy to create and interpret, it is used in a wide variety of applications. Proper, or well-formed, XML follows a set of guidelines that dictates the data structure yet gives the XML developer great flexibility to create an unlimited number of customized tags that best describe the data and relationships that might exist within the data.
Like HTML, XML is a markup language in which text is organized with a combination of tags (words enclosed in angle brackets < and >) and attributes (of the form name = "value"). But whereas HTML specifies what each tag and attribute means and how it should be rendered for display, XML uses the tags only to delineate elements of data. Because you can use any tag names with XML, it is at the sole discretion of an application to interpret the data and its meaning. Although XML developers are free to create whatever tags or hierarchy best fit their data, there is a set of specifications that should be followed to ensure that the XML is considered well-formed. Well-formed XML has the following characteristics:
Figure 1 - Well-Formed XML Sample (Dictionary.xml)
The XML in Figure 1 describes entries in a dictionary. The root element is <Dictionary>, and the root contains one child element named <Entries>. <Entries> contains three children, each of which describes the data for an individual entry in the dictionary. For each entry, there are two child elements: <Word> and <Definition>. The value of each <Word> element is the word itself; <Word> elements also have a Type attribute that indicates whether the word is an adjective (1) or a noun (2). <Definition> elements have a text value and no attributes.
What is an XML stylesheet?XML tags describe the data in a text file, but XML alone does not specify how the data should be presented to the user. Formatting rules for XML data are often contained in XML stylesheets. XML stylesheets are well-formed XML documents that use Extensible Style Language (XSL) to transform XML data for presentation. You can have many stylesheets for the same XML data. While your actual XML data structure might remain the same, stylesheets give you flexibility to present the data in many different ways.
Figure 2: XML Stylesheet (Dictionary.xsl)
Dictionary.xsl, shown in Figure 2, is a stylesheet that can be used to transform the sample Dictionary.xml. The transformation results in an HTML presentation of the data that can be viewed in a Web browser. The XSL sorts the dictionary entries alphabetically in a bulleted list and formats each component of the dictionary entry. The transformed XML that is rendered in the browser resembles the following:
How does Excel translate XML files?Excel can open any XML file that is well-formed. XML files can be opened directly from the Open command on the File menu. XML files can also be opened programmatically using either the Open or OpenXML methods of the Workbooks collection. Regardless of whether you open the XML through code or through the user interface (UI), the file is marked as Read-only so that you cannot accidentally replace your original source file with one in the XML Spreadsheet (XMLSS) format.
In Excel, when you import XML that is not in the XML Spreadsheet (XMLSS) format, Excel uses a special flattening algorithm to load the data into rows and columns. Excel worksheets are two-dimensional entities comprised of rows and columns; because XML can have more than two dimensions, flattening is needed to interpret the XML so that it can be loaded into cells on a worksheet.
Consider the following two sets of XML, which essentially represent the same data in different ways:
Figure 3: Customer XML with Child Elements
Figure 4: Customer XML with Attributes
If you load the XML in Figure 3 into Excel, the data is imported into cells as shown below:
Excel treats attributes exactly like child elements. To eliminate name collision with existing elements, Excel prepends the at sign (@) to the front of the attribute names for consistency with the naming standards for XSL patterns. Therefore, the XML in Figure 4 is loaded into cells in the same way, with the exception that cells A2, B2, and C2 contain "/@CustomerID", "/@FirstName", and "/@LastName", respectively.
The Customer XML in Figures 3 and 4 follows a two-dimensional structure, which is easy to see. But consider the following Customer XML, which cannot be easily interpreted as two-dimensional because of the addition of the <Order> level:
Figure 5: Customer XML with Two Levels Beneath the Root Element
In the XML in Figure 5, there is one customer with two orders and another customer with one order. Excel must import the XML in such a way that the relationship between Orders and Customers is maintained. Excel imports the XML into three rows as shown below:
Two additional columns are present: a #id column that associates the row with the <Customer> node, and a #agg column that represents what can be considered a "FillDown". With #agg, the value of the first node in the level above it is listed once and then omitted for remaining rows.
Also note that the column for <Amt> precedes the column for <ProdID> even though it appears after <ProdID> in the actual XML. The flattener loads the elements in alphabetical order from left to right, not in the natural order in which they appear in the original XML document.
Using stylesheets can provide significant advantages for opening XML files in Excel. By using a stylesheet, you can better control how data is placed in cells and even provide some formatting for the cells. When you open an XML file in Excel and that XML file contains one or more xml-stylesheet processing instructions, Excel prompts you to select the stylesheets you want to use for transformation.
Once again consider the Customer XML in Figure 5. If you add a processing instruction to the Customer XML (<?xml-stylesheet type="text/xsl" href="Customer.xsl" ?>), you can control which columns in the worksheet the data appears in, and apply custom formatting. The following is a stylesheet sample that can be used as a transform for the Customer XML sample.
Figure 6 - Customer XML Stylesheet (Customer.xsl)
The stylesheet transforms the XML into valid HTML (which Excel can translate into its native format). When you open the Customer XML with this stylesheet applied, Excel displays the data as a four-column table with headers (Customer ID, Name, Product ID, and Amount). Cell formatting such as font attributes, cell background colors, and borders are also present in the table.
Try it out: Copy and paste the XML in Figure 5 into any text editor and save it as Customer.xml. Then copy the XML stylesheet in Figure 6 and save it as Customer.xsl in the same folder as Customer.xml. Start Excel 2002, and open the file with and without the stylesheet applied to see the differences that the stylesheet makes.
Troubleshooting errors opening your XML files
If you attempt to open an XML file that is not well-formed, you will receive an alert or a "Script Parse Error". When an alert appears, it provides a description of why the XML could not be opened. The alert description may also provide details on the node that caused the translation to fail. In some cases in which there is a parse error, you may receive a dialog box that lists errors instead of an alert message; script parse errors are logged to a file named XMLErr.log in your Temp folder. To quickly examine the contents of XMLErr.log, follow these steps:
The XML Spreadsheet (XMLSS) formatExcel can save workbooks in XML. When you save a workbook as XML, Excel saves the workbook in the XML Spreadsheet (XMLSS) format. XMLSS describes the contents of the workbook by persisting cell data and formulas, cell formats, worksheet settings, and workbook settings. However, some Excel features cannot be persisted in XML. These include:
Integrating XML in your Excel solutionsUsing XML data presents the developer with many opportunites for Excel 2002 solutions.
For example, you can create multisheet workbooks in XMLSS at runtime without automating Excel (which requires loading the Excel executable file). Because Excel is an out-of-process Automation server, calls to the Excel object model can be costly in terms of performance. You can create XMLSS completely in-process using only the Microsoft XML Parser (MSXML) and XML stylesheets. Or, you can use the Office XP Spreadsheet component (in-process) to build a spreadsheet and retrieve the XMLSS for the spreadsheet. In either case, the resulting XMLSS can be opened directly in Excel as a native format. By streaming the XMLSS to client browsers using Excel's Multipurpose Internet Mail Extensions (MIME) content type, Web applications such as those that use Active Server Pages (ASP) can create complex workbooks without the overhead of running Excel on the server. For sample code that demonstrates possible uses of XMLSS generation, click the article numbers below to view the articles in the Microsoft Knowledge Base:
(http://support.microsoft.com/kb/288130/ )How to use ASP to build spreadsheet XML for client-side display
285891Another potential use of XML in an Excel solution is to extract data from a range of cells in a format that can be easily shared with other applications. XMLSS can be extracted from cells on a worksheet by calling the Value property of the Range object with the xlRangeValueXMLSpreadsheet parameter. You can use a stylesheet to transform the XMLSS to any XML that can be used by another application. One possible use of this is to transform the XMLSS to custom XML that can be posted and processed by a Web application. For an example of this technique, see the following article in the Microsoft Knowledge Base:
(http://support.microsoft.com/kb/285891/ )How to use Visual Basic or ASP to create an XML spreadsheet for Excel 2002 and Excel 2003
(http://support.microsoft.com/kb/278976/ )How to use XSL to transform Excel XML spreadsheet for server-side use
For more information, see the following Web sites:
World Wide Web Consortium (W3C)
XSLT Developer's Guide
XML Developer Center