Provide Feedback on this Broadcast

Microsoft Support WebCasts

How to use Microsoft Office Word 2003 as your XML editor

March 9, 2004

 

Note This document is based on the original spoken Support WebCast transcript. It has been edited for clarity.

Risa Andrews: Welcome to today's WebCast on how to use Microsoft® Office Word 2003 as your XML editor. What I want to go over today, I'm going to go over the objectives (slide 2). I want to explain the importance of XML support in Microsoft Office Word 2003, provide an introduction into XML in Word 2003, and then how to use XML in Word 2003.

This WebCast is basically a general overview of XML and why it was introduced into Word 2003. It's how to use it. One of the things that we will be doing in the presentation today is going over a demo on how to add a schema. This schema is available for download and I will have that link listed later in the presentation.

There is another WebCast that I also have a link to, concerning how to go in and create custom solutions. This Web presentation will not get into creating presentations. That is geared more towards development, and we will not be covering that today.

One thing I want to note is that the XML features are available only in the Microsoft Office Professional Edition 2003 and the stand-alone edition of Microsoft Office Word 2003 (slide 3). The only thing that is available in all versions of Word 2003 is the ability to save documents as an XML file.

Some of the features that we have added into Word 2003 are the ability to insert XML data into a Word document, the ability to mark up a Word document with your customized XML. This is basically done from a custom-defined XML schema. We also have the ability to save Word as XML files. You can save a Word document as an XML file. We've also allowed support for customizing and programming XML files that allow you to design your own schemas, and to programmatically design them as well.

When XML was brought into Word 2003 (slide 4), we were trying to decide who would benefit from having XML introduced into Word 2003. End users will benefit from having XML in Word 2003, because they may not even know what's going on behind the scenes. They may have a Word document, a form, that's been created for them, by their company. They can go in and enter data, just as they normally would, without ever realizing that they're typing in an XML data and having that data created in an XML file.

It's the user friendliness that's still available within Word, but the XML developers also benefit from it. They benefit because they can create this form that users use, tag it with the schema, tag it with their tags, be able to save the data as an XML file. This allows them to go in across platforms and across applications, take that data, and use it for other applications.

For instance, you may have data that can be used to create different reports, news articles that are created for the company, performance reviews, anything that companies may use as far as forms and data. It's kind of a one-stop way of gathering data and then using it amongst other applications.

The next thing I want to cover (slide 5) is what exactly is XML? Everybody hears about what is XML and how is it used. XML is really derived from an earlier language called SGML. Its sole objective is really to serve as a universal standard to describe and transport data. What it does is it allows developers to create and use their own custom tags to mark up data within a document. By using the custom tags, they're able to go in and make data that is specific for their company. That data can be used across all the company, across various applications that they may have to create different reports.

It allows the programmer, the person who is designing the forms, to go in and actually create their own tags, whereas with something like HTML, you have to follow a set of predefined tags. You don't really have too much leeway there, as far as creating your own custom tags, and that type of thing. You have to pretty much follow a set of predefined tags.

One thing about it is XML does have stricter rules on syntax than HTML does. Because of that, it allows you to use the custom tags. But the reason for the stricter rules is also so data will be well formed. By using the well-formed data and complying with the set of strict rules, it allows you to share that information across different applications. If the data is not well formed, if it doesn't fall into this set of rules, then some other application may not be able to read that XML data and it just won't be of any use.

Part of what defines the data for an XML file is a file called a schema (slide 6). A schema generally has an .xsd file name extension. XML files generally have an .xml file extension. Schemas will allow you to validate data. What it does is it provides the framework for structuring data and ensuring that it makes sense to the creator or the other users that are using the data.

As long as the data and the XML file conform to the rules in the schema, any program that can support XML will be able to take that data and open it up, and interpret it.

Another part of an XML file that gets used is a transform. A transform is basically a way of taking that same data and reusing it for another application. It's also known as an XSLT or Extensible Stylesheet Language Transformation document.

You can have different transforms. By having different transforms you can take data. For instance, you can open an XML file in Word, apply a transform to it, and it may have something like, let's say, a marketing report or, let's say, a newsletter that you send out to the company. You can take that same XML file, open it up in, let's say, for instance, Excel. You apply a transform to the data in Excel, and it may create like a payroll report or a monthly report that you use in your office.

So, in summary, the style — think of XML — think of the styling of an XML file as done through a style sheet or the schema in the transform. XML separates the data from the presentation, whereas kind of with an HTML file it's combined into one. XML is basically just data. It offers the developers a way to go in, define their own vocabulary on how they want to describe data. They do it via the schema, using tags. It's a combination of the data file or the XML file, the schema, and the transform that make up the whole XML system and how it works.

With Word 2003 and XML (slide 7), the benefits are that XML makes for easier portability, meaning you can share data amongst other programs. It allows you to take a foreign word, save the data as XML, and then share the data.

Microsoft Word also allows you to work with XML documents in two ways. You can use the Word XML schema, which is known as WordML. You can create a document in Word, as you normally would, and then save it as an XML document.

By default, when you save within Word as an XML file, the WordML or the Word XML schema automatically gets saved. The difference between the WordML and regular XML is that the WordML will tag the information that's within the document such as formatting, paragraph styles, layout information, and information in the document properties. It creates tags for those and saves it within the XML file.

The second way that Word enables you to work with XML is to save it as data only. It creates the same kinds of tags, but it strips out all the WordML. We'll have an example of that later on in the presentation.

Word also enables you to work with XML by using a schema. You can create or open up a document in Word, and attach a custom XML schema to it. After you attach the XML schema, you can go in, apply the tags to it to define the structure of how the document will be laid out, and then save that document as an XML file. That includes the tags and how it will lay out within the context of the schema file.

You can open up XML files in a couple of different ways in Word 2003 (slide 8). The first way is basically through the same "file open" (on the File menu, click Open) that you use for any other document, just through the Open dialog box. Try to open it up. It'll come up. It recognizes the file type as being an XML and allows you to open it.

You can also insert an XML file into Word by clicking the Insert menu and choosing File, and browsing to the XML file at that point. It will insert the file into Word that way.

The next thing I want to go over is basically how to save a document as an XML file in Word 2003 (slide 9). Remember that any version of Word 2003 does have the ability to save, so this should be available on all versions of Word and not just the Office Professional or the Word stand-alone.

Basically, saving a document as an XML file is the same as you would save any other document. It's from the File menu. Choose Save. Change the file type to an XML document, and it will get saved. Remember, by default, when you save as an XML document, it will get saved with the WordML included in it.

On the Save As screen in front of you, you'll notice that you have an option towards the bottom that says Save data only. If that box is checked and you choose to save the file at that point, as an XML file, what that does is it takes out the WordML information such as the styles, the formatting, the paragraph styles, the heading styles, that type of thing in the document information. So, at that point, it just conforms to the regular XML standards and saves strictly data only.

One other way that you can save an XML file in Word 2003 is by applying a transform (slide 10). On the first screen, when you are in the Save As dialog box, before the Save data only, you'll have Apply transform button attached. If you check that, you then have the Transform button available and you can point to the XSLT file or the XSL file that has been supplied for you. Again, developers will go in if, for instance, within your company you've got developers that have created these transforms for you to attach the information to.

When working with XML documents within Word, the easiest way to have access to all the information is through the XML task pane (slide 11). You can access the XML task pane by going up to the View menu, choose Task pane. It will show the task pane on the right-hand side at the application of Word. Click the down arrow and there is an option for XML structure. This will bring up the XML information available in Word and how to access the information.

The rest of this presentation is basically going to go through taking a very simple document (slide 12). I've created a very simple resume document that this is basically it. I wanted for everybody to be able to see it on the screen, so this is pretty much the extent of what the document looks like. But I wanted to take a sample of just a regular Word document, add a schema to it, and go through the whole process of how XML can benefit the users and the companies, and how to apply the data.

What you have on the screen before you, like I said, it's just a regular Word document, basically set up as a very simple resume. It has the person's name, address, city, state, ZIP, their objective, their work experience, their education, and then interests.

In order to go in and have the information workable for an XML file (slide 13), you need to go in and add a schema. So, at this point, from the XML Structure task pane that's on the right, you have a Templates and Add-Ins link. You can click on the Templates and Add-Ins link, or you can also access it from the Tools menu, by clicking Templates and Add-Ins on the Word menu bar.

Once you have the Templates and Add-ins dialog box up, you have a tab for XML schema. Within the XML Schema dialog, you have an Add Schema button. You can click on that. What you do is browse to your actual .xsd file (slide 14).

I have two samples here. One is Resume.xsd and one is Myresume.xsd. The Resume.xsd file is actually one that's available from our Web site if somebody wants to take practice with it. It is a link that's available on our Web site that you can use. I created my own .xsd file by going in and adding a little bit more information to it, so that I would have a little bit more stuff to show you on what to do.

I choose Myresume.xsd file, choose Open, and this will take me to another screen that gives me a little bit more information about the schema settings (slide 15). Once you click OK, a Schema Settings dialog box will come up. Basically what it has in here is a URI that's set within the .xsd file. URI basically stands for Uniform Resource Identifier. It's information that's put into the .xsd file that tells Word to associate this file with this type of information. It has the location of the .xsd file and then it also has an alias name.

The alias can be a friendly name. It's basically for clarification purposes. For example, you may want to call it Resume, which is what I did. You may want to call it something else. You may have different .xsd files that are very similar in nature and you need to clarify it a little bit. So by giving it an alias, it's just basically for clarification purposes.

At the bottom of the screen, on the right, with the XML structure, you'll notice that there's an XML Options link. If you click on that, it takes you to the XML Options dialog box that basically goes over XML options that are available and how you want to treat the data (slide 16).

You have the ability here (under XML save options) to save the data, to choose to have it save data only, or you can go in here and have it apply a custom transform. If you leave both of them blank, you still have that option at the file Save As area.

You also have schema validation options. You can decide whether you want to validate the document against the attached schemas, whether you want to hide the violations, whether you want to be able to save a document even if the data is not valid.

You can choose all these different options. If you choose, for instance, to allow saving an XML file even if it is not valid, you may not be able to use that data in other applications, strictly because the data may not be well formed. In fact, at the beginning, we discussed well-formed data and how it can be applied to other applications, even though you don't get an error. If you choose, for instance, Allow saving as XML even if not valid, you may wind up saving the document, trying to take it to another application, and have it not open correctly, so just kind of "two cents worth" there.

There are also some other XML view options. One thing that we'll go over a little bit, there is a Show placeholder text for all empty elements. When you go in and create, start adding tags to a document, you can actually go in and set kind of a generic text type situation, kind of like when you do form fields.

You can set an attribute on that XML tag to have placeholder text. It may just say enter last name here or enter first name here, or something. By marking this, Show placeholder text for all empty elements, if you have just basically a generic document printed up or a generic XML file, and you don't have any information within those tags, it will show that generic text within those empty elements.

Once you have added the schema, you'll notice that the left-hand side (slide 17) now shows my friendly name, my friendly alias, the sample resume, the name of Resume. You'll notice now that it shows up at the bottom of the screen as Resume.

At this point you've added the schema to the resume document, but you've not done anything with it. In order to start adding tags to it and start actually working with the creation of the XML file itself, you need to go and apply that schema.

So what you need to do is, within that document, once you have that document up on the screen, you can come down to this bottom screen where it says Resume, and you click on it.

Once you click on it, another dialog box is going to appear and it basically says do you want to apply this to the entire document or to the selection only? If you've got certain parts of it selected and you want to apply just this schema to it, you can do so. For today's purposes we're going to apply it to the entire document.

Once you've chosen to apply it to the entire document, you'll notice that the Resume element now moves up to the top of the XML Structure task pane (slide 18), and at this point, you can see the tags for Resume at the beginning of this document, right before John Smith, and at the end of the document at the very end. It basically says take everything that's contained in here, apply this Resume element to it, and when it's saved everything will be associated with that Resume schema.

After you apply the Resume schema, you have to actually go in and start adding elements of that schema (slide 19). You'll notice one of the first things we will do is highlight the name. You'll have to select the text, John Smith, and one of the elements of that Resume element, basically it's a child element, is Name. So you've now applied Name to John Smith. It's put the tags in there for Name and you'll notice at the top, you'll notice on the XML Structure task pane, Resume is now included and Name is now included. However, you'll notice at the top, where Name is, it's got a question mark by it. When you see a question mark next to an element like this, it's basically saying this data is not valid. It's not going to conform to the validation rules of this schema.

If you look down at the bottom part of this XML Structure task pane, you'll notice that there are other elements listed for Name.

Basically, these are subelements of the Name element. You've got FirstName and LastName. In order to correct it and to make the document valid, you must go in and add these FirstName, LastName elements. To do this, you basically select the first part, which is John, and choose FirstName from the bottom structure. You can go to the next name, the last name of Smith. Select that and add the LastName structure to it.

At that point you will notice that the validation for the name, the question mark is no longer there and you now have validated (slide 20). You now have Name, FirstName, and LastName. You would do this basically for the rest of the document. For instance, you have to highlight data, mark the elements at the bottom, and try to validate the data.

You can also add elements by using the right-click (slide 21). You can go in and select text within the document. You can also right-click (press your right mouse button to right-click) and you'll see an option in there to apply an XML element. At that point when you apply XML, when you choose Apply XML Element, you get all the other objects there. You get the name, the address, phone. It's just basically the main elements and not the subelements, though.

Also within the right-click, you can have the option to remove tags (slide 22). You also have attributes. You also have the ability to go in and view the XML structure. You'll notice, on this example, when I was marking up the data, one of the things I did was I marked up address is showing only basically the street information, the street information, and the apartment number.

If you look to the right for the XML structure, you'll notice the address is up at the top. It has a question mark by it. The subelements are for street, city, state, and ZIP Code. It has other elements within the address button. Therefore, I need to correct my mistake. I need to go in and actually remove the address tag from where it is and then reapply it to include the street, the state, the city, and the ZIP Code.

If you put your curser anywhere within the two tags for address, and right-click, you will get the option. It automatically recognizes it as being the address tag and you can choose Remove Address tag.

This is an example where basically everything has been marked up (slide 23). Everything's been verified. At that point, you want to go in and save the document as an XML file. You just go up, and on the File menu, click Save As, choose XML, and this is basically an example of what the data looks like when you've saved it as an XML file with the WordML included. This is basically the XML file showing the tags that have been entered.

This is an example of what the data will actually look like if you choose the save as data only option (slide 24). You'll notice if you choose Save data only, you'll have a warning, basically, that says anything saved as data only or through a custom transform may result in loss of document features. That's basically what it does. It basically strips out the formatting, the pictures, and the objects. It basically leaves the straight text at that point with the tags left. As you can see, it is just pretty much tags and that's all that's left of it.

If you go to view the different saves and review the XML (slide 25), you'll see the data of how it looks. These were opened up in Internet Explorer. It's just basically an XML file that's been saved. You can open up the XML file in Internet Explorer or your browser. You can notice the difference right off the bat.

The Word with ML, you can see, from just the very beginning of it, it's got a lot of the properties like, for instance, the author of the document, total time to work on the document, create a number of pages, characters, paragraph styles, lines, etc. It has a lot of what gets saved within a document itself and what keeps a Word document able to retain its formatting. So basically it's got all the formatting information it could save within a Word document.

On the left-hand side, where the data-only screen is, you'll notice most of that's just been stripped out. You're basically dealing with basic data at that point. It's just straight raw names, the tags, and no formatting. That is data only.

Once you've created an XML file (slide 26), once you've gone through the process of actually creating the file, adding all the tags to it, saving the file as an XML file, you want to have a way for people to not go in and actually mess up all the tags that you've spent this time putting in.

Word 2003 allows XML files to be locked down. What it does is it allows you to set whoever has set the standards — whoever is going in and creating these forms — it allows you to lock down the documents so that somebody doesn't accidentally delete the XML tags that you've gone in and done.

It allows you to restrict users' access to the XML tags, and you can also lock down the document so that users can only work within editable sections of that document. Basically, you can go in and it's kind of like if any of you've ever used form fields, created a form in Word and used form fields, and then locked the document for forms, it's pretty much along that concept.

In order to lock down a file (slide 27), once you've created all your XML tags, the ability to lock down the file is basically from the Protect Document task pane. The document that you see right here is the XML file. It was created. The XML tags are turned off. You can toggle between turning tags on and turning tags off by doing a CTRL+SHIFT+X. This view is with tags off.

At this point, you'll notice that there are different options to go into in order to basically lock down the document. The two main ones you want to be concerned with, when locking down an XML file, are in section number two, under Editing restrictions, which basically allows you to leave the editing restrictions. You want it set to Allow only this type of editing in the document. And you want to have it set to No changes, (Read only).

Then you want to go to the Exceptions (optional) area underneath that, and this is where you can start editing sections of the document to allow users to go in and type the information that they need.

In order to do that, you basically select the text (slide 28). You select the Everyone check box, you go down to the Exceptions. You tell it, basically, I want everyone to have access to this text that I just selected. I want everybody to be able to edit that.

Once you do that, you click on — there's a Yes, Start Enforcing Protection button that was on the screen before. Sorry. Once you start enforcing that, the document is basically locked at that point. What you get is the document that's on the screen right now (slide 28). It's only allowing users to go in and type. You can see kind of the bookmark areas. It's only going to allow users to type within that area.

The Protect Document task pane, at that point, once the document's been locked down, it has different options. It allows people to go in and find the next region that they can edit. It basically just skips from one area to the other that allows them to be editable.

Where you see John Smith, the address, the city, the state, and the ZIP, if you had the placeholders on, remember, we talked earlier about setting attributes within the XML tags to set default text and then you had an option to go in and make sure to show the placeholder text if the area was blank?

If you had nothing in these fields at this point, if John and Smith were not entered into this document, if you had basically taken that out, what you would see is whatever default text you had placed in there. As an example, you might say "insert first name here" for John and where Smith is you might say "insert last name here." You would do the same with address, city, state, and ZIP.

Basically that was pretty much the overview or general review of XML and working with a sample document. We talked about how to add a schema, how to create and save an XML file, open an XML file, apply a transform, add a schema and apply the tags, and lock down the system.

In reviewing (slide 29), XML data allows for easy access from other applications. It allows you to open up XML data from within Word. It allows you to save data from within Word 2003 as XML to be used across other platforms and other applications.

Word 2003 allows you to have customizable schemas that are geared more towards your company's environment. WordML is basically the default schema applied when you choose to save as an XML file.

You do have the option to go in and change it to save data only. One thing about the WordML is that it does allow round-tripping of the formatting. It does keep the bolding and the paragraph styles, headings, etc.

Word 2003 also allows for the validation of the elements (slide 30) when adding the schema to the document and going in and adding the different tags and different elements of that schema, you can check for the validity of those elements. It also allows you to the ability to lock down the document to prevent the deletion of the elements once you have gone in and created them.

The last thing is the user friendliness of the Word interface. Users are still able to have the user friendliness of the Word layout, what they're used to. Developers have the ability to go in and create customized schemas and transforms to allow them to be able to use the data to take to other applications, whether it is their own custom applications within the company or other applications that can read XML.

The last thing I wanted to give you is a list of additional resources (slide 31). There are several of them on our Web site. A good way to go in and do it is just go to Microsoft.com. You can do a search for XML and there's a ton of stuff out there.

Some of the ones that I wanted to mention: The first one is the "XML Developer's Guide." You can download it. It's more programming. It does have some schemas included in it. It has a lot of good information, if you're from a developer point of view and you want to have that.

The second one, the "Architect Webcast: XML in Word 11" is another webcast that was given at an earlier date. It is very in-depth, very much about the architecture of the whole XML schema, the whole XML process in Word 11. It's pretty in-depth, so anybody that's interested, as far as development there, you may want to go there.

There's also the "Office 2003 XML Reference Schemas." It has schemas included with it, along with some programmability options for Word 2003, Excel 2003, and I think it's also got some Microsoft Office InfoPath™ 2003 schemas.

Another area that's very, very popular is the World Wide Web Consortium Standards, which is www.w3.org. It's basically a Web site that has all the standardizations that are used throughout the whole XML and HTML process.

The last one is "Creating and Applying an XML Resume Template in Microsoft Office Word 2003." This page has an executable file that you can download. It's basically what this whole Support WebCast and the sample were geared off of. It's where I took an example, messed around with the .xsd file that was in that sample, just added some more tags and stuff to it. But it's a great spot to go in and look if you want to take a look at one and start working with it.

There's one other thing that might be useful and I think it's in the Second Edition. There's actually an MSPress book called XML Step by Step, Second Edition. The XML Step by Step has a lot of good information for beginners. It has a CD with it that has example schemas, .xsd files, and kind of it basically walks you through the process of going in and creating one.

{Editor's note: For more information about XML Step by Step, Second Edition, see the Microsoft Press Web site at http://www.microsoft.com/MSPress/books/5518.asp. The ISBN number is 0-7356-1465-2.}

That pretty much takes care of today's WebCast at this point, and thank you very much for joining us at this time. I'll hand it over to Otto at this point.

Otto Cate: Thank you very much for the presentation. Before we move on to the Q&A portion of the Support WebCast here, I'd like to share a couple of quick program notes with our listeners. For more information on future events or to review any of our sessions on demand, feel free to visit our main Support WebCast site there at support.microsoft.com/webcasts.

There you will be able to also find the on-demand streaming media for this particular live session, as well as a downloadable version of the PowerPoint® slides and a full transcript within two to three weeks time.

We'd also like to ask for any feedback as well, on the events that we produce. Feel free to submit any feedback on today's event, for instance, subjects that you'd like to see in the future, or even the quality of the events that we produce. Anything is certainly fair game there and we'd love to hear from you. You can use that URL on the last slide to go to the "Contact Us" page at support.microsoft.com/servicedesks/webcasts/feedback.asp.

So, with that, let's go ahead and answer some questions that were submitted during the presentation today.

After you add the tags to the Word document, can you save out the .xsd schema data?

Risa: At this point Word does not allow you to save out as an .xsd file. I know that was actually brought up, I think, in, maybe, an Excel WebCast as well. At this point in time, we do not have the ability to save out as an .xsd file. I'm sure, you know, it's probably a question that's been asked a lot and it may be that, somewhere down the road, that may be incorporated, but at this time, no. It's just not available at this time.

Otto: If I create an XML document outside of MS Word, is there any way to use the MS Word field codes INCLUDETEXT and INCLUDEPICTURE?

Risa: I'm not sure exactly what you may be trying to do, but I can take it offline and do some research, and get back with you on that.

Follow-up information: We tested this and did not see a way to incorporate the fields. We tested creating a document in Word using the INCLUDETEXT and INCLUDEPICTURE fields and saving as an XML file. However, we saved it with the WordML included and not as data only. This seemed to be fine. I think that a straight XML file would not recognize those fields and be able to interpret them. However, this would fall more into the boundaries of editing XML data, which is more a development type level of support.

I would suggest trying the newsgroups that are available from our site and posting the question there. There are people out on the newsgroups that can probably address the question in more detail. I would suggest that you post to one or both of these newsgroups:

There are XML areas in both of these groups.

Otto: Okay. Next question: Can I take an XML Word doc and then enter it into an RDIF tag?

Risa: That I'm not really sure of either. I'd have to get back with you on that one as well.

Follow-up answer: I was not able to find any information regarding your question. However, this would fall more into the boundaries of a development type level of support where most of our XML support is addressed.

I would suggest trying the newsgroups that are available from our site and posting the question there. There are people out on the newsgroups that can probably address the question in more detail. I would suggest that you post to one or both of these newsgroups:

There are XML areas in both of these groups.

Otto: The final question here appears to be a clarification question: What was the extension that the schema will actually have?

Risa: A schema file will have an .xsd extension.

Otto: With that, it appears that we have answered all the questions that were submitted to the queue today, so I'm going to go ahead and wrap up our session. I certainly wanted to thank our presenter for coming out and giving us a great presentation here. Of course, as always, I'd like to thank you, our listeners, for attending today's event. We certainly hope that this information was useful to you and your business. We look forward to seeing everyone again in the near future. Thanks, everyone, and have a great day.