How To Locate and Replace Special Characters in an XML Document with Visual Basic

This article was previously published under Q251354
This article has been archived. It is offered "as is" and will no longer be updated.
XML predefines five entity references for special characters that would otherwise be interpreted as part of the markup language. These five are listed as follows:
Entity Reference                Character      &amp;                        &        &lt:                         <      &gt                         >      &quot;                       "      &apos;                       '                    				
When you create an XML Document, you must input these special characters appropriately to insure proper parsing, interpretation, and data display. When you create an XML Document from Visual Basic, one way to search for these special characters and replace them with the proper entity reference is to use the Replace function.

NOTE: Microsoft Visual Basic 6.0 and later products, and Microsoft Office 98 and later products come with a Replace function that is built-in.
The following sample opens an ADO Recordset, creates Document Elements for the records, and saves the data as an XML Document using the FileSystemObject object. Before you save the file, however, the content is searched for special characters and replaced with the appropriate XML Entity References for those characters utilizing the Replace Method.

NOTE: For simplicity of demonstration, the instructions are for a search and replace for the Ampersand (&) character in the following sample. The same function can be applied for the other four characters as well.
  1. To run this sample, start a new Visual Basic Standard Exe Project. Form1 is created by default.
  2. On the Project menu, under References, select the following: Microsoft ActiveX Data Objects Library (2.1 or later)
    Microsoft XML, Version 1.0 or later
    Microsoft Scripting Runtime*
    *The FileSystemObject Object Model is contained in the Scripting Type Library.

  3. Draw the following controls on Form1 and set their Properties as listed:
       Control                   Name			   CaptionLabel                  lblFSORecordCount           Record Count:Text Box               txtFSORecordCountCommand Button         cmdOpenAdoFsoRs             Open ADO Recordset	Command Button         cmdSaveRSasXMLwithFSO       Save RS as XML using FSOCommand Button         cmdFsoXmltoTreeView         Place XML Data Into TreeViewTreeView Control*      xtvFSO*The TreeView Control is part of the Windows Common Controls. Therefore, you want to check this off under Components in the Project menu.					
  4. From the Project menu, add a standard Module and copy the following to create the appropriate variables needed in the project:
    Option ExplicitPublic cn As New adodb.ConnectionPublic rs As New adodb.RecordsetPublic myI As IntegerPublic StrTmp As StringPublic Errs1 As ErrorsPublic myFSO As New FileSystemObjectPublic myTxtStream As TextStreamPublic myfld As adodb.FieldPublic domdoc As New MSXML.DOMDocumentPublic domdoc As New MSXML2.DOMDocument 'For use with XML 2.6					
  5. Copy the following code into the General Declarations Section of the code window.NOTE: Replace your Server Name, User ID, and Password information in the Connection String.
    Private Sub cmdOpenAdoFsoRs_Click()Screen.MousePointer = vbHourglass'Check to be sure the connection is not already opened:If cn.State = 1 Then    cn.Close    Set cn = NothingEnd Ifcn.Open "driver={SQL Server};Server=YourServer;uid=YourUID;pwd=YourPWD;database=Northwind"'Check to be sure a Recordset is not already opened:If rs.State = 1 Then    rs.Close    Set rs = NothingEnd IfWith rs    .CursorLocation = adUseClient    .CursorType = adOpenKeyset    .LockType = adLockOptimistic    .CacheSize = 50    .Source = "Select CustomerID, " & _    "CompanyName, ContactName, Region from Customers"    .ActiveConnection = cn    .OpenEnd Withrs.MoveLastrs.MoveFirsttxtFsoRecordCount.Text = rs.RecordCountScreen.MousePointer = vbNormalEnd SubPrivate Sub cmdExit_Click()If cn.State = 1 Then    cn.Close    Set cn = NothingEnd IfIf rs.State = 1 Then    rs.Close    Set rs = NothingEnd IfUnload MeEnd SubPrivate Sub cmdSaveRSasXMLwithFso_Click()Screen.MousePointer = vbHourglass'If the file already exists, first Kill it'and then create it.'If it does not already exist, simply create it:On Error GoTo mdkNoSeeEmConst mdkNoSeeEm = 53 'File Not FoundKill "Customers.xml"'Note: Unicode data is stored as 2 Bytes per character. This will allow'proper parsing of international language characters. However, either'omitting or explicitly setting the last 'argument in the CreateTextFile 'Method to False will create the file as Ascii. Subsequently, a '"An 'Invalid Character Was Found in Text Content" error will occur if these 'characters exist in 'the text. Therefore, you use Unicode (True). 'Create a Text file using the FileSystemObject:Set myTxtStream = myFSO.CreateTextFile(App.Path & "\Customers.xml", True, True)'After searching for and replacing special characters with'their respective entities (via the ReplaceEntXMLSpecChar Function),'write the data to Text File to compile XML Document using the TextStream:myTxtStream.WriteLine "<?xml version=""1.0"" standalone=""yes""?>"myTxtStream.WriteLine "<Customers>"rs.MoveFirst    Do While Not rs.EOFmyTxtStream.WriteLine "<Customer>"myTxtStream.WriteLine "<CustomerID>" & ReplaceEntXMLSpecChar(rs.Fields("CustomerID") & "", "&", "amp;") & "</CustomerID>"myTxtStream.WriteLine "<CompanyName>" & ReplaceEntXMLSpecChar(rs.Fields("CompanyName") & "", "&", "amp;") & "</CompanyName>"myTxtStream.WriteLine "<ContactName>" & ReplaceEntXMLSpecChar(rs.Fields("ContactName") & "", "&", "amp;") & "</ContactName>"myTxtStream.WriteLine "<Region>" & ReplaceEntXMLSpecChar(rs.Fields("Region") & "", "&", "amp;") & "</Region>"myTxtStream.WriteLine "</Customer>"rs.MoveNextLooprs.MoveFirstmyTxtStream.WriteLine "</Customers>"myTxtStream.CloseScreen.MousePointer = vbNormalExit SubmdkNoSeeEm:    If Err.Number = mdkNoSeeEm Then        Resume Next    End If    Screen.MousePointer = vbNormalEnd Sub'When this function is applied to each record in the recordset,'simply provide the Source to perform the search on, the item'to be Searched For and the item to Replace with:Private Function ReplaceEntXMLSpecChar(ByVal strSource As String, ByVal strSearchFor As String, ByVal strReplace As String) _    As String    Dim lngPointer As Long, strNew As String    lngPointer = InStr(strSource, strSearchFor)    If lngPointer = 0 Then        ReplaceEntXMLSpecChar = strSource    Else        strNew = Left$(strSource, lngPointer - 1) & strReplace & Mid(strSource, lngPointer + 1, Len(strSource))        ReplaceEntXMLSpecChar = strNew    End IfEnd FunctionPrivate Sub cmdFsoXmltoTreeView_Click()'Fill The TreeView with data from the XML Document:With DOMdoc    .async = False    .Load App.Path & "\customers.xml"End WithIf DOMdoc.readyState = 4 Then    xtvFSO.Nodes.Clear    AddNode DOMdoc.documentElementEnd IfEnd SubPrivate Sub AddNode(ByRef myNode As IXMLDOMNode, Optional ByRef TreeNode As Node)       Dim tvcusterrNode As Node       Dim tvcusterrNodeList As IXMLDOMNodeList       Dim i As Long                       If TreeNode Is Nothing Then           Set tvcusterrNode = xtvFSO.Nodes.Add       Else           Set tvcusterrNode = xtvFSO.Nodes.Add(TreeNode, tvwChild)       End If                 tvcusterrNode.Expanded = True       tvcusterrNode.Text = myNode.nodeName                 If tvcusterrNode.Text = "#text" Then              tvcusterrNode.Text = myNode.nodeTypedValue       Else              tvcusterrNode.Text = "<" & tvcusterrNode.Text & ">"  End If					
  6. Run the project.
  7. Click the Open ADO Recordset CommandButton to open the Connection and Recordset, displaying a RecordCount.
  8. Click the Save RS as XML Using FSO CommandButton to parse the data, replace special characters in the Recordset, and then save it as an XML document through the FileSystemObject object.
  9. Subsequently, click the Place FSO XML Data into TreeView CommandButton to open the XML File and display the data in a TreeView Control.

    If you open the XML document produced in Notepad or any other text editor and do a search for the ampersand (&), you find:
    Split Rail Beer &amp;						
    In the treeview, this appears as:
    Split Rail Beer &						
If a newer version of MSXML has been installed in side-by-side mode, you must explicitly use the Globally Unique Identifiers (GUIDs) or ProgIDs for that version to run the sample code. For example, MSXML version 4.0 can only be installed in side-by-side mode. For additional information about the code changes that are required to run the sample code with the MSXML 4.0 parser, click the following article number to view the article in the Microsoft Knowledge Base:
305019 INFO: MSXML 4.0 Specific GUIDs and ProgIds
For additional information, click the following article number to view the article in the Microsoft Knowledge Base:
308060 How To Locate and Replace Special Characters in an XML File with Visual Basic .NET

Article ID: 251354 - Last Review: 12/05/2015 18:18:20 - Revision: 3.2

Microsoft Visual Basic 6.0 Learning Edition, Microsoft Visual Basic 6.0 Professional Edition, Microsoft Visual Basic Enterprise Edition for Windows 6.0, Microsoft ActiveX Data Objects 2.5, Microsoft ActiveX Data Objects 2.6, Microsoft XML Parser 3.0, Microsoft XML Core Services 4.0

  • kbnosurvey kbarchive kbdatabase kbhowto KB251354