HOW TO: Retrieve Metadata from Excel by Using the GetOleDbSchemaTable Method in Visual Basic .NET

For a Microsoft Visual C# .NET version of this article, see
318452 .

Summary

This step-by-step article describes how to retrieve table and column metadata from Microsoft Excel data sources by using the GetOleDbSchemaTable method with the Microsoft OLE DB Managed Provider and the Microsoft OLE DB Provider for Jet.

The GetOleDbSchemaTable method that is exposed by the System.Data.OleDb class of Microsoft .NET Framework is the .NET successor to the OpenSchema method in earlier versions of Microsoft ActiveX Data Objects (ADO).

Description of the Technique

After you connect to an Excel data source by using ADO.NET, you extract a list of table metadata by using GetOleDbSchemaTable, and then use the same method with different arguments to obtain column metadata for the selected table. You can also use a DataGridTableStyle object to lay out and format your query results in a data grid.

Requirements

The following list outlines the recommended hardware, software, network infrastructure, and service packs that you will need:
  • Visual Studio .NET installed on a compatible Microsoft Windows operating system
  • At least one Microsoft Excel workbook (.xls) file with some rows and columns of data
This article assumes that you have at least basic familiarity with the following topics:
  • Visual Basic .NET
  • ADO.NET data access
  • Excel workbooks and worksheets

Sample

  1. Start Microsoft Visual Studio .NET, and create a new Visual Basic .NET Windows Application project.
  2. Add three Button controls and two DataGrid controls to the default form (Form1). Change the Text property of the Button controls to Retrieve Metadata, Format Tables List, and Format Columns List respectively.
  3. Switch to the form's code module, and add the following Imports statements at the top:
    Imports System.Data
    Imports System.Data.OleDb
    Imports System.ComponentModel
  4. Insert the following module-level declarations in the Form class after the "Inherits System.Windows.Forms.Form" line. Adjust the connection string as necessary to point to an Excel workbook file that contains some rows and columns of data.
        Dim cn As OleDbConnection
    Dim strCn As String = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    "Data Source=c:\test.xls;Extended Properties=Excel 8.0"
    Dim dtTables As DataTable
    Dim WithEvents cm As CurrencyManager
    Dim dtColumns As DataTable
    Dim dvColumns As DataView
  5. Insert the following code in the Form class after the "Windows Form Designer generated code" region. This code calls GetOleDbSchemaTable to load the table and columns lists, populates the DataGrid controls, and updates the columns list when the selected table changes.
        Private Sub Button1_Click(ByVal sender As System.Object, _
    ByVal e As System.EventArgs) Handles Button1.Click
    Call GetTablesList()
    End Sub

    Private Sub GetTablesList()
    cn = New OleDbConnection(strCn)
    cn.Open()
    dtTables = cn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)
    DataGrid1.DataSource = dtTables
    DataGrid1.ReadOnly = True
    cn.Close()
    Call GetColumnsList()
    End Sub

    Private Sub GetColumnsList()
    If cm Is Nothing Then
    cm = CType(Me.BindingContext(dtTables), CurrencyManager)
    End If
    Dim r As Integer = cm.Position
    Dim strTable As String = dtTables.Rows(r)("TABLE_NAME")
    cn = New OleDbConnection(strCn)
    cn.Open()
    dtColumns = cn.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, _
    New Object() {Nothing, Nothing, strTable, Nothing})
    dvColumns = New DataView(dtColumns)
    dvColumns.Sort = "ORDINAL_POSITION"
    DataGrid2.DataSource = dvColumns
    DataGrid2.ReadOnly = True
    cn.Close()
    End Sub

    Private Sub cm_PositionChanged(ByVal sender As Object, _
    ByVal e As System.EventArgs) Handles cm.PositionChanged
    Call GetColumnsList()
    End Sub
  6. Insert the following code to lay out and format the Tables DataGrid by using TableStyles. Note the use of PropertyDescriptor to facilitate non-default formatting of the date columns.
        Private Sub Button2_Click(ByVal sender As System.Object, _
    ByVal e As System.EventArgs) Handles Button2.Click
    Call FormatTablesGrid(dtTables)
    End Sub

    Private Sub FormatTablesGrid(ByVal dt2format As DataTable)
    Dim gs As DataGridTableStyle = New DataGridTableStyle()
    gs.MappingName = dt2format.TableName

    Dim cs As DataGridColumnStyle = New DataGridTextBoxColumn()
    With cs
    .MappingName = "TABLE_NAME"
    .HeaderText = "Table Name"
    .Width = 75
    End With
    gs.GridColumnStyles.Add(cs)

    cs = New DataGridTextBoxColumn()
    With cs
    .MappingName = "TABLE_TYPE"
    .HeaderText = "Table Type"
    .Width = 75
    End With
    gs.GridColumnStyles.Add(cs)

    Dim cm As CurrencyManager = CType(Me.BindingContext(dt2format), CurrencyManager)
    Dim pd As PropertyDescriptor = cm.GetItemProperties()("DATE_CREATED")

    cs = New DataGridTextBoxColumn(pd, "d")
    With cs
    .MappingName = "DATE_CREATED"
    .HeaderText = "Date Created"
    .Width = 75
    End With
    gs.GridColumnStyles.Add(cs)

    cm = CType(Me.BindingContext(dt2format), CurrencyManager)
    pd = cm.GetItemProperties()("DATE_MODIFIED")

    cs = New DataGridTextBoxColumn(pd, "d")
    With cs
    .MappingName = "DATE_MODIFIED"
    .HeaderText = "Date Modified"
    .Width = 75
    End With
    gs.GridColumnStyles.Add(cs)

    DataGrid1.TableStyles.Add(gs)
    Me.Button2.Enabled = False

    End Sub
  7. Insert the following code to lay out and format the Columns
    DataGrid by using TableStyle:
        Private Sub Button3_Click(ByVal sender As System.Object, _
    ByVal e As System.EventArgs) Handles Button3.Click
    Call FormatColumnsGrid(dtColumns)
    End Sub

    Private Sub FormatColumnsGrid(ByVal dt2format As DataTable)
    Dim gs As DataGridTableStyle = New DataGridTableStyle()
    gs.MappingName = dtColumns.TableName

    Dim cs As DataGridColumnStyle = New DataGridTextBoxColumn()
    With cs
    .MappingName = "COLUMN_NAME"
    .HeaderText = "Column Name"
    .Width = 100
    End With
    gs.GridColumnStyles.Add(cs)

    cs = New DataGridTextBoxColumn()
    With cs
    .MappingName = "ORDINAL_POSITION"
    .HeaderText = "Ordinal Position"
    .Width = 100
    End With
    gs.GridColumnStyles.Add(cs)

    cs = New DataGridTextBoxColumn()
    With cs
    .MappingName = "DATA_TYPE"
    .HeaderText = "Data Type"
    .Width = 75
    End With
    gs.GridColumnStyles.Add(cs)

    cs = New DataGridTextBoxColumn()
    With cs
    .MappingName = "CHARACTER_MAXIMUM_LENGTH"
    .HeaderText = "Text Length"
    .Width = 75
    End With
    gs.GridColumnStyles.Add(cs)

    cs = New DataGridTextBoxColumn()
    With cs
    .MappingName = "NUMERIC_PRECISION"
    .HeaderText = "Numeric Precision"
    .Width = 75
    End With
    gs.GridColumnStyles.Add(cs)

    DataGrid2.TableStyles.Add(gs)
    Me.Button3.Enabled = False

    End Sub
  8. Run the project.
  9. Click Retrieve Metadata to fill the Tables list (DataGrid1) with all of the columns of information that are returned for each table in the Excel workbook by GetOleDbSchemaTable. The Columns list (DataGrid2) is filled at the same time with all of the columns of information that are returned for the columns in the first table in the Tables list.
  10. Select a different table in the Tables list. The Columns list changes to display the columns from the selected table because of the cm_PositionChanged event handler that you implemented.
  11. Click Format Tables List to define and apply a TableMappingStyle to DataGrid1. This makes the column headers more "friendly" and displays only the four columns of useful data from the nine columns that are returned by GetOleDbSchemaTable. These columns are displayed:
    TABLE_NAME
    TABLE_TYPE
    DATE_CREATED
    DATE_MODIFIED
    These blanks columns are not displayed:
    TABLE_CATALOG
    TABLE_SCHEMA
    TABLE_GUID
    DESCRIPTION
    TABLE_PROPID
  12. Click Format Columns List to define and to apply a TableMappingStyle to DataGrid2. This makes the column headers more "friendly" and displays only the 5 most useful columns of data from the 28 columns that are returned by GetOleDbSchemaTable.
    COLUMN_NAME
    ORDINAL_POSITION
    DATA_TYPE
    MAXIMUM_CHARACTER_LENGTH
    NUMERIC_PRECISION
    These mostly blank columns are not displayed:
    TABLE_CATALOG
    TABLE_SCHEMA
    TABLE_NAME (already known)
    COLUMN_GUID
    COLUMN_PROPID
    COLUMN_HASDEFAULT (always false)
    COLUMN_DEFAULT
    COLUMN_FLAGS
    IS_NULLABLE (always true)
    TYPE_GUID
    CHARACTER_OCTET_LENGTH
    NUMERIC_SCALE
    DATETIME_PRECISION
    CHARACTER_SET_CATALOG
    CHARACTER_SET_SCHEMA
    CHARACTER_SET_NAME
    COLLATION_CATALOG
    COLLATION_SCHEMA
    COLLATION_NAME
    DOMAIN_CATALOG
    DOMAIN_SCHEMA
    DOMAIN_NAME
    DESCRIPTION
  13. Click a different row in DataGrid1 to select a different table. The Columns list is automatically updated. The TableStyles that you previously applied to each DataGrid remain in effect.

Troubleshooting

  • The column data types that are returned by GetOleDbSchemaTable for an Excel data source are not identical in all cases to the data types that are returned by the OpenSchema method in classic ADO:
    Column typeClassic ADOADO.NET (OLE DB)
    Numeric5 - adDouble5 - OleDbType.Double
    Currency6 - adCurrency6 - OleDbType.Currency
    Date/Time7 - adDate7 - OleDbType.Date
    Boolean11 - adBoolean11 - OleDbType.Boolean
    Text < 255202 - adVarWChar130 - OleDbType.WChar
    Memo203 - adLongVarWChar130 - OleDbType.WChar
  • GetOleDbSchemaTable, like OpenSchema, returns a single column "F1" from an empty Excel worksheet, although in fact no data or column headings are present.

References

For additional information, click the following article number to view the article in the Microsoft Knowledge Base:
257819 HOWTO: Use ADO with Excel Data from Visual Basic or VBA
プロパティ

文書番号:318373 - 最終更新日: 2009/03/23 - リビジョン: 1

フィードバック