DataSet contains duplicate records based on the SQL outer join query

This article was previously published under Q317058
This article has been archived. It is offered "as is" and will no longer be updated.
This article refers to the following Microsoft .NET Framework Class Library namespaces:
  • System.Data
  • System.Data.SqlClient
If you call the Fill method of the DataAdapter object twice on a DataTable in a DataSet that is based on a SQL outer join statement, if you do not first clear the current contents of the DataTable, the DataTable may contain two copies of each record.
The records are duplicated because a DataTable that is based on a SQL outer join query does not contain primary key information to identify each record uniquely. Instead of merging with the existing records based on key information, a second set of exactly the same records is appended.
To resolve this problem, clear the DataTable before you reload the data. To do this, call the Clear method of the DataTable or the DataSet before you call the Fill method of the DataAdapter the second time.

This is the recommended method. This method is more efficient than if you require the DataAdapter to merge the new data with the existing data.
This behavior is by design.
A SQL outer join query requests records from two or more tables, even when some of the records in one table may have no matching records in the other table. For example, if you want a list of all of the customers in the Northwind sample database, including those who had never placed an order, you use the following query:
USE NorthwindSELECT Customers.CustomerID, Customers.ContactName, Orders.OrderID, Orders.OrderDateFROM Customers LEFT OUTER JOIN OrdersON Customers.CustomerID = Orders.CustomerIDORDER BY Customers.CustomerID				
In the case of a customer who had not placed any orders, the results appear as follows:


Because the Primary Key column of the Orders table (OrderID) may contain these NULL values, Microsoft SQL Server marks the column as nullable. However, because the DataSet does not accept a nullable Primary Key column, the DataSet does not import any primary key information with the table meta data. The lack of primary key information causes the records to be duplicated instead of merged.

Steps to reproduce behavior

  1. Start Visual Studio .NET or Visual Studio 2005, and then create a new Visual Basic Windows application.
  2. Add the following statements at the top of the code window for the default form:
    Imports System.DataImports System.Data.SqlClient					
  3. Drag a Button control from the toolbox onto the form.
  4. Add the following code in the Button1_Click event:
    Try   Dim cn As SqlConnection   Dim cmd As SqlCommand   Dim da As SqlDataAdapter   Dim ds As DataSet   Dim dt As DataTable   Dim strSelect As String   Dim duplicateRows As DataRow()   cn = New SqlConnection("Data Source=YourSQLServer;" & _     "Initial Catalog=Northwind;User ID=YourUser;Password=YourPassword;")   cmd = New SqlCommand()   cmd.CommandText = "SELECT Customers.CustomerID, " & _     "Customers.ContactName, Orders.OrderID, Orders.OrderDate " & _     "FROM Customers LEFT OUTER JOIN Orders " & _     "ON Customers.CustomerID = Orders.CustomerID " & _     "ORDER BY Customers.CustomerID"   Console.WriteLine(cmd.CommandText)   cmd.Connection = cn   da = New SqlDataAdapter(cmd)   ds = New DataSet()   da.Fill(ds, "CustomerOrders")   dt = ds.Tables("CustomerOrders")   Console.WriteLine("Row count after first Fill: " & _     dt.Rows.Count.ToString)   strSelect = "CustomerID='ALFKI' AND OrderID=10643"   duplicateRows = dt.Select(strSelect)   Console.WriteLine("Rows for ALFKI order 10643: " & _      duplicateRows.Length)   da.Fill(ds, "CustomerOrders")   Console.WriteLine("Row count after second Fill: " & _      dt.Rows.Count.ToString)   duplicateRows = dt.Select(strSelect)   Console.WriteLine("Rows for ALFKI order 10643: " & _      duplicateRows.Length)   Catch exc As Exception     MessageBox.Show(exc.Message)   End Try					
  5. Modify the connection string in the declaration for cn as necessary for your environment.
  6. Run the project. Review the results in the Output window. Notice that, after the second execution of the Fill method of the DataAdapter, the "CustomersOrders" DataTable contains duplicate records.

Article ID: 317058 - Last Review: 02/27/2014 21:11:29 - Revision: 4.5

  • Microsoft ADO.NET 1.0
  • Microsoft ADO.NET 1.1
  • Microsoft ADO.NET 2.0
  • Microsoft Visual Basic .NET 2002 Standard Edition
  • Microsoft Visual Basic .NET 2003 Standard Edition
  • Microsoft Visual Basic 2005
  • kbnosurvey kbarchive kbtshoot kbdataadapter kbprb kbsqlclient kbsystemdata KB317058