This step-by-step article describes how to implement and
how to use a DataSetHelper class that includes sample code to create a DataTable object that contains aggregate data from another DataTable object.
To do this, use the following public
methods:
CreateGroupByTable
InsertGroupByInto
SelectGroupByInto
The SelectGroupByInto method calls the CreateGroupByTable and the InsertGroupByInto methods. You can also use private helper methods and data members
to store the parsed field list, and to handle NULL values correctly when the
aggregate values are calculated.
The DataSetHelper class includes a DataSet member variable. Optionally, you can assign an existing DataSet. If the member variable points to a valid DataSet, any DataTable objects that the CreateGroupByTable or the SelectGroupByInto method creates are added to the DataSet. In either case, the method call returns a reference to the DataTable object. The InsertGroupByInto method requires an existing target DataTable, and does not return anything.
For additional information about DataSet objects, click the following article number to view the article in the Microsoft Knowledge Base:
313485
(http://support.microsoft.com/kb/313485/EN-US/
)
INFO: Roadmap for ADO.NET DataSet, DataView, and DataViewManager
The code in this section declares the shell class that all DataSetHelper articles add methods and member variables to.
Start Visual Studio .NET.
On the File menu, point to New, and then click Project.
In the New Project dialog box, click Visual C# Projects under Project Types, and then click Class Library under Templates.
In the Name text box, type DataSetHelper. Click
OK.
In the code, at the top of the code window, add the
following line:
using System.Data;
Replace the class code with the following code:
public class DataSetHelper
{
public DataSet ds;
public DataSetHelper(ref DataSet DataSet)
{
ds = DataSet;
}
public DataSetHelper()
{
ds = null;
}
}
The two overloads for the constructor allow you to create an instance
of the class with or without a reference to a valid DataSet. For a class that contains a reference to a valid DataSet, the DataTable objects that the methods return are also added automatically to the DataSet.
This section contains the code for two field list parsers: one for the field list, and one for the fields to group by. The parsed
structure is used so that the CreateGroupByTable and InsertGroupByInto methods do not have to reparse the field list. These methods must reparse the field list if they are called from the SelectGroupByInto method or from your own code. The parsed field list and the unparsed field list
are stored in Private member variables of the DataSetHelper class.
Add the following Private class definition in the DataSetHelper class that you created in the "" section:
private class FieldInfo
{
public string RelationName;
public string FieldName; //source table field name
public string FieldAlias; //destination table field name
public string Aggregate;
}
Note This class is common to other DataSetHelper articles, and contains some fields that this article does not use.
Add the following Private member variables to the class definition immediately after the DataSet declaration:
Add the following Private method to the DataSetHelper class definition. This method is the same as the method that is used in other DataSetHelper articles and is used to parse the list of fields to group by.
private void ParseFieldList(string FieldList, bool AllowRelation)
{
/*
* This code parses FieldList into FieldInfo objects and then
* adds them to the m_FieldInfo private member
*
* FieldList systax: [relationname.]fieldname[ alias], ...
*/
if (m_FieldList == FieldList) return;
m_FieldInfo = new System.Collections.ArrayList();
m_FieldList = FieldList;
FieldInfo Field; string[] FieldParts; string[] Fields=FieldList.Split(',');
int i;
for (i=0; i<=Fields.Length-1; i++)
{
Field=new FieldInfo();
//parse FieldAlias
FieldParts = Fields[i].Trim().Split(' ');
switch (FieldParts.Length)
{
case 1:
//to be set at the end of the loop
break;
case 2:
Field.FieldAlias=FieldParts[1];
break;
default:
throw new Exception("Too many spaces in field definition: '" + Fields[i] + "'.");
}
//parse FieldName and RelationName
FieldParts = FieldParts[0].Split('.');
switch (FieldParts.Length)
{
case 1:
Field.FieldName=FieldParts[0];
break;
case 2:
if (AllowRelation==false)
throw new Exception("Relation specifiers not permitted in field list: '" + Fields[i] + "'.");
Field.RelationName = FieldParts[0].Trim();
Field.FieldName=FieldParts[1].Trim();
break;
default:
throw new Exception("Invalid field definition: " + Fields[i] + "'.");
}
if (Field.FieldAlias==null)
Field.FieldAlias = Field.FieldName;
m_FieldInfo.Add (Field);
}
}
Add the following Private method to the class definition. This method is used to parse the field
list, including aggregate functions.
private void ParseGroupByFieldList(string FieldList)
{
/*
* Parses FieldList into FieldInfo objects and adds them to the GroupByFieldInfo private member
*
* FieldList syntax: fieldname[ alias]|operatorname(fieldname)[ alias],...
*
* Supported Operators: count,sum,max,min,first,last
*/
if (GroupByFieldList == FieldList) return;
GroupByFieldInfo = new System.Collections.ArrayList();
FieldInfo Field; string[] FieldParts; string[] Fields = FieldList.Split(',');
for (int i=0; i<=Fields.Length-1;i++)
{
Field = new FieldInfo();
//Parse FieldAlias
FieldParts = Fields[i].Trim().Split(' ');
switch (FieldParts.Length)
{
case 1:
//to be set at the end of the loop
break;
case 2:
Field.FieldAlias = FieldParts[1];
break;
default:
throw new ArgumentException("Too many spaces in field definition: '" + Fields[i] + "'.");
}
//Parse FieldName and Aggregate
FieldParts = FieldParts[0].Split('(');
switch (FieldParts.Length)
{
case 1:
Field.FieldName = FieldParts[0];
break;
case 2:
Field.Aggregate = FieldParts[0].Trim().ToLower(); //we're doing a case-sensitive comparison later
Field.FieldName = FieldParts[1].Trim(' ', ')');
break;
default:
throw new ArgumentException("Invalid field definition: '" + Fields[i] + "'.");
}
if (Field.FieldAlias==null)
{
if (Field.Aggregate==null)
Field.FieldAlias=Field.FieldName;
else
Field.FieldAlias = Field.Aggregate + "of" + Field.FieldName;
}
GroupByFieldInfo.Add(Field);
}
GroupByFieldList = FieldList;
}
This call sample creates a new DataTable with a TableName of OrderSummary and four fields (EmployeeID, Total, Min, and
Max). The four fields have the same data type as the EmployeeID and the Amount
fields in the Orders table.
Use the following syntax to specify
fields in the field list:
The ColumnName and the DataType properties are the only properties that are copied to the
destination DataTable.
You can rename a field in the destination DataTable by specifying an alias name.
The field list can contain a subset of field names that are
listed in a different order than in the source DataTable. If the field list is blank, an exception is thrown.
Relation specifiers are not supported as part of the field
name. All fields must come from the same DataTable.
To call the CreateGroupByTable method, add the following method to the DataSetHelper class that you created in the "" section:
public DataTable CreateGroupByTable(string TableName, DataTable SourceTable, string FieldList)
{
/*
* Creates a table based on aggregates of fields of another table
*
* RowFilter affects rows before GroupBy operation. No "Having" support
* though this can be emulated by subsequent filtering of the table that results
*
* FieldList syntax: fieldname[ alias]|aggregatefunction(fieldname)[ alias], ...
*/
if (FieldList == null)
{
throw new ArgumentException("You must specify at least one field in the field list.");
//return CreateTable(TableName, SourceTable);
}
else
{
DataTable dt = new DataTable(TableName);
ParseGroupByFieldList(FieldList);
foreach (FieldInfo Field in GroupByFieldInfo)
{
DataColumn dc = SourceTable.Columns[Field.FieldName];
if (Field.Aggregate==null)
dt.Columns.Add(Field.FieldAlias, dc.DataType, dc.Expression);
else
dt.Columns.Add(Field.FieldAlias, dc.DataType);
}
if (ds != null)
ds.Tables.Add(dt);
return dt;
}
}
This section contains code for the InsertGroupByInto method.
The results are sorted on the fields that are listed in the GroupBy argument. The GroupBy argument must comply with a valid Sort field list (minus
ASC and DESC modifiers). If the GroupBy argument is blank, the target DataTable contains only a single record that aggregates all the input.
When you call the ParseGroupByFieldList and the ParseFieldList properties, you can parse lists that were previously parsed, if
these lists are available. If the field list is blank, an exception is thrown.
This is the calling convention for the InsertGroupByInto method:
This call sample reads records from the DataTable that is named Orders and writes records to the DataTable that is named OrderSummary. The OrderSummary DataTable contains the EmployeeID field and three different aggregates of
the Amount field that are filtered on "EmployeeID<5" and that are grouped on
(and sorted by) EmployeeID.
Note The filter expression is applied before any aggregate
functionality. To implement HAVING-type functionality, filter the resultant DataTable.
To call the InsertGroupByInto method, add the following method to the DataSetHelper class that you created in the "" section:
public void InsertGroupByInto(DataTable DestTable, DataTable SourceTable, string FieldList,
string RowFilter, string GroupBy)
{
/*
* Copies the selected rows and columns from SourceTable and inserts them into DestTable
* FieldList has same format as CreateGroupByTable
*/
if (FieldList == null)
throw new ArgumentException("You must specify at least one field in the field list.");
ParseGroupByFieldList(FieldList); //parse field list
ParseFieldList(GroupBy,false); //parse field names to Group By into an arraylist
DataRow[] Rows = SourceTable.Select(RowFilter, GroupBy);
DataRow LastSourceRow = null, DestRow = null; bool SameRow; int RowCount=0;
foreach(DataRow SourceRow in Rows)
{
SameRow=false;
if (LastSourceRow!=null)
{
SameRow=true;
foreach(FieldInfo Field in m_FieldInfo)
{
if (!ColumnEqual(LastSourceRow[Field.FieldName], SourceRow[Field.FieldName]))
{
SameRow=false;
break;
}
}
if (!SameRow)
DestTable.Rows.Add(DestRow);
}
if (!SameRow)
{
DestRow = DestTable.NewRow();
RowCount=0;
}
RowCount+=1;
foreach(FieldInfo Field in GroupByFieldInfo)
{
switch(Field.Aggregate) //this test is case-sensitive
{
case null: //implicit last
case "": //implicit last
case "last":
DestRow[Field.FieldAlias]=SourceRow[Field.FieldName];
break;
case "first":
if (RowCount==1)
DestRow[Field.FieldAlias]=SourceRow[Field.FieldName];
break;
case "count":
DestRow[Field.FieldAlias]=RowCount;
break;
case "sum":
DestRow[Field.FieldAlias]=Add(DestRow[Field.FieldAlias], SourceRow[Field.FieldName]);
break;
case "max":
DestRow[Field.FieldAlias]=Max(DestRow[Field.FieldAlias], SourceRow[Field.FieldName]);
break;
case "min":
if (RowCount==1)
DestRow[Field.FieldAlias]=SourceRow[Field.FieldName];
else
DestRow[Field.FieldAlias]=Min(DestRow[Field.FieldAlias], SourceRow[Field.FieldName]);
break;
}
}
LastSourceRow = SourceRow;
}
if(DestRow!=null)
DestTable.Rows.Add(DestRow);
}
private FieldInfo LocateFieldInfoByName(System.Collections.ArrayList FieldList, string Name)
{
//Looks up a FieldInfo record based on FieldName
foreach (FieldInfo Field in FieldList)
{
if (Field.FieldName==Name)
return Field;
}
return null;
}
private bool ColumnEqual(object a, object b)
{
/*
* Compares two values to see if they are equal. Also compares DBNULL.Value.
*
* Note: If your DataTable contains object fields, you must extend this
* function to handle them in a meaningful way if you intend to group on them.
*/
if ((a is DBNull) && (b is DBNull))
return true; //both are null
if ((a is DBNull) || (b is DBNull))
return false; //only one is null
return (a==b); //value type standard comparison
}
private object Min(object a, object b)
{
//Returns MIN of two values - DBNull is less than all others
if ((a is DBNull) || (b is DBNull))
return DBNull.Value;
if (((IComparable)a).CompareTo(b)==-1)
return a;
else
return b;
}
private object Max(object a, object b)
{
//Returns Max of two values - DBNull is less than all others
if (a is DBNull)
return b;
if (b is DBNull)
return a;
if (((IComparable)a).CompareTo(b)==1)
return a;
else
return b;
}
private object Add(object a, object b)
{
//Adds two values - if one is DBNull, then returns the other
if (a is DBNull)
return b;
if (b is DBNull)
return a;
return ((decimal)a + (decimal)b);
}
This section contains the code for the SelectGroupByInto method. This method is a combination of the CreateGroupByTable and the InsertGroupByInto methods. The SelectGroupByInto method creates a new DataTable based on existing DataTable objects, and copies the records that are sorted and filtered to
the new DataTable.
The following is the calling convention for the SelectGroupByInto method:
This call sample creates a new DataTable with a TableName of OrderSummary and four fields (EmployeeID, Total, Min, and
Max). These four fields have the same data type as the EmployeeID and the
Amount fields in the Orders table. Then this sample reads records from the
Orders DataTable, and writes records to the OrderSummary DataTable. The OrderSummary DataTable contains the EmployeeID field and three different aggregates of
the Amount field that are filtered on "EmployeeID<5" and that are grouped on
(and sorted by) EmployeeID. If the GroupBy argument is blank, the target DataTable contains only a single record that aggregates all the
input.
Note The filter expression is applied before any aggregate
functionality. To implement HAVING-type functionality, filter the DataTable that results.
To call the SelectGroupByInto method, add the following method to the DataSetHelper class that you created in the "" section:
public DataTable SelectGroupByInto(string TableName, DataTable SourceTable, string FieldList,
string RowFilter, string GroupBy)
{
/*
* Selects data from one DataTable to another and performs various aggregate functions
* along the way. See InsertGroupByInto and ParseGroupByFieldList for supported aggregate functions.
*/
DataTable dt = CreateGroupByTable(TableName, SourceTable, FieldList);
InsertGroupByInto(dt, SourceTable, FieldList, RowFilter, GroupBy);
return dt;
}
Save, and then compile the DataSetHelper class that you created in the previous sections.
Follow these steps to create a new Visual C# Windows
Application:
Start Visual Studio .NET.
On the File menu, point to New, and then click Project.
In the New Project dialog box, click Visual C# Projects under Project Types, and then click Windows Application under Templates.
In Solution Explorer, right-click the solution, and
then click Add Existing Project. Add the DataSetHelper project.
On the Project menu, click Add Reference.
In the Add Reference dialog box, click the Projects tab, and then add a reference to the DataSetHelper project to the
Windows Form application.
In the form designer, drag three Button controls and a DataGrid control from the toolbox to the form. Name the buttons
btnCreateGroupBy,
btnInsertGroupByInto, and
btnSelectGroupByInto. Keep the default name for the DataGrid control (DataGrid1).
In the form code, add the following using statement to the top of the code window:
using System.Data;
Add the following variable declarations to the form
definition:
Run the application, and then click each of the buttons. Notice that the DataGrid is populated with the tables and data from the code. Note You can only click the btnCreateGroupBy and the btnSelectGroupByInto buttons one time. If you click either of these buttons more than
one time, you receive an error message that you are trying to add the same
table two times. Additionally, you must click btnCreateGroupBy before you click btnInsertGroupByInto; otherwise, the destination DataTable is not created. If you click the btnInsertGroupByInto button multiple times, you populate the DataGrid with duplicate records.
The ColumnName and the DataType properties are the only properties that are copied to the
destination DataTable. You can extend the CreateTable method to copy additional properties, such as the MaxLength property, or you can create new key columns.
You can expand the functionality so that you can specify
fields from related DataTable objects.
You can expand the functionality by adding support for
additional aggregate functions, such as the AVG or the STDEV function. If you do this, you must have dynamically created
variables to track intermediate values. To track intermediate values, you can
add member variables to the FieldInfo class, and then use those member variables to store intermediate
values.
The fieldname and the alias parts of the field list must
comply with DataColumn naming conventions. The parser also restricts the names, in that
the name must not contain a period (.), a comma (,), or a space ( ).
If you click a button more than one time, the same table is
added two times to the DataSet. This results in an exception. To work around this problem, you
can add code to the test application to check whether a DataTable of the same name already exists. Alternatively, you can create
the DataSetHelper class without a reference to a DataSet, and then bind the DataGrid.DataSource property directly to the dt variable instead of by using the SetDataBinding method call.
If the source table uses custom data types (for example, a
class), you must add code to the InsertGroupByInto method to perform a deep copy of the data. Otherwise, only a
reference is copied.