Article ID: 242377 - Last Review: February 5, 2002 - Revision: 1.1 INF: How to Use Data Transformation Services (DTS)
This article was previously published under Q242377 On This PageSUMMARY
Microsoft SQL Server 7.0 shipped with a feature known as Data Transformation Services (DTS). This article provides some general information regarding DTS.
MORE INFORMATIONWhat is Data Transformation Services?DTS is a high-end, sophisticated, COM-based solution for a wide range of data exchange. COM makes the perfect implementation because it is first and foremost a separation of interface and implementation.DTS is a way to move data from one OLE DB data source to another OLE DB data source. This can be from Microsoft Excel to Microsoft SQL Server, SQL Server to Microsoft Access, Microsoft Excel to Microsoft Access and so forth. The design is such that it is solely based on the OLE DB specification for the source and destination. DTS can exchange data with, but is not limited to:
What Data Transformation Services is NotDTS does not provide generic Object Transfer Manager features (the creation of indexes, referential integrity constraints, defaults, and so forth). DTS is designed only to manipulate and move data. Schema definition and movement is not part of the specification or design of the product. For those activities you should use the Transfer Manager and appropriate upgrade wizards (when you work with SQL Server databases), or generate and execute scripts for indexes, constraints, defaults, and so forth (when you work with non-SQL Server databases) after the data has been loaded.Data Transformation Services Rules of Thumb
Use Fewer TransformationsWhen you copy data, you can move each column in its own transformation, or you can copy many columns in a single transformation. Transformations are serialized, so fewer transformations may speed up the copy if you have a lot of columns. As a rule of thumb, use a single transformation if you have more than 20 columns (you can see up to a 5% improvement). If you have 20 or fewer columns you will not see much difference, so do not bother with fewer transforms. It is easier to see what's happening in the user interface with multiple transforms.Use Ordinals Instead of Names in ActiveX ScriptsSometimes, if you use ordinals instead of names, you may see a small performance improvement, especially on scripts with many columns. For example:
DTSDestination(1) = DTSSource(1)
A performance improvement is not always the case and it depends on where the other bottlenecks in your transformation are. In general, the performance gain is 5% to 10% and this may not be enough to compensate for the loss of readability of the script.Use SQL Instead of ActiveX and Use Fewer ActiveX ScriptsTake advantage of SQL Server's Query Processor. If you can perform an operation in the SELECT statement that is the source SQL statement for a DTS pump, then do that instead of using an ActiveX script. To call an ActiveX script may cut performance by 2 to 4 times over a straight DTS copy.Combine ActiveX Scripts And Save TimeEach call to the ActiveX scripting engine generates the majority of the overhead for the transformation. So, if you need to use an ActiveX script to perform data transformations on several columns, use a single script for all the columns to limit the number of calls to the scripting engine. If you have to use an ActiveX script, since the majority of overhead is in the call to the scripting engine and not in the calculations that appear in the script, do what you need to do to transform your data and do not worry about the cost of performing complex calculations in the script.Scripting LanguagesVBScript is faster than JScript, which is faster than PerlScript. Each incurs about a 10% reduction in speed. PerlScript does not ship with SQL Server but is available from ActiveState.com.Other scripting languages can be used if they are installed; however, they are not tested, the functions will not show up in the scripting dialog, and the main function is not autogenerated. File Import/ExportThe fastest way to import data from a file into SQL Server is to use the Transact-SQL BULK INSERT statement or the DTS Bulk Insert task. The DTS Bulk Insert task is a user-interface that creates the SQL statement for you. This technique is new to SQL Server 7.0 and is not the same as BCP or DTS. It tends to be about twice as fast as either BCP or DTS because it is a server-side process and is executed entirely in the SQL Server process space. Bulk Insert can use existing BCP format files or handle some standard delimiters.There are certain limitations to the use of Bulk Insert:
Simple ExportThe fastest way to export to a text file in SQL Server 7.0 is to use BCP.exe. DTS is much slower than BCP when you export to a text file.Since you cannot perform data transformations with BCP, this may not be an option if you do not export the contents of a table. If you need transformation on export, use the DTS Data Pump task, although performance may be slower. SQL Server is the only allowed source (although multiple versions of SQL Server are supported). Import with TransformationsA bulk insert to a staging table followed by a DTS copy may be faster than importing and transforming all in one step with a single ActiveX transformation. Use SQL instead of ActiveX scripts even if it takes more steps. Using more steps is not always slower.Import of Files (or other data) to Non-SQL Server DestinationsUnless your third-party provider has a bulk insert program, your best bet is DTS. You can use either a DTS Data Pump copy task or Data-Driven Query (DDQ) task. Depending on the driver, DDQ may be marginally faster than the pump task (you'll have to try it yourself, though). The pump task is far easier to create and maintain. DDQ cannot export to a text file or to Microsoft Excel.SQL Server 6.5 and DTSUpgrade/Import to SQL Server 7.0Use the Upgrade wizard to bring data and schema from SQL Server 6.5 to SQL Server 7.0. There are several reasons for this:
ODBC or OLEDB?You can use the SQL Server 7.0 ODBC driver to talk to previous versions of SQL Server. This has some limitations, such as:
Load the Data WarehouseUse Pump Tasks for InsertsDo not use a DDQ task to insert records when the destination is SQL Server. Use DDQ for updates and other non-insert tasks. This is even true when you split records.Splitting RecordsSuppose you want to send data from a single staging table to both a dimension table and a fact table. You might think it would be more efficient to eliminate multiple reads on the source and do this in a single task such as a DDQ task. This is possible, but typically slow.A faster method is to use two data pump tasks. Even though you are doing two source reads, you make up for it by being able to do fast load on the destination. Dealing with a Surrogate KeyA common special case of loading two tables at once is when the dimension table has a surrogate key like an identity column. You then need the ID of each dimension record when you load the fact table. You can use the same two methods that were mentioned in the Splitting Records section.You can use a single DDQ task to create both records at once (this won't do a bulk load and is best when the destination is not SQL Server). Here's an example that uses a stored procedure to INSERT into a table and perform an UPDATE if the insert fails: Update the Data WarehouseCreate Staging TablesStaging tables are very useful for several reasons:
Select Only the Records Needed to Add or UpdateDo not attempt to copy every record, relying on a script or a primary key to reject records that should not be copied. This takes much longer than selecting only the records you need to copy (see the "Creating Surrogate Keys" section for information on how to do this).As a side benefit, selecting only changed records makes packages restartable. If not all the records were copied, restart the package and it picks up where it left off. This is the fastest form of row level restartability. Create Surrogate Keys for Data Comparison (Change Capture)When you check to see what source records do not exist on the destination, you have several options (listed best to worst):
REFERENCESFor additional information about DTS, click the article numbers below to view the articles in the Microsoft Knowledge Base:242391
(http://support.microsoft.com/kb/242391/EN-US/
)
INF: DTS Package Development, Deployment and Performance
243775
(http://support.microsoft.com/kb/243775/EN-US/
)
INF: How to Use Lookups in Data Transformation Services
242543
(http://support.microsoft.com/kb/242543/EN-US/
)
INF: DTS Row Level Restartability After an Unexpected Failure
| Other Resources Other Support Sites
CommunityGet Help NowArticle Translations
|






Windows Live
Facebook
Twitter
Linkedin
Digg it
Yahoo
Delicious
StumbleUpon
Yammer
Reddit
Technorati
FriendFeed
Email
Back to the top
