|
Provide Feedback on this Broadcast
Microsoft Support WebCast
Host Integration Server 2004 Data Providers for DB2 and Host Files
June 22, 2004
Note This document is based on the original spoken Support WebCast transcript. It has been edited for clarity.
Paul Larsen: Welcome, everyone. Jeff and I are extremely pleased to be here today talking to you about new technologies that we find very exciting in Microsoft® Host Integration Server 2004. Jeff and I work in the product team delivering this technology to our customers such as yourselves. We've been working on this release for about two and a half years now, and we're really pleased that it's nearing the end of its release cycle and will be available to you shortly.
(Slide 2) Let's look at our objectives today. From a high level, we want to give you a survey of the data access technologies that are available in Microsoft Host Integration Server 2004. While doing this, we want to give you an understanding of the various data access architectures upon which we build these technologies, the components of these architectures, and the different roles that the components play.
Finally, we want to wrap up the session with some basic problem resolution by looking at the tools and techniques that we offer to our Microsoft Product Support Services (PSS) Support Professionals our support professionals within our enterprise customer sites.
(Slide 3) Today's agenda takes us through an overview of the data integration approach to IBM platform integration. When I say IBM platform integration, primarily what I mean is enabling Microsoft Windows® desktops and Windows servers to connect to, access, and integrate vital data stored in heritage — we don't say legacy here at Microsoft — heritage data sources, such as IBM DB2 databases on the mainframe z/OS platform, midrange iSeries platform, and other open platforms such as Windows, UNIX, AIX, HP-UX, etc.
At the same time, we're talking about vital data that might be stored in IMS DB, through which you access the data through DB2 stored procedures on the mainframe, as well as the native file system on the mainframe and midrange systems. On the mainframe we're talking about Virtual Storage Access Method (VSAM), Sequential Access Method (SAM), Partitioned Data Set (PDS), data sets and members, and on the midrange system we're talking about those physical files and logical files over physical files.
Next, we'll be talking specifically about DB2 data access. We have a number of technologies that enable you to very efficiently access and integrate DB2 data in new solutions developed for the Microsoft Windows platform. Then we'll segue over to host record file access, accessing those data sets and members on the native file systems. As I noted before, we want to wrap up this session with problem resolution.
(Slide 4) In Host Integration Server 2004, there are many layers of technologies or what we call approaches to integration. At the very base level is what we call network integration. The fact is that the IBM mainframe and midrange systems run on proprietary Systems Network Architecture networking protocols, and you just can't connect these boxes to Windows boxes without some sort of protocol conversion. Yes, there is modern connectivity to the mainframe in Application System/400 (AS/400) through open system TCP/IP, but it doesn't offer support for all of the rich System Network Architecture (SNA) application programming interfaces or technologies that we like to offer our customers.
There are also other layers of technologies or approaches to integration. Moving a little higher up the stack, there's what we call security integration. Primarily what that means is the ability to map Windows accounts as they are stored in Active Directory to IBM Resource Access Control Facility (RACF), Access Control Facility 2 (ACF/2), Top Secret accounts on the mainframe, or the integrated AS/400 security system. We do that through our enterprise single sign-on technology. We have a WebCast that's going to be offered in upcoming weeks specifically on this technology.
Enterprise single sign-on enables the Windows developer to map Windows IDs to foreign credentials to provide for seamless logon to the application and end user. It also provides the framework for things like password synchronization, taking password changes from the host, bringing them down to Windows, taking password changes from Windows, and mapping them up to the host.
Moving a little higher up the stack, we have three areas of technology or approaches to integration that we call our enterprise application integration technologies. These include messaging integration, data integration, and application integration. What we're talking about today is our data access and integration technologies.
The reason Microsoft offers a data integration approach to IBM platform interoperability is because we believe, and based on customer feedback we know, that this provides our enterprise developers with the most rapid deployment of services-oriented architecture (SOA) and Web access solutions. If you want to get at the vital data that's stored in these legacy — excuse me — heritage data sources as quickly as possible, you want to take a data access approach.
Also, we're seeing quite a trend in our enterprise accounts for customers that want to move portions of their data sets and DB2 instance data down into a Microsoft SQL Server™ data warehouse, and they're doing it for three reasons. One, it's very easy to write new Web applications against SQL Server. All the Microsoft Visual Studio® tools and sample applications and documentation show you how to do that. Also, we find that there's a significant trend for customers that want to create a SQL Server data warehouse to use the rich analysis and reporting tools in SQL Server 2000, to really get at the information that's in that legacy data, and analyze it and report it to the business decision makers within the organization.
In a data integration approach, what are the technologies? Fundamentally, what Jeff and I do, and what other members of our team do, is map IBM data access architectures to Microsoft data access architectures. On the IBM side, there is something called Distributed Data Management (DDM). That's the overall architecture for local and remote access to information wherever it resides. It's been around for about 25 years, and it's embedded into the mainframe and midrange platforms as well as the DB2 servers.
DDM, as it's called, has three protocols. We're going to talk about two of those today. The first, distributed relational database architecture, DRDA or Dr. DA - lots of acronyms - enables access to DB2 from remote clients and servers. The DRDA protocol is owned by an open standards groups called The Open Group. You might know this group. They also own Bluetooth and other industry technologies.
Jeff and I sit on the standards group for DRDA and DDM, and help IBM and other independent software vendors (ISVs) move this architecture forward, introducing new technologies and features that eventually make their way into IBM DB2 servers, into the native host file systems, as well as into the data technologies that we're talking about today for Microsoft.
Another protocol that we'll be looking at today in DDM is called record-level input/output, and that's what we use to access the native file system. So that's IBM or the open platforms world.
On the Microsoft side, we have industry-standard Open Database Connectivity or ODBC. We also have component object model (COM)-based OLE Database (OLE DB). You might know that as Active data object (ADO). That's the high-level interface to OLE DB. Then more recently, you probably know about Microsoft .NET Framework in Microsoft Visual Studio .NET, which enables access to local and remote heritage data stores through the System.Data framework for ADO.NET. ADO.NET was devised to allow for a more connectionless or disconnected way of accessing data, to enable interoperability with Extensible Markup Language (XML), and to very efficiently move data between tiers - between a server tier that might be accessing a remote data source or a local SQL Server data source, as well as between client tiers. That might be Web browsers, Windows Forms, or Web Forms applications.
Other technologies that we'll be talking about today are the data providers for DB2 and host files — those are the primary deliverables from the Host Integration Server team — as well as the various consumer applications that are available within Microsoft: Microsoft Office applications and SQL Server applications.
(Slide 5) Let's take a look at, from a high level, the data providers that are in Host Integration Server 2004. If you're an enterprise developer, an architect, or a technology decision maker, and you're trying to figure out how you can efficiently move your enterprise towards a services-oriented architecture, or very efficiently allow a business decision maker who is using, say, Office tools such as Microsoft Excel or Microsoft Access, to access and integrate vital data with their daily operations, you'll want to make use of one of these data providers in Host Integration Server 2004. They fall within two categories. The first category is data providers for DB2 databases: our new Managed Provider for DB2, our updated OLE DB provider for DB2, and our ODBC driver for DB2.
Second, for accessing host file systems, we offer an OLE DB provider for the native AS/400 and mainframe VSAM file systems. There is also a very efficient component object model or ActiveX® control for host file transfer that will allow you to move data sets in bulk from the host down to the PC or from the PC up to the host.
In all of these data providers, we offer some very common infrastructure. Of course, we support the Windows data access architectures and the open DDM data access architectures on these legacy platforms. We also support things like connection startup and tear down, session pooling where appropriate, and single sign-on to access the foreign system with Windows credentials. We also support things like data conversion. So when we access DB2 on the mainframe from a Windows data provider, we access the host as if we're a peer application that talks to the DB2 database, sending the DB2 database native mainframe machine data types, Extended Binary Coded Decimal Interchange Code (EBCDIC) encoded strings.
What that means is that the DB2 database, the native file systems on these foreign platforms, they don't have to do non-native data conversion. That really allows the mainframe to do what it's designed to do: support high levels of transaction-processing programs, batch programs, and not have to worry about supporting data conversion of Windows machine types. That really allows the host system to operate efficiently and offloads the data conversion down to the Windows server and client desktops.
(Slide 6) Let's look at an architectural depiction of our data providers for DB2. On the left side, we have a silo that represents ODBC, COM+, ADO, and OLE DB, and on the right side we have a silo that represents the managed ADO.NET world. In between, we have three call-outs that talk about the different roles that one can play in these data access architectures.
At the highest level of these silos or the stacks are the data consumers. On the left side we have ODBC-enabled Office applications such as Excel and Access. We have COM+-enabled applications such as SQL Server consumers, Microsoft BizTalk® as a consumer of our data sources, and also custom applications written to ADO and OLE DB.
On the right side, again, we have custom applications such as Microsoft InfoPath™, which can consume ADO.NET, and also custom applications written in Visual Studio .NET consuming ADO.NET. Also, SQL Server custom applications and BizTalk server applications can consume the managed data space.
At the bottom, we have data providers. We have our ODBC and OLE DB providers in the COM+ and ODBC world, and on the right side we have our Managed Data Provider for DB2. These data providers sit right on top of a Microsoft-developed DRDA application requester or DB2 network client, which ships in Host Integration Server 2004.
That network client in turn communicates to the local Host Integration Server SNA WINAPPC library or TCP/IP Windows sockets. The message here is that all of the technology that we're talking about today is developed and engineered within Microsoft. It's supported by Microsoft, and it's tested against the Microsoft tools and consumers, so that you get a reliable solution that we believe scales and performs up to our customers' expectations.
In the middle of this technology stack, we have these middleware components that are called service components. They offer services such as session pooling or connection pooling in ODBC, client cursor engines, and transaction enlistment for two-phase or reliable transactional updates, one of the solutions we'll be talking about today.
(Slide 7) Why don't we jump into the product itself, spend a little less time on the slides and give you a demo? At this point I want to talk about where you begin as a developer or an evaluator of this technology.
(Demo1) This server computer is connected across an Ethernet LAN to an IBM MP3000 mainframe on which we're running a number of mainframe z/OS 1.4 logical partitions (LPARs). I'm connecting up to an IBM DB2 Version 7 (V7) database, and I'm using a new technology in Host Integration Server 2004, which allows for SNA over IP connectivity.
Here I have my SNA Manager, and what I want to do is bring up a little 3270 client for a second and show you my mainframe database that we'll be connecting to using our Data Access Tool. At the same time, let me start up one of the key tools in Host Integration Server 2004 for the data access developer; it's called our Data Access Tool. What it allows you to do is very easily create and manage data source definitions, the information needed for our data providers to connect up to and integrate data, access data, in foreign databases and file systems.
You'll see that the Data Access Tool lists a number of folders. It allows you to view and manage your ODBC data sources as well as your OLE DB data sources to access DB2, and OLE DB data sources to access the native file system.
It also has something called host column descriptions, and these are used by our data access providers for the host file system. Mainframe data sets and many of the physical files on AS/400 are what we call program-described files. Only the owning host application knows what the structure of these data sets are, whether they include one column or multiple columns, what the data types for the columns are, and how many keys or what the data types are for the keys to access these key sequence data sets or key physical files on the 400.
What we offer in terms of the Data Access Tool is a little utility to create a PC-sized metadata map, mapping host data types. You can see here a variable-length string, where we do EBCDIC to ANSI conversion, and then the local data types, in this case OLE DB data types. We allow the Windows developer to access the native file system by using a local map that maps host types to PC types, and then we handle the data conversion on behalf of the developer.
Jeff Derstadt: How would I actually create a data source? How would I start out?
Paul: There are a couple of ways to start out creating a data source. Thanks, Jeff, for that leading question. One method of creating a DB2 data source is to use one of our new tools in the Data Access Tool called Import. It allows you to import an existing DB2 Connect data definition. IBM DB2 Connect is very popular for connecting the DB2. What I have done is I've taken my DB2 Connect instance, and I exported out all of the data source definitions into a file.
What I want to do is open that file up using our Import DB2 Connect File Wizard. It will parse that file, and it will pull from DB2 Connect different data source definitions. I'm going to pick the first one, which allows for a TCP/IP connection to a DB2 400 database, and from that I'm going to create a universal data link (UDL). OLE DB offers us a data source definition.
If we refresh this folder here, we can see that it's created this data source definition here. We can edit that data source and see that it's pre-populated with data source information that it pulled from DB2 Connect, including where DB2 Connect was storing a DB2 object called Packages. Packages are used in DB2 to hold certain object references. For example, our data providers for DB2 offer a server cursor, so you can open up a DB2 table, navigate to a record, lock the record, read the record, update the record, close the record, and navigate off of it. It's very useful for data entry applications. What we need to do is create what are called DB2 packages, and we do that by referencing a collection or a schema in DB2, where we will run a create package process.
Let's just quickly run through this. Let's see if this created a useful data source. What I want to do is make one change. What we'll see is it indeed does allow you to connect using a DB2 Connect data link, and I can save that off and use it for later. That's one way to create data sources. So if you're an existing IBM DB2 Connect customer, you can import your DB2 Connect data source definitions and it will generate a new one for you.
However, let's walk through this process with another component of the Data Access Tool called the Data Source Wizard. The idea here is to provide for a very easy way to allow you to create a connection to a foreign DB2 database, and through these wizard panes to map the Windows terminology to the IBM terminology. It's the biggest problem our customers have, mapping the terminology.
Here we're going to create a connection to an IBM DB2 for Multiple Virtual Storage (MVS) database. We can connect to these different database platforms. By specifying the database platform, we can instruct the data provider as to whether or not to convert the Windows machine types to the foreign machine types. So if I select DB2/MVS here, we will convert Windows to mainframe machine types and offload the mainframe from doing that conversion.
An optional feature would be to select DB2/NT, and then we wouldn't do full data conversion; we'd actually offload from that conversion on the mainframe. If you have extra CPUs around out there for your host you can do that. It's highly unlikely, though; it's very expensive to do that conversion on the host.
I'm going to pick an SNA connection to the mainframe. Here the Data Source Wizard is picking up from my underlying Host Integration Server client or server configuration, the SNA addressing that my HIS administrator or network administrator has already configured for my computer. What I need to do is pick a local SNA address or a remote SNA address, and then the mode for that SNA connectivity.
Also, here you'll see that we offer optional security types for Advanced Program to Program Communication (APPC) security, and then there's a check box for distributed transactions. Jeff will be talking more about that in a little bit, and that's our two-phase commit support.
The first major question that I have to ask is what is the initial catalog? What is the DB2 instance that I want to connect to? Here, if I go back to my 3270 screen, you'll see that I'm using an IBM tool called Query Management Facility (QMF), and I'm connecting to a Data Source Name1 (DSN1) instance, and this item right here is a DSN1D037. That is the catalog for the database instance name for what IBM calls the location, and that location is what we map to in the ODBC, OLE DB, and ADO.NET world initial catalog.
That's the hardest thing to come up with: what is the DSN that I want to connect to?
Jeff: If I was using an AS/400 or a midrange system, what would the name be?
Paul: If it was an AS/400 system, the initial catalog would map to the relational database name, and for every AS/400, there is one integrated DB2 database and only one. Out of the factory, it comes with a nice cryptic name, but most AS/400 administrators will map that to a more user-friendly name. We have a number of AS/400s in our environment and they're all named after birds. You'll see us connecting to Condor like we did before, and that's our relational database name on that platform. Our documentation and wizard will point you to the commands that you can use on the foreign system, to go and find out what the relational database name is on your DB2 400 box.
Another thing that I configure on this data source is my default schema, and that tells generic applications like data design tools in Visual Studio and SQL Server consumer applications what collection or group of objects, tables, views, and stored procedures to look at. I don't want to look at the entire database.
I have a Humongous Insurance demo collection, Hum Insurance. We also have another new feature called Default Qualifier. What that allows you to do is to pre-qualify unqualified object names. So a lot of the generic tools in Visual Studio use one-part naming conventions. However, in an IBM DB2 environment, typically objects are named with two parts: a collection or owner ID, as in the case of the mainframe, and then the actual table name.
I'm trying to remember, Jeff, when we use Default Qualifier on the mainframe, aren't we actually issuing a set statement at connect time?
Jeff: Yes. We use the SetCurrentSQLID command. So if I say select * from insurance table, what I'm actually getting is select * from Hum Ins.insurance table. So it allows you to pre-qualify that without actually having to specify that qualifier.
Paul: That's great. In a little bit we'll show you a table as to which DB2 platforms and versions support which of these optional features. Again, we're going to do the encoding of the strings on the PC side, offload the mainframe.
We now have the option of using an interactive sign-on, which I'll do right now. I can persist this user ID inside the file universal data link. There's a warning here that we don't encrypt this data today. It's not part of the Windows Data Access architecture to encrypt this. Probably, in most enterprises, based on customer feedback, you don't actually save this into your UDL.
We have our own flavor of connection pooling that we offer that's been optimized to work with our enterprise single sign-on. We'll offer read access to legacy DB2 mainframe environments, and then we have support for alternate Transaction Program (TP) names, which is pretty obscure and not used in many environments.
We can test the connection from the wizard. We can create the DB2 packages. Basically, you would do this once per DB2 instance. You would create the packages on a well known collection such as NULLID, or we offer up a naming convention called MSPKG, and your administrator would create these packages once, and all of your users, all of your developers would use the same collection of packages.
Then we can now, from the Data Source Wizard, actually sample the data and pull back what were the tables in that collection, and know indeed that this data source works. Use an X; give it a friendly name, and you can save out different types of data sources. I'll create an OLE DB UDL initialization string and finish that.
You can see here it displays that. From the context menu you can test the connection. You can create the packages. You can sample the query. You can display the connection string, which I find very useful. As a developer you can display the connection string, the initialization string, copy that into the clipboard, paste that right into your Visual Studio .NET application code, and know that you have a connection that will work at runtime.
That is our Data Access Tool, Data Source Wizard and editor for the host column description files that are used when accessing the host file system.
Let me show you one other really quick demo here. I'm going to edit this data source and walk through it to the logon portion. I'm going to choose this optional button here called Single sign-on. What that will enable me to do is, without passing the user ID and password, it allows me to connect to the mainframe. Let's try a connection. And we test, and the connection works.
That's because on this Host Integration Server 2004 box, I have installed our enterprise single sign-on, and stored in an encrypted SQL Server database are my host credentials mapped to my Windows user ID. At runtime, our data provider calls out to enterprise single sign-on and says, "I am Redman P. Larsen. Authenticate me to the mainframe. Give me my host password, username password." It gets back the host username password in a secure way, and a data provider passes it up to the mainframe. In this case, it's part of the SNA 5.0 or the attachment to the APPC connection.
Jeff: Does that work just over APPC, or can I use that over IP?
Paul: Good question, Jeff. It works over SNA and IP, which is really cool. That's new for this release. Before, it was just an SNA option.
Very quickly, let's jump through a couple of slides here and let's get back to the product again. (Slide 8) Let's look at the improvements that we made in our DB2 data access features for HIS 2004.
Basically we rewrote the DB2 network client, the DRDA application requester. That was one of Jeff's big projects in this last release. It's been rewritten to support existing IBM-supported DB2 platforms and versions, as well as the upcoming DB2 version such as DB2 version 8.0 on the mainframe, that now supports a new version of the wire-level protocol, DRDA version 3.0, so we can support new features that come out of the data access architecture in IBM.
While doing this work, we've improved our OLE DB provider, and really optimized the data conversions. In some cases, we had two or three data conversions occurring. We've got that down to a singular data conversion in almost all scenarios. We're really seeing better scalability and performance due to these optimizations.
We have the new Data Access Tool that we've seen. The improved network client can now support advanced functionality such as reliable transactions or two-phase commit (TPC) protocols over TCP/IP, as well as existing SNA support.
We support single and multiple result sets on calls to DB2 stored procedures. Without the DRDA application requester rewrite, could we support multiple result sets very easily, Jeff?
Jeff: No. I think HIS 2000; even in the first service pack you could only pull back one. So this is a really neat new feature that I think a lot of customers will want to take advantage of.
Paul: For our far Eastern locales, we now have full support for our double-byte character set (DBCS) and multi-byte character set (MBCS), as well as single-byte character set. Then for simplified Chinese locales, we support encoding the Unicode, which is used in our Managed Provider for DB2, as well as the big five Chinese code page conversion. And, as we noted, single sign-on over IP and SNA.
(Slide 9) Well, looking at a matrix of these new features, this shows you the DB2 platforms and versions that we do most of our testing against here at Microsoft, and that we certify full support for. Notice here at the bottom, SQL long identifiers. That's a new feature in DB2 version 8.0, something that we'll be working to certify with the availability of DB2 version 8.0 this fall.
We also have another little feature called Client Application Name. That is a feature we slipped in after our final beta and before our release candidate, based upon a critical request from one of our early adopters, a rapid deployment site. They really needed the ability to send up to the mainframe a custom application name so that they could identify specific applications in their DB2 mainframe logs. They do that for not only problem resolution, which we'll show you in a minute, but also for chargeback to specific applications and groups within their organization.
(Slide 10) In the new Managed Data Provider for DB2, the key thing here is that this data provider, which is written in Microsoft Visual C#®, runs directly on top of our new DB2 network client, our DRDA application requestor (AR). So it is a very efficient solution, unlike some of the competing solutions out there, where their Managed Providers run on top of their ODBC driver. Our Managed Provider is tightly coupled with our DB2 network client and runs right on top of Windows sockets, or right on top of Host Integration Server's SNA connectivity.
It offers up very common object interfaces that would be familiar to you if you've used the Microsoft SQL client or if you've used the OLE DB data provider, such as connection and properties where you can set individual connection attributes that we saw on the Data Access Tool. Or you can load in an initialization string like I copied from the Data Access Tool. It supports commands for issuing select statements and calling DB2 stored procedures: flat, select, insert, update, and delete, as well as parameterized commands.
It has a data adapter and data reader for very efficiently pulling data back from the DB2 database and sending updates back. Then it supports DB2 transactions. What's interesting is that we support both remote unit of work and distributed unit of work in the Managed Provider. So we will support two-phase commit, and we do that through the enterprise services COM+ interoperability down to the Microsoft Distributed Transaction Coordinator (DTC) layer. Jeff will talk a little bit more about the architecture of our two-phase commit.
Of course, for the developer, the Managed Provider offers much better scalability and performance than, say, interoperating down to our OLE DB provider from within Visual Studio .NET in the .NET Framework. Developers can take advantage of the object browser and see the object interfaces exposed to our Managed Provider. They can use Visual Studio .NET IntelliSense® to help complete their code as they're writing it, with the comments that are returned back and the object requirements that are returned back from our provider at design time. Then we have fairly good online documentation and samples that really help jump-start the developer in using this Managed Provider.
At this point, I'd like to pass the microphone over to Jeff, and have him give you a nice walkthrough of our two-phase commit support in our managed data product for DB2.
Jeff: (Slide 11) Thanks, Paul. Let me just pull up the remote desktop connection here. (Demo2) What we're going to do in this demo is, I have a Web application using the new Web Forms feature of Visual Studio. That connects to a back-end Web service. What this particular Web service does is it allows me to simulate a banking application. Perhaps one system is running on DB2, and another system is running on a SQL Server. What I want to do is I want to be able to transfer funds from my DB2 system to my SQL Server system.
One of the problems with that is networks are unreliable. Computers are unreliable. They can crash. Networks can fail. Someone can trip over the network cable. Something bad can happen. With our new two-phase commit feature, it not only allows you to have two potentially different types of database systems enlisted in this same transaction, but it also protects you from the kind of network failures or computer failures that are inevitable in your business environment.
So I'm going to just go ahead and run this really quickly and show you what it does. Then I'll step through a little bit of code, and then show you how I might go about coding this with the Managed Provider for DB2.
What it's just done is it's gone out to the DB2 system and done a SELECT from a table and retrieved three rows, which are accounts. So I'm going to select this particular account. Then it's also gone out to my local SQL Server system, and I'm going to select an account there.
I'm going to type in an amount that I'm going to transfer from the Jones account to the Quan account, and I'm going to click Transfer. As I see, the transfer succeeded. Jones' balance was decreased and Quan's balance was increased, and that happened within a single distributed transaction. If I transfer again, I can see Jones' balance continue to decrease and Quan's continue to increase. That's great. I've now got DB2 and SQL Server running in the exact same transaction.
But what happens when things go wrong, when let's say another part of my transaction fails, or the network goes down or something like that? I don't want the DB2 system to have deducted the money from the Jones account, but not added it to the Quan account. Then I'll have lost my $1,000 somewhere in the bit bucket that is who knows where.
In that case, I have this little Transfer with Failure button that I'll show you, and if I click on that it just – basically what happened was that I did the deduction from the Jones account and I tried to add it to the Quan account, but I threw an exception in my code, which I'll show you, and then it rolled back the transaction. So both systems were kept in sync of that behavior.
Really quickly here I'm going to step through some of the Web service code. So I'm going to go here and view the code of my Web service, and go down to the transfer function. The transfer function is going to create a new connection using the Microsoft Managed Provider for DB2 connection. It's going to take a connection string and it's going to create command, and it's also going to use the SQL provider to create a SQL connection.
Then inside the body of this function I'm going to open the connection to DB2, and I'm going to open the connection to SQL Server. Here I have the UPDATE statement to update the table from DB2, and I have the statement for SQL Server, where I'm going to update that database also.
If something goes wrong, I can throw an exception, if perhaps they couldn't find the account or something like that. Then when that works, I call the enterprise services SetComplete method which says this transaction has worked okay, and it is okay to commit this transaction. If anywhere in this an exception is thrown, that exception is caught here. I can call enterprise service to abort that transaction, and it will pass that exception on to the Web service.
That other button that I showed you, Transfer with Failure, does essentially the same thing, but I have inserted an exception here that I've thrown that simulates a potential failure that might occur somewhere. So I open the connections. I perform the update statement. The actual data, at this point, has been changed on the host and on the SQL Server. But then because I've thrown the exception and called abort, that transaction will then be rolled back and both DB2 and SQL Server will rollback that transaction.
As you can see, in a very short amount of code, you can implement enterprise services distributed transactions with our Managed Provider, and this sample is available in our SDK that ships with HIS 2004.
I think I will turn it back over to Paul now, so he can show you another demo.
Paul: That's really cool, Jeff, and that hopefully gives you an idea as to why Microsoft and our enterprise customers are telling us the quickest time to market approach, to integrating vital data stored on these heritage systems. When the data ultimately is not in the applications (it's stored in VSAM data sets, in IMS, which is accessible through DB2, or in DB2 databases themselves), the quickest way of getting at that data is to write very efficient code with Visual Studio .NET and the .NET Framework. We think the Managed Provider is going to help customers do that.
Jeff: Yes. Anyone who knows how to use the SQL provider or any of the ADO.NET providers will already know how to use the Managed Provider for DB2, because the interfaces are very, very similar. I think any application that you have to write there should be very easy to do.
Paul: (Slide 12) Here's a graphical depiction of really what Jeff was doing. He had a data consumer application. It was an ASP.NET Windows Forms application, and it was calling out to a separate ASP.NET Web service. He had that part of a single Visual Studio .NET solution, so it's really easy to write these applications in Visual Studio. Then it instantiated our Managed Data Provider for DB2, which called down – were you doing an IP connection?
Jeff: Yes. That particular demo was using two-phase commit over TCP/IP.
Paul: Our latest technology. That in turn relied upon our distributed re-sync service for TCP/IP, that stores a transaction mapping log that maps Windows transaction IDs and DB2 transaction IDs in a local SQL Server, Microsoft Database Engine (MSDE), or a SQL Server instance database.
(Slide 13) Looking at this diagram for a second, I want to just show you a couple of things in another solution that I have on one of my computers. (Demo3) As you'll recall, we have QMF up here. Why don't we query the database? I have an insurance claims database, and it should have one record in it that I created earlier for one claim, "Drove convertible through carwash and ruined leather seats." Just my luck isn't it, Jeff?
Jeff: Absolutely.
Paul: You're not supposed to agree with me on that. So if we go to my desktop here, I have a Visual Studio .NET solution, a nice little sample application, and it has a number of Web service interfaces, a Visual Basic .NET Web service. I like VB .NET, because I think it accurately reflects what our enterprise customers use the most. Jeff was also showing –
Jeff: Yes, the two-phase commit sample is written in C#.
Paul: C#, which we find to be popular as well. I have a number of different Web service interfaces. I have the ability to enter a new claim, the ability to report a claim as XML. What I want to show here is that it's really easy using our Managed Data Provider and System.Data or ADO.NET to import and export XML. That's because one of the architectural design goals of ADO.NET was to make integration with XML seamless for the end user or the developer.
It's really easy to take XML output that we get back, in this case from doing a SELECT statement against the database with an input parameter, and passing that back as an XML document to the calling application. Once you get the data back from the mainframe DB2 instance, our data provider creates a data set, like an ADO recordset, a managed version of that, a number of data tables, relations inside a data set. And the data set, you can write it out to an XML schema document, an XML document. It makes for really easy interoperability with XML consumers, such as InfoPath, BizTalk, or your custom XML application.
The same thing with XML input, I can parse an XML document and from that create an actual ADO.NET data set. So I can read the XML document in the schema and create a data set from that, and then very easily map that with a parameterized Insert statement to insert statements up to DB2 or call a stored procedure as an insert into a table.
What I wanted to show you here was a really simple sample that will call a DB2 stored procedure and get back multiple result sets. I've got to show you all the new features.
I'm going to debug this Visual Basic .NET XML Web service. I'm going to click on this interface and invoke that Web service interface and I'll walk through the code. The first thing is that there are no input parameters for this stored procedure or this Web interface. I'm going to specify a connection string, which I created with a Data Access Tool. It looks an APPC connection to a mainframe DB2 instance that we have access to in our 3270 session.
I'm going to create a Microsoft DB2 connection object and use the connection string to fully qualify that object. I'm going to create a Microsoft DB2 data adapter, and that's the object that actually pulls the data back from the DB2 database and passes it back to ADO.NET.
I'm going to declare a data set. The query that I'm doing, the SQL string, is a call to Humongous Insurance store proc multi-result set; intuitive name. I put all of my real code inside a Try Catch, where I open up the connection, create the data adapter with the call statement and the connection object, and fill the data set from the data adapter results.
So we've gone and connected to the mainframe DB2 instance over SNA, using this new SNA over IP or IBM enterprise extender functionality in Host Integration Server 2000 [sic], connected to the DB2 version 7.0 database, called the stored procedure, got the multiple results back, and created a data set with two data tables. If we run through that and look at the results, you can see here we have Table1 and Table2. Table1 is a listing of the claims, and Table2 are the claim types, and all of that data is returned as XML output to a Simple Object Access Protocol (SOAP) consumer list Web service.
It's really easy to use business logic in your DB2 database in the form of a stored procedure and wrap that in XML Web services, and move your enterprise towards a services-oriented architecture, allowing you to move into a modern computing infrastructure, develop new solutions quicker, and offer new products and services to your constituents and your customers.
(Slide 13) Here you see a graphical depiction of our data providers for host files. Again, on the left side we have COM+, ADO, and OLE DB. On the right side we have managed data. In the case of COM+, ADO, and OLE DB, there are consumers such as Excel, Access, basically any Office application you can use Visual Basic for applications, and ADO to call into our Microsoft OLE DB provider for AS/400 and VSAM and have bulk or incremental record-level access to the vital data stored in these legacy host files.
Our OLE DB provider for AS/400 VSAM runs on top of a Microsoft-designed and -developed DDM record-level input/output or DDM client. Again, it handles session startup and tear down. It handles data conversion, so the host system doesn't have to do non-native data conversion. It is a non-SQL command and rowset provider. So it will open up a rowset based upon a data set name, including a member. So it opens up AS/400 members and mainframe partition data sets, partition data set extended members. It opens up mainframe key sequence data sets, relative record data sets, ultimate indexes based upon a key, and it will open up sequential data sets such as SAMs and entry sequence data sets as well.
It will connect via SNA or IP to an AS/400, and connect via SNA only to the mainframe. Yet with the new support in Host Integration Server 2004 for IBM's enterprise extender on the mainframe, you can connect to the mainframe all the SNA connectivity flows over an IP-routed intranet local area network and actual mainframe data center. So we effectively support IP into the mainframe as well.
(Slide 14) On the 400, the server transaction program, our DDM server that we connect to, is built into the AS/400. On the mainframe there's a component called Distributed File Manager. It's part of your system's management storage. Most of our enterprise customers use IBM System Managed Storage (SMS) or Tivoli to manage their data sets, and that component is an optional feature that can be enabled in very quick order.
If our enterprise customers have difficulty with the host-side configuration, we encourage them to work with us through Product Support Services, and the product team is quite amenable to handholding customers into configuring their host environments to get this file system provider to work.
Again, the architecture, we're taking IBM's architecture and mapping it to Windows architectures for data access. We have data consumers, we mentioned those. Service components: in this case there are session pooling components and client cursor engines that can be made to work in a read-only manner with our file system provider. And then the data providers themselves: OLE DB for AS/400 VSAM, and then an ActiveX control host file transfer, the difference being that OLE DB provider is for incremental data access. File transfer is for bulk data access. File transfer is much quicker along the lines of an FTP access, and yet it will handle the code page and data conversion for you that FTP does not. So there's an advantage to using our host file transfer over FTP.
The other thing about the VSAM OLE DB provider is that it's, as I mentioned, for non-SQL commands. So it won't consume a select * from or a select where clause, or an INSERT, UPDATE, or DELETE, but it will work really well with ADO and COM+ OLE DB semantics for opening up a recordset or a rowset, navigating by key, locking the record, and updating. So it's really good in a connected fashion.
What I want to show you is how you can use ADO.NET to call into our OLE DB provider and pull data back in bulk form. New in this release is support for SQL Server Data Transformation Services (DTS). It's a consumer in SQL Server 2000 that allows for bidirectional data movement between OLE DB data sources.
What we find customers doing is using DTS to pull VSAM data sets into SQL Server, create a data warehouse, and run reporting or analysis services on that rich data. Let me give you a quick demo of this (slide 15).
(Demo4) I have a simple little Visual Basic .NET Web service sample. Here we have one Web service method. You'll see that we're going to consume our OLE DB provider for AS/400 VSAM, connect to the mainframe directly from ADO.NET through the Data.OleDb data provider.
Again, I have an initialization string to access the database that I created with the Data Access Tool and SNA connection, and it's going to pull back two data sets from a fictitious book catalog. In this case, I'm just going to pull back one data set, and we'll walk through that.
We'll debug this DB.xml Web service. Click on this interface, and that's going to go up and pull back a list of categories in this book store. No input parameters. Again, an initialization string. The Microsoft Managed Provider for OLE DB connection object with an init string that calls into our provider, OLE DB. Create an OLE DB data adapter. The command syntax to open up a data set on the mainframe is EXEC OPEN, kind of like calling a stored procedure, and then the fully qualified path to the data set HISdemo.demo.ESDS.category for the entry sequence data set.
We create a data set from the results. I try that and create an OLE DB connection. We look here, active connections. We have an APPC connection. That represents our SNA OLE DB provider connecting to the mainframe, and we'll pull the data back.
I'm going to show you really quickly in our host column description files, here is the mapping that we've created locally to map from host types to PC types. So we'll do that data conversion on behalf of the Web service application. We'll run through that and we'll get the data back and display it in the Web browser.
There it is. There's the data coming back from the mainframe. It's that easy to wrap your heritage data sets on the mainframe as an XML Web service and move to an SOA. It's truly amazing technology.
Another thing I want to show you really quickly is from Excel. They have these things called query files. If we take a look at those, what that is is just an initialization string to a data source, in this case an AS/400, a command that our file system provider can access, EXEC OPEN. It's going to instruct Excel to call into our data provider and pull back this data set. It's a really easy way to pull data into an Excel application.
We open up this query file. It will connect to the 400, in this case over TCP/IP, and pull back all the data in that data set and display it in Excel. With that you can create a graph or report and really help the business decision maker by using familiar tools, in this case Excel, to access heritage data.
Should we look at Data Transformation Services (DTS)?
Jeff: Sure. I think they would like to see that.
Paul: Alright. Let's show you another demo here, where we will bring up SQL Server Enterprise Manager. In SQL Server I have a bunch of databases. I have one called VSAM Demo, and here are some tables. You can see there is no table in here called WebCast, so why don't we go ahead and import from within SQL Server foreign data, as represented by VSAM data we're going to import.
What most customers want to do is they want to be able to import the data from the legacy system. Let me show you another really cute tool in Data Access Tool here. I'm going to create a really quick data source connection to the native AS/400 file system via TCP/IP. It shows you what it takes to connect to the 400. Connect to the relational database name in Condor. I've got HISDEMO, I think I've got some data tables there.
Cheat again by storing a password in this string, connect, sample the query. It looks like we have some data up here. It looks like we can pull back one of these data sets, the claims. We'll create a UDL file and finish that. There's the data source we created.
Now I want to be able to move this to another location, so I'm using a really cool feature in the Data Access Tool (DAT) called Locate. So I'll copy this data set, and paste it on my desktop so I can find it when I'm inside SQL Server, and that will make my life easier.
Let's do this; let's show that we've indeed mated our file system provider to Data Transformation Services in SQL Server. The data source will be an OLE DB data source. There's that UDL. Yes, it connects to the 400; that's cool.
Let's bring this data into this VSAM demo SQL Server. Bring this claims table in here. Why don't we just call this WebCast? We'll preview the data. There's no data in that data set.
Jeff: Bad choice.
Paul: We'll try this one, Title. Yes, there's data there. So what we're doing is we're telling SQL Server that we want to pull from this AS/400 physical file to be a single-member or a multi-member file, and we want to create a table in SQL Server and bring the data down from the 400 into SQL Server. So we can probably run Analysis Services or Reporting Services against that.
What DTS has done, without writing any code, just from this nice little SQL Server Wizard, is created a target table in SQL Server, gone up to the 400 using our OLE DB provider for AS/400, opened up the AS/400 physical file, read from the first member, and pulled all the records down. The data provider in Host Integration Server converted from AS/400 machine types and EBCDIC strings to Windows OLE DB type and passed that data to DTS as a consumer of our provider, and that information is now in the SQL Server database. We'll refresh this view and you can see the data now here in SQL Server, cool.
Just quickly, slides (slide 16, 17, 18), we've shown all this and we'll move on to a problem resolution demo (slide 19). We'll let Jeff talk. He'll be efficient and he'll cover more than I possibly can.
Jeff: Thanks, Paul. (Demo5) One of the new things we wanted to do in HIS 2004 was make it easier to troubleshoot and trace errors that customers run into, especially with our DB2 provider.
I don't know if you're aware of our tracing facilities in Host Integration Server, but we have the SNA Trace utility, which, despite its name, also traces data over TCP/IP. For DB2, all of our tracing flags can be set on the DB2 network library section. So if you click on Properties, you can turn on all of the tracing, especially into this message tracing, these network data streams.
This one is really important, because what it allows us to do is capture all of the data, all of the commands that we send to DB2, all of their replies at the low level, that DRDA protocol that Paul spoke of earlier —
Paul: That option is new, isn't it, Jeff? In the past, connecting over TCP/IP we didn't get a lot of data back from tracing.
Jeff: That's correct, and a lot of times we relied on another tool called NetMon, which is an internal tool that wasn't available at a lot of customer sites. So now they can turn this on. They don't need any extra tools, and we can still, in a very easy way, retrieve all of the commands we're sending up there, all the replies, the metadata that DB2 sends back, and the actual columns. So if there is a data conversion problem, it's easy to identify.
I'm going to turn those on. Then I can go to Rowset Viewer here. I already have a connection open to a host. So if I do select * from authors and execute that query, it retrieved some data. I can go ahead and clear the traces and you get a bunch of tracing files.
If you look inside the message tracing file, you can see information that we've traced. Here it says we're preparing the SQL SELECT statement, select * from authors. Here, in these sections that say DDM, are the actual data that we've sent and received from DB2.
What I've done is I've gone through and collected some interesting traces. What you can do is if you save the trace as a text format, we have a tool that will take this DDM data and parse it out for you so it makes sense. So I'm going to show you that really quickly.
I've saved them already, to save some time. Here's a Trace.txt file. If I open that up, it's basically the trace file in a text format. If I open a command line, I can run my trace program, trace.txt, and I'm going to save it. It actually exports it as an XML file, and I used the character set 1252, which is the ANSI code page, and that information is needed for this tool.
That runs. I get an XML file. What does that look like? Here is my trace, all parsed nicely so I can read all of the commands and DDM code points, and their data that are sent up to the host and received back from the host.
So if I go down here, I can see how we're accessing the database. Here's the character set, 1252. A little more interesting is here I am preparing the SQL statement. Here is the SQL statement, select * from authors, that we sent up there.
Then here is the reply data that we get back, and it's saying that there are nine columns in this column count, and here are their descriptions. Here's the column name and its label, its type, the length of the column, precision scale, the code page that's used or the Coded Character Set Identifier (CCSID) that's used. So you can get a lot more information using this new set of tools.
Then, of course, it doesn't parse it out for you quite yet, but we also have all of the data. So if there is an actual problem converting the data, someone in the product team will be able to take this information and perform a much quicker turnaround time, when doing a Request for Comments (RFC) or a quick fix engineering (QFE) or something like that.
We also added some new error messages. So if go into my collection of UDLs, I have a connection to one of our hosts; that's that connection. I guess that host is offline.
Paul: That's one of our older DB2 NT boxes. I'm not sure that's the box you want. So the tracer tool that you showed is not on the product CD.
Jeff: Right.
Paul: But it's available by working with Product Support Services, and what we'll do is we're in the process of publishing a Microsoft Knowledge Base (KB) article and allowing you access to that tool.
Jeff: Right. Here, for example, if a user types in the wrong password, we have a lot more informative error messages now: "Test connection failed; an error in initializing provider. The user does not have the authority to access the host resource. Check your authentication credentials or contact your administrator."
Paul: Basically what we did is the product team prioritized their usability improvements based upon feedback and analysis from Product Support Services. Where do we spend the most time, in terms of numbers of days to solution, in resolving customer reported problems? What we did is we focused on the top issues, and over the course of this project we have implemented things like the Data Access Tool, the Data Source Wizard, improved error tracing, improved error command strings, return strings, that really help Product Support Services and our enterprise customers that support these data providers reduce their overall total cost of ownership.
Jeff: That's right.
Here's another one. What I did was I typed in an incorrect initial catalog, which is, as Paul said, one of the ones that a lot of people get wrong. It says, "The host resource could not be found. Check that the initial catalog value matches the host resource name." So on this particular mainframe, that host, DSNKD036, did not exist but 37 does.
If I were to go up here and try and do a command for a table that does not exist, it will tell me that, "JEFFDERS.AUTHORS22 is an undefined name." It gives you, of course, the native DB2 error codes and error messages, in case you want to go to those reference manuals and look up additional error information.
Paul: Can you bring up a data link really quickly, Jeff?
Jeff: Sure.
Paul: Just edit a DB2 data link. If you look at the All tab for a moment, there is the client application name. You'll see that that by default is blank. That's what I spoke of earlier. One of the key features that we added at the very last minute, based upon a customer request, was the ability to send up a custom property value that the developer in a custom application can set at runtime, based upon the user and the type of application, send it up to the mainframe or up to the DB2 400 or the DB2 server on open platforms, and identify uniquely that application instance.
The idea is that really makes it easier to identify a rogue application on the mainframe and destroy that thread. So if some user issues a query to that, they should note that it might run for days. Usually the administrator can very easily find it and kill it. Also, it allows for you on the 400 to go in through the active user or the DB2 server contacts, identify the thread, and get custom error information from the 400. Also, as we mentioned before, it allows for a chargeback within an enterprise, where host services are charged back to the consuming group.
(Slide 20) In summary, we really believe that you can lower your costs for developing new Windows Server™-based Web applications, and help modernize your heritage infrastructure by moving towards an SOA through the use of Visual Studio .NET, .NET Framework, and the data providers for DB2 in the native file systems that are available in Host Integration Server 2004. The focus, for us, has been on reducing overall developer time to market, making the product more scalable, more supportable, and more performant.
These are a list of the technologies that we spoke of today.
(Slide 21) Our schedule for 2004: the product is nearing release to our manufacturing facilities. We are working with our rapid deployment sites to certify the ship readiness of the product. As more of those sites chime in and give us detailed reports on the uptime over a course of a number of weeks to prove the readiness of the product, we will then obtain permission to release this product. We expect that to be mid-calendar year 2004, very soon.
(Slide 22) Here are some information sites for you. We have our external Web sites. We have an MSDN presence. We have newsgroups, which are very active and supported by our MVPs, our product support personnel, as well as members of the product team.
We have a contact here. I am plarsen@microsoft.com. Feel free to send me mail if you have questions, concerns, or issues for follow-up.
With that, we'd like to thank you for joining us for today's event, and move into the Q&A portion.
Zherina Salamanca: At this point, we'd like to hear from you, our listeners, about this topic. If you find that you need more complex technical assistance that might be outside of the scope of this discussion, go to support.microsoft.com/ or call Microsoft Product Support Services directly to speak to a support professional.
If you would like more information on future Support WebCasts or to review any of our sessions on demand, visit our main Support WebCast site at support.microsoft.com/webcasts/. For this particular session, you will find a downloadable version of the slides and on-demand streaming media within 24 hours, and a full written transcript within three weeks.
With that, let's go ahead an answer some questions submitted during the slide presentation.
Our first question is: Are you able to access native data queues on AS/400 and pull data back into Visual Studio using a Web service?
Paul: The answer to that question is yes. We have another COM+ ActiveX control in our stable of data access technologies. It's designed to allow a Windows developer to call an AS/400 data queue and pull a record back from a first in, first out (FIFO), last in, first out (LIFO), or keyed queue, or to send a message up to one of these queues. The data queue ActiveX control, although it's COM+-based, does work well from Visual Studio .NET ADO.NET application context through COM+ interoperability.
The only caveat I would mention is that we've kind of neglected this data queue control. It needs some upgrades, in the sense that it only supports converting message data from Windows to 400. It doesn't convert inbound messages. It doesn't have full support for our Far Eastern locales, in terms of double-byte character set conversions, and it only connects with a 400 over an SNA connection, and we find that the predominant method of connecting 400 is with native TCP/IP.
We really want your feedback, and you're welcome to send me e-mail and tell us if you'd like us to upgrade this ActiveX control and maybe produce a managed component for accessing data queues, because it is a very common method for inter-process communication on the 400, and it is very popular with Windows developers to access the 400 and move data between applications.
Zherina: What are the best practices for integrating the AS/400 with BizTalk using HIS 2004?
Paul: That's a good question. We have a specific purpose session that will be coming up in a couple of weeks on this topic, where we'll talk about the technologies in Host Integration Server 2004, and how they enable you to extend solutions that you develop in BizTalk Server 2004.
With regards to the AS/400, personally I think the best way of accessing AS/400 resources, whether they're application or data resources, is using the Managed Provider for DB2. There's a little bit of code, as I showed, either a Web service or a managed Windows Server component. You can front-end your AS/400 data, whether it's file system data, DB2 data, or even applications, by wrapping a native AS/400 application, a callable AS/400 application, in a DB2 400 stored procedure.
You can wrap all those in a very easily accessible XML Web service or a .NET server component, and then BizTalk orchestration can reference that Windows representation of the AS/400 resources, and include that in as a Web reference or a .NET reference right inside orchestration. And with that, BizTalk creates the port definitions, the message type, and you can very easily pass messages to the 400, whether it's file system data, DB2 tables, stored procedures that wrap AS/400 applications.
We also have in Host Integration Server 2004 the ability to call AS/400 applications directly, and that is a topic of another session, and will be covered in the BizTalk session in about two weeks.
Zherina: It looks like that was our last question. It looks like Paul and Jeff did a very thorough job covering their topic today.
We hope this presentation has been helpful to you and your business, and we look forward to your participation in upcoming WebCasts. Again, I'd like to thank Paul Larsen and Jeff Derstadt for joining us in our discussion today, and of course we'd like to thank you, our listeners, for attending today's session. Thanks and have a great day, everyone.
|