Summary
This article answers common questions concerning Automation to Microsoft Office from Visual C++.
More Information
Table of Contents
-
What is Automation?
-
I'm new to Automation, where can I find good resources to learn more?
-
Are there different ways I can use Automation?
-
What is COM?
-
How do I attach to the running instance of an Office application?
-
How do I pass optional parameters?
-
How do I catch events exposed by the Office applications?
-
My automation code is too slow. How can I speed things up?
-
What do these huge error values, like -2147352573 or 0x80030002, mean?
-
What is a type library?
-
My automation code worked with Microsoft Excel 95, but fails with Microsoft Excel 97. Why?
-
Why does the application I'm automating stay in memory after my program is finished?
-
I know what I want to do as a Microsoft Office application user, but how do I do this programmatically using Automation?
-
Can I automate an embedded Microsoft Office application?
-
How do I access my document properties in a Microsoft Office document?
Questions and Answers
-
What is Automation?
Automation (formerly OLE Automation) is a technology that allows you to take advantage of an existing program's functionality and incorporate it into your own applications. For instance, you can utilize the Microsoft Word spelling and grammar checking capabilities into your application without Microsoft Word visible to your users. You can even use all of the Microsoft Excel charting, printing, and data analysis tools. This technology can greatly simplify and speed up your development. -
I'm new to Automation, where can I find good resources to learn more? Chapter 24 of David Kruglinski's "Inside Visual C++" (ISBN:1-57231-565- 2) supplies a general overview as well as some great examples. Also, the Microsoft Knowledge Base is a good source of information.
If you prefer learning by example, please see the following article in the Microsoft Knowledge Base:179706 HOWTO Use MFC to Automate Excel & Create/Format a New Workbook
-
Are there different ways I can use Automation?
There are three basic ways you can use Automation: MFC, #import, and C/C++:-
With MFC, use the Visual C++ ClassWizard to generate "wrapper classes" from the Microsoft Office type libraries. These classes, as well as other MFC classes, such as COleVariant, COleSafeArray, COleException, simplify the tasks of Automation. This method is usually recommended over the others, and most of the Microsoft Knowledge Base examples use MFC.
-
#import, a new directive that became available with Visual C++ 5.0, creates VC++ "smart pointers" from a specified type library. It is very powerful, but often not recommended because of reference- counting problems that typically occur when used with the Microsoft Office applications.
-
C/C++ Automation is much more difficult, but sometimes necessary to avoid overhead with MFC, or problems with #import. Basically, you work with such APIs as CoCreateInstance(), and COM interfaces such as IDispatch and IUnknown.
It is important to note that there are some slight differences between Automation from C++ compared to plain C, because COM was designed around the C++ class.
-
-
What is COM?
Automation is based on the Component Object Model (COM). COM is a standard software architecture based on interfaces, and designed to have code separated into self-contained objects. Think of it as an extension of the Object Oriented Programming (OOP) paradigm, but applicable to separate applications. Each object exposes a set of interfaces, and all communication to an object, such as initialization, notifications, and data transfer, happens through these interfaces. COM is also a set of services provided by dynamic-link libraries (DLLs) installed with the operating system. Automation uses many of those services. One example is the "Marshalling" service, which packages the client application's calls to the member functions of the server application's interfaces and passes those, with their arguments, to the server application. It makes it appear that the server's interfaces are exposed in the client's memory space, which is not the case when the client is an .exe running in its own process space. Marshalling also gets the return values from the server's methods back across the process boundaries and safely into the hands of the client's call. There are many other services essential to Automation that are provided by the various COM libraries. Sources of information about those include "Inside Ole - Second Edition" by Kraig Brockschmidt, ISBN 1-55615-843-2, "Inside COM" by Dale Rogerson - ISBN 1-57231-349-8, and "Automation Programmer's Reference," ISBN 1-57231-584-9. -
How do I attach to the running instance of an Office application?
Use the GetActiveObject() API. Automation servers register themselves in the ROT (Running Object Table), via the RegisterActiveObject() API. Automation clients can get at the running instance with code such as:// Translate server ProgID into a CLSID. ClsidFromProgID // gets this information from the registry. CLSID clsid; CLSIDFromProgID(L"Excel.Application", &clsid); // Get an interface to the running instance, if any.. IUnknown *pUnk; HRESULT hr = GetActiveObject(clsid, NULL, (IUnknown**)&pUnk); ASSERT(!FAILED(hr)); // Get IDispatch interface for Automation... IDispatch *pDisp; hr = pUnk->QueryInterface(IID_IDispatch, (void **)&pDisp); ASSERT(!FAILED(hr)); // Release the no-longer-needed IUnknown... pUnk->Release();
NOTE: If there are multiple instances running of the Office application you want to attach, you will only be able to attach to the first instance that was launched using the GetActiveObject() API.
Theoretically, you can iterate the ROT for each individual instance, but the Office apps don't register themselves if another instance is already in the ROT because the moniker for itself is always the same (it couldn't be distinguished anyway). This means that you can't attach to any instance except for the first. However, because the Office apps also register their documents in the ROT, you can successfully attach to other instances by iterating the ROT looking for a specific document, attaching to it, then getting the Application object from it. You won't need to do this for PowerPoint, because it is a single- instance application; you can only have one instance of it running. -
How do I pass optional parameters?
Some methods have "optional" parameters. In Visual Basic, you can casually omit them when calling the method. However, when calling with Visual C++ you have to pass a special VARIANT whose .vt field is VT_ERROR, and .scode field is DISP_E_PARAMNOTFOUND. That is:// VARIANT used in place of optional-parameters. VARIANT varOpt; varOpt.vt = VT_ERROR; varOpt.scode = DISP_E_PARAMNOTFOUND;
This is really what Visual Basic is doing behind-the-scenes.
-
How do I catch events exposed by the Office applications?
Basically you implement the event interface you want to catch (the "sink"), and setup an advisory connection with the application (the "source"). In general, to setup the advisory connection, you get the server's IConnectionPointContainer and call FindConnectionPoint() with the IID of the event interface. This gives you an IConnectionPoint interface and all that's left is to call Advise() with an instance of your event interface. The server will then call back through this interface when these events occur. -
My automation code is too slow. How can I speed things up?
A common cause of speed problems with Automation is with repetitive reading and writing of data. This is typical for Excel Automation clients. However, most people aren't aware that this data can usually be written or read all at once using SAFEARRAY. See the following Microsoft Knowledge Base article for more information and informative examples:179706 HOWTO: Use MFC to Automate Excel and Create/Format a New Workbook Also, it is important to point out that using the clipboard can sometimes improve performance. For instance, you can copy your data to the clipboard, then use automation to tell the server to paste. Or vice- versa; tell the server to copy-to-clipboard, and paste into your application.
-
What do these huge error values, such as -2147352573, or 0x80030002 mean?
These values are known as HRESULTs and are defined in winerror.h. The numbers are so large because the first bit represents whether or not it is an error result. You can use the ErrLook.Exe utility that comes with Visual C++ to translate these numbers into meaningful descriptions. If you want to programmatically obtain a description for the errors, you can use the FormatMessage() API.NOTE: If you are using Visual C++ 6.0 and have a variable containing this value in the debug watch window, append ", hr" (without the quotes) to it to have Visual C++ translate it for you!
-
What is a type library?
A type library is similar to a C/C++ header file. It contains the interfaces, methods, and properties that a server is publishing. You can view the type library with the OLE/COM Object Viewer (Oleview.exe) that comes with Visual C++. Here is a list of the type library filenames for Microsoft Office 95, 97, and 2000: Office Application | Type library ------------------------+---------------- Word 95 and prior | wb70en32.tlb Excel 95 and prior | xl5en32.olb Powerpoint 95 and prior | Powerpoint.tlb Access 95 and prior | msaccess.tlb Binder 95 | binder.tlb Schedule+ | sp7en32.olb Project | pj4en32.olb Team Manager | mstmgr1.olb Word 97 | msword8.olb Excel 97 | excel8.olb Powerpoint 97 | msppt8.olb Access 97 | msacc8.olb Binder 97 | msbdr8.olb Graph 97 | graph8.olb Outlook 97 | msoutl8.olb Outlook 98 | msoutl85.olb Word 2000 | msword9.olb Excel 2000 | excel9.olb Powerpoint 2000 | msppt9.olb Access 2000 | msacc9.olb Outlook 2000 | msoutl9.olb Word 2002 | msword.olb Excel 2002 | excel.exe Powerpoint 2002 | msppt.olb Access 2002 | msacc.olb Outlook 2002 | msoutl.olb
-
My automation code worked with Excel 95, but fails with Excel 97. What's happening?
The object model for Excel made a significant change from version 95 to 97. Excel 95 implemented all its methods and properties in a single implementation of IDispatch. This meant that often you could call methods meant for object X, from object Y. This was not a good design, so in Office 97, each object has its own separate Idispatch implementation. This means that if you ask for a method or property from object X from a separate object Y, you get the error 0x80020003, -2147352573, "Member not found." To avoid this error, you need to make sure that the underlying IDispatch interface you are making calls from is the semantically correct one. -
The application I'm automating stays in memory after my program is finished. What's happening?
Most likely, this is because you have forgotten to release an acquired interface and you'll need to track it down. Here are some general suggestions, and things to looks for:-
If you're using #import, it is very likely you could be running into one of the reference-counting bugs associated with it. Often times the bugs can be worked around, but usually it is preferred to use one of the other Automation methods. #import doesn't work very well with the Office applications, because its type libraries and use are fairly complex. Also, such reference counting problems are hard to track down because a lot of the interface-level COM calls are behind-the-scenes when using #import.
-
Check to see if you are calling any methods, such as Open, or New, that return an IDispatch * (LPDISPATCH), and ignoring the return value. If you are, then you are abandoning this returned interface and will need to change your code so that you release it when no longer needed.
-
Gradually comment out sections of your code until the problem disappears, then add it back judiciously to track down where the problem starts.
-
Note that some applications will stay running if the user has "touched" the application. If this occurs while you are automating, then the application will probably stay running afterwards. The Office applications have a "UserControl" property on the Application object that you can read/write to change this behavior.
-
Also, some applications will decide to stay running if enough user-interface "action" has occurred. If you are intending the application to exit, then call its Quit() method on the Application object. Word will shutdown regardless of its reference count when Quit is called. This isn't expected COM behavior. Excel, however, will properly just hide itself but stay running until all outstanding interfaces are released. In general, you should release all outstanding references, and only call Quit() if you intend the application to quit.
-
-
I know what I want to do as a Office application user, but how do I do this programmatically through Automation?
What you are interested in is what objects, methods, and properties you need to use. The best way to learn how to navigate the object models of Word, Excel, and Powerpoint, based on what you want to do as a user, is to use the Macro Recorder. Just choose Macro\'Record New Macro' from the Tools menu, execute the task you're interested in, then choose Macro\'Stop Recording.' Once you're done recording, choose Macro\Macros from the Tools menu, select the macro you recorded, then click Edit. This will take you to the generated VBA code that will accomplish the task you recorded. Keep in mind the recorded macro won't be the best possible code in most cases, but it's does very well for a quick example. -
Can I automate an embedded Office application?
Absolutely. The trick is getting the IDispatch pointer: this is given in the Visual C++ Technical Note 39 (TN039). -
How do I access my document properties in an Office document?
The document properties are accessible through Automation, or directly through IPropertyStorage.