How To Convert from ANSI to Unicode & Unicode to ANSI for OLE

All strings that are passed to and received from 32-bit OLE APIs andinterface methods use Unicode. This requires applications that use ANSIstrings to convert them to Unicode before passing them to OLE and toconvert the Unicode strings that are received from OLE to ANSI. Thisarticle demonstrates how these conversions can be done.
More information
Windows NT implements Unicode (or wide character) and ANSI versions ofWin32 functions that take string parameters. However Windows 95 does notimplement the Unicode version of most Win32 functions that take stringparameters. Instead it implements only the ANSI versions of thesefunctions.

A major exception to this rule is 32-bit OLE. 32-bit OLE APIs andinterface methods on Windows NT and Windows 95 use Unicode exclusively.ANSI versions of these functions are not implemented either on WindowsNT or Windows 95.

This means that a 32-bit application that needs to run on both Windows95 and Windows NT must use the ANSI versions of the non-OLE Win32functions and must convert ANSI strings to Unicode before they arepassed to OLE.

A 32-bit Unicode application that runs only on Windows NT need not useany ANSI/Unicode conversion functions.

Win32 provides MultiByteToWideChar and WideCharToMultiByte to convertANSI strings to Unicode and Unicode strings to ANSI. This articleprovides AnsiToUnicode and UnicodeToAnsi, which uses these functions forANSI/Unicode conversion.
/* * AnsiToUnicode converts the ANSI string pszA to a Unicode string * and returns the Unicode string through ppszW. Space for the * the converted string is allocated by AnsiToUnicode. */ HRESULT __fastcall AnsiToUnicode(LPCSTR pszA, LPOLESTR* ppszW){    ULONG cCharacters;    DWORD dwError;    // If input is null then just return the same.    if (NULL == pszA)    {        *ppszW = NULL;        return NOERROR;    }    // Determine number of wide characters to be allocated for the    // Unicode string.    cCharacters =  strlen(pszA)+1;    // Use of the OLE allocator is required if the resultant Unicode    // string will be passed to another COM component and if that    // component will free it. Otherwise you can use your own allocator.    *ppszW = (LPOLESTR) CoTaskMemAlloc(cCharacters*2);    if (NULL == *ppszW)        return E_OUTOFMEMORY;    // Covert to Unicode.    if (0 == MultiByteToWideChar(CP_ACP, 0, pszA, cCharacters,                  *ppszW, cCharacters))    {        dwError = GetLastError();        CoTaskMemFree(*ppszW);        *ppszW = NULL;        return HRESULT_FROM_WIN32(dwError);    }    return NOERROR;/* * UnicodeToAnsi converts the Unicode string pszW to an ANSI string * and returns the ANSI string through ppszA. Space for the * the converted string is allocated by UnicodeToAnsi. */ HRESULT __fastcall UnicodeToAnsi(LPCOLESTR pszW, LPSTR* ppszA){    ULONG cbAnsi, cCharacters;    DWORD dwError;    // If input is null then just return the same.    if (pszW == NULL)    {        *ppszA = NULL;        return NOERROR;    }    cCharacters = wcslen(pszW)+1;    // Determine number of bytes to be allocated for ANSI string. An    // ANSI string can have at most 2 bytes per character (for Double    // Byte Character Strings.)    cbAnsi = cCharacters*2;    // Use of the OLE allocator is not required because the resultant    // ANSI  string will never be passed to another COM component. You    // can use your own allocator.    *ppszA = (LPSTR) CoTaskMemAlloc(cbAnsi);    if (NULL == *ppszA)        return E_OUTOFMEMORY;    // Convert to ANSI.    if (0 == WideCharToMultiByte(CP_ACP, 0, pszW, cCharacters, *ppszA,                  cbAnsi, NULL, NULL))    {        dwError = GetLastError();        CoTaskMemFree(*ppszA);        *ppszA = NULL;        return HRESULT_FROM_WIN32(dwError);    }    return NOERROR;}				
Sample use of these functions is as follows. CoTaskMemFree is used tofree the converted string if CoTaskMemAlloc was used to allocate thestring. The converted string must not be freed if it is returned throughan out-parameter to another OLE component, because that component isresponsible for freeing the string. LPOLESTR is a pointer to a Unicodestring.
// The following code gets an ANSI filename that is specified by the// user in the OpenFile common dialog. This file name is converted into// a Unicode string and is passed to the OLE API CreateFileMoniker. The// Unicode string is then freed.OPENFILENAME ofn;LPOLESTR pszFileNameW;LPMONIKER pmk;:// Get file name from OpenFile Common Dialog. The ANSI file name will// be placed in ofn.lpstrFileGetOpenFileName(&ofn);:AnsiToUnicode(ofn.lpstrFile, &pszFileNameW);CreateFileMoniker(pszFileNameW, &pmk);CoTaskMemFree(pszFileNameW);// The following code implements IOleInPlaceFrame::SetStatusText.// The lpszStatusText string, that is received from another OLE// component, uses Unicode. The string is converted to ANSI before it is// passed to the ANSI version of SetWindowText. Windows 95 supports only// the ANSI version of SetWindowText.COleInPlaceFrame::SetStatusText(LPCOLESTR pszStatusTextW){    LPSTR pszStatusTextA;    UnicodeToAnsi(pszStatusTextW, &pszStatusTextA);    SetWindowText(m_hwndStatus, pszStatusTextA);    CoTaskMemFree(pszStatusTextA);}				
NOTE: Comments in AnsiToUnicode and UnicodeToAnsi regarding theallocator that is used to allocate the converted string. CoTaskMemAlloc(the OLE allocator) is required to be used only if the resultant stringwill be passed to another OLE component and if that component can freethe string. This means that strings that are passed as in-parameters toOLE interface methods need not use the OLE allocator. Strings that arepassed as in-out-parameters or returned through out-parameters or in-out-parameters must be allocated using the OLE allocator.

String constants can be converted to Unicode at compile time by usingthe OLESTR macro. For example:
CreateFileMoniker(OLESTR("c:\\boo\\har.doc"), &pmk);				
Another example of ANSI/Unicode conversion routines can be found in theMicrosoft Foundation Classes (MFC) source code which ships with theVisual C++ 4.0 compiler. These routines are described in MFC Technote59: 'Using MFC MBCS/Unicode Conversion Macros'. The definition thesemacros OLE2T, T2OLE, OLE2CT, T2COLE, A2W, W2A, A2CW, W2CA andUSES_CONVERSION are in \msdev\mfc\include\afxpriv.h. Also seeAfxA2WHelper and AfxW2AHelper in the MFC source code in \msdev\mfc\srcand the use of OLE2T, T2OLE, OLE2CT and T2COLE in the MFC source code in\msdev\mfc\src. These functions allow code to be compiled either forUnicode or ANSI depending on whether the _UNICODE preprocessordefinition has been made. For example, the CreateFileMoniker call in theabove example can be made as follows with the MFC macros:
USES_CONVERSION;GetOpenFileName(&ofn);CreateFileMoniker(T2OLE(ofn.lpstrFile), &pmk);				
If _UNICODE is defined, T2OLE is defined as follows:
inline LPOLESTR T2OLE(LPTSTR lp) { return lp; }				
If _UNICODE is not defined, T2OLE is defined as follows:
#define T2OLE(lpa) A2W(lpa)				
T in T2OLE indicates that the type being converted to an OLE string(Unicode string) is a Unicode string when _UNICODE is defined and anANSI string when _UNICODE is not defined. Similarly LPTSTR is defined asa pointer to a Unicode string when _UNICODE is defined and as a pointerto an ANSI string when _UNICODE is not defined. T2OLE doesn't do anyconversion when _UNICODE is defined (LPTSTR == LPOLESTR). When Unicodeis not defined, A2W is called. A2W converts an ANSI string to Unicode asfollows:
#define A2W(lpa) (\         ((LPCSTR)lpa == NULL) ? NULL : (\             _convert = (strlen(lpa)+1),\             AfxA2WHelper((LPWSTR) alloca(_convert*2), lpa, _convert)\         )\ )				
AfxA2WHelper uses MultiByteToWideChar to do the conversion.

The MFC conversion macros use _alloca to allocate space from the programstack for the converted string. The space is automatically deallocatedwhen the procedure call has completed. OLE requires the OLE allocator tobe used for all strings (data) that will be allocated by one componentand freed by another. This means that strings passed through out-parameters and in-out-parameters of OLE interfaces must be allocatedwith the OLE allocator. In-parameters need not be allocated with the OLEallocator because the caller is responsible for freeing them. MostLinking/Embedding OLE interfaces and API pass strings as in-parameters.Consequently the MFC conversion macros can be used in most cases. TheMFC conversion macros cannot be used for in-out parameters or forreturning values through out-parameters because they do not allocatespace using the OLE allocator. AnsiToUnicode and UnicodeToAnsi can beused in these cases.

Yet another set of Unicode/ANSI conversion routines can be found in DonBox's column on OLE in Microsoft Systems Journal, August 1995, Vol. 10No. 8, Page 86. Don Box defines a C++ class with a cast operator whichwill return a Unicode/ANSI converted string. The allocated space isautomatically freed when the object goes out of scope. This class can bemodified to allocate using the OLE allocator and to not free theallocated space for strings that are passed through in-out or out-parameters.

One of the classes, String16, from Don Box's column which converts anANSI string to Unicode, follows. Another class, String8, that is similarto this one is used for ANSI to Unicode conversion. TheCreateFileMoniker call from the previous example can be made as followswith this class:
GetOpenFileName(&ofn);CreateFileMoniker(String16(ofn.lpstrFile), &pmk);				
In the above code, an instance of String16 is created. The constructorof the class will convert the ANSI string to Unicode. The languageimplementation will call the cast operator, operator const wchar_t *,to cast this parameter to the type of of CreateFileMoniker's firstparameter. The cast operator will return the Unicode string which ispassed to CreateFileMoniker. The object will destruct when in goes outof scope.
// String16 //////////////////////////////////////////////////////// // Shim class that converts both 8-bit (foreign) and// 16-bit (native) strings to 16-bit widenessclass String16 {public:// native and foreign constructors    String16(const char *p8);    String16(const wchar_t *p16);// non-virtual destructor (this class is concrete)  ~String16(void);// native conversion operator  operator const wchar_t * (void) const;private:// native wideness string    wchar_t *m_sz;// is foreign??    BOOL m_bIsForeign;// protect against assignment!  String16(const String16&);    String16& operator=(const String16&);};// native constructor is a pass-throughinline String16::String16(const wchar_t *p16): m_sz((wchar_t *)p16), m_bIsForeign(FALSE){}// simply give out the native wideness stringinline String16::operator const wchar_t * (void) const{  return m_sz;}// foreign constructor requires allocation of a native// string and conversioninline String16::String16(const char *p8): m_bIsForeign(TRUE){// calculate string length  size_t len = strlen(p8);// calculate required buffer size (some characters may// already occupy 16-bits under DBCS)  size_t size = mbstowcs(0, p8, len) + 1;// alloc native string and convert  if (m_sz = new wchar_t[size])    mbstowcs(m_sz, p8, size);}// delete native string only if synthesized in foreign constructorinline String16::~String16(void) {  if (m_bIsForeign)    delete[] m_sz;}				
convert helper routines functions

Artikelnummer: 138813 – Letzte Überarbeitung: 06/22/2014 17:58:00 – Revision: 4.0

  • kbcode kbhowto kbprogramming KB138813