Vous êtes sur la page 1sur 56

Introduction

COM (Component Object Model) is the popular TLA (three-letter acronym) that seems to be everywhere in the Windows world these days. There are tons of new technologies coming out all the time, all based on COM. The documentation throws around lots of terms like COM object, interface, server, and so on, but it all assumes you're familiar with how COM works and how to use it. This article introduces COM from the beginning, describes the underlying mechanisms involved, and shows how to use COM objects provided by others (specifically, the Windows shell). By the end of the article, you will be able to use the COM objects built-in to Windows and provided by third parties. This article assumes you are proficient in C++. I use a little bit of MFC and ATL in the sample code, but I will explain the code thoroughly, so you should be able to follow along if you are not familiar with MFC or ATL. The sections in this article are: COM - What Exactly Is It? - A quick introduction to the COM standard, and the problems it was created to solve. You don't need to know this to use COM, but I'd still recommend reading it to get an understanding of why things are done the way they are in COM. Definitions of the Basic Elements - COM terminology and descriptions of what those terms represent. Working with COM Objects - An overview of how to create, use, and destroy COM objects. The Base Interface - IUnknown - A description of the methods in the base interface, IUnknown. Pay Close Attention - String Handling - How to handle strings in COM code. Bringing it All Together - Sample Code - Two sets of sample code that illustrate all the concepts discussed in the article. Handling HRESULTs - A description of the HRESULT type and how to test for error and success codes. References - Books you should expense if your employer will let you. :)

COM - What exactly is it?


COM is, simply put, a method for sharing binary code across different applications and languages. This is unlike the C++ approach, which promotes reuse of source code. ATL is a perfect example of this. While source-level reuse works fine, it only works for C++. It also introduces the possibility of name collisions, not to mention bloat from having multiple copies of the code in your projects.

Windows lets you share code at the binary level using DLLs. After all, that's how Windows apps function - reusing kernel32.dll, user32.dll, etc. But since the DLLs are written to a C interface, they can only be used by C or languages that understand the C calling convention. This puts the burden of sharing on the programming language implementer, instead of on the DLL itself. MFC introduced another binary sharing mechanism with MFC extension DLLs. But these are even more restrictive - you can only use them from an MFC app. COM solves all these problems by defining a binary standard, meaning that COM specifies that the binary modules (the DLLs and EXEs) must be compiled to match a specific structure. The standard also specifies exactly how COM objects must be organized in memory. The binaries must also not depend on any feature of any programming language (such as name decoration in C++). Once that's done, the modules can be accessed easily from any programming language. A binary standard puts the burden of compatibility on the compiler that produces the binaries, which makes it much easier for the folks who come along later and need to use those binaries. The structure of COM objects in memory just happens to use the same structure that is used by C++ virtual functions, so that's why a lot of COM code uses C++. But remember, the language that the module is written in is irrelevant, because the resulting binary is usable by all languages. Incidentally, COM is not Win32-specific. It could, in theory, be ported to Unix or any other OS. However, I have never seem COM mentioned outside of the Windows world.

Definitions of the Basic Elements


Let's go from the bottom up. An interface is simply a group of functions. Those functions are called methods. Interface names start with I, for example IShellLink. In C++, an interface is written as an abstract base class that has only pure virtual functions. Interfaces may inherit from other interfaces. Inheritance works just like single inheritance in C++. Multiple inheritance is not allowed with interfaces. A coclass (short for component object class) is contained in a DLL or EXE, and contains the code behind one or more interfaces. The coclass is said to implement those interfaces. A COM object is an instance of a coclass in memory. Note that a COM "class" is not the same as a C++ "class", although it is often the case that the implementation of a COM class is a C++ class. A COM server is a binary (DLL or EXE) that contains on or more coclasses. Registration is the process of creating registry entries that tell Windows where a COM server is located. Unregistration is the opposite - removing those registry entries. A GUID (rhymes with "fluid", stands for globally unique identifier) is a 128-bit number. GUIDs are COM's language-independent way of identifying things. Each interface and coclass has a GUID. Since GUIDs are unique throughout the world, name collisions are avoided (as long as you use the COM API to create them). You will also see the term UUID (which stands for

universally unique identifier) at times. UUIDs and GUIDs are, for all practical purposes, the same. A class ID, or CLSID, is a GUID that names a coclass. An interface ID, or IID, is a GUID that names an interface. There are two reasons GUIDs are used so extensively in COM: 1. GUIDs are just numbers under the hood, and any programming language can handle them. 2. Every GUID created, by anyone on any machine, is unique when created properly. Therefore, COM developers can create GUIDs on their own with no chance of two developers choosing the same GUID. This eliminates the need for a central authority to issue GUIDs. An HRESULT is an integral type used by COM to return error and success codes. It is not a "handle" to anything, despite the H prefix. I'll have more to say about HRESULTs and how to test them later on. Finally, the COM library is the part of the OS that you interact with when doing COM-related stuff. Often, the COM library is referred to as just "COM," but I will not do that here, to avoid confusion.

Working with COM Objects


Every language has its own way of dealing with objects. For example, in C++ you create them on the stack, or use new to dynamically allocate them. Since COM must be language-neutral, the COM library provides its own object-management routines. A comparison of COM and C++ object management is listed below: Creating a new object

In C++, use operator new or create an object on the stack. In COM, call an API in the COM library.

Deleting objects

In C++, use operator delete or let a stack object go out of scope. In COM, all objects keep their own reference counts. The caller must tell the object when the caller is done using the object. COM objects free themselves from memory when the reference count reaches 0.

Now, in between those two stages of creating and destroying the object, you actually have to use it. When you create a COM object, you tell the COM library what interface you need. If the object is created successfully, the COM library returns a pointer to the requested interface. You can then call methods through that pointer, just as if it were a pointer to a regular C++ object.

Creating a COM object


To create a COM object and get an interface from the object, you call the COM library API CoCreateInstance(). The prototype for CoCreateInstance() is: Collapse
HRESULT CoCreateInstance ( REFCLSID rclsid, LPUNKNOWN pUnkOuter, DWORD dwClsContext, REFIID riid, LPVOID* ppv );

The parameters are:


rclsid

The CLSID of the coclass. For example, you can pass CLSID_ShellLink to create a COM object used to create shortcuts.
pUnkOuter

This is only used when aggregating COM objects, which is a way of taking an existing coclass and adding new methods to it. For our purposes, we can just pass NULL to indicate we're not using aggregation.
dwClsContext

Indicates what kind of COM servers we want to use. For this article, we will always be using the simplest kind of server, an in-process DLL, so we'll pass CLSCTX_INPROC_SERVER. One caveat: you should not use CLSCTX_ALL (which is the default in ATL) because it will fail on Windows 95 systems that do not have DCOM installed.
riid

The IID of the interface you want returned. For example, you can pass IID_IShellLink to get a pointer to an IShellLink interface.
ppv

Address of an interface pointer. The COM library returns the requested interface through this parameter. When you call CoCreateInstance(), it handles looking up the CLSID in the registry, reading the location of the server, loading the server into memory, and creating an instance of the coclass you requested. Here's a sample call, which instantiates a CLSID_ShellLink object and requests an IShellLink interface pointer to that COM object. Collapse
HRESULT hr; IShellLink* pISL; hr = CoCreateInstance ( CLSID_ShellLink, NULL, // CLSID of coclass // not used - aggregation

CLSCTX_INPROC_SERVER, IID_IShellLink, (void**) &pISL ); pointer if ( SUCCEEDED ( hr ) ) { // Call methods using pISL here. } else { // Couldn't create the COM object. }

// type of server // IID of interface // Pointer to our interface

hr holds the error code.

First we declare an HRESULT to hold the return from CoCreateInstance() and an IShellLink pointer. We call CoCreateInstance() to create a new COM object. The SUCCEEDED macro returns TRUE if hr holds a code indicating success, or FALSE if hr indicates failure. There is a corresponding macro FAILED that tests for a failure code.

Deleting a COM object


As stated before, you don't free COM objects, you just tell them that you're done using them. The IUnknown interface, which every COM object implements, has a method Release(). You call this method to tell the COM object that you no longer need it. Once you call Release(), you must not use the interface pointer any more, since the COM object may disappear from memory at any time. If your app uses a lot of different COM objects, it's vitally important to call Release() whenever you're done using an interface. If you don't release interfaces, the COM objects (and the DLLs that contain the code) will remain in memory, and will needlessly add to your app's working set. If your app will be running for a long time, you should call the CoFreeUnusedLibraries() API during your idle processing. This API unloads any COM servers that have no outstanding references, so this also reduces your app's memory usage. Continuing the above example, here's how you would use Release(): Collapse
// Create COM object as above. Then...

if ( SUCCEEDED ( hr ) ) { // Call methods using pISL here.

// Tell the COM object that we're done with it. pISL->Release(); }

The IUnknown interface is explained fully in the next section.

The Base Interface - IUnknown


Every COM interface is derived from IUnknown. The name is a bit misleading, in that it's not an unknown interface. The name signifies that if you have an IUnknown pointer to a COM object, you don't know what the underlying object is, since every COM object implements IUnknown.
IUnknown

has three methods:

1. AddRef() - Tells the COM object to increment its reference count. You would use this method if you made a copy of an interface pointer, and both the original and the copy would still be used. We won't need to use AddRef() for our purposes in this article. 2. Release() - Tells the COM object to decrement its reference count. See the previous example for a code snippet demonstrating Release(). 3. QueryInterface() - Requests an interface pointer from a COM object. You use this when a coclass implements more than one interface. We've already seen Release() in action, but what about QueryInterface()? When you create a COM object with CoCreateInstance(), you get an interface pointer back. If the COM object implements more than one interface (not counting IUnknown), you use QueryInterface() to get any additional interface pointers that you need. The prototype of QueryInterface() is: Collapse
HRESULT IUnknown::QueryInterface ( REFIID iid, void** ppv );

The parameters are:


iid

The IID of the interface you're requesting.


ppv

Address of an interface pointer. QueryInterface() returns the interface through this parameter if it is successful. Let's continue our shell link example. The coclass for making shell links implements IShellLink and IPersistFile. If you already have an IShellLink pointer, pISL, you can request an IPersistFile interface from the COM object with code like this: Collapse
HRESULT hr;

IPersistFile* pIPF; hr = pISL->QueryInterface ( IID_IPersistFile, (void**) &pIPF );

You then test hr with the SUCCEEDED macro to determine if QueryInterface() worked. If it succeeded, you can then use the new interface pointer, pIPF, just like any other interface. You must also call pIPF->Release() to tell the COM object that you're done using the interface.

Pay Close Attention - String Handling


I need to make a detour for a few moments, and discuss how to handle strings in COM code. If you are familiar with how Unicode and ANSI strings work, and know how to convert between the two, then you can skip this section. Otherwise, read on. Whenever a COM method returns a string, that string will be in Unicode. (Well, all methods that are written to the COM spec, that is!) Unicode is a character encoding scheme, like ASCII, only all characters are 2 bytes long. If you want to get the string into a more manageable state, you should convert it to a TCHAR string.
TCHAR

and the _t functions (for example, _tcscpy()) are designed to let you handle Unicode and ANSI strings with the same source code. In most cases, you'll be writing code that uses ANSI strings and the ANSI Windows APIs, so for the rest of this article, I will refer to chars instead of TCHARs, just for simplicity. You should definitely read up on the TCHAR types, though, to be aware of them in case you ever come across them in code written by others. When you get a Unicode string back from a COM method, you can convert it to a char string in one of several ways: 1. 2. 3. 4. Call the WideCharToMultiByte() API. Call the CRT function wcstombs(). Use the CString constructor or assignment operator (MFC only). Use an ATL string conversion macro.

WideCharToMultiByte()
You can convert a Unicode string to an ANSI string with the WideCharToMultiByte() API. This API's prototype is: Collapse
int WideCharToMultiByte ( UINT CodePage, DWORD dwFlags, LPCWSTR lpWideCharStr, int cchWideChar, LPSTR lpMultiByteStr, int cbMultiByte, LPCSTR lpDefaultChar, LPBOOL lpUsedDefaultChar );

The parameters are:


CodePage

The code page to convert the Unicode characters into. You can pass CP_ACP to use the current ANSI code page. Code pages are sets of 256 characters. Characters 0-127 are always identical to the ASCII encoding. Characters 128-255 differ, and can contain graphics or letters with diacritics. Each language or region has its own code page, so it's important to use the right code page to get proper display of accented characters.
dwFlags dwFlags

determine how Windows deals with "composite" Unicode characters, which are a letter followed by a diacritic. An example of a composite character is è. If this character is in the code page specified in CodePage, then nothing special happens. However, if it is not in the code page, Windows has to convert it to something else. Passing WC_COMPOSITECHECK makes the API check for non-mapping composite characters. Passing WC_SEPCHARS makes Windows break the character into two, the letter followed by the diacritic, for example e`. Passing WC_DISCARDNS makes Windows discard the diacritics. Passing WC_DEFAULTCHAR makes Windows replace the composite characters with a "default" character, specified in the lpDefaultChar parameter. The default behavior is WC_SEPCHARS. The Unicode string to convert.

lpWideCharStr cchWideChar

The length of lpWideCharStr in Unicode characters. You will usually pass -1, which indicates that the string is zero-terminated.
lpMultiByteStr

A char buffer that will hold the converted string.


cbMultiByte

The size of lpMultiByteStr, in bytes.


lpDefaultChar

Optional - a one-character ANSI string that contains the "default" character to be inserted when dwFlags contains WC_COMPOSITECHECK | WC_DEFAULTCHAR and a Unicode character cannot be mapped to an equivalent ANSI character. You can pass NULL to have the API use a system default character (which as of this writing is a question mark).
lpUsedDefaultChar

Optional - a pointer to a BOOL that will be set to indicate if the default char was ever inserted into the ANSI string. You can pass NULL if you don't care about this information. Whew, a lot of boring details! Like always, the docs make it seem much more complicated than it really is. Here's an example showing how to use the API: Collapse
// Assuming we already have a Unicode string wszSomeString... char szANSIString [MAX_PATH]; WideCharToMultiByte ( CP_ACP, // ANSI code page

WC_COMPOSITECHECK, characters wszSomeString, -1, terminated szANSIString, sizeof(szANSIString), NULL, NULL ); flag

// Check for accented // Source Unicode string // -1 means string is zero// Destination char string // Size of buffer // No default character // Don't care about this

After this call, szANSIString will contain the ANSI version of the Unicode string.

wcstombs()
The CRT function wcstombs() is a bit simpler, but it just ends up calling WideCharToMultiByte(), so in the end the results are the same. The prototype for wcstombs() is: Collapse
size_t wcstombs ( char* mbstr, const wchar_t* wcstr, size_t count );

The parameters are:


mbstr

A char buffer to hold the resulting ANSI string.


wcstr

The Unicode string to convert.


count

The size of the mbstr buffer, in bytes.


wcstombs() uses the WC_COMPOSITECHECK | WC_SEPCHARS flags in its call to WideCharToMultiByte(). To reuse the earlier example, you can convert a Unicode string with

code like this: Collapse


wcstombs ( szANSIString, wszSomeString, sizeof(szANSIString) );

CString

The MFC CString class contains constructors and assignment operators that accept Unicode strings, so you can let CString do the conversion work for you. For example: Collapse
// Assuming we already have wszSomeString... CString str1 ( wszSomeString ); CString str2; str2 = wszSomeString; // Convert with an assignment operator. // Convert with a constructor.

ATL macros
ATL has a handy set of macros for converting strings. To convert a Unicode string to ANSI, use the W2A() macro (a mnemonic for "wide to ANSI"). Actually, to be more accurate, you should use OLE2A(), where the "OLE" indicates the string came from a COM or OLE source. Anyway, here's an example of how to use these macros. Collapse
#include <atlconv.h> // Again assuming we have wszSomeString... { char szANSIString [MAX_PATH]; USES_CONVERSION; // Declare local variable used by the macros. lstrcpy ( szANSIString, OLE2A(wszSomeString) ); }

The OLE2A() macro "returns" a pointer to the converted string, but the converted string is stored in a temporary stack variable, so we need to make our own copy of it with lstrcpy(). Other macros you should look into are W2T() (Unicode to TCHAR), and W2CT() (Unicode string to const TCHAR string). There is an OLE2CA() macro (Unicode string to a const char string) which we could've used in the code snippet above. OLE2CA() is actually the correct macro for that situation, since the second parameter to lstrcpy() is a const char*, but I didn't want to throw too much at you at once.

Sticking with Unicode

On the other hand, you can just keep the string in Unicode if you won't be doing anything complicated with the string. If you're writing a console app, you can print Unicode strings with the std::wcout global variable, for example: Collapse
wcout << wszSomeString;

But keep in mind that wcout expects all strings to be in Unicode, so if you have any "normal" strings, you'll still need to output them with std::cout. If you have string literals, prefix them with L to make them Unicode, for example: Collapse
wcout << L"The Oracle says..." << endl << wszOracleResponse;

If you keep a string in Unicode, there are a couple of restrictions:


You must use the wcsXXX() string functions, such as wcslen(), on Unicode strings. With very few exceptions, you cannot pass a Unicode string to a Windows API on Windows 9x. To write code that will run on 9x and NT unchanged, you'll need to use the TCHAR types, as described in MSDN.

Bringing it All Together - Sample Code


Following are two examples that illustrate the COM concepts covered in the article. The code is also contained in the article's sample project. Using a COM object with a single interface The first example shows how to use a COM object that exposes a single interface. This is the simplest case you'll ever encounter. The code uses the Active Desktop coclass contained in the shell to retrieve the filename of the current wallpaper. You will need to have the Active Desktop installed for this code to work. The steps involved are: 1. Initialize the COM library. 2. Create a COM object used to interact with the Active Desktop, and get an IActiveDesktop interface. 3. Call the GetWallpaper() method of the COM object. 4. If GetWallpaper() succeeds, print the filename of the wallpaper. 5. Release the interface. 6. Uninitialize the COM library. Collapse
WCHAR wszWallpaper [MAX_PATH]; CString strPath;

HRESULT hr; IActiveDesktop* pIAD; // 1. Initialize the COM library (make Windows load the DLLs). Normally you would // call this in your InitInstance() or other startup code. use // AfxOleInit() instead. In MFC apps,

CoInitialize ( NULL ); // 2. Create a COM object, using the Active Desktop coclass provided by the shell. // The 4th parameter tells COM what interface we want (IActiveDesktop). hr = CoCreateInstance ( CLSID_ActiveDesktop, NULL, CLSCTX_INPROC_SERVER, IID_IActiveDesktop, (void**) &pIAD ); if ( SUCCEEDED(hr) ) { // 3. If the COM object was created, call its GetWallpaper() method. hr = pIAD->GetWallpaper ( wszWallpaper, MAX_PATH, 0 ); if ( SUCCEEDED(hr) ) { // 4. If GetWallpaper() succeeded, print the filename it returned. // Note that I'm using wcout to display the Unicode string wszWallpaper. // wcout is the Unicode equivalent of cout. wcout << L"Wallpaper path is:\n endl; } else { cout << _T("GetWallpaper() failed.") << endl << endl; } // 5. Release the interface. pIAD->Release(); } else { cout << _T("CoCreateInstance() failed.") << endl << endl; } " << wszWallpaper << endl <<

// 6. Uninit the COM library. MFC does // it for us. CoUninitialize();

In MFC apps, this is not necessary since

In this sample, I used std::wcout to display the Unicode string wszWallpaper. Using a COM object with a multiple interfaces The second example shows how to use QueryInterface() with a COM object that exposes a single interface. The code uses the Shell Link coclass contained in the shell to create a shortcut to the wallpaper file that we retrieved in the last example. The steps involved are: 1. 2. 3. 4. 5. 6. 7. Initialize the COM library. Create a COM object used to create shortcuts, and get an IShellLink interface. Call the SetPath() method of the IShellLink interface. Call QueryInterface() on the COM object and get an IPersistFile interface. Call the Save() method of the IPersistFile interface. Release the interfaces. Uninitialize the COM library.
sWallpaper = wszWallpaper; // Convert the wallpaper path to

Collapse
CString ANSI

IShellLink* pISL; IPersistFile* pIPF; // 1. Initialize the COM library (make Windows load the DLLs). Normally you would // call this in your InitInstance() or other startup code. use // AfxOleInit() instead. CoInitialize ( NULL ); 2. Create a COM object, using the Shell Link coclass provided by the shell. // The 4th parameter tells COM what interface we want (IShellLink). hr = CoCreateInstance ( CLSID_ShellLink, NULL, CLSCTX_INPROC_SERVER, IID_IShellLink, (void**) &pISL ); In MFC apps,

if ( SUCCEEDED(hr) ) { // 3. Set the path of the shortcut's target (the wallpaper file). hr = pISL->SetPath ( sWallpaper ); if ( SUCCEEDED(hr) ) { // 4. Get a second interface (IPersistFile) from the COM object. hr = pISL->QueryInterface ( IID_IPersistFile, (void**) &pIPF ); if ( SUCCEEDED(hr) ) { // 5. Call the Save() method to save the shortcut to a file. The // first parameter is a Unicode string. hr = pIPF->Save ( L"C:\\wallpaper.lnk", FALSE ); // 6a. Release the IPersistFile interface. pIPF->Release(); } } // 6b. Release the IShellLink interface. pISL->Release(); } // Printing of error messages omitted here. // 7. Uninit the COM library. MFC // does it for us. CoUninitialize(); In MFC apps, this is not necessary since

Handling HRESULTs
I've already shown some simple error handling, using the SUCCEEDED and FAILED macros. Now I'll give some more details on what to do with the HRESULTs returned from COM methods. An HRESULT is a 32-bit signed integer, with nonnegative values indicating success, and negative values indicating failure. An HRESULT has three fields: the severity bit (to indicate success or failure), the facility code, and the status code. The "facility" indicates what component or program the HRESULT is coming from. Microsoft assigns facility codes to the various components, for example COM has one, the Task Scheduler has one, and so on. The "code" is a

16-bit field that has no intrinsic meaning; the codes are just an arbitrary association between a number and a meaning, just like the values returned by GetLastError(). If you look up error codes in the winerror.h file, you'll see a lot of HRESULTs listed, with the naming convention [facility]_[severity]_[description]. Generic HRESULTs that can be returned by any component (like E_OUTOFMEMORY) have no facility in their name. Examples:
REGDB_E_READREGDB:

Facility = REGDB, for "registry database"; E = error; READREGDB is a description of the error (couldn't read the database). S_OK: Facility = generic; S = success; OK is a description of the status (everything's OK).

Fortunately, there are easier ways to determine the meaning of an HRESULT than looking through winerror.h. HRESULTs for built-in facilities can be looked up with the Error Lookup tool. For example, say you forgot to call CoInitialize() before CoCreateInstance(). CoCreateInstance() will return a value of 0x800401F0. You can enter that value into Error Lookup and you'll see the description: "CoInitialize has not been called."

You can also look up HRESULT descriptions in the debugger. If you have an HRESULT variable called hres, you can view the description in the Watch window by entering "hres,hr" as the value to watch. The ",hr" tells VC to display the value as an HRESULT description.

The Factory Method (Creational) Design Pattern


Gopalan Suresh Raj

The Problem
One of the goals of object-oriented design is to delegate responsibility among different objects. This kind of partitioning is good since it encourages Encapsulation and Delegation. Sometimes, an Application (or framework) at runtime, cannot anticipate the class of object that it must create. The Application (or framework) may know that it has to instantiate classes, but it may only know about abstract classes (or interfaces), which it cannot instantiate. Thus the Application class may only know when it has to instantiate a new Object of a class, not what kind of subclass to create. a class may want it's subclasses to specify the objects to be created. a class may delegate responsibility to one of several helper subclasses so that knowledge can be localized to specific helper subclasses.

The Solution
Factory Method is a creational pattern. This pattern helps to model an interface for creating an object which at creation time can let its subclasses decide which class to instantiate. We call this a Factory Pattern since it is responsible for "Manufacturing" an Object. It helps instantiate the appropriate Subclass by creating the right Object from a group of related classes. The Factory Pattern promotes loose coupling by eliminating the need to bind application-specific classes into the code. Factories have a simple function: Churn out objects. Obviously, a factory is not needed to make an object. A simple call to new will do it for you. However, the use of factories gives the programmer the opportunity to abstract the specific attributes of an Object into specific subclasses which create them. The Factory Pattern is all about "Define an interface for creating an object, but let the subclasses decide which class to instantiate. The Factory method lets a class defer instantiation to subclasses" Thus, as defined by Gamma et al, "The Factory Method lets a class defer instantiation to subclasses". Figure 1 below illustrates the roles of the Factory Pattern.

Figure 1: The roles of the Factory Pattern As shown in the figure, the Factory Pattern has a couple of roles - a Creator and a Concrete Creator. This pattern is used when a class (the Creator) does not know

beforehand all the subclasses that it will create. Instead, each subclass (the Concrete Creator) is left with the responsibility of creating the actual object instances.

Explanation
Example Scenario 1 - One typical use of the Factory Pattern in a Component Object Model (COM) application: In this scenario, the Creator is played by a COM interface called the IClassFactory and the Concrete Creator is played by the class which implements that interface. COM alleviates the need for the COM runtime to know about all the possible object types that it needs to create ahead of time. Typically, Object creation may require acquisition of system resources, coordination between several Objects, etc. With the introduction of the IClassFactory interface, all the details of object creation can be hidden in a Concrete derivation, manifested as a Class Factory. This way, it is only the class factory that has to be explicitly created by the COM support system. Figure 2 below, shows the control flow details of component creation from the client using the Factory Pattern in a typical COM application.

Figure 2: Control Flow Details of Component Creation from the client using a Factory Pattern approach in COM As shown in figure 2, the client first calls CoCreateInstance, which is implemented in the COM library. CoCreateInstance is implemented using CoGetClassObject. CoGetClassObject looks for the component in the Windows Registry. If it finds the component in the registry, it loads the associated DLL that serves the component. After the DLL is loaded, CoGetClassObject calls the DllGetClassObject. DllGetClassObject is implemented in the DLL Server. It's job is to create the Class Factory which it does using the new operator. DllGetClassObject then queries the Class Factory for the IClassFactory interface, which is returned to CoCreateInstance. CoCreateInstance then uses the IClassFactory interface to call it's CreateInstance function. Here, IClassFactory::CreateInstance calls the new operator to create the component. In

addition, it queries for the IAccount interface. After getting the interface, CoCreateInstance releases the Class Factory and returns the IAccount interface pointer to the client. The client can now use the interface pointer to call methods on the component. Example Scenario 2- One typical use of the Factory Pattern in the Standard Template Library (STL): In the STL, an iterator is a sort of smart pointer for which the operator*() function is overloaded in order to provide not the iterator object itself but the individual data object in the container, such as a table row. Because the iterator must have knowledge of the container internals in order to traverse the data structure, the definition of the iterator class is included with each container in the STL; Gamma et al would probably include this as an example of a Factory design pattern, in that the container, in a way, creates the iterator. In other words, the list makes its iterator. "Iterator'' itself is also a design pattern described by Gamma et al (which we will discuss in a future article). Example Scenario 3- One typical use of the Factory Pattern in an Enterprise JavaBean (EJB) Application: An entity bean is an object representation of persistent data that are maintained in a permanent data store, such as a database. A primary key identifies each instance of an entity bean. Entity beans can be created by creating an object using an object factory Create method. Similarly, Session beans can be created by creating an object using an object factory Create method. Figure 3 below, shows the control flow details of component creation from the client using the Factory Pattern in a typical EJB application.

Figure 3: Control Flow Details of Component Creation from the client using a Factory Pattern approach in EJB Starting at the top left of the image, the client first creates a new context to look up the EJBObject with the help of a JNDI server. Given this context, the client then creates an EJB using the home interface. The client can then call the available methods of the EJBObject.

When all the activities are complete, the client calls the home interface, to terminate the session. The equivalent code for the same looks like Listing 1: import javax.naming.*; public class EJBClient

remove()

method, also through the

{ (String[]
argv)

public static void main

// get the JNDI naming context Context initialCtx = new InitialContext (); // use the context to lookup the EJB Home interface AccountHome home=(AccountHome)initialCtx.lookup("Account"); // use the Home Interface to create a Session bean object Account account = home.create (10001, "Athul", 100000000.25d); // invoke business methods account.credit (200000000.25d); // remove the object account.remove ();

} }
Listing 1: The EJB Client code to talk to an EJB Example Scenario 4- One typical use of the Factory Pattern in the Common Object Request Broker Architecture (CORBA): The CORBA Object Services - Naming Service Specification (COSNaming), describes the entire namespace in terms of NamingContext Objects. These contexts can be connected to each other and can contain references to actual object instances for which clients will ask. The contexts are all contained within the CosNaming.Factory and CosNaming.ExtFactory servers. When the factory server is executed, it always creates a singleton persistent object implementing the CosNaming::NamingContextFactory interface. All NamingContext objects created within a particular factory server are associated with that server's single NamingContextFactory. The factories support the following interfaces as shown in Listing 2: // CORBA Interface Definition Language module CosNaming { interface NamingContextFactory NamingContext create_context oneway void shutdown

();

(); :
NamingContextFactory

};
interface ExtendedNamingContextFactory

NamingContext root_context

();

}; };
Listing 2: The COSNaming interface in CORBA is modelled using the Factory Pattern

Implementation
Our Example: To restrict access to authorized personnel and to prevent information glut by loading users with information that would be useless to them, lots of ecommerce applications, provide profile management by storing static and dynamic information about users. Providing individual or multiple users access to resources such as individual files, directory structures, user-defined entities (such as catalogs, purchase orders etc.) etc., can be controlled based on the individual/group's organization/role with and within an organization. Our example here, ensures information control, by providing Employees with access to Confidential Resources, while restricting the generic public to only Public Resources. We will use this example to illustrate the different implementation options available to us. The Classes and Interfaces used:

Figure 4: The Creator as an Abstract Class (*) 1. Resource - An abstract base class that defines a common interface to different types of Resources in an Organization. These resources may represent individual files, directory structures, user-defined entities (such as catalogs, purchase orders etc.) etc. The listings below, illustrate how the Resource class is implemented in Java and C++.

In Java, public abstract class Resource

{ ();

public abstract void printMessage

}
Listing 3: Resource.java In C++, #include

class CResource public: CResource() virtual

<stdio.h> { {} () = 0;
Listing 4: Resource.h

~CResource() {}

virtual void printMessage

};
2. ConfidentialResource - A specialization of the Resource interface which represents resources which are Company Confidential resources which may not be accessed by anyone except the Company Employees.The listings below, illustrate how the ConfidentialResource class is implemented in Java and C++. In Java, public class ConfidentialResource extends Resource public void printMessage

{ }

()
is a Company Confidential Resource...");

{ }

System.out.println

("This

Listing 5: ConfidentialResource.java In C++, #include "Resource.h" class CConfidentialResource public: CConfidentialResource() virtual

public CResource

{}
Resource...\n");

~CConfidentialResource() {} virtual void printMessage (void) { printf ("This is a Company Confidential } };

Listing 6: ConfidentialResource.h 3. PublicResource - A specialization of the Resource interface which represents resources which are available to anyone.The listings below, illustrate how the PublicResource class is implemented in Java and C++. In Java,

public class PublicResource extends Resource public void printMessage

{ }

()
is a Public Resource...");

{ }

System.out.println

("This

Listing 7: PublicResource.java In C++, #include "Resource.h" class CPublicResource public: CPublicResource() virtual

public CResource

{} ~CPublicResource() {} virtual void printMessage (void) { printf ("This is a Public Resource...\n"); } };


Listing 8: PublicResource.h 4. Profile - An abstract base class that defines a common interface to hold information about users like their Name, EMail Address, etc. A getResource() method is used to retrieve a reference to Resources. Employees have access to ConfidentialResources whereas, Non-Employees have access to PublicResources. The listings below, illustrate how the Profile class is implemented in Java and C++. In Java, public abstract class Profile public Profile

(String m_sName = sName; m_sEmail = sEmail;

{ sName,

String sEmail)

}
public String getName public String

() getEmail ()

public boolean IsEmployee

{ { () {

return m_sName; return

} m_sEmail; } }

return m_bIsEmployee;

public abstract Resource getResource protected String m_sName, m_sEmail; protected boolean m_bIsEmployee;

();

}
Listing 9: Profile.java In C++,

#include <string.h> #include "Resource.h" #include "ResourceCreator.h" class CProfile { public: CProfile(char* psName, char* psEmail) if

(psName) { m_psName = new char [strlen(psName)+1]; strcpy (m_psName, psName); (psEmail) { = new char [strlen(psEmail)+1]; strcpy (m_psEmail, psEmail);
m_psEmail

}
if

} }
virtual

~CProfile() { [] m_psName; delete [] m_psEmail;


delete

}
virtual char* getName virtual virtual virtual

(void) { return m_psName; } char* getEmail (void) { return m_psEmail; } bool IsEmployee (void) { return m_bIsEmployee; } CResource* getResource (void) = 0;

protected: char char

*m_psName; *m_psEmail;
Listing 10: Profile.h

bool m_bIsEmployee;

};
5. Employee - A specialization of the Profile interface represents an Object associated with an Employee of the Company. The IsEmployee() method returning a boolean value is used to test to see if the Object represents an employee of the Company. 6. NonEmployee - A specialization of the Profile interface represents an Object associated with users who are Non-Employee of the Company. The IsEmployee() method returning a boolean value is used to test to see if the Object represents an employee of the Company.

Implementation 1 - When the Creator class is an Abstract class:


This requires subclasses to define an implementation, because there is no reasonable default. It gets around the dialemma of having to instantiate unforeseeable classes. public class Employee extends Profile public Employee

{ {

(String

sName, String sEmail)

super

(sName,

sEmail);

m_bIsEmployee

true;

}
public Resource getResource () { return new ConfidentialResource (); }

}
Listing 11: Employee.java public class NonEmployee extends Profile public NonEmployee super

{ {

(String
false;

sName, String sEmail)

(sName,

sEmail);

m_bIsEmployee

}
public Resource getResource () { return new PublicResource (); }

}
Listing 12: NonEmployee.java public class FactoryPattern public static void main Employee employee

{
args)

(String[] =

new Employee

NonEmployee nonEmployee "aditya@inventor.com"); employee.getResource

new NonEmployee

("Athul", "athul@eCommWare.com"); ("Aditya",

().printMessage (); nonEmployee.getResource ().printMessage (); } }


Listing 13: FactoryPattern.java

Implementation 2 - When the Creator class is a Concreate class:


In this type of implementation, the concrete Creator uses the factory method primarily for flexibility. It says "Create objects in a separate operation so that subclasses can change the way they are created". This ensures that, if necessary, designers of subclasses can change the class of objects their parent class instantiates.

Implementation 3 - Parameterized Factory Methods:


This type of implementation lets the factory method create multiple kinds of Objects. The factory method takes a parameter that identifies the kind of object to create. All Objects that the factory method creates, share the same interface. Once the identifier is read, the framework calls createResource () with the identifier passed in. The createResource () method instantiates and returns the appropriate Resource reference. public class ResourceCreator

{ = 0;

public static final int CONFIDENTIAL

public static final int PUBLIC public Resource createResource switch

= 1; (int nID) { ();

(nID) {

case CONFIDENTIAL: return new ConfidentialResource case PUBLIC: return new PublicResource

();

}
return null;

} }
Listing 14: ResourceCreator.java public class Employee extends Profile public Employee

{ {

(String sName, super (sName, sEmail); m_bIsEmployee = true; =

String sEmail)

}
public Resource getResource ResourceCreator creator

() {
new ResourceCreator

return creator.createResource

(); (ResourceCreator.CONFIDENTIAL);

} }
Listing 15: Employee.java public class NonEmployee extends Profile public NonEmployee

{ {

(String super (sName, sEmail); m_bIsEmployee = false; =

sName, String sEmail)

}
public Resource getResource ResourceCreator creator

() {
new ResourceCreator

return creator.createResource

(); (ResourceCreator.PUBLIC);

} }
Listing 16: NonEmployee.java

Implementation 4 - Using Templates to avoid subclassing:


A potential problem with factory methods is that they might force you to subclass just to create the appropriate Objects. One way to get around this in C++ is to provide a template subclass of the Creator class that's parameterized by the appropriate object class. template public:

<class

aResource>

class CResourceCreator

virtual CResource* createResource

();

}; <class aResource> CResource* CResourceCreator<aResource>::createResource () { return new aResource; }


template Listing 17: ResourceCreator.h #include "Profile.h" #include "ConfidentialResource.h" class CEmployee : public CProfile public: CEmployee

(char* psName, char* psEmail) (psName, psEmail) { m_bIsEmployee = true; :


CProfile

} ~CEmployee() {} virtual CResource* getResource (void) { CResourceCreator<CConfidentialResource>


virtual creator;

return creator.createResource ();

} };
Listing 18: Employee.h #include "Profile.h" #include "PublicResource.h" class CNonEmployee : public CProfile public: CNonEmployee

{
psEmail)

(char* psName, char* :CProfile (psName, psEmail) { m_bIsEmployee = false;

} ~CNonEmployee() {} virtual CResource* getResource (void) { CResourceCreator<CPublicResource> creator;


virtual

return creator.createResource ();

} };
Listing 19: NonEmployee.h #include "Resource.h" #include "Employee.h" #include "NonEmployee.h" int main(int argc, char* argv[]) CEmployee employee CNonEmployee

{ ("Athul", "athul@eCommWare.com"); nonEmployee ("Aditya", "aditya@inventor.com");

employee.getResource

()->printMessage (); nonEmployee.getResource ()->printMessage ();


return 0;

}
Listing 20: FactoryPattern.cpp

Checks and Balances of using this pattern


Plusses
Eliminates the need to bind application-specific classes into your code Provides hooks for subclassing. Creating objects inside a class with a factory method is always more flexible than creating an object directly. This method gives subclasses a hook for providing an extended version of an object Connects parallel heirarchies. Factory method localises knowledge of which classes belong together. Parallel class heirarchies result when a class delegates some of its responsibilities to a separate class.

Minus
Clients might have to subclass the Creator class

Abstract
This paper is intended to be a quick reference for the primary rules of using and implementing Microsoft Component Object Model (COM) objects. Readers interested in gaining a better understanding of what COM is, as well as the motivations behind its design and philosophy, should read the first two chapters of the Component Object Model Specification (MSDN Library, Specifications). Chapter 1 is a brief introduction, and Chapter 2 provides a thorough overview. The information presented here is all taken from the COM specification.

Rule #1: Must Implement IUnknown


An object is not a Microsoft Component Object Model (COM) object unless it implements at least one interface that at minimum is IUnknown.

Interface Design Rules


Interfaces must directly or indirectly inherit from IUnknown. Interfaces must have a unique interface identifier (IID). Interfaces are immutable. Once assigned an IID and published, no element of the interface definition may change. Interface member functions should have a return type of HRESULT to allow the remoting infrastructure to report remote procedure call (RPC) errors. String parameters in interface member functions should be Unicode.

Implementing IUnknown

Object identity. It is required that any call to QueryInterface on any interface for a given object instance for the specific interface IUnknown must always return the same physical pointer value. This enables calling QueryInterface(IID_IUnknown, ...) on any two interfaces and comparing the results to determine whether they point to the same instance of an object (the same COM object identity). Static interface set. It is required that the set of interfaces accessible on an object via QueryInterface be static, not dynamic. That is, if QueryInterface succeeds for a given IID once, it will always succeed on subsequent calls on the same object (except in catastrophic failure situations), and if QueryInterface fails for a given IID, subsequent calls for the same IID on the same object must also fail. Object integrity. QueryInterface must be reflexive, symmetric, and transitive with respect to the set of interfaces that are accessible. That is, given the code snippet below:

IA * pA = (some function returning an IA *); IB * pB = NULL; HRESULT hr; hr = pA->QueryInterface(IID_IB, &pB); // line 4

Symmetric: pA->QueryInterface(IID_IA, ...) must succeed (a>>a) If, in line 4, pB was successfully obtained, then Reflexive:
pB->QueryInterface(IID_IA, ...)

must succeed (a>>b, then b>>a). If, in line 4, pB was successfully obtained, and we do
IC * pC = NULL; hr = pB->QueryInterface(IID_IC, &pC); //Line 7

Transitive:

and pC is successfully obtained in line 7, then

pA->QueryInterface(IID_IC, ...)

must succeed (a>>b, and b>>c, then a>>c).

Minimum reference counter size. AddRef implementations are required to maintain a counter that is large enough to support 2 31 1 outstanding pointer references to all the interfaces on a given object taken as a whole. A 32-bit unsigned integer fits this requirement. Release cannot indicate failure. If a client needs to know that resources have been freed, and so forth, it must use a method in some interface on the object with higher-level semantics before calling Release.

Memory Management Rules

The lifetime management of pointers to interfaces is always accomplished through the AddRef and Release methods found on every COM interface. (See "Reference-Counting Rules" below.) The following rules apply to parameters to interface member functions, including the return value, that are not passed "by-value": o For in parameters, the caller should allocate and free the memory. o The out parameters must be allocated by the callee and freed by the caller using the standard COM memory allocator. o The in-out parameters are initially allocated by the caller, then freed and reallocated by the callee if necessary. As with out parameters, the caller is responsible for freeing the final returned value. The standard COM memory allocator must be used. If a function returns a failure code, then in general the caller has no way to clean up the out or in-out parameters. This leads to a few additional rules: o In error returns, out parameters must always be reliably set to a value that will be cleaned up without any action on the caller's part. o Further, it is the case that all out pointer parameters (including pointer members of a caller-allocate callee-fill structure) must explicitly be set to NULL. The most straightforward way to ensure this is (in part) to set these values to NULL on function entry. o In error returns, all in-out parameters must either be left alone by the callee (and thus remaining at the value to which it was initialized by the caller; if the caller didn't initialize it, then it's an out parameter, not an in-out parameter) or be explicitly set as in the out parameter error return case.

Reference-Counting Rules
Rule 1: AddRef must be called for every new copy of an interface pointer, and Release called for every destruction of an interface pointer, except where subsequent rules explicitly permit otherwise.

The following rules call out common nonexceptions to Rule 1.


Rule 1a: In-out-parameters to functions. The caller must AddRef the actual parameter, since it will be Released by the callee when the out-value is stored on top of it. Rule 1b: Fetching a global variable. The local copy of the interface pointer fetched from an existing copy of the pointer in a global variable must be independently reference counted, because called functions might destroy the copy in the global while the local copy is still alive. Rule 1c: New pointers synthesized out of "thin air." A function that synthesizes an interface pointer using special internal knowledge, rather than obtaining it from some other source, must do an initial AddRef on the newly synthesized pointer. Important examples of such routines include instance creation routines, implementations of IUnknown::QueryInterface, and so on. Rule 1d: Returning a copy of an internally stored pointer. After the pointer has been returned, the callee has no idea how its lifetime relates to that of the internally stored copy of the pointer. Thus, the callee must call AddRef on the pointer copy before returning it.

Rule 2: Special knowledge on the part of a piece of code of the relationships of the beginnings and the endings of the lifetimes of two or more copies of an interface pointer can allow AddRef/Release pairs to be omitted.

From a COM client's perspective, reference-counting is always a per-interface concept. Clients should never assume that an object uses the same reference count for all interfaces. The return values of AddRef and Release should not be relied upon, and should be used only for debugging purposes. Pointer stability; see details in the OLE Help file under "Reference-Counting Rules," subsection "Stabilizing the this Pointer and Keeping it Valid."

See the excellent "Managing Object Lifetimes in OLE" technical article by Douglas Hodges, and Chapter 3 of Inside OLE, 2nd edition, by Kraig Brockschmidt (MSDN Library, Books) for more information on reference-counting.

COM Application Responsibilities


Each process that uses COM in any wayclient, server, object implementoris responsible for three things: 1. Verify that the COM Library is a compatible version with the COM function CoBuildVersion. 2. Initialize the COM Library before using any other functions in it by calling CoInitialize. 3. Uninitialize the COM Library when it is no longer in use by CoUninitialize. In-process servers can assume that the process they are being loaded into has already performed these steps.

Server Rules

In-process servers must export DllGetClassObject and DllCanUnloadNow. In-process servers must support COM self-registration. o In-process and local servers should put an OLESelfReg string in their file version information. o In-process servers should export DllRegisterServer and DllUnRegisterServer. o Local servers should support the /RegServer and /UnRegServer command-line switches.

Creating Aggregatable Objects


Creating objects that can be aggregated is optional; however, it is simple to do, and doing so has significant benefits. The following rules must be followed in order to create an object that is aggregatable (often called the inner object).

The inner object's implementation of QueryInterface, AddRef, and Release for the IUnknown interface controls the inner object's reference count alone, and must not delegate to the outer unknown. This IUnknown implementation is called the implicit IUnknown. The implementation of QueryInterface, AddRef, and Release members of all interfaces that the inner object implements, other than IUnknown itself, must delegate to the outer unknown. These implementations must not directly affect the inner object's reference count. The implicit IUnknown must implement the QueryInterface behavior for only the inner object. The aggregatable object must not call AddRef when holding a reference to the outer unknown pointer. If, when the object is created, any interface other than IUnknown is requested, the creation must fail with E_UNKNOWN.

The code fragment below illustrates a correct implementation of an aggregatable object using the nested class approach to implementing interfaces:
// CSomeObject is an aggregatable object that implements // IUnknown and ISomeInterface class CSomeObject : public IUnknown { private: DWORD m_cRef; // Object reference count IUnknown* m_pUnkOuter; // Outer unknown, no AddRef // Nested class to implement the ISomeInterface interface class CImpSomeInterface : public ISomeInterface { friend class CSomeObject ; private:

DWORD debugging

m_cRef;

// Interface ref-count, for

IUnknown* m_pUnkOuter; // Outer unknown, for delegation public: CImpSomeInterface() { m_cRef = 0; }; ~ CImpSomeInterface(void) {}; // IUnknown members delegate to the outer unknown // IUnknown members do not control lifetime of object STDMETHODIMP QueryInterface(REFIID riid, void** ppv) { return m_pUnkOuter->QueryInterface(riid,ppv); }; STDMETHODIMP_(DWORD) AddRef(void) { return m_pUnkOuter->AddRef(); STDMETHODIMP_(DWORD) Release(void) { return m_pUnkOuter->Release(); // ISomeInterface members STDMETHODIMP SomeMethod(void) { return S_OK; }; } ; CImpSomeInterface m_ImpSomeInterface ; public: CSomeObject(IUnknown * pUnkOuter) { m_cRef=0; // No AddRef necessary if non-NULL as we're aggregated. m_pUnkOuter=pUnkOuter; m_ImpSomeInterface.m_pUnkOuter=pUnkOuter; } ; ~CSomeObject(void) {} ; // Static member function for creating new instances (don't use // new directly). Protects against outer objects asking for interfaces // other than IUnknown static HRESULT Create(IUnknown* pUnkOuter, REFIID riid, void **ppv) { CSomeObject* pObj; if (pUnkOuter != NULL && riid != IID_IUnknown) return CLASS_E_NOAGGREGATION; pObj = new CSomeObject(pUnkOuter); if (pObj == NULL) return E_OUTOFMEMORY; // Set up the right unknown for delegation (the non-aggregation case) if (pUnkOuter == NULL) pObj->m_pUnkOuter = (IUnknown*)pObj ; HRESULT hr; }; };

if (FAILED(hr = pObj->QueryInterface(riid, (void**)ppv))) delete pObj ; return hr; } // Implicit IUnknown members, non-delegating // Implicit QueryInterface only controls inner object STDMETHODIMP QueryInterface(REFIID riid, void** ppv) { *ppv=NULL; if (riid == IID_IUnknown) *ppv=this; if (riid == IID_ISomeInterface) *ppv=&m_ImpSomeInterface; if (NULL==*ppv) return ResultFromScode(E_NOINTERFACE); ((IUnknown*)*ppv)->AddRef(); return NOERROR; } ; STDMETHODIMP_(DWORD) AddRef(void) { return ++m_cRef; }; STDMETHODIMP_(DWORD) Release(void) { if (--m_cRef != 0) return m_cRef; delete this; return 0; }; };

Aggregating Objects
When developing an object that aggregates in another object, these rules must be followed:

When creating the inner object, the outer object must explicitly ask for IUnknown. The outer object must protect its implementation of Release from reentrancy with an artificial reference count around its destruction code. The outer object must call its own outer unknown's Release if it queries for a pointer to any of the inner object's interfaces. To free this pointer, the outer object calls its own outer unknown's AddRef followed by Release on the inner object's pointer:

// Obtaining inner object interface pointer pUnkInner->QueryInterface(IID_IFoo, &pIFoo); pUnkOuter->Release(); // Releasing inner object interface pointer pUnkOuter->AddRef(); pIFoo->Release();

The outer object must not blindly delegate a query for any unrecognized interface of the inner object unless that behavior is specifically the intention of the outer object.

Apartment Threading Model


The details of apartment-model threading are actually quite simple, but must be followed carefully, as follows: Every object lives on a single thread (within a single apartment).

All calls to an object must be made on its thread (within its apartment). It is forbidden to call an object directly from another thread. Applications that attempt to use objects in this free-threaded manner will likely experience problems that will prevent them from running properly in future versions of the operating systems. The implication of this rule is that all pointers to objects must be marshalled between apartments. Each apartment/thread with objects in it must have a message queue in order to handle calls from other processes and apartments within the same process. This means simply that the thread's work function must have a GetMessage/DispatchMessage loop. If other synchronization primitives are being used to communicate between threads, the Microsoft Win32 function MsgWaitForMultipleObjects can be used to wait for both messages and thread synchronization events. DLL-based or in-process objects must be marked in the registry as "apartment aware" by adding the named value "ThreadingModel=Apartment" to their InprocServer32 key in the registration database. Apartment-aware objects must write DLL entry points carefully. Each apartment that calls CoCreateInstance on an apartment-aware object will call DllGetClassObject from its thread. DllGetClassObject should therefore be able to give away multiple class objects or a single thread-safe object. Calls to CoFreeUnusedLibraries from any thread always route through the main apartment's thread to call DllCanUnloadNow.

Containment/Delegation
The most common mechanism for object reuse in COM is containment/delegation. This type of reuse is a familiar concept found in most object-oriented languages and systems. The outer object, which needs to make use of the inner object, acts as an object client to the inner object. The outer object "contains" the inner object, and when the outer object requires the services of the inner object, the outer object explicitly delegates implementation to the inner object's methods. That is, the outer object uses the inner object's services to implement itself. It is not necessary for the outer and inner objects to support the same interfaces, although it certainly is reasonable to contain an object that implements an interface that the outer object does not and implement the methods of the outer object simply as calls to the corresponding methods in the inner object. When the complexity of the outer and inner objects differs greatly, however,

the outer object may implement some of the methods of its interfaces by delegating calls to interface methods implemented in the inner object. It is simple to implement containment for an outer object. The outer object creates the inner objects it needs to use as any other client would. This is nothing newthe process is like a C++ object that itself contains a C++ string object that it uses to perform certain string functions, even if the outer object is not considered a string object in its own right. Then, using its pointer to the inner object, a call to a method in the outer object generates a call to an inner object method.
Introduction The Component Object Model (COM) is a software architecture that allows applications to be built from binary software components. COM is the underlying architecture that forms the foundation for higher-level software services, like those provided by OLE. OLE services span various aspects of commonly needed system functionality, including compound documents, custom controls, interapplication scripting, data transfer, and other software interactions.

Figure 1. OLE technologies build on one another, with COM as the foundation. These services provide distinctly different functionality to the user. However they share a fundamental requirement for a mechanism that allows binary software components, derived from any combination of pre-existing customers' components and components from

different software vendors, to connect to and communicate with each other in a well-defined manner. This mechanism is supplied by COM, a software architecture that does the following: Defines a binary standard for component interoperability Is programming-language-independent Is provided on multiple platforms (Microsoft Windows, Windows 95, Windows NT, Apple Macintosh, and many varieties of UNIX) Provides for robust evolution of component-based applications and systems Is extensible by developers in a consistent manner Uses a single programming model for components to communicate within the same process, and also across process and network boundaries Allows for shared memory management between components Provides rich error and status reporting Allows dynamic loading and unloading of components

It is important to note that COM is a general architecture for component software. Although Microsoft is applying COM to address specific areas such as controls, compound documents, automation, data transfer, storage and naming, and others, any developer can take advantage of the structure and foundation that COM provides. How does COM enable interoperability? What makes it such a useful and unifying model? To address these questions, it will be helpful to first define the basic COM design principles and architectural concepts. In doing so, we will examine the specific problems that COM is meant to solve, and how COM provides solutions for these problems. The Component Software Problem The most fundamental problem that COM solves is: How can a system be designed so that binary executables from different vendors, written in different parts of the world and at different times, are able to interoperate? To solve this problem, we must first find answers to these four questions: Basic interoperability. How can developers create their own unique binary components, yet be assured that these binary components will interoperate with other binary components built by different developers? Versioning. How can one system component be upgraded without requiring all the system components to be upgraded? Language independence. How can components written in different languages communicate? Transparent cross-process interoperability. How can we give developers the flexibility to write components to run in-process or cross-process, and even crossnetwork, using one simple programming model?

Additionally, high performance is a requirement for a component software architecture. Although cross-process and cross-network transparency is a laudable goal, it is critical for the commercial success of a binary component marketplace that components interacting within the same address space be able to use each other's services without any undue "system" overhead. Otherwise, the components will not realistically be scalable down to very small, lightweight pieces of software equivalent to C++ classes or graphical user interface (GUI) controls.

COM Fundamentals The Component Object Model defines several fundamental concepts that provide the model's structural underpinnings. These include: A binary standard for function calling between components. A provision for strongly-typed groupings of functions into interfaces. A base interface providing: A way for components to dynamically discover the interfaces implemented by other components. Reference counting to allow components to track their own lifetime and delete themselves when appropriate. A mechanism to identify components and their interfaces uniquely, worldwide. A "component loader" to set up component interactions and, additionally (in the cross-process and cross-network cases), to help manage component interactions.

Binary Standard For any given platform (hardware and operating system combination), COM defines a standard way to lay out virtual function tables (vtables) in memory, and a standard way to call functions through the vtables. Thus, any language that can call functions via pointers (C, C++, Smalltalk, Ada, and even BASIC) all can be used to write components that can interoperate with other components written to the same binary standard. Indirection (the client holds a pointer to a vtable) allows for vtable sharing among multiple instances of the same object class. On a system with hundreds of object instances, vtable sharing can reduce memory requirements considerably, because additional vtables pointing to the same component instance consume much less memory than multiple instances of the same component.

Figure 2. Virtual function tables (VTBL) Objects and Components

The word object tends to mean something different to everyone. To clarify: In COM, an object is a piece of compiled code that provides some service to the rest of the system . To avoid confusion, it is probably best to refer to an object used in COM as a COM component or simply as a component. This avoids confusing COM components with source-code OOP objects such as those defined in C++. COM components support a base interface called IUnknown (described later), along with a combination of other interfaces, depending on what functionality the COM component chooses to expose. COM components usually have some associated data, but unlike C++ objects, a given COM component will never have direct access to another COM component in its entirety. Instead, COM components always access other COM components through interface pointers. This is a primary architectural feature of the Component Object Model, because it allows COM to completely preserve encapsulation of data and processing, a fundamental requirement of a true component software standard. It also allows for transparent remoting (cross-process or cross-network calling) because all data access is through methods that can be accessed through a proxy-stub pair that forward the request from the client component to the server component and also send back the response. Interfaces In COM, applications interact with each other and with the system through collections of functions called interfaces. Note that all OLE services are simply COM interfaces. A COM "interface" is a strongly-typed contract between software components to provide a small but useful set of semantically related operations (methods). An interface is the definition of an expected behavior and expected responsibilities. OLE's drag-and-drop support is a good example. All of the functionality that a component must implement to be a drop target is collected into the IDropTarget interface; all the drag source functionality is in the IDragSource interface. Interface names begin with "I" by convention. OLE provides a number of useful generalpurpose interfaces (which generally begin with "IOle"), but because anyone can define custom interfaces as well, developers can develop their own interfaces as they deploy component-based applications. Incidentally, a pointer to a COM component is really a pointer to one of the interfaces that the COM component implements; this means that you can only use a COM component pointer to call a method, and not to modify data, as described above. Here is an example of an interface, ILookup, with two member methods: interface ILookup : public IUnknown { public: virtual HRESULT __stdcall LookupByName ( LPTSTR lpName, TCHAR **lplpNumber) = 0; virtual HRESULT __stdcall LookupByNumber ( LPTSTR lpNumber, TCHAR **lplpName) = 0; }; Attributes of Interfaces Given that an interface is a contractual way for a COM component to expose its services, there are several very important points to understand:

An interface is not a class. Although an instance of a class can be created (instantiated) to form a COM component, an interface cannot be instantiated by itself because it carries no implementation. A COM component must implement that interface and that COM component must be instantiated in order for an interface to exist. Furthermore, different COM component classes may implement an interface differently, so long as the behavior conforms to the interface definition for each associated component (such as two COM components that implement IStack where one uses an array and the other a linked list). Thus the basic principle of polymorphism fully applies to COM components. An interface is not a COM component. An interface is just a related group of functions and is the binary standard through which clients and COM components communicate. The COM component can be implemented in any language with any internal state representation, so long as it can provide pointers to interface member functions. COM clients only interact with pointers to interfaces. When a COM client has access to a COM component, it has nothing more than a pointer through which it can access the functions in the interface, called simply an interface pointer. The pointer is said to be opaque because it hides all aspects of internal implementation. You cannot see the COM component's data, as opposed to C++ object pointers through which a client may directly access the object's data. In COM, the client can only call methods of the interface to which it has a pointer. This encapsulation is what allows COM to provide the efficient, safe, robust binary standard that enables local or remote transparency. COM components can implement multiple interfaces. A COM component can-and typically does--implement more than one interface. That is, the COM class has more than one set of services to provide. For example, a COM class might support the ability to exchange data with COM clients, as well as the ability to save its persistent state information (the data it would need to reload to return to its current state) in a file at the client's request. Each of these abilities is expressed through a different interface (IDataObject and IPersistFile), so the COM component must implement two interfaces. Interfaces are strongly typed. Every interface has its own interface identifier (a GUID, which is described later), thereby eliminating any chance of collision that would occur with human-readable names. The difference between components and interfaces has two important implications. When you create a new interface, you must also create a new identifier for that interface. When you use an interface, you must use the identifier for the interface to request a pointer to the interface. This explicit identification improves robustness by eliminating naming conflicts that would result in run-time failure. Interfaces are immutable. COM interfaces are never versioned, which means that version conflicts between new and old components are avoided. A new version of an interface, created by adding more functions or changing semantics, is an entirely new interface and is assigned a new, unique identifier. Therefore a new interface does not conflict with an old interface even if all that changed is one operation or the semantics (but not even the syntax) of an existing method. Note that, as an implementation efficiency, it is likely that two very similar interfaces can share a common internal implementation. For example, if a new interface adds only one method to an existing interface, and you as the component author wish to support both old-style and new-style clients, you would express both collections of capabilities through two interfaces, yet internally implement the old interfaces as a proper subset of the implementation of the new.

It is convenient to adopt a standard pictorial representation for COM components and their interfaces. The current convention is to draw each interface on a COM component as a "plug-in jack."

Figure 3. A typical picture of a COM component that supports three interfaces: A, B, and C.

Figure 4. Interfaces extend towards the clients connected to them.

Figure 5. Two applications may connect to each other's objects, in which case they extend their interfaces towards each other. The unique use of interfaces in COM provides five major benefits: 1. The ability for functionality in applications (clients or servers of COM components) to evolve over time. This is accomplished through a request called QueryInterface, which absolutely all COM components support (or else they are not COM components). QueryInterface allows a COM component to make more interfaces available to new clients (that is, support new groups of functions) while at the same time retaining complete binary compatibility with existing client code. In other words, revising a COM component by adding new functionality will not require any recompilation of any existing clients of that component. This is a key solution to the

problem of versioning, and is a fundamental requirement for achieving a component software market. COM additionally provides for robust versioning because COM interfaces are immutable, and COM components continue to support old interfaces even while adding new functionality through additional interfaces. This guarantees backwards compatibility as components are upgraded. Other proposed system object models generally allow developers to change existing interfaces, leading ultimately to versioning problems as components are upgraded. This freedom in other object models to change interfaces may appear on the surface to handle versioning, but in practice it does not work. For example, if version-checking is done only at object creation time, subsequent users of an instantiated object can easily fail because the object is of the right type but the wrong version. (Per-call version-checking is too expensive to even contemplate!) 2. Fast and simple object interaction. Once a client establishes a connection to a COM component, calls to that COM component's services (interface functions) are simply indirect functions calls through two memory pointers. As a result, the performance overhead of interacting with an in-process COM component (an COM component that is in the same address space) as the calling code is negligible. Calls between COM components in the same process are only a handful of processor instructions slower than a standard direct function call and no slower than a compile-time bound C++ object invocation. In addition, using multiple interfaces per object is efficient because the cost of negotiating interfaces (via QueryInterface) is done in groups of functions instead of one function at a time. 3. Interface reuse. Design experience suggests that there are many sets of operations that are useful across a broad range of components. For example, it is commonly useful to provide or use a set of functions for reading or writing streams of bytes. In COM, components can reuse an existing interface (such as IStream) in a variety of areas. This not only allows for code reuse, but by reusing interfaces, the programmer learns the interface once and can apply it throughout many different applications. 4. "Local/Remote Transparency." The binary standard allows COM to intercept an interface call to an object and make instead a remote procedure call (RPC) to an object that is running in another process or on another machine. A key point is that the caller makes this call exactly as it would for an object in the same process. The binary standard enables COM to perform inter-process and cross-network function calls transparently. Although there is, of course, more overhead in making a remote procedure call, no special code is necessary in the client to differentiate an in-process object from an out-of-process object. This means that as long as the client is written from the start to handle RPC exceptions, all objects (in-process, cross-process, and remote) are available to clients in a uniform, transparent fashion. Microsoft will be providing a distributed version of COM that will require no modification to existing components in order to gain distributed capabilities. In other words, programmers are completely isolated from networking issues, and components shipped today will operate in a distributed fashion when this future version of COM is released. 5. Programming language independence. Any programming language that can create structures of pointers and explicitly or implicitly call functions through pointers can create

and use COM components. COM components can be implemented in a number of different programming languages and used from clients that are written using completely different programming languages. Again, this is because COM, unlike an object-oriented programming language, represents a binary object standard, not a source code standard. Globally Unique Identifiers (GUIDs) COM uses globally unique identifiers (GUIDs), which are 128-bit integers that are guaranteed to be unique in the universe across space and time, to identify every interface and every COM component class. These globally unique identifiers are UUIDs (universally unique IDs) as defined by the Open Software Foundation's Distributed Computing Environment. Human-readable names are assigned only for convenience and are locally scoped. This helps ensure that COM components do not accidentally connect to the "wrong" component, interface, or method, even in networks with millions of COM components. CLSIDs are GUIDs that refer to COM component classes, and IIDs are GUIDs that refer to interfaces. Microsoft supplies a tool (uuidgen) that automatically generates GUIDs. Additionally, the CoCreateGuid function is part of the COM application programming interface (API). Thus, developers create their own GUIDs when they develop COM components and custom interfaces. Through the use of defines, developers don't need to be exposed to the actual 128-bit GUID. For those who want to see real GUIDs in all their glory, the following example shows two GUIDs. CLSID_PHONEBOOK is a COM component class that gives users lookup access to a phone book. IID_ILOOKUP is a custom interface implemented by the PhoneBook class that accesses the phonebook's database: DEFINE_GUID(CLSID_PHONEBOOK, 0xc4910d70, 0xba7d, 0x11cd, 0x94, 0xe8,\ 0x08, 0x00, 0x17, 0x01, 0xa8, 0xa3); DEFINE_GUID(IID_ILOOKUP, 0xc4910d71, 0xba7d, 0x11cd, 0x94, 0xe8,\ 0x08, 0x00, 0x17, 0x01, 0xa8, 0xa3); The GUIDs are embedded in the component binary itself and are used by the COM system dynamically at bind time to ensure that no false connections are made between components. IUnknown COM defines one special interface, IUnknown, to implement some essential functionality. All COM components are required to implement the IUnknown interface and, conveniently, all other COM and OLE interfaces are derived from IUnknown. IUnknown has three methods: QueryInterface, AddRef, and Release. In C++ syntax, IUnknown looks like this: interface IUnknown { virtual HRESULT QueryInterface(IID& iid, void** ppvObj) = 0; virtual ULONG AddRef() = 0; virtual ULONG Release() = 0; }

Figure 6 is a graphical representation of IUnknown.

Figure 6. The IUnknown interface AddRef and Release are simple reference counting methods. A COM component's AddRef method is called when another COM component is using the interface; the COM component's Release method is called when the other component no longer requires use of that interface. While the COM component's reference count is non-zero it must remain in memory; when the reference count becomes zero, the COM component can safely unload itself, because no other components hold references to it. QueryInterface is the mechanism that allows clients to dynamically discover (at run time) whether or not an interface is supported by a COM component; at the same time, it is the mechanism that a client uses to get an interface pointer from a COM component. When an application wants to use some function of a COM component, it calls that object's QueryInterface, requesting a pointer to the interface that implements the desired function. If the COM component supports that interface, it will return the appropriate interface pointer and a success code. If the COM component doesn't support the requested interface, it will return an error value. The application will then examine the return code; if successful, it will use the interface pointer to access the desired method. If the QueryInterface failed, the application will take some other action, letting the user know that the desired method is not available. The following example shows a call to QueryInterface on the PhoneBook component. We are asking this component, "Do you support the ILookup interface?" If the call returns successfully, we know that the COM component supports the ILookup interface and we have a pointer to use to call methods contained in the ILookup interface (either LookupByName or LookupByNumber). If not, we know that the PhoneBook COM component does not implement the ILookup interface.

LPLOOKUP *pLookup; TCHAR szNumber[64]; HRESULT hRes; // Call QueryInterface on the COM Component PhoneBook, asking for a pointer // to the Ilookup interface identified by a unique interface ID. hRes = pPhoneBook->QueryInterface( IID_ILOOKUP, &pLookup); if( SUCCEEDED( hRes ) ) { pLookup->LookupByName("Daffy Duck", &szNumber); // Use Ilookup interface // pointer. pLookup->Release(); // Finished using the IPhoneBook interface // pointer. } else { // Failed to acquire Ilookup interface pointer. } Note that AddRef() is not explicitly called in this case because the QueryInterface implementation increments the reference count before it returns an interface pointer. COM Library The COM Library is a system component that provides the mechanics of COM. The COM Library provides the ability to make IUnknown calls across processes; it also encapsulates all the "legwork" associated with launching components and establishing connections between components. Typically, when an application creates a COM component, it passes the CLSID of that COM component class to the COM Library. The COM Library uses that CLSID to look up the associated server code in the registration database. If the server is an executable, COM launches the EXE and waits for it to register its class factory through a call to CoRegisterClassFactory. (A class factory is the mechanism in COM used to instantiate new COM components.) If that code happens to be a DLL, COM loads the DLL and calls DllGetClassFactory. COM uses the object's IClassFactory to ask the class factory to create an instance of the COM component, and sends a pointer to the requested interface back to the calling application. The calling application neither knows nor cares where the server application is run; it just uses the returned interface pointer to communicate with the newly created COM component. The COM Library is implemented in COMPOBJ.DLL on Windows, and OLE32.DLL on Windows 95 and Windows NT. Interfaces Summary To summarize, COM defines several basic fundamentals that provide the underpinnings of the object model. The binary standard allows components written in different languages to call each other's functions. Interfaces are logical groups of related functions--functions that together provide some well-defined capability. IUnknown is the interface that COM defines to allow components to control their own lifespan and to dynamically determine another component's capabilities. A COM component implements IUnknown to control its lifespan and to provide access to the interfaces it supports. A COM component does not provide direct access to its data. GUIDs provide a unique identifier for each class and interface, thereby preventing naming

conflicts. And finally, the COM Library is implemented as part of the operating system, and provides the "legwork" associated with finding and launching COM components. Now that we have a good understanding of COM's fundamental pieces, let's look at how these pieces fit together to enable component software. COM Solves the Component Software Problem COM addresses the four basic problems associated with component software: Basic component interoperability Versioning Language independence Transparent cross-process interoperability

Additionally, COM provides a high-performance architecture to meet the requirements of a commercial component market. Basic Interoperability and Performance These are provided by COM's use of vtables to define a binary interface standard for method calling between components. Calls between COM components in the same process are only a handful of processor instructions slower than a standard direct function call and no slower than a compile-time bound C++ object invocation. Versioning A good versioning mechanism allows one system component to be updated without requiring updates to all the other components in the system. Versioning in COM is implemented using interfaces and IUnknown:QueryInterface. The COM design completely eliminates the need for things like version repositories or central management of component versions. When a software module is updated, it is generally to add new functionality, or to improve existing functionality. In COM, you add new functionality to your COM component by adding support for new interfaces. Because the existing interfaces don't change, other components that rely on those interfaces continue to work. Newer components that know about the new interfaces can use those newly exposed interfaces. Because QueryInterface calls are made at run time without any expensive call to some "capabilities database" (as used in some other system object models), the current capabilities of a COM component can be efficiently evaluated each time the component is used. When new features become available, applications that know how to use them will begin to do so immediately. Improving existing functionality is even easier. Because the syntax and semantics of an interface remain constant, you are free to change the implementation of an interface, without breaking other developers' components that rely on the interface. For example, say you have a component that supports the (hypothetical) IStack interface, which (hypothetically) includes methods like Push and Pop. You've currently implemented the interface as an array, but you decide that a linked list would be more appropriate. Because

the methods and parameters do not change, you can freely replace the old implementation with a new one, and applications that use your component will get the improved linked list functionality "for free." Windows and OLE use this technique to provide improved system support. For example, in OLE today, Structured Storage is implemented as a set of interfaces that currently use the C run-time file I/O functions internally. In the next major release of the Windows NT operating system, those same interfaces will write directly to the file system. The syntax and semantics of the interfaces remain constant; only the implementation changes. Existing applications will be able to use the new implementation without any changes; they will get the improved functionality "for free." The combination of the use of interfaces (immutable, well-defined "functionality sets" that are extruded by components) and QueryInterface (the ability to cheaply determine at run time the capabilities of a specific COM component) enables COM to provide an architecture in which components can be dynamically updated, without requiring updates to other reliant components. This is a fundamental strength of COM over other proposed object models. COM solves the versioning/evolution problem where the functionality of objects can change independently of clients of that object without rendering existing clients incompatible. In other words, COM defines a system in which components continue to support the interfaces through which they provided services to older clients, as well as support new and better interfaces through which they can provide services to newer clients. At run time, old and new clients can safely coexist with a given COM component. Errors can only occur at easily handled times: bind time or during a QueryInterface call. There is no chance for random crashes, such as those that occur when an expected method on an object simply does not exist, or its parameters have changed. Language Independence Components can be implemented in a number of different programming languages and used from clients that are written using completely different programming languages. Again, this is because COM, unlike an object-oriented programming (OOP) language, represents a binary object standard, not a source code standard. This is a fundamental benefit of a component software architecture over object-oriented programming languages. Objects defined in an OOP language typically interact only with other objects defined in the same language. This necessarily limits their reuse. At the same time, an OOP language can be used in building COM components, so the two technologies are actually quite complementary. COM can be used to "package" and further encapsulate OOP objects into components for widespread reuse, even within very different programming languages. Transparent Cross-Process Interoperability It would be relatively easy to address the problem of providing a component software architecture if software developers could assume that all interactions between components occurred within the same process space. In fact, other proposed system object models do make this basic assumption. The bulk of the work in defining a true component software model involves the transparent bridging of process barriers. In the design of COM, it was understood from the beginning that interoperability had to occur across process spaces because most applications could not be expected to be rewritten as dynamic-link libraries (DLLs) loaded into shared memory.

Also, by solving the problem of cross-process interoperability, COM solves the problem of components communicating transparently between different computers across a network, using exactly the same programming interface used for components communicating on the same computer. The COM Library is the key to providing transparent cross-process interoperability. As discussed in the last section, the COM Library encapsulates all the "legwork" associated with finding and launching components and managing the communication between components. As shown earlier, the COM Library insulates components from the location differences. This means that COM components can interoperate freely with other COM components running in the same process, in a different process, or across the network. The code needed to implement or use a COM component in any of those cases is exactly the same. Thus, when a new COM Library is released with support for cross-network interaction, existing COM components will be able to work in a distributed fashion without requiring source-code changes, recompilation, or redistribution to customers. Local and Remote Transparency COM is designed to allow clients to transparently communicate with components regardless of where those components are running, be it the same process, the same machine, or a different machine. This means that there is a single programming model for all types of COM components, not only for clients of those COM components but also for the servers of those COM components. From a client's point of view, all COM components are accessed through interface pointers. A pointer must be in-process, and in fact, any call to an interface function always reaches some piece of in-process code first. If the COM component is in-process, the call reaches it directly. If the COM component is out-of-process, the call first reaches what is called a proxy object provided by COM itself that generates the appropriate remote procedure call to the other process or the other machine. Note that the client should be programmed from the start to handle RPC exceptions; then it can transparently connect to an object that is inprocess, cross-process, or remote. From a server's point of view, all calls to a COM component's interface functions are made through a pointer to that interface. Again, a pointer only has context in a single process, and so the caller must always be some piece of in-process code. If the COM component is in-process, the caller is the client itself. Otherwise, the caller is a stub object provided by COM that picks up the remote procedure call from the proxy in the client process and turns it into an interface call to the server COM component. As far as both clients and servers know, they always communicate directly with some other in-process code, as illustrated in Figure 7.

Figure 7. Clients always call in-process code; COM components are always called by in-process code. COM provides the underlying transparent RPC. The bottom line is that dealing with local or remote COM components is transparent and identical to dealing with in-process COM components. This local/remote transparency has a number of key benefits: A common solution to problems that are independent of the distance between client and server: For example, connection, function invocation, interface negotiation, feature evolution, and so forth, occur the same way for components interoperating in the same process as for components interoperating across global networks. Programmers leverage their learning. New services are simply exposed through new interfaces, and once programmers learn how to deal with interfaces, they already know how to deal with new services that will be created in the future. This is a great improvement over environments in which each service is exposed in a completely different fashion. For example, Microsoft is working with other ISVs to extend OLE services. These new services, which will be quite diverse in function, will all be very similar in their implementations because they will simply be sets of COM interfaces.

Systems implementation is centralized. The implementors of COM can focus on making the central process of providing this transparency as efficient and powerful as possible, benefitting every piece of code that uses COM. Interface designers concentrate on design. In designing a suite of interfaces, the designers can spend their time in the essence of the design--the contracts between the parties--without having to think about the underlying communication mechanisms for any interoperability scenario. COM provides those mechanisms for free, including network transparency.

COM and the Client/Server Model Clients, Servers, and Object Implementors The interaction between COM components and the users of those COM components in COM is in one sense based on a client/server model. We have already used the term client to refer to some piece of code that is using the services of a COM component. Because a COM component supplies services, the implementor of that component is usually called the server--the COM component that serves those capabilities. A client/server architecture in any computing environment leads to greater robustness: If a server process crashes or is otherwise disconnected from a client, the client can handle that problem gracefully and even restart the server if necessary. Because robustness is a primary goal in COM, a client/server model naturally fits. Because COM allows clients and servers to exist in different process spaces (as desired by component providers), crash protection can be provided between the different components making up an application. For example, if one component in a component-based application fails, the entire application will not crash. In contrast, object models that are only in-process cannot provide this same fault tolerance. The ability to cleanly separate object clients and object servers in different process spaces is very important for a component software standard that promises to support sophisticated applications. Unlike other object models we know of, COM is unique in allowing clients to also represent themselves as servers. In fact many interesting designs have two (or more) components using interface pointers on each other, thus becoming clients and servers simultaneously. In this sense, COM also supports the notion of peer-to-peer computing, and so is quite different--and, we think, more flexible and useful--from other proposed object models in which clients never represent themselves as objects. Servers: In-Process and Out-Of-Process In general a server is some piece of code that implements some COM component such that the COM Library and its services can run that code and have it create COM components. Any specific server can be implemented in one of a number of flavors, depending on the structure of the code module and its relationship to the client process that will be using it. A server is either in-process, which means its code executes in the same process space as the client (as a DLL), or out-of-process, which means it runs in another process on the same machine or in another process on a remote machine (as an EXE). These three types of servers are called in-process, local, and remote.

COM component implementors choose the type of server based on the requirements of implementation and deployment. COM is designed to handle all situations, from those that require the deployment of many small, lightweight in-process components (like OLE Controls, but conceivably even smaller) up to those that require deployment of huge components, such as a central corporate database server. And as discussed, all COM servers look the same to client applications, whether they are in-process, local, or remote. Custom Interfaces and Interface Definitions When a developer defines a new custom interface, she or he can create an interface definition using the Interface Description Language (IDL). From this interface definition, the Microsoft IDL compiler generates header files for use by applications using that interface, and source code to create proxy and stub objects that handle remote procedure calls. The IDL used and supplied by Microsoft is based on simple extensions to the OSF DCE IDL, a growing industry standard for RPC-based distributed computing. IDL is only a tool for the convenience of the interface designer and is not central to COM's interoperability. It simply saves the developer from manually creating header files for each programming environment and from creating proxy and stub objects by hand. Note that IDL is not necessary unless you are defining a custom interface for an object--proxy and stub objects are already provided with the COM Component Library for all COM and OLE interfaces. Here is the IDL file used to define a custom interface, ILookup, that is implemented by the PhoneBook object: [ object, //Use the GUID for the ILookup interface. uuid(c4910d71-ba7d-11cd-94e8-08001701a8a3), pointer_default(unique) ] interface ILookup : IUnknown // ILookup interface derives from IUnknown. { import "unknwn.idl"; // Bring in the supplied IUnknown IDL. // Define member function LookupByName: HRESULT LookupByName( [in] LPTSTR lpName, [out, string] WCHAR **lplpNumber); // Define member function LookupByNumber: HRESULT LookupByNumber( [in] LPTSTR lpNumber, [out, string] WCHAR ** lplpName); } COM and Application Structure COM is not a specification for how applications are structured, it is a specification for how applications interoperate. For this reason, COM is not concerned with the internal structure of an application. That is the job of the programmer, and also depends on the programming languages and development environments used. Conversely, programming environments have no set standards for working with objects outside the immediate application. C++, for example, works extremely well with objects inside an application, but has no support for working with objects outside the application. Generally, other programming languages are the same. COM, through language-independent interfaces, picks up where

programming languages leave off, providing network-wide interoperability of components making up an integrated application. Client/Server Summary The core of the Component Object Model is a specification for how components and their clients interact. As a specification it defines a number of other standards for interoperability of software components: A binary standard for function calling between components. A provision for strongly typed groupings of functions into interfaces. A base IUnknown interface providing: A way for components to dynamically discover the interfaces supported by other components (QueryInterface). Reference counting to encapsulate component lifetime ( AddRef and Release). A mechanism to uniquely identify components and their interfaces (GUIDs).

In addition to being a specification, COM is also an implementation contained in the COM Library. The implementation is provided through a library (such as a DLL on Microsoft Windows, Windows 95, or Windows NT) that includes: A small number of fundamental API functions that facilitate the creation of COM applications, both clients and servers. For clients, COM supplies basic COM component creation functions; for servers it supplies facilities to expose their COM components. Implementation locator services through which COM determines from a class identifier which server implements that class and where that server is located. This includes support for a level of indirection, usually a system registry, between the identity of an COM component class and the packaging of the implementation such that clients are independent of the packaging (so packaging can change in the future). Transparent remote procedure calls when a COM component is running in a local or remote server, as illustrated in Figure 7.

In general, only one vendor needs to, or should, implement a COM Library for any particular operating system. For example, Microsoft is implementing COM on Windows, Windows NT, and the Apple Macintosh. Other vendors are implementing COM on other operating systems, including specific versions of UNIX. Also, it is important to note that COM draws a very clear distinction between: The object model and the wire-level protocols for distributed services, which are the same on all platforms

and Platform-specific operating system services (for example, local security, network transports, and so on)

Therefore, developers are not constrained to use new and specific models for the services of different operating systems, yet they can develop components that interoperate with components on other platforms. All in all, only with a binary standard on a given platform and a wire-level protocol for crossmachine component interaction can an object model provide the type of structure necessary for full interoperability between all applications and between all different machines in a network. With a binary and network standard, COM opens the doors for a revolution in innovation without a revolution in programming or programming tools. Appendix 1. The Problem with Implementation Inheritance Implementation inheritance--the ability of one component to "subclass" or inherit some of its functionality from another component--is a very useful technology for building applications. Implementation inheritance, however, can create many problems in a distributed, evolving object system. The problem with implementation inheritance is that the "contract" or relationship between components in an implementation hierarchy is not clearly defined; it is implicit and ambiguous. When the parent or child component changes its behavior unexpectedly, the behavior of related components may become undefined. This is not a problem when the implementation hierarchy is under the control of a defined group of programmers who can make updates to all components simultaneously. But it is precisely this ability to control and change a set of related components simultaneously that differentiates an application, even a complex application, from a true distributed object system. So although implementation inheritance can be a very good thing for building applications, it is not appropriate for a system object model that defines an architecture for component software. In a system built of components provided by a variety of vendors, it is critical that a given component provider be able to revise, update, and distribute (or redistribute) a product without breaking existing code in the field that is using the previous revision or revisions of that component. In order to achieve this, it is necessary that the actual interface on the component used by such clients be crystal clear to both parties. Otherwise, how can the component provider be sure to maintain that interface, and thus not break the existing clients? From observation, the problem with implementation inheritance is that it is significantly easier for programmers not to be clear about the actual interface between a base and derived class than it is to be clear. This usually leads implementors of derived classes to require source code to the base classes; in fact, most application framework development environments that are based on inheritance provide full source code for exactly this reason. The bottom line is that inheritance, although very powerful for managing source code in a project, is not suitable for creating a component-based system where the goal is for components to reuse each others' implementations without knowing any internal structures of the other objects. Inheritance violates the principle of encapsulation, the most important aspect of an object-oriented system. An Example of the Implementation Inheritance Problem

The following C++ example illustrates the technical heart of the robustness problem: class CBase { public: void DoSomething(void) { ... if (condition) this->Sample(); ...} virtual void Sample(void); }; class CDerived : public CBase { public: virtual void Sample(void); }; This is the classic paradigm of reuse in implementation inheritance: A base class periodically makes calls to its own virtual functions, which may be overridden by its derived classes. In practice, in such a situation CDerived can become, and therefore often will become, intimately dependent on exactly when and under what conditions Sample will be invoked by the class CBase. If, at present, all such Sample invocations by CBase are intended (long-term) as hooks for the derived class, there is no problem. There are two cases, however; either the implementation of CBase::Sample is implemented as void CBase::Sample(void) { //Do absolutely nothing. } or it is not empty. If the implementation of CBase::Sample is not empty, it is carrying out some useful and likely needed transformation on the internal state of CBase. Thus, it is questionable whether all of the invocations of Sample are for the support of derived classes; some of them instead are likely to be only for the purpose of carrying out this transformation. That transformation is part of current implementation of CBase. Thus, in summary, CDerived becomes coupled to details of that current implementation of CBase; the interface between the two is not clear and precise. Further coupling comes from the fact that the implementation of CDerived::Sample, in addition to performing its own role, must be sure to carry out the transformation carried out in CBase::Sample. It can do this by reimplementing the code in CBase::CBase itself, but this causes obvious coupling problems. Alternatively, it can itself invoke the CBase::Sample method: void CDerived::Sample(void) { [Do some work] ... CBase::Sample(); ... [Do other work]

} However, it is very unclear what is appropriate or possible for CDerived to do in the areas marked "Do some work" and "Do other work." What is appropriate depends, again, heavily on the current implementation of CBase::Sample. If, in contrast, CBase::Sample is empty, we likely are not in immediate danger of surreptitious coupling. In the implementation of CBase, invoking Sample clearly serves no immediately useful purpose, and so it is likely that indeed all invocations of Sample in CBase are only for the support of CDerived. Consider, however, the case in which the CDerived class is reused: class CAnother : public CDerived { public: virtual void Sample(void); }; Though CBase::Sample had a trivial implementation, CDerived::Sample will not (why override it if otherwise?). The relationship of CAnother to CDerived thus becomes as problematic as the CDerived-CBase relationship in the previous case. This is the architectural heart of the problem observed in practice that leads to a view that implementation inheritance is unacceptable as the mechanism by which independently developed binary components are reused and refined. COM provides two other mechanisms for code reuse, called containment/delegation and aggregation. Both of these reuse mechanisms allow objects to exploit existing implementation while avoiding the problems of implementation inheritance. See Appendix 2 of this article for an overview of these alternate reuse mechanisms. Appendix 2. COM Reusability Mechanisms The key point to building reusable components is black-box reuse, which means the piece of code attempting to reuse another component knows nothing, and needs to know nothing, about the internal structure or implementation of the component being used. In other words, the code attempting to reuse a component depends upon the behavior of the component and not the exact implementation. As illustrated in Appendix 1, implementation inheritance does not achieve black-box reuse. To achieve black-box reusability, COM supports two mechanisms through which one COM component may reuse another. For convenience, the object being reused is called the inner object and the object making use of that inner object is the outer object. Containment/Delegation. The outer object behaves like an object client to the inner object. The outer object "contains" the inner object, and when the outer object wishes to use the services of the inner object, the outer object simply delegates implementation to the inner object's interfaces. In other words, the outer object uses the inner object's services to implement some (or possibly all) of its own functionality. Aggregation. The outer object wishes to expose interfaces from the inner object as if they were implemented on the outer object itself. This is useful when the outer

object would always delegate every call to one of its interfaces to the same interface of the inner object. Aggregation is a convenience to allow the outer object to avoid extra implementation overhead in such cases. These two mechanisms are illustrated in Figures 8 and 9. The important part of both these mechanisms is how the outer object appears to its clients. As far as the clients are concerned, both objects implement interfaces A, B, and C. Furthermore, the client treats the outer object as a black box, and thus does not care, nor does it need to care, about the internal structure of the outer object--the client only cares about behavior.

Figure 8. Containment of an inner object and delegation to its interfaces Aggregation is almost as simple to implement. The trick here is for COM to preserve the function of QueryInterface for COM component clients even as an object exposes another COM component's interfaces as its own. The solution is for the inner object to delegate IUnknown calls in its own interfaces, but also allow the outer object to access the inner object's IUnknown functions directly. COM provides specific support for this solution.

Figure 9. Aggregation of an inner object where the outer object exposes one or more of the inner object's interfaces as its own

Vous aimerez peut-être aussi