INTRODUCTION September 14, 2005 In the late summer of 2004, I became interested in identifying the best return type for factory functions. I ultimately drafted an article on the topic, which I intended to submit for publication in some magazine. I sent the draft to a number of people for comments, but then I got caught up in finishing Effective C++, Third Edition, and by the time I got back to the article, it was many months later. As I reviewed the article and the comments I'd received on it, I realized that (1) a lot of the material had been incorporated into the new version of Effective C++, where I'd done a better job writing it up, and (2) revising the article for publication and bringing it up to date would require more time and energy than I had available. Still, I didn't want to abandon the article completely, because (1) I'd already put a lot of time and energy into it, and (2) I felt that some of the material -- notably the interactions of auto_ptr and tr1::shared_ptr -- continued to be worthwhile. I thus decided to annotate my November 2004 draft with comments on things I'd change if I were to revise the article for publication, then make the annotated draft available on the web. The result is what you see below. Please note that what you see does not necessarily reflect what would appear in a final version of the article. If I were to pursue publication, I'd: - Modify the draft in accord with the annotations. - Iteratively review the revised draft myself, making new changes until I was happy with it. - Send the new draft to colleagues for review. - Revise the draft to take the latest comments into account. - Iteratively review the new draft, making changes until I was happy with it. - Submit it for publication somewhere, possibly somewhere where I knew there'd be an additional round of review. I hope that the technical information in the draft is interesting and useful, but I also hope that this document sheds some light on the process of producing something ready for publication. Annotations are in braces ("["and "]") and are indented. In case you're wondering why the draft is in simple text form, it's because I happened to write it that way. Sometimes I use a word processor when writing, sometimes a simple text editor (though calling Emacs "simple" is a stretch). In this case, I happened to use a text editor. THE RESOURCE RETURN PROBLEM 20 November 2004 Draft Scott Meyers A "resource" in a C++ program is something of limited availability such that if you acquire some of it but you fail to return it when you are done using it, something bad is likely to happen. Common resources include memory, file handles, and synchronization primitives like mutexes and semphores. If you acquire memory or file handles but fail to later release them, your software may be unable to acquire additional memory or file handles later. If you acquire synchronization primitives and fail to release them later, your software is likely to suffer from poor performance or deadlock. In many cases, callers get access to resources by calling factory functions. Such functions often return pointers to dynamically allocated resources, thus allowing callers to manipulate the underlying resource through the pointers. Consider, then, a factory function for some object of type Resource. The "obvious" way to declare the function is like this: Resource* factory(); // parameters omitted, because they're // not germane to this discussion However, any design yielding a pointer-returning function should certainly consider returning a smart pointer instead. After all, the standard C++ library supports one smart pointer now (auto_ptr) and is poised to support two more in the very near future (tr1::shared_ptr and tr1::weak_ptr -- see the sidebar for more on TR1, shared_ptr, and weak_ptr), so it makes sense to consider using such standard functionality whenever possible. [Need to clarify that I'm talking about resource-managing smart pointers here. Iterators are smart pointers, too, and they're also in the standard library.] Designs for factory's interface should thus consider at least the following additional possibilities: std::auto_ptr factory(); std::tr1::shared_ptr factory(); std::tr1::weak_ptr factory(); The question, then, is how to choose among these options. [The article should mention that I'm limiting myself to off-the-shelf return types. Factory functions could also return custom smart pointer types, e.g., those employing custom interfaces and/or using intrusive reference counting. Returning custom smart pointer types has its own set of advantages and disadvantages.] One of the most powerful ideas in interface design is that interfaces should be easy to use correctly and hard to use incorrectly [IEEESoftwareCol]. [Should also cite Item 18 of Effective C++, Third Edition, which does a better job covering this than the IEEE Sofware article.] Given that resources should always be released when clients are done using them, an easy way to vet our interface possibilities is to see how likely they are to lead to resource leakage, i.e., to clients failing to release resources they acquire. One of the most straightforward usage scenarios is for a client to acquire a resource in a block, use it during the block, then release it at the end of the block. With a raw-pointer-returning factory, client code could look something like this: { Resource *pResource = factory(); // Step 1: Acquire resource ... // Step 2: Use pResource delete pResource; // Step 3: Release resource } But consider all the things that can go wrong here: - Something during step 2 might prevent control from ever getting to step 3. For example, there might be return statement, a continue statement, even a goto statement. An exception might be thrown. Furthermore, these things might be added to the code in Step 2 long after Step 3 had been written, meaning that a harried maintenance programmer might inadvertently introduce a resource leak. Note that a particularly subtle way in which this can happen is if some function called in Step 2 is modified to throw an exception under conditions when it didn't do so before. Yes, this is an interface change, but it's a largely silent change, so there's a good chance that client code will fail to be updated in response. - The programmer might forget to perform Step 3. Both C and C++ have a long, if not proud, tradition of resource leaks, and certainly some of them are due to people simply failing to write the code to release the resource. In addition, it's possible that a later maintenance programmer will remove Step 3, possibly by simple accident during editing. - Step 3 might be written incorrectly. For whatever reason, a client might use the array form of delete (i.e., "delete []") instead of the single-object form, thus engendering undefined behavior. Clients can avoid these problems by storing the factory's returned pointer in a local smart pointer that automatically handles resource release, i.e., an auto_ptr or a tr1::shared_ptr: { std::auto_ptr pResource1(factory()); // Acquire a resource ... // use it std::tr1::shared_ptr pResource2(factory()); // acquire another res. ... // use it } // both resources are // automatically deleted But how are clients to know to do this, and how will their oversight be brought to their attention if they forget? There is nothing about the factory interface that documents that ownership of the returned resource is being turned over to the client, nothing to remind callers to take steps to ensure that the resulting resource is correctly released. This is important, because not all pointer-returning functions do turn over ownership to callers. For example, the traditional implementation of the Singleton pattern [GOFBook] features an Instance function returning a pointer to the Singleton -- a pointer that clients should NOT delete. [Perhaps mention here or elsewhere that if a factory returns a type that should not be deleted, the interface of that type should express that constraint by not offering a public destructor? But what about factories that return types whose interfaces they do not control?] Taking everything above into consideration, having a factory return a raw pointer yields an interface that is relatively easy to use correctly, but one that is easy to use incorrectly, too. We'd like to do better. A logical way to approach the problem is to look at the implications of having factory return a smart pointer instead of a raw pointer. We can dispense with the idea of returning a tr1::weak_ptr, because it sends callers the wrong message: it suggests that ownership is NOT being transferred to the caller. That leaves us with the following interfaces to examine: std::auto_ptr factory(); std::tr1::shared_ptr factory(); Both document that the returned resource needs to be released when it is no longer in use, both prevent resource leaks in the scenarios considered above, and both eliminate the possibility that clients will use the wrong form of delete (because the smart pointer itself does the deletion). So how do we choose between these alternatives? There are at least two ways. The first is to note that an auto_ptr holds exclusive ownership of a resource, while a tr1::shared_ptr participates in shared ownership. Unfortunately, whether ownership of a factory-generated resource should be shared is generally a decision to be made by resource CLIENTS, not resource providers. That is, the caller of factory is generally better able to determine whether ownership of the resource should be exclusive or shared. If the factory returns a smart pointer with the type of ownership that the client wants, everything is ducky, of course: // Case 1: factory returns auto_ptr, client wants auto_ptr std::auto_ptr pResource(factory()); ... // use pResource // Case 2: factory returns tr1::shared_ptr, client wants tr1::shared_ptr std::tr1::shared_ptr pResource(factory()); ... // use pResource But consider the code that clients must write if factory's interface is "incorrect" as regards sharing, i.e., if the factory returns an auto_ptr (exclusive ownership) but the client wants a tr1::shared_ptr (shared ownership) and vice versa. The first case is straightforward: std::auto_ptr factory(); // Case 3: factory // returns auto_ptr std::tr1::shared_ptr // move raw ptr from pResource(factory().release()); // auto_ptr to shared_ptr ... // use pResource Moving from exclusive to shared ownership is easy, because auto_ptr has a member function (release) that directly releases ownership. [There is no need to use release, because tr1::shared_ptr has a constructor that takes an auto_ptr.] The same cannot be said for moving in the opposite direction: std::tr1::shared_ptr factory(); // Case 4: factory // returns tr1::shared_ptr std::tr1::shared_ptr pTemp(factory()); // copy shared_ptr assert(pTemp.unique()); // assert exclusive ownership std::auto_ptr pResource(pTemp.get()); // copy raw ptr into auto_ptr pTemp.reset(); // set shared_ptr to null ... // use pResource The code itself makes clear that moving from a tr1::shared_ptr to an auto_ptr is more work than is the opposite transformation, but the situation is far grimmer than that, because the code above MAY have UNDEFINED BEHAVIOR, and THERE IS NO WAY TO DETERMINE whether it does! That's because it's possible that the resource held by a tr1::shared_ptr should not be released by calling delete. We'll explore this issue in more detail in a moment. For now, the simplicity of the auto_ptr-to-tr1::shared_ptr code compared with the more complex (and inherently risky) tr1::shared_ptr-to-auto_ptr code suggest that, all other things being equal, factories should return auto_ptrs instead of tr1::shared_ptrs, because if the ownership policy of the return type is inappropriate for the caller, it's easier for the caller to fix it when an auto_ptr is returned. [Added Nov. 2006: Joseph Link points out that it's far worse than this, because the call to reset will cause the resource to be released, and the auto_ptr will dangle. There is no way to get a collection of tr1::shared_ptrs to relinquish ownership of a resource, not even if you know that only one tr1::shared_ptr points to the resource.] A second way to distinguish between returning an auto_ptr or a tr1::shared_ptr is to consider the runtime cost imposed on clients. auto_ptr was designed to be small and fast, with no data space penalty compared to raw pointers and with no "really expensive" operations. (Different programmers define "really expensive" differently, but auto_ptr does about as little as possible consistent with its mission of maintaining exclusive ownership and ensuring resource release when the owning auto_ptr is destroyed.) tr1::shared_ptr is more expensive. In the Boost implementation (the de facto "reference" implementation [BoostSharedPtr]), for example, a shared_ptr is twice the size of a raw pointer, it stores the reference count in dynamically allocated memory, and it uses an auxiliary class with virtual functions, thus necessitating the generation of a vtable. This doesn't mean that tr1::shared_ptrs are "expensive," it just means that they are more expensive than auto_ptrs. This is to be expected: implementing shared ownership of a resource is more complicated than is implementing exclusive ownership (which is in turn more expensive than implementing raw pointers, which enforce no ownership policy at all). [Bartosz Milewski has sent test code and timing data showing that vector+tr1::shared_ptr runs about half as fast as his vector+auto_ptr-like auto_vector.] From the point of view of a factory's interface design, then, it seems that the case for auto_ptr over tr1::shared_ptr is rather strong. Returning an auto_ptr imposes a smaller runtime penalty on callers vis-a-vis returning a tr1::shared_ptr, and clients can more easily "upgrade" a returned auto_ptr to a tr1::shared_ptr than they can "downgrade" a tr1::shared_ptr to an auto_ptr. Case closed, yes? Maybe. It depends on whether all the evidence has been heard. There are situations where it has not. One such situation is when release of a resource requires something other than a simple delete. For example, suppose that factory returns a pointer to a resource that, when the client is done with it, is supposed to be passed to a reclamation function. That is, proper client code looks like this (where I've changed factory's signature back to returning a raw pointer, because we don't yet know what the proper return type should be): Resource* factory(); // call this to get the resource void reclaim(Resource*); // pass it here when you're done with it Resource* pResource = factory(); // get resource ... // use it reclaim(pResource); // relinquish it This is a usage scenario where auto_ptr just doesn't fill the bill. auto_ptr's destructor hardwires in a call to delete, so client use of auto_ptr would just introduce the possibility that the client would delete a pointer it wasn't supposed to delete. factory should thus not return an auto_ptr, because that would encourage clients to use one themselves -- precisely what they should NOT do. On the other hand, tr1::shared_ptr can handle this situation with complete aplomb. That's because tr1::shared_ptr can have a custom destruction function passed in at the time the shared_ptr object is created, and when the last shared_ptr to the underlying resource is destroyed, the raw pointer to the resource will be automatically passed to the custom destruction function. Furthermore, the type of the destruction function has no effect on the type of the shared_ptr, so clients can program with shared_ptr objects and remain ignorant of how the underlying resource is ultimately released. As a result, factory can be implemented something like this: std::tr1::shared_ptr factory() { Resource *pRes = new Resource; // acquire raw resource std::tr1::shared_ptr sp(pRes, reclaim); // give it and its // release func // to a shared_ptr return sp; // return the shared_ptr } Clients just use the tr1::shared_ptr as usual: std::tr1::shared_ptr pResource(factory()); // acquire resource ... // use it; it will be // automatically // destroyed when // it's no longer in // use The fact that a tr1::shared_ptr may contain reference to a hidden custom destruction function explains why it's dangerous to move a resource from a tr1::shared_ptr to an auto_ptr, as I mentioned above. The auto_ptr will ultimately use delete to release the resource, but that may not be appropriate for the pointer inside the tr1::shared_ptr. Furthermore, there is no way to determine whether it IS appropriate. True, tr1 offers a get_deleter function that returns the custom destruction function associated with a tr1::shared_ptr, but there are two problems with it. First, you must know the type of the deletion function in order to get it from get_deleter, but the type isn't always obvious. For example, consider this reasonable-looking client code: std::tr1::shared_ptr factory(); // factory returns a (smart) ptr to Base typedef void (*DelFunc)(Base*); // DelFunc is a function type taking a // Base* std::tr1::shared_ptr // call factory, save returned ptr pResource(factory()); DelFunc* pDeleter = // get the shared_ptr's deletion func tr1::get_deleter(pResource); // assuming DelFunc is its signature Here, the client works with a tr1::shared_ptr-to-Base, so it asks for the deletion function as one that takes a pointer-to-Base. Unfortunately, this may be the wrong type to ask for. Consider this possible implementation code for factory: class Base { ... }; class Derived: public Base { ... }; void derivedGoByeBye(Derived*) { ... } // custom deletion function std::tr1::shared_ptr factory() { return std::tr1::shared_ptr(new Derived, derivedGoByeBye); } In this case, the deletion function is of type void (*)(Derived*) instead of void (*)(Base*) so the client's call to get_deleter will yield the null pointer. (This generally signifies that the signature for the requested deleter is incorrect, not that there is no deletion function associated with the tr1::shared_ptr.) Even if get_deleter returns a pointer to a deletion function, that does little good for clients who want to move the resource into an auto_ptr, because there is, in general, no way to determine whether the behavior of the deletion function is the same as a simple call to delete, which is all auto_ptr knows how to do. The conclusion bears repeating: moving raw pointers from tr1::shared_ptrs to auto_ptrs is inherent risky, because it can lead to undefined behavior (i.e., the use of a simple delete on a pointer that should not simply be deleted). One might wish for auto_ptr to also offer support for custom deletion functions. Clearly, the technology that allows tr1::shared_ptr to function the way that it does could be added to auto_ptr. However, doing that would make auto_ptr almost tr1::shared_ptr's twin. auto_ptr would double in size, it would use dynamically allocated memory, it would have an associated helper class with virtual functions, etc. It's possible that elimination of reference counting capabilities would allow for a less expensive implementation if custom resource destruction capability were to be added to auto_ptr, but from a pragmatic point of view, it makes little difference, as there appears to be no support within the standardization committee for the idea of adding such functionality to auto_ptr. Furthermore, the collection of smart pointers at Boost [BoostSmartPtr]-- elements of which served as the models for tr1::shared_ptr and tr1::weak_ptr -- fails to include anything offering custom resource destruction in conjunction with auto_ptr's combination of both copyability and exclusive ownership. This suggests that there's little demand for such a combination, at least among Boosters. If that's a combination of features you think you want, you'll probably have to write your own custom smart pointer class. Before embarking on that, however, I advise you to read about the difficulties that bedevilled auto_ptr design both prior to and after standardization [AutoPtrUpdatePage AutoPtrFlaws]. The fact that auto_ptrs hold exclusive ownership and can also be copied leads to significant design challenges. Be sure to educate yourself about them before venturing into those murky waters. But perhaps you know that the resource returned by your factory can be dispatched with a simple delete. Taking everything above into account, it would seem that the preferred interface is this, std::auto_ptr factory(); which facilitates client code like this: std::auto_ptr pResource1(factory()); // for clients who want ... // exclusive ownership std::tr1::shared_ptr // for clients who want pResource2(factory().release()); // shared ownership ... Alas, what we know tends to change over time, so it's possible that at some point in the future, the need for a simple delete may be replaced by a need to call a custom resource-reclamation function. If that happens, factory's interface must be changed, and all client code using factory must be not only recompiled, but rewritten. If, on the other hand, factory returned a tr1::shared_ptr, only the implementation of factory would require modification. Its interface would remain stable, and clients would only have to relink to upgrade to the new behavior. That's quite a feather in tr1::shared_ptr's cap. [Due to apparent revised thinking about the proper behavior of auto_ptr, contemporary auto_ptr implementations don't support inheritance, e.g.: std::auto_ptr factory(); std::auto_ptr pResource(factory()); // error! no implicit // conversion from // auto_ptr to // auto_ptr In contrast, tr1::shared_ptr supports such implicit conversions in the expected manner.] CONCLUSIONS When designing the interfaces for functions that return resources, at least the following considerations should be taken into account: - How can the idea of ownership transferral be conveyed to callers? - How can the likelihood of caller errors (e.g., resource leaks, incorrect resource release) be minimized? - How can caller convenience be maximized? - How can performance overhead be minimized? - How can the use of custom approaches (e.g., nonstandard smart pointers) be minimized? - How can implementations retain the flexibility to modify resource release policies in the future? As with most nontrivial design spaces, there is no design that simultaneously optimizes the result for each of these considerations. In general, for pointer-returning factory functions, tr1::shared_ptr is probably the best bet when its size/speed profile is acceptable, because it offers "normal" copying semantics (e.g., it can be stored in STL containers), and it allows for implementations to change resource destruction policies without disturbing clients. If present or future support for custom deletion functions is not an issue, auto_ptr is also an attractive choice. ACKNOWLEDGMENTS This article was motivated and based, in part, on discussions with Kevlin Henney. Marco Dalla Gasperina, Matthew Wilson, and Uwe Schnitker offered helpful comments on earlier drafts of this piece. [Need to add Bartosz Milewski.] SIDEBAR: TR1 AND SMART POINTERS "TR1" is shorthand for a technical report soon to be issued by the C++ Standardization Committee specifying 14 new sets of library functionality almost certain to be included in the next standard for C++. Those 14 types of functionality are identified in TR1 as follows. (Entries marked with asterisks are modeled on Boost libraries, as explained below.) * Reference Wrappers * Smart Pointers Function Return Types * Member Pointer Adapters * Function Object Binders * Polymorphic Function Wrappers * Metaprogramming and Type Traits * Random Number Generation Mathematical Special Functions * Tuple Types * Fixed Size Array Unordered Associative Containers * Regular expressions C Compatibility Unfortunately, many of these terms are not very descriptive. "Regular Expressions" is straightforward, but "unordered associative containers" is TR1 codespeak for hash tables, and "Function Object Binders" refers to an astonishingly useful ability to allow all compatible function-like entities to be treated as a single type. To learn more about TR1, you can download the most recent draft of the technical report itself (currently [April2004TR1Draft]), but you are likely to gain a more useful understanding if you also consult the extension proposals that gave rise to TR1 in the first place. A good place to find links to those proposals is a newsgroup discussion from April 2004 [TR1Discussion]. In addition, Matt Austern has written about hash table support in TR1 [TR1HashTables], and Herb Sutter has written about tuples [TR1Tuples] and generalized function objects [TR1Function]. TR1's support for smart pointers consists primarily of the shared_ptr template, a noninvasive reference-counting smart pointer. An object pointed to by one or more shared pointers is automatically deleted when the last shared_ptr pointing to it is destroyed. Furthermore, copying a shared_ptr does the "obvious" thing (it yields a new shared_ptr that points to the same object as the original shared_ptr). These "natural" copying semantics make shared_ptr behavior more intuitive to programmers than that of auto_ptr, and it also allows shared_ptrs to be stored in arrays and in STL containers. (auto_ptrs may not be stored in either.) The best source of information on shared_ptr is the documentation for Boost's smart pointer of the same name [BoostSharedPtr], as it is the model behind TR1's shared_ptr. The sand in the gears of all reference-counting schemes is cycles of references, because cycles can keep resources alive even when they're no longer really being used. One way to allow programmers to benefit from reference counting while still allowing the creation of cyclic data structures is to offer special kinds of smart pointers that work with shared_ptrs, but that don't affect reference counts. In TR1 (as well as at Boost), that's what weak_ptrs do. Programmers who want both reference counting as well as cycles in their data structures use shared_ptrs for most links, but when adding a link that would create a cycle, they use weak_ptrs instead. For details, the best source of information is the Boost weak_ptr page [BoostWeakPtr]. It is highly likely that partial or full implementations of TR1 functionality will begin to ship as part of next-generation C++ implementations, but currently, the best way to work with TR1 functionality is to download the various components on a piecemeal basis. Since 10 of the 14 components specified by TR1 are based on Boost libraries (noted with asterisks in the list above), the logical place to start is Boost (www.boost.org). All TR1 components are in the namespace std::tr1. REFERENCES [IEEESoftwareCol] http://www.aristeia.com/Papers/IEEE_Software_JulAug_2004.pdf [BoostSmartPtr] http://boost.org/libs/smart_ptr/smart_ptr.htm [GOFBook] Design Patterns: Elements of Reusable Object-Oriented Software, Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, Addison-Wesley, 1995, ISBN 0-201-63361-2. [AutoPtrUpdatePage] http://tinyurl.com/yh9t [AutoPtrFlaws] http://tinyurl.com/6skrp [April2004TR1Draft] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1647.pdf [TR1Discussion] http://tinyurl.com/4be49 [TR1HashTables] http://www.cuj.com/documents/s=7984/cujcexp2004austern/ [TR1Tuples] http://www.cuj.com/documents/s=8250/cujcexp2106sutter/ [TR1Function] http://www.cuj.com/documents/s=8464/cujcexp0308sutter/ [BoostSharedPtr] http://boost.org/libs/smart_ptr/shared_ptr.htm [BoostWeakptr] http://boost.org/libs/smart_ptr/weak_ptr.htm