Scott Meyers

Comments on Item 27 of More Effective C++

Last Updated December 15, 1999
by Scott Meyers

The messages below make up correspondance related to More Effective C++'s discussion of how to determine whether an object is on the heap. I've edited out unnecessary email headers and omitted parts of the messages not germane to the issue at hand, but other than that, these are verbatim messages sent to/from me on this topic. I'm grateful to each of the authors for their permission to make their messages available for general viewing.

After Ron van der Wal and I exchanged email, Ron put up a web page summarizing his extensive and insightful thoughts on this topic. That page contains essentially the same information as is shown in our correspondance below, but he's formatted his page, so it's easier to read. If you read his page, however, please don't forget to come back here to read the other authors' comments, because they describe interesting approaches to this problem that are different from both mine and Ron's.

 
 From: Ron van der Wal
 Date: Mon, 26 Feb 96 10:44:40 PST
 Subject: On the heap checks
 To: Scott Meyers
 
 Thank you for another well written and useful book on C++ - meaning of course
 "More effective C++". I would like to comment on the central issue of Item
 27, i.e. ways to determine whether or not some object is allocated off the
 heap rather than, say, stack or static data space. Unless noted otherwise, my
 comments refer to the UPNumber and HeapTracked samples that you use
 throughout Item 27.


 1. A static bool 'onTheHeap' is not satisfactory (pp. 148-149)
 --------------------------------------------------------------
 This is true. There is, however, a fairly simple way to circumvent some of
 the problems you outline: replace the bool flag by a counter. For example:

    class UPNumber {
    public:
    ...as before...
    private:
    static int heapPending;  // Replaces 'onTheHeap'
    };

    int UPNumber::heapPending = 0;

    void *UPNumber::operator new(size_t sz)
    {
    ++heapPending;
    return ::operator new(sz);
    }

    UPNumber::UPNumber()
    {
    if (heapPending < 1)
        throw HeapConstraintViolation();
    ...
    --heapPending;
    }

 This solves most of the problems you mention (we'll get to array allocations
 in a moment), including the new UPNumber(*new UPNumber()) situation. In
 fact, it could only fail (as far as single objects are concerned) in the
 following situation:

    new UPNumber(UPNumber());

 This fails if (and only if) the compiler happens to generate code as follows:

    1. Call operator new
    2. Call the constructor of the temporary
    3. Call the constructor of the dynamic object

 The order of 2 and 3 will never be reversed, of course, so the question is
 simply whether or not the operator new call will precede the construction of
 the temporary. (This scenario would also cause the 'onTheHeap' approach to
 fail.) Any other combination of constructors and/or operator new invocations
 will be safe with the counter approach. In the example above it remains to
 be seen if the use of a temporary makes sense, but it's legal anyway so we
 have to deal with it.

 As to arrays of UPNumbers, this could be solved by also overloading operator
 new[] for the UPNumber class and implementing it as follows:

    void *UPNumber::operator new[](size sz)
    {
    heapPending += sz / sizeof(UPNumber);
    return ::operator new[](sz);
    }

 This solves the new UPNumber[100] you mention, assuming that 'sz' is indeed
 the correct multiple of sizeof(UPNumber). When will this not be the case? I
 can see two possibilities:

    1. When the C++ implementation under consideration needs some extra space
    in the array for bookkeeping, and to this end increases the passed-in
    'sz'.  This is a possibility allowed by the ANSI/ISO C++ draft standard
    (see section 5.3.4), but it did not occur with the C++ compilers I tried
    (Borland, Microsoft, Symantec). It may with others, though.

    2. When UPNumber::operator new[] is inherited and used by a class derived
    from UPNumber *and* sizeof(Derived) > sizeof(UPNumber). There is no
    foolproof way to check for this (in particular, the size of the derived
    class may itself be an integral multiple of the size of UPNumber), so
    this will cause problems too.

 To summarize: The counter approach does not completely solve the problems you
 outlined on pp. 148-149, but it improves on the bool flag approach. In fact,
 it still is a cheap solution and it may suffice for a large number of
 situations, especially if portability is not an issue or will be considered
 as the need arises (as frequently happens :-).


 2. Tracking heap based objects (pp. 154-156)
 --------------------------------------------
 Your HeapTracked class does indeed exactly that in a portable and bullet
 proof way. It is, however, a fairly expensive solution both in space and
 time. In particular, the list template will often (if not always)
 be implemented as a doubly-linked list and therefore require an allocation of
 a list element per address stored. This may be a quite considerable overhead,
 both in space and time. Furthermore, HeapTracked::isOnHeap() operates in
 linear time, and this too could be a problem.

 I have two alternatives to offer. Neither is as generally applicable or as
 foolproof as your solution, but they are a lot cheaper (requiring only a
 single integer per tracked object, and checking in constant time) and could
 be appropriate in many situations. After all, if you really need to track
 heap based objects, you may also be willing to impose some constraints on the
 use of the tracked class. (I have done so anyway.)
 
 Both methods add a signature to the allocated object, which is initialized by
 the overloaded operator new for the class (although the location of the
 signature differs between the two approaches). The signature is not
 initialized if the object is constructed otherwise, and this is also the
 possibly fatal flaw: a correct signature may be accidentally present even in
 objects that do not come off the heap. I'll return to that later.
 

 A. Prepend a signature to the allocated block
 ---------------------------------------------
 This first solution would work out as follows: overload operator new for the
 class and have it allocate some extra space before the actual block, then
 fill this with a special signature. For example:
 
    class HeapTracked {
    public:
    ...as before (operator delete not needed)...
    private:
    static const int32 signature = 0x57647652L;
    static bool checkSignature(const void *);
    ...RawAddress stuff deleted...
    };

    void *HeapTracked::operator new(size_t sz)
    {
    // Increase requested size by signature size
    sz += sizeof(int32);

    // Get somewhat larger block
    void *block = ::operator new(sz);

    // Initialize heap signature, then return pointer to subblock
    // beyond signature. (Second static_cast not strictly necessary.)
    int32 *signptr = static_cast(block);
    *signptr = signature;
    return static_cast(signptr + 1);
    }

    bool HeapTracked::isOnHeap() const
    {
    return checkSignature(dynamic_cast(this));
    }

    bool HeapTracked::checkSignature(const void *ptr)
    {
    // Cast back to int32 pointer
    const int32 *signptr = static_cast(ptr);

    // Check signature in front of the block.
    // ***This is the problem area, see text***
    return signptr[-1] == signature;
    }

 (By the way, 'int32' is a typedef to ensure I have a 4-byte integer
 regardless of my compiler's preferences.)

 As you've probably spotted, the problem with this approach arises when
 HeapTracked::checkSignature() attempts to retrieve the signature field that
 precedes the actual object. In general, the address some_array[-1] is not
 guaranteed to exist. If the block was allocated by HeapTracked::operator new,
 it will exist, but we're just about to find that out and can't be sure yet.

 This problem cannot be solved in a portable way, as far as I know. It may
 just so happen that the address signptr[-1] is not valid; it will depend on
 the platform at hand whether or not this is likely to occur. In my practice
 (DOS, Win16, Win32, OS/2 and Mac apps), the most threatening situations are
 in 16-bit segmented DOS/Win16 programs when signptr[-1] would cause segment
 'underwrap' (this can be checked, albeit in a non-portable way) and in
 others, when the passed-in block pointer actually addresses something on the
 stack exactly at the bottom of a stack page, and signptr[-1] causes a page
 fault on the stack's guard page. The latter condition is almost impossible to
 check for, but it has never occurred to me (yet). Anyway, this method is not
 without its problems, although it is a generally applicable as your
 list approach in all other respects (inheritance, mix-in use,
 etc.).

 Finally, it does not work with arrays of heap tracked objects, but you wisely
 ignored that subject too :-).


 B. Include a signature in the object itself
 -------------------------------------------
 This second solution is one you'll probably balk at, but it still has its
 merits and I've used it with success on several occasions (as I did with the
 previous method). The idea is to place the signature field inside the
 allocated object, yet still initialize it in HeapTracked::operator new. For
 example:
 
    class HeapTracked {
    public:
    ...as before (operator delete not needed)...
    private:
    static const int32 signature = 0x57647652L;
    int32 signfield;         // Added
    ...RawAddress stuff deleted...
    };

    void *HeapTracked::operator new(size_t sz)
    {
    // Get block of correct size
    void *block = ::operator new(sz);

    // Point into the object-to-be and set its signature...
    HeapTracked *htptr = static_cast(block);
    htptr->signfield = signature;
    return block;
    }

    bool HeapTracked::isOnHeap() const
    {
    return signfield == signature;
    }

 The HeapTracked constructors must leave this field alone of course, as must
 the assignment operator for the class, so it's probably a good idea to define
 at least the copy constructor and assignment operator for this class and
 implement them to that effect.
 
 If you're really into it, you might even want to implement
 HeapTracked::operator new[] and step through the array of objects-to-be,
 setting all their signature fields. In for a penny, in for a pound, don't you
 think?

 When is this method not applicable? In essence, when class HeapTracked is
 used as a base class (e.g. as the mix-in you mention) and it is either not
 the first base class, or it is a virtual base class. In both cases, the
 start of the new object will not coincide with the start of the allocated
 block, and the htptr->signfield reference will be off. The same applies if
 the C++ implementation under consideration adds some bookkkeeping space in
 front of the actual object (again allowed by ANSI/ISO C++). It's up to the
 client to decide whether or not these restrictions are acceptable.
 

 Accidental signature match
 --------------------------
 As I hinted at above, there still remains the problem of accidental signature
 matches. Both method A and B assume that if the signature value matches the
 predefined constant, we have a valid heap block. This may not always be true,
 however, especially not so because in the case of not-heap-based objects, the
 signature fields aren't initialized at all and may contain any value
 (including the signature value). What can we do about it? Nothing absolutely
 foolproof, but a few things might raise the confidence level enough to make
 either method usable.
 
    1. Use an 'improbable' signature value. Clearly, 0 would be a poor
    choice, as would be 0xFFFFFFFF. I normally use a 4-byte value that does
    not match anything systematic. In the above examples the value
    0x57647652L was used; I find it nicely conspicuous in dumps, yet fairly
    unlikely to occur by chance (try interpreting it as 4 ASCII characters).
    If necessary, you can make the signature value less probable by
    increasing its size (what about a 128-bit GUID?), but some compromise is
    probably called for.

    2. Clear the signature field in HeapTracked::operator delete (method
    A and B) or in the HeapTracked destructor (only method B). This might
    help a bit, but not much, since we're mainly interested in distinguishing
    heap based objects from others, and not one heap based object from
    another. Still, this could reduce the chance of accidents in method B if
    we do it in the HeapTracked destructor for every HeapTracked object.

    3. Use additional information above and beyond the signature field. For
    example, add a signature field at the end of the object field as well
    (this limits the application of the methods). Or, prepend another field
    with a dynamically magical value, say the address of the object itself.
    Or prepend an additional counter field and XOR the signature field with
    that. All these methods would still allow HeapTracked::isOnHeap() to
    operate in constant time, and don't require auxiliary data structures.


 C. Dig into your C++ compiler
 -----------------------------
 One final method to get what you want (i.e. knowledge about heap-basedness)
 is to consult your compiler's runtime implementation details. This is
 extremely nonportable (it might even change from one compiler release to the
 next), but what gives if you're desparate?
 
 Many compilers will use some hidden information passed to the constructors
 and destructors they generate, indicating whether or not the constructor
 should call operator new before proceeding (effectively combining the
 constructor call with the operator new call), and conversely, whether the
 destructor should call operator delete when it's done. They will also add
 some information to memory blocks indicating how many objects are in the
 block (to take care of vector construction and destruction). If you can get
 at these and are prepared to face the maintenance problems it causes, this
 could also solve your problems.
 
 Obviously, this approach is off-limits for a book like "More effective C++"
 and I have never seen it being used by anybody but the compiler manufacturer
 himself, but...
 

 Date: Fri, 01 Mar 1996 21:40:24 EST
 From: Scott Meyers
 To: Ron van der Wal
 Subject: Re: On the heap checks
 
 Thanks for your thoughtful comments on checking to see if objects are on
 the heap.  
 
 >1. A static bool 'onTheHeap' is not satisfactory (pp. 148-149)
 >--------------------------------------------------------------
 >This is true. There is, however, a fairly simple way to circumvent some of
 >the problems you outline: replace the bool flag by a counter. For example:
 
 I considered adding this to the book, but I decided against it.  The
 book is longer than I wanted it to be anyway, and my experience has
 been that anybody who can figure out the onTheHeap bool trick can
 pretty easily extend it to the counter idea.  In retrospect, perhaps I
 should have added it anyway, but if I did that in all the places where
 there were logical extensions (especially reference counting), I'd have
 a 400-page book instead of a 300-page book.  (I wanted a 200-page
 book!)

 >Your HeapTracked class does indeed exactly that in a portable and bullet
 >proof way. It is, however, a fairly expensive solution both in space and
 >time. In particular, the list template will often (if not always)
 >be implemented as a doubly-linked list and therefore require an allocation of
 >a list element per address stored. This may be a quite considerable overhead,
 >both in space and time. Furthermore, HeapTracked::isOnHeap() operates in
 >linear time, and this too could be a problem.

 True.  However, I assume that the problem of optimizing the data
 structure storing/finding addresses is independent of the
 heap-or-no-heap problem I consider in the book. 

 >    void *HeapTracked::operator new(size_t sz)
 >    {
 >	// Increase requested size by signature size
 >	sz += sizeof(int32);
 >
 >	// Get somewhat larger block
 >	void *block = ::operator new(sz);
 >
 >	// Initialize heap signature, then return pointer to subblock
 >	// beyond signature. (Second static_cast not strictly necessary.)
 >	int32 *signptr = static_cast(block);
 >	*signptr = signature;
 >	return static_cast(signptr + 1);
 >    }
 
 There is another potential problem with this idea, and that's that you
 may violate an alignment restriction.  operator new is required to
 return an address that satisfies alignment requirements for all types,
 and this implementation assumes that 4-byte boundaries satisfies that.
 Whether this is a problem in practice, I don't know.  I've heard that on
 some machines, doubles must be double-word aligned. 


 From: Ron van der Wal
 Date: Sat,  2 Mar 96 14:04:31 PST
 Subject: Re: On the heap checks 
 To: Scott Meyers
 
 Yes, you're right and I should have mentioned it, because one of those 
 machines is the DEC Alpha which I occasionally use. It might even require 
 8-byte alignment for other types, since its long int is 64 bits (as is 
 that of the Sun Sparc).
 
 
 
 From: Luke Tomasello
 To: Scott Meyers
 Subject: Heap-based objects
 Date: Wed, 22 Apr 1998 16:34:28 -0700

 I have been enjoying your book MEC++ and have what I believe is a
 reasonable solution to the "is it on the heap" OR "Is it on the stack"
 problem (Item 27).
 
 I believe my solution is both elegant and portable.
 
 The basic idea is to:
 1. implement both new and delete for the class in question.
 2. You then use malloc() in 'new' to allocate the memory.
 3. Now initialize all of the memory to some known value. E.g.. "ALLOCATED ON
    HEAP"
 4. Your class will contain a char array of say 16 bytes. E.g.. TestBuf[16];
 5. In the constructor you will check TestBuf for a match with our known
    initialization value.
 (The string probably won't match-up exactly, so you will need write a small
 bit of code.)
 
 Your ctor can now have logic something like:
 
 myclass::myclass()
 {
 	// set alloc state for this class instance
 	if (match(TestBuff,"ALLOCATED ON HEAP") == true)
 		allocatedOnHeap=true;
	else
		allocatedOnHeap=false;
 }

 I like using a string for the initialization value because it is eaiser to
 match and does not suffer any alignment issues.

 What do you think?


 Date: Wed Apr 22 21:11:02 1998
 To: Luke Tomasello
 From: Scott Meyers
 Subject: Re: Heap-based objects
 
 This will often work, but it may fail under the following conditions:
 
   - The address of TestBuff fails to be the same as the address returned
     from operator new.  This can happen if (1) multiple inheritance and/or
     virtual inheritance is being used, or (2) the address returned from the
     new operator isn't the same as the address returned from operator new.
     (The C++ standard makes no guarantee that they are the same, though I
     don't know of any implementations where they're not.)
 
   - By chance, the memory for a non-heap object contains the byte string
     you test for.  For an object of size 16 bytes or larger, the chances of
     this are remote, but as objects get smaller, the chances of this
     happening increases.  
 
 I still like the solution I describe in the book, because it's portable, it
 can never fail, and I believe it will offer acceptable performance most of
 the time.
 
 
 From: Luke Tomasello
 To: Scott Meyers
 Subject: RE: Heap-based objects
 Date: Thu, 23 Apr 1998 09:36:22 -0700
 
 What I meant in my original message is that when you initialize the memory
 returned from malloc, you initialize the whole memory block and not simply
 the memory occupied by TestBuf (doing so would cause the problems you
 mention above.)
 
 I think this only leaves your second point which we both agree is extremely
 unlikely if a buffer of say 16 bytes is used.
 
 
 Date: Thu Apr 23 10:15:10 1998
 To: Luke Tomasello
 From: Scott Meyers
 Subject: RE: Heap-based objects
 
 >I think this only leaves your second point which we both agree is extremely
 >unlikely if a buffer of say 16 bytes is used.
 
 Right, though it would be interesting to characterize the cases where your
 solution would be preferable to the one I published.  My solution may use
 more total memory, but yours makes objects larger, so my guess is the
 performance of the different approaches would depend on how often queries
 were made to find out if an object was on the heap.
 
 Is there a particular reason you like your solution better?
 
 
 From: Luke Tomasello
 To: Scott Meyers
 Subject: RE: Heap-based objects
 Date: Thu, 23 Apr 1998 10:42:14 -0700
 
 I think my solution may be better:
 1.  	if hundreds of objects are getting allocated and objects are often
     deleted.  (The list in your solution would be long, so search times are
     long.) 
 2.  	if the application is multi-threaded.  (There is some problems with
     your approach in a multi-threaded app accessing the 'address list'
     unless more code is added to handle this.) 
 
 I think we could eliminate the 16 byte buffer and simply do something like:
 
 threshold=some_number; // probably 10-20 is sufficient
 magicString="SOME MAGIC STRING";
 myClass::myClass()
 {
	// make sure this class is big enough to eliminate virtually
	// all possibility of having the random value on the stack
	assert(sizeof(*this) > threshold);

	// now do the test
	if (match(this,magicString,sizeof(*this)))
		IsOnHeap=true;
	else
		IsOnHeap=false;
 }


 Date: Thu Apr 23 11:37:37 1998
 To: Luke Tomasello
 From: Scott Meyers
 Subject: RE: Heap-based objects

 >I think my solution may be better:
 >1.	if hundreds of objects are getting allocated and objects are often
 >   deleted.   (The list in your solution would be long, so search times are
 >   long.) 

 Fair enough.

 >2.	if the application is multi-threaded.  (There is some problems with
 >   your approach in a multi-threaded app accessing the 'address list'
 >   unless more code is added to handle this.)

 Right.  Access to the list would clearly have to be made thread-safe.

 >I think we could eliminate the 16 byte buffer and simply do something like:
 >threshold=some_number; // probably 10-20 is sufficient
 >magicString="SOME MAGIC STRING";
 >myClass::myClass()
 >{
 >	// make sure this class is big enough to eliminate virtually
 >	// all possibility of having the random value on the stack
 >	assert(sizeof(*this) > threshold);
 >
 >	// now do the test
 >	if (match(this,magicString,sizeof(*this)))
 >		IsOnHeap=true;
 >	else
 >		IsOnHeap=false;
 >}

 Frankly, you could probably just cast the first 4 bytes of memory to an
 int and then put a magic value there.  Working with ints is fast, and the
 chances of static or stack memory happening to have a particular value is,
 well, one in about four billion.





 Subject: MEC++ Item 27 (require/prohib. heap) suggestions
    Date: Tue, 20 Apr 1999 23:19:02 -0400
    From: Eric Anderson
      To: Scott Meyers

 I just read through "Item 27: Requiring or prohibiting heap-based objects",
 and I'd like to offer a few suggestions.
 
 A. To improve the HeapTracked mixin class:

 1. Add a new data member bool createdWithNew (or the equivalent),
    initialized to false for all new HeapTracked objects.
 
 2. In the body of the HeapTracked constructor, place code to search the
    static "new" list for a match to the address of "this".  If a match is
    found, then createdWithNew is set to true for that object and the
    address is immediately removed from the static "new" list.
 
 3. In HeapTracked::isOnHeap, substitute a simple return of the state of the
    local bool createdWithNew  data member. 
 
 4. Eliminate the specialized operation delete function.  (If necessary, use
    some other function to make the class abstract.) 
 
 In comparison with the solution in MEC++, this should be more time and more
 space efficient:
 
 Time --  Returning a bool value is much more efficient than doing a search
 through a sequential list of length N.  This savings occurs with each use
 of isOnHeap.  
   Plus, there is also a savings for each search that checks for removing and
 address from the list.  If this operation is deferred to deletion time,
 that means the "new" list must contain entries for every undeleted object.
 This could be hundreds, thousands or more, depending on the application.
 Each such search must make an O(N) search of that potentially long list.
   When the search is performed during construction with immediate removal,
 then the absolute worst case would be bounded by the number of nested,
 simultaneously-in-process constructions of HeapTracked objects.  At worst,
 this is likely to be only a few at any time, and usually this will only be
 one object deep or else empty (if "new" was not used), i.e. a search
 through a list of typical length <= 1.
 
 Space -- The space required for the extra bool member is much less than
 keeping a whole list entry in the static "new" list, which includes space
 for each address to remember, plus the overhead of the list pointers that
 tie the list together (e.g. two extra pointers, if it is doubly linked).
 
 Early on, you show the inadequacy of a simplistic solution that uses a
 specialized "operator new" to toggle the value of a bool value that is read
 by the next constructor.  As it is currently presented, one could reason
 that the problem was simply in using a bool instead of an integer value.
 Rather than toggle between false and true, one could think your objections
 would be met by using incrementing (when memory is allocated by new) and
 decrementing (by constructor), i.e. if the count is > 0, then assume a
 "new" was used and then decrement.  This much would take care of the
 nesting problem -- or so it might seem to a reader.
 
 However, I think the real problem and the deeper unresolved issue is that
 the construction of nested objects could include an object that has a data
 member of the class in question.  That data member might not involve the
 use of new, yet it may be constructed before the completion of the
 construction of other instances that did use "new".  So, the issue is not
 merely the possibility of nesting construction (which could be dealt with
 by the inc/dec method), but of the fact that that nesting could involve
 both "new" and non-"new" instances of the class in question.  Without the
 more explicit comparison of addresses, one could not be sure that the use
 of "new" was not being attributed to the wrong instances of the class.
 
 
 Subject: Re: MEC++ Item 27 (require/prohib. heap) suggestions
    Date: Wed, 21 Apr 1999 16:33:46 -0700
    From: Scott Meyers
      To: Eric Anderson
 
 >Time -- Returning a bool value is much more efficient than doing a search
 >through a sequential list of length N.  This savings occurs with each use
 >of isOnHeap.

 True, but don't forget that because each object is now a bit larger (and,
 due to alignment and padding issues, possibly by more than just the size of
 a bool).  That means that fewer objects will fit in cache or main memory.
 This could compromise locality of reference if many objects are frequently
 accessed together.  Depending on the mix of operations performed, your
 design might therefore run more slowly than the one in MEC++.  It might
 also run faster.  A priori, there's just no way to know.  This isn't a
 criticism of your design, it's just an argument that your design is not
 obviously faster.
 
 >Plus, there is also a savings for each search that checks
 >for removing and address from the list.  If this operation is deferred to
 >deletion time, that means the "new" list must contain entries for every
 >undeleted object.  This could be hundreds, thousands or more, depending on
 >the application.  Each such search must make an O(N) search of that
 >potentially long list.
 
 True, but there are other data structures with better access
 characteristics, e.g., a hash table.  My goal in the book was to come up
 with a portable design that works, not to come up with the fastest possible
 design.
 
 >Space -- The space required for the extra bool member is much less than
 >keeping a whole list entry in the static "new" list, which includes space
 >for each address to remember, plus the overhead of the list pointers that
 >tie the list together (e.g. two extra pointers, if it is doubly linked).
 
 True, but as I note above, by making each object bigger, you may decrease
 the speed of the application.  Sometimes it's preferable to use more total
 bytes in order to make each object as small as possible.  Again, this is
 not a criticism of your design, just a remark that size and speed sometimes
 interact in unanticipated ways.
 
 
 Subject: Re: MEC++ Item 27 (require/prohib. heap) suggestions
    Date: Wed, 21 Apr 1999 22:17:00 -0400
    From: Eric Anderson
      To: Scott Meyers
 
 One other minor observation is that Ron van der Wal's problem case of 
 
    new UPNumber(UPNumber());

 is only the simplest specific case of the more general one I mention,
 i.e. the nested creation of any object that contains a non-heap instance of
 the class in question.  In order to apply the appropriate constructor for
 the enclosing new object, the compiler will have to finish creating the
 nested object first (which becomes the argument to the enclosing
 constructor), so the counter solution won't work.
 
 
 
 
 
 From: "Bill Wade" 
 To: "Scott Meyers" 
 Subject: Re: New Web Page:  Is an object on the heap?
 Date: Thu, 22 Apr 1999 09:01:39 -0500

 Scott Meyers, Ron van der Wal and Luke Tomasello discuss solutions that
 involve having operator new() put a "signature" (or a bunch of signatures)
 into a block so that the object's constructor can tell if the object is on
 the heap.  There is some discussion of "accidental" signatures.  In the
 book Scott presents a list-based method (with no signatures) that avoids
 this problem, but can require O(N) time (N is number of heap objects) to
 delete a single item.
 
 There is an amortized O(1) technique to make signatures while avoiding the
 false (accidental) signature problem.  I first saw this as exercise 2.12 in
 "The Design and Analysis of Computer Algorithms" by Aho, Hoppcroft and
 Ullman.  An interesting application was presented by P. Briggs and L.
 Torczon, "An Efficient Representation for Sparse Sets", ACM Letters on
 Programming Languages and Systems (LOPLAS) 2(1-4), March-December 1993,
 pages 59-70.
 
 Here is the technique to efficiently and unambiguously "sign" a word of
 memory.  It should be easy to extend to sign a range of words.  To keep
 things simple I'll ignore issues such as alignment or potential overlap of
 words and build a class which keeps track of signed integers in memory.
 Pointers must be dereferencable.  You can't sign NULL.
 
 I haven't even attempted to compile this code as written.  All operations
 are amortized constant time.  There are two words of external storage
 (class Notary::Entry) for each signed word.  If vector is contiguous it is
 easy to reduce the overhead to single word (in addition to the word being
 signed).  As written, at most MAX_INT words can be signed at any one time.
 
 class Notary
 {
 public:
  Notary();

  // Sign a word.  Word must not already be signed or overlap a signed word.
  // Code does not check for overlap.  Once a word is signed it should not be
  // modified until after its signature has been Revoke'd.
  void Sign(int *);

  void Revoke(const int*);     // Unsign a word.  Word must already be signed.
  bool IsSigned(const int*) const;   // Tells if a word is signed.
 private:
  struct Entry
  {
    Entry(): word(0), next_free(-1){}
    const int* word;   // The word that is signed or NULL if not in use.
    int next_free;     // Index in logbook of next free Entry (linked list) or -1.
  };

  std::vector logbook;
  int first_free;  // Index of first (linked list) free Entry or -1.
 };

 Notary::Notary(): first_free(-1){}

 void Notary::Sign(int* p)
 {
  assert(!IsSigned(p));
  if(first_free == -1)
  {    // logbook is full.  Make some more room.
    first_free = logbook.size();
    logbook.resize(first_free+1);
  }

  // In the word being signed, remember the location of the entry in the logbook.
  *p = first_free;

  // Update the logbook entry to indicate the word is signed.
  logbook[first_free].word = p;

  // Remove the entry from the free list.
  first_free = logbook[first_free].next_free;
 }

 void Notary::Revoke(const int* p)
 {
  assert(IsSigned(p));
  int index = *p;

  // Make the Entry point nowhere.
  logbook[index].word = 0;

  // Add the entry to the free list.
  logbook[index].next_free = first_free;
  first_free = index;
 }

 bool Notary::IsSigned(const int* p) const
 {
  // This next line assumes it is always ok to read an uninitialized integer
  // as an integer.  Strictly speaking I don't believe the standard allows that.
  // You could make this code strictly conforming by only accessing *p as an array
  // of characters (here and in Sign and Revoke), but that seems like overkill.
  int index = *p;

  // Is index in range?
  if(index < 0 || logbook.size() <= index)
    return false;

  // Does the logbook agree that *p is the word associated with index?
  return p == logbook[index].word;
 }
	

Return to errata page for More Effective C++.