Scott Meyers
Comments on Item 27 of More Effective C++
Last Updated December 15, 1999
by Scott Meyers
The messages below make up correspondance related to More Effective C++'s discussion of how to determine whether an object is on the heap. I've edited out unnecessary email headers and omitted parts of the messages not germane to the issue at hand, but other than that, these are verbatim messages sent to/from me on this topic. I'm grateful to each of the authors for their permission to make their messages available for general viewing.
After Ron van der Wal and I exchanged email, Ron put up a web page summarizing his extensive and insightful thoughts on this topic. That page contains essentially the same information as is shown in our correspondance below, but he's formatted his page, so it's easier to read. If you read his page, however, please don't forget to come back here to read the other authors' comments, because they describe interesting approaches to this problem that are different from both mine and Ron's.
From: Ron van der Wal
Date: Mon, 26 Feb 96 10:44:40 PST
Subject: On the heap checks
To: Scott Meyers
Thank you for another well written and useful book on C++ - meaning of course
"More effective C++". I would like to comment on the central issue of Item
27, i.e. ways to determine whether or not some object is allocated off the
heap rather than, say, stack or static data space. Unless noted otherwise, my
comments refer to the UPNumber and HeapTracked samples that you use
throughout Item 27.
1. A static bool 'onTheHeap' is not satisfactory (pp. 148-149)
--------------------------------------------------------------
This is true. There is, however, a fairly simple way to circumvent some of
the problems you outline: replace the bool flag by a counter. For example:
class UPNumber {
public:
...as before...
private:
static int heapPending; // Replaces 'onTheHeap'
};
int UPNumber::heapPending = 0;
void *UPNumber::operator new(size_t sz)
{
++heapPending;
return ::operator new(sz);
}
UPNumber::UPNumber()
{
if (heapPending < 1)
throw HeapConstraintViolation();
...
--heapPending;
}
This solves most of the problems you mention (we'll get to array allocations
in a moment), including the new UPNumber(*new UPNumber()) situation. In
fact, it could only fail (as far as single objects are concerned) in the
following situation:
new UPNumber(UPNumber());
This fails if (and only if) the compiler happens to generate code as follows:
1. Call operator new
2. Call the constructor of the temporary
3. Call the constructor of the dynamic object
The order of 2 and 3 will never be reversed, of course, so the question is
simply whether or not the operator new call will precede the construction of
the temporary. (This scenario would also cause the 'onTheHeap' approach to
fail.) Any other combination of constructors and/or operator new invocations
will be safe with the counter approach. In the example above it remains to
be seen if the use of a temporary makes sense, but it's legal anyway so we
have to deal with it.
As to arrays of UPNumbers, this could be solved by also overloading operator
new[] for the UPNumber class and implementing it as follows:
void *UPNumber::operator new[](size sz)
{
heapPending += sz / sizeof(UPNumber);
return ::operator new[](sz);
}
This solves the new UPNumber[100] you mention, assuming that 'sz' is indeed
the correct multiple of sizeof(UPNumber). When will this not be the case? I
can see two possibilities:
1. When the C++ implementation under consideration needs some extra space
in the array for bookkeeping, and to this end increases the passed-in
'sz'. This is a possibility allowed by the ANSI/ISO C++ draft standard
(see section 5.3.4), but it did not occur with the C++ compilers I tried
(Borland, Microsoft, Symantec). It may with others, though.
2. When UPNumber::operator new[] is inherited and used by a class derived
from UPNumber *and* sizeof(Derived) > sizeof(UPNumber). There is no
foolproof way to check for this (in particular, the size of the derived
class may itself be an integral multiple of the size of UPNumber), so
this will cause problems too.
To summarize: The counter approach does not completely solve the problems you
outlined on pp. 148-149, but it improves on the bool flag approach. In fact,
it still is a cheap solution and it may suffice for a large number of
situations, especially if portability is not an issue or will be considered
as the need arises (as frequently happens :-).
2. Tracking heap based objects (pp. 154-156)
--------------------------------------------
Your HeapTracked class does indeed exactly that in a portable and bullet
proof way. It is, however, a fairly expensive solution both in space and
time. In particular, the list template will often (if not always)
be implemented as a doubly-linked list and therefore require an allocation of
a list element per address stored. This may be a quite considerable overhead,
both in space and time. Furthermore, HeapTracked::isOnHeap() operates in
linear time, and this too could be a problem.
I have two alternatives to offer. Neither is as generally applicable or as
foolproof as your solution, but they are a lot cheaper (requiring only a
single integer per tracked object, and checking in constant time) and could
be appropriate in many situations. After all, if you really need to track
heap based objects, you may also be willing to impose some constraints on the
use of the tracked class. (I have done so anyway.)
Both methods add a signature to the allocated object, which is initialized by
the overloaded operator new for the class (although the location of the
signature differs between the two approaches). The signature is not
initialized if the object is constructed otherwise, and this is also the
possibly fatal flaw: a correct signature may be accidentally present even in
objects that do not come off the heap. I'll return to that later.
A. Prepend a signature to the allocated block
---------------------------------------------
This first solution would work out as follows: overload operator new for the
class and have it allocate some extra space before the actual block, then
fill this with a special signature. For example:
class HeapTracked {
public:
...as before (operator delete not needed)...
private:
static const int32 signature = 0x57647652L;
static bool checkSignature(const void *);
...RawAddress stuff deleted...
};
void *HeapTracked::operator new(size_t sz)
{
// Increase requested size by signature size
sz += sizeof(int32);
// Get somewhat larger block
void *block = ::operator new(sz);
// Initialize heap signature, then return pointer to subblock
// beyond signature. (Second static_cast not strictly necessary.)
int32 *signptr = static_cast(block);
*signptr = signature;
return static_cast(signptr + 1);
}
bool HeapTracked::isOnHeap() const
{
return checkSignature(dynamic_cast(this));
}
bool HeapTracked::checkSignature(const void *ptr)
{
// Cast back to int32 pointer
const int32 *signptr = static_cast(ptr);
// Check signature in front of the block.
// ***This is the problem area, see text***
return signptr[-1] == signature;
}
(By the way, 'int32' is a typedef to ensure I have a 4-byte integer
regardless of my compiler's preferences.)
As you've probably spotted, the problem with this approach arises when
HeapTracked::checkSignature() attempts to retrieve the signature field that
precedes the actual object. In general, the address some_array[-1] is not
guaranteed to exist. If the block was allocated by HeapTracked::operator new,
it will exist, but we're just about to find that out and can't be sure yet.
This problem cannot be solved in a portable way, as far as I know. It may
just so happen that the address signptr[-1] is not valid; it will depend on
the platform at hand whether or not this is likely to occur. In my practice
(DOS, Win16, Win32, OS/2 and Mac apps), the most threatening situations are
in 16-bit segmented DOS/Win16 programs when signptr[-1] would cause segment
'underwrap' (this can be checked, albeit in a non-portable way) and in
others, when the passed-in block pointer actually addresses something on the
stack exactly at the bottom of a stack page, and signptr[-1] causes a page
fault on the stack's guard page. The latter condition is almost impossible to
check for, but it has never occurred to me (yet). Anyway, this method is not
without its problems, although it is a generally applicable as your
list approach in all other respects (inheritance, mix-in use,
etc.).
Finally, it does not work with arrays of heap tracked objects, but you wisely
ignored that subject too :-).
B. Include a signature in the object itself
-------------------------------------------
This second solution is one you'll probably balk at, but it still has its
merits and I've used it with success on several occasions (as I did with the
previous method). The idea is to place the signature field inside the
allocated object, yet still initialize it in HeapTracked::operator new. For
example:
class HeapTracked {
public:
...as before (operator delete not needed)...
private:
static const int32 signature = 0x57647652L;
int32 signfield; // Added
...RawAddress stuff deleted...
};
void *HeapTracked::operator new(size_t sz)
{
// Get block of correct size
void *block = ::operator new(sz);
// Point into the object-to-be and set its signature...
HeapTracked *htptr = static_cast(block);
htptr->signfield = signature;
return block;
}
bool HeapTracked::isOnHeap() const
{
return signfield == signature;
}
The HeapTracked constructors must leave this field alone of course, as must
the assignment operator for the class, so it's probably a good idea to define
at least the copy constructor and assignment operator for this class and
implement them to that effect.
If you're really into it, you might even want to implement
HeapTracked::operator new[] and step through the array of objects-to-be,
setting all their signature fields. In for a penny, in for a pound, don't you
think?
When is this method not applicable? In essence, when class HeapTracked is
used as a base class (e.g. as the mix-in you mention) and it is either not
the first base class, or it is a virtual base class. In both cases, the
start of the new object will not coincide with the start of the allocated
block, and the htptr->signfield reference will be off. The same applies if
the C++ implementation under consideration adds some bookkkeeping space in
front of the actual object (again allowed by ANSI/ISO C++). It's up to the
client to decide whether or not these restrictions are acceptable.
Accidental signature match
--------------------------
As I hinted at above, there still remains the problem of accidental signature
matches. Both method A and B assume that if the signature value matches the
predefined constant, we have a valid heap block. This may not always be true,
however, especially not so because in the case of not-heap-based objects, the
signature fields aren't initialized at all and may contain any value
(including the signature value). What can we do about it? Nothing absolutely
foolproof, but a few things might raise the confidence level enough to make
either method usable.
1. Use an 'improbable' signature value. Clearly, 0 would be a poor
choice, as would be 0xFFFFFFFF. I normally use a 4-byte value that does
not match anything systematic. In the above examples the value
0x57647652L was used; I find it nicely conspicuous in dumps, yet fairly
unlikely to occur by chance (try interpreting it as 4 ASCII characters).
If necessary, you can make the signature value less probable by
increasing its size (what about a 128-bit GUID?), but some compromise is
probably called for.
2. Clear the signature field in HeapTracked::operator delete (method
A and B) or in the HeapTracked destructor (only method B). This might
help a bit, but not much, since we're mainly interested in distinguishing
heap based objects from others, and not one heap based object from
another. Still, this could reduce the chance of accidents in method B if
we do it in the HeapTracked destructor for every HeapTracked object.
3. Use additional information above and beyond the signature field. For
example, add a signature field at the end of the object field as well
(this limits the application of the methods). Or, prepend another field
with a dynamically magical value, say the address of the object itself.
Or prepend an additional counter field and XOR the signature field with
that. All these methods would still allow HeapTracked::isOnHeap() to
operate in constant time, and don't require auxiliary data structures.
C. Dig into your C++ compiler
-----------------------------
One final method to get what you want (i.e. knowledge about heap-basedness)
is to consult your compiler's runtime implementation details. This is
extremely nonportable (it might even change from one compiler release to the
next), but what gives if you're desparate?
Many compilers will use some hidden information passed to the constructors
and destructors they generate, indicating whether or not the constructor
should call operator new before proceeding (effectively combining the
constructor call with the operator new call), and conversely, whether the
destructor should call operator delete when it's done. They will also add
some information to memory blocks indicating how many objects are in the
block (to take care of vector construction and destruction). If you can get
at these and are prepared to face the maintenance problems it causes, this
could also solve your problems.
Obviously, this approach is off-limits for a book like "More effective C++"
and I have never seen it being used by anybody but the compiler manufacturer
himself, but...
Date: Fri, 01 Mar 1996 21:40:24 EST
From: Scott Meyers
To: Ron van der Wal
Subject: Re: On the heap checks
Thanks for your thoughtful comments on checking to see if objects are on
the heap.
>1. A static bool 'onTheHeap' is not satisfactory (pp. 148-149)
>--------------------------------------------------------------
>This is true. There is, however, a fairly simple way to circumvent some of
>the problems you outline: replace the bool flag by a counter. For example:
I considered adding this to the book, but I decided against it. The
book is longer than I wanted it to be anyway, and my experience has
been that anybody who can figure out the onTheHeap bool trick can
pretty easily extend it to the counter idea. In retrospect, perhaps I
should have added it anyway, but if I did that in all the places where
there were logical extensions (especially reference counting), I'd have
a 400-page book instead of a 300-page book. (I wanted a 200-page
book!)
>Your HeapTracked class does indeed exactly that in a portable and bullet
>proof way. It is, however, a fairly expensive solution both in space and
>time. In particular, the list template will often (if not always)
>be implemented as a doubly-linked list and therefore require an allocation of
>a list element per address stored. This may be a quite considerable overhead,
>both in space and time. Furthermore, HeapTracked::isOnHeap() operates in
>linear time, and this too could be a problem.
True. However, I assume that the problem of optimizing the data
structure storing/finding addresses is independent of the
heap-or-no-heap problem I consider in the book.
> void *HeapTracked::operator new(size_t sz)
> {
> // Increase requested size by signature size
> sz += sizeof(int32);
>
> // Get somewhat larger block
> void *block = ::operator new(sz);
>
> // Initialize heap signature, then return pointer to subblock
> // beyond signature. (Second static_cast not strictly necessary.)
> int32 *signptr = static_cast(block);
> *signptr = signature;
> return static_cast(signptr + 1);
> }
There is another potential problem with this idea, and that's that you
may violate an alignment restriction. operator new is required to
return an address that satisfies alignment requirements for all types,
and this implementation assumes that 4-byte boundaries satisfies that.
Whether this is a problem in practice, I don't know. I've heard that on
some machines, doubles must be double-word aligned.
From: Ron van der Wal
Date: Sat, 2 Mar 96 14:04:31 PST
Subject: Re: On the heap checks
To: Scott Meyers
Yes, you're right and I should have mentioned it, because one of those
machines is the DEC Alpha which I occasionally use. It might even require
8-byte alignment for other types, since its long int is 64 bits (as is
that of the Sun Sparc).
From: Luke Tomasello
To: Scott Meyers
Subject: Heap-based objects
Date: Wed, 22 Apr 1998 16:34:28 -0700
I have been enjoying your book MEC++ and have what I believe is a
reasonable solution to the "is it on the heap" OR "Is it on the stack"
problem (Item 27).
I believe my solution is both elegant and portable.
The basic idea is to:
1. implement both new and delete for the class in question.
2. You then use malloc() in 'new' to allocate the memory.
3. Now initialize all of the memory to some known value. E.g.. "ALLOCATED ON
HEAP"
4. Your class will contain a char array of say 16 bytes. E.g.. TestBuf[16];
5. In the constructor you will check TestBuf for a match with our known
initialization value.
(The string probably won't match-up exactly, so you will need write a small
bit of code.)
Your ctor can now have logic something like:
myclass::myclass()
{
// set alloc state for this class instance
if (match(TestBuff,"ALLOCATED ON HEAP") == true)
allocatedOnHeap=true;
else
allocatedOnHeap=false;
}
I like using a string for the initialization value because it is eaiser to
match and does not suffer any alignment issues.
What do you think?
Date: Wed Apr 22 21:11:02 1998
To: Luke Tomasello
From: Scott Meyers
Subject: Re: Heap-based objects
This will often work, but it may fail under the following conditions:
- The address of TestBuff fails to be the same as the address returned
from operator new. This can happen if (1) multiple inheritance and/or
virtual inheritance is being used, or (2) the address returned from the
new operator isn't the same as the address returned from operator new.
(The C++ standard makes no guarantee that they are the same, though I
don't know of any implementations where they're not.)
- By chance, the memory for a non-heap object contains the byte string
you test for. For an object of size 16 bytes or larger, the chances of
this are remote, but as objects get smaller, the chances of this
happening increases.
I still like the solution I describe in the book, because it's portable, it
can never fail, and I believe it will offer acceptable performance most of
the time.
From: Luke Tomasello
To: Scott Meyers
Subject: RE: Heap-based objects
Date: Thu, 23 Apr 1998 09:36:22 -0700
What I meant in my original message is that when you initialize the memory
returned from malloc, you initialize the whole memory block and not simply
the memory occupied by TestBuf (doing so would cause the problems you
mention above.)
I think this only leaves your second point which we both agree is extremely
unlikely if a buffer of say 16 bytes is used.
Date: Thu Apr 23 10:15:10 1998
To: Luke Tomasello
From: Scott Meyers
Subject: RE: Heap-based objects
>I think this only leaves your second point which we both agree is extremely
>unlikely if a buffer of say 16 bytes is used.
Right, though it would be interesting to characterize the cases where your
solution would be preferable to the one I published. My solution may use
more total memory, but yours makes objects larger, so my guess is the
performance of the different approaches would depend on how often queries
were made to find out if an object was on the heap.
Is there a particular reason you like your solution better?
From: Luke Tomasello
To: Scott Meyers
Subject: RE: Heap-based objects
Date: Thu, 23 Apr 1998 10:42:14 -0700
I think my solution may be better:
1. if hundreds of objects are getting allocated and objects are often
deleted. (The list in your solution would be long, so search times are
long.)
2. if the application is multi-threaded. (There is some problems with
your approach in a multi-threaded app accessing the 'address list'
unless more code is added to handle this.)
I think we could eliminate the 16 byte buffer and simply do something like:
threshold=some_number; // probably 10-20 is sufficient
magicString="SOME MAGIC STRING";
myClass::myClass()
{
// make sure this class is big enough to eliminate virtually
// all possibility of having the random value on the stack
assert(sizeof(*this) > threshold);
// now do the test
if (match(this,magicString,sizeof(*this)))
IsOnHeap=true;
else
IsOnHeap=false;
}
Date: Thu Apr 23 11:37:37 1998
To: Luke Tomasello
From: Scott Meyers
Subject: RE: Heap-based objects
>I think my solution may be better:
>1. if hundreds of objects are getting allocated and objects are often
> deleted. (The list in your solution would be long, so search times are
> long.)
Fair enough.
>2. if the application is multi-threaded. (There is some problems with
> your approach in a multi-threaded app accessing the 'address list'
> unless more code is added to handle this.)
Right. Access to the list would clearly have to be made thread-safe.
>I think we could eliminate the 16 byte buffer and simply do something like:
>threshold=some_number; // probably 10-20 is sufficient
>magicString="SOME MAGIC STRING";
>myClass::myClass()
>{
> // make sure this class is big enough to eliminate virtually
> // all possibility of having the random value on the stack
> assert(sizeof(*this) > threshold);
>
> // now do the test
> if (match(this,magicString,sizeof(*this)))
> IsOnHeap=true;
> else
> IsOnHeap=false;
>}
Frankly, you could probably just cast the first 4 bytes of memory to an
int and then put a magic value there. Working with ints is fast, and the
chances of static or stack memory happening to have a particular value is,
well, one in about four billion.
Subject: MEC++ Item 27 (require/prohib. heap) suggestions
Date: Tue, 20 Apr 1999 23:19:02 -0400
From: Eric Anderson
To: Scott Meyers
I just read through "Item 27: Requiring or prohibiting heap-based objects",
and I'd like to offer a few suggestions.
A. To improve the HeapTracked mixin class:
1. Add a new data member bool createdWithNew (or the equivalent),
initialized to false for all new HeapTracked objects.
2. In the body of the HeapTracked constructor, place code to search the
static "new" list for a match to the address of "this". If a match is
found, then createdWithNew is set to true for that object and the
address is immediately removed from the static "new" list.
3. In HeapTracked::isOnHeap, substitute a simple return of the state of the
local bool createdWithNew data member.
4. Eliminate the specialized operation delete function. (If necessary, use
some other function to make the class abstract.)
In comparison with the solution in MEC++, this should be more time and more
space efficient:
Time -- Returning a bool value is much more efficient than doing a search
through a sequential list of length N. This savings occurs with each use
of isOnHeap.
Plus, there is also a savings for each search that checks for removing and
address from the list. If this operation is deferred to deletion time,
that means the "new" list must contain entries for every undeleted object.
This could be hundreds, thousands or more, depending on the application.
Each such search must make an O(N) search of that potentially long list.
When the search is performed during construction with immediate removal,
then the absolute worst case would be bounded by the number of nested,
simultaneously-in-process constructions of HeapTracked objects. At worst,
this is likely to be only a few at any time, and usually this will only be
one object deep or else empty (if "new" was not used), i.e. a search
through a list of typical length <= 1.
Space -- The space required for the extra bool member is much less than
keeping a whole list entry in the static "new" list, which includes space
for each address to remember, plus the overhead of the list pointers that
tie the list together (e.g. two extra pointers, if it is doubly linked).
Early on, you show the inadequacy of a simplistic solution that uses a
specialized "operator new" to toggle the value of a bool value that is read
by the next constructor. As it is currently presented, one could reason
that the problem was simply in using a bool instead of an integer value.
Rather than toggle between false and true, one could think your objections
would be met by using incrementing (when memory is allocated by new) and
decrementing (by constructor), i.e. if the count is > 0, then assume a
"new" was used and then decrement. This much would take care of the
nesting problem -- or so it might seem to a reader.
However, I think the real problem and the deeper unresolved issue is that
the construction of nested objects could include an object that has a data
member of the class in question. That data member might not involve the
use of new, yet it may be constructed before the completion of the
construction of other instances that did use "new". So, the issue is not
merely the possibility of nesting construction (which could be dealt with
by the inc/dec method), but of the fact that that nesting could involve
both "new" and non-"new" instances of the class in question. Without the
more explicit comparison of addresses, one could not be sure that the use
of "new" was not being attributed to the wrong instances of the class.
Subject: Re: MEC++ Item 27 (require/prohib. heap) suggestions
Date: Wed, 21 Apr 1999 16:33:46 -0700
From: Scott Meyers
To: Eric Anderson
>Time -- Returning a bool value is much more efficient than doing a search
>through a sequential list of length N. This savings occurs with each use
>of isOnHeap.
True, but don't forget that because each object is now a bit larger (and,
due to alignment and padding issues, possibly by more than just the size of
a bool). That means that fewer objects will fit in cache or main memory.
This could compromise locality of reference if many objects are frequently
accessed together. Depending on the mix of operations performed, your
design might therefore run more slowly than the one in MEC++. It might
also run faster. A priori, there's just no way to know. This isn't a
criticism of your design, it's just an argument that your design is not
obviously faster.
>Plus, there is also a savings for each search that checks
>for removing and address from the list. If this operation is deferred to
>deletion time, that means the "new" list must contain entries for every
>undeleted object. This could be hundreds, thousands or more, depending on
>the application. Each such search must make an O(N) search of that
>potentially long list.
True, but there are other data structures with better access
characteristics, e.g., a hash table. My goal in the book was to come up
with a portable design that works, not to come up with the fastest possible
design.
>Space -- The space required for the extra bool member is much less than
>keeping a whole list entry in the static "new" list, which includes space
>for each address to remember, plus the overhead of the list pointers that
>tie the list together (e.g. two extra pointers, if it is doubly linked).
True, but as I note above, by making each object bigger, you may decrease
the speed of the application. Sometimes it's preferable to use more total
bytes in order to make each object as small as possible. Again, this is
not a criticism of your design, just a remark that size and speed sometimes
interact in unanticipated ways.
Subject: Re: MEC++ Item 27 (require/prohib. heap) suggestions
Date: Wed, 21 Apr 1999 22:17:00 -0400
From: Eric Anderson
To: Scott Meyers
One other minor observation is that Ron van der Wal's problem case of
new UPNumber(UPNumber());
is only the simplest specific case of the more general one I mention,
i.e. the nested creation of any object that contains a non-heap instance of
the class in question. In order to apply the appropriate constructor for
the enclosing new object, the compiler will have to finish creating the
nested object first (which becomes the argument to the enclosing
constructor), so the counter solution won't work.
From: "Bill Wade"
To: "Scott Meyers"
Subject: Re: New Web Page: Is an object on the heap?
Date: Thu, 22 Apr 1999 09:01:39 -0500
Scott Meyers, Ron van der Wal and Luke Tomasello discuss solutions that
involve having operator new() put a "signature" (or a bunch of signatures)
into a block so that the object's constructor can tell if the object is on
the heap. There is some discussion of "accidental" signatures. In the
book Scott presents a list-based method (with no signatures) that avoids
this problem, but can require O(N) time (N is number of heap objects) to
delete a single item.
There is an amortized O(1) technique to make signatures while avoiding the
false (accidental) signature problem. I first saw this as exercise 2.12 in
"The Design and Analysis of Computer Algorithms" by Aho, Hoppcroft and
Ullman. An interesting application was presented by P. Briggs and L.
Torczon, "An Efficient Representation for Sparse Sets", ACM Letters on
Programming Languages and Systems (LOPLAS) 2(1-4), March-December 1993,
pages 59-70.
Here is the technique to efficiently and unambiguously "sign" a word of
memory. It should be easy to extend to sign a range of words. To keep
things simple I'll ignore issues such as alignment or potential overlap of
words and build a class which keeps track of signed integers in memory.
Pointers must be dereferencable. You can't sign NULL.
I haven't even attempted to compile this code as written. All operations
are amortized constant time. There are two words of external storage
(class Notary::Entry) for each signed word. If vector is contiguous it is
easy to reduce the overhead to single word (in addition to the word being
signed). As written, at most MAX_INT words can be signed at any one time.
class Notary
{
public:
Notary();
// Sign a word. Word must not already be signed or overlap a signed word.
// Code does not check for overlap. Once a word is signed it should not be
// modified until after its signature has been Revoke'd.
void Sign(int *);
void Revoke(const int*); // Unsign a word. Word must already be signed.
bool IsSigned(const int*) const; // Tells if a word is signed.
private:
struct Entry
{
Entry(): word(0), next_free(-1){}
const int* word; // The word that is signed or NULL if not in use.
int next_free; // Index in logbook of next free Entry (linked list) or -1.
};
std::vector logbook;
int first_free; // Index of first (linked list) free Entry or -1.
};
Notary::Notary(): first_free(-1){}
void Notary::Sign(int* p)
{
assert(!IsSigned(p));
if(first_free == -1)
{ // logbook is full. Make some more room.
first_free = logbook.size();
logbook.resize(first_free+1);
}
// In the word being signed, remember the location of the entry in the logbook.
*p = first_free;
// Update the logbook entry to indicate the word is signed.
logbook[first_free].word = p;
// Remove the entry from the free list.
first_free = logbook[first_free].next_free;
}
void Notary::Revoke(const int* p)
{
assert(IsSigned(p));
int index = *p;
// Make the Entry point nowhere.
logbook[index].word = 0;
// Add the entry to the free list.
logbook[index].next_free = first_free;
first_free = index;
}
bool Notary::IsSigned(const int* p) const
{
// This next line assumes it is always ok to read an uninitialized integer
// as an integer. Strictly speaking I don't believe the standard allows that.
// You could make this code strictly conforming by only accessing *p as an array
// of characters (here and in Sign and Revoke), but that seems like overkill.
int index = *p;
// Is index in range?
if(index < 0 || logbook.size() <= index)
return false;
// Does the logbook agree that *p is the word associated with index?
return p == logbook[index].word;
}

