The Mysteries of Distributed Garbage Collection
(or "How does lifetime management in .NET Remoting
really work")
Each object's lifetime in the .NET Framework is determined
by the garbage collector (GC). To perform its work, the GC checks for each and
every object if it is "reachable". To determine reachability, the GC
starts at all root objects - which are basically all static (shared in VB.NET)
members and all in-scope local variables - and traverses the complete object reference
graph. Each object which is encountered by the GC on its way is marked as alive.
In a second pass, all non-marked objects are destroyed, their resources freed,
and finally the heap is compacted again to prevent memory fragmentation.
When Garbage Collection is implemented in this way, it works
like a charm because of the relative ease of determining which objects are
alive and which aren't, simply by looking at whether or not other live objects'
reference point towards them.
But let's increase complexity and add remote object. How
can the garbage collector work in a Remoting environment, in which numerous
disconnected clients can still hold references to objects? How will it
determine which objects are still in use? Well, let's put it this way: it
can't, and therefore it won't. Well-performing and scalable distributed garbage
collection as it is used for local applications is simply not possible in these
distributed environments.
Fortunately though, a number of possible workarounds exist.
The designers of .NET Remoting chose to employ a lease based lifetime
management system. In this system, a lease object is attached to the server
side of each remoted object. There is exactly one Lease for each
MarshalByRefObject after it has been sent over a remoting boundary or manually
marshaled by the use of RemotingServices.Marshal(). Its the responsibility of
this lease to track the objects lifetime. The lease therefore stores
the objects lifetime in its public field CurrentLeaseTime which is of
type TimeSpan.
You can even access this lease and output an objects
lifetime by using code similar to the following:
public static void DumpStats(MarshalByRefObject o)
{
ILease le = (ILease) o.GetLifetimeService();
Console.WriteLine("Current TTL: {0}", le.CurrentLeaseTime);
}
In addition to tracking an objects current time to live,
the lease has a second property called RenewOnCallTime. This property is
used by the framework to increase the lease time whenever a method of the
destination object is called.
But the lease object itself only tracks the
lifetime. Neither does it increase the time to live when a method call is
received, nor does it periodically decrement the objects lifetime. These tasks
are performed by two other classes: The first on is called LeaseManager
and can be found in System.Runtime.Remoting.Lifetime. This class uses a
background thread to periodically check all remoted objects leases and
decrement their time to live. The second one is a sink in the server-side
remoting sink stack. It is called LeaseSink and can also be found in System.Runtime.Remoting.Lifetime.
This sink is used for each and every server-side object and will increase their
time to live whenever a method call is received.
In addition to the purely TTL based system of leases and
lifetime, the .NET Remoting Framework also allows the developer to attach so
called Sponsor objects to a lease. These sponsors will be contacted by
the LeaseManager whenever the corresponding leases CurrentLeaseTime
reaches zero.
The combination of these two approaches allow the framework
to determine with quite high probability if an object is still used or if it
can be destroyed.
What happens next?
Now that you know how lifetime management works on the
outside, you might be interested in the internal workings. Is the object
immediately destroyed after timing out? Are there actually two different
garbage collectors in .NET? One for local applications, and another GC for
remote ones? Fortunately not.
But before getting into the guts of GC combined with
Remoting, lets first look at how MarshalByRefObjects are marshaled. The kind
of marshaling Im about to describe in a second will happen at the following
points:
- Whenever a MBR is published using
RemotingServices.Marshal()
- Whenever a server-activated object (SAO) in Singleton-mode
is accessed for the first time.
- Whenever a registered client-activated object (CAO) is
instantiated
Or more generally: Marshaling takes place, whenever a
MBR is passed over a remoting boundary.
As soon as this happens, a so called Identity object
is created. This identity points to the server-side MarshalByRefObject itself
and also to its objectURI which is either manually specified when
dealing with SAOs, or dynamically created when you are working with CAOs. To
complete the marshaling process, this identity is stored in a Hashtable
using the objectURI as a key. This Hashtable will later be used to
resolve incoming requests by acquiring the correct server-side destination
object depending on the URL used to call.
To illustrate this process, lets look at the following
Hashtable:
|
Key
|
Value
|
|
/URI/to/Object1
|
Identity -> Object#1
|
|
/URI/to/Object2
|
Identity -> Object#2
|
|
/URI/to/Object3
|
Identity -> Object#3
|
When the Remoting stack now receives an incoming request
targeted at /URI/to/Object2, it can grab the corresponding Identity
object and forward the call to the correct server-side destination object.
But wait! If all remoted objects are stored in one big Hashtable,
how can the Garbage Collector ever free them? Well, it cant. And this is
exactly where the lease-based lifetime management system comes together with
the normal GC. As soon as an objects lease reaches zero, the corresponding
identity object is simply removed from the Hashtable which makes the
destination object subject to GC. This renders the object unreachable by
remoting clients, but the real GC will happen at a later time.
So, one could say that the .NET Remoting lifetime management
system with all its leases, sponsors, background threads and sinks is in fact
just a highly sophisticated way of removing entries from a Hashtable.
|