< C++ .Net Early Stages 4 | Main | C++ .Net Early Stages 6 >


 

 

Early Stages of the C++ .Net 5

(Managed Extensions for C++)

 

 

Note: The code snippets used in this module are for Visual C++ .Net 2003 and if compiled using Visual C++ .Net 2005 you need to use the /clr:oldSyntax option). Here you will learn new keywords used in the C++ .Net though in VC++ 2005 these keywords already deprecated. However we just want to see how the whole thing changed to the managed C++ .Net from native, unmanaged C++ used in COM, MFC etc.

  1. Managed Pointers

  2. Pinning Pointers

  3. Passing by Reference and by Value

  4. Properties

Managed Pointers

 

Reference types are accessed through managed pointers. There are a couple of types of managed pointers, depending on what they point to, and the rules for these differ significantly from the rules applied to unmanaged pointers. Managed pointers must point to an object. You cannot initialize them to some arbitrary section of memory because unlike C pointers, managed pointers are strongly typed and can be initialized only with a pointer to the specified type. You can use casts to fool the compiler like this:

int* p = reinterpret_cast<int*>(0x1000000);

String* s = reinterpret_cast<String*>(p);

This code is perverse, and you should avoid ever getting into the position of writing such code. Here I am using reinterpret_cast<> to initialize an unmanaged pointer with a value. The compiler does not even allow direct initialization of unmanaged pointers. Then we cast the unmanaged pointer to a String* pointer. At run time, the code that uses this String* pointer will throw an exception. If you have a managed array, the pointer is to the array object and not to the memory that the array uses. In general, if you have a managed pointer to an object, you cannot perform pointer arithmetic.

When you declare a managed pointer, the compiler will generate code that initializes the pointer to zero, so it is redundant to do this operation yourself. In general, an untyped pointer to a reference type (for example, a member of a collection, or if you want to write a generic algorithm) is an Object* pointer. For unboxed value types, the equivalent is Void __gc *. However, be wary of pointers to value types because when you cast from an address of a value type to a Void __gc*, you get a managed pointer but you do not get a boxed object.

 

 

 

 

 

 

// pointers.cpp

// Don't do this!

__gc class BadCast

{

   Queue* q;

 public:

   BadCast()

   {

      q = new Queue;

      int i = 99;

      q->Enqueue(reinterpret_cast<Object*>((Void __gc*)&i));

   }

   int Pop();

};

In this case, the address of the local variable is obtained, cast to a managed pointer, and then cast to an Object* pointer so that it can be put in Queue. This code will compile and run, but it has an inherent problem. The lifetime of the value type is determined by the stack frame, but the array’s lifetime is determined by the lifetime of the instance of BadCast. Take a look at Pop:

// pointers.cpp

int BadCast::Pop()

{

   return *reinterpret_cast<int __gc*>(q->Dequeue());

}

This code obtains the first item in Queue and treats the item as a pointer to an int. However, the original address was the address on the stack, which will now have changed, the original int had been lost well before this method was called. The value returned from Pop will be some random value. The message is clear: be wary of pointers to value types; in most cases, they refer to an address on the stack frame and should be considered only temporary.

If a __gc type has a data member (__value or __gc types), the member will be allocated on the managed heap, and the lifetime of the member will be determined by the lifetime of the containing object. You can create a pointer to such a member, but again you have to be careful because the pointer is not to a whole object, but only to part of the object, so it is called an interior pointer. The ECMA specification talks about object references (O types) and managed pointers (& types). An object reference is equivalent to what we call a whole object pointer, and what the ECMA spec calls a managed pointer is what I call an interior pointer. In both cases, they point to memory on the managed heap, which is why we call them, collectively, managed pointers.

An interior pointer can be a stack variable, passed as a method parameter or returned from a method. However, interior pointers cannot be stored as fields in a __gc or __value class, in a static variable or in an array, in order to guarantee that the lifetime of the pointer is not longer than the item it points to. In general, any __gc pointer to a __value type will be an interior pointer and the compiler will issue an error if you try to store the pointer as described earlier. Interior pointers are special in that the runtime allows certain limited pointer arithmetic to occur, but this code will not be verifiable by the runtime. Interior pointers can be incremented or decremented, or you can subtract one interior pointer from another to get the offset between the two members. Subtraction of interior pointers in IL gives the number of bytes between the pointers, but the C++ compiler inserts code to divide this by the size of the item pointed to by the interior pointers so that the result mirrors the behavior in C++. Of course, you always have to be careful when you get free access to memory.

// pointers.cpp

// Don't do this!

__gc class BadInteriorPointers

{

   __int64 x;

   __int64 y;

   String* s;

 public:

   BadInteriorPointers()

   { x=1; y=2; s=S"Test"; }

   void KillMe()

   {

      Dump();   // Initial values

      __int64 __gc * p = &x;

      *p = 3;

      Dump();   // Changed x

      p++;

      *p = 4;

      Dump();   // Changed y

      p++;

      *p = 5;

      Dump();   // Oops! Changed s

   }

   void Dump()

   {

      Console::WriteLine(S"{0} {1} {2}", __box(x), __box(y), s);

   }

};

Here we have two 64-bit integers and a __gc String member. The Dump method just prints out the values of these members to the console. We call this method in the KillMe method and then obtain an interior pointer to the first item. After that, we write a value through this pointer, which will change the value of the member x. The next code changes member y, and then we do something that is fatal to this code: we increment the pointer again so that some of the memory that the pointer points to is the memory occupied by the string pointer s. (we have used 64-bit integers for x and y so that the interior pointer will be __int64 __gc*, and thus incrementing the pointer after it points to y will make the pointer refer to memory other than the packing between members). No exception will be thrown when we change the memory pointed to by this interior pointer, but when we access the member s through the pointer (and hence treat it as a String* pointer), an exception will occur. Here are the results that we get:

1 2 Test

3 2 Test

3 4 Test

 

Fatal execution engine error.

The error is so serious that we cannot catch this error, and there is no automatic stack dump.

 

Pinning Pointers

 

Managed pointers are managed by the garbage collector so that when copies are made or the pointer is assigned to zero, the garbage collector knows that references are created or lost. When a pointer is passed to native code, the garbage collector cannot track its usage and so cannot determine any change in object references. Furthermore, if a garbage collection occurs, the object can be moved in memory, so the garbage collector changes all managed pointers (including interior pointers) so that they point to the new location. Because the garbage collector does not have access to the pointers passed to native code, potentially a pointer used in native code could suddenly become invalid. The runtime does not allow managed pointers to be passed to native code; instead, a pinned pointer must be used.

When a managed pointer is pinned, the garbage collector is informed and this pinned pointer represents an extra object reference; in addition, pinning a pointer tells the garbage collector that during the lifetime of the pointer the object will be pinned in memory, which means that the garbage collector cannot move the object. Note that the lifetime of the pointer is the entire method where the object is used, not just the scope of the C++ pinned pointer (although if you assign a pinning pointer to zero, the object will no longer be pinned). An interior pointer will always be a __gc pointer even if the member pointed to is a __value type with no __gc pointers. To convert an interior pointer to a __nogc pointer, you must pin the pointer, as shown here:

// pinning.cpp

#pragma unmanaged

void print(int* p)

{

   printf("%ld\n", *p);

}

#pragma managed

 

__gc struct Test{int i;};

 

void main()

{

   Test* t = new Test;

   int __pin* p = &t->i;

   print(p);

}

In this example, we have a function that is compiled to native code (it could be a method called through platform invoke, for instance), and we want to pass an interior pointer to this function. To do this, we create a pinning pointer, p, and assign it to the interior pointer. During the lifetime of the pinning pointer, the entire object, t, will be pinned.

 

Passing by Reference and by Value

 

When you pass parameters to a method, a copy of those parameters is made on the stack. If the parameter is a __gc type, the parameter will be a pointer to the instance. If the parameter is a __value type, a bitwise copy is made of __value type members and copies are made of object reference members. If a change is made to a __value type or to its __value type members, the change is made to the copy on the stack and will not affect the original. This code will work fine for calls within the same application domain. An application domain, or, as more commonly called by its class name, an AppDomain, is a unit of code isolation used within a .NET process. However, if you pass the value across application boundaries, the type must be serializable. The simplest way to do this is to apply the [Serializable] attribute to the type, as shown in the following code:

[Serializable]

__value struct Point

{

   int x; int y;

};

This attribute instructs the runtime, when an instance of this type is passed across context boundaries, to serialize all members that are not marked with [NotSerialized] and transmit these to the new context where a new (uninitialized) instance will be created on the stack and initialized with the serialized data. Again, if you make changes to the value instance, the change will be made to the copy on the stack in the method. A __value type can be passed by reference, in which case you have to pass a pointer to the object (a C++ pointer or a C++ reference).

void MirrorX(Point& p)

{

   p.y = -p.y;

}

This code will work if the call is made within the same process, although it is interesting that even though the parameter is accessed through the C++ reference (a pointer) the type still has to be serializable. This code cannot be called across a process boundary because .NET remoting does not support passing pointers to value types via remoting. The reason is that if you want to pass a parameter by reference, it must be derived from MarshalByRefObject, and of course value types cannot be derived from this class (or any class). A reference type is usually passed by reference, so if, in a method, you change the parameter’s members through the pointer, the original object will be changed. This works fine for calls within the same application domain, but if the call is made outside of the domain (either in the same process or in another process), the __gc type must derive from MarshalByRefObject, which will mean that the object will be created and will live in one domain, but it can be accessed by code in other domains. You can also pass __gc types by value, in which case you have to apply the [Serialization] attribute. The object will be serialized only if remoting is used that is, if the call is made into another application domain.

 

Properties

 

Both __gc types and __value types can have properties. Strictly speaking, a property is not really a member of a type. It is a description, metadata, which identifies methods on the type that can be called through property access. Data members of a type are called fields by .NET and can have any type that you choose, including arrays. Fields have the disadvantage that they allow the data member to be read and written, and they have no mechanism to perform validation. On the other hand, properties are implemented using methods, which mean that you can determine whether a property is read-only, write-only, or read/write by the methods that you implement. Furthermore, the property methods can perform validation on the values passed to them or returned from them, so they can take evasive action if the values are invalid. Properties are implemented with get_ and set_ methods. The get_ method is used to return the property, so its return type should be the same as the property. The set_ method is used to initialize the property, so the method should not have a return type and its last parameter should be the same type as the property. To tell the compiler to generate the .property metadata, you use the __property modifier on the property methods.

__gc class GrimesPerson

{

   String* name;

  public:

   __property String* get_Name()

   {

      if (name == 0) name = S"the man with no name";

      return name;

   }

   __property void set_Name(String* n)

   {

      if (n == 0)

         throw new ArgumentException(S"name cannot be null");

      name = n;

   }

};

This class has a string property named Name. The name of the property is the name after the get_ or the set_. In this case, the property methods change the private field name, but this behavior is an implementation detail of my class. The property could generate a name dynamically, or it could read the name from a database or a file. The choice is entirely yours. The metadata for the property looks like this:

.property specialname instance string Name()

{

  .get instance string GrimesPerson::get_Name()

  .set instance void GrimesPerson::set_Name(string)

}

The European Computer Manufacturer’s Association (ECMA) spec says that the property can also have a method marked with .other, but there is no way that you can define these methods in C++, nor is it clear how such methods are called other than directly through their name. Code that uses the property treats the property as if it is a data member. The compiler will convert the property access to one of the methods mentioned in the .property metadata, for example:

GrimesPerson* me = new GrimesPerson, *you = new GrimesPerson;

me->Name = S"Richard";

you->set_Name(S"Ellinor");

Console::WriteLine(S"{0} and {1}", me->Name, you->get_Name());

Properties can be static or instance members, they can be virtual and an abstract class can have pure virtual implementations for either access method.

__property static String* get_SurName()

{

   return S"Grimes"; 

}

The access methods cannot differ by the static specifier, but they can differ by the virtual specifier and the member accessibility. When you declare a property, you do not automatically add storage to a type. This behavior, coupled with the fact that properties can be pure virtual, means that properties can be members of interfaces. Thus, any class that implements the interface must implement the property. The interface can mention only one of the accessor methods, but any class can implement both accessors, meaning that the other accessor can be accessed only through an object reference. Properties can have indexes, which means that they look (in code) similar to arrays. To add an index, you have to add a parameter to the get_ and set_ methods. The last parameter of the set_ method, of course, is the value that you are passing to the property. The index can be any type that you want.

// properties.cpp

public __gc class FileStore

{

public:

   __property StreamReader* get_Document(String* name)

   {

      return File::OpenText(name);

   }

   // Other members

};

 

void main()

{

   FileStore* fs = new FileStore();

   StreamReader* stm = fs->Document[S"readme.txt"];

   Console::WriteLine(stm->ReadToEnd());

   stm->Close();

}

Here the Document property is indexed with a string parameter. To call this property, we give the name of the property followed by the index value in square brackets. Properties with parameters can be overloaded.

__property StreamReader* get_Document(String* name);

__property StreamReader* get_Document(int i);

In this case, there are two properties, one indexed with a file name and the other indexed with an integer which might be an index into some other list maintained by the object. This declaration will result in two .property metadata descriptions. However, note that not all languages support indexed properties. As it stands, the FileStore::Document property can be accessed in C# only through the accessor methods directly.

// C#

FileStore fs = new FileStore();

StreamReader sr = fs.get_Document("readme.txt");

This code is quite ugly and is not what C# developers expect to see. In C#, indexed properties are called indexers. C# does not allow access to indexed properties through indexer syntax unless you add the [DefaultMember] attribute to your class identifying the indexed property.

[DefaultMember("Document")]

public __gc class FileStore

{

public:

   __property StreamReader* get_Document(String* name);

};

Now the C# code will look like this:

// C#

// puser.cs

FileStore fs = new FileStore();

StreamReader sr = fs["readme.txt"];

Because no name is specified, the C# compiler looks for the default value and accesses the specified property. This syntax means that only one indexed property can be accessed in this way. All others have to be accessed directly through their accessor methods. The converse is a little odd: C# can define indexers, but by default the C# compiler calls the property Item. The C# developer can change the name of the property using the [IndexerName] attribute. In C++, a property with an integer index looks as if it is an array field. Indexed properties can have more than one index, and in this case, the syntax looks like native C++ array syntax because the calling code has to enclose each parameter with square brackets, so for this code:

// arrayprop.cpp

__gc class Multiplier

{

public:

   __property int get_Value(int x, int y)

   { return x*y; }

};

the calling code looks like this:

Multiplier* m = new Multiplier;

int i = m->Value[5][6];

Console::WriteLine(i);

Of course, you can use any type for the indexes. C# can handle indexed properties with more than one index as long as they are treated as indexers that is, the property is the default member. The C# code for accessing Multiplier::Value looks like this:

// C#

Multiplier m = new Multiplier();

int i = m[5,6];

Console.WriteLine(i);

 

 

 

 

Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8

 


 

< C++ .Net Early Stages 4 | Main | C++ .Net Early Stages 6 >