< C++ .Net Early Stages 3 | Main | C++ .Net Early Stages 5 >


 

 

Early Stages of the C++ .Net 4

(Managed Extensions for C++)

 

 

 

Note: The code snippets used in this module are for Visual C++ .Net 2003 and if compiled using Visual C++ .Net 2005 you need to use the /clr:oldSyntax option). Here you will learn new keywords used in the C++ .Net though in VC++ 2005 these keywords already deprecated. However we just want to see how the whole thing changed to the managed C++ .Net from native, unmanaged C++ used in COM, MFC etc.

  1. Constructors

  2. Operators

  3. Value Types

  4. Enumerations

  5. Boxing

 

Constructors

 

Constructors are used to initialize a newly created instance of a class. In managed C++, constructors of __gc classes are declared in the same way as in native C++: the name of the class is used as if it is a method without a return type. In metadata, a constructor has the special name of .ctor. Constructors can be overloaded, but like methods, they do not permit you to define default values for parameters. You are able to pass in/out and out parameters to constructors, although returning a value from a constructor goes against the reason to call the constructor which is to construct the object.

Classes can also have a static constructor (also known as a type constructor). A static constructor of a .NET class created with C++ is called just before the first access is made to a member. The static constructor is called by the run­time and thus you are not able to pass any parameters to it. This arrangement means that only one, parameterless static constructor is allowed on each class. In metadata, a static constructor is named .cctor. If your class has a static field and you initialize this inline, the compiler will generate a static constructor with the code to initialize the field; if you define a static constructor, the compiler will put the initialization code before your code.

public __gc class Data

{

   static Data()

   {

      Console::WriteLine(S"we are called {0}", str);

   }

   public:

      static String* str = S"the Data class";

};

 

 

 

 

 

In this code, there is a static member named str; this member is initialized to a string within the class. (Contrast this behavior to native C++, where only constant static integral members can be initialized like this). The class also has a static constructor that prints out the value of the static field. This class is fine because the compiler will inject code before the call to WriteLine to initialize the string to the specified value.

ldstr "the Data class"    // Initialize the string...

stsfld string Data::str   // ... and store it in the static field.

ldstr "we are called {0}" // Load the format string.

ldsfld string Data::str   // Load the parameter, and call WriteLine.

call void [mscorlib]System.Console::WriteLine(string, object)

Finally, it is worth pointing out that because reference types are created on the managed heap and the garbage collector tracks the pointers that are used, you cannot define a copy constructor on a class. If you want to make an exact copy of an object, you should implement ICloneable and call the Clone method.

 

Operators

 

Managed types can implement .NET operators and in this section, we will make a few comments about C++ operators. Class instances are created using the operator new. For a __gc managed class, this operator is defined by the runtime, so you cannot create your own operator new on the class. Similarly, because objects are removed from the heap by the garbage collector, you cannot implement operator delete, and because the garbage collector manages pointers, you cannot define operator &.

Your access to a managed object should be through its members. You cannot change the pointer to the object, and you cannot increment a whole object pointer. Thus, the C++ sizeof and offsetof operators do not work. There are cases when you might need to know the size of an unmanaged type represented by a .NET value type or the position of a member within that class for example, if you are defining custom interop marshalling. In this case, you can use Marshal::SizeOf and Marshal::OffsetOf (in the System::Runtime::InteropServices namespace). However, these do not work on managed objects.

 

Value Types

 

Value types are typically small, short-lived objects and they are usually created on the stack. In managed C++, you can define a value type as a class or a struct. The important point is that the value type is marked with __value, as shown here:

__value class Point

{

   public:

      int x;

      int y;

};

You cannot create a value type directly on the managed heap. Typically, they are created on the stack.

Point p = {100, 200};

This example shows an initializer list used for the value type. The compiler will generate code to pass 100 to the first member (x) and 200 to the second member (y). If an initializer list is not used, the members will be initialized to their default values, which is zero for primitive types. A value type can also implement constructors (including a static constructor), but if you define a constructor, you cannot use an initializer list to initialize an instance.

__value class Point

{

    public:

       int x;

       int y;

       Point(int i, int j) : x(i), y(j) {}

       // Default constructor to define the default value of this type

       Point() : x(-1), y(-1) {}

};

 

void Useit()

{

   Point p(100, 200);

}

A value type is implicitly sealed; you do not have to apply the __sealed modifier. A value type cannot derive from a __gc type. Thus, the only methods that you can override in the value type are the methods of System::ValueType, which is the base class of all value types. Methods inherited from System::ValueType are virtual, but other than these, it makes no sense to define new virtual methods on a value type.

Value types are typically used as records of data much as you would use a struct in C. By default, the items are sequential that is, in memory the fields appear in the order that they are declared, but the amount of memory taken up by each member is determined according to the .pack metadata for the method. (The default is a packing of eight.) You can change this behavior with the [StructLayout] pseudo custom attribute (in the System::Runtime::InteropServices namespace). This attribute can take one of the three members of the LayoutKind enumeration: if you use Auto (the default for reference types), the runtime determines the order and amount of memory taken up by each member (this amount will be at least as large as the size of the member); if you use Sequential (the default for value types), just the order is defined, the actual space taken up is determined by the size of the member and the packing specified. The final value you can use is Explicit, which means that you specify the exact layout of members their byte location within the type and the size of each member and you do this with the [FieldOffset] attribute. The [StructLayout] pseudo custom attribute adds the auto, explicit or sequential metadata attribute to the type. Here is an example of using LayoutKind::Explicit:

// union.cpp

[StructLayout(LayoutKind::Explicit)]

__value class LargeInteger

{

    public:

       [FieldOffset(0)] int lowPart;

       [FieldOffset(4)] int highPart;

       [FieldOffset(0)] __int64 quadPart;

};

The first two members are 32-bit integers. Thus, the first member appears at offset 0 within the type and the second member appears at offset 4. However, notice that I have also intentionally put the third member (quadPart) at offset 0. There are no unions in .NET, but by using [StructLayout(LayoutKind::Explicit)] and [FieldOffset] like this you can simulate a union. Here, the quadPart member will be a 64-bit integer. The lower 32-bits can be obtained through the lowPart member, and the higher 32-bits through the highPart member. Value types are typically small, which usually means that they contain primitive types. There are no restrictions to the types that you can use. A value type can contain pointers to __gc types, which will be allocated on the managed heap. If the value type does not contain __gc pointers, it can be created on the unmanaged heap by calling __nogc new. Of course, you have to remember to delete these allocated members. Value types cannot be created directly on the managed heap. There are two cases when a value type will appear on the managed heap: when it is in a managed array or when it is a member of a __gc type.

 

Enumerations

 

Enumerations are value types and have similar characteristics (allocated on the stack, implicitly sealed). However, enumerations do have some distinct differences. For a start, enumerations are derived from System::Enum (derived from System::ValueType), which gives access to methods to convert enumerated values to other types, to get the names and values of members, and to create an enumerated value from a string. Further, you cannot provide implementations of methods on enums. Enumerated values are integral types. You can specify the underlying type that will be used. The syntax looks like inheritance, but you do not specify an access level.

// enums.cpp

__value enum Color : unsigned int {RED=0xff0000, GREEN=0xff00, BLUE=0xff};

Here we have defined a new enum named Color that has 32-bit values. The enumeration has three items, and we have explicitly given them values. If you omit a value, the item will have the incremented value of the previous item (or zero for the first item). Of course, the items in the enum are not members in the same sense as other value types. The items are named values for the enum. Thus, an instance of Color can be initialized using an integral value or the named value.

Color red = Color::RED;

Color white = (Color)0xffffff;

Color cyan = (Color)(Color::BLUE │ Color::GREEN);

Color gray = (Color)0x010101;

Here we have qualified the name of the enumerated value with the name of the enum; this means that there is no ambiguity. It is possible to omit the enum name (to get a weak enumerator name), and the compiler will search for an appropriate value. If the compiler finds another symbol with the same name, you might not get the result you expect. For example:

Color red = RED;

This will initialize red with a value of 0xff as long as RED is not defined as a symbol. If you define another enum, then there will be a problem.

__value enum UKTrafficLight {RED, AMBER, GREEN};

Then the compiler will complain because it does not know whether RED refers to Color or UKTrafficLight. Further, if you declare a variable with the same name,

int RED;

the compiler will attempt to convert the integer variable to an enum, and because no implicit conversion exists, you will get an error. This error dangerous because as we have shown previously, you can assign an integral value to an enum variable as long as you cast to the enum type. The error caused by using a weak enumerator name indicates that an explicit cast will solve the problem, but in fact it makes the problem worse. It is always better to use qualified names for enumerators. The compiler allows you to define anonymous enums and will generate a name for you. However, an anonymous enum implies that you will use weak enumerator names.

Normally, when you call System::Object::ToString on an object you will get the string version of the value of the object returned. ToString called on an enum does a little more work. First ToString checks to see whether the [Flags] attribute has been applied. The documentation says that the members of such an enum can be combined with the bitwise OR operator, but C++ still treats the value as the underlying integral type and (as shown earlier) you have to cast to the enum type. However, without the [Flags] attribute, ToString expects the enumerated value to be a single item from the enum. If this is the case, the enum item name will be returned. If we call ToString on the red variable mentioned earlier, the string “RED” will be returned. If ToString cannot find a single item that matches (for example, white, cyan, and gray defined earlier), a string is returned that represents the number. When ToString sees the [Flags] attribute, the method will attempt to build a string made up of a comma-separated list of the names of the items in the enum that constitute the value. If the number cannot be represented completely by the items in the enum, the string representation of the number is returned. So if Color had the [Flags] attribute, the formatted string for white will be “RED, GREEN, BLUE” whereas gray will return 65793 (if the default formatting is used).

 

Boxing

 

Value types can have methods, and you access these through the dot operator just like any other member of the value type. Value types also derive from System::ValueType (directly, or in the case of enums, indirectly through System::Enum). However, if you look up ValueType, you’ll see that it is a __gc type and not a value type, which means that its members should be accessed through a __gc pointer and not a value type instance. .NET allows you to convert a value type to a __gc object through a process named boxing. Boxing is explicit in C++ (unlike other languages supported by .NET) because the operation is not without a performance issue, so you have to specify that a boxed value is being used rather than the value. When you box a value type, the run­time creates an object on the managed heap that has an exact copy of the value type being boxed. The type of this object on the heap is called the boxed type. Here is an example using the Color enum declared in the last section:

Color cyan = (Color)(Color::GREEN │ Color::BLUE);

__box Color* boxedCyan = __box(cyan);

Console::WriteLine(boxedCyan->ToString());

Here we have used the __box operator on the cyan value to get a pointer to an object of type __box Color. This object is on the managed heap, so we can call ToString using pointer syntax, and we can access any of the other public members defined on the value type. If the value type overrides a method in ValueType, then we have the choice of accessing the method through the value type (with the dot operator) or through the boxed type (through the → operator). Primitive types are value types, and they implement methods that allow you to convert instances to other types. These methods are part of the IConvertible interface, and to get access to this interface, you have to box the object first, as shown here:

Int32 i = 42;

__box Int32* b = __box(i);

IConvertible* cvt = b;

Double d = cvt->ToDouble(new NumberFormatInfo);

Console::WriteLine(d);

Note that if a value type is boxed, a copy of its fields are made. The boxed object is a clone of the value type but located on the managed heap. Consider the Point class shown earlier.

Point p1(100, 200);

__box Point* p2 = __box(p1);

p2->x = 300;

Trace::Assert(p1.x != p2->x);

When a change is made to a member of the boxed object, it affects the value on the managed heap, not the value type, which is why we have performed an assert. In this case, the assertion is true because p1.x is not equal to p2→x. This behavior is one reason why it is important that the C++ team has decided to provide boxing through an operator. If you intend to call a method of System::Object on the boxed object, you can make the type of the pointer Object*; however, try avoiding this practice because you cannot specify that the pointer is a boxed type. (You cannot use __box Object* because you can box only value types). You will have to box a value type whenever you pass a value type to a method that takes an Object* pointer. The most frequent occasion when you will box a value type is when you pass value types to Console::WriteLine or when you put value types into a collection. Console::WriteLine has many overloads, some of which take value types, so the following statement will compile and run because there is an overload that takes an Int32 parameter:

Console::WriteLine(999);

If we want to pass a format string to print the integer in hex, we could try this:

// Does not compile

Console::WriteLine(S"{0:x}", 999);

This statement will not compile because no overload exists that takes a string and an Int32. The nearest version takes a string and an Object* pointer, so you can get the line to compile by boxing the value type. The System::Collection namespace has various general-purpose classes. These classes are generic, so they contain Object* pointers. Thus, you have to box value types to create an object on the heap. If you have many items that you want to put into a collection, boxing each one is inefficient. Value types exist precisely to avoid having many small items on the heap. The alternative is to use an array. Once a value type has been boxed, you can obtain a managed pointer to the value type from the boxed object, and you can initialize a value by dereferencing the pointer. Pointers to value types obtained through the address-of operator (&) are __nogc pointers.

// Implicit conversion from pointer to boxed type

// to a managed pointer to a value type

Point __gc* p3 = p2; 

// Dereference pointer to initialize a value type.

Point p4 = *p2;

Point p5 = *p3;

Dereferencing a pointer to a boxed value is called unboxing. If the type of the boxed value is a boxed type, no cast is required during unboxing. If the object type is Object, you have to cast to the appropriate value type. For example, System::Enum has a method named Parse that you can use to pass either the name of an item in the enum or an absolute value, as shown here:

Color red;

Object* o = Enum::Parse(__typeof(Color), S"RED");

red = *static_cast<__box Color*>(o);

Parse takes the type of the boxed object to return, but the method actually returns an Object* pointer. The type of the object is __box Color, so we can use static_cast<> to get a pointer, and then we can dereference this pointer to unbox the object and initialize the value type. Reference types can have value types as members, and the memory for the value type will actually be allocated on the heap. However, this memory behaves like a stack frame insofar as the lifetime of the value type depends on the lifetime of the reference type object. Contrast this behavior to a reference type pointer within a reference type: the lifetime of this referred-to object might depend on the lifetime of the containing object, but there could be other pointers to the same object and those pointers could also have an effect on the lifetime of this object.

 

 

 

 

Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8

 


 

< C++ .Net Early Stages 3 | Main | C++ .Net Early Stages 5 >