C++ .Net Early Stages 2 | Main | C++ .Net Early Stages 4 >


 

 

Early Stages of the (Visual) C++ .Net 3

(Managed Extensions for C++)

 

 

Note: The code snippets used in this module are for Visual C++ .Net 2003 and if compiled using Visual C++ .Net 2005 you need to use the /clr:oldSyntax option). Here you will learn new keywords used in the C++ .Net though in VC++ 2005 these keywords already deprecated. However we just want to see how the whole thing changed to the managed C++ .Net from native, unmanaged C++ used in COM, MFC etc. You will learn the following topics divided in several parts.

  1. Managed Types and Value Types

  2. Managed Objects

  3. Methods on Managed Types

  4. Constructors

-----------Next-----------------------

 

 

 

  1. Operators

  2. Value Types

  3. Enumerations

  4. Boxing

  5. Managed Pointers

  6. Pinning Pointers

  7. Passing by Reference and by Value

  8. Properties

  9. Delegates and Events

  10. Attributes

  11. Managed Interfaces

  12. Managed Strings

  13. Managed Arrays

  14. Exceptions and Managed Code

  15. Unmanaged Exceptions

 

Managed Types and Value Types

 

.NET languages are described as consumers or extenders. A consumer language can merely use existing .NET types whereas an extender such as C++ can create new types. .NET defines two different sorts of types, depending on where instances of the type are allocated and how they are used. Reference types are created on the garbage collector managed heap, where allocation and deallocation is cheap but heap cleanup during garbage collection is expensive. Reference types are usually passed to methods by reference. Value types are typically created on the stack and are passed to methods by value.

Garbage collected reference types appear to solve the problem of leaking memory, your code merely has to allocate the objects, and the garbage collector does the de-allocation. However, garbage collection is more important than merely solving memory leaks within client code. In a distributed application, memory allocation is extremely important because objects can be accessed across process or machine boundaries, which introduces the issue of which code has the responsibility to perform the cleanup. Furthermore, when data is passed from one context to another by value, the data has to be serialized into a form that can be transmitted and then de-serialized at the other end in the form that the receiving code expects to get. In both cases, memory allocation has to be performed, and this brings into question how long these memory buffers should exist and who has the responsibility of releasing them.

 In synchronous code, the issues were straightforward because both sides of the call know when a buffer is no longer being used. COM provided rules about who had the responsibility of managing memory based on parameter attributes, and this strategy worked well in most cases. However, when you passed variable-length buffers out of a method, the code got a little messy and involved using a global memory manager. (Allocations are performed with CoTaskMemAlloc, and memory is freed with CoTaskMemFree.) With asynchronous COM code, memory management started to get more complicated and required a final clean-up call to be made when it was clear that the call was completed. .NET makes asynchronous calls easy, and you can decide to ignore any return values from the call, in which case the final clean-up call is not made, but because memory is allocated on the managed heap, this lack of a clean-up call is not a problem. If your application uses many small objects with short lifetimes, individually allocating these objects on the heap can be a significant performance hit. For this reason, the .NET Framework provides value types. Value types are short-lived, small objects that are usually created on the stack. Allocating them is cheap: when you declare a value type variable, the stack pointer is merely moved to provide space. De-allocation is also cheap and is automatic, when the variable goes out of scope; the stack pointer is moved to indicate that the space is now available. Furthermore, accessing the data members involves direct access, so a dereference is not required. Because value types are normally created on the stack, their lifetime is short (except, of course, for those created in the entry point method).

 

Managed Objects

 

In C++ you identify that a class is managed by the garbage collector by using the __gc modifier. This modifier can be used on classes and structs, and it can be used on pointers to explicitly specify that the pointer is to a managed object. All __gc class members are private by default; __gc struct members are public. This scheme follows the usual C++ meaning of these types. Here is an example of a managed type:

 

__gc class DataFile

{

      System::String* name;

      public:

         DataFile(System::String* n) : name(n){}

         // Other members omitted

};

This class is named DataFile. The C++ public keyword to indicate that the constructor can be accessed by any code outside of the class, and the default member access to indicate that the name field can be accessed only by code within the class. The name field is a managed string, and in this example the fully qualified name, including its namespace. The name field is initialized in the initializer list of the constructor, and the syntax here is similar to native C++: the pointer of the name field is initialized with the pointer n, but it does not mean that a constructor is called. Because the string is a reference type, all that occurs is an assignment of the reference. This behavior is important because when an instance of DataFile is created with this constructor, the name field is initialized with a reference to a managed string. Instances of this class can be created only on the managed heap, as shown in the following code:

// strFile is a managed string initialized elsewhere.

DataFile* df = __gc new DataFile(strFile);

You cannot create instances of a __gc class on the stack. If you attempt to create a stack-based instance, the compiler will issue an error (C3149). Notice that we have explicitly used the __gc modifier on the new operator to indicate that the managed operator is used. You do not have to use this syntax. If you omit this modifier, the compiler will still use the managed new operator because the class that is being created is managed. If you omit the __gc modifier from the class declaration or you use the __nogc modifier, a native C++ class will be created, as shown here:

// Must compile with /EHsc to enable unmanaged exception handling

__nogc class natDataFile

{

      std::wstring name;

      public:

         natDataFile(std::wstring n) : name(n){}

         // Other members omitted

};

This code can exist in the same source file as DataFile; you can use pointers to native C++ objects in __gc classes and pointers to __gc objects in native C++ objects. Note that you cannot use a raw __gc pointer as a data member of a native C++ object. The reason is that the native object will not be allocated on the managed heap and the pointers to the object will not be managed. (You can explicitly identify them as __nogc pointers). This arrangement means the garbage collector will not be able to identify when the native object is destroyed and thus when the reference to the managed data member is freed. Instead, the native class must manage the reference itself and tell the runtime when the reference should be treated as being freed. All __gc classes look similar to native C++ classes, but they are subject to the .NET rules of reference types. Some of these rules are similar to C++; others apply more restrictions. The most significant restriction is that .NET allows only single-implementation inheritance, which means you cannot derive a class from more than one other class.

 

Methods on Managed Types

 

Managed types can have methods, and methods contain code. There are several types of methods that can be called for example, the metadata devices, properties, and events are really descriptions of methods that can be called (respectively, to get or set a property value; and to add or remove a delegate from an event and to raise that event). Methods can be called on a type (static methods) or on an instance. The default is for a method to be an instance method, but this can be changed with the static keyword. Methods are called with a special calling convention named __clrcall. You do not specify this (because the compiler will not recognize the keyword), and the only time that you will see this mentioned is in the error that is generated if you attempt to apply a different calling convention on a class method. However, you can apply other calling conventions to global functions. Note also that __gc class methods cannot be marked using the C++ const or volatile keywords.

Class methods can be overloaded. The .NET specification allows methods to be overloaded by return type as well as parameters, but this has not been carried over to the Managed Extensions. Instead, the normal C++ rules apply: methods can be overloaded only on the parameters. There is an exception to this C++ rule: if you define a static operator named op_Explicit or op_Implicit to perform conversions between managed types, the operator can be overloaded on the return value. Native C++ methods can have default values for parameters so that the method can be called without mentioning the parameter. Default parameters are not legal in .NET. A method on a __gc type with a default parameter will not compile and will generate the error C3222.

Methods can be implemented inline in the class, or you can separate the declaration and the implementation into separate header and .cpp files. The concept of inlining is redundant for several reasons. First, if a method is public, it could be used by another assembly unknown to the compiler at compile time, so the method must be available as a single item. Second, inlining code is actually performed by the Just-In-Time (JIT) compiler. The first time a method is called, the JITter will analyze the code, and it can decide to optimize the JITted method by compiling small methods as inline code. This decision is not yours to make; it is purely the choice of the JITter, so the C++ inline keyword has no effect.

The method parameters can be an instance of any .NET type, and they can be in, out, or in/out. By default a parameter is an in parameter, which means that it is passed from the calling method to the called method via the stack. If the parameter is an instance of a __gc reference type, the parameter will be passed via a pointer, so it is possible that the called method can change the instance by accessing its members through the pointer. It is the pointer that is treated as an in parameter. The parameter is in/out if it is passed in both directions, that is, initialized in the calling method and then used in the called method before being reinitialized and passed back to the calling method. To use an in/out parameter in managed C++, the parameter should be passed by reference, which means that a C++ reference or a pointer to a __gc reference type pointer should be used, as shown here:

void UseDataFile(DataFile __gc*& file)

{

   if (file == 0)

      file = new DataFile(S"Default.dat");

   // Use file here.

}

 

void PassDataFile()

{

   // Initialized to zero automatically

   DataFile __gc* df;

   UseDataFile(df);

   // Use df here.

}

UseDataFile takes a reference to a DataFile __gc* variable, and if this value is zero, the method creates an instance. Because the parameter is a reference, the variable in the calling code, PassDataFile, will be initialized with this new object, so this method can call the members of the new object. In this code, we have explicitly called the pointer DataFile __gc*&, but because the class is a __gc type, it is perfectly acceptable to omit the __gc modifier and call the parameter DataFile *&. C++ references are fine, but in this situation the call to UseDataFile is confusing because it is not obvious that an instance can be returned; hence, the equivalent syntax using pointers and the address-of operator is preferred.

void UseDataFile(DataFile __gc* __gc* file)

{

   if (*file == 0)

      *file = new DataFile(S"Default.dat");

   // Use file here.

}

 

void PassDataFile()

{

   // Initialized to zero automatically

   DataFile __gc* df;

   UseDataFile(&df);

   // Use df here.

}

Again, it is acceptable to omit the __gc modifier on the pointer declarations. Although it appears that PassDataFile calls the address-of operator, the address is not obtained in this call. The compiler recognizes the use of & here to mean that the parameter is passed as in/out. The same IL will be generated whether you use a reference or a pointer, but if the code is in the same C++ file, you cannot mix the two, the C++ compiler will refuse to allow you to pass a pointer to a method that requires a reference. If you call UseDataFile (either version) from C#, the parameter should be passed using the ref modifier. The runtime does not distinguish between parameters passed as in/out or passed as out within the same context. However, some languages do make a distinction; C#, for example, uses the out and ref modifiers. The preceding examples pass the parameter as in/out. To indicate that the parameter should be passed as an out-only parameter, you should use the [Out] attribute of the System::RunĀ­time::InteropServices namespace. When it sees the [Out] attribute, the compiler adds the [out]. Note that the attribute you add in C++ has an uppercase O whereas the metadata attribute that is applied has a lowercase o metadata attribute to the parameter.

In a similar way, by default a value type is passed as an in parameter. Value types, of course, are not passed through a pointer. To pass a value type as in/out, you have to use a managed pointer, and to pass the parameter as an out parameter, you have to apply the [Out] attribute. In this code, we have explicitly used __gc on the pointer because int is a primitive C++ type, and without __gc an unmanaged pointer will be used. It is interesting to note that in MSIL a managed pointer is identified with & whereas an unmanaged pointer is identified by a *.

void PassValueTypes(int inParam, int __gc* inoutParam, [Out] int __gc* outParam);

.NET classes can have virtual methods, so the runtime determines, from the type of this pointer when a method is called, which particular implementation of the method will be called. In fact, the runtime can call a virtual method virtually or non-virtually, and the compiler decides which. When your code calls a method on a type that is declared as virtual, the C++ compiler will always call those methods virtually. Virtual methods are usually identified with the C++ virtual keyword. Additionally, .NET classes can be abstract that is, you do not intend that instances of the class should be created and you do intend that it should be used only as a base class. There are two ways to do this. The first way is to use the __abstract keyword on the class declaration, as shown here:

// abstract.cpp

__gc __abstract class FileBase

{

    protected:

          Stream* stm;

    public:

          // Get a stream to read/write to the disk.

          virtual Stream* GetStream(){ return stm; }

          // Other methods omitted

};

Because FileBase has the __abstract modifier the class is abstract, even though the method has an implementation. The compiler puts the abstract metadata attribute on the class in the assembly so that code in other languages is also aware that the class cannot be created. A class derived from FileBase can override the GetStream method, or the derived class can leave the method as-is and allow client code to access the method through the pointer to an instance of the derived class. This pattern is useful for providing partial implementations of classes, and the documentation should indicate the extra code that should be implemented. You do not have to use the __abstract keyword. If one or more virtual methods have no implementation, the compiler will generate the metadata to indicate that the class is abstract (although it is useful to use __abstract because it gives a visual clue in your code what your intentions are).

// abstract.cpp

__gc class FileBase2

{

   protected:

        Stream* stm;

   public:

        // Get a stream to read/write to the disk.

        virtual Stream* GetStream() = 0;

        // Other methods omitted

};

In this case, the C++ syntax used to identify a pure virtual method. In C++, any class that has a pure virtual method is abstract. The compiler also adds the metadata attribute abstract to the method to indicate that it has no implementation, and the pure virtual syntax is the only way that you can get this attribute applied. To use FileBase2, you not only have to derive a class from it, but you also have to implement the pure virtual methods. In this case, the pure virtual methods indicate an interface that derived classes should support. This system was how the C++ bindings for COM interfaces were implemented in versions of Visual C++ prior to the .NET Framework SDK and Visual Studio .NET. The new version of the compiler introduces a new keyword, __interface, which enforces the semantics of interfaces. .NET allows multiple interface inheritance, but unlike native C++, abstract classes are not treated as interfaces. So, the rule is that a class can derive from at most one class and from any number of __interfaces. Methods that are used to implement interfaces are virtual (but you do not have to mark them as such).

The antithesis of __abstract is __sealed. This keyword can be applied to virtual methods and to classes. When applied to an overridden virtual method, it indicates that the method is complete; the implementation cannot be overridden in a derived class. It is nonsensical to make a method both virtual and sealed because virtual implies that the method can be overridden, but sealed prevents overriding. However, the compiler does allow this usage. When one method is sealed, the class is marked as sealed in its metadata. If you apply the __sealed keyword to a class, all the methods are considered to be sealed. Think carefully when you apply the sealed keyword to a class because the keyword means that another developer cannot extend your code, and do you know about all uses other developers might have for your code? The only reason that we can think of for using sealed on a class is to prevent other developers from accessing protected members.

 

 

 

 

Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7 | Part 8

 


 

 C++ .Net Early Stages 2 | Main | C++ .Net Early Stages 4 >