< C++ .Net Early Stages 1 | Main | C++ .Net Early Stages 3 >


 

 

Early Stages of the C++ .Net 2

(Managed Extensions for C++)

 

 

Continue from the previous module...

  1. Microsoft Intermediate Language (MSIL) bytecode and the Native Code

 

 

 

 

 

The C++ compiler will compile the code in all C++ functions - managed and non-managed classes, to MSIL (currently known as Common Intermediate Language (CIL)), with a few exceptions. The first case is when you specifically identify that you do not want code to be MSIL, and you do this with a pragma. One complaint often made about .NET is that assemblies have metadata and IL that can be readily viewed with the IL disassembler and hence your algorithms are an open secret. One way that you can get around this problem is to compile the code to native x86.

#pragma unmanaged

 

char Encrypt(char cClear, char cKey)

{

   return cClear ^ cKey;

}

#pragma managed

The code that encrypts a string can pass each character to Encrypt. The following code shows a simple use of this function. This code assumes that no data is lost when the characters in the managed string strClear are converted from the 16-bit Unicode characters that System::String uses internally to the 8-bit char.

// strClear is the string to encrypt.

// strKey is the key to encrypt the data.

// bEncrypted is an array with the encrypted data.

// Create an array to hold the encrypted data.

 

Byte bEncrypted[] = new Byte[strClear->Length];

int posKey = 0;

 

for (int pos = 0; pos < strClear->Length; pos++)

{

   // String::Chars[] returns the character at the specified position.

   bEncrypted[pos] = Encrypt(strClear->Chars[pos], strKey->Chars[posKey]);

   posKey++;

   if (posKey == strKey->Length) posKey = 0;

}

 

You could use code such as this if you wanted to encrypt data before passing the byte array to a stream for example, FileStream to write to a file or NetworkStream to pass the data to a socket. The simple encryption algorithm XORs each character of the cleartext with the corresponding character in the secret key. Because we do not want the secret algorithm to be widely known, we have compiled it as native code. When a snooper uses ILDASM to view my assembly, he will see the following code:

.method public static pinvokeimpl(/* No map */)

int8 modopt([Microsoft.VisualC]Microsoft.VisualC.NoSignSpecifiedModifier)

   modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)

Encrypt(

   int8 modopt([Microsoft.VisualC]Microsoft.VisualC.NoSignSpecifiedModifier) A_0,

   int8 modopt([Microsoft.VisualC]Microsoft.VisualC.NoSignSpecifiedModifier) A_1) native unmanaged preservesig

{

   .custom instance void [mscorlib]System.Security.SuppressUnmanagedCodeSecurityAttribute:: .ctor() = ( 01 00 00 00 )

  // Embedded native code

  // Disassembly of native methods is not supported.

  // Managed TargetRVA = 0x1000

} // end of method 'Global Functions'::Encrypt

In essence, the C++ compiler has generated a managed function that wraps the unmanaged function. Because ILDASM cannot disassemble x86 machine code, the snooper does not get to see my secret algorithm. Of course, a determined hacker could access the native code referenced in the managed function and use an x86 disassembler to get the assembly code for the algorithm, as shown here:

Encrypt:

00401010  push  ebp

00401011  mov   ebp,esp

00401013  movsx eax,byte ptr [cClear]

00401017  movsx ecx,byte ptr [cKey]

0040101B  xor   eax,ecx

0040101D  pop   ebp

0040101E  ret

It is interesting to compare this with the IL that would be generated if the method had been compiled to IL, as shown here:

.maxstack  2

IL_0000:  ldarg.0

IL_0001:  ldarg.1

IL_0002:  xor

IL_0003:  conv.i1

IL_0004:  br.s IL_0006

IL_0006:  ret

In this simple example, it is clear in both cases what the algorithm does. In a more complicated algorithm, one that makes library calls, makes Boolean checks, or performs loops, there will be a marked difference between the disassembled x86 and IL. The main difference will be that without symbols there will be no indication in the disassembled x86 about the procedure calls that are made, whereas in IL, metadata identifies the calls. The compiler will check the code that you are compiling to see whether the code can be compiled to MSIL. This check is important if you are compiling existing C++ code. If a function contains code that cannot be compiled to MSIL, the entire function will be compiled to x86 native code. The cases when this compilation to x86 native code happen are:

  1. Functions that have __asm blocks.

  2. Functions that have varargs parameters. In fact, there is an equivalent of varargs in .NET, and C++ can call such methods. However, the current version of C++ cannot compile methods that have vararg parameters.

  3. Functions that call setjmp.

  4. Functions with intrinsic such as _ReturnAddress and _Address­OfReturnAddress that directly access the machine code.

  5. Functions with variables that are aligned types (using __decl­spec(aligned)).

With these rules taken into account, most code will compile to MSIL and the remaining code will compile to native x86.

 

C++ Primitive Types

 

The .NET Framework defines value types for all the primitive types used in C++. You can continue to use C++ types, and the compiler will ensure that the correct .NET type is used. These types are shown in the following Table. All of these types are defined in the System namespace, and the corresponding value from System::TypeCode enumeration also given. If you use the C++ equivalent type (other than void*, std::time_t, and std::wstring<>), the compiler will use the equivalent .NET type. If your code uses void*, std::time_t, or std::wstring<> and you want to pass the values to .NET code, you will have to change your code to the equivalent .NET type.

The basic types in the .NET Framework for which there are no equivalents in C++ or the C++ standard library: DBNull and Decimal, which are used to represent a NULL value in a database and a decimal value with 29 significant digits, respectively. In addition, the nearest equivalent in C++ terms for three types: DateTime, to hold a time; String, which is a string type that holds a Unicode string; and Object, which is the top class in all class hierarchies in .NET and hence an Object* is used in the situations when a void* is typically used in C++ also listed in the Table.

 

New C++ .NET Type

Size (Bits)

TypeCode

C++ Equivalent Types

Boolean

8

0x03

bool

Char

16

0x04

wchar_t

Byte

8

0x06

unsigned char

SByte  (If you use the /J compiler switch, a C++ char is compiled as a Byte.)

8

0x05

char

Int16

16

0x07

short

UInt16

16

0x08

unsigned short

Int32

32

0x09

int

UInt32

32

0x0a

unsigned int

Int32

32

0x09

long

UInt32

32

0x0a

unsigned long

Int64

64

0x0b

__int64

UInt64

64

0x0c

unsigned __int64

Single

32

0x0d

float

Double

64

0x0e

double

DateTime

0x10

std::time_t

DBNull

0x02

Decimal

0x0f

Object

0x01

void*

String

0x12

std::wstring<>

 

Primitive .NET value types and their equivalent C++ types

 

Notice that int and long have the same underlying .NET Framework type: Int32. Thus, the same code will be generated for int as for long, and this behavior might appear to imply that a method cannot be overloaded on long and int. However, the C++ compiler will use a special modifier (Microsoft::VisualC::IsLongModifier) to indicate that the type is long rather than int, so the run­time will treat methods overloaded with the long and int parameters as being different. Each of the .NET types for primitive types derives from System::ValueType. These .NET types have methods to convert to other primitive types, to compare values, to create a value from a string, and to convert to a formatted string; and they each have a method named GetTypeCode that returns a TypeCode enumerated type. This TypeCode is used to identify the particular type, so you can pass a primitive type through a ValueType pointer and use the TypeCode to identify which type is being passed. Here are some examples:

// Use a .NET primitive type.

Int32 i32 = 99;

// Convert to a string.

String* s = i32.ToString();

// Use as a C++ primitive type.

int i = i32;

The compiler will automatically convert C++ primitive types to the .NET primitive types, so you can assign an int to an Int32 and vice versa. To call the other conversion methods (for example, ToSingle and ToDecimal), the call must be made on a managed interface and this requires that the type be boxed. This interface is called IConvertible. The System::Convert class can be used to convert from one primitive type to another. You can use the generic ChangeType method, which takes Object* pointers to the value you want converted and the type you want to convert the value to, but since most primitive types are value types, this operation will involve boxed values. The Convert class also has overloaded methods to convert between specific types:

int i = 0;

// true is nonzero.

bool b = Convert::ToBoolean(i);

String* s = Convert::ToString(i);

In addition, most .NET types will have a ToString method to get a string version of the object.

 

 

 

 

Part 1 | Part 2

 


 

< C++ .Net Early Stages 1 | Main | C++ .Net Early Stages 3 >