EveC tour



Introduction

EveC is a new programming language that appears to be very close to Java as regards syntax. However, despite this apparent similarity, the EveC source code is translated into ANSI-C code, optimized through some high-level optimization targeted to object-oriented languages, and finally compiled by an ANSI-C compiler that produces optimized native code. EveC inherits some features of Java. For example, it ensures to have a similar robustness and portability between operating systems and architectures to Java, with the only difference that it is necessary to rebuild the source code for the target architecture and operating system, without the need to change any line of code. The Runtime System is provided with a concurrent garbage collector, an exception system and concurrent primitives. EveC provides a full generic class support and, unlike Java, it is possible to determine at run-time the type of every object. Moreover, EveC supports also unsigned primitive types and arithmetic.

System architecture

The system developed consists of three principal phases.

The first phase concerns the translation of the EveC source code into ANSI-C: an ANTLR grammar is appointed to perform this translation, handling all those object-oriented features, like objects and classes, that do not exist in the target code.

The second phase consists of a set of optimizations, most of which are targeted for object-oriented programming languages and, therefore, cannot be performed by an ANSI-C compiler. In this phase the compiler manipulates the C code already generated in order to make it more efficient.

The purpose of the last phase is to compile the optimized C code generated into native code. This task can be performed by any ANSI-C compiler, therefore theEveC programming language is portable on a huge number of architectures and operating systems.


System architecture of the EveC compiler.

Translation from EveC to C

The first phase of the compilation process consists in the translation of the EveC source code into C code. This phase can be divided into two different subphases. In the first subphase, the source code is parsed for the first time and information about the classes or the interfaces, the methods and the attributes of the class or the interface are collected. During the second subphase, the source code is parsed for a second time and the translation from EveC to C is performed. This first phase of the compilation process is performed using an ANTLR grammar, differently annotated in the two phases. In the first subphase, the ANTLR grammar is annotated to collect all the necessary information. In the second subphase, the ANTLR grammar is annotated to directly translate the EveC code into C, performing all the necessary type checking, verification of existence of called methods and so on, using the information gathered by the previous subphase. Moreover, all the high-level features, such as classes and interfaces, are managed in order to be translated in one or more constructs present in C.
A detailed description of how the translation phase is performed and how all the EveC features are translated in C is available in this document.

Optimizations

In order to reach better performances, the EveC compiler has been enriched with some high-level optimizations. Since the EveC language is translated into C code and then compiled by a C compiler, most optimizations are already performed by the ANSI-C compiler that represent the back-end of this project.

However, the EveC compiler, in order to handle and optimize the typical features of an object-oriented language, perform some high-level optimizations not implemented in an ANSI-C compiler. Unlike the optimizations implemented in the ANSI-C compiler, that are performed during the compilation of the C code into native code, the optimizations implemented in the EveC compiler are performed, at an higher level, i.e. before the compilation with the ANSI-C compiler. These optimizations are: Devirtualization, Method Inlining , Bounds Checks Elimination and Escape Analysis.

Actually, Method Inlining is already performed by any ANSI-C compiler; however, performing it also at an higher level permits the enhancing of the other optimizations executed by the EveC compiler

Devirtualization

In the EveC programming language, every method is implicitly defined as virtual; thus, every method call that is performed must be executed using a look-up table instead of a direct call to the corresponding function. Since the language allows a class to extend only one father class, the indirect call is not as expensive as in other languages, for instance C++. Nevertheless, the indirect method call does not allow further optimizations, such as Method Inlining; moreover, an indirect method call can break hardware mechanisms like speculative execution and prefetching because they rely on control flow prediction, which is harder to analyse in presence of indirect method calls.

Method Inlining

Method Inlining trades code space for time by replacing a procedure call with the called procedure’s body. More in details, such a trade-off constists in the reduction of the time overhead generated by the call of a method, at the cost of a simultaneous code growth that can lead to a loss of locality of reference and excessive code growth.

void Method () 
{
  for(int i = 0; i <= 10; i++) 
  {
    A(i); 
  }
}
void A(int i) 
{
  array[i] = i; 
}

In the code below, the call to the method A shown in the code above, is replaced by its body, and, in this example, it can lead to a further optimization of the code, by performing, after Method Inlining, also the Range Check Elimination.

void Method ()
 {
  for(int i = 0; i <= 10; i++)
  {
    array[i] = i;
  }
}

Inlining has two important benefits: first of all, it eliminates procedure call overhead, including the cost of passing arguments, saving and restoring registers, building return linkage information and branching to the procedure body. In addition, by merging the procedure body with its calling context, it enables other optimizations to specialize the caller and the callee (i.e. the method called) together.

Bounds Checks Optimization

Whenever an array element is accessed, the EveC compiler executes a comparison instruction to ensure that the index value is within the valid range; this is made to guarantee a typesafe execution: if a reference is invalid, an exception must be thrown. Hence, whenever the compiler finds an array access, a check is added to determine if the access is safe. This clearly reduces execution speed, since there are more instructions to be executed. In particular, when the array access is placed inside a loop, as shown in the code below, the check is executed many times. In this case the check may be redundant and the processor would execute many useless instructions.

for(int i = 0; i < A.length; i++)
{
   if(i >= A.length || i < 0)
       RaiseException(); 
   A[i] = i;
 }

Bounds Check Elimination has the purpose to eliminate this redundant checks whenever possible, guaranteeing the safety of each array access. This lead to better performances as for the executions time since it reduces the number of instructions.

Escape Analysis

When an object is created through the new statement in EveC, the compiler translates it in a call to a function defined in the Runtime System, SuperMalloc, whose aim is to allocate the object on the heap. Therefore, each object in EveC is allocated on the heap, and so, it can be deallocated only by the garbage collector. Escape Analysis is a modern compiler optimization technique for the identification and optimization of objects that are accessible only within a method. Such objects are said not to escape the method. When it is verified that a certain object does not escape the method in which it was declared, it can be allocated on the method’s stack frame. This has two important implications. First of all, stack allocation is inherently cheaper than heap allocation; moreover it reduces garbage collector overhead, since the storage on the stack is automatically reclaimed when the method returns.

According to the adopted policy, an object escapes a method if one of the following situations take place:

  • The object, or one of its members, is returned by the method.
  • The object is passed as an argument to a function call.
  • The object is assigned to another object that escapes from the method.

When the analysis on a certain object has determined that it does not escape the method, the call to the SuperMalloc function can be replaced with the assignment of the object to another variable, allocated on the stack frame of the creating method.

void method() {
  ...
  struct Object * obj = SuperMalloc(sizeof(Object));
  ...
}

If the object obj does not escape from method, it can be allocated on the stack, as shown below.

void method () 
{
  ...
  struct Object tmp;
  struct Object * obj = &tmp;
  ...
}