Jussi Laakkonen, Teemu Ikonen, Arttu Kuukankorpi, Jari Kytöjoki, Olli Auvinen, Vesa Kärpijoki
Last updated 1998-04-23
We have been porting Kaffe, the popular Java JIT-compiler, to the Plan9 operating system since September 1997. We are doing this as a part of our studies of computer science at Helsinki University of Technology, Finland. The project is a part of the Software Project -course and is to be completed early May 1998. Our work is being supervised by PhD Mikko Tiusanen.
To make it easier for others to port Kaffe to other platforms, we have gathered the experiences and problems we have encountered in this document. We hope our experiences will help others in their work. We also give the guidelines we are following in our task. This document will be written during our work.
We will port Kaffe to the Intel x86 -version of Plan9. Thus we have chosen to base the porting on the Linux-version of Kaffe.
We have approached the problem by using a bottom-to-top way of thinking. This means we will fit Kaffe piece by piece to Plan9 system calls and try to get them to work. Some parts of Kaffe can't be tested alone (e.g. threads, garbage collection) so in the beginning we have to assume that "if it compiles, it works".
Porting Kaffe is a big and complicated task, therefore it is necessary to divide it into smaller parts. We have decided to port the interpreter only in the first phase and the JIT-compiler later. The interpreter doesn't have too many technical specialities and the functionality of Plan9 is very similar to Linux from the scope of the interpreter. Only the implementation of threads is different because of the notify-based interface in Plan9. Kaffe uses signals to implement context switching in threads. The main task in porting the JIT compiler is to assimilate the 8c way of generating code to the way of gcc. This is because Linux version of Kaffe assumes that the features of gcc can be used.
We start the porting by fitting Java API into Plan9. First we have to get Java.lang and Java.io to work before we can test Kaffe.
It has been checked that it is possible to implement the JIT-compiler in Plan9. Plan9 protects memory in a way that the user can execute code in the code/data segment, because the segments overlap each other (so the address space is the same for both of them). It was checked by exploiting the F00F-bug in Pentium.
The original code of Kaffe was quite messy and it relied much on the properties of gcc. Because there is no gcc for Plan9 and the C-compiler for Plan9 also has some restrictions, we had to clean up the code and make it closer to ANSI C. Because there is a huge amount of code, we made a tool called cupp to perform the cleaning.
cupp stands for 'clean-up preprocessor'. It preprocesses the C code by removing conditionals but leaving the define directives untouched. Using this tool it's easy to make different variations from Kaffe source code.
First, the Java API had to be ported to Plan9. It was necessary to port the classes java.lang, java.io and java.lang.reflect to Plan9 before porting the interpreter. We also ported java.net, because we wanted to use the network.
To be able to port the Java API to Plan9 we had to replace the system calls in the Linux version of Kaffe with the appropriate Plan9 system calls. This meant that we changed the #include-directives in the files so that we could include the correct system header files, cleaned up the code so that the Plan9 C-compiler accepted it and finally traversed the code and changed every system call to the corresponding one in Plan9.
Some system calls were similar in Plan9 and Linux, but many were not. We had to check the Plan9 programmer's manuals to see if a system call exists in Plan9 and if not, we had to find one that could be used to replace the Linux-call. Fortunately, there was a work-around for every system call in Kaffe (sometimes it took a while to find a work-around and implement it, but anyhow we could do it). If there had been calls that couldn't be ported, we would have been in trouble.
There was already a Java-interpreter, Javar, for Plan9, which was done using an older version of JDK. However, most of the methods were similar and we could use Javar as a good example when we fitted the code to Plan9. The method interfaces were a bit different in Kaffe and Javar, so we were not able to do a straightforward copy-and-paste, but Javar was a good help in many cases.
After replacing the system calls we tried to compile the classes in Plan9. We encountered several errors, but they were quite small. Of course, there were a total of hundreds of errors and warnings, but because they were quite easy to fix it took only a few days to correct all of them.
After we had successfully compiled the code, we wrote small test programs to test each method. The programs just called a method with some input and we checked that the output and functionality were correct. This kind of testing is not very extensive, but considering the huge amount of methods it would have taken ages to test each method separately in an extensive way.
After testing the methods separately we wrote programs to test entire classes and libraries. These test programs used many methods in the class/library. However, the final testing of the methods was left to be done through the interpreter. We found only two bugs in the native methods when testing the interpreter.
After the interpreter was complete and working, we decided to test the native methods extensively through Java-programs to ensure that there were no bugs in this fundamental part of Kaffe. We wrote Java-programs that used the native methods so extensively, so that all functionality of the methods was tested.
Because one can't use the native methods directly from Java-programs, we first had to find out which Java API methods had to be used and how to test the native methods. We did this by checking the API source code. Some methods can't be called from Java-programs explicitly. In such cases we used an indirect way of testing the method. Most of such methods were general methods that were used by all Java-programs, so there was no need to test them explicitly.
After figuring out which Java API methods had to be used and in what order to test the native methods extensively, we wrote small Java-programs to test the native methods. We compared the results of the programs to the Java API Specification and to the results of running the same programs on the release version of the IRIX-Kaffe. In some cases the native methods didn't work as specified in the specification, but if they worked similarly in the release version of Kaffe, we didn't fix them. We fixed only methods which had bugs not found in the release version of Kaffe.
In the Java-tests we found some features in Plan9 that restricted some features of Kaffe:
We also found one misfeature in Kaffe: The getFields0()-method of class Class doesn't return the public fields of a class as an array of type Field (as specified by the Java API specification). In Kaffe the result depends on the order the fields have been written in the class being tested.
Porting the interpreter required the Java Virtual Machine to be ported to Plan9. Because we based our work on the Linux Kaffe, all we had to do was to replace the Linux system calls with Plan9 system calls. However, this was not as easy as it may sound, because some parts of Linux and Plan9 were very different. We encountered many problems. In this chapter we briefly describe the biggest problems.
Problem | Description |
---|---|
Dividing by zero | Dividing by zero causes a notify in Plan9. One can't handle the exception very well, because one can't use file descriptors while handling the exception and in Plan9 everything is file-based. A part of the zero-division problem was solved as we found an improperly ported macro from interpreter porting phase, but there are still situations that cannot handle this exception well. These situations occur if there are page faults while running in the exception handler. |
Blocking I/O | In Plan9, one can't know beforehand, if an I/O-operation is going to block. In UNIX it can be determined (see 'man select' in UNIX). The blocking I/O was handled with a thread pool solution that has several threads waiting for requests and all the threads in the pool are repeatedly yielding. |
Time | In Plan9 one can't get the exact time in milliseconds. We built a crude kludge to get something that is close to the exact time. |
Notify | All notify-signals in Plan9 don't pend, so there are some nested-exception-situations that are very difficult to handle. Nested exceptions are currently handled with a notify trampoline solution based on a similar solution in APE. |
Bug in Plan9 malloc | Plan9 works strangely, if too much memory has been allocated. The Plan9 malloc doesn't return any error code if it runs out of memory. One just has to be sure, that one doesn't use too much memory. |
Floats | Plan9 floats do not work as specified by IEEE / ANSI or there are bugs concerning float in the Plan9 C-compiler. Some common float operations raise an arithmetic exception. This problem was erased as we found a bug in a macro definition that was not properly ported to Plan9. |
Fork | Fork doesn't work in Plan9 as stated in the manuals. If two processes share the open files, Plan9 closes stdin, stdout and stderr when the child process dies, even though the manual states that this shouldn't happen. We tried to solve the I/O blocking problem using native threads, but this undocumented feature in fork made it impossible. |
Alloca | There is no alloca-function in Plan9 and it is impossible to implement it with the Plan9 C-compiler. We replaced calls to alloca with static allocation. |
Plan9 C-compiler (8c) | The compiler is quite rough and it doesn't fit well to compile C-code made for gcc. Kaffe relies very much on the specialities in gcc. We have found some clear mis-features in 8c. E.g. a variable with the name of a function can't be a part of a structure and some identifiers made with typedef don't work with structures. |
The principles of Plan9 notify-signals | A notify-signal can't be addressed to a specific handler, so one has to build a chain of the handlers and each handler then checks with strcmp, if it should handle the signal. This makes the operation quite slow. |
Plan9 manuals | Plan9 actually works differently than stated in the manuals in many cases, e.g. the problems with fork, pending signals and the behaviour of malloc when it runs out of memory. |
Mis-features in Plan9 | There are some serious problems with files in Plan9, even though Plan9 is based completely on a file-based approach (e.g. lack of 'select'-like system call). |
Lack of documentation of Plan9 | We had to find out many of the things by trial and error and there are still some small things that we have not been able to solve. Some things took many hours to solve (e.g. setting the netmask). |
Porting the just-in-time compiler required porting the part of the Java Virtual Machine that generates native code from Java bytecode. The problems we encountered exceeded the expectations to a huge extent. The following chart describes them.
It turned out that it was not possible to implement exception handling in Plan9 for the JIT-compiler. Therefore the JIT-compiler remained incomplete.
Problem | Description |
---|---|
Plan9 C-compiler (8c) | The biggest problem in the 8c compiler was the lack of documentation. The delivering of arguments and return values in stack had to be found out by testing the behaviour of the compiler by hand ("8c -S", gives the assembler code). |
Plan9 debugger (acid) | There are problems setting the breakpoints while running in heap. Normal disassembler does not work in heap but we found a workaround to accomplish it. |
Stackdumps in exception handling | Stackdumps can not be traced. Tracing fails while exception is thrown from native code and while it is thrown from java code. This is because it is impossible to know going backwards in stack where the JIT opcodes end and native calls begin. In the Linux version this is accomplished by iterating backwards through the stack until the frame pointer of main function is found. |
Plan9 assembler (8a) and linker (8l) | The linker nor the assembler doesn't seem to know all i386 instruction set opcodes. Therefore some instrcutions must ne encoded in opcode and bytecode level. E.g. MOVL 20(BP)(AX*4),AX (in Motorola notation), mov eax, [4*eax+ebp+20] (normal Intel notation): 8bh, 44h, 85h, 14h. |
Handling of long longs as return values | The main problem with long longs is that the return value delivering is accomplished differently in gcc and in Plan9 C compiler. Gcc returns in registers edx:eax and 8c returns value to the address given by a pointer as an argument on the top of the stack. |
Plan9 C compiler handling of registers in function calls. | The 8c compiler does not save the state of the registers (e.g. ebp, esi, edi, ebx) before making a function call so that the states could be restored after returning from the funtion. Gcc's way of working in this situtation is that the function callee saves the registers, while in the 8c it is the function caller's job to do it. |