Last two weeks of working on LIR assembler provided one of the most challenging pieces of work done yet. Multiple fragment support in LIRasm was implemented almost completely, and landed in the tracemonkey repo. I got to look up current source code, understand the way it is performing, modify it to support mutliple fragments, battle with segfaults and memory leaks, and do a whole lot of implementation and re-implementation before finally getting an acceptable version. And it would be correct to say the hardest things bear the sweetest fruits in the end. I am really happy that I finished this part of the project.
Undoubtedly, I got great support from my project mentor, jorendorff, humph, and graydon. Without their support, guidance and everlasting patience, I would never have been able to come so far. Although the assembler is far from finished, it has been a great journey so far, with its ups and downs, and we have managed to pull through so far. I really respect this community and the people I have been working with, and I just can't see its engagement with me ending.
In the process, I fear I have disappointed people and committed some stupid mistakes. In particular, I think last post in my blog sent the wrong message to many people, and I would like to clarify whatever I believe was uncalled for on my part. I think I sounded disappointed and possibly hateful(I am not finding the right word here) in the overall tone of the post. It represented in general my feeling at that time, since I had worked hard on the project and most of my work was not going to be used in the new world. I think it was just a manifestation of the feelings associated with the unfortunate event of code duplication. Everyone loves their work.
Anyway, I am very much grateful to Mozilla Education for providing me with this opportunity to work on such a real project. I think its a great gesture, and will definitely publicise it in my college so that more juniors can benefit from it.
Tuesday, June 30, 2009
Monday, June 15, 2009
[LIR Compiler/Assembler?] Current Status
We hid a roadblock last week when it came to our notice that another version of LIR Assembler(https://bugzilla.mozilla.org/show_bug.cgi?id=484142) was being developed by Graydon Hoare(graydon), using only handcoded c/c++. The whole of last week went into deciding what should be done with the two working versions, seemingly both equally developed upto a certain extent.
The major problem was in merging the two forks - we were using a parser generator based approach, and graydon was using handcoded c++. It is far from trivial to merge the two, and hence, one version had to go. We spent one day going over the pros and cons of either approach. The summary of our discussion follows -
Graydon was advocating the fact that parser generators lead to 'mucky' .y files, a tool dependency, and are difficult to land.
Our case was that the .y files lead to much cleaner, neater classified code, separating the logic from the implementation to a certain extent. It involved significantly lesser number of code lines, and would also be extensible/modifiable(easily) in the future if additional syntax requirements appear.
As things turned out, we were deadlocked, and the decision to choose one particular fork lay upon me.
I went with graydon's version.
graydon's version was actually going to be landed in the tracemonkey repo, and this was one of the reasons I chose his version. The remainder of the week was spent in making lirasm(as our new lir assembler is now called) appropriate for the source tree, and then landing it in the repository.
During this time, I read up on Mercurial Queues(MQ), and learnt about the Bugzilla way of doing things (using patches, filing bugs) compared to the BitBucket approach we used earlier.
Things will get on track from today probably, and we have a bug to work on(https://bugzilla.mozilla.org/show_bug.cgi?id=497991).
The main setback of the whole duplication situation was time. When it occurred, things were going very - I mean very smoothly, and it really seemed most of the work for LIR Compiler would be finished in two weeks time. I will get more idea today on how much time this is going to take now.
The major problem was in merging the two forks - we were using a parser generator based approach, and graydon was using handcoded c++. It is far from trivial to merge the two, and hence, one version had to go. We spent one day going over the pros and cons of either approach. The summary of our discussion follows -
Graydon was advocating the fact that parser generators lead to 'mucky' .y files, a tool dependency, and are difficult to land.
Our case was that the .y files lead to much cleaner, neater classified code, separating the logic from the implementation to a certain extent. It involved significantly lesser number of code lines, and would also be extensible/modifiable(easily) in the future if additional syntax requirements appear.
As things turned out, we were deadlocked, and the decision to choose one particular fork lay upon me.
I went with graydon's version.
graydon's version was actually going to be landed in the tracemonkey repo, and this was one of the reasons I chose his version. The remainder of the week was spent in making lirasm(as our new lir assembler is now called) appropriate for the source tree, and then landing it in the repository.
During this time, I read up on Mercurial Queues(MQ), and learnt about the Bugzilla way of doing things (using patches, filing bugs) compared to the BitBucket approach we used earlier.
Things will get on track from today probably, and we have a bug to work on(https://bugzilla.mozilla.org/show_bug.cgi?id=497991).
The main setback of the whole duplication situation was time. When it occurred, things were going very - I mean very smoothly, and it really seemed most of the work for LIR Compiler would be finished in two weeks time. I will get more idea today on how much time this is going to take now.
Tuesday, May 26, 2009
How to use Loads and Stores in LIR
Well, if you read the Wiki on LIR, and if you aren't previously familiar with LIR, then you might not have gotten the hang of pointers.
Pointers - are like integers in the way that they contain addresses, and can be used anywhere where an integer can(LIR has only one type).
What is different is that you cannot create a pointer(that is get a memory address, unless you are extremely lucky!) by yourself.
Thats where alloc instruction comes in.
Alloc - The wiki will give you the more technical information. Alloc is basically like calloc() in C - it allocates a specified amount of space for your usage later, and returns a pointer to it. Hence the instruction format is:
I will illustrate this with an example straight from jorendorff's desk -
Going one step further, if you allocate 8 bytes of memory
Having implemented basic loads and stores today, and consolidated my code which looks much more readable now, I am hoping to get information on:
1. Jumps and labels and how to handle those(Are labels solely the responsibility of the parser?).
2. Subroutine calls.
3. Guards
I will go about them in the above order.
Pointers - are like integers in the way that they contain addresses, and can be used anywhere where an integer can(LIR has only one type).
What is different is that you cannot create a pointer(that is get a memory address, unless you are extremely lucky!) by yourself.
Thats where alloc instruction comes in.
Alloc - The wiki will give you the more technical information. Alloc is basically like calloc() in C - it allocates a specified amount of space for your usage later, and returns a pointer to it. Hence the instruction format is:
p = alloc sizeIllustration
I will illustrate this with an example straight from jorendorff's desk -
addr = alloc 4;That stores the value 0 at the memory address specified by addr pointer. Now you can load that value into a variable, store another value, etc.
zero = int 0;
st addr[0] = zero ... (that's roughly how it's intended to be used)
Going one step further, if you allocate 8 bytes of memory
addr = alloc 8;Note the offset 4 in the second store. The offset is the number of bytes to be offset from the base, as expected.Now, you can load the two bytes in separate variables, or mess up stuff by specifying an offset not divisible by 4!
zero = int 0;
one = int 1;
st addr[0] = zero;
st addr[4] = one;
Having implemented basic loads and stores today, and consolidated my code which looks much more readable now, I am hoping to get information on:
1. Jumps and labels and how to handle those(Are labels solely the responsibility of the parser?).
2. Subroutine calls.
3. Guards
I will go about them in the above order.
Sunday, May 24, 2009
[LIR Compiler]Progress and Issues
I have added support for a number of instructions(mostly, floating points and guards, and 64 bit instructions are left). The project feels pretty robust currently.
I need to add code for reporting errors in a more informative manner(currently, I use the trivial "syntax error at line number ...." format. I will need to look into YYLOC and the related token locating parameters.
Also, I need information on how/what do display as output. Currently, I am using the program here as the base program, and modifying it to suit my needs.
Also, I am not aware how loads and stores work - for instance, what addresses are we providing to the compiler - if we store an integer using store, where is it actually getting stored. Are we directly addressing memory words, or is it stored in some LIns * pointer like other data is stored.
I need to add code for reporting errors in a more informative manner(currently, I use the trivial "syntax error at line number ...." format. I will need to look into YYLOC and the related token locating parameters.
Also, I need information on how/what do display as output. Currently, I am using the program here as the base program, and modifying it to suit my needs.
Also, I am not aware how loads and stores work - for instance, what addresses are we providing to the compiler - if we store an integer using store, where is it actually getting stored. Are we directly addressing memory words, or is it stored in some LIns * pointer like other data is stored.
Wednesday, April 22, 2009
[LIR Compiler]Bison Works, Finally
I finally got Bison to work for the first time for LIR Compiler.
The error was in the makefile itself - a statement, though incorrect skipped my notice since it worked for the earlier version using only flex.
What I was doing for making main.o was -
$(CC) $(CFLAGS) $^ -0 $@.
where the dependencies were a .cc file, and a library.
Due to a strange coincidence, this makefile worked fine with my earlier attempts using only flex. With bison also included, things got a little messier, since I had to include the token header tok.h in the dependencies for main.cc. The above line broke down in that case.
The remedy was to replace this line with what should have been there in the first place -
$(CC) $(CFLAGS) -c main.cc -o main.c
and then, include the other libraries in the linking phase when all the .o files were linked.
The makefile and compilation process is starting to make more sense now, although I still don't know what lies inside the .a files.
Once the build is working fine, remains the task of building the parser, which is the fun part. So I should be making good progress with the project in the coming days.
The error was in the makefile itself - a statement, though incorrect skipped my notice since it worked for the earlier version using only flex.
What I was doing for making main.o was -
$(CC) $(CFLAGS) $^ -0 $@.
where the dependencies were a .cc file, and a library.
Due to a strange coincidence, this makefile worked fine with my earlier attempts using only flex. With bison also included, things got a little messier, since I had to include the token header tok.h in the dependencies for main.cc. The above line broke down in that case.
The remedy was to replace this line with what should have been there in the first place -
$(CC) $(CFLAGS) -c main.cc -o main.c
and then, include the other libraries in the linking phase when all the .o files were linked.
The makefile and compilation process is starting to make more sense now, although I still don't know what lies inside the .a files.
Once the build is working fine, remains the task of building the parser, which is the fun part. So I should be making good progress with the project in the coming days.
Friday, April 17, 2009
Updates
I have been trying to get Bison to work along with flex using the basic 4 line snippet that was used initially, but I haven't had much success. I haven't been able to devote much time to the project, since nearing the end of the semester, all projects in the curriculum are in a kind of rush, plus alot of practical vivas and presentations are due. I will try to get some time and write some code this weekend, but the coming month is going to be very hectic.
Saturday, April 11, 2009
Updates - LIR Compiler
I am currently reading on Bison. I have today and tomorrow off, so I can get alot of reading done in these two days, and hopefully I have sufficient skills to write the parser.
I am really enjoying reading Bison - the way it parses is very natural yet error free.
I am really enjoying reading Bison - the way it parses is very natural yet error free.
Subscribe to:
Posts (Atom)