Compiler代写:COMP2932CompilerProject


代写编译器,用于编译JACK语言。

The Brief

In this project, you will develop a compiler for the JACK programming language
described in our textbook. This involves implementing the lexical analyser,
the syntactic analyser, the symbol table, the semantic analyser, and the code
generation phases of the compiler. The target machine for the compiler is the
virtual machine described in. The compiler should produce VM code that
successfully run on the VM emulator provided with the textbook (see this link
https://www.nand2tetris.org/software
. The specifications of both the JACK language and the target VM code are
clearly laid out in. For further information, resources, software tools and
many other useful things see this site https://www.nand2tetris.org/ .

Compiler Details

The JACK source files

A JACK program is a collection of one or more classes. Each class is defined
in its own file. JACK source files (you can call them class files) must have
the .jack extension. All the source files of a single JACK program should be
stored in the same directory.
You will develop a working compiler that accepts as input a directory
containing one or more JACK source files. For each source file the compiler
should produce an equivalent VM code file having the same name as the source
file but with the .vm extension (vm=virtual machine). The target code files
should be created in the same directory as the source files. After
compilation, the directory will contain the .jack source files and the
corresponding .vm files. To run the program using the provided VM emulator,
copy the provided OS (library) files into the same program directory, then
load the whole thing into the emulator.

Invoking the compiler

It should be possible to invoke your compiler at the command line. The
compiler should accept one command line argument representing the name of the
directory that contains the JACK source files (the JACK program). For example,
if your compiler’s executable file is called myjc and the JACK program
directory is called myprog, the compiler would be invoked by typing this at
the command line:
myjc myprog

Error Reporting

At any phase of the compilation process, if an error is encountered, the
compiler should print out an informative error message, citing the line number
where the error was encountered. The message should also cite the token near
which the error occurred. Here is an example of a typical error message:
Error, line 103, close to “;”, an identifier is expected at this position
Uninformative generic messages such as “syntax error” must be avoided. Upon
encountering an error, the compiler should report this error and immediately
stop (yes, it is a short-tempered compiler!). The compiler is not required to
attempt error recovery or report more than one error at a time.

The Lexical Analyser

You will write a lexical analyser (lexer) module that reads a JACK source file
(having a .jack extension) and extracts all the tokens from this file. The
lexer should reveal itself to other modules (i.e. the parser) through two main
functions:

  • Token GetNextToken(). Whenever this function is called it will return the next available token from the input stream, and the token is removed from the input (i.e. consumed).
  • Token PeekNextToken(). When this function is called it will return the next available token in the input stream, but the token is not consumed (i.e. it will stay in the input). So, the next time the parser calls GetNextToken, or PeekNextToken, it gets this same token.
    The lexer should successfully remove white space and comments from the input
    file and correctly extract all the tokens of the source code. It should not
    crash or become unstable if the source file contains any kind of lexical
    errors. One particular type of error to watch for is the unexpected end of
    file while scanning a string literal, or a multi-line comment.
    Once you have finished developing the lexer, you must test its operation using
    a set of JACK source files which will be provided in due course.

The Grammar

The grammar of the JACK language is given in. However, we will write another
version of the grammar. The main purpose of writing a new version is to
account for operator precedence in arithmetic expressions which is not
currently accounted for in the original grammar described in.

The Parser

You will implement a recursive descent parser. A recursive descent parser is a
collection of recursive functions, and can be very easily developed from the
grammar of the language. Once you have finished implementing the parser it
should be thoroughly tested using the set of JACK source files. All kinds of
syntax errors should be reported properly. However, as stated earlier, the
compiler should stop on encountering the first error in a source file. It
should not crash or become unstable when a syntax error is encountered.

The Symbol Table

You will implement a symbol table to store all program identifiers and their
properties. A symbol in this context is any identifier defined in the program
such as a variable or method name. We will explain symbol tables and their
implementation in due course.

The Semantic Analyser

You will implement a semantic analyser to find and report possible semantic
errors in the JACK program. However, this analyser will not be a standalone
module. You will insert semantic analysis statements at suitable points in the
parser functions. Once the semantic analyser is complete, it should be
thoroughly tested.

Code Generation

You will implement code generation in your compiler. However, this will not
require a separate module. Code generation statements will be inserted to the
parser functions at appropriate points.

Implementation and Planning

You can write your compiler in any of the following languages:

  • C
  • C++
  • Java
  • Python
    You are allowed to use any standard library that comes with the programming
    language. But you are not allowed to use libraries that can perform entire
    compilation tasks (e.g. tokenization or parsing). You are also not allowed to
    use complier generator tools. You must write your compiler from scratch
    following the methods and guidelines explained in this module.
    You will adhere to the specifications of the JACK language and the virtual
    machine code described in. However, you will not follow the development
    guidelines or the plan of the book. Instead, you will follow the plan detailed
    in the Teaching and Project Plan spreadsheet handed out at the outset of this
    module. A copy of this spreadsheet is available on Minerva. It is very
    important to keep to the plan and make sure that you reach each milestone of
    the compiler project at the specified date. Each milestone represents the
    completion of one phase of the compiler. Except for the final project
    submission deadline, there are three major milestones, corresponding to the
    lexical, syntactic, and semantic analysis phases respectively.

Submissions and Version Control

You must create a GitLab repository for your project and give me full access
to it. My GitLab username is drsalka. You must regularly push commits to the
repository particularly at project milestones.
At each milestone date, you must submit your current compiler code base
(without executables) to Minerva with a brief description (200 - 300 words) of
what you have achieved so far. It does not matter if your code at a certain
milestone still contains some bugs or errors, as long as the main
functionality required at this stage has been achieved. 10 marks have been
assigned for achieving the milestones in due time, as detailed in the marking
scheme below.
You may not be able to work ahead of the plan, mainly because we will be
explaining the techniques and the implementation guidelines as we proceed with
the teaching in lectures and tutorials.
Before submitting your compiler, you must make sure that you can compile your
compiler on our school’s Linux machines (e.g. DEC-10 lab machines). You must
also thoroughly test your compiler on these machines. Submissions that do not
compile and run on our Linux machines will score a zero mark, even if they
work perfectly well on another platform. So, if you develop your compiler on a
Windows or Mac PC, you must not submit your project until you have made sure
that it does compile and run on our Linux computers without any problems.
The final project code base should be committed to GitLab and submitted to
Minerva. Before submitting to Minerva, clean your project directory of any
object or executable files, compress it using ZIP, and upload to Minerva.
You are also required to write and submit a brief report (5-10 pages) at the
end of the project. A template for the report will be provided in due course.

Marking Scheme

  • The lexical analysis phase (the lexer) works correctly
  • The syntactic analysis phase (the parser) works correctly
  • The semantic analysis phase (including the symbol table) works correctly
  • The code generation phase works correctly
  • The compiler code is well structured, efficient, clear, and properly commented Adherence to project milestones (the plan):
    • First milestone, the lexer
    • Second milestone, the parser
    • Third milestone, the symbol table
  • Final project report

文章作者: SafePoker
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 SafePoker !
  目录