代写一个精简版的interpreter,可以解析functions, loops, conditional statements,
arithmetic和recursion等。
General
For our final project we will write our own little toy programming language.
The language will have functions, loops, conditional statements, arithmetic
and will even support recursion. For Phase B of the project, we’re introducing
real program statements (well, I suppose “real” here is a relative term). In
addition to output and assignment statements from Phase A, we’ll have
conditional statements (if), loops (do) and functions (defun and call). We’ll
also introduce the concept of statement blocks, since loops, functions and
conditions have “bodies” consisting of a sequence of statements (i.e., a
block).
The Blip do statement
While if statements are conceptually easier than loops, it’s perhaps best to
start with the do construct. At the very least, the do statement is more
straightforward to parse than an if.
As you can see, the syntax consists of the do keyword followed by a loop
condition expression, followed by a block of statements followed by the od
keyword. As always in Blip, the spacing, indenting and new lines are all
irrelevant. Please note that the block of statements can consist of zero or
more statements. In Blip, a do loop with zero statements in the block is
always silly (an infinite loop with no output). However, technically that is a
legal loop.
The semantics of a do statement in Blip are essentially the same as a while
loop in C. First evaluate the expression. If the expression is true (i.e., any
integer value other than 0), then execute the block of statements. Then re-
evaluate the expression and continue repeating the block of statements until
the expression evaluates to false.
Parsing the do statement
Parsing do loops in Blip is not incredibly hard. Once you recognize the do
keyword, you know the next thing in the input will be an expression, so that’s
easy (about the same as an output statement in Blip). The challenge comes when
you start parsing the block of statements. Depending on how you wrote your
parser during Phase A, you may have no problems at all with the statement
block, but if your code is highly structured, you might find it a pain in the
neck to deal with the od. The issue is that by the time you realize that the
next token is an od, you’ve already read the token (duh). But, if the token is
not od, then that token is the first token in a regular statement, by which
time you may wish that you hadn’t already read it. To help with this
situation, the Input.cpp file now has a function “peek_next_token”, which
returns a C string representing the next token in the stream. To first
approximation, peek_next_token works a lot like the sequence
read_next_token(); next_token(). However, peek_next_token does not actually
remove the token from the input stream. So, after you have “peeked” at the
token, you still have to actually read the token to remove it. I used
peek_next_token a lot in my solution. Some of you will find it useful, some of
you will have no idea what it’s supposed to be for. If it doesn’t appear
really, really useful to you, then your code probably doesn’t benefit from it
- it will be obvious how to use this function if it’s useful to you. And if
it’s not useful, please don’t feel obligated to use it.
Delaying and Repeating execution
The big change from Phase A to Phase B is that when you implement Phase B
you’re forced to parse and save Blip statements. In Phase A, you could parse,
execute and throw away each statement as you executed it. Statements could
only be executed once, so there was no need to save them. However, in Phase B
we have loops that force us to repeatedly execute the same statement(s).
Obviously, the statements appear only once in the input stream even though
we’re executing them repeatedly in the loop. So, you’re forced to save those
statements somehow and then re-execute them later. We strongly recommend that
you use a parse tree to store your statements. In fact, we recommend that you
use a parse tree to store ALL your statements, even those statements that are
not inside a loop. Read/parse with one (recursive) function, and then execute
with another (recursive) function. If you used a parse tree in Phase A, you’re
probably in good shape to extend that parse tree for Phase B. Please note that
you will probably have two different types of parse trees: trees of statements
and trees of expressions. A do statement is a perfect example. The condition
for the do is an expression. If you have an expression parse tree, that’s a
great way to store that expression. The body of the loop is a block (hint, a
vector) of statements. Each of those statements can be any Blip statement,
including do or if statements! So, a vector of statement parse trees is the
right way to store the body of a do loop.
If statements
Once you’ve figured out how to do a loop (no pun intended), if statements are
pretty easy.
Where, as before, expr is an expression and there is a block of statements in
between the start of the if and the end of the if. Note that the end of the if
is marked with “fi” (if spelled backwards). The semantics of a basic if
statement is the same as in C. If expr evaluates to any integer other than
zero, then all the statements in the body are executed. If expr evaluates to
0, then none of the statements in the body are executed.
The else statement
Just as in C, if statements can have an optional else attached.
The semantics of an if-then-else in Blip are just like those in C. If expr
evaluates to true then execute all the statements inside the first statement
block, but none of the statements in the else block. If expr evaluates to
false, then evaluate none of the statements in the first block, but evaluate
all the statements in the else block. Please keep in mind that the else
construct will not appear on every if statement.
And that’s all there is for Project 8
The rest of this document describes additions to Blip. Please note that the
effort implied by the requirements below (i.e., how much time it takes to
finish everything) is outside the bounds of what’s reasonable for a
programming project. So, it really is extra credit, and I would not expect
very many (if any) students to undertake the entire project. With that caveat,
we will test the following requirements, including the quality of your
solution (e.g., testing for memory leaks) and assign up to five additional
extra credit points for extraordinarily good solutions.
Functions (oh my)
The complicated part of Phase B starts with functions. Adding functions to
Blip means adding three new commands, two statements and one expression type.
But the hard part of functions is implementing their semantics. We suggested
you approach functions as follows.
- Implement basic functions with no parameters. These functions, when called, always run to the end of their function bodies - i.e., the return statement will always be the last line in the function.
- Implement functions with parameters. Note that parameters require that you have a new “local” scope for your variables, so if there’s a variable “x” defined and then there’s a parameter “x”, the parameter and the original variable are different. We’ll approximate the rules that C uses for variable scoping. Once you implement parameters, implementing local variables are relatively easy.
- Implement return statements that happen in the middle of your function. A return statement can appear anywhere inside a function (or even anywhere inside the main program), including inside loops or if statements. When the return statement is executed, the function immediately stops executing and control returns to the calling function.
You can actually swap the order you implement #2 and #3 above, depending on
how you’re feeling. I just strongly recommend that you do #1 to completion,
and then pick either #2 or #3 to work on next. Don’t try to do 2 and 3
simultaneously - build the project incrementally!
Basic Functions and defun/nufed
A function can be defined anywhere, at any time. In fact, we can nest
functions in functions in Blip (you can’t do that in C). When you define a
function (with the defun keyword), you create a list of statements (the
function body) that are not executed yet, but will be executed when the
function is called (with the call expression).
In this example, a new function called “sayHello” is created. The function has
zero parameters (there is nothing between the params and smarap markers), and
one statement (there is one statement between the parameter list and the nufed
marker). When this function is parsed, you should build a parse tree for it.
As a hint, since my implementation always parses and then executes statements,
my defun statements are also both parsed and executed. However, “executing” a
defun statement does not execute the body. Rather executing a defun statement
simply adds the function to the symbol table once my defun has been executed,
the symbol “sayHello” is bound to the function. There are many other ways to
implement the semantics, but however you do it, you must not execute the body
of the function until it is actually called.
Call expressions
The only way to invoke a function in Blip is to use a call expression. This is
actually similar to what C does. However, in Blip, every function has a return
value. If you don’t include a return statement in your Blip function, then the
value 0 must be returned by that function (when it ends). Most importantly,
the return value cannot be ignored by the caller - C allows the function
return values to be ignored. So, calling the sayHello function requires that
we do something with the returned value (zero for that example since there’s
no explicit return statement).
Here I’ve used a var statement. I have a bogus variable “not_used” and I’m
initializing this variable to the value returned by sayHello. I use this
technique extensively in my Blip programs to simulate “void” functions.
Anyway, since Blip functions always return a value and since that value can
never be ignored, our task to execute them is relatively easy. We have a new
expression type “call”. The call expression invokes the function using the
arguments from the argument list and then evaluates to the value returned by
the function.
Params and variable scoping
Blip functions can have an arbitrary number of parameters. The parameter list
is marked with params/smarap in the defun statement. The corresponding set of
arguments is marked with args/sgra. Just like in C, the first argument is
bound to the first parameter, the second argument is bound to the second
parameter and so forth. The first step in implementing parameters is to modify
your call expression so that, when it is executed, each of the arguments is
evaluated and the value is assigned to a variable with the name of the
parameter.
There are many possible implementations of variable scoping. You’re free to
choose any implementation strategy that you would like. However, my personal
favorite involves using multiple symbol tables. When the parameter x comes
into existence, it is placed into its own symbol table - the local variables
for function fun. When the function returns, its local-variable symbol table
is completely erased. However you decide to implement local variables, the
correct behavior in Blip is defined as follows
- Global statements (statements that are outside of any function) have access only to variables that are also outside of any function. Variables defined outside of any function are known as “global variables”
- Statements inside a function have access to both local variables and global variables. Local variables are variables defined with var statements that appear within the function’s body, or parameters that appear within the statement’s parameter list. In Blip, parameters and local variables are the same scope. Also, new scopes are only created for functions.
- If a variable appears in an expression inside a function body, and that variable is defined as a local variable, then the value of the local variable must be used for the expression. It does not matter if the variable is also a global variable.
- If a variable is used as the target of a var statement inside a function, then a new “local” variable is being created. If there is a global variable with the same name, then the new local variable hides the global. If there is a parameter with the same name, or if there is a previously executed var statement in the same function with the same name, then a warning is generated (same as for Phase A warnings).
- Note that parameters and locals are considered the same scope.
- Note that only functions create new scopes - if and do blocks do not create new scopes (this is different from C).
- If a variable is used as the target of a set statement, and the variable has previously been set by an executed var statement from the same function, then the local variable is used, even if a global variable exists by the same name.
- If a variable is used as the target of a set statement, and the function has not previously executed a var statement with the same name, then the set statement assigns to the global variable. If there is no global variable, then a warning is generated (same as described for Phase A).
- If a function is nested inside other functions, then statements and expressions within that function can access only the local variables for that specific function or globals. A function cannot access local variables defined by an outer function.
Early return statements
Blip function can have multiple return statements. If there are zero return
statements, then the function returns the value 0. Otherwise, the first return
statement that is executed supplies the returned value. Additionally, as soon
as a function executes a return statement, the function must immediately stop
executing and return control to its caller. To make things consistent, return
statements can also appear in the “main program” (i.e., the Blip statements
that appear outside of any functions constitute the main program). The first
of these return statements that are executed forces the Blip program to
immediately stop.
How you go about implementing the return statement depends entirely on how you
structured your program - how you execute statements and functions. I have no
specific advice. There are no fancy data structures or complex semantics to
worry about. Please do keep in mind that we prefer clean and well-designed
code over sloppy code and that your style is part of your grade for this
project.