Jupyter Kernel for P0: Documentation and Examples¶

[Linas Aleknevicius, Kenneth Mak]

This project allows a user to enter any P0 code into a cell within a Jupyer Notebook and run it without the need of external libraries such as WABT installed on the system. To do this, we started with the base parser for the P0 language provided by this course. We then modified it to compile a runnable Python file instead of WebAssembly. The kernel will take the compiled python file, run it, and output the result to the user below the cell.

Installation and Setup¶

  1. Clone the repository and ```cd``` into the directory
    
  2. Make sure you are outside of the p0 directory, and run jupyter kernelspec install --user p0. You can run jupyter kernelspec list to check the list of installed kernels
  3. cd into the p0 directory, and run jupyter notebook
  4. You can now select the p0 kernel from the dropdown

The files within this project are:

  • p0kernel.py: The kernel itself.
  • kernel.json: Attributes required for the kernel to run.
  • convert.py: Helper file to run the compilation and conversion.
  • CGpy.py: Code Generator for Python. Interprets lines from P0 and outputs the appropriate command in Python. Based off CGwat.py, CGmips.py, and CGast.py in the course.
  • CGpy Tests.ipynb: A notebook dedicated to testing if CGpy.py generates the correct program in Python
  • program.py: The interpreted Python file compiled by convert.py for kernel output.
  • P0.py, SC.py, ST.py: Parser files, taken directly from the course notes.
  • CGwat.py, CGast.py: Code Gen files for other languages, taken directly from the course notes.

This notebook contains programs relevant to the explanations. For a complete list of more detailed programs, refer to the CGpy Tests notebook.

The P0 Language in Jupyter¶

The Current Method for P0¶

  1. Write the entire P0 program as a string
  2. Compile the string into a program using one of the available Code Generators
    • wat, mips, ast
  3. Run the newly generated program and capture the output
compileString("""
type A = [2 .. 9] → integer
var a: A
procedure (r: A) q()
  r[2] := 7
program p
  writeln() //a.q()
"""), 'assign.wat', target = 'wat')

!wat2wasm assign.wat

runpywasm('assign.wasm')

The Goal¶

Our goal is to allow the running of P0 code directly within the cell. This is accomplished by replacing the default Python kernel in Jupyter with a P0 kernel that would compile and output the P0 code inline with the expected behaviour.

A Quick Overview of Jupyter Kernels¶

When receiving input from a code cell in Jupyter, the function that parses and returns the appropriate response is the do_execute function.

The parameter code represents the code to be executed (i.e. the contents of the cell). When the kernel is ready to send back a response to the user, it constructs a stream message type, with the appropriate message inside the content. This is then passed through to the kernel messaging protocol to output to the user

def do_execute(self, code, silent=False, store_history=True, user_expressions=None, allow_stdin=True):

        if not silent:
            stream_content = {'name': 'stdout', 'text': code}
            self.send_response(self.iopub_socket, 'stream', stream_content)

        return {'status': 'ok',
                'execution_count': self.execution_count,
                'payload': [],
                'user_expressions': {},
               }

In this case, do_execute receives the code code and simply transmit it back, by storing it in stream_content and sending it back with send_response.

The value of stream_content['text'] is what the user will receive.

stream_content = {'name': 'stdout', 'text': 'hello world'}

For example, this would make the kernel reply to all code cells with 'hello world'.

Development History¶

We worked on two separate solutions for developing a Jupyter Kernel for P0. The first method was a quick combination of the a P0 kernel with the course's method on running P0 code via Web Assembly.

The second method is a development of a new Code Generator that would target Python. We chose the Python language because it is embedded within Jupyter, reducing the requirement for additional tools to run it.

The Naive Method - P0 Code Cell into WebAssembly¶

The first method involved simply feeding the code input into the pre-established process and sending the output to the kernel.

In this case, code written in the cell would be automatically run through P0.compileString when the cell is run. The resulting wat file would then be converted with wat2wasm, and the .wasm file would be run with runpywasm.

This entire procedure is the same as what we encountered in the course notes and assignments, except the compilation, conversion, and running of the WebAssembly file would occur behind the scenes when the cell is ran.

The function that takes in contents of the cell and returns the output after compilation, conversion, and running.

def output(code):
    P0.compileString(input, "program.wat", 'wat')   # Compiling text from cell
    os.system('wat2wasm program.wat')               # Converting generated file to binary
    return runpywasm('program.wasm')                # Running the .wasm file; return value is the output of the file

The function that runs the wasm file and captures the output. This is the same code encountered in our course notes.

def runpywasm(wasmfile):
    str = io.StringIO()
    def write(s, i): print(i)
    def writeln(s): print('\n')
    def read(s): return int(input())
    with contextlib.redirect_stdout(str):
        pywasm.load(wasmfile, {'P0lib: {'write': write, 'writeln': writeln, 'read': read}})
    output = str.getvalue()
    return output

Scrapped on April 1¶

The method was scrapped because it relies on an external tool of WebAssembly.

Preferably this project should use the Jupyter protocols directly and rely on other tools as little as possible.

This method used WebAssembly, and libraries such as pywasm, or the command wat2wasm had to be present on the machine prior to execution. This meant that this implementation could not function by itself, and we realized that we should focus on using as little outside tools as possible.

Our Solution - A Python Code Generator¶

The second method was the development of a new Code Generator for Python. This would translate the P0 code into Python code, and we could then run this generated python program and feed the results back to the kernel.

By translating the P0 code into a Python file, we will then be able to run and return the outputs of this program without the need for external tools, preventing this kernel from having too many additional dependencies.

P0 Kernel - p0kernel.py¶

def do_execute(self, code, silent=False, store_history=True, user_expressions=None,
                   allow_stdin=True):
        if not silent:
            try:
                compile(code, "program.py", 'py')     # compiles program into python file
                response = output("program.py")       # runs and stores output from python file with given inputs
            except Exception as e:
                response = "P0 Parse Error: " + str(e)

            stream_content = {'name': 'stdout', 'text': str(response)}
            self.send_response(self.iopub_socket, 'stream', stream_content)

P0 Kernel - convert.py¶

The file that handles the conversion process of the string P0 programs into Python. Used by p0kernel.py.

# Compiles the P0 code into a file with the given name and the target language
def compile(code, name, type = 'py'):
    P0.compileString(code, name, type)

# Runs the given python file and returns the output.
def output(filename):
    fileout = runpyfile(filename)
    return fileout

# Runs a python file with the given arguments while capturing the stdout and stderr values
def runpyfile(pyfile):
    process = subprocess.run('py ' + pyfile, capture_output=True, text=True)
    output = str(process.stdout) + "\n" + str(process.stderr)
    return output

P0.py¶

The function compileString was modified to allow targeting to the Python Code Generator.

def compileString(src, dstfn = None, target = 'py'):
    global CG
    if target == 'py': import CGpy as CG         # <- new Code Generator
    elif target == 'wat': import CGwat as CG
    elif target == 'mips': import CGmips as CG
    elif target == 'ast': import CGast as CG
    else: print('unknown target'); return
    try:
        SC.init(src); ST.init(); p = program()
        if dstfn == None: return str(p)
        else: with open(dstfn, 'w') as f: f.write(p)
        print("Compiled " + str(dstfn) + '\n')
    except Exception as msg:
        raise Exception(str(msg))

SC.py¶

There was a bug that did not allow the letter n to be used as a variable of a parameter. To fix this, the function getSym was slightly modified when n is detected:

if ch == 'n':
            getChar(); 
            if ch == 'o': 
                getChar(); 
                if ch == 't': 
                    getChar(); 
                    if ch == ' ': sym = NOT
                    else: identKW() 
                else: identKW()
            else: identKW()

How CGpy Was Tested¶

  1. We ran the P0 program using the new CGpy, and again with CGwat. We then compared the output of both the Python and the WebAssembly result.
  2. We then reviewed the generated Python code to ensure proper structure and commands.

Some examples can be found in this notebook. For more in-depth testing, visit the 'CGpy Tests.ipynb' notebook.

Here is a sample comparison made between a WebAssembly result and our Python equivalent within a Jupyter Notebook:

result-2.png

Furthermore, here is a sample comparison made between a WebAssembly compilation and our Python equivalent:

(module
(import "P0lib" "write" (func $write (param i32)))
(import "P0lib" "writeln" (func $writeln))
(import "P0lib" "read" (func $read (result i32)))
(global $_memsize (mut i32) i32.const 0)
(func $program
(local $a i32)
(local $b i32)
(local $i i32)
(local $0 i32)
i32.const 1
i32.const 3
local.set $0
i32.const 1
local.get $0
i32.shl
i32.const 5
local.set $0
i32.const 1
local.get $0
i32.shl
i32.or
i32.const 7
local.set $0
i32.const 1
local.get $0
i32.shl
i32.or
local.set $a
local.set $i
local.get $i
local.set $0
i32.const 1
local.get $0
i32.shl
local.get $a
i32.and
i32.eqz
if
i32.const 7
local.set $0
i32.const 1
local.get $0
i32.shl
i32.const 9
local.set $0
i32.const 1
local.get $0
i32.shl
i32.or
i32.const 0xffffffff
i32.xor
local.get $a
i32.and
local.set $a
end
loop
local.get $i
i32.const 12
i32.lt_s
if
local.get $i
local.set $0
i32.const 1
local.get $0
i32.shl
local.get $a
i32.and
if
local.get $i
call $write
end
local.get $i
i32.const 1
i32.add
local.set $i
br 1
end
end
)
(memory 1)
(start $program)
)

uSet = set(list(range(-10000, 10000)))
def bitsets():
    global uSet
    a = None
    b = None
    i = None
    i = 1
    a = ({3, 5, 7})
    if (not ((i) in a)):
        a = (a.intersection((uSet - ({9, 7}))))
    while (i < 12):
        if ((i) in a):
            print(i)
        i = (i+1)

bitsets()

CGpy - Code Generator for Python¶

Overall, we tried to include the same functionality as the original P0 parser. This means:

  • Procedures and programs (with parameters)
  • Returning specified values from the parameters
  • Calling procedures and programs within the body with support for passing values
  • Global or local variable and constant declaration and assignment
  • Getting user input and as well as printing output
  • Support for different types such as Arrays, Sets, Records, Integers, and Booleans
  • Support for Unary and Binary operations
  • If and While Loop constructs with support for multiple conditions

We also added support for automatic indentation according to Python standards. In order to accomplish this, we had to account for fundamental differences between WebAssembly and Python, where one language has a binary instruction format on a stack, and the other is a traditional object-oriented language. However, we kept the same methodology of pushing and popping instructions from an array. Often this meant we had to merge multiple instructions into a single line.

CGpy - Programs and Procedures with Parameters¶

Indents in Python¶

Python, like P0, uses indentation as part of its flow control. We define a function that meant to control and shift indentation.

# Modifies the current indent by the selected value
def moveIndent(val): 
    global indent 
    indent += (val*4)

P0 Programs in Python¶

When defining a program, we save the name of the program to be used later to run the function. We then also increment the curlev, current level of our variables, and increase the indentation by 1.

# Entering program name 'ident'.
def genProgEntry(ident):
    global curlev, programName
    programName = ident
    asm.append('def ' + ident + '():')  # Defining the function; i.e. def arithmitic():

    curlev = curlev + 1
    moveIndent(1) 
    declareGlobalVars()                 # Appends a line of global variables for access

When a program ends, we reduce the indentation by 1 for future lines, and reduce the level. We also add a call to programName to call this program in the python file.

# Ending program; returning all lines of code in asm with \n appended
def genProgExit(x):
    global curlev, programName
    curlev = curlev - 1                     
    moveIndent(-1)                         # reduce indents

    asm.append ('\n' + programName + '()') # Call this program at the end
    return '\n'.join(l for l in asm)       # Flattens asm stack into a string

P0 Procedures in Python¶

Procedures are treated similarly to Programs. We simply increment the curlev - the current level of our variables - write down the declaration, and increase the indentation by 1. Finally we also declare the global vars as a line so the function has access to them.

# Starting procedure name 'ident'; declaration, moving indents, and declaring global variables
def genProcStart(ident, fp, rp):
    global curlev 
    curlev = curlev + 1
    if rp: asm.append(' '*indent + 'def ' + ident + '(' + ','.join(e.name for e in rp) + '):')
    moveIndent(1)
    declareGlobalVars()
    return rp

# Entering procedure 'ident'; occurs after local vars are loaded, so nothing is done here
def genProcEntry(ident, para, local):
    pass

P0 defines the returned values next to the parameters in a procedure, whereas Python has an explicit return statement at the very end of function to return the values. To support returning values, the genProcExit function is modified to append a return statement, along with the parameters that were originally passed:

# Exiting procedure; return any parameters as needed, and reduce indent
def genProcExit(x, para, local):
    global curlev
    curlev = curlev - 1
    if para: asm.append(' '*indent + 'return (' + ','.join(e.name for e in para) + ')')
    moveIndent(-1)
In [20]:
procedure useful(x, y: integer) → (q, r: integer)
    q, r := x+5, y+8

program veryuseful
    var a, b, c, d: integer
    a := 1
    b := 2
    c, d ← useful(a, b)
    write(c)
    write(d)
6
10

In [17]:
program useful
    write(1)
1

CGpy - Constants, Variables, and Types¶

Constants¶

Constants are handled by the genConst function, where each one is handled in the same fashion as the original implementation of CGwat.py.

def genConst(x):
    x.lev = None
    return x

Constants refer to both the Int and Bool classes.

In [25]:
const N = 4
var r: integer 

program useful
    var x, y: integer
    r := 7
    y, x := 6, 10
    write(x)
    write(y)
    write(N)
10
6
4
Global Variables¶

Global variables required creating a separate array, 'globalVars, within genProgStart to keep track of assignment. Then, the function genGlobalVars is modified to append the global variables to said array, as well as call to initialize each variable.

def genGlobalVars(sc, start):
    global memsize
    # Declare a universal set
    asm.append('uSet = set(list(range(-' + str(uniSetRange) +', '+ str(uniSetRange) +')))')
    for i in range(start, len(sc)):
        if type(sc[i]) == Var:
            if sc[i].tp in (Int, Bool) or type(sc[i].tp) in (Array, Record, Set):
                asm.append(' '*indent + initializeVar(sc[i]))
                globalVars.append(sc[i].name)
            else: mark('genGlobalVars - error handling unknown type ' + str(type(sc[i])))

There is also a special case when dealing with Sets, specifically the complement and intersection operations within Python. For these scenarios, a universal set is introduced with a range of -N to N.

# Declare a universal set
    asm.append('uSet = set(list(range(-' + str(uniSetRange) +', '+ str(uniSetRange) +')))')

For example, in P0, the Complement of an empty set is the set of all integers. As Python cannot do this, we declare a uSet that will represent this concept of a universal set of all integers from -N to N.

    uSet = set(list(range(-10, 10)))
    uSet = {-10, -9, -8, ..., 8, 9, 10}

Upon entry of a program or procedure, a call to the function declareGlobalVars is made to go over the global variables array and append each one, separated by commas, to the main array as a Python instruction:

def declareGlobalVars():
    if (len(globalVars) > 0):
        asm.append(' '*indent + "global " + ",".join(v for v in globalVars))

This provides the function access to the global variables

a, b, c = 0, 4, 7
def function1:
    global a, b, c
Local Variables¶

Local variables only required modifying the genLocalVars function to append and simultaneously initialize each variable:

def genLocalVars(sc, start):
    for i in range(start, len(sc)):
        if type(sc[i]) == Var:
            asm.append(' '*indent + initializeVar(sc[i]))
    return sc[start:]
Variable Initialization¶

The initialization of each variable is done with the inclusion of the initializeVar function. Upon a call, it uses the getInitialValue function to determine the value of each variable based on its type, and based on its rational Python equivalent.

def initializeVar(var):
    res = var.name + ' = ' + getInitialValue(var.tp)
    return res

For example, a variable of Integer type is set to 0, a set is set(), and empty set. Arrays are initialized to an array with a length determined by the variable attributes, whereas records are initialized to a Python dictionary that holds each of the record values.

This function takes the type of the given variable, and returns the initial value of said type. If the type is an Array or Record, it recursively visits the inner fields/element's type.

def getInitialValue(varType):
    if varType == Int: return '0'              # integers initialize to 0
    elif varType == Bool: return 'False'       # booleans initialize to False
    elif type(varType) == Set: return 'set()'  # python set
    elif type(varType) == Array:               # ararys initiaze to ['type' for i in range 'length']
        res = '[' + getInitialValue(varType.base) + ' for i in range(' + str(varType.length + varType.lower) + ')]' 
        # handles [2..6] by doing [0 for i in range 6]
        return res 
    elif type(varType) == Record:              # Python dictionary {"f1": v1, "f2": v2, ..., "fn": vn}
        recordValues = []
        for f in varType.fields:
            val = getInitialValue(f.tp)
            line = '"' + f.name + '": ' + val
            recordValues += [line]
        return '{' + ','.join(i for i in recordValues) + '}'  # python dict
    else : return 'None'

CGpy - Assignment of Variables and Constants¶


The loadItem function return the appropriate value of a scanned item depending on its type. For example, if the item is a constant, then we want to retrieve its value, checking if it is a Bool or Int.

If it is a variable, we check to see if it can be retrieved via the item's name, or if we have the pop the result from the stack asm and use that instead.

def loadItem(x):
    if (type(x) == Const): 
        if (x.tp == Bool): return (x.val == 1) 
        else: return str(x.val) # integer 
    elif type(x) == Var:
        # return the name of the variable
        if x.lev == Global or \
           x.lev == curlev: return (x.name)

        # pop result from stack; i.e. 'x[i]', x['first'], '(x+y)', and so on
        elif x.lev == Stack or \
             x.lev == MemInd or \
             x.lev == MemAbs: return encaseParenthesis(asm.pop())    # Check if popped element needs to be encased with '()'
        else: mark('loadItem - error; received var level not handled' + str(x.lev))
    else: mark('loadItem - encountered unknown type' + str(x)); return None

When loading, if a result is popped from the stack it also enclosed in a new set of parenthesis. This is done with the inclusion of the encaseParenthesis function:

def encaseParenthesis(msg):
    if (str(msg[0]) != '(' or str(msg[-1]) != ')'): msg = '(' + str(msg) + ')'
    return msg

The genLeftassign function was originally used to help with array values in WebAssembly code generation. In Python,

def genLeftAssign(x):
    global lhsAssign; lhsAssign += 1
    return x

The genRightAssign function was find and loads the expression on the right of the assignment and stores it in as a name in a temporary variable.

def genRightAssign(x):
    global rhsAssign; rhsAssign += 1
    y = Var(x.tp)
    y.name = loadItem(x)
    y.lev = curlev
    return y

Both lhsAssign and rhsAssign is used to keep track of how many assignments have occurred on the same line.

For example, P0 commands

  • a, b := 3, 5 gives lhsAssign = 2 = rhsAssign
  • a, b, c := 4, 2, 1 lhsAssign = 3 = rhsAssign

This is used to later ensure the correct of order assigning, such that the python conversion results in

a = 3
b = 5

This is especially important when going through loops.

j = 1                 j = 1
while (j < 32):       while (j < 32):
    print(j)              j = j+1 
    j = j + 1             print(j)

Finally, the genAssign function is prompted to simply load the variables (x, y) and assign them directly. It then appends the Python equivalent:

def genAssign(x, y):
    global lhsAssign, rhsAssign, listWaitAssign
    y2 = loadItem(y)  # y2 is loaded first due to stack
    x2 = loadItem(x)

    # store assignment statement in waiting list
    listWaitAssign += [' '*indent + x2 + ' = ' + str(y2)]
    lhsAssign -= 1; rhsAssign -= 1  # removed one assignment statement from the (a,b,c :=3,5,6)

    # if all assignment statements in this line have been assigned...
    if (lhsAssign == 0 == rhsAssign):
        while (len(listWaitAssign) > 0): asm.append(listWaitAssign.pop())  # append the lines to the stack

Handling Arrays¶


Arrays are handled with the genIndex function. It links the proper variable and index together in Python form, and changes the appropriate type modifiers.

#x[y]
def genIndex(x, y):
    y2 = loadItem(y)
    x2 = loadItem(x)
    asm.append(x2 + '[' + y2 + ']')
    if x.lev == MemAbs and type(y) == Const: 
        x.tp = x.tp.base 
    else:
        x = Var(x.tp.base)
        if x.tp in (Int, Bool) or type(x.tp) == Set: x.lev = MemInd
        else: x.lev = Stack
    return x

Handling Records¶


Similarly, supporting record assignment involved modifying the genSelect function to append the proper key-value pair together with the brackets in Python form:

#x.f becomes x["f"]
def genSelect(x, f):
    x2 = loadItem(x)
    asm.append(x2 + '["' + f.name + '"]')
    x.lev = Stack
    x.tp = f.tp
    return x

Altogether, this means that variables and the different types of variables now properly work. Here are a few examples of the different types in use:

Constants and Variables¶


In [8]:
const N = 4
var r: integer 

program useful
    var x, y: integer
    r := 7
    y, x := 6, 10
    write(x)
    write(y)
    write(N)
10
6
4

Arrays¶

In [2]:
var c: [0 .. 1] → integer
var a, b: [2 .. 9] → integer
program p
  b[2] := 3; b[3] := 5
  a := b
  write(a[2]); write(a[3]); write(a[4]) // writes 3, 5, 0
  
3
5
0
In [3]:
type A = [2 .. 9] → integer
type B = [0 .. 1] → A
var b: B
procedure q(x: A) → (y: A)
    y := x; write(x[4]) // writes 0
program p
  b[1][2] := 3; b[1][3] := 5
  b[0] ← q(b[1])
  write(b[0][2]); write(b[0][3]) 
0
3
5

Records¶

In [7]:
type T = (first: integer, second: integer, third: integer)
var r: T
program records
    r.first := 1 
    r.second := 10 
    r.third := -6
    write(r.second)
    write(r.first)
    write(r.third)
10
1
-6
In [6]:
type T = (first: integer, second: integer, third: integer)
var r: T
var s: T
program records
    r.first := 1 
    r.second := 10 
    r.third := -6
    s.first := r.second 
    s.third := r.first + r.second + r.third
    write(s.first)
    write(s.third)
10
5

Sets¶

In [8]:
type S = set [1..10]
procedure elements(s: S)
    var i: integer
    writeln(); i := 0
    while i < 11 do
        if i ∈ s then write(i)
        i := i + 1
program p
    var s: S
    s := {3,4,1,8}; elements(s) 
1
3
4
8

CGpy - Operations¶


To support Operations we generally had to modify the genUnaryOp, genBinaryOp, and genRelation functions to append the Python equivalent. Loading the operands was done the same way through the loadItem function.

Unary Operations¶

def genUnaryOp(op, x):
    # load the name/value/expr of x
    x2 = loadItem(x)
    if op ... 
        ...
    else: mark('genUnaryOp - unhandled operator received: ' +str(op))
    return x
Minus¶
if op == MINUS:
        asm.append('-' + str(x2))
        x = Var(Int); x.lev = Stack
In [9]:
program unary
    var y: integer
    y := -(1)
    write(y)
-1
Set¶
elif op == SET:
        asm.append(encaseBraces(x2))
        x = Var(Set(0, 32)); x.lev = Stack

The function encaseBraces simply takes the value and encases it in braces if required.

def encaseBraces(msg):
    if (str(msg[0]) != '{' or str(msg[-1]) != '}'): msg = '{' + str(msg) + '}'
    return msg
In [10]:
type S = set [1..10]
procedure elements(s: S)
    var i: integer
    writeln(); i := 0
    while i < 10 do
        if i ∈ s then write(i)
        i := i + 1
program sets
    var s, t: S
    s := {3,4}; elements(s)
    
3
4
Cardinality¶
elif op == CARD:
        # Gets the number of elements in the set
        asm.append('len' + encaseParenthesis(x2))
        x = Var(Int); x.lev = Stack
In [11]:
type S = set [1..10]
program sets
    var s: S
    s := {3,4}; write(#s)
2
Complement¶
elif op == COMPLEMENT:
        # an empty set has value 0
        if (x2 == '0'): x2 = 'set()'
        asm.append('uSet - ' + x2)
        x = Var(x.tp); x.lev = Stack
In [12]:
program sets
  var a, b: set [1 .. 11]
  var i: integer
    i, a := 1, {3, 5, 7}
    a := a ∩ ∁{7, 9}
    while i < 12 do
      if i ∈ a then write(i)
      i := i + 1
3
5
Not¶
elif op == NOT:
        asm.append('not ' + encaseParenthesis(x2))
        x = Var(Bool); x.lev = Stack
In [13]:
program logic
    var x, y: boolean
    x := true
    y := ¬(true)
    if (x and y) then write(1) else write(2)
2
In [14]:
program logic
    var x, y: boolean
    x := ¬(false)
    y := ¬(false)
    if (x and y) then write(1) else write(2)
1
Binary Operations¶

def genBinaryOp(op, x, y):
    y2 = loadItem(y)
    x2 = loadItem(x)
    if ...
        ...
    else: mark('genBinaryOp - unknown binary operator encountered ' + str(op))
    return x
Addition¶
if op in PLUS: 
        asm.append((x2 + '+' + y2) if op == PLUS)
In [56]:
program arithmetic
    var x, y: integer
    x, y := 2, 3
    write(x+y)
5
Subtraction¶
if op in MINUS: 
        asm.append((x2 + '-' + y2) if op == MINUS)
In [57]:
program arithmetic
    var x, y: integer
    x, y := 2, 3
    write(x-y)
-1
Multiplication¶
if op in TIMES: 
        asm.append((x2 + '*' + y2) if op == TIMES)
In [58]:
program arithmetic
    var y, z: integer
    y, z := -2, 3
    write(y×z)
-6
In [59]:
program arithmetic
    var y, z: integer
    y, z := 2, 3
    write(y*z)
6
Division¶
if op in DIV: 
        asm.append(('int(' + x2 + '/' + y2 + ')') if op == DIV)
In [60]:
program arithmetic
    var x, y: integer
    x, y := 3, 3
    write(x div y)
1
In [15]:
program arithmetic
    var x, y: integer
    x, y := 7, 3
    write(x div y)
2
Mod¶
if op in MOD: 
        asm.append((x2 + '%' + y2) if op == MOD)
In [61]:
program arithmetic
    var x, y: integer
    x, y := 3, 3
    write(x mod y)
0
In [62]:
program arithmetic
    var x, y: integer
    x, y := 10, 3
    write(x mod y)
1
Union and Intersection Operations¶

For binary operations that involved Union or Intersection between sets, we decided to include a separate function genSetOp that performed the operation betwen the sets and returned a simplified Python equivalent when available. This is done to reduce the amount of clutter that occurs in the Python code.

For example, instead of writing ({3}).union(({5})).union(({7}), we instead try to write {3, 5, 7}):

def genSetOp(x, y, op):
    # removes ({...}) from string, and returns elements as int if possible
    # i.e. '({3, 5})' and '({7})' => [3, 5] and [7]
    def listFromString(s):
        if (s[0] == '(' and s[1] == '{' and \
            s[-1] == ')' and s[-2] == '}'):
            s = s[2:-2]
            s = [int(v) for v in s.split(',')]
        return s
    # Generate int lists from x and y, and try and use .union
    xList = listFromString(x); yList = listFromString(y)
    if (isinstance(xList, list) and isinstance(yList, list)):
        if (op == UNION): res = set(xList).union(set(yList))
        elif (op == INTERSECTION): res = set(xList).intersection(set(yList))
        else: mark('setOp - unhandled operation ' + str(op))
# failed cleanup (i.e vars were present). This is the default handling
    else: 
        if (op == UNION): opStr = ".union("
        elif (op == INTERSECTION): opStr = ".intersection("
        else: mark('setOp - unhandled operation ' + str(op))
        res = x + opStr + y + ")"
    return str(res)
Union, Intersection¶
elif op in {UNION, INTERSECTION}:
        asm.append(genSetOp(x2, y2, op))
        x = Var(x.tp); x.lev = Stack
In [17]:
type S = set [1..10]
procedure elements(s: S)
    var i: integer
    writeln(); i := 0
    while i < 10 do
        if i ∈ s then write(i)
        i := i + 1
program uni
    var s: S
    s := {3,4,1,8}
    s := s ∪ {1, 9}; elements(s) 
    
1
3
4
8
9
In [64]:
type S = set [1..10]
procedure elements(s: S)
    var i: integer
    writeln(); i := 0
    while i < 10 do
        if i ∈ s then write(i)
        i := i + 1
program inter
    var s: S
    s := {3,4,1,9}
    s := s ∩ {5, 7, 9}; elements(s)
    
9
And¶
elif op == AND:
        asm.append(x2 + ' and ' + y2)
        x = Var(Bool); x.lev = Stack
In [18]:
program logic
    var x, y: boolean
    x := true
    y := false
    if (x and y) then write(1) else write(2)
2
In [19]:
program logic
    var x, y: boolean
    x := true
    y := true
    if (x and y) then write(1) else write(2)
1
Or¶
elif op == OR:
        asm.append(x2 + ' or ' + y2)
        x = Var(Bool); x.lev = Stack
In [20]:
program logic
    var x, y: boolean
    x := false
    y := false
    if (x or y) then write(1) else write(2)
2
In [21]:
program logic
    var x, y: boolean
    x := true
    y := false
    if (x or y) then write(1) else write(2)
1
Relational Operations¶

def genRelation(op, x, y):
    y2 = loadItem(y)
    x2 = loadItem(x)
    if op == ...
        ...
    return x
Equal¶
asm.append((x2 + ' == ' + y2) if op == EQ)
In [28]:
program relation
    var x, y: integer
    x, y := 3, 3
    if (x = y) then write(1)
1
In [30]:
program relation
    var x, y: integer
    x, y := 3, 4
    if (x = y) then write(1) else write(2)
2
Not Equal¶
asm.append((x2 + ' != ' + y2) if op == NEQ)
In [32]:
program relation
    var x, y: integer
    x, y := 3, 4
    if (x != y) then write(1)
1
In [33]:
program relation
    var x, y: integer
    x, y := 4, 4
    if (x != y) then write(1) else write(2)
2
Less Than¶
asm.append((x2 + ' < ' + y2) if op == LT)
In [34]:
program relation
    var x, y: integer
    x, y := 3, 4
    if (x < y) then write(1)
1
In [35]:
program relation
    var x, y: integer
    x, y := 4, 4
    if (x < y) then write(1) else write(2)
2
Greater Than¶
asm.append((x2 + ' > ' + y2) if op == GT)
In [36]:
program relation
    var x, y: integer
    x, y := 4, 3
    if (x > y) then write(1)
1
In [37]:
program relation
    var x, y: integer
    x, y := 4, 4
    if (x > y) then write(1) else write(2)
2
Less Than or Equal To¶
asm.append((x2 + ' <= ' + y2) if op == LE)
In [44]:
program relation
    var x, y: integer
    x, y := 3, 4
    if (x <= y) then write(1)
1
In [43]:
program relation
    var x, y: integer
    x, y := 3, 3
    if (x <= y) then write(1)
1
Greater Than or Equal To¶
asm.append((x2 + ' >= ' + y2) if op == GE)
In [41]:
program relation
    var x, y: integer
    x, y := 3, 3
    if (x >= y) then write(1)
1
In [42]:
program relation
    var x, y: integer
    x, y := 4, 3
    if (x >= y) then write(1)
1
Element¶
asm.append((x2 + ' in ' + y2) if op == ELEMENT)
In [45]:
type S = set [1..10]
program element
    var s, t: S
    s := {3,4,1,9}
    if 3 ∈ s then write(1) else write(2)
1
In [46]:
type S = set [1..10]
program element
    var s, t: S
    s := {3,4,1,9}
    if 10 ∈ s then write(1) else write(2)
2
Subset¶
asm.append((x2 + '.issubset(' + y2 + ')') if op == SUBSET)
In [47]:
type S = set [1..10]
program sub
    var s, t: S
    s := {3,4,1,9}
    t := {4,3}
    if t ⊆ s then write(1)
    
1
In [48]:
type S = set [1..10]
program sub
    var s: S
    s := {3,4,1,9}
    if {3,2,21} ⊆ s then write(1) else write(2)
    
2
Superset¶
asm.append((x2 + '.issuperset(' + y2 + ')') if op == SUPERSET)
In [49]:
type S = set [1..10]
program super
    var s, t: S
    t := {3,4,1,9}
    s := {4,3}
    if t ⊇ s then write(1)
1
In [50]:
type S = set [1..10]
program super
    var s, t: S
    s := {3,4,1,9}
    t := {4,3}
    if t ⊇ s then write(1) else write(2)
2

Procedure Calls¶


The functions genActualPara and the genCall function are used by P0.py when calling a function.

genActualPara attempts to loads the actual parameters that the procedure will use when it is called.

# loads the actual parameters onto the stack, and also returns the new parameter to be used in genCall's `ap` parameter
def genActualPara(ap, fp, n):
    newParameter = loadItem(ap)
    asm.append(newParameter)
    return newParameter

genCall contains the return parameters, procedure, and the actual parameters it will execute the procedure with. ap contains the list of all actual parameters generated by genActualPara.

def genCall(rp, pr, ap):
    # genActualPara pushed ap onto the stack in addition to passing it to this function
    # we can use this to check if genCall is going to receive the correct parameters
    for a in reversed(ap):
        b = asm.pop()
        if a != b: mark("\ngenCall - invalid parameter found between " + str(a) + " and " + str(b))

    # Start line of code
    res = ' '*indent
    # adding return parameters; i.e   b = a(1)
    if (len(rp) > 0): res +=','.join(loadItem(r) for r in rp) + ' = ' 
    # adding actual procedure call with ap parameters
    res += pr.name + '(' + ','.join(a for a in ap) + ')'
    ...
...
    res += pr.name + '(' + ','.join(a for a in ap) + ')'
    # store assignment statement in waiting list to handle multiple assignments in the same line
    listWaitAssign += [res]
    lhsAssign -= len(rp) # lhsAssign should equal rhsAssign; i.e. x, y = quotrem(5, 3)

    # all assignments in line appended to listWaitAssign; now we add the list to the stack
    if (lhsAssign == 0 == rhsAssign):
        while (len(listWaitAssign) > 0): asm.append(listWaitAssign.pop())
In [22]:
procedure useful(x, y: integer) → (q, r: integer)
    q, r := x, y

program veryuseful
    var a, b, q, r: integer
      a := 2
      b := 3
      q, r ← useful(a, b)
      write(q); write(r)
2
3

Read and Write P0 Statements¶


In P0, we used read(), write(), and writeln() from a P0 library. These are written in Python with input() and print().

Print Statements from write and writeln¶

Print statements simply involved modifying the genWrite and genWriteLn functions to instead append the Python equivalent print and println, depending on the command received.

def genWrite(x):
    eleml = asm.pop()  # genActualPara pushes the target onto the stack before genWrite is called
    asm.append(' '*indent + 'print(' + str(eleml) + ')')

def genWriteln():
    asm.append(' '*indent + "print('\\n')")
In [23]:
program arithmetic
    var x, y: integer
    x, y := 4, 2
    write(x+y)
6
In [24]:
program blank
    write(1)
    writeln()
1
Input from Read¶

The genRead function appends the Python equivalent input without the need of calling genAssign.

def genRead(x):
    global lhsAssign; lhsAssign -= 1          # x = input(); finished an assignment statement, lhs -= 1
    asm.append(' '*indent + x.name + " = int(input())")
    y = Var(Int); y.lev = Stack
In [42]:
program arithmetic
    var x, y: integer
    x ← read(); y ← read()
    write(x+y)
Traceback (most recent call last):
  File "C:\Users\linas\Desktop\4tb3 new\group-21\p0\program.py", line 10, in <module>
    arithmetic()
  File "C:\Users\linas\Desktop\4tb3 new\group-21\p0\program.py", line 6, in arithmetic
    x = int(input())
EOFError: EOF when reading a line
A Limitation to Inputs¶

Unfortunately, we encountered a limitation of the use of input. Despite implementing the proper conversion into Python code, it cannot be run from within the wrapper kernel for any Jupyter Notebook.

If run, it will throw an EOF error or communications error due to the input feed being cut off immediately. We had multiple attempts of solving this limitation with little success.

Some attempts included using different kernel messages on the shell (stdin, execute_reply to allow for shell input), detecting the number of input lines after and feeding them as arguments separately, programmatically creating new cells used for input, using custom ipython widgets and so on.

CGpy - Control Structures¶


If-Then, If-Then-Else, and While Loops

If¶


For any If statements, the functions genThen, genIfThen, and genIfElse were modified to better support the Python equivalent block construction.

The genThen function was modified to retrieve the condition and put it within the condition statement:

# This generates the condition to check against
def genThen(x):    
    x2 = loadItem(x)
    asm.append(' '*indent + 'if (' + x2 + '):')
    moveIndent(1)
    return x

The genElse function was only slightly modified to append the start of a Python equivalent Else block:

# Generates the 'else:' line after y has been loaded
def genElse(x, y):
    moveIndent(-1)
    asm.append(' '*indent + 'else:')
    moveIndent(1)

The genIfThen and genIfElse functions were modified to remove any unnecessary WebAssembly instructions from being appended:

# If X, then Y; Y is built from alternate call; we just end If-Then here
def genIfElse(x, y, z):
    moveIndent(-1)
    return x
# Ends the If X Then Y Else Z clause
def genIfThen(x, y):
    moveIndent(-1)
In [26]:
program arithmetic
    var x, y: integer
      x := 1
      y := 2
      if x ≠ y then write(x × y)
2
In [36]:
program arithmetic
    var x, y: integer
      x := 1
      y := 2
      if (x ≤ y) or (x ≥ y) then write(x + y)
3

While¶

To support While loops, the functions genWhile, genDo, and genWhileDo were all modified.

The genWhile function was only slightly modified to append the start of a Python equivalent while loop.

def genWhile():
    asm.append('while (')

The genDo function was modified to retrieve (and subsequently delete) all of the separate conditions for the loop and put them within the condition statement:

# After all conditions are appended to stack; go back through them and combine them into one string
def genDo(x):
    asm.append(')')           # End of condition element
    elemc = ' '*indent + "while ("                         # Start of code line to be added

    for i in range(asm.index('while (') + 1, len(asm)):    # Gathering all conditions 
        elemc += asm[i]
    elemc += ':'                                           # while (x..y):
    del asm[asm.index('while (') : ]                       # remove from stack the gathered conditions

    asm.append(elemc)                                     # Append the new condition element
    moveIndent(1)
    return x

The genWhileDo function was modified to remove any unnecessary WebAssembly instructions from being appended:

# Simply ends the while loop by moving the indent back once
def genWhileDo(t, x, y):
    moveIndent(-1)
In [11]:
program arithmetic
    var x, y: integer
      x := 1
      y := 4
      while y ≥ x do
          x := x + 1
          write(y)
2
3
4
5

In [13]:
program arithmetic
    var x, y, z: integer
      x := 1
      y := 4
      z := 4
      while (y ≥ x) and (z ≥ x) do
          x := x + 1
          write(y)
2
3

CGpy - More on Indentation¶


Python requires indentation to indicate different blocks of code and compile correctly. To achieve proper indentation there is a global variable called indent, and a function called moveIndent to modify indent at any point:

def moveIndent(val): 
    global indent 
    indent += (val*4)

This function is called at multiple points, depending on what block of code is occurring. For example, whenever a program or procedure starts, the indent is increased by 4:

def genProgEntry(ident):
    moveIndent(1)

Then, whenever it exits, it decrements the indent by 4:

def genProgExit(x):
    moveIndent(-1)

Other scenarios of changing the indent include inside of If blocks and While loops: \

def genIfThen(x, y):
    moveIndent(-1)
def genWhileDo(t, x, y):
    moveIndent(-1)

Every instruction appended to the array depends on this global level of indent. There is white space prepended to every string that is multiplied by the indent:

asm.append(' '*indent + 'print(' + str(eleml) + ')')

Summary: The Main Problems Encountered¶


Two problems stood out the most

  1. The late switch to the development of a Python Code Generator
  2. The even later realization that input() cannot work within the P0 kernel.

Problem 1 - Switch from WebAssembly compilation in Kernel to Python¶


Our first attempt involved using compilation of the cell contents to WebAssembly via CGwat.py. However, we eventually encountered problems on our devices where certain commands and libraries were not natively installed on the host OS.

I.e. wat2wasm, pywasm libraries, and more.

We later scrapped this idea around the end of March and worked towards developing a new Python Code Generator instead that could natively run the compiled Python program and return the outputs.

Problem 2 - Python input() giving EOF errors¶


During development of CGpy.py, we eventually came across a rather significant error that occurred only when running through Jupyter Notebook.

For the majority of the development, we had been testing our compiled programs in terminal. However, when we tried running the compiled program in the Kernel, we came across the EOF error that occurs when the python program prompts the user with input().

However, the Kernel does not pause the execution and allow further inputs from the user to the program. This results in the program immediately halting with the EOF error, as it never receives an input.

However, running the program in terminal does prompt the user for input.

We tried multiple methods of getting around this issue.

  1. Getting the kernel to pause during execution of the Python program
  2. Opening widgets to the notebook that take in input when needed
  3. Using send_response with a stdin socket to request user input
  4. Scanning the compiled file for number of input() statements, prompting the user to fill them in, and then feeding the user inputs with to the program (I.e. basically replace input() with argv[i].