127 lines
4.6 KiB
Markdown
127 lines
4.6 KiB
Markdown
### Disclaimer
|
|
|
|
This is a fantasy architecture on which I intend to write fantasy compilers. It was born out of the
|
|
"fuck around and find out" philosophy, and is a toy project. I will change a lot of stuff as I learn
|
|
how it's done in the real world. For now, I'm just gonna guess and have fun.
|
|
|
|
Since I'm studying riscV, this will be a lot riscv inspired.
|
|
|
|
# The GRAVEJIT virtual machine
|
|
|
|
The gravejit virtual machine sports 16 16-bit registers (plus the program counter!) and 16 operations.
|
|
|
|
Here is the list of registers together with memonics.
|
|
```
|
|
0 : zero // register 0 is always 0.
|
|
1 : ra // return address
|
|
2 : sp // stack pointer
|
|
3 : t0 // temporary
|
|
4 : t1
|
|
5 : t2
|
|
6 : t3
|
|
7 : a0 // function arguments
|
|
8 : a1
|
|
9 : a2
|
|
10: a3
|
|
11: s0 // saved registers
|
|
12: s1
|
|
13: s2
|
|
14: s3
|
|
15: t4 // don't know what to do with this
|
|
|
|
pc: program counter.
|
|
```
|
|
|
|
## ISA
|
|
|
|
| opcode | memonic | format | description |
|
|
| ------ | -------------- | ------- | --------------------------------------- |
|
|
| 0000 | NOP | just 0s'| Does nothing. |
|
|
| 0001 | ADD s0 s1 s2 | R | s0 = s1 + s2 |
|
|
| 0010 | SUB s0 s1 s2 | R | s0 = s1 - s2 |
|
|
| 0011 | AND s0 s1 s2 | R | s0 = s1 && s2 |
|
|
| 0100 | XOR s0 s1 s2 | R | s0 = s1 xor s2 |
|
|
| 0101 | SLL s0 s1 s2 | R | s0 = s1 << s2 |
|
|
| 0110 | SLI s0 c | I | s0 = s0 << c |
|
|
| 0111 | ADDI s0 c | I | s0 = s0 + c |
|
|
| 1000 | BEQ s0 s1 s2 | R | if (s1 == s2) -> pc = s0 |
|
|
| 1001 | BGT s0 s1 s2 | R | if (s1 > s2) -> pc = s0 |
|
|
| 1010 | JAL s0 s1 c | J | s0 = pc+1; pc += s1 + c; |
|
|
| 1011 | | | #TODO? |
|
|
| 1100 | LOAD s0 s1 s2 | R | loads s1 + shift by s2 in s0 |
|
|
| 1101 | STORE s0 s1 s2 | R | stores s0 in address s1 + shift by s2 |
|
|
| 1110 | CALL s0 c | I | performs system call |
|
|
| 1111 | HALT | just 1s'| halt, and possibly catch fire. |
|
|
|
|
|
|
### Operation formats:
|
|
|
|
Each instruction is 16 bits long.
|
|
The first 4 most-significant bits are the opcode.
|
|
Constants (c in the above table) are always considered signed, and written in
|
|
two's compliment. Sign extension also takes place whenever needed.
|
|
i.e., to make an immediate subtraction, one just needs to add a negative number.
|
|
|
|
#### R-type:
|
|
opcode: 4 bits
|
|
dest register: 4 bits
|
|
source 1 register: 4 bits
|
|
source 2 register: 4 bits
|
|
|
|
example: ADD s0 s1 s2 = 0001 1011 1100 1101
|
|
|
|
#### I-type
|
|
opcode: 4 bits
|
|
dest register: 4 bits
|
|
constant: 8 bits
|
|
|
|
example:
|
|
ADDI s0 28 = 0111 1011 00011100
|
|
ADDI s0 -2 = 0111 1011 11111110
|
|
|
|
|
|
#### J-Type
|
|
opcode: 4 bits
|
|
dest register: 4 bits
|
|
jump address register: 4 bits
|
|
constant: 4 bits
|
|
|
|
|
|
The constant is added to the value of the second register argument.
|
|
|
|
### JIT's system calls:
|
|
|
|
the `CALL` instruction is a bit of a hack because I want to load more functionality into the thing.
|
|
The JIT can decide what to do with the register s0 and the number c.
|
|
It should be possible to open files, write files, read stdin, write to stdout, etc...
|
|
|
|
#### io\_vec: first systemcall environment
|
|
|
|
Working on this, quick and dirty.
|
|
|
|
### Binary executable format:
|
|
|
|
Binary files start with two 16 bit numbers, a constant and a length N, followed by a list of
|
|
length N of pairs 16 bit numbers. This is the header of the file.
|
|
|
|
The initial constant is currently unused and unimportant. In this draft-toy-spec, the initial constant
|
|
is always 39979.
|
|
|
|
The first number is an offset, and the second number is a size N in bytes.
|
|
|
|
The offset points at a null-terminated UTF-8 (yes.) string, located offset\*16 bits to the right after the end of the header in the binary file, followed by arbitrary binary content of size N\*16 bits.
|
|
|
|
The utf-8 string cannot contain the null character anywhere, as that will be used as terminator.
|
|
|
|
This represents a "symbols table" of the binary file, where functions and data can be stored.
|
|
|
|
There must exist a symbol named "main", and it must point to a function: this will be the entrypoint to our program.
|
|
|
|
|
|
### Executable, in memory.
|
|
|
|
When loading a binary program, all the code in the binary file is placed at the start of our memory, followed by the data sections, in the order it appeared.
|
|
|
|
The "text" sections (or code) are put in the order they appeared on the binary file, with the only exception of the "main" section, witch goes at the start of the file.
|
|
|