# 8086-rs

8086-rs is a Rust-based toolchain for analyzing and interpreting 16-bit 8086 binaries, made with the intention of interpreting binaries compiled for MINIX 1.x.

It includes:
- a.out Parser to parse legacy MINIX 1.x executables.
- 8086 disassembler to parse the 16-bit instructions into an IR and prints them in a `objdump(1)`-style fashion.
- 8086 interpreter which interprets the instructions with MINIX 1.x conventions (e.g. interrupts, memory layout, ...) in mind and obeys segment register indirection, which enables the usage of the entire 20-bit memory bus.

## Usage

To compile and run the tool, use Cargo:
```
cargo build --release
```

Or run it directly:
```
cargo run -- --help
```

Run with debug output:
```
RUST_LOG=debug cargo run -- interpret -p ./a.out 2>&1 | less
```

CLI Options:
```
$ cargo run -- --help
Simple program to disassemble and interpret 8086 a.out compilates, e.g. such for MINIX 
                                                                                       
Usage: 8086-rs [OPTIONS] <COMMAND>                                               
                                                                                       
Commands:                                                                              
  disasm     Disassemble the binary into 8086 instructions                             
  interpret  Interpret the 8086 instructions                                           
  help       Print this message or the help of the given subcommand(s)                 
                                                                                       
Options:                                                                               
  -p, --path <PATH>  Path of the binary                                                
  -d, --dump         Dump progress of disassembly, in case of encountering an error    
  -h, --help         Print help                                                        
  -V, --version      Print version
```

## Status

This project is under active development and primarily used by me to explore some Intel disassembly and learn some more Rust.
Expect bugs and some missing features.
I mainly test with 'official' binaries from the MINIX source tree.

Currently, everything is in the binary, but I want to move some parts to a lib, which would make it much easier to ignore the Minix 1.x specifics (e.g. currently with a hardcoded interrupt handler) and would allow for more generic usage of this 8086 (e.g. implenting an own simple BIOS or OS).
But first I want to implement all features correctly and add tests for all of them, before I want to move to that.

## Caveats

Interpreted code is disassembled into a Vector, which will also be used for execution.
This means, that the code is not actually loaded into memory, but the `CS:IP` addressing scheme is still being used.


## Documentation

The documentation of the project itself can be accessed by using `cargo doc`.
```
$ cargo doc
$ firefox target/doc/8086_rs/index.html 
```

For the implementation of the disassembly, I used the Intel "8086 16-BIT HMOS MICROPROCESSOR" Spec, as well as [this](http://www.mlsite.net/8086/8086_table.txt) overview of all Opcode variants used in conjunction with [this](http://www.mlsite.net/8086/) decoding matrix.

For the implementation of the interpreter, I used the Intel "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B, 2C & 2D): Instruction Set Reference, A-Z" Spec.


## FAQ

#### Why hassle with interpretation and not just emulate 8086?
For once, this project stemmed from a university exercise about the 8086 instruction set and disassembly.
An interpreter for these assembly instructions was the logical (?) next step.
Maybe I add raw 8086 emulation some day.

#### Why no `nom`?
There is no real reason, I just wanted to try to implement most parts myself, even if it meant more boilerplate code.
I used `nom` extensivly in the past and I just wanted to see what it would be like without that crate.
In hindsight, using `nom` would have been the cleaner option, but hey, something I only learned by not using `nom` for once.