85 lines
4.0 KiB
Markdown
85 lines
4.0 KiB
Markdown
# 8086-rs
|
||
|
||
8086-rs is a Rust-based toolchain for analyzing and interpreting 16-bit 8086 binaries, made with the intention of interpreting binaries compiled for MINIX 1.x.
|
||
|
||
It includes:
|
||
- a.out Parser to parse legacy MINIX 1.x executables.
|
||
- 8086 disassembler to parse the 16-bit instructions into an IR and prints them in a `objdump(1)`-style fashion.
|
||
- 8086 interpreter which interprets the instructions with MINIX 1.x conventions (e.g. interrupts, memory layout, ...) in mind.
|
||
|
||
## Usage
|
||
|
||
To compile and run the tool, use Cargo:
|
||
```
|
||
cargo build --release
|
||
```
|
||
|
||
Or run it directly:
|
||
```
|
||
cargo run -- --help
|
||
```
|
||
|
||
Run with debug output:
|
||
```
|
||
RUST_LOG=debug cargo run -- interpret -p ./a.out 2>&1 | less
|
||
```
|
||
|
||
CLI Options:
|
||
```
|
||
$ cargo run -- --help
|
||
Simple program to disassemble and interpret 8086 a.out compilates, e.g. such for MINIX
|
||
|
||
Usage: 8086-rs [OPTIONS] <COMMAND>
|
||
|
||
Commands:
|
||
disasm Disassemble the binary into 8086 instructions
|
||
interpret Interpret the 8086 instructions
|
||
help Print this message or the help of the given subcommand(s)
|
||
|
||
Options:
|
||
-p, --path <PATH> Path of the binary
|
||
-d, --dump Dump progress of disassembly, in case of encountering an error
|
||
-h, --help Print help
|
||
-V, --version Print version
|
||
```
|
||
|
||
## Status
|
||
|
||
This project is under active development and primarily used by me to explore some Intel disassembly and learn some more Rust.
|
||
Expect bugs and some missing features.
|
||
I mainly test with 'official' binaries from the MINIX source tree.
|
||
|
||
Currently, everything is in the binary, but I want to move some parts to a lib, which would make it much easier to ignore the Minix 1.x specifics (e.g. currently with a hardcoded interrupt handler) and would allow for more generic usage of this 8086 (e.g. implenting an own simple BIOS or OS).
|
||
But first I want to implement all features correctly and add tests for all of them, before I want to move to that.
|
||
|
||
## Caveats
|
||
|
||
Interpreted code is disassembled into a Vector, which will also be used for execution.
|
||
This means, that the code is not actually loaded into memory, but the `CS:IP` addressing scheme is still being used.
|
||
|
||
|
||
## Documentation
|
||
|
||
The documentation of the project itself can be accessed by using `cargo doc`.
|
||
```
|
||
$ cargo doc
|
||
$ firefox target/doc/8086_rs/index.html
|
||
```
|
||
|
||
For the implementation of the disassembly, I used the Intel "8086 16-BIT HMOS MICROPROCESSOR" Spec, as well as [this](http://www.mlsite.net/8086/8086_table.txt) overview of all Opcode variants used in conjunction with [this](http://www.mlsite.net/8086/) decoding matrix.
|
||
|
||
For the implementation of the interpreter, I used the Intel "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B, 2C & 2D): Instruction Set Reference, A-Z" Spec.
|
||
|
||
|
||
## FAQ
|
||
|
||
#### Why hassle with interpretation and not just emulate 8086?
|
||
For once, this project stemmed from a university exercise about the 8086 instruction set and disassembly.
|
||
An interpreter for these assembly instructions was the logical (?) next step.
|
||
Maybe I add raw 8086 emulation some day.
|
||
|
||
#### Why no `nom`?
|
||
There is no real reason, I just wanted to try to implement most parts myself, even if it meant more boilerplate code.
|
||
I used `nom` extensivly in the past and I just wanted to see what it would be like without that crate.
|
||
In hindsight, using `nom` would have been the cleaner option, but hey, something I only learned by not using `nom` for once.
|