# 8086-rs 8086-rs is a Rust-based toolchain for analyzing and interpreting 16-bit 8086 binaries, made with the intention of interpreting binaries compiled for MINIX 1.x. It includes: - a.out Parser to parse legacy MINIX 1.x executables. - 8086 disassembler to parse the 16-bit instructions into an IR and prints them in a `objdump(1)`-style fashion. - 8086 interpreter which interprets the instructions with MINIX 1.x conventions (e.g. interrupts, memory layout, ...) in mind and obeys segment register indirection, which enables the usage of the entire 20-bit memory bus. ## Usage To compile and run the tool, use Cargo: ``` cargo build --release ``` Or run it directly: ``` cargo run -- --help ``` Run with debug output: ``` RUST_LOG=debug cargo run -- interpret -p ./a.out 2>&1 | less ``` CLI Options: ``` $ cargo run -- --help Simple program to disassemble and interpret 8086 a.out compilates, e.g. such for MINIX Usage: 8086-rs [OPTIONS] Commands: disasm Disassemble the binary into 8086 instructions interpret Interpret the 8086 instructions help Print this message or the help of the given subcommand(s) Options: -p, --path Path of the binary -d, --dump Dump progress of disassembly, in case of encountering an error -h, --help Print help -V, --version Print version ``` ## Status This project is under active development and primarily used by me to explore some Intel disassembly and learn some more Rust. Expect bugs and some missing features. I mainly test with 'official' binaries from the MINIX source tree. Currently, everything is in the binary, but I want to move some parts to a lib, which would make it much easier to ignore the Minix 1.x specifics (e.g. currently with a hardcoded interrupt handler) and would allow for more generic usage of this 8086 (e.g. implenting an own simple BIOS or OS). But first I want to implement all features correctly and add tests for all of them, before I want to move to that. ## Caveats Interpreted code is disassembled into a Vector, which will also be used for execution. This means, that the code is not actually loaded into memory, but the `CS:IP` addressing scheme is still being used. ## Documentation The documentation of the project itself can be accessed by using `cargo doc`. ``` $ cargo doc $ firefox target/doc/8086_rs/index.html ``` For the implementation of the disassembly, I used the Intel "8086 16-BIT HMOS MICROPROCESSOR" Spec, as well as [this](http://www.mlsite.net/8086/8086_table.txt) overview of all Opcode variants used in conjunction with [this](http://www.mlsite.net/8086/) decoding matrix. For the implementation of the interpreter, I used the Intel "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B, 2C & 2D): Instruction Set Reference, A-Z" Spec. ## FAQ #### Why hassle with interpretation and not just emulate 8086? For once, this project stemmed from a university exercise about the 8086 instruction set and disassembly. An interpreter for these assembly instructions was the logical (?) next step. Maybe I add raw 8086 emulation some day. #### Why no `nom`? There is no real reason, I just wanted to try to implement most parts myself, even if it meant more boilerplate code. I used `nom` extensivly in the past and I just wanted to see what it would be like without that crate. In hindsight, using `nom` would have been the cleaner option, but hey, something I only learned by not using `nom` for once.