Running Doom on the PMR-171

As part of the ongoing PMR-171 reverse engineering project, I ported Doom to the radio’s STM32H743 as an exercise in bare-metal development on this hardware. The STM32H743 is a Cortex-M7 at 400 MHz with 2 MB of flash, 1 MB of SRAM, and a hardware FPU — the original Doom ran on a 33 MHz 486 with 4 MB of RAM. This post documents the port.
Hardware
Relevant peripherals:
- MCU: STM32H743IIT6, Arm Cortex-M7 at 400 MHz. 1 MB of SRAM split across six non-contiguous banks, the largest being 512 KB.
- Display: 320×240 ST7789V panel over SPI. Doom’s native resolution is 320×200.
- USB host: OTG_HS in FS mode (PB14/PB15) with a USB-A port for FAT32 flash drive WAD storage.
- Constraints: No external RAM, no SD card, SPI-only display. The main PCB does have unpopulated pads in an 8-WSON (8×6) footprint next to the MCU for what appears to be an I2C memory IC.
| Region | Address | Size | Bus Domain |
|---|---|---|---|
| DTCM | 0x20000000 |
128 KB | Zero-wait-state, tightly coupled |
| AXI SRAM | 0x24000000 |
512 KB | D1 domain (main bus) |
| SRAM1 | 0x30000000 |
128 KB | D2 domain |
| SRAM2 | 0x30020000 |
128 KB | D2 domain |
| SRAM3 | 0x30040000 |
32 KB | D2 domain |
| SRAM4 | 0x38000000 |
64 KB | D3 domain (battery-backed area) |
The regions span different bus domains with different DMA accessibility. SAI2 audio DMA must target D2 SRAM; the display blit buffer resides in SRAM4.
Testbed

All development and testing was done through a bench testbed. A USB camera was pointed at the radio’s LCD for monitoring without physically being at the bench. A Segger J-Link connected over SWD handled firmware flashing and live memory reads for diagnostics. An MCP2221A USB-to-I2C adapter controlled a PCF8574 I/O expander driving relays for remote power cycling of the radio.

This setup allowed the entire flash-test-debug cycle to run remotely — flash new firmware over SWD, power-cycle via relay, and observe the LCD output through the camera feed.
Engine
PureDOOM is a single-header C library implementing the complete Doom engine with a callback-based I/O model. The integration contract:
- File I/O:
doom_open/read/seek/close→ FatFS reading the WAD from USB - Memory:
doom_malloc/free→ custom 6-pool bump allocator - Video: 320×200 paletted framebuffer → palette LUT → RGB565 SPI blit
- Audio: currently disabled (see below)
- Timer: SysTick millisecond counter
Doom requests 4–6 MB of heap. After optimization (see Heap Optimization below), ~492 KB is available for the zone allocator. PureDOOM loads WAD lumps on demand, so the zone only holds the active working set. DOOM1.WAD (shareware) fits within this constraint.
Sound
Sound is not implemented. The audio hardware path (SAI2 DMA → WM8731L DAC → amplifier) is initialized, but the mixer callbacks are stubbed. Doom’s sound mixer requires vol_lookup[128 * 256] — a 128 KB mixing table — plus additional working buffers, totaling ~228 KB. With the table removed to reclaim heap space (see Heap Optimization below), re-enabling audio would require restoring it and accepting the heap pressure.
Multi-Pool Memory Allocator
Standard malloc cannot span non-contiguous memory. I implemented a 6-pool bump allocator where each pool corresponds to one SRAM bank, tried in order from smallest to largest:
| Pool | Region | Size | Contents |
|---|---|---|---|
| 0 | DTCM | 64 KB | screen_buffer (PureDOOM primary screen) |
| 1 | SRAM1 tail | ~2 KB | Screen aliases from V_Init() |
| 2 | SRAM3 | 32 KB | Visplanes, small allocations |
| 3 | SRAM4 tail | ~33 KB | Overflow allocations |
| 4 | SRAM2 tail | ~118 KB | Medium allocations |
| 5 | AXI SRAM tail | ~492 KB | Zone heap |
Small allocations land in the fast, small pools first. The zone heap — PureDOOM’s single largest allocation — falls through to the AXI SRAM tail.
void* DoomMalloc(int size) {
size = (size + 3) & ~3; /* align to 4 bytes */
for (int i = 0; i < HEAP_NUM_POOLS; i++) {
int avail = (int)(pools_[i].end - pools_[i].ptr);
if (avail >= size) {
void* p = pools_[i].ptr;
pools_[i].ptr += size;
return p;
}
}
return NULL;
}
No free is implemented. PureDOOM allocates its heap once at startup; all other allocations are permanent. The zone allocator handles dynamic allocation internally.
Display Pipeline
Doom outputs a 320×200 framebuffer as 8-bit palette indices. The ST7789V expects RGB565 over SPI, big-endian. The conversion pipeline:
- Palette LUT: 256-entry table mapping palette indices to RGB565 values, rebuilt on palette changes.
- Batch conversion: 50 rows at a time, indexed → RGB565 with byte-swap. 50 × 320 × 2 = 32 KB per batch, stored in SRAM4.
- SPI bulk transfer: Four 32 KB batches per frame.
The ST7789V MADCTL register is configured with BGR=1, requiring R↔B swap in the palette LUT:
static uint16_t Rgb888ToRgb565(uint8_t r, uint8_t g, uint8_t b) {
return (uint16_t)(((uint16_t)(b & 0xF8) << 8) |
((uint16_t)(g & 0xFC) << 3) |
((uint16_t)(r & 0xF8) >> 3));
}
Doom’s screen-melt wipe transition requires two additional screen buffers (~128 KB). To conserve RAM, I aliased these to screens[0], which corrupted the animation. Wipes are disabled; level transitions cut to black.
Boot Safety
The board has a power amplifier chain, T/R switching relays, and eight GPIO expanders controlling RF filter banks. Custom firmware must safe all RF-active hardware before initializing anything else.
PA Disable
PB1 controls the PA enable line and has an external pull-up. On MCU reset, GPIO pins default to high-impedance, allowing the pull-up to enable the HMC482ST89 gain block chain — drawing over 6 W within microseconds. Three bare register writes execute as the first code in main(), completing in ~12 ns at the 64 MHz HSI boot clock:
RCC->AHB4ENR |= RCC_AHB4ENR_GPIOBEN; /* clock on */
(void)RCC->AHB4ENR; /* barrier */
GPIOB->MODER = (GPIOB->MODER & ~(3U << (1U * 2)))
| (1U << (1U * 2)); /* PB1 = output */
GPIOB->BSRRH = GPIO_PIN_1; /* PB1 = LOW */
Boot Sequence
The firmware writes a DIAG_STEP counter to AXI SRAM at 0x2407FFD8, readable over SWD without halting the CPU:
- IWDG extend (0x01): Reconfigure the stock firmware’s ~512 ms watchdog to ~32.8 s (prescaler 6, max reload). Enable IWDG freeze during debug (DBGMCU_APB4FZ1 bit 18).
- SWD safety delay (0x02): 500 ms spin-wait for debugger attachment.
- Safe state init (0x03): All RF-critical GPIO pins set to safe defaults — PA off, T/R switch to RX, BK4819 SPI deselected, codec held in reset.
- NCA9555 GPIO expanders (0x04): Eight I2C expanders programmed to stock firmware RX-idle state via bit-banged I2C. These control the 12 FM3418 SP8T antenna switches forming the RF filter bank. Values captured from logic analyzer traces of the stock firmware.
- Clock tree (0x10): PLL configuration, CPU to 400 MHz (VOS1).
- Peripheral setup (0x11–0x20): SPI2, I2C3, SAI2, OTG_HS, USART1.
- PureDOOM init (0x30): Register callbacks, load WAD, enter game loop.
Boot to title screen takes 3–4 seconds, dominated by USB enumeration and WAD header verification.
Crash: Dangling argv Pointer
After boot, navigating the menu triggered a hard fault. The crash was observed via the camera feed — the display froze mid-menu. Because the fault handlers are __attribute__((naked)) functions that write stacked registers (PC, LR, CFSR, BFAR) to fixed SRAM addresses at 0x2407FFE0, the J-Link could read the crash dump over SWD without resetting the chip. The relay then power-cycled the radio for the next test iteration. The full observe → read crash data → patch → flash → power-cycle → observe loop ran entirely from the host machine, with no physical bench access required.
The crash dump:
CFSR = 0x00008200 (PRECISERR + BFARVALID)
BFAR = 0x2001FF10
PC → doom_strlen()
LR → M_CheckParm()
M_CheckParm() iterates over PureDOOM’s internal myargv[] pointer. doom_init() saves argv without copying the data. The argv[] array was declared as a local variable in DoomPmr171_Init() — after the function returned, myargv pointed to a recycled stack frame.
/* Bug: local array — dangling pointer after return */
char* argv[] = {"doom", "-iwad", "/DOOM1.WAD", "-nosound", NULL};
/* Fix: static storage duration */
static char* argv[] = {"doom", "-iwad", "/DOOM1.WAD", "-nosound", NULL};
Heap Optimization
Early builds crashed after 15–30 seconds of gameplay. Z_Malloc would fail to find a free block, trigger a software reset, and the cycle would repeat: title screen → E1M1 → 15 seconds → OOM → reset.
The zone heap at this point was 349 KB. Doom’s zone allocator is first-fit: it walks a linked list of blocks, purging PU_CACHE entries (cached textures) when it can’t find space. Composite textures allocated as PU_STATIC are never freed — they accumulate during gameplay until the heap is exhausted.
Running arm-none-eabi-nm --size-sort on the ELF revealed the problem: vol_lookup[128 * 256] — a 128 KB sound volume mixing lookup table — was sitting in BSS consuming AXI SRAM. The audio mixer was disabled (I_UpdateSound returns immediately), but the table was still allocated. Three one-line edits eliminated it:
vol_lookup[128*256]→vol_lookup[1](128 KB → 4 bytes)#if 0around the initialization loop inI_SetChannels()- Null the mixer pointer assignments in
addsfx()
An additional 15 KB came from shrinking the RTT debug buffer from 16 KB to 1 KB. Total savings: ~143 KB. Zone heap went from 349 KB to 492 KB (+41%).
The side effect was unexpected: FPS increased noticeably. A larger heap means fewer texture cache evictions, so R_GenerateComposite — which rebuilds multi-patch wall textures from the WAD — gets called far less often. A memory fix produced a performance fix.
Tick Catch-Up Fix
USB flash drive reads block for 1–3 seconds during WAD lump loading. PureDOOM’s doom_update() measures elapsed time and processes that many game ticks as catch-up, causing multi-second freezes during level transitions.
Fix: cap delta_time in doom_update():
if (delta_time > 4) delta_time = 4;
This limits catch-up to 4 ticks (~114 ms) per update call.
Controls
Controls are partially implemented. The primary goal was getting Doom running on the display; full input mapping was not a priority. Basic input from the front panel buttons and the hand microphone’s keypad is working, but neither is really implemented for gameplay. Complete keypad coverage is still in progress.
Results
Doom runs on the PMR-171 at ~35 fps. The binary is 354 KB of the 2 MB flash.
Working:
- Full Doom gameplay (DOOM1.WAD shareware)
- USB flash drive WAD loading (FAT32, OTG_HS MSC)
- Basic front panel input (charlieplexed matrix via GPIO) and hand mic keypad input
- ~35 fps SPI2 palette-converting blit
- Audio hardware initialized (SAI2 DMA + WM8731L + amplifier)
- SWD diagnostics (proof-of-life sentinel, fault register capture)
- Stack canary + zone heap integrity monitoring
- IWDG ~33 s timeout, 500 ms SWD safety delay
Not yet working:
- Sound (mixer disabled — restoring
vol_lookupcosts 128 KB of zone heap) - Full keypad mapping for gameplay (SCT3258 key event format undocumented)
- Rotary encoder direction (PG10/SAI2_SD_B pin conflict)
- Save games (no persistence layer)
- Wipe transitions (insufficient screen buffer memory)
At ~35 fps, the port runs without any performance optimization — no DMA framebuffer transfer, no double-buffering, no SIMD intrinsics. The remaining gaps are just things I didn’t get to.
Acknowledgments
- PureDOOM by Daivuk — single-header Doom engine
- UHSDR project — STM32H7 HAL reference
- id Software — Doom source code release (1997)
Resources
- PMR-171 Teardown: PMR-171 Teardown — the hardware analysis that started this
- UART Reverse Engineering: AI-Assisted Reverse Engineering of the PMR-171 Programming Interface — the first post in this series
Disclosure: both the implementation and this writeup were done with heavy AI assistance. I directed the project, provided the hardware and SWD access, and verified results on the physical radio. The LLM handled code generation, debugging analysis, and drafting this post. There may be mistakes or inaccuracies in the technical details.
