CPUlator Simulator Documentation

CPUlator is a simulator and debugger of a computer system (CPU, memory, and I/O devices) that runs inside a web browser. Simulation allows running and debugging programs without needing to use a hardware board. When you load a program onto the simulator, the program will behave (almost) identically to the same program running on real hardware. The simulated systems are based on the computer systems from the Altera University Program (Nios II, ARMv7, and RISC-V) or SPIM (MIPS).

A simulator has several advantages over hardware. It is more convenient and easily accessible than a hardware board. Also, compared to hardware, a simulator has greater visibility into the system state and can warn about suspicious behaviours, such as self-modifying code or violating calling conventions, making debugging easier.

Features

Instruction sets: Nios II, ARMv7, MIPS, and RISC-V RV32
I/O devices:
- Nios II, ARMv7, and RISC-V: Includes most devices found on a DE1-SoC (and other board models used by the Altera University Program), including interrupt support.
- MIPS: Includes SPIM-compatible terminal
Nothing to install: Runs entirely inside a web browser
Debugger: Single-step, breakpoints, watchpoints, trace, call stack, examine disassembly, memory and registers
Debug assertions: Optional runtime assertions catch many potential errors
Input: Accepts both assembly source code and ELF executables

Comparison Chart

The following is a feature comparison of CPUlator with some popular simulators used for teaching.

	CPUlator	MARS 4.5	RARS 1.6	QtSPIM 9.1.20	ARMSim# 2.1
No download/install
Platform	Web browser	Java JRE	Java JRE	Windows, OSX, Linux	.NET 3.0
Free
Open-source
Editor
Code completion				n/a	n/a
Assembler	GNU	custom	custom	custom	GNU
C or other languages
Debugger
Breakpoints
Single-step
Reverse step
Step over function
Step out of function
Modify registers		(except pc/ra)
Modify memory
Show call stack
Runtime calling convention checks
Data watchpoints
Instruction sets	MIPS32 r5 MIPS32 r6 ARMv7 Nios II RV32IMAFD+	MIPS32	RV32/64 IMFDN	MIPS32	ARMv5
Self-modifying code					Partial
MMU
FPU
Memory model	4 GB flat	5 segments	5 segments	5 segments	1 segment
Maximum usable memory	2048 MB	4+4+4 MB data 4+4 MB code	4+4+4 MB data 4+4 MB code	4+1+0.5 MB data 256+64 KB code	96 KB data 512 MB code
I/O devices
Terminal
File I/O
Other devices
Simulation speed (Minst/second)	20	12	0.8	10	1.8

Quick Start

Choose which system to simulate. This setting determines which CPU (instruction set) and I/O devices will be included
In the Editor window, use the Compile and Load (F5) button to compile your assembly source and load the executable into memory
Set breakpoints, step, run, stop, and debug your program.

If you want to simulate a compiled executable instead of assembly source code, see the Compiling Code section below.

TL;DR

The CPUlator user interface is similar to a typical debugger. If you're already familiar with other debuggers, it should be possible to use CPUlator without reading the rest of this documentation. There are a few things to keep in mind:

I/O devices: CPUlator simulates not only the CPU, but also I/O devices. The Devices panel (right side by default) allows you to interact with the I/O devices.
Debug assertions: The simulator checks for many common errors or unusual/unexpected behaviours in your program at runtime, and will stop execution with a warning if any such conditions are encountered. It is usually better to change your program to avoid the warnings, but they can be disabled in Settings → Debugging Checks if necessary.

System Requirements

CPUlator is a client-side WebAssembly running in a web browser. Any web browser new enough to support WebAssembly should work.

Firefox 53+
Chrome 57+
Edge 16+
Internet Explorer: No support for WebAssembly

Mobile touch-screen devices are strongly discouraged because the user interface is designed to use a mouse (for resizing and moving the panels).

Chrome 97+, Android
Safari, iOS 11+

What is CPUlator?

CPUlator is a functional simulator of a computer system (CPU and I/O devices), a debugger, and (an interface to) an assembler

There are four main components in a typical development flow for simple assembly programs: Source code written by the user, an assembler (or compiler for a higher-level language) to transform the source code into executable machine code, a computer system to run the machine code, and a debugger to observe the behaviour of the program running on the computer. CPUlator provides the assembler, computer, and debugger inside a web browser.

CPUlator's design was based on the Altera Monitor Program, so there are many similarities. In the Altera Monitor Program development flow, the Monitor Program provides the assembler and debugger, while the computer system is running on an FPGA development board attached to the host PC over a programming cable. (Top-most option in the figure below)

CPUlator replaces both the computer system and debugger, and optionally also the assembler. For simple programs (one source file), using the built-in editor and assembler (GNU assembler) is the most convenient option (Above figure, middle option). For more complex programs, it is also possible to assemble or compile your project into an ELF executable using any tools you wish (e.g., Altera Monitor Program) and then simulate the executable (above figure, bottom-most option).

Compiling Code

Internally, the CPUlator simulator simulates executable machine code (it does not directly simulate assembly source code). You can compile your source code using the built-in assembler, Altera Monitor Program, or any other development tools you wish. The built-in editor and assembler is limited to working with a single assembly-language source file. If your program requires more than one file or is written in another language (e.g, C), you must compile the program yourself and simulate the compiled executable.

Using the Built-in Assembler

The built-in assembler is an interface to the GNU assembler. When Compile and Load is clicked, the contents of the Editor is uploaded to the server, saved as a single file, then run through the GNU assembler. The server then sends back the ELF executable and any assembler output messages, and the executable is loaded into the simulator.

Because the ELF file is generated at the server, there are limits on the size of the compiled executable (currently 12 MB). If your program is bigger than a megabyte or so (which sometimes happens if it includes images or audio as initialized data), it is usually faster to compile your program locally than to download megabytes at every compile. If your program is substantially bigger than this, be aware that the executable size causes the simulator itself to consume memory to hold the executable. Don't expect the simulator to work with gigabyte-sized executables.

ARMv7 only: The GNU assembler uses "divided" syntax by default. If you're writing in the newer "unified" syntax, use a .syntax unified directive to change the syntax. The differences are minor, so this won't affect most programs. The CPUlator disassembler uses instruction names in unified syntax.

Compiling using Altera Monitor Program

The Altera Monitor Program can be used to compile projects without being connected to an FPGA board, as long as actions that interact with the board are avoided.

When creating a project, specify the "DE1-SoC Computer" system (or the system you'll be simulating) as usual. This is important because the compiler makes certain assumptions (e.g., system memory size) that depend on the computer system, and should match what's being simulated.

When prompted to select the programming cable, leave it blank, as we won't be using it.
System Parameters settings box

When prompted, do not download the computer system to the board.
Download system? dialog box

You can now "Compile" your project, but do not use "Load" nor "Compile & Load". Compiling the program produces a filename.elf (or filename.axf for ARM) executable file in your project directory. To load the executable into the CPUlator simulator, use File → Load ELF Executable and choose this executable file.
File Load ELF executable

Using the Simulator

The simulator interface consists of the Toolbar and a collection of movable panels. The panels can be moved, resized, undocked as a movable window, and docked to any part of the browser window, by dragging the title bar (or tab, if docked). The default arrangement is organized by function:

Toolbar: Includes the File and Help drop-down menus.
Registers and debugging tools: Registers, Call stack, Trace, Breakpoints, Watchpoints, Symbol table, and Statistics (performance counters).
Settings: The list of settings is quite long. Scroll down to see the rest.
Editor, Disassembly, and Memory
Messages
I/O devices: The contents depend on what I/O devices your chosen system contains. If your system contains no I/O devices with a graphical panel, the I/O Devices panel is not shown.

The user interface split into 6 sections

The toolbar along the top has the usual debugging operations: step, step over, step out, continue, stop, restart, and reload. Most of these functions are mapped to the same shortcut keys as in the Altera Monitor Program. There are also two drop-down menus: File and Help.

Step into single-steps the execution of one instruction. Step over differs from step into when single-stepping a call instruction. Step into will single-step into the first instruction of the function, while Step over will skip over the function entirely, stopping at the instruction following the call. This can be useful for skipping over big assumed-bug-free functions such as printf. Step out runs the program until the current function returns to its caller.

Step over and step out are implemented by tracking function calls and returns (the same mechanism used to show the call stack). If your function calls and returns are not properly paired or you use an unusual call or return method that is not recognized as a call or return, you may get strange behaviour.

The Reload function reloads the previously-loaded executable into memory. If you loaded an ELF executable from disk, it does not reload the program from disk. You will need to select and load the ELF file again. (Web browsers generally do not allow JavaScript programs to read a file from disk without user interaction, including a file that has changed since the user last selected it, for security reasons.)

File Menu

The File menu includes operations to load and save files to disk.

Open Assembly Source loads a file into the Editor window. Alternatively, copy-pasting code into the Editor window is equivalent.

Save Assembly Source downloads a file containing the contents of the Editor window. Alternatively, copy-pasting the Editor window to save your code elsewhere is equivalent.

Share Assembly Source uploads the contents of the Editor window and creates a link to it. Anyone with the link can see the uploaded code. The new link opens in a new window.

Load ELF Executable loads a compiled ELF executable into the simulator. You can then run and debug the program using the Disassembly and Memory windows. The Editor window will be unused when simulating an executable.

Help Menu

The Help menu contains a link to documentation (this page), and some sample programs.

Sample programs come in two forms: ELF executable without source code (), and assembly source code (), which can be distinguished by the icon. Most of the sample programs test various I/O devices, so they will only behave as expected when running on a system that contains the I/O device.

Editor Window

The Editor window provides an interface to conveniently write simple (single-file) assembly language programs. Clicking Compile and Load (or pressing F5) will compile the code in the Editor and load it into the simulator. See the section on Compiling Code for more details or if you need to simulate a more complex program.

Code Completion

Code completion is activated by pressing Ctrl-Space in the editor. This pops up a list of instructions. Code completion only suggests instructions, but not the operands, so the instruction list only pops up while typing the first word of a line. The list of instructions is not exhaustive. It omits some variations of the instructions (e.g., it only contains the basic form and omits the 15 conditional variations of each ARM instruction). The list may also include instructions that the simulator does not support (some uncommonly-used ARM instructions).

Disassembly Window

The disassembly window shows the disassembly of the machine code (bytes) in a memory region. The disassembly is an interpretation of the machine code in memory, and can differ from the source code, especially if the machine code in memory has been modified (self-modifying code).

The disassembly window optionally shows source code lines that generated the machine code, interleaved in the same window, in a lighter gray colour. How much source code is shown depends both on the Show source code setting and whether your program includes debugging information. For programs compiled using the built-in assembler, debugging information is included and full source code can be displayed. When simulating ELF executables, the simulator does not have access to the source code that generated the executable, so it can only display the source code file name and line number that corresponds to each machine instruction. The CPUlator debugger uses debugging information in DWARF format embedded in the ELF executable. Usually, compiling with a -g flag will generate debugging information in DWARF format, as DWARF is usually the default (Or use --gdwarf2 to explicitly specify DWARF version 2 format).

Immediately above the disassembly window, there is a "Go to address, label, or register" text input. This allows you to jump to a particular location in your program. The box will accept hexadecimal numbers (3a8), labels (_start), or registers (pc).

You can force the program to move to a different instruction by changing pc. This can be done by changing pc in the registers window, or double-clicking the target line in the disassembly (This works the same way in the Monitor Program).

Memory Window

The memory window shows a region of memory as an array of 32-bit words. The number format and number of words per row is configurable in the Settings window. You may edit values in memory by clicking in the window to pop up a text box, entering a new value, then pressing Enter. Pressing ESC will cancel the edit. Above the Memory Window, the "Go to..." box works in the same way as for the Disassembly window.

The memory contents in each row interpreted as ASCII characters are also shown (to make text strings easier to read). Unprintable characters are shown as a dot (•), and null bytes (those with value 0) are shown as a black dot.

When viewing the memory address ranges of memory-mapped I/O devices, the simulator will show you the values at those memory locations as if a load were performed, but without triggering any of the side effects that would normally be performed in response to a load (such as dequeueing a FIFO). This makes it easier to observe the system state without disturbing it. This behaviour differs from using a debugger with real hardware, where side effects can occur. (However, there is currently no method to intentionally trigger the side effects from the debugger.)

Loading and Saving Memory Contents

Memory contents can be loaded from or saved to a file. Raw and text file formats are supported. This can be used to import or export memory contents to a spreadsheet or other program for post-processing or viewing. The file load and save options can be accessed by clicking on the button near the upper-right corner of the Memory window to show the sidebar.

The text file formats encode the memory contents as a series of numbers, each separated by a delimiter. The delimeters supported are LF (UNIX-style line break), CR+LF (Windows-style line break), and comma (all numbers on one line, separated by commas). The Element size setting specifies the number of memory bytes that form each number, which is useful for arrays of integers or floating-point numbers that exceed one byte each. If the size of the memory region is not a multiple of the element size, the memory region is shortened to a multiple of the element size. Each number can be formatted using octal, decimal (signed or unsigned), or hexadecimal base.

The raw file format just contains each memory byte in the memory region, unmodified. The Radix and Element size settings are not used for raw files.

For an example of the file formats, save a region of memory to file and look at the output.

When loading memory contents from a file, you need to specify the starting address at which the file contents are written. The contents of the input file determine the number of bytes written to memory. The maximum input file size is limited to 256 MB. When loading from a text file, the delimiter setting is ignored. Both whitespace (including newlines) and commas are accepted as delimiters between numbers. After entering the start address, click Load from file to choose the input file.

When saving memory contents to a file, you need to specify the memory region to save. The memory region is specified by entering the start address (inclusive) and either the end address (exclusive) or length. The start address and the end address or length (whichever is more recently focused) are used to calculate the third field (shown in gray). The size of the memory region to be saved is limited to 96 MB (which could create a text file up to 576 MB). Once the memory region has been entered, click Save to file, which generates the output file and saves it. Whether it shows a dialog box to choose the output filename depends on your browser settings ("Ask where to save each file before downloading").

Registers Window

This shows the current values of the CPU registers. Registers that have changed since the last time the simulator was paused are highlighted in red. You can change the value of a register by clicking on the register value, entering a new number, and pressing Enter.

Trace and Call stack

The trace and call stack show what the program has been executing. The trace shows the last 200 instructions that were executed. The call stack shows the current stack of function calls since the last processor restart. Click on the instruction in the list to highlight and scroll to the instruction in the disassembly window.

Breakpoints

This lists all of the breakpoints that have been set. Set or remove breakpoints by clicking in the left-most gray column in the disassembly window, next to the instruction where you want the breakpoint. You can also remove all breakpoints using the Clear button in the Breakpoints panel.

The list in the breakpoint window allows you to make breakpoints conditional. A conditional breakpoint stops the program only if the specified condition evaluates to true (non-zero). For example, a condition of r3==0 would cause the breakpoint to trigger only if register r3 is zero when this instruction is reached. A blank condition is treated as the condition "true" (an unconditional breakpoint).

Watchpoints

A watchpoint is used to pause the simulation (like a breakpoint) whenever a certain region in memory is read or written. There are four data watchpoints to allow monitoring four memory regions for loads and stores (but not instruction fetches).

Each memory region has a start and end address (in hexadecimal). The R and W checkboxes select whether the simulation should stop when a memory read or write occurs. Regardless of whether these checkboxes are enabled, the number of reads and writes into the memory region and the most recent instruction that accessed the region are recorded. The Reset button resets these counters.

Symbols

This table lists all of the symbols (labels) found in the most recent loaded program. The symbol name and value (address) are listed. For convenience, you can jump to the address in the Disassembly or Memory windows by clicking on one of the two icons next to each address.

Counters

This shows some performance counters, which are reset when the simulation is restarted.

Devices Window

The devices window allows you to interact with the hardware devices in the computer system outside the CPU, such as LEDs and switches. Because we're limited to operating in a web browser, the interface sometimes differs from a hardware board. For more complicated I/O devices, the simulated device may have limited functionality. In some cases, the interface also shows extra debugging information about the state of the I/O device, to help in debugging your program.

The panels in the Devices window can be reordered by dragged them, so you can group commonly-used devices together for convenience. For even more flexibility, the panels can also be "floated" out of the Devices window and become its own window that can be moved and docked. Use the Show in a separate box option in the device's drop-down menu to cause the device panel to float. Selecting the option again will unfloat the device panel (move the device's panel back into the Devices window).

Once a panel is floated into its own box, it can be "rolled up" to save screen space. A rolled-up box shows only its title bar and not its contents. Click on the title bar icon (upper left corner) to roll up or roll down a floating box.

Settings Window

Number Display Options

This changes how numbers are displayed in various places in the simulator. For the ASCII byte option, bytes are show as a single ASCII character if it is printable, or as two hexadecimal digit if not.

Editor Options

There is currently one setting here: To enable or disable code completion in the Editor.

Disassembly Options

The Show source code option controls how much source code should be shown in the disassembly window. There are three options:

None: Show only the disassembly. This option is the most compact and least cluttered.
Some: Show non-blank source code lines that are not the same as the disassembled code. This shows any source code lines that contain comments, and instructions that are different in source code and disassembled form (e.g., pseudo-instructions). This mode reduces clutter by hiding source code lines that are equivalent to the disassembly. For these lines, it marks the disassembled instruction with a line number instead of showing the line of source code.
All: Show all source code lines, including those that are blank. This mode also preserves the original source code's indentation.

Debugging Checks

The checkboxes control whether each assertion should be enforced. Turning off some of these may be useful if your program does something unusual and the assertion is too stringent. Hover over each checkbox for a description of each assertion.

System

Changing the system affects which CPU and I/O devices are in the system. Changing the system requires a reload of the simulator. See the Simulated Systems section for a description of the systems.

Memory Usage Warning

The simulator's memory usage grows with the amount of simulated memory that is written to by the simulated program. A program that writes to memory in a loop can hit the 2 GB WebAssembly heap limit fairly quickly, at which point the simulator stops working. The Memory usage warning setting stops the simulation when the simulator allocates memory in excess of the warning threshold, to prevent a program from inadvertently using a large amount of memory. Currently, changing this threshold requires reloading the simulator. If your simulated program uses a large amount of memory, you may need to reload the simulator with a higher warning threshold.

(The memory use tracked here is the size of the WebAssembly heap, used by the portion of the simulator that uses WebAssembly.)

Messages Window

The messages window at the bottom shows messages generated by the assembler, simulator, and debug assertions. The clear button clears the window.

Debug Assertions

The CPUlator simulator does more sanity checks than a real processor, to catch common errors. When it detects something suspicious, the simulator will stop executing (as if a breakpoint occurred), allowing you to examine the cause of the problem as soon as it's detected (instead of later when an incorrect result or unexpected behaviour becomes visible). Any of these checks can be disabled by unchecking the desired checkbox in Settings → Debugging Checks.

Simulated Systems

A computer system contains a processor and I/O devices. This section discusses the simulated processors and systems separately. A processor can be used in many different computer systems that differ in the set of I/O devices.

Processors

There are currently five supported processor instruction set architectures:

Nios II: Models a Nios II/f without MMU. All instructions are supported, including FPH1 floating-point (fmuls, fadds, fsubs, fdivs), and internal interrupt controller. Some exception conditions aren't implemented (e.g., divide by zero exception), and some are implemented differently from hardware (e.g., break stops the simulator instead of taking a breakpoint-type software exception).

Nios II Instruction Set Reference
ARMv7: Models a Cortex-A9. Includes integer and floating-point instructions (VFPv3) and GIC interrupt controller. Excludes MMU, Thumb, Thumb 2, and Neon vector instructions. Some exception conditions aren't implemented, and some are implemented differently from hardware (e.g., bkpt).
- CP15 control registers are unimplemented except CP15 c1 CPACR for enabling and disabling the FPU.
- FPU powers up enabled. Although a real CPU powers up with the FPU disabled, the Altera Monitor Program enables the FPU for you, and CPUlator aims to match the Altera Monitor Program environment.
ARM Architecture Reference Manual, ARMv7-A and ARMv7-R edition
MIPS32 release 5: Models a MIPS32 release 5 CPU, without MMU. Includes a 64-bit double-precision FPU. Traps, exceptions, interrupts, and syscalls are implemented. There is also a simulation model of a hypothetical variant of the architecture where all branch delay slots are disabled, because introductory courses often teach MIPS without the delay slots. When delay slots are disabled, the return address of branch-and-link instructions behave differently (pointing to the instruction after the branch rather than after the delay slot), and exceptions can no longer occur in a delay slot.

MIPS32 release 5 is generally backwards-compatible with older MIPS instruction sets, so the MIPS32r5 model is the right one to use unless you know you are using release 6.
- FPU: The FPU contains 32 64-bit floating-point registers, and can do both single and double precision operations. Single-precision operations use the lower half of each register. Double precision operations can either use full 64-bit registers, or use pairs of the lower half of each register as was done on older MIPS processors. This CPU boots up with the FPU in 32-bit mode, and can be set to 64-bit mode by changing CP0.status_FR to 1. The following code fragment will enable the 64-bit FPU mode by setting bit 26 of CP0.status_FR:
```
li $t0, 0x04000000   # Bit 26
mfc0 $t1, $12, 0     # Read CP0.status (Coprocessor 0, register 12, select 0)
or $t1, $t1, $t0     # Set bit 26
mtc0 $t1, $12, 0     # Write back to CP0.status
```
  The simulated FPU ignores rounding modes and does not support floating-point exceptions.
The MIPS32 Instruction Set v5.04
MIPS32 release 6: Models a MIPS32 release 6 CPU, without MMU. Includes a 64-bit double-precision FPU. Traps, exceptions, interrupts, and syscalls are implemented. There is also a simulation model of a hypothetical variant of the architecture where all branch delay slots (and forbidden slots) are disabled, because introductory courses often teach MIPS without the delay slots. When delay slots are disabled, the return address of branch-and-link instructions behave differently (pointing to the instruction after the branch rather than after the delay slot), and exceptions can no longer occur in a delay slot.

MIPS32 release 6 makes major changes to the instruction set and is not backwards-compatible with any earlier MIPS instruction set. There are a significant number of new instructions, re-encoded instructions, and removed instructions (e.g., addi has been removed and replaced with addiu). If you're not specifically using release 6, the MIP32 release 5 model is probably the right one to use.
- FPU: The FPU contains 32 64-bit floating-point registers, and can do both single and double precision operations. Double precision operations use the full 64-bit register, while single-precision operations use the lower half of each register. Release 6 forbids using pairs of single-precision registers as double-precision registers, as was done on older MIPS architectures. The simulated FPU ignores rounding modes and does not support floating-point exceptions.
The MIPS32 Instruction Set v6.06
RISC-V RV32: This is a 32-bit RISC-V CPU. Unaligned load/store is supported (but a warning is enabled by default). Only Machine mode is supported (no MMU/paging), including support for machine-level interrupts without an external interrupt controller. The MTI (machine timer) and MSI (machine software interrupt) standard interrupts are supported. Other I/O devices are attached to the platform-defined IRQs 16 to 31.

A floating-point unit is modelled, but rounding modes and exceptions are ignored (including the accural of exception flags in fcsr.fflags). The only rounding mode is round-to-nearest ties-to-even (the only rounding mode supported by WebAssembly), and RISC-V instructions using other rounding modes will silently use the default mode. NaN handling may also differ from the RISC-V specification. For example, signalling NaNs are ignored because exceptions are not supported, and perhaps other cases where WebAssembly defines operations with NaNs differently than RISC-V.

The instruction set modelled is RV32IMAFDZicsrZicbom:
- RV32I: Base instruction set
- M: Integer multiplication and division
- A: Atomic instructions
- F: Single-precision floating-point instructions
- D: Double-precision floating-point instructions
- Zicsr: Control and Status Register (CSR) Instructions
- Zicbom: Cache-block management instructions (implemented as nops)
RISC-V Specifications

Computer Systems

The computer system you choose to simulate determines its CPU and the I/O devices it contains.

Nios II Generic

This is a Nios II CPU with 4 GB of memory and no other I/O devices.

Nios II DE1-SoC

This is a Nios II system with most of the FPGA-side I/O devices found in the DE1-SoC Computer, the 1 GB DDR3 memory attached to the HPS (Hard Processor System), but no other HPS-attached devices.