I think of S4 as a stack-based, RPN, Forth-like, virtual CPU/VM that can have as many registers, functions, and amount of user ram as the system supports.

A register is identified by up to 3 UPPER-case characters, so there is a maximum or (26x26x26) = 17576 registers available. I tend to think of registers as built-in variables. Reading, setting, incrementing or decrementing a register is a single operation.

A function is identified by any number of UPPER-case characters. The maximum number of functions is set in the config.h file.

The number of registers, function vectors, and user memory can be scaled as desired to fit into a system of any size. For example, on an ESP-8266 board, a typical configuration might be 576 (2626) registers, 5000 functions, and 24K of user ram. In such a system, register names would be in the range of [aa.zz]. For a RPI Pico, I use 576 registers, 1000 functions, and 128K USER RAM. On a Arduino Leonardo, you might configure the system to have 26 registers and functions, and 1K USER. On a Windows or Linux system, I use 17576 registers (2626*26), 5000 functions, and 1MB USER.

There were multiple reasons why to do this:

  1. Many interpreted environments use tokens and a large SWITCH statement in a loop to execute the user's program. In these systems, the "machine code" (i.e. - byte-code ... the cases in the SWITCH statement) are often arbitrarily assigned and are not human-readable, so they have no meaning to the programmer when looking at the code that is actually being executed. Additionally there is a compiler and/or interpreter, often something similar to Forth, that is used to create the programs in that environment. For these enviromnents, there is a steep learning curve ... the programmer needs to learn the user environment and the hundreds or thousands of user functions in the libraries (or "words" in Forth). I wanted to avoid as much as that as possible, and have only one thing to learn: the machine code.

  2. I wanted to be free of the need for a multiple gigabyte tool chain and the edit/compile/run paradigm for developing everyday programs.

  3. I wanted a simple, minimal, and interactive programming environment that I could modify easily.

  4. I wanted an environment that could be easily configured for and deployed to many different types of development boards via the Arduino IDE.

  5. I wanted to be able to use the same environment on my personal computer as well as development boards.

  6. I wanted short commands so there was not a lot of typing needed.

S4 is the result of my work towards those goals.

The implementation of S4

S4 Reference


Opcode Stack Description
+ (a b--n) n: a+b - addition
- (a b--n) n: a-b - subtraction
* (a b--n) n: a*b - multiplication
/ (a b--q) q: a/b - division
^ (a b--r) r: MODULO(a, b)
& (a b--q r) q: DIV(a,b), r: MODULO(a,b) - /MOD


Opcode Stack Description
b& (a b--n) n: a AND b
b| (a b--n) n: a OR b
b^ (a b--n) n: a XOR b
b~ (a--b) b: NOT a (ones-complement, e.g - 101011 => 010100)


Opcode Stack Description
# (a--a a) Duplicate TOS (DUP)
\ (a b--a) Drop TOS (DROP)
$ (a b--b a) Swap top 2 stack items (SWAP)
% (a b--a b a) Push 2nd (OVER)
_ (a--b) b: -a (Negate)
xA (a--b) b: abs(a) (Absolute)


Opcode Stack Description
c@ (a--n) Fetch BYTE n from address a
@ (a--n) Fetch CELL n from address a
c! (n a--) Store BYTE n to address a
! (n a--) Store CELL n to address a


- Register names are 1 to 3 UPPERCASE characters: [rA..rZZZ]
- LOCALS: S4 provides 10 locals per call [r0..r9].
- The number of registers is controlled by the NUM_REGS #define in "config.h".
- Register 'rI' is the FOR Loop index special

Opcode Stack Description
rXXX (--n) read register/local XXX
sXXX (n--) set register/local XXX to n
iXXX (--) increment register/local XXX
dXXX (--) decrement register/local XXX
nXXX (--) increment register/local XXX by the size of a cell (next cell)


- Word/Function names are ProperCase identifiers.
- They must begin with [A..Z], and can include lowercase characters [a..z].
- The number of words is controlled by the NUM_FUNCS #define in "config.h"
- NUM_FUNCS needs to be a power of 2.
- If a word has already been defined, S4 prints "-redef-".

Opcode Stack Description
: (--) Define word/function. Copy chars to (HERE++) until closing ';'.
ABC (--) Execute/call word/function ABC
; (--) Return: PC = rpop()
- Returning while inside of a loop is not supported; break out of the loop first.
- Use '|' to break out of a FOR or WHILE loop.


Opcode Stack Description
. (N--) Output N as decimal number.
, (N--) Output N as character (Forth EMIT).
" (?--?) Output characters until the next '"'.
- %d outputs TOS as an integer (eg - 123"x%dx" outputs x123x)
- %c outputs TOS as a character (eg - 65"x%cx" outputs xAx)
- %n outputs CR/LF
- % output (eg - "x%" %% %"x" outputs x" % "x)
0..9 (--n) Scan DECIMAL number. For multiple numbers, separate them by space (47 33).
- To enter a negative number, use "negate" (eg - 490_).
hNNN (--h) h: NNN as a HEX number (0-9, A-F, UPPERCASE only).
'x (--n) n: the ASCII value of x
XXX (a--a b) Copies XXX to address a, b is the next address after the NULL terminator.
xZ (a--) Output the NULL terminated string starting at address a.
xK? (--f) f: 1 if a character is waiting in the input buffer, else 0.
xK@ (--c) c: next character from the input buffer. If no character, wait.


Opcode Stack Description
< (a b--f) f: (a < b) ? 1 : 0;
= (a b--f) f: (a = b) ? 1 : 0;
> (a b--f) f: (a > b) ? 1 : 0;
~ (n -- f) f: (a = 0) ? 1 : 0; (Logical NOT)
( (f--) IF: if (f != 0), continue into '()', else skip to matching ')'
[ (F T--) FOR: start a For/Next loop. if (T < F), swap T and F
rI (--n) n: the index of the current FOR loop
] (--) NEXT: increment index (rI) and restart loop if (rI <= T)
NOTE: A FOR loop always runs at least one iteration.
- It can be put it inside a '()' to keep it from running.
{ (f--f) WHILE: if (f == 0) skip to matching '}'
} (f--f?) WHILE: if (f != 0) jump to matching '{', else drop f and continue
uL (--) Unwind the loop stack. Use with ';'.eg = (uL;)
uF (--) Exit FOR. Continue after the next ']'.
uW (--) Exit WHILE. Continue after the next '}'.
uC (--) Continue. Jump to the beginning of the current loop.


Opcode Stack Description
xBO (n m--fh) File: Block Open (block-nnn.s4, m: 0=>read, 1=write)
xBR (n a sz--) File: Block Read (block-nnn.s4, max sz bytee).
xBW (n a sz--) File: Block Write (block-nnn.s4, sz bytes).
xBL (n--) File: Load code from file (block-nnn.src). This can be nested.
xFL (--) File: Load code from ./Code.S4.
xFS (--) File: Save code to ./Code.S4.
xFO (n m--h) File: Open - n: file name, m: mode, h: file handle (0 means not opened)
xFC (h--) File: Close - h: file handle
xFD (n--) File: Delete - n: file name
xFR (h--c f) File: Read - h: file handle, c: char, n: success?
xFW (c h--f) File: Write - h: file handle, c: char, n: success?
NOTE: File operations are enabled by #define FILES
xPI (p--) Arduino: Pin Input (pinMode(p, INPUT))
xPU (p--) Arduino: Pin Pullup (pinMode(p, INPUT_PULLUP))
xPO (p--) Arduino: Pin Output (pinMode(p, OUTPUT)
xPRA (p--n) Arduino: Pin Read Analog (n = analogRead(p))
xPRD (p--n) Arduino: Pin Read Digital (n = digitalRead(p))
xPWA (n p--) Arduino: Pin Write Analog (analogWrite(p, n))
xPWD (n p--) Arduino: Pin Write Digital (digitalWrite(p, n))
xR (n--r) r: a random number in the range [0..(n-1)]
NB: if n=0, r is the entire 32-bit random number
xT (--n) Milliseconds (Arduino: millis(), Windows: GetTickCount())
xN (--n) Microseconds (Arduino: micros(), Windows: N/A)
xW (n--) Wait (Arduino: delay(), Windows: Sleep())
xIAF (--a) INFO: a: address of first function vector
xIAH (--a) INFO: a: address of HERE variable
xIAR (--a) INFO: a: address of first register vector
xIAS (--a) INFO: a: address of beeginning of system structure
xIAU (--a) INFO: a: address of beeginning of user area
xIF (--n) INFO: n: number of words/functions
xIH (--n) INFO: n: value of HERE variable
xIR (--n) INFO: n: number of registers
xIU (--n) INFO: n: number of bytes in the USER area
xSR (--) S4 system reset
xQ (--) PC: Exit S4


; To enter a comment:
0( here is a comment )
; here is another comment

; if (c) { print("Yes") } else { print("No") }
rC #("Yes")~("No")

; x = (a == b) ? c : d;
rA rB=#(rC$)~(rD)sX;

; To make sure F < T
; S4 code: %%>($)
; Forth equivalent: OVER OVER > IF SWAP THEN
; C equivalent: if (f > t) { int x = f; f = t; t = x; }

; To do something (in this case, execute LP) N times:
1 rN[LP]

; Increment Register x, decrement register Y
iX dY

; To print numbers from F to T:
; S4 code: rF rT[rI." "]
; Forth equivalent: F @ T @ FOR I . NEXT
; C equivalent: for (int i = F; i <= T; i)) { printf("%d ", i); }

; One way to copy N bytes from A to B (n f t--)
N A B s2 s1 1[r1 c@ r2 c! i1 i2]

; One way to copy N CELLS from A to B (N A B--)
N A B s2 s1 1[r1 @ r2 ! n1 n2]

; A simple benchmark for a 100 million FOR loop:
1000#* 100* xT$ 1[] xT$-." ms"

; A simple benchmark for a 100 million WHILE loop:
1000#* 100* xT$ {1-} xT$-." ms"

; Define a word to display the currently defined code:
:CODE xIAU xIH 1-[rI c@ #,';=(rI 1+ c@': =(13,10,))];

; Exit S4:

Adding new functionality to S4:

In run(start) (S4.cpp), in the "switch" statement, there are cases available for direct support of new instructions, mostly unused lowercase characters. New functionality can be put there by comandeering one of the /* FREE */ cases.

S4 also has "extended" instructions. These are triggered by the 'x' case. All extended instructions begin with 'x'. For extended instructions, function doExt() is called. It behaves in a similar way to run(), by setting 'ir = *(pc++)' and a switch statement. The "default:" case calls an external function, doCustom(ir, pc). That is where I usually put system-specific functionality; (eg - pin manipulation for Arduino Boards). Function doCustom(ir, pc) needs to return the address where pc should continue.

As an example, I will add Gamepad/Joystick simulation to S4 as extended instructions. This example uses the HID-Project from NicoHood (

Import the library into Arduino using "Import Library".

In the doCustom(ir, pc) function (S4.ino), add a new case for ir. In this case, I am adding case 'G'.

addr doCustom(byte ir, addr pc) {
    ir = *(pc++);
    switch (ir) {
    case 'G': pc = doGamePad(ir, pc);        break;
    return pc;

Then it is a simple matter of implementing doGamePad(ir, pc):

\#ifdef __GAMEPAD__
\#include <HID-Project.h>
\#include <HID-Settings.h>
addr doGamePad(byte ir, addr pc) {
    ir = *(pc++);
    switch (ir) {
    case 'X': Gamepad.xAxis(pop());          break;
    case 'Y': Gamepad.yAxis(pop());          break;
    case 'P':;          break;
    case 'R': Gamepad.release(pop());        break;
    case 'A': Gamepad.dPad1(pop());          break;
    case 'B': Gamepad.dPad2(pop());          break;
    case 'L': Gamepad.releaseAll();          break;
    case 'W': Gamepad.write();               break;
        isError = 1;
    return pc;
addr doGamePad(addr pc) { printString("-noGamepad-"); return pc; }

The last thing to do is #define __GAMEPAD__. This is best done in "config.h".

#elif __BOARD__ == XIAO
  #define __GAMEPAD__
  #define STK_SZ          8
  #define LSTACK_SZ       4
  #define USER_SZ        (22*1024)
  #define NUM_REGS       (26*26)

Now in S4, you can do things like:

3 xGP xGW 0(Press button 3)
3 xGR xGW 0(Release button 3)

Tags: vm  

Last modified 08 March 2024