sprout-bytecode-spec
Sprout Bytecode Specification
The instruction set for a self-modifying bytecode VM where code is data is text.
Philosophy
Everything compiles down to: pause → fetch → (call out) → splice → continue
The stream IS the program. It's also the data. It's also the library. Everything is bytes.
Core Principle: No Magic
The VM has zero built-in knowledge about the domain.
The VM knows how to:
- Read bytes from a stream
- Do arithmetic (increment, modulo, compare)
- Write bytes back
- Follow peg indirection
The VM does NOT know:
- What "shape" means
- What "circle" looks like
- That shapes cycle in a particular order
- What a "sprout" is
- What "harvest" means
Everything comes from three sources:
- Stream - the bytecode being executed
- Fences - tools/operations loaded at runtime
- Pegboard - data loaded from environment configuration
If it's not in the configuration, it doesn't exist. Period.
This means:
- A level defines its own schema (what attributes exist)
- A level defines its own enums (what values each attribute can have)
- A level defines its own cycles (what order values rotate)
- The VM just executes bytes according to simple rules
The same VM can run a shape-matching game, a color-mixing puzzle, a document workflow, or an API orchestration - it doesn't know or care which.
Stream Layout
┌─────────────────────────┬────┬───────────────────────┬──────────────────┐
│ INSTRUCTION AREA │ FF │ PEGBOARD │ DATA AREA │
├─────────────────────────┼────┼───────────────────────┼──────────────────┤
│ [program bytecode...] │STOP│ 256 × 3-byte pegs │ [text] [configs] │
│ │ │ [how][arg1][arg2] │ │
│ 4-byte picks: │ @0 │ @1025 to @1793 │ @1794+ │
│ [op][peg1][peg2][peg3] │ │ │ │
└─────────────────────────┴────┴───────────────────────┴──────────────────┘
Pick format: 4 bytes per instruction (not 3!)
Board size (16×16 grid): 1024 bytes (256 cells × 4 bytes)
Pegboard: 768 bytes (256 pegs × 3 bytes each)
Total before data area: 1024 (board) + 1 (FF) + 768 (pegboard) = 1793 bytesEnvironment Bootstrap
Before any program runs, the environment loads configuration into the pegboard. This is where all domain knowledge lives.
The Six Layers
LAYER 1: Labels (what attributes exist)
┌──────┬─────────────────────────────┐
│ Peg │ Contents │
├──────┼─────────────────────────────┤
│ 01 │ "SHAPE" FF │
│ 02 │ "COLOR" FF │
└──────┴─────────────────────────────┘
FF-terminated ASCII strings. These are just names.
The VM doesn't know "SHAPE" means anything special.
LAYER 2: Values (what each attribute can be)
┌──────┬─────────────────────────────┐
│ Peg │ Contents │
├──────┼─────────────────────────────┤
│ 03 │ "red" FF │
│ 04 │ "blue" FF │
│ 05 │ "yellow" FF │
│ 06 │ "green" FF │
│ 07 │ "circle" FF │
│ 08 │ "square" FF │
│ 09 │ "triangle" FF │
│ 0A │ "star" FF │
└──────┴─────────────────────────────┘
More FF-terminated ASCII. Just strings in memory.
LAYER 3: Cycles (rotation order for each enum)
┌──────┬─────────────────────────────┐
│ Peg │ Contents │
├──────┼─────────────────────────────┤
│ 10 │ 03 04 05 06 FF │
│ │ (colors: red→blue→yellow→green)
│ 11 │ 07 08 09 0A FF │
│ │ (shapes: circle→square→triangle→star)
└──────┴─────────────────────────────┘
Lists of peg references, FF-terminated.
The VM just sees: "here's a list of bytes, cycle through them."
LAYER 4: Slot Bindings (which cycle belongs to which attribute)
┌──────┬─────────────────────────────┐
│ Peg │ Contents │
├──────┼─────────────────────────────┤
│ 12 │ 01 11 FF │
│ │ (label@01 "SHAPE" uses cycle@11)
│ 13 │ 02 10 FF │
│ │ (label@02 "COLOR" uses cycle@10)
└──────┴─────────────────────────────┘
Binds a label peg to a cycle peg. Still just bytes.
LAYER 5: Contexts (complete attribute states)
┌──────┬───────────────────────────────────────────┐
│ Peg │ Contents │
├──────┼───────────────────────────────────────────┤
│ 20 │ 01 07 02 03 FF │
│ │ (SHAPE=circle, COLOR=red as peg refs) │
│ 21 │ 01 08 02 04 FF │
│ │ (SHAPE=square, COLOR=blue) │
└──────┴───────────────────────────────────────────┘
Key-value pairs as peg references. Format: [label][value][label][value]...FF
LAYER 6: Indices and Counts (mutable state)
┌──────┬─────────────────────────────┐
│ Peg │ Contents │
├──────┼─────────────────────────────┤
│ 30 │ 01 FF │
│ │ (count = 1) │
│ 31 │ 00 │
│ │ (shape cycle index = 0) │
│ 32 │ 00 │
│ │ (color cycle index = 0) │
└──────┴─────────────────────────────┘
Single bytes for indices, FF-terminated for counts.Bootstrap is the First Fetch
PAUSE → "I need an environment"
FETCH → load level configuration into pegboard
SPLICE → layers 1-6 now exist in memory
CONTINUE → VM can execute, all knowledge is indirectA different level loads different configuration. Same VM, different game.
Two Kinds of Holes
00 and 70 are thermodynamic inverses.
| Opcode | Name | Direction | Role | Resolves To |
|---|---|---|---|---|
| 00 | Virtual Fence | IN | Understanding | DATA only |
| 70 | Physical Fence | OUT | Agency | ACTIONS only |
00 = Virtual Fence = Understanding
Format: [00][fence_addr][reserved]
- Data flows INTO the system
- Reading the environment ("what is the switch state?")
- Passive observation, reduces uncertainty
- Can only resolve to values/parameters
- Gets REPLACED by resolved bytes (interpolation)
- Like
${variable}in a template
70 = Physical Fence = Agency
Format: [70][fence_id]
- Work flows OUT of the system
- Acting on the environment ("change this color", "send API request")
- Active intervention, spends energy
- Can only invoke side effects
- STAYS in place, splices result after (invocation)
- Like a function call
Information Flow
Environment ──00──▶ System ──70──▶ Environment
(read) (write)
understanding agencyExecution Pattern
[resolve 00s] → [execute ops] → [hit 70s]
observe → orient/decide → actThis is OODA. This is the sensorimotor loop. This is how cognition works.
The opcode position tells you which way information flows. That's the type system without types.
"The abstraction is clean when it works the same for a child's game as it does for their mind."
Instruction Set
All instructions are 4 bytes: [op][peg1][peg2][peg3]
Every argument is a peg reference. The VM resolves pegs, does math, writes back. No magic.
50 05 00 ; JUMP to address in peg 05 (second peg unused)
FF ; STOP has no arguments at all (1 byte exception)Fence 0
| HOW | Hex | Cache Semantics |
|---|---|---|
| INLINE | 00 |
L1 hit - data is right here |
| ABSOLUTE | 50 |
L2 lookup - known location |
| RELATIVE | 51 |
L2 lookup - computed from context |
| INDIRECT | 52 |
L3 lookup - pointer chase |
| CONTEXT | 53 |
Read from runtime context slot |
| FENCE | 54 |
Virtual fence - resolve via call |
| UNBOUND | FF |
Cache miss - must CALL to resolve |
Opcode Details
SPRING (0x01) - Emitter
01 [context_peg] [count_peg]
context_peg → resolves to context bytes (e.g., peg 20 → 01 07 02 03 FF)
count_peg → resolves to emission count (e.g., peg 30 → 01 FF = 1)
Operation:
1. Resolve context_peg → get context template
2. Resolve count_peg → count until FF → N
3. Emit N sprouts, each with a copy of context
Example: 01 20 30
peg 20 = [SHAPE=circle, COLOR=red]
peg 30 = [01 FF] = count 1
Result: emit 1 sprout with {SHAPE: circle, COLOR: red}POKE (0x20) - Write Context
20 [slot_peg] [value_peg]
slot_peg → resolves to label peg (e.g., 01 → "SHAPE")
value_peg → resolves to value peg (e.g., 08 → "square")
Operation:
1. Find slot_peg's label in current context
2. Replace its value reference with value_peg
3. Context updated
Example: 20 12 0A
peg 12 = binding for SHAPE slot
peg 0A = "star"
Result: context now has SHAPE=star
The VM doesn't know what "star" means. It just replaced a byte.NEXT (0x30) - Cycler
30 [cycle_peg] [index_peg]
cycle_peg → resolves to cycle list (e.g., peg 11 → 07 08 09 0A FF)
index_peg → resolves to current index byte (e.g., peg 31 → 00)
Operation:
1. Read cycle from cycle_peg, count until FF → length (4)
2. Read current index from index_peg → 0
3. new_index = (0 + 1) % 4 → 1
4. Write 01 back to index_peg
Example: 30 11 31
peg 11 = [07 08 09 0A FF] (shape cycle)
peg 31 = [00] (current index)
Result: peg 31 becomes [01]
To get current value: cycle[index] → peg 08 → "square"
The VM just did: read list, read index, add 1, modulo length, write back.
No magic. Pure math on bytes.CHECK (0x40) - Gate
40 [slot_peg] [expected_peg]
slot_peg → which slot to check (binding peg)
expected_peg → expected value reference
Operation:
1. Look up slot_peg in current context → get current value peg
2. Compare current value peg byte with expected_peg byte
3. Match → continue execution
4. No match → blocked, execution stops
Example: 40 12 07
peg 12 = SHAPE binding
peg 07 = "circle"
If context has SHAPE=circle (value peg 07) → pass
If context has SHAPE=star (value peg 0A) → blocked
Just byte comparison. The VM doesn't know circles from stars.Direction Opcodes (0x80-0x83)
80 [next_peg] [_] RIGHT
81 [next_peg] [_] DOWN
82 [next_peg] [_] LEFT
83 [next_peg] [_] UP
Operation:
1. Move in direction
2. If destination accepts entry → continue there
3. If blocked → stop
The opcode itself encodes routing:
- RIGHT (80) cell accepts from LEFT
- DOWN (81) cell accepts from UP
- etc.
No metadata needed. The opcode IS the routing rule.SINK (0xFE) - Collector
FE [context_peg] [count_peg]
context_peg → expected context pattern
count_peg → how many needed
Operation:
1. Arriving sprout's context is normalized (keys sorted)
2. Resolve context_peg → expected bytes, also normalized
3. Byte-compare the two
4. Match → success, decrement count
5. No match → rejected
6. When count reaches 0 → harvest complete
Example: FE 21 30
peg 21 = expected context [SHAPE=square, COLOR=blue]
peg 30 = count [01 FF] = 1
Sprout arrives with matching context → success
Comparison is just memcmp on normalized byte sequences.
The VM has no idea what's being compared.STOP (0xFF) - Terminator
FF
No arguments. 1 byte. Always blocks.
Used for walls, boundaries, end of program.
Also marks the boundary between instruction area and pegboard.SPRING/SINK Symmetry
These are thermodynamic inverses:
| SPRING (01) | SINK (FE) | |
|---|---|---|
| Direction | Emits OUT | Collects IN |
| Context | Template to copy | Pattern to match |
| Count | How many to create | How many needed |
| Format | [context_peg][count_peg] |
[context_peg][count_peg] |
Same structure, opposite flow. The game loop:
- SPRING emits sprouts with context
- Sprouts traverse the grid, context transforms
- SINK collects sprouts if context matches
Type System
Three first-class types for the cognitive stack:
| Type | Role | Representation |
|---|---|---|
| Fence | Code | Capability reference, late-bound |
| Dict | Data | Context, key-value pairs |
| ByteStr | Text | Serialized bytes, manipulable |
CODE ←→ DATA ←→ TEXT
fence dict bytestrAll three interconvert. That's the homoiconic property.
Homoiconic Nature
Code is data is text:
- Store routines in data area as bytes
- SPLICE them into instruction area to run
- Fences can return bytecode
- Bytecode can call fences
- Everything round-trips through serialization
# A fence that returns code based on context
def dynamic_routine(ctx):
if ctx.get('mode') == 'fast':
return ctx, bytes([0x30, 0x11, 0x31, 0xFF]) # NEXT cycle@11 idx@31
else:
return ctx, bytes([0x20, 0x12, 0x07, 0xFF]) # POKE slot@12 val@07The fence reads context, decides what bytecode to emit, and that bytecode gets spliced into the stream and executed. Self-modifying code as a feature, not a bug.
Cache Semantics
The three-level hole structure maps directly to cache behavior:
| HOW | Hex | Cache Semantics |
|---|---|---|
| INLINE | 00 |
L1 hit - data is right here |
| ABSOLUTE | 50 |
L2 lookup - known location |
| RELATIVE | 51 |
L2 lookup - computed from context |
| INDIRECT | 52 |
L3 lookup - pointer chase |
| UNBOUND | FF |
Cache miss - must CALL to resolve |
LOOKUP → peg indirection: "where do I look?"
FETCH → CALL fires if unbound: "go get it"
SPLICE → result fills the slot: "now it's here"
CONTINUE → execution resumes with resolved valueThis is why KV-cache in transformers, DNS resolution, and hippocampal memory consolidation all work the same way. Not similar - identical algorithm forced by the constraints.
Fence 0
Visual Mapping (Game)
| Bytecode | Game Element |
|---|---|
| NOP (00) | Empty plot, interactive anchor |
| SPRING (01) | Source, emitter |
| POKE (20) | Stamper (sets value) |
| NEXT/BACK (30/31) | Cycler (rotates value) |
| CHECK/SKIP (40/41) | Gate (pass/block) |
| JUMP (50) | Warp pad |
| SPLICE (60) | Construction zone |
| CALL (70) | Pipe/tunnel (subroutine) |
| 80-83 | Directional vines |
| SINK (FE) | Harvest point |
| STOP (FF) | Wall/rock |
Architecture Stack
┌─────────────────────────────────────────┐
│ PRESENTATION LAYER (UI) │
│ - Renders tiles, icons, arrows │
│ - Shows human-readable attribute names │
│ - Visual mode / expression mode │
└─────────────────────────────────────────┘
│
│ reads state
▼
┌─────────────────────────────────────────┐
│ STATE OBJECT │
│ - Current context (attribute values) │
│ - Pegboard contents │
│ - Program counter │
└─────────────────────────────────────────┘
│
│ modified by
▼
┌─────────────────────────────────────────┐
│ BYTECODE VM │
│ - Executes 4-byte pick instructions │
│ - Resolves peg indirection │
│ - Pure math on bytes │
│ - Zero domain knowledge │
└─────────────────────────────────────────┘
│
│ initialized from
▼
┌─────────────────────────────────────────┐
│ LEVEL CONFIGURATION │
│ - Schema (labels, values, cycles) │
│ - Initial pegboard state │
│ - Grid layout, constraints │
└─────────────────────────────────────────┘Python API Requirements
Core Engine
# Bootstrap environment from configuration
env = Environment.from_config("level-01.yaml")
# Load bytecode program
vm = VM(env)
vm.load(program_bytes)
# Step debugger
while not vm.halted:
state = vm.step() # Execute one instruction
print(f"PC: {state.pc}, OP: {state.op}")
print(f"Pegboard: {state.pegs}")
# Can poke into state here
result = vm.result()Level Operations
# Load level (schema + layout + initial state)
level = Level.load("p1-01.yaml")
game = Game(level)
# Gameplay loop
game.spawn_sprouts()
while not game.complete:
state = game.tick()
print(state.grid)
print(state.active_sprouts)REPL Modes
| Mode | Commands | Purpose |
|---|---|---|
| env | dump, peek, poke |
Inspect/edit pegboard |
| ctx | show, set |
Inspect/edit context |
| grid | set x,y op, clear |
Edit grid cells |
| level | load, save, play |
Level management |
| debug | step, run, break |
Execution control |
Limits (Current)
- 256 opcodes (single byte)
- 256 pegs per pegboard
- 256 values per cycle
- 768 bytes pegboard area
Extension Paths
If limits become constraining:
- 16-bit peg addresses (4-byte instructions)
- Nested pegboards (bank selection)
- Varint encoding for variable-width addresses
For Sprout Garden, 256 pegs is plenty. For production (Loom), we use slug→section→coordinate addressing which effectively gives unlimited namespace.
Provenance
- 2026-01-22: Updated to 4-byte pick format, added HOLE/COPY/WAIT/YIELD/JUMPIF/BRANCH opcodes, added CONTEXT/FENCE addressing modes to match actual vm.py implementation
Document
- Status: 🟡 In Review
Changelog
- 2026-01-20: Major revision - removed magic, consolidated to single opcode table, made environment bootstrap canonical
- 2026-01-19 19:54: Node created - initial bytecode specification
Next: Level Bootstrap Example
[To be added: A complete worked example showing how a level configuration maps to pegboard state, how the VM executes a simple program, and how the full bootstrap→execute→collect cycle works. This becomes the acceptance test for the implementation.]
North
slots:
- slug: sprout-garden-engine
context:
- Linking spec to engine case node