Appendix A — Bytecode Patterns by Construct
This appendix is a code-reading companion to the JVM backend. It does not try to list every opcode the emitter can produce. It shows the stable bytecode shapes used for the major Nex constructs, so that a reader can move between a Nex example and the corresponding emitter path in src/nex/compiler/jvm/emit.clj without reverse-engineering the whole file each time.
The sequences below are schematic. They omit some boxing, unboxing, casts, and local-variable-table details when those are not the point of the example. The important thing is the control-flow and stack shape the backend relies on.
A.1 Reading This Appendix
The compiled JVM path has three layers:
- lowering in
src/nex/lower.cljc - IR definitions in
src/nex/ir.cljc - bytecode emission in
src/nex/compiler/jvm/emit.clj
This appendix starts at the last of these. Each section answers the same question: once lowering has decided that a construct stays on the compiled path, what bytecode pattern does the emitter actually write?
A.2 Constants and Local State
The simplest compiled constructs become direct stack operations.
A.2.1 Constants
Integer, boolean, character, string, and nil literals become ordinary JVM constants through the emitter’s constant path. In practice the sequence is a short LDC-style push followed, when needed, by boxing or coercion at the use site.
Nex: 42
Bytecode shape:
LDC 42
Nex: true
Bytecode shape:
ICONST_1
Nex: "hello"
Bytecode shape:
LDC "hello"
A.2.2 Local Loads and Stores
Lowered locals become typed JVM local slots. The emitter uses a typed load opcode on read and a typed store opcode on assignment.
Nex:
let x: Integer := 10
x + 1
Bytecode shape:
LDC 10
ISTORE <slot-x>
ILOAD <slot-x>
ICONST_1
IADD
The actual load and store opcodes vary with the lowered JVM type:
ILOAD/ISTOREforInteger,Boolean, andCharLLOAD/LSTOREforInteger64DLOAD/DSTOREforRealALOAD/ASTOREfor object-typed values
The relevant emitter paths are the :local and :set-local branches in emit.clj.
A.3 Top-Level REPL Bindings
The compiled REPL does not treat top-level let as a normal JVM local. Top-level values live in the values atom inside NexReplState, and compiled cells mutate that canonical state.
Nex:
let count := 10
Bytecode shape:
ALOAD state
GETFIELD NexReplState.values
CHECKCAST clojure/lang/Atom
INVOKEVIRTUAL Atom.deref
CHECKCAST java/util/HashMap
LDC "count"
LDC 10
INVOKEVIRTUAL HashMap.put
POP
A top-level read mirrors this:
Bytecode shape:
ALOAD state
GETFIELD NexReplState.values
...
LDC "count"
INVOKEVIRTUAL HashMap.get
<unbox-or-cast>
This is why the compiled REPL can execute multiple cells coherently without pretending every top-level name is a local variable in one giant synthetic method. The read and write paths live under :top-get, :top-set, and the emit-load-values-map! helper in emit.clj.
A.4 Arithmetic, Comparison, and Boolean Operators
A.4.1 Arithmetic
Primitive arithmetic is emitted directly after any required numeric promotion.
Nex:
x + y
Bytecode shape:
<emit x>
<coerce to operand type>
<emit y>
<coerce to operand type>
IADD | LADD | DADD
The same pattern applies to -, *, /, and %, with the opcode chosen from the lowered JVM type.
A.4.2 Comparisons
Comparisons compile to branch-and-materialize sequences. The emitter does not leave a raw JVM comparison result on the stack. It branches to produce a Nex boolean value explicitly.
Nex:
x < y
Bytecode shape for int-like operands:
<emit x>
<emit y>
IF_ICMPLT true
ICONST_0
GOTO end
true:
ICONST_1
end:
For long and double, the JVM requires a compare instruction first:
Bytecode shape:
<emit x>
<emit y>
LCMP | DCMPL
IFLT | IFGT | IFEQ ...
ICONST_0
GOTO end
true:
ICONST_1
end:
Object comparison is deliberately narrower. The compiled path currently supports only = and /= on object operands, which become IF_ACMPEQ and IF_ACMPNE style branches over references.
A.4.3 Short-Circuit Boolean Operators
and and or are emitted as real short-circuit control flow.
Nex:
a and b
Bytecode shape:
<emit a>
IFEQ false
<emit b>
IFEQ false
ICONST_1
GOTO end
false:
ICONST_0
end:
Nex:
a or b
Bytecode shape:
<emit a>
IFNE true
<emit b>
IFNE true
ICONST_0
GOTO end
true:
ICONST_1
end:
This is handled in emit-boolean-short-circuit! in emit.clj.
A.5 Branching and Looping
A.5.1 Expression if
Expression-shaped if lowers to one-expression branches with an explicit merge label.
Nex:
if test then a else b end
Bytecode shape:
<emit test>
IFEQ else
<emit a>
<coerce to result type>
GOTO end
else:
<emit b>
<coerce to result type>
end:
The critical detail is the result-type coercion before the merge point. Lowering already decided the result type; emission enforces it on both branches.
A.5.2 Statement if
Statement-form if uses the same branch skeleton without preserving a result on the stack.
Bytecode shape:
<emit test>
IFEQ else
<emit then statements>
GOTO end
else:
<emit else statements>
end:
A.5.3 Loops
The lowered loop form is a guard-at-top loop with a conventional back edge.
Bytecode shape:
<emit init statements>
loop:
<emit test>
IFNE end
<emit body>
GOTO loop
end:
This looks inverted only if one expects the JVM code to mirror the Nex source text exactly. Lowering has already normalized the loop condition into the internal guard form the emitter uses.
A.6 Functions, Methods, and Calls
A.6.1 Function Argument Prologue
Compiled REPL functions and compiled instance methods receive arguments in an Object[]. The first step in the emitted method body is to unpack that array into typed locals.
Bytecode shape for one argument:
ALOAD __args
LDC 0
AALOAD
<unbox-or-cast to declared type>
<typed store into local slot>
The emitter repeats this for each parameter in emit-function-arg-prologue!.
A.6.2 Top-Level REPL Function Calls
Top-level functions in the compiled REPL are registered in the session state’s functions map as reflective Method objects. Calling one therefore goes through that map and then through Method.invoke.
Bytecode shape:
<load functions map from state>
LDC "fn-name"
INVOKEVIRTUAL HashMap.get
CHECKCAST java/lang/reflect/Method
ACONST_NULL
ICONST_2
ANEWARRAY java/lang/Object
...
INVOKEVIRTUAL Method.invoke
<unbox-or-cast result>
This is not how file-compiled static calls work. It is specifically the compiled REPL strategy, where the current session state is part of the call protocol.
A.6.3 Compiled Instance and Virtual Calls
Ordinary method calls on compiled objects use a direct virtual invocation after the target is emitted and cast to the expected owner type.
Bytecode shape:
<emit target>
CHECKCAST owner
ALOAD state
<emit boxed arg array>
INVOKEVIRTUAL owner.method
<unbox-or-cast result>
Higher-order function-object calls take a different path: they invoke runtime helper machinery rather than inlining the closure protocol into raw bytecode. That choice keeps the backend readable while still staying on the compiled path.
A.7 Object Construction, Fields, and Class Initialization
A.7.1 Plain Object Construction
A lowered create C eventually becomes the ordinary JVM object-construction triplet:
Bytecode shape:
NEW C
DUP
INVOKESPECIAL C.<init>
That is the :new branch in emit.clj.
A.7.2 Field Reads and Writes
Compiled field access is direct.
Nex:
obj.x
Bytecode shape:
<emit obj>
CHECKCAST owner
GETFIELD owner/x <descriptor>
Nex:
obj.x := value
Bytecode shape:
<emit obj>
CHECKCAST owner
<emit value>
<coerce to field type>
PUTFIELD owner/x <descriptor>
A.7.3 User Default Constructors
User-defined classes get a default constructor that does more than call Object.<init>. It also initializes composition fields, assigns default values, and sets the __outer__ pointer used for dynamic dispatch through composed parent objects.
Bytecode shape:
ALOAD 0
INVOKESPECIAL super.<init>
ALOAD 0
ALOAD 0
PUTFIELD owner.__outer__
ALOAD 0
NEW ParentPart
DUP
INVOKESPECIAL ParentPart.<init>
PUTFIELD owner.parent_part
ALOAD 0
GETFIELD owner.parent_part
ALOAD 0
PUTFIELD ParentPart.__outer__
ALOAD 0
<default value>
PUTFIELD owner.field
RETURN
The back-pointer setup is specific to Nex’s compiled treatment of composed inheritance and is worth understanding before changing constructor emission.
A.7.4 Class Initializers
Static constants become a straightforward class initializer:
Bytecode shape:
<emit constant value>
PUTSTATIC owner/CONST
...
RETURN
A.8 Collections
Collection literals compile directly to Java collection objects.
A.8.1 Arrays
Nex arrays become java.util.ArrayList.
Nex:
[1, 2, 3]
Bytecode shape:
NEW java/util/ArrayList
DUP
INVOKESPECIAL ArrayList.<init>
DUP
LDC 1
<box if needed>
INVOKEVIRTUAL ArrayList.add
POP
DUP
LDC 2
...
A.8.2 Maps
Maps become java.util.HashMap.
Bytecode shape:
NEW java/util/HashMap
DUP
INVOKESPECIAL HashMap.<init>
DUP
<emit key>
<emit value>
INVOKEVIRTUAL HashMap.put
POP
A.8.3 Sets
Sets become java.util.LinkedHashSet, which preserves insertion order.
Bytecode shape:
NEW java/util/LinkedHashSet
DUP
INVOKESPECIAL LinkedHashSet.<init>
DUP
<emit element>
INVOKEVIRTUAL LinkedHashSet.add
POP
A.8.4 Collection Operations
Several core collection operations are emitted as direct host-library calls:
- array
get->ArrayList.get - array
put->ArrayList.set - array
add->ArrayList.add - array
length->ArrayList.size - map
put->HashMap.put - map
size->HashMap.size - set
size->LinkedHashSet.size
The emitter uses runtime helpers only where a direct Java collection call would hide Nex semantics rather than clarify them. String rendering, cloning, equality, and some cursor-related operations still go through helper calls for exactly that reason.
A.9 Exceptions, Contracts, and Retry
A.9.1 raise
raise expr first evaluates the value, boxes it if necessary, converts it to a runtime exception object, then throws it.
Bytecode shape:
<emit expr>
<box if primitive>
<runtime call make-raised-exception>
CHECKCAST java/lang/Throwable
ATHROW
A.9.2 Contract Assertions
Compiled preconditions, postconditions, invariants, and variants are ordinary guard checks that construct and throw a contract violation when false.
Bytecode shape:
<emit condition>
IFNE ok
LDC "Postcondition"
LDC "label"
<runtime call make-contract-violation>
CHECKCAST java/lang/Throwable
ATHROW
ok:
This directness is one of the strengths of the compiled backend. Contract checks are not magic metadata. They are visible control-flow.
A.9.3 rescue and retry
The compiled try/rescue path uses explicit JVM try/catch regions plus a retry signal recognized by the runtime.
Bytecode shape:
loop-start:
body-start:
<emit body>
body-end:
GOTO end
body-handler:
ASTORE throwable
ALOAD throwable
<runtime call retry-signal?>
...
IFEQ not-retry
ALOAD throwable
ATHROW
not-retry:
ALOAD throwable
<runtime call exception-value>
ASTORE exception
rescue-start:
<emit rescue body>
rescue-end:
GOTO end
rescue-handler:
ASTORE rescue-throwable
ALOAD rescue-throwable
<runtime call retry-signal?>
...
IFEQ rescue-not-retry
GOTO loop-start
rescue-not-retry:
ALOAD rescue-throwable
ATHROW
end:
This is more substantial than a helper call wrapped around the interpreter. The control flow is genuinely compiled; the runtime only supplies the retry-signal protocol and exception-value extraction.
A.10 Concurrency and Helper Calls
Concurrency is compiled, but not by inlining channel and task mechanics into raw JVM instructions. The emitter uses dedicated runtime helper calls for the concurrency surface.
Nex:
t.await()
Bytecode shape:
<load runtime var task-await-method>
<emit boxed task object>
INVOKEVIRTUAL Var.invoke
<unbox-or-cast result>
Nex:
ch.send(value)
Bytecode shape:
<load runtime var channel-send-method>
<emit boxed channel>
<emit boxed value>
INVOKEVIRTUAL Var.invoke
<unbox-or-cast or POP>
The important design point is that these are still compiled calls. They are not deopts. Lowering produces :concurrency-method IR, and emission maps each supported method to a named runtime helper such as:
task-await-methodtask-cancel-methodchannel-send-methodchannel-receive-methodchannel-close-method
This keeps the bytecode layer small while still preserving a compiled execution path for concurrency-heavy code.
A.11 File Compilation and Program Launch
File compilation adds one more layer: a launcher class with main.
The launcher bytecode shape is:
<load clojure.core/require>
<require nex.compiler.jvm.runtime>
<runtime call make-repl-state>
CHECKCAST NexReplState
ASTORE state
<runtime call bootstrap-compiled-state! state classes-edn imports-edn>
POP
ALOAD state
INVOKESTATIC Program.eval
POP
<runtime call print-state-output! state>
POP
RETURN
This structure is why the file compiler can share so much code with the compiled REPL. The program class still exposes an eval entry point that consumes NexReplState; the launcher is just the small wrapper that creates and bootstraps that state in a standalone jar.
A.12 How to Use This Appendix While Reading the Code
For actual code reading, the most efficient order is:
- Read the lowering rule in
src/nex/lower.cljcfor the construct you care about. - Find the corresponding IR shape in
src/nex/ir.cljc. - Jump to the matching
:opbranch or helper insrc/nex/compiler/jvm/emit.clj. - Check
src/nex/compiler/jvm/runtime.cljonly if emission routes through a runtime helper.
The main practical distinction to keep in mind is this:
- direct JVM emission is used when the construct has a stable and readable stack-level meaning
- runtime helper calls are used when they preserve Nex semantics more clearly than expanding everything inline
That design is visible throughout the backend, and the bytecode patterns in this appendix are the quickest way to see it.