The static semantics describes what can be known about a program before it runs: the type of every expression, the conformance of every class to its parents, and the well-formedness of every contract. A program that passes these checks is elaborated; one that fails is rejected, and never reaches the dynamic semantics of Chapter 5.
We proceed in the manner of The Definition of Standard ML: first the semantic objects—the types and bindings the rules manipulate; then the relations among them—conformance and lookup; then the inference rules that assign types to phrases.
4.1Types
A type is one of the following.
| \(\tau\) | ::= | \(\texttt{Integer} \mid \texttt{Integer64} \mid \texttt{Real} \mid \texttt{Decimal}\) | — numeric |
| | | \(\texttt{Char} \mid \texttt{Boolean} \mid \texttt{String}\) | — other scalars | |
| | | \(A[\tau_1,\dots,\tau_n]\) | — class type (\(n \ge 0\)) | |
| | | \(\alpha\) | — type variable | |
| | | \(\tau_1 \times \cdots \times \tau_n \to \tau\) | — function type | |
| | | \(\mathsf{Fun}\) | — the unconstrained function type | |
| | | \(\tau\,?\) | — optional type | |
| | | \(\mathsf{Any} \mid \mathsf{Void} \mid \mathsf{Nil}\) | — top, no-result, the type of nil |
Here \(A\) ranges over class names. A class type with no arguments is written \(A\) rather than \(A[\,]\). The scalar types, \(\mathsf{Any}\), \(\mathsf{Function}\) and the collection classes are not primitive in any deep sense: they are the classes of the standard environment (Appendix B), and \(\texttt{Integer}\) and the rest are abbreviations for particular class types. We keep them syntactically distinct only because the lexer does.
\(\mathsf{Any}\) is the type of which every type is a subtype; it is the
elaborated form of the root class Any. \(\mathsf{Void}\) is the
type assigned to a statement, and to a routine with no declared return type: it
is not a value type and may not be the type of a field or parameter.
\(\mathsf{Nil}\) is the type of the constant nil; it is a subtype of
every optional type and of no other.
4.2The Static Environment
Elaboration takes place against a static environment, written \(C\), comprising three components:
- a class bindings \(\Sigma\), mapping each class name
\(A\) in scope to its class signature — its generic parameters
and their constraints, its modifiers (
deferred,sealed), its parents, the typed fields it declares, and the typed signatures of its routines and constructors; - a variable bindings \(\Gamma\), mapping each identifier in scope (parameters, locals, fields of the current object) to its type;
- a set \(\Delta\) of type variables currently in scope, introduced by the generic parameters of the enclosing class or routine.
We write \(C(x) = \tau\) for the lookup of an ordinary variable, \(C \oplus \{x : \tau\}\) for the environment \(C\) extended with the binding \(x:\tau\) (shadowing any earlier binding of \(x\)), and \(C(A)\) for the signature of class \(A\). The functions \(\mathit{fields}(C, A)\) and \(\mathit{routine}(C, A, m)\) collect, respectively, the fields and the named routine \(m\) visible in \(A\)—those declared in \(A\) together with those inherited from its parents, with \(A\)’s own declarations overriding inherited ones of the same name.
4.3Conformance
The central relation of the static semantics is conformance, written \(\tau \preceq \tau'\) and read “\(\tau\) conforms to \(\tau'\)” or “\(\tau\) is a subtype of \(\tau'\).” A value of a conforming type may be used wherever the wider type is expected. Conformance is the reflexive, transitive closure of the rules below.
A class type conforms to its parents, with type arguments substituted through
the inheritance clause. If \(A\) is declared inherit \(B[\sigma_1,
\dots]\) under generic parameters \(\alpha_1,\dots,\alpha_n\), then
The numeric types form a small chain reflecting widening: \(\texttt{Integer} \preceq \texttt{Integer64} \preceq \texttt{Real} \preceq \texttt{Decimal}\) is not assumed here; numeric conversions in Nex are explicit except where the standard environment provides a coercion. The one implicit numeric rule the language relies upon is that integer operands are admitted where a real is required by the arithmetic of the standard environment; this is recorded with the arithmetic operators in Appendix B rather than as a conformance.
Optional types
An optional type \(\tau\,?\) is inhabited by the values of \(\tau\) together
with nil. Hence \(\tau\) conforms to \(\tau\,?\), and
\(\mathsf{Nil}\) conforms to every optional type; but \(\tau\,?\) does
not conform to \(\tau\).
Function types
A function type conforms to another when it accepts at least the arguments the other accepts and returns at most what the other promises: its parameters vary contravariantly and its result covariantly. The unconstrained type \(\mathsf{Fun}\) is a supertype of every function type.
Note the directions in (C-Fun): for the subtype on the left, each parameter
type \(\sigma_i\) must be a supertype of the corresponding \(\tau_i\)
(\(\tau_i \preceq \sigma_i\)), while the result \(\sigma\) must be a
subtype of \(\tau\). Thus a Function(a: Animal) conforms to
a Function(a: Dog) — a handler for any animal may stand in
wherever a dog-handler is wanted — but not the reverse.
Function(a: A) genuinely accepts every \(A\), so
a call through that type cannot go wrong — there is no “catcall”
to chase down later. This is the opposite of the Eiffel tradition, which makes
parameters covariant and so admits an unsound relation that must be rescued by
a separate whole-program analysis; Nex instead rejects the unsafe direction at
the point of conformance, with a diagnostic at the offending value. The same
rule governs method redefinition (Section 4.9): an overriding routine may
widen a parameter and narrow a return, but not the reverse.
The contravariant parameter rule also aligns with the Design-by-Contract
substitution principle the language follows elsewhere — an heir may
weaken what it demands and strengthen what it guarantees.
4.4Elaboration of Expressions
The judgement \(C \vdash e \Rightarrow \tau\) asserts that, in the static environment \(C\), the expression \(e\) is well-typed with type \(\tau\). The rules follow the grammar of Section 2.7.
Constants and variables
The constant this has the type of the current class, available in
\(C\) within the body of a routine; result has the declared return
type of the enclosing routine, and is in \(C\) only there (Section 2.9).
Operators
An infix operator denotes a routine of the operand’s class. The
arithmetic and comparison operators are elaborated through the signatures
recorded in the standard environment; we summarise their effect. For an arithmetic
operator \(\odot\) — one of + - * / % ^ — on numeric operands the result is
the wider of the two operand types; comparison operators yield
Boolean; the equality operators \(=\), \(/\!=\), \(==\), \(!\!=\)
yield Boolean for any operands of conforming type; and the logical
operators require and yield Boolean.
Here \(\tau_1 \sqcup \tau_2\) is the wider of two numeric types. Rule (4.10)
stands for the conjunction; disjunction (or) and negation
(not) are analogous.
Member access and calls
A member access \(e.f\) requires that the static type of \(e\) be a class type possessing an accessible field or routine named \(f\). If \(f\) is a field, its type is the result; if \(f\) is a nullary routine, its return type is the result; if \(f\) is applied to arguments, the argument types must conform to the parameter types of the routine.
The accessibility condition is that \(f\) is not declared in a
private feature section of \(A\), unless the access occurs textually
within \(A\). A bare call \(m(e_1,\dots,e_n)\) is elaborated as a call on
this if \(m\) is a routine of the current class, and otherwise as a
call of the free function \(m\) (Section 4.8).
nil when \(e\) is
nil, and so is defined precisely on optional receivers.
Object creation
The expression create \(A[\bar\tau].k(e_1,\dots,e_n)\) requires
that \(A\) be a non-deferred class, that \(k\) be one of its constructors, and
that the arguments conform to \(k\)’s parameters; its type is
\(A[\bar\tau]\). When the constructor and argument list are omitted, \(A\) must
have a constructor taking no arguments, or none at all.
Conditional and anonymous-function expressions
The conditional expression requires a boolean condition and two branches; its type is the join—the least common supertype—of the branch types, written \(\tau_1 \sqcup \tau_2\) and always defined because \(\mathsf{Any}\) is a top.
4.5Elaboration of Statements
A statement is elaborated by the judgement \(C \vdash \mathit{stmt} \Rightarrow C'\), producing a possibly extended environment \(C'\); a statement that binds no new name produces \(C\) unchanged. A block is elaborated by threading the environment through its statements.
A let with no type annotation takes \(\sigma\) to be the inferred
type \(\tau\) of its initialiser. The premise “\(x\) not a once
field outside a constructor” in (4.17) is the static enforcement of
immutability: assigning a once field anywhere but in a constructor
of its class is rejected. The loop, case, match,
across, and do/rescue statements elaborate
their constituent blocks under the environment extended with any variable they
introduce (the cursor variable of across, the bound variable of a
match clause), and produce the original environment: a block does not leak
its locals to the statements that follow it.
Type dispatch and exhaustiveness
A match on an expression of class type \(A\) dispatches on the
runtime class. Each clause when \(B\) as \(x\)
then \(\mathit{block}\) requires \(B \preceq A\) and elaborates its
block with \(x\) bound to \(B\).
When \(A\) is sealed, the set of clause classes must exhaust
\(\mathit{subclasses}(C, A)\), unless an else clause is present to
cover the remainder. A missing variant is a compile-time error. This
exhaustiveness check is the whole purpose of the sealed modifier,
and is why a sealed class is required to be deferred (Section 4.9).
Type conversion
The form convert \(e\) to \(x{:}\sigma\) is a boolean
expression that attempts a downward or upward cast: it succeeds, binding \(x\) of
type \(\sigma\), when the runtime type of \(e\) is related to \(\sigma\) (a
supertype or subtype), and otherwise yields false with \(x\) bound
to nil. Statically it requires that the type of \(e\) and \(\sigma\)
be related by conformance in one direction or the other.
4.6Contract Formation
Every assertion—in a require, an ensure, an
invariant, or a loop invariant—must elaborate to
Boolean. A precondition is elaborated in the environment of the
routine’s entry (parameters and fields in scope); a postcondition in that
environment extended with result and with the old forms; an
invariant in the environment of the class’s fields.
Rule (4.22) records the restriction of Section 2.9: old
applies only to a field of the current object, and only within a postcondition
(the superscript \(\mathsf{ensure}\) marks that environment). The label of an assertion
has no effect on its type; it is carried into the dynamic semantics solely to
name the condition in a violation report.
4.7Generic Elaboration
A generic class \(A[\alpha_1{\to}K_1,\dots]\) is elaborated once, with its parameters \(\alpha_i\) added to \(\Delta\) and constrained so that any actual argument must conform to \(K_i\). An application \(A[\tau_1,\dots,\tau_n]\) is well-formed when each \(\tau_i\) conforms to the constraint \(K_i\) of the corresponding parameter; its fields and routine signatures are those of \(A\) with \(\tau_i\) substituted for \(\alpha_i\) throughout.
A parameter written \(?\alpha\) additionally permits \(\mathsf{Nil}\) as an argument. A generic routine is elaborated in the same way, its type parameters local to the routine. Because elaboration substitutes rather than erases, the type \(\texttt{Array[Integer]}\) and the type \(\texttt{Array[String]}\) are distinct types with distinct field and routine signatures, and a value of one does not conform to the other.
4.8Free Functions and the Static World
The free functions of a program are elaborated together with its classes, so
that any function may call any other regardless of textual order. A bare
application \(m(e_1,\dots,e_n)\), where \(m\) is not a routine of the current
class, is elaborated against the signature of the free function \(m\); a
declare function signature supplies that type before the definition
is seen, which is how mutual recursion is admitted (Section 3.4). A later
definition must match its declaration exactly. Free functions, unlike methods,
may not be overloaded: each free function name is meant to be unique, and a
definition matching an earlier declare function must agree with it
in full. (The current implementation does not enforce this uniqueness; see the
note on enforcement gaps in Section 4.9.)
4.9Well-Formedness of Classes
Beyond the typing of expressions, a program must satisfy conditions on its class structure. These are checked once, over the whole static world.
- Acyclic inheritance. The relation “\(A\) inherits \(B\)” must be acyclic; its reflexive-transitive closure is the conformance among class types of (C-Inherit).
- Attachment (void safety). Every constructor of a class
must leave each of the class’s fields attached—holding a
value of its declared type rather than
nil—unless the field is of an optional type \(\tau\,?\), which admitsnil. A constructor that returns with a non-optional reference field still unattached is rejected. Scalar fields are attached by their zero value and so need no explicit initialiser; optional fields are attached tonilby default. It is this discipline that makes the absence of a rule fornil-dereference (Section 5.3) a guarantee rather than a hazard: a non-optional reference is never observed to benil. - Conformant overriding. When \(A\) inherits \(B\) and redefines a routine \(m\) of \(B\), the signature of \(A\)’s \(m\) must conform to that of \(B\)’s \(m\) under (C-Fun): each parameter may be widened but not narrowed (contravariant), and the return may be narrowed but not widened (covariant). A field of \(B\) may not be redeclared with a non-conforming type. This is checked at the override; a routine that narrows a parameter or returns a non-conforming type is rejected there, naming the routine and the offending position.
- Complete deferral. A non-deferred class must supply a
body for every routine declared
deferredin any of its ancestors; a class with an unimplemented deferred routine is itself deferred and may not be instantiated. - Sealed implies deferred. A
sealedclass must bedeferred. Were a sealed class instantiable, a bare instance of the parent would be a runtime value that an exhaustivematchover the parent’s subclasses (4.19) would fail to cover; requiring deferral removes that value and so keeps exhaustiveness meaningful. - Once discipline. A
oncefield is assigned only in constructors of its class (the premise of (4.17)).
A program all of whose phrases elaborate, and all of whose classes are well-formed, is statically valid. The dynamic semantics of the next chapter is defined only for statically valid programs.