Static Semantics — The Definition of Nex

The static semantics describes what can be known about a program before it runs: the type of every expression, the conformance of every class to its parents, and the well-formedness of every contract. A program that passes these checks is elaborated; one that fails is rejected, and never reaches the dynamic semantics of Chapter 5.

We proceed in this manner: first the semantic objects—the types and bindings the rules manipulate; then the relations among them—conformance and lookup; then the inference rules that assign types to phrases.

4.1Types

A type is one of the following.

\(\tau\)	::=	\(\texttt{Integer} \mid \texttt{Real}\)	— numeric
	\|	\(\texttt{Char} \mid \texttt{Boolean} \mid \texttt{String}\)	— other scalars
	\|	\(A[\tau_1,\dots,\tau_n]\)	— class type (\(n \ge 0\))
	\|	\(\alpha\)	— type variable
	\|	\(\tau_1 \times \cdots \times \tau_n \to \tau\)	— function type
	\|	\(\mathsf{Fun}\)	— the unconstrained function type
	\|	\(\tau\,?\)	— optional type
	\|	\(\mathsf{Any} \mid \mathsf{Void} \mid \mathsf{Nil}\)	— top, no-result, the type of `nil`

Here \(A\) ranges over class names. A class type with no arguments is written \(A\) rather than \(A[\,]\). The scalar types, \(\mathsf{Any}\), \(\mathsf{Function}\) and the collection classes are not primitive in any deep sense: they are the classes of the standard environment (Appendix B), and \(\texttt{Integer}\) and the rest are abbreviations for particular class types. We keep them syntactically distinct only because the lexer does.

\(\mathsf{Any}\) is the type of which every type is a subtype; it is the elaborated form of the root class Any. \(\mathsf{Void}\) is the type assigned to a statement, and to a routine with no declared return type: it is not a value type and may not be the type of a field or parameter. \(\mathsf{Nil}\) is the type of the constant nil; it is a subtype of every optional type and of no other.

4.2The Static Environment

Elaboration takes place against a static environment, written \(C\), comprising three components:

a class bindings \(\Sigma\), mapping each class name \(A\) in scope to its class signature — its generic parameters and their constraints, its modifiers (deferred, sealed), its parents, the typed fields it declares, and the typed signatures of its routines and constructors;
a variable bindings \(\Gamma\), mapping each identifier in scope (parameters, locals, fields of the current object) to its type;
a set \(\Delta\) of type variables currently in scope, introduced by the generic parameters of the enclosing class or routine.

We write \(C(x) = \tau\) for the lookup of an ordinary variable, \(C \oplus \{x : \tau\}\) for the environment \(C\) extended with the binding \(x:\tau\) (shadowing any earlier binding of \(x\)), and \(C(A)\) for the signature of class \(A\). The functions \(\mathit{fields}(C, A)\) and \(\mathit{routine}(C, A, m)\) collect, respectively, the fields and the named routine \(m\) visible in \(A\)—those declared in \(A\) together with those inherited from its parents, with \(A\)’s own declarations overriding inherited ones of the same name.

4.3Conformance

The central relation of the static semantics is conformance, written \(\tau \preceq \tau'\) and read “\(\tau\) conforms to \(\tau'\)” or “\(\tau\) is a subtype of \(\tau'\).” A value of a conforming type may be used wherever the wider type is expected. Conformance is reflexive and transitive, and makes \(\mathsf{Any}\) a top to which every type conforms (rules C-Refl, C-Trans, C-Top in Appendix D). Its one substantive rule is inheritance: a class type conforms to its parents, with type arguments substituted through the inheritance clause. If \(A\) is declared inherit \(B[\sigma_1,\dots]\) under generic parameters \(\alpha_1,\dots,\alpha_n\), then

\[ \frac{A[\alpha_1,\dots,\alpha_n] \;\text{inherits}\; B[\sigma_1,\dots,\sigma_m]}{A[\tau_1,\dots,\tau_n] \preceq B[\sigma_1,\dots,\sigma_m][\tau_i/\alpha_i]} \tag{C-Inherit} \]

class Animal
  create make() do end
  feature
    speak(): String do result := "..." end
end

class Dog
  inherit Animal
  create make() do end
  feature
    speak(): String do result := "woof" end
end

let a: Animal := create Dog.make()
print(a.speak())

The binding type-checks because Dog conforms to Animal by (C-Inherit), so a Dog may stand where an Animal is expected; the access a.speak() elaborates to String. (At run time it prints woof: the call dispatches on the object’s actual class, Section 5.4.)

There are exactly two numeric types, \(\texttt{Integer}\) and \(\texttt{Real}\), and a widening chain \(\texttt{Integer} \preceq \texttt{Real}\) is not assumed here as a conformance; numeric conversions in Nex are explicit except where the standard environment provides a coercion. The one implicit numeric rule the language relies upon is that integer operands are admitted where a real is required by the arithmetic of the standard environment—so that \(\texttt{Integer} \sqcup \texttt{Real} = \texttt{Real}\) in rule (4.8). This is recorded with the arithmetic operators in Appendix B rather than as a conformance.

Optional types

An optional type \(\tau\,?\) is inhabited by the values of \(\tau\) together with nil. Hence \(\tau\) conforms to \(\tau\,?\), and \(\mathsf{Nil}\) conforms to every optional type; but \(\tau\,?\) does not conform to \(\tau\) (rules C-Opt, C-Nil, C-OptMono in Appendix D).

Function types

A function type conforms to another when it accepts at least the arguments the other accepts and returns at most what the other promises: its parameters vary contravariantly and its result covariantly, and the unconstrained type \(\mathsf{Fun}\) is a supertype of every function type (rules C-Fun, C-FunTop in Appendix D). The direction is the crux: for the subtype, each parameter type must be a supertype of the other’s, while the result must be a subtype. Thus a Function(a: Animal) conforms to a Function(a: Dog) — a handler for any animal may stand in wherever a dog-handler is wanted — but not the reverse.

Why contravariant, and why it is sound Contravariance is what makes conformance a local guarantee. A value that conforms to Function(a: A) genuinely accepts every \(A\), so a call through that type cannot go wrong — there is no “catcall” to chase down later. This is the opposite of the Eiffel tradition, which makes parameters covariant and so admits an unsound relation that must be rescued by a separate whole-program analysis; Nex instead rejects the unsafe direction at the point of conformance, with a diagnostic at the offending value. The same rule governs method redefinition (Section 4.9): an overriding routine may widen a parameter and narrow a return, but not the reverse. The contravariant parameter rule also aligns with the Design-by-Contract substitution principle the language follows elsewhere — an heir may weaken what it demands and strengthen what it guarantees.

Refinement types

A refinement type \(R\), declared declare type R = B where n: p (Section 3.5.1), conforms to its base: \(R \preceq B\) (rule C-Refine in Appendix D). Widening is therefore free—an \(R\) flows into any context that expects \(B\). The reverse direction is not a conformance: a \(B\) does not conform to \(R\), because not every \(B\) satisfies \(p\). Instead a base value enters a refinement only at a narrowing site—a binding, parameter, field, or return typed \(R\), or an explicit convert … to —where the elaboration attaches the predicate \(p\) as a contract to be checked at run time (Section 5.6). The refinement name is not erased during elaboration, so these narrowing sites remain visible to the checker; it is erased only in the value representation, where an \(R\) is exactly a \(B\). Consequently an operation on refinement values has the base type: q1 + q2 for two Quantity operands is an Integer, and is re-checked only if it flows back into a Quantity-typed target.

4.4Elaboration of Expressions

The judgement \(C \vdash e \Rightarrow \tau\) asserts that, in the static environment \(C\), the expression \(e\) is well-typed with type \(\tau\). The rules follow the grammar of Section 2.7.

Constants and variables

A literal takes the type of its scalar class — an integer literal is Integer, a real Real, and so for String, Char, and Boolean; nil has the type \(\mathsf{Nil}\); and a variable has the type \(C\) records for it (rules 4.1–4.7 in Appendix D). The constant this has the type of the current class, available in \(C\) within the body of a routine. The constant super is available there too, with the type of that class’s direct superclass—provided it has exactly one. A class declaring no inherit clause, or one naming more than one parent (Section 3.2), makes every occurrence of super within it ill-formed; the intended ancestor must then be named explicitly, as in Shape.describe (rule 4.7a in Appendix D). result has the declared return type of the enclosing routine, and is in \(C\) only there (Section 2.9).

Operators

An infix operator denotes a routine of the operand’s class. The arithmetic and comparison operators are elaborated through the signatures recorded in the standard environment; we summarise their effect. For an arithmetic operator \(\odot\) — one of + - * / % ^ — on numeric operands the result is the wider of the two operand types; comparison operators yield Boolean; the equality operators \(=\), \(/\!=\), \(==\), \(!\!=\) yield Boolean for any operands of conforming type; and the logical operators require and yield Boolean. Formally these are rules 4.8–4.10 of Appendix D, where \(\tau_1 \sqcup \tau_2\) is the wider of two numeric types; disjunction and negation are analogous to the conjunction rule.

The opening sentence is meant literally, and a class may make it so for itself. Write \(\mathrm{alias}_\tau(\odot) = m\) when the class \(\tau\), or an ancestor of it, declares a one-argument routine \(m\) bound to \(\odot\) by an alias clause (Section 3.4). Where the operands of an arithmetic operator are not numeric—so that rule 4.8 does not apply—the operator elaborates as the call it abbreviates: the right operand must conform to the parameter of \(m\), and the type of the operator expression is the return type of \(m\). This is rule 4.8a of Appendix D. The numeric rule is tried first, so an alias can never displace built-in arithmetic; and where neither applies the expression is ill-typed, as before.

Member access and calls

A member access \(e.f\) requires that the static type of \(e\) be a class type possessing an accessible field or routine named \(f\). If \(f\) is a field, its type is the result; if \(f\) is a nullary routine, its return type is the result; if \(f\) is applied to arguments, the argument types must conform to the parameter types of the routine.

Field access \(e.f\) has the field’s type (rule 4.11 in Appendix D); the call rule is the keystone of the chapter:

\[ \frac{\begin{array}{c}C \vdash e \Rightarrow A[\bar\tau] \quad \mathit{routine}(C, A[\bar\tau], m) = (\sigma_1 \times \cdots \times \sigma_n \to \sigma) \\ C \vdash e_i \Rightarrow \rho_i \quad \rho_i \preceq \sigma_i \;\; (1 \le i \le n)\end{array}}{C \vdash e.m(e_1,\dots,e_n) \Rightarrow \sigma} \tag{4.12} \]

class Counter
  create make() do count := 0 end
  feature
    count: Integer
    bump(n: Integer): Integer do result := count + n end
end

let c := create Counter.make()
print(c.bump(5))

The receiver c has type Counter, the routine bump takes one Integer and returns one, and the argument 5 conforms; so by (4.12) the call c.bump(5) elaborates to Integer (and evaluates to 5). An argument of the wrong type, or the wrong number of arguments, is rejected here.

The accessibility condition is that \(f\) is not declared in a private feature section of \(A\), unless the access occurs textually within \(A\). A bare call \(m(e_1,\dots,e_n)\) is elaborated as a call on this if \(m\) is a routine of the current class, and otherwise as a call of the free function \(m\) (Section 4.8).

Method overloading by arity A class may declare several routines of the same name differing in the number of parameters. \(\mathit{routine}(C, A, m)\) therefore selects among them by the count of arguments at the call site; there is no overloading by parameter type, and free functions may not be overloaded at all. A safe member access \(e\,?.f\) has type \(\sigma\,?\) where \(\sigma\) is the type of \(e.f\): the access yields nil when \(e\) is nil, and so is defined precisely on optional receivers.

Object creation

The expression create \(A[\bar\tau].k(e_1,\dots,e_n)\) requires that \(A\) be a non-deferred class, that \(k\) be one of its constructors, and that the arguments conform to \(k\)’s parameters; its type is \(A[\bar\tau]\). When the constructor and argument list are omitted, \(A\) must have a constructor taking no arguments, or none at all. The type of the expression is the class type created (rule 4.13 in Appendix D).

Conditional and anonymous-function expressions

The conditional expression requires a boolean condition and two branches; its type is the join—the least common supertype—of the branch types, written \(\tau_1 \sqcup \tau_2\) and always defined because \(\mathsf{Any}\) is a top. An anonymous function fn… has the function type built from its parameter and result types, its body checked under the parameters (rules 4.14–4.15 in Appendix D).

4.5Elaboration of Statements

A statement is elaborated by the judgement \(C \vdash \mathit{stmt} \Rightarrow C'\), producing a possibly extended environment \(C'\); a statement that binds no new name produces \(C\) unchanged. A block is elaborated by threading the environment through its statements.

\[ \frac{C \vdash e \Rightarrow \tau \quad \tau \preceq \sigma \quad x : \notin C}{C \vdash \texttt{let}\; x{:}\sigma := e \Rightarrow C \oplus \{x : \sigma\}} \tag{4.16} \]

let x: Integer := 10
if x > 5 then
  let y: Integer := x * 2
  print(y)
else
  print(0)
end

The let extends the environment with \(x : \texttt{Integer}\) by (4.16), and that environment is threaded into the if; the then block extends it further with \(y\), which is in scope only within the block (it does not leak to the statements after the if). Assignment requires the value to conform to the variable’s type and leaves the environment unchanged; the if checks each branch under the environment it is given (rules 4.17–4.18 in Appendix D).

A let with no type annotation takes \(\sigma\) to be the inferred type \(\tau\) of its initialiser. The premise “\(x\) not a once field outside a constructor” in (4.17) is the static enforcement of immutability: assigning a once field anywhere but in a constructor of its class is rejected. The loop, case, match, across, and do/rescue statements elaborate their constituent blocks under the environment extended with any variable they introduce (the cursor variable of across, the bound variable of a match clause), and produce the original environment: a block does not leak its locals to the statements that follow it.

Type dispatch and exhaustiveness

A match on an expression of class type \(A\) dispatches on the runtime class. Each clause when \(B\) as \(x\) then \(\mathit{block}\) requires \(B \preceq A\) and elaborates its block with \(x\) bound to \(B\) (rule 4.19 in Appendix D). A destructuring clause when \(B(\ldots)\) additionally binds each named field of the matched variant in the block, at that field’s declared type; a guard if \(g\) requires \(g\) to elaborate to Boolean in the scope of those bindings.

When \(A\) is sealed, the set of clause classes must exhaust \(\mathit{subclasses}(C, A)\), unless an else clause is present to cover the remainder. A missing variant is a compile-time error. A clause that can fail after its class matches—one carrying a guard, a literal field, or a nested pattern—does not count toward this coverage, since it may fall through; a variant whose only clause is so qualified therefore still requires an unguarded clause, a wildcard, or an else. This exhaustiveness check is the whole purpose of the sealed modifier, and is why a sealed class is required to be deferred (Section 4.9).

Type conversion

The form convert \(e\) to \(x{:}\sigma\) is a boolean expression that attempts a downward or upward cast: it succeeds, binding \(x\) of type \(\sigma\), when the runtime type of \(e\) is related to \(\sigma\) (a supertype or subtype), and otherwise yields false with \(x\) bound to nil. Statically it requires that the type of \(e\) and \(\sigma\) be related by conformance in one direction or the other; the form has type Boolean and binds \(x : \sigma\,?\) (rule 4.20 in Appendix D).

4.6Contract Formation

Every assertion—in a require, an ensure, an invariant, a loop invariant, or an assert—must elaborate to Boolean. A precondition is elaborated in the environment of the routine’s entry (parameters and fields in scope); a postcondition in that environment extended with result and with the old forms; an invariant in the environment of the class’s fields. An assert is elaborated in the environment holding at the point it appears, which is the environment of an ordinary statement in that body; the bare form, carrying no label, is elaborated exactly as the named one.

Every assertion must elaborate to Boolean (rule 4.21 in Appendix D). The keystone is the typing of old, which is admitted only in a postcondition and only on a field of the current object:

\[ \frac{C \vdash f \Rightarrow \tau \quad f \in \mathit{fields}(C, \text{current class})}{C^{\mathsf{ensure}} \vdash \texttt{old}\; f \Rightarrow \tau} \tag{4.22} \]

class Counter
  create make() do count := 0 end
  feature
    count: Integer
    bump()
    do
      count := count + 1
    ensure
      advanced: count = old count + 1
    end
end

let c := create Counter.make()
c.bump()
print(c.count)

The postcondition advanced elaborates because count is a field of the current class, so old count has type Integer by (4.22); the assertion as a whole is Boolean by (4.21). At run time old count is the value held on entry, so after one bump the count is 1 and the postcondition holds. Writing old on a parameter, or outside an ensure, is a compile-time error.

Rule (4.22) records the restriction of Section 2.9: old applies only to a field of the current object, and only within a postcondition (the superscript \(\mathsf{ensure}\) marks that environment). The label of an assertion has no effect on its type; it is carried into the dynamic semantics solely to name the condition in a violation report.

4.7Generic Elaboration

A generic class \(A[\alpha_1{\to}K_1,\dots]\) is elaborated once, with its parameters \(\alpha_i\) added to \(\Delta\) and constrained so that any actual argument must conform to \(K_i\). An application \(A[\tau_1,\dots,\tau_n]\) is well-formed when each \(\tau_i\) conforms to the constraint \(K_i\) of the corresponding parameter; its fields and routine signatures are those of \(A\) with \(\tau_i\) substituted for \(\alpha_i\) throughout.

\[ \frac{A[\alpha_1{\to}K_1,\dots,\alpha_n{\to}K_n] \in \Sigma \quad \tau_i \preceq K_i \;\;(1 \le i \le n)}{C \vdash A[\tau_1,\dots,\tau_n] \;\mathbf{type}} \tag{4.23} \]

A constraint is not merely a gate on the actual argument; it is what the body of the generic may rely on. Within the scope of \(\alpha \to K\), a value of type \(\alpha\) has the features of \(K\): its fields may be read and its routines called, and the call is elaborated against \(K\)’s signature. A routine deferred in \(K\) may be called on an \(\alpha\) even though no implementation is known statically—the receiver’s dynamic class supplies one (Section 5.5). An unconstrained parameter has no features beyond those of Any.

function total_area[T -> Shape](xs: Array[T]): Integer do
  result := 0
  across xs as s do
    result := result + s.area()   -- area is deferred in Shape
  end
end

class Box [G]
  create make(x: G) do value := x end
  feature
    value: G
    get(): G do result := value end
end

let b: Box[Integer] := create Box.make(42)
print(b.get())

The application Box[Integer] is a well-formed type by (4.23) — the unconstrained parameter G admits any argument — and within it G stands for Integer, so get() returns an Integer. Had G carried a constraint such as [G -> Comparable], the argument would have to conform to it.

A parameter written \(?\alpha\) additionally permits \(\mathsf{Nil}\) as an argument. A generic routine is elaborated in the same way, its type parameters local to the routine. Because elaboration substitutes rather than erases, the type \(\texttt{Array[Integer]}\) and the type \(\texttt{Array[String]}\) are distinct types with distinct field and routine signatures, and a value of one does not conform to the other.

4.8Free Functions and the Static World

The free functions of a program are elaborated together with its classes, so that any function may call any other regardless of textual order. A bare application \(m(e_1,\dots,e_n)\), where \(m\) is not a routine of the current class, is elaborated against the signature of the free function \(m\); a declare function signature supplies that type before the definition is seen, which is how mutual recursion is admitted (Section 3.4). A later definition must match its declaration exactly. Free functions, unlike methods, may not be overloaded: each free function name is meant to be unique, and a definition matching an earlier declare function must agree with it in full. Both rules are enforced: a repeated definition, and a definition whose signature disagrees with its declaration, are each rejected, naming the function.

4.9Well-Formedness of Classes

Beyond the typing of expressions, a program must satisfy conditions on its class structure. These are checked once, over the whole static world.

Acyclic inheritance. The relation “\(A\) inherits \(B\)” must be acyclic; its reflexive-transitive closure is the conformance among class types of (C-Inherit).
Attachment (void safety). Every constructor of a class must leave each of the class’s fields attached—holding a value of its declared type rather than nil—unless the field is of an optional type \(\tau\,?\), which admits nil. A constructor that returns with a non-optional reference field still unattached is rejected. Scalar fields are attached by their zero value and so need no explicit assignment; optional fields are attached to nil by default. It is this discipline that makes the absence of a rule for nil-dereference (Section 5.3) a guarantee rather than a hazard: a non-optional reference is never observed to be nil.
Conformant overriding. When \(A\) inherits \(B\) and redefines a routine \(m\) of \(B\), the signature of \(A\)’s \(m\) must conform to that of \(B\)’s \(m\) under (C-Fun): each parameter may be widened but not narrowed (contravariant), and the return may be narrowed but not widened (covariant). A field of \(B\) may not be redeclared with a non-conforming type. This is checked at the override; a routine that narrows a parameter or returns a non-conforming type is rejected there, naming the routine and the offending position.
Complete deferral. A non-deferred class must supply a body for every routine declared deferred in any of its ancestors; a class with an unimplemented deferred routine is itself deferred and may not be instantiated.
Sealed implies deferred. A sealed class must be deferred. Were a sealed class instantiable, a bare instance of the parent would be a runtime value that an exhaustive match over the parent’s subclasses (4.19) would fail to cover; requiring deferral removes that value and so keeps exhaustiveness meaningful.
Once discipline. A once field is assigned only in constructors of its class (the premise of (4.17)).

A program all of whose phrases elaborate, and all of whose classes are well-formed, is statically valid. The dynamic semantics of the next chapter is defined only for statically valid programs.