Fabric

Jakub Bachurski
Supervisors: Dominic Orchard and Alan Mycroft
University of Cambridge

Abstract

Subtyping is a classical programming language feature, usually associated with nominal subtyping in object-oriented programming. Structural subtyping — where the subtyping relation is derived from the shared structure of types — is an alternative. Structural typing is present in many forms in modern languages, but it is often incomplete without subtyping. Regardless, subtyping often ends up omitted because of its inherent complexity.
I argue that with the introduction of algebraic subtyping — a new framework for ML-style type inference — the time has come to revisit structural subtyping. In my thesis I propose a novel language design, Fabric, and demonstrate the relevance of structural subtyping to modern design problems. I propose a new approach to typing array programs, show how to capture record extension within algebraic subtyping without row polymorphism, and compile Fabric into WebAssembly with a focus on efficient core data structures. I aim to show structural subtyping narrows the expressive gap between static and dynamic languages, it helps us statically type check dynamic programs, and its ideas can improve how we program generally.

Note! Fabric is a work-in-progress project under active development.

Documents

Overview

Type system

Fabric's type system follows the framework of algebraic subtyping, as set out by Stephen Dolan in his thesis (MLsub) and later refined by Lionel Parreaux (Simple-sub, MLstruct). When following this tradition, we define a lattice algebra over types, with meets and joins over any pair of types. For instance, in MLsub: $$ \texttt{select}\,p\,v\,d = \texttt{if}\, p\,v \,\texttt{then}\, v \,\texttt{else}\, d $$ $$ \forall \alpha, \beta \ldotp (\alpha \to \texttt{bool}) \to \alpha \to \beta \to (\alpha \sqcup \beta) $$ In Fabric I implement the type system following MLstruct, which removes the polarity restriction and supports a limited form of negation types. I explore the capabilities and limitations of this Boolean algebraic approach to algebraic subtyping. Subtyping, as in similar work, is implicit in Fabric.

Records

A key feature present in both statically and dynamically typed languages are aggregate record types. Dynamic languages (Python, JavaScript, Lua) often call these objects, and check attribute accesses at runtime. I believe we can get much of the benefits of objects without sacrificing safety by typing them structurally. In the past, successful approaches to this have relied on the application of row polymorphism – an extension to Hindley-Milner type systems (e.g. OCaml's object types, or PureScript's records).


(* type annotations are added for clarity *)
let p: {x: int, y: int} = {x: 3, y: 4}
let cp: {x: int, y: int, c: [ Green ]} = {c: Green | p}
let hypot v = v.x*v.x + v.y*v.y
let h = hypot cp (* 25 *)
            
I propose to use algebraic subtyping as a basis for typing records instead, without the inclusion of row polymorphism. To make this possible, I propose a solution to the update problem of subtyping (see Cardelli and Mitchell).

Arrays

It is known that structural types (be it via row polymorphism or otherwise) can be useful for typing operations on dataframes. However, this approach remains unexplored for arrays, with the array programming community failing to discover a satisfactory approach to typing array programs in modern workloads (e.g. for machine learning models and scientific computing). Most widely adopted languages only give a cursory treatment of array types, only providing (some of) the following: This lack of structure fails to reflect the rich array transformations we need to perform. I argue to resolve this via the use of structural types, with the following core ideas: For example, consider a matrix of floats. We may write its type as \([\mathrm{row}, \mathrm{col}] \Rightarrow \texttt{float}\), where \( \mathrm{row} \) and \( \mathrm{col} \) are index types (we could set them to \( \texttt{int} \), or to be variables). Let us add padding rows to this matrix, one at the left and right. This changes its structure significantly enough to mandate reflecting this in its type: $$ [(\mathrm{Left}: \langle \rangle \mid \mathrm{Center}: \mathrm{row} \mid \mathrm{Right}: \langle \rangle), \mathrm{col}] \Rightarrow \texttt{float} $$

Code generation

I implemented a Fabric compiler into WebAssembly, a modern and portable compilation target.
WebAssembly is a convenient target. WebAssembly runtimes are shipped in most web browsers, but local runtimes and ahead-of-time compilers also exist. It aims to provide a solid abstraction over hardware. I thus expect its performance to be a good baseline above other possible targets.
I rely on the Binaryen toolchain for generating WebAssembly code. Fabric is a managed memory language, and so I made rich use of the recently adopted WasmGC extensions. The main motive that leads me to look at code generation is that structural subtyping requires careful compilation schemes for its core data structures (records, variants). I thus aim to answer the following questions:

Properties

Many type systems limit the user's capabilities to define their own rules in the type system. Powerful type system features like GADTs permit encoding this, but the proof burden is often left to the user.
I would like to explore a structural system of properties, describing extra information about a value and supplementing its type. Properties can be required by a function, and they can be derived from other known properties via user-defined inference rules.

(* say, BLAS routine for symmetric matrix multiply *)
val symm : matrix @ {symm} -> matrix -> matrix -> matrix

(* encode rules on higher-level properties of matrices *)
rule {upper_triangular, lower_triangular} |- {diagonal}.
rule {orthogonal} |- {invertible}.