Skip to content

Latest commit

 

History

History
537 lines (414 loc) · 26.6 KB

TypesCheatSheet.org

File metadata and controls

537 lines (414 loc) · 26.6 KB

Reference Sheet for Simple Type Theory

More

(setq org-latex-pdf-process ‘(“pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f”))

(setq org-latex-pdf-process ‘(“pdflatex -shell-escape -output-directory %o %f”))

LaTeX Setup

Abbreviations

The phrase “ϕᵢ : sᵢ → tᵢ is a family of functions for each $i$” is abbreviated “$ϕ : ∀ i • sᵢ → tᵢ$”.

Whereas the phrase “ψ maps each $i$ of $I$ to a member of $tᵢ$” is abbreviated “$ψ : (i : I) → tᵢ$”.

Higher-Order Logic

higher-order logic, HOL, is an extension of the simply typed lambda calculus with logical connectives and quantifiers, permitting logical reasoning about functions in a very general way. In particular, it allows quantification over higher-order entities such as predicates and relations. It is sufficient for expressing most ordinary mathematical theories and has a simple set-theoretic semantics.

In informal mathematics, functions have to be introduced with an explicit name before they can be used. Lambda calculus provides a machanism for building expressions that denote functions, so that functions can be used without explicit definitions and naming. Functions are constructed by λ-abstraction; e.g., $λ n : ℕ • n + 1$ is the function sending a number $n$ to its successor $n + 1$; the “λ n : ℕ” porition indicates that $n$ is the argument to the function and it should be of ‘type’ ℕ, the rest is the result of the function. We may apply a λ-expression to obtain a value by replacing the argument; e.g., $(λ n : ℕ • n + 1)\,2 = 2 + 1 = 3$.

A signature is a set of pairs consisting of a ‘name’ $c$ and furthur data which will be denoted $τ(c)$.

Types

A ‘type’ is an expression that denotes a nonempty set. Higher-order logic postulates two atomic types —𝔹 to denote the two-element set of truth values, and ι to denote an infinite set of individuals— and one type-operator which denotes a set-forming operation —namely, $α → β$ denotes all total functions from the set denoted by α to the set denoted by β.

For practicality, let $AT$ be a signature ‘type operators’; i.e., a set of name-number pairs.

-- “Types over AT”
𝒯 ::= ι | 𝔹 | 𝒯 𝒯| Op(𝒯₁, , 𝒯ₙ)

In the last alternative, $Op$ is an n-ary type operator in $AT$ that constructs a new type out of the types $𝒯₁, …, 𝒯ₙ$.

  • Part of the power of HOL comes from the fact that we can iterate type construction to ever higher levels of functionality. For instance, $(ι → 𝔹) → 𝔹$ denotes the set of functions that take functions from individuals to truth values as arguments and return truth values as results. This is a set of higher-order functions.

Terms

The terms of HOL are expressions that denote elements of types. Each term is associated with a unique type.

For practicality, let $SG$ be a signature of ‘constants’; i.e., a set of name-type pairs.

$t$ ::= $𝓋_α$ (variable)
$\;\,∣$ 𝒸 (constant)
$\;\,∣$ $t₁ \, t₂$ (application)
$\;\,∣$ $(λ 𝓋_α • t₁)$ (abstraction)

Terms are associatedd with types as follows; where we write $t : α$ to express that $t$ has type $α$.

  • $𝓋_α : α$
  • $𝒸 : τ(c)$
  • Likewise application and abstraction.

A term is well-typed if it has a type according to these rules; in which case it denotes a value in the set denoted by the type.

  • It is common to write $𝓋 : α$ instead of $𝓋_α$, especially when $α$ is complex. Moreover, one writes $𝓋$ rather than $𝓋_α$ when $α$ can be easily deduced.

Standard Types & Constants

Just as we insist that every 𝔹 is a nullary type operator, we insist that every constant signature contains a constant called equality, $= : α → α → 𝔹$, for each type $α$, which denotes equality on the set denoted by the type $α$.

Semantics

Let 𝒮ℯ𝓉 denote a collection of sets.

A model of a type signature is given by associating every n-ary type operation with a function $𝒮ℯ𝓉ⁿ → 𝒮ℯ𝓉$, which acts as the meaning of the operation. The type operators 𝔹 and → must be given their standard meaning as the Booleans and as the collection of all total functions between sets. We write $⟦-⟧₀$ for a model.

A model for a constant signature is a model for a type signature along with a function $⟦-⟧₁ : (c : C) → ⟦τ(c)⟧₀$ that gives the meaning of $c$ as a value of $⟦τ(c)⟧₀$. We insist that the equality constant is given its meaning as the equality operation on a set.

Deductive Systems

A deductive system lets us prove properties of elements denoted by terms.

  • A formula is a term of type 𝔹.
  • A sequent is a pair $(ϕ, t)$, that is written “ϕ ⊢ t”, where ϕ is a finite set of formulas and $t$ is a single formula known as the assumptions and conclusion, respectively.
    • A sequent is satisfied if any environment making the formulas of ϕ true also makes $t$ true.
    • A sequent is valid if it is satisfied in every standard model.
  • An inference (rule) is a list of n+1 sequents; it is usually written as a fraction with the hypotheses, the first n, sequents are above the line; and the last sequent, the conclusion is below the line.
    • An inference is valid if any standard model that satisfies the hypothesis also satisfies the conclusion.
    • An inference with no hypothesis is called an axiom.
    • A schema is an inference that mentions metavariables, which stand for arbitrary (sets of) phrases; the instantiations of which may need to satisfy ‘associated side conditions’.

A deductive system $DS$ is a set of inference rules.

  • Inference rules can be ‘composed’: If the conclusion of R₁ is one of the premises of R₂, then we draw R₂ in fraction form then above it’s i-th hypothesis we draw a fraction line above which is the hypotheses of R₁. The result of such an iterated process is a deduction (tree).
    • For schemas, such a tree with hypothesis and a conclusion and accumulated side-conditions is known as a derivied inference rule.
  • A theorem is a sequent having a proof; i.e., a deduction without any hypothesis and with the sequent as its conclusion —this is known as a natural deduction proof.
    • This form does not scale for large deductions; an alternative is the calculational proof style where-in rules are displayed sideways & upsidedown: One places the conclusion at the top then itemises the conclusions vertically underneath; then composition means each item will have its own children hypothesis vertically under it, possibly none. (See Back and von Wright)
type signatureWhat types can be formed?
constant signatureWhat terms can be formed?
inference rulesWhat theorems can be proven & how?
Consistent
There’s some underivable sequent; in particular, if false cannot be proven.
Sound
(wrt to a given model) if every provable sequent is valid in the model.
Complete
(wrt to a given model) if every valid theorem can be proved using the deductive system.

Quantification

The quantifiers ∀ and ∃ take predicates as their arguments

∀, ∃ : (α → 𝔹) → 𝔹

In FOL, we have to build quantifiers into the syntax of formulas; in HOL, quantifiers may be taken as constants. One writes $(∀ x • t)$ instead of $∀ (λ x • t)$; likewise for ∃.

The selection, or choice, operator ε takes a predicate and returns a solution to it, if there are any; otherwise the result is an unknown element. This operator embodies the “axiom of choice”.

ε : (α → 𝔹) → α
  • Introduction: (∃ x • t) ⇒ t[x ≔ (ε x • t)]
  • Elimination: If we can show t ⇒ s where x is not free in s, then t[x ≔ (ε x • t)] ⇒ s.
    • That is, we prove properties involving ε by proving properties without making any assumptions about which value is chosen excpet that it is a solution to $t$.

    We have an arbitrary element of any type: $⊥_α = (ε x : α • false)$; the only information about it we have is its type.

STT has a simple and highly uniform syntax

#

STT has to kinds of syntactic objects.

  • Expressions denote values including truth values true and false; they do what both terms and formulae do in first-order logic.
  • Types denote nonempty sets of values; they are used to restrict the scope of variables, control the formation of expressions, and classify expressions by their values.

A ‘type’ is a string of symbols defined inductively by the following formation rules. The first two being being for the base types of ‘individuals’ and Booleans, while the third being an infinite hierarchy of ‘function’ types.

𝒯 ::= ι | 𝔹 | (𝒯  𝒯)

Semantically, a function type denotes a domain of total functions: $⟦α → β⟧ = ⟦α⟧ → ⟦β⟧$.

A language is a pair $L = (C, τ)$ consisting of a set of constant symbols $C$ which are assigned types by the total function $τ : C → 𝒯$; i.e., a ‘signature’ of primitive symbols used to construct expressions of the language.

Expressions E of type α of a language L are strings of symbols defined inductively by the following formation rules.

e  ::=  (𝓋 : α)    -- annotation of variable
      | e e       -- application
      | (λ 𝓋  e) -- abstraction
      | 𝒸         -- constant symbol
      | e = e     -- equality
      | (I 𝓋  e) -- definite description

Of course not all combinations of symbols are meaningful. A string of symbols e is considered an expression only if it can be assigned a type; i.e., we can find some γ ∈ 𝒯 so that e : γ according to the following rules. That is “_:_” is an inductively defined binary predicate; which is determinstic in its second argument.

0. e = (x : α)         e : α
1. f : α  β, e : α    f e : β
2. e : β               (λ x : α  e) : α  β
3. c  C               c : τ(c)
4. e : α, d : α        (e = d) : 𝔹
5. p : 𝔹               (I x : α  p) : α

Write $ℰ \, α$ for the collection of expressions of type $α$.

  • Abstraction defines functions from expressions mentioning variables.
  • Definite description builds an expression that denotes the unique value that satisfies a given property.

    $(I x : α • p)$ denotes the unique value $x$ of type $α$ satisfying $p$, if it exists and is a canonical “error” value of type $α$ otherwise.

  • Annotation constructs a variable by assingin a type to a member of 𝓋, an infinite set of symbols.
  • Different kinds of expressions are distinuisghed by type instead of by form.
  • Within the body of a qunatification over variable $x : α$, we will write just $x$ rather than $(x : α)$.

The syntax of STT with types in addition to expressions is a bit more complex than the syntax of first-order logic. However, the syntax is also more uniform than the syntax of first-order logic since expressions serve as both terms and formulae and constant include individual constants, function symbols, and predicate symbols as well as constants of many other types. There are no propositional or predicate connectives in STT, but we will see later that these can be easily defined in STT using function application, abstraction, and equality.

The semantics of STT is based on a small collection of well-established ideas

A standard model for a language $L$ is a triple $𝓜 = (⟦-⟧₀, ⟦-⟧₁, err)$ where

  1. $⟦-⟧₀ : 𝒯 → 𝒮ℯ𝓉$ assigns each type a non-emppty set.
    • $⟦𝔹⟧₀$ is the Booleans.
    • $⟦α → β⟧₀ = ⟦α⟧₀ → ⟦β⟧₀$ are the total functions between the interpreted sets.
  2. $⟦-⟧₁ : (c : C) → ⟦τ c⟧₁$ assigns each constant symbol to a value in the interpreation of the symbol’s type.
  3. $err : (α : 𝒯) → ⟦α⟧₀$ is a function associating each type with a ‘error’ value.

The subscripts of ⟦-⟧ᵢ will generally be omitted.

  • ⟦ι⟧ corresponds to the ‘universe of discourse’ in a first-order model.
  • Note that the function domains of a standard model are determined by the choice of ⟦ι⟧ alone.
  1. A variable assignment into ℳ is a variable-to-value assignment $σ : ∀ α • 𝓋 → ⟦α⟧$.
  2. One then defines the valuation function $⟦-⟧₂ = ⟦-⟧_σ : ∀ α • ℰ α → ⟦α⟧$ that takes expressions to values as follows:
    1. $⟦(x : α)⟧₂ = σ(x)$
    2. $⟦c⟧₂ = ⟦c⟧₁$ for $c ∈ C$
    3. $⟦f e⟧₂ = ⟦f⟧₂ ⟦e⟧₂$, for function application.
    4. $⟦ λ x : α • e⟧₂$, for $e : β$, is the function $f : ⟦α⟧ → ⟦β⟧$ such that, for each $a ∈ ⟦α⟧$, we have $f(a) = ⟦e⟧₂′$ where $⟦-⟧₂′$ uses $σ[(x : α) ↦ a]$ instead of $σ$.

      The semantics of function abstraction is defined using the same trick, due to Tarski, that is used to define the semantics for universal quantification in first-order logic.

    5. $⟦e = d⟧₂$ is true if $⟦e⟧₂ = ⟦d⟧₂$ and false otherwise.
    6. $⟦I x : α • p⟧₂$ is the unique $a ∈ ⟦α⟧₀$ with $⟦p⟧′₂ = true$ if it exists, and is $err_α$ otherwise.

Some jargon:

  1. A value $a ∈ ⟦α⟧$ is nameable if it is the value of some closed expression e; i.e., $⟦e⟧₂ = a$.
  2. A is valid in 𝓜, written $M ⊨ A$, if $⟦A⟧_σ = true$ for all variable assignment $σ$ into 𝓜.
  3. A is a semantic consequence of a set Σ of sentences, written $Σ ⊨ A$, if for every model ℳ, whenever $ℳ ⊨ B$ for all $B ∈ Σ$ then $ℳ ⊨ A$.
  4. A theory $T = (L, Γ)$ is a language $L$ along with a set of sentences $Γ$, called the axioms of T.
    • Semantic conseqence for theories: $T ⊨ A$ means $Γ ⊨ A$.
    • A standard model for theory T is a standard model ℳ for $L$ such that $M ⊨ B$ for all $B ∈ Γ$.

The two semantics of first-order logic and STT are based on essentially the same ideas: Domains of discourse, truth values, and functions; models for languages; variable assignments; and valuation functions defined recursively on the syntax of expressions.

STT is a highly expressive (classical) logic

Even though STT is formulated as a ‘function theory’, it is a form of higher-order predicate logic. Indeed its higher-order quantification lets us define the following connectives.

Operation Definition
true $(λ x : 𝔹 • x) = (λ x : 𝔹 • x)$
false $(λ x : 𝔹 • true) = (λ x : 𝔹 • x)$
p ∧ q $(λ f : 𝔹 → (𝔹 → 𝔹) • f \; true \; true) = (λ f : 𝔹 → (𝔹 → 𝔹) • f \; p \; q)$
∀ x : α • p $(λ x : α • p) = (λ x : α • true)$
$⊥_α$ $I x : α • x ≠ x$

Where the remaining connectives are obtained by duality: $¬ p ≡ (p = false)$, $p ∨ q ≡ ¬(¬ p ∧ ¬ q)$, $(∃ x : α • p) = ¬ (∀ x : α • ¬ p)$, and $p ≠ q ≡ ¬(p = q)$.

Since STT is equipped with full higher-order quantification and definite description, most mathematical notions can be directly and naturally expressed in STT. None of the following examples can be directly expressed in first-order logic.

  1. Function composition is expressed by $λ f : ι → ι • λ g : ι → ι • λ x : ι • f (g\, x)$.
  2. Inverse image is expressed by $λ f : ι → ι • λ s : ι → 𝔹 • I s′ : ι → 𝔹 • ∀ x : ι • s′ \,x = s\, (f\, x)$.
⟪Its simple semantics makes STT suitable for general purpose logic.⟫

Theoretically speaking the inclusion of definite description adds no additional expressivitiy, however from a practicaly stand-point it makes it easy to actually express ideas. See Russell’s “On Denoting”. The removal of definite description does simplify the semantics since error values would no longer be needed. However, without definite description it would not be possible to directly express the many mathematical concepts that are defined in informal mathematics using the phrasing “the $x$ that satisfies $P$” such as the notion of the limit of a function.

STT admits categorical theories of infinite structures

A theory is categorical if it has exactly one model up to isomorphism.

  • These are useful for when we want to capture eseentially the only model we’re interested in; in a clean minimal fashion.
  • E.g., completely ordered fields are categorical —the real numbers.
  • E.g., any algebraic data type is categorical, that’s why we can code safely in functional programming.
    • ADTs are initial and so unique up to unique isomorphism.
  • E.g., Peano Arithmetic is categorical —the natural numbers.

The theory PA, of Peano Arithmetic, has constants 0, s with τ = 0 ↦ ι, s ↦ ι → ι and axioms that zero is not a successor, s is injective, and the induction principle (IP) which may be rendered as the STT expression

$∀ p : ι → 𝔹 • (p\,0 \;∧\; (∀ x : ι • p\,x ⇒ p\,(s\, x))) ⇒ ∀ x : ι • p\, x$

PA cannot be directly formalised in first-order logic because the induction principle involves quantification over predicates, which is not directly expressible in first-order logic. Instead, in FOL one resorts to an infinite collection of formulaes called the induction schema:

As axiom take each sentence that is a universal closure of a formula of the form

(B[0] ∧ (∀ x • B[x] ⇒ B[s\, x])) ⇒ ∀ x • B[x]

where B[x] is a formula.

This approximates the IP since it includes just a countable number of instances of IP —one for each equivalence class of formulas related by logical equivalence— while IP has a continuum number of instances —one for each property of the natural numbers. Consequently, such a formulation of PA is not categorical since it does not contain all instances of IP. As such, it has “nonstandard” models containing infinite natural numbers by the compactness theorem of FOL.

By the compactness theorem of FOL, any first-order theory that has an infinite model has infinitely many (infinite) nonisomorphic models! Whence, lack of categoricity is a fundamental weakness of FOL: A first-order theory that is inteded to specifiy a single infinite structure cannot be categorical.

STT has a simple, elegant, and powerful proof system

  • A proof system 𝑷 consists of a finite set of axiom schemas and rules of inference.
  • A proof of a formula $A$ from theory $T = (L, Γ)$ in 𝑷 is a finite sequence of formulas of $L$ where $A$ is the last member of the sequence and every member of the sequence is an instance of one of the axiom schemas of 𝑷, a member of Γ, or is inferred from previous members by a rule of inference of 𝑷.
  • Let $T ⊢_𝑷 A$ mean there is a proof of $A$ from $T$ in 𝑷.
  • If for every theory $T$ and formula $A$, if $T ⊢_𝑷 A \;⇒\; T ⊨ A$ then say 𝑷 is sound.
  • If for every theory $T$ and formula $A$, if $T ⊢_𝑷 A \;⇐\; T ⊨ A$ then say 𝑷 is complete.

By Godel’s Incompleteness Theorem, STT has no sound & complete proof system, which is a sign of its strength.

STT has simple proof system consisting of the following 6 axiom schemas and a single rule of inference. ( They are schemas since the type variables ‘α’ cannot be quantified. )

A1
Truth Values; $∀ f : 𝔹 → 𝔹 • (f \,\mathsf{true} \,∧\, f\,\mathsf{false}) \,=\, (∀ x : 𝔹 • f\, x)$.
A2
Leibniz’ Law; $∀ x, y : α • x = y ⇒ ∀ p : α → 𝔹 • p\,x = p\,y$
A3
Extensionality; $∀ f, g : (α → β) • (f = g) = (∀ x : α • f\,x = g\, x)$
A4
Beta-Reduction; $(λ x : α • e) \, d \;=\; e[x ≔ d]$

( λ-abstraction acts as the converse rule. )

A5
Definite Description Introduction; $(∃₁ x : α • p) ⇒ p[x ≔ (I x : α • p)]$ where ∃₁ is the usual unique-existence.
A6
Definite Description Elimination: $¬(∃₁ x : α • p) ⇒ (I x : α • p) = ⊥_α$
R1
Equality Substitution: From $L = R$ and $E$ infer the result of replacing one occurence of $L$ in $E$ by an occurence of $R$, provided that the occurrence of $L$ in $E$ is not a variable immediately preceded by λ or I.

Even though its necessairly incomplete, as mentioned earlier, the following system is sound since each instance of the schemas can be shown to be valid in any standard model and the rule preserves validity.

Does this system have sufficient provability power to be useful? Yes! Why? By experience, including the number of proof assistants based on this system such as HOL and Isabelle.

vfill

\vfill

References

The Seven Virtues of Simple Type Theory \newline by William Farmer

\vspace{1em}

A nifty exposition of type theory and how it compares to first-order logic, with an emphasies on practicality.

\vspace{1em}

A beautiful read.