Skip to content
hadley edited this page Sep 29, 2010 · 26 revisions

The S3 object system

(Contents adapted from the R language definition)

Central to any object-oriented language are the concepts of class and of methods. A class is a definition of an object, and typically a class contains several elements that hold class-specific details. Every object must be an instance of some class.

R implements a type of object orientedness called generic functions. This is different to programming languages in that it is centered around these generic functions. In MPOO, messages are sent to objects, and the messages are interpreted by object-specific methods. This means that the method selection is based only on the object to which the object is sent, not the parameters of the message. Typically this object has a special appearance in the method call, usually appearing before the name of the method/message. You're probably familiar with this style of OO, because it's what most popular OO languages (like C++, Java and C#) use.

With generic functions, computations are still carried out via methods, but rather than the object choosing which method to call, a special function called a generic function decides. Methods are functions that are specialized to carry out specific calculations on objects of a specific class - they are defined exactly like a normal R function, but are typically called in a different way.

The greatest use of OO programming in R is for print, summary and plot methods. These methods allow us to have one generic function call, e.g. print(), that displays the object differently typing on its type. To do this, each model attaches a class attribute to its output and then provides a special method that takes that output and provides a nice readable version of it. The user then needs only remember that print or summary will provide nice output for the results of any analysis.

Object class

The class of an object is determined by its class attribute, a character vector of class names. For example, to create an object of class foo you just set the class attribute to a vector that contains "foo":

x <- 1
attr(x, "class") <- "foo"
x

# Or all in one line
x <- structure(1, class = "foo")
x

This means virtually anything can be turned in to an object of class "foo" (whether it makes sense or not!).

An object can have more than one class:

attr(x, "class") <- c("a", "b")

As you'll learn in the next section, methods are looked for in the order in which they appear in the class vector. So in this example, it would be like class A inherits from class B - if a method isn't defined for A, it will fall back to B. However, if you switched the order of the classes, the opposite would be true!

This is because S3 doesn't have any formal relationship between classes, or even any definition of what an individual class is. If you're coming from a stricter environment like Java, this will seem pretty frightening (and it is!) but it also gives the user a tremendous amount of freedom. It makes it very difficult to stop someone from doing something you don't want them to do, but on the other hand, your users will never be held back because there is something you haven't implemented yet.

Method dispatch

Specific methods are chosen (dispatched) by a generic function. Generic functions all have the same form: a call to UseMethod that specifies the generic name, and the object to dispatch on. This means that generic functions are usually very simple, like mean:

 mean <- function (x, ...) {
   UseMethod("mean", x)
 }

The first argument to mean is special because UseMethod uses it to figure out which method should be called as follows. Suppose that x had a class of c("foo","bar"). Then UseMethod would search first for a function called mean.foo and if it didn't find it, it would then look for mean.bar. If it couldn't find that either, it would try mean.default, and if that didn't exist it would raise an error. The same approach applies regardless of how many classes an object has:

x <- structure(1, class = letters)
bar <- function(x) UseMethod("bar", x)
bar.z <- function(x) "z"
bar(x)
# [1] "z"

Once a method has been determined UseMethod invokes it in a special way. Rather than creating a new evaluation environment, it uses the environment of the current function call (the call to the generic), so any assignments or evaluations that were made before the call to UseMethod will be accessible to the method. The arguments that were used in the call to the generic are rematched to the formal arguments of the method.

Inheritance

NextMethod is used to provide a simple inheritance mechanism. Commonly a specific method performs a few operations to set up the data and then it calls the next appropriate method through a call to NextMethod. A function may have a call to NextMethod anywhere in it - this works like UseMethod but instead of dispatching on the first element of the class vector, it will dispatch based on the second element:

baz <- function(x) UseMethod("baz", x)
baz.A <- function(x) "A"
baz.B <- function(x) "B"

ab <- structure(1, class = c("A", "B"))
ba <- structure(1, class = c("B", "B"))
baz(ab)
baz(ba)

baz.C <- function(x) NextMethod()
ca <- structure(1, class = c("C", "A"))
cb <- structure(1, class = c("C", "B"))
baz(ca)
baz(cb)

The exact details are a little tricky: NextMethod doesn't actually work with the class attribute of the object, it uses a global variable (.Class) to keep track of which class to call next. This means that manually changing the class of the object will have no impact on the inheritance. The following example illustrates this:

# Turn object into class A - doesn't work!
baz.D <- function(x) {
  class(x) <- "A"
  NextMethod()
}
da <- structure(1, class = c("D", "A"))
db <- structure(1, class = c("D", "B"))
baz(da)
baz(db)

Methods invoked as a result of a call to NextMethod behave as if they had been invoked from the previous method. The arguments to the inherited method are in the same order and have the same names as the call to the current method. This means that they are the same as for the call to the generic. However, the expressions for the arguments are the names of the corresponding formal arguments of the current method. Thus the arguments will have values that correspond to their value at the time NextMethod was invoked. Unevaluated arguments remain unevaluated. Missing arguments remain missing.

If NextMethod is called in a situation where there is no second class it will return an error. A selection of these errors are shown below so that you know what to look for.

c <- structure(1, class = "C")
baz(c)
# Error in UseMethod("baz", x) : 
#   no applicable method for 'baz' applied to an object of class "C"
baz.c(c)
# Error in NextMethod() : generic function not specified
baz.c(1)
# Error in NextMethod() : object not specified
Clone this wiki locally