Skip to content

Linear Algebra Vision

mrocklin edited this page Feb 9, 2012 · 15 revisions

Introduction

Linear algebra is an important part of mathematics and a very important part of computational mathematics. The number of linear algebraic projects started within SymPy reflects this. Unfortunately this enthusiastic development has resulted in a number of very good but very separated projects. This article outlines a broad top-down vision of what we would like in a Linear Algebra system

Note: this article so far has only a single author. Help by adding your thoughts.

Needs

SymPy has the following needs for linear algebra

External

  • Represent and manipulate small dense matrices with explicit entries (currently Matrix)
  • Represent large sparse matrices with explicit entries (currently SparseMatrix?)
  • Standard mathematical operations on these two types (i.e. solve, eigenvectors)
  • Represent these matrices inside SymPy objects (ImmutableMatrix in PR 1015)
  • Represent purely symbolic matrices, i.e. (X*Y).T (currently MatExpr)
  • Matrix Calculus (i.e. d(xT A x)/ dx)
  • Everything above, but now replace "Matrix" with "Tensor"
  • Tensor index notation (What is a good summation notation easily expressible in Python?)
  • Implement some well known tensor math - i.e. Relativity, fluids
  • Interact with numpy - NumPy objects can contain SymPy ones. SymPy Matrices can not contain NumPy arrays
  • Code generation (somewhat implemented in tensor.Indexed)
  • Fancy indexing? How much of numpy functionality can we bring over?
  • A common case in scientific computing is to have a linear operator rather than a matrix. I.e. an object for which individual indices are not available but the "multiply by a vector" operation is readily available.

Internal

Matrices are used internally by SymPy modules. These are currently using Dense Matrix objects but could benefit from a good efficient sparse solver

  • Integration (indefinite)
  • Solvers

Known issues

Mutability

For algorithmic performance some Matrix types will need to have rapidly changing entries. SymPy Basic objects must be immutable. This forces a split in the type system. Good design must be planned out so that this split can be executed with a minimum of code complexity.

Complex Interaction

Matrices and Tensors are used in a great variety of contexts. Building a simple system which can handle all of these contexts in a unified way may be challenging. Care must be taken to avoid code complexity.

Replication of Expr code

For pure symbolics (i.e. MatrixSymbol) we must build a system to represent and simplify tensor/matrix expressions. This system will be very similar but not identical to the Expr subclasses (Add, Mul, One, Zero, etc....) How can we make our own Add/Mul/Identity/Zero objects and robust simplification routines without redoing all of this work?

Variety

The word Tensor means many things to many people. There are a wide variety of applications. To meet all expectations is a difficult task. Meeting all expectations with simple and elegant code will require some planning. Lets break up and categorize this variety.

Storage

If we choose to explicitly represent elements of an array we may store these elements in a number of ways. There are the following broad categories

  • Dense Storage - keep each element in memory
  • Sparse Storage - our object holds mostly zeros. We can store all the elements in less space by being clever
  • Functional Storage - our object can be represented as a function on indices. For example the identity matrix can be represented by lambda i,j: 1 if i==j else 0

As mentioned above we must also determine whether or not the storage should be Mutabile. This could be accomplished by choosing between containers list, Tuple, dict, etc....

There has also been talk of involving different ground types. I.e. we may want a Matrix of SymPy real numbers, sympy expressions, python floats, numpy floats, or any of the ground types in the poly's class.

NumPy is a powerful numerical library which deals with the same problem. We may wish to provide a special container which directly wraps the numpy.ndarray object or the scipy.sparse objects.

Algebra

Operations on tensors form an algebra. I.e. we often want to combine tensors in some way and the rules for those combinations have some significant structure which we may wish to exploit. However, in different domains we care about different things and we may wish to expose a different algebraic structure to each domain. Lets discuss the variety here.

  • Explicit computation - As in the current SymPy Matrix object, we often want to explicitly compute the elements of any operation.
    • This can happen as we enter the expression
    • Or this can happen lazily
  • Forming Expressions - In this case we don't want to evaluate the elements of the resulting matrix expression. Instead we want to manipulate the expression itself. Think "(U.invXU).T". This occurs in a number of fields
    • We may want a fully abstract treatment involving linear operators, inner product spaces, metric tensors, with potentially infinite dimensions
    • For many physics applications we want fairly but not fully abstract rules. Think fluid dynamics, relativity, continuum mechanics
    • Matrices are a special class of tensors and matrix expressions deserve special treatment. There are a number of additional operations (such as inverse) that we would like to include
    • Code generation

Algorithms

Many algorithms can be performed without knowledge of the underlying storage. Merely having index access is sufficient. Most of the code in the current Matrix class consists of these. Can we make these representation independent?

One of our goals though is for fast Sparse Matrices. For this it seems we may need to have a few special methods which are aware of underlying representation. How can we balance general code with efficiency?

Issues with variety and code structure

Lets consider the number of potential types involved in this project by enumerating the number of types in each category.

  • Storage - Dense, Sparse, Functional
  • Mutability - Immutable, Mutable
  • Ground Type - float, Real, Expr, numpy.float, ...
  • Algebra - Tensor, Matrix, Code generator ...

Now consider all possible combinations. This is infeasible. We'll need to combine these ideas in a nice modular way so that we're not defining a DenseImmutableTensorOfExpr, DenseMutableMatrixOfFloat, etc... objects.

Some Proposals

Tensor Indexing

All objects we may wish to consider share a few traits. One is that they are indexed. A good general model at this level for indexing tensors may simplify the rest of the code. Care should be taken at this step.

Tensor calculus asks that we write equations like the following

$$R_{ab} - \frac{1}{2}g_{ab}R + g_{ab}\Lambda = \frac{8\pi G}{c^4}T_{ab}$$

We would like to be able to express this in the SymPy language easily and elegantly. What is the best way to get equations out of this out of the Python syntax?

What about contravariance/covariance? How should we represent different types of indices

One option

i,j,k,a,b,c,d = indicies('ijkabcd')
g[i,j]*u[i] # $g^i_j * u^i$
R[a|b, c|d] # $R^{ab}_{cd}$

Can this syntax be used throughout the code. I.e. it would be nice if Matrix.matrix_multiply and trace could be written as

def matrix_multiply(self, other):
    i,j,k = indicies('ijk')
    return self[i,j]*other[j,k]

def trace(self):
    i = Index('i')
    return self[i,i]

This would work toward SymPy's goal of writing clear code that reflects the mathematics.

Just as a small side note, what about symmetrized/antisymmetrized index tuples? These may even span over multiple objects. Probably gets really difficult here.

Some individual projects

Here are some ideas on how this work can be split up.

Framework Building

This framework needs to be built so that other projects can build off of it. This requires work and feedback from the community.

A Good NDArray container object

To reduce complexity we should separate the storage from the algorithms. A good NDArray container object might include the following

  • NumPy like syntax including strides X[:,:2:, 0]
  • Dense container objects
  • Sparse container objects
  • Functional container objects
  • Views like transposes, slices, or reshapings that refer back to the same data
  • Integration of numpy and scipy sparse
  • Efficiency

Sparse Matrix Algorithms

Sherjil wrote a lot of code on this front last Summer but it was never merged into master. Much of it can likely be salvaged.

Applications of Tensors to Physics

There is lots of work in physics to do. When designing a general framework its important to know what demands will be made downstream. How general do we need to be to solve interesting problems in General Relativity for example?

User Interface

Whatever we build should be comfortable and intuitive to the user. There are too many systems out there that are powerful, have a steep learning curve, and unused. Lets not become one of them.

Interested Parties

Currently most of this page is written by user mrocklin. If you're also interested feel free to shoot him an e-mail. He is playing around with some ideas on this github branch https://github.com/mrocklin/sympy/tree/tensor . It's undocumented and changing so don't expect much.

Clone this wiki locally