Project Mata is an approach to managing Mata projects. It handles dependencies and macro definitions, and automates the assembly of ado-files.
You have a complex Mata project, perhaps a lengthy ado-file, that you have broken into a series of separate files. Great! By using multiple files, you facilitate code readability and version control.
However, Mata now provides you no means by which to compile a file that depends on other files or to assemble the files into an ado-file. Project Mata offers a set of ado-files to fill this gap.
- Dependency management
- Declaration dependencies
- External dependencies
- Macro definitions
- Type macros
- Constants
- Enumerations
- External dependencies
- File structure
- Assembling an ado-file
- Installation
Your project has many Mata source files, and you're updating one: algorithm.mata
. You try compiling it, but you've already run into a challenge: algorithm.mata
uses the functions defined in two other files, set.mata
and graph.mata
. This means you need to run those files anytime you compile algorithm.mata
:
clear mata
do set.mata
do graph.mata
do algorithm.mata
Project Mata allows you to automate this step by defining these dependencies at the start of algorithm.mata
:
* Top of file
matainclude set graph
mata:
// Mata code
end
Here matainclude
finds set.mata
and runs it, then does the same for graph.mata
. If those files include dependencies, matainclude
runs those without ever compiling the same file twice. Compiling algorithm.mata
becomes that much easier:
clear mata
do algorithm.mata
You can combine these two lines by running matainclude
from the Command Window:
matainclude algorithm
In rare cases, two classes' methods reference each other. In this case, you need to compile the classes' declarations before their method definitions. Do so using declareclass
. This parses a Mata source file that contains a single class, extracting and compiling only its class declaration.
Here's an example. A container stores a set of member objects. Both the container and its members have names.
// Container class declaration
class Container {
public:
string scalar name()
string rowvector member_names()
private:
string scalar name
pointer(class Member scalar) rowvector members
}
// Container method definitions
string rowvector Container::member_names()
{
real scalar i
string rowvector names
names = J(1, length(members), "")
for (i = 1; i <= length(members); i++)
names[i] = members[i]->name()
return(names)
}
...
// Member class declaration
class Member {
public:
string scalar name()
string scalar container_name()
private:
string scalar name
pointer(class Container scalar) scalar container
}
// Member method definitions
string scalar Member::container_name()
return(container->name())
...
Container::member_names()
calls Member::name()
, which means that Member
must be declared before Container::member_names()
is defined. Likewise, Container
must be declared before Member::container_name()
is defined. Mata needs us to declare both classes before defining either.
Say Container.mata
defines class Container
, and Member.mata
defines Member
. Then you would add declareclass
to the top of both files:
Container.mata
declareclass Member
mata:
class Container {
...
}
...
end
Member.mata
declareclass Container
mata:
class Member {
...
}
...
end
One Project Mata project may depend on the code of another. For example, set.mata
and graph.mata
may belong to one Project Mata project while algorithm.mata
belongs to another.
Say set
and graph
are part of project Foundation, and algorithm
is part of project Levenshtein. You need to specify that Levenshtein depends on Foundation so that matainclude
and declareclass
know to search the src
directory of Foundation for source files.
To do so, add a file named .external
to the root of the dependent project (Levenshtein). Add a line to the file for each dependency, specifying the dependency's root path. Don't list the absolute path of the dependency's project root: Project Mata projects should work across computers. Rather, list the fastcd
code that maps to the project root. (For more on fastcd
, see the SSC package of the same name.) For example:
.external
foundation
Here, Project Mata assumes that c foundation
points to the root of project Foundation.
As Bill Gould has outlined, macros serve a useful purpose in Mata as well as Stata. We wholeheartedly agree.
By storing a long type in a macro of shorter length, we reduce the verbosity of type declarations:
local SR string rowvector
...
`SR' union(`SR' list1, `SR' list2)
This is especially true for classes and structs:
local ListS class my_list_class scalar
...
`ListS' intersection(`ListS' list1, `ListS' list2)
Mata has no namespaces, so programmers use long class names to avoid conflicts. With macros, you can use project-specific short names rather than the full class names.
Mata has few exposed classes — most classes are wrapped in functions — which means that almost every value is real
, string
, or transmorphic
. Macros allow us to add semantic aliases of these types:
// Rowvector of Stata names
local NameR string rowvector
// File handle scalar
local FileHandleS real scalar
...
// We immediately understand the format of these arguments.
void save_dta_subset(`NameR' variables, `FileHandleS' out)
Many StataCorp ado-files use macros in Mata: rename.ado
is an excellent example. These ado-files define their macros immediately above their Mata code.
However, this pattern fails for projects with multiple Mata source files, as each file needs to be able to use project macros. Enter matamac
.
When you run matamac
at the top of a Mata source file, it adds the predefined macros of your Project Mata project:
* Top of file
matamac
mata:
// Immediately start using type macros.
`SR' union(`SR' list1, `SR' list2) ...
`ListS' intersection(`ListS' list1, `ListS' list2) ...
void save_dta_subset(`NameR' variables, `FileHandleS' out) ...
end
You define your project's macros in a file named .matamac
at the root of your project directory:
[type]
S = string
List = class my_list_class
Name = string
FileHandle = real
The line S = string
defines a series of macros:
local SS string scalar
local SV string vector
local SR string rowvector
local SC string colvector
local SM string matrix
In keeping with StataCorp convention, all Project Mata type macros end in a capital letter that designates the orgtype.
The line List = class my_list_class
, defining a class type, results in a similar set of macros:
local List my_list_class
local ListS class my_list_class scalar
local ListV class my_list_class vector
local ListR class my_list_class rowvector
local ListC class my_list_class colvector
local ListM class my_list_class matrix
Here the class name is stored in a macro that can be used in the class definition:
class `List' {
public void concat()
...
}
void `List'::concat(`ListS' list)
Class macros help overcome the lack of namespaces. With them, no project code need refer to actual class names, which can be long without degrading readability.
A .matamac
file may include simple constants in addition to types:
[type]
S = string
List = class my_list_class
[cons]
ImportantProjectValue = something
EssentialProjectSetting = something else
The cons
section of this .matamac
file defines these macros:
local ImportantProjectValue something
local EssentialProjectSetting something else
Enumerations allow you to define a Mata type that equals one of a fixed set of values. (Bill Gould discusses these in his Stata Journal article.)
For example, say a function yields a return code that indicates the result of a file write. The operation could be successful or it could fail because the file already exists or because its parent directory does not exist. matamac
will define macros like these:
local WriteResultS real scalar
...
local WriteResultM real matrix
local ResultSuccess 0
local ResultFileExists 1
local ResultNoParent 2
Leading to client code like this:
`WriteResultS' result
result = write_file(some_file)
if (result == `ResultSuccess')
display("We did it!")
else if (result == `ResultFileExists')
...
else if (result == `ResultNoParent')
...
This enumeration's definition in .matamac
appears as follows:
[type]
S = string
List = class my_list_class
[enum]
WriteResult = (Result) Success FileExists NoParent
WriteResult
designates a name for the enumeration's type macros, while (Result)
prefixes the names of the enumeration's value macros.
matamac
defines the macros specified in the project's .matamac
file, as well as the .matamac
files of the project's external dependencies. To automatically add the core Project Mata macros, add Project Mata as a dependency.
Project Mata requires your project to follow a defined structure.
At the project's root directory, define .matamac
for project macros. In fact, the project root is defined as the directory that contains .matamac
.
Store your source files, both Stata and Mata, in directory src
. Project Mata ado-files do not use other directories, but we recommend that you create a directory named doc
for project documentation. For example, doc/help
could contain Stata help files while doc/develop
stores documentation for project developers.
src
can use whatever directory structure you prefer. We recommend a directory named main
for primary files and one named cscript
for the certification script.
Every source file must be uniquely identified by its base name. For example, a file named something.mata
cannot exist in two directories. Base names must be valid Stata names.
Classes follow their own structure. Each class should have its own type macro. Every class must be defined in its own file whose name is the same as the class type macro. For example, we may use the macro List
for class my_list_class
:
.matamac
[type]
List = class my_list_class
Then the source file that contains the definition of my_list_class
must be named List.mata
.
Use write_ado
to assemble the source files of a Project Mata project into an ado-file. Specify three sets of source files:
- A Stata file that defines the program and calls Mata. Specify the file's relative path from the
src
directory. - Mata files. Just specify the sources' names.
- (Optional) Mata files whose class declarations must be compiled before other classes' definitions. Specify source names.
For example:
#delimit ;
write_ado using my_ado.ado,
stata(main/stata.do)
mata(
Container
Member
set
graph
algorithm
)
class_declarations(
Container
Member
)
;
#delimit cr
To use Project Mata, clone this repository and add src/ado
to your ado-path.
Project Mata requires the SSC package fastcd
to manage external dependencies.