I started this project knowing nothing about C++ and thought it would be fun to learn the language by creating my own programming language, Theta. The language has no real purpose and was built purely for educational purposes.
Theta is a strongly typed, functional, compiled programming language designed to be data-driven and composable, with built-in support for pattern matching and modular organization through capsules. The language aims to provide a clean and expressive syntax while ensuring type safety and functional programming paradigms.
Click here to view the formal language grammar in BNF format
Thank you for your interest in contributing to the Theta programming language! Here are some guidelines to help you get started:
- Git Submodules: We use submodules to manage some dependencies. Make sure to initialize and update the submodules by running:
git submodule update --init --recursive
- C++ Compiler: Ensure you have a compiler that supports C++17.
- Clone the Repository: If you haven't already, clone the repository:
git clone https://github.com/alexdovzhanyn/ThetaLang.git
cd ThetaLang
- Initialize Submodules: Initialize and update the submodules for dependencies like Binaryen:
git submodule update --init --recursive
- Build the Project: Run the build script to compile Theta:
./build.sh
To verify that Theta has been installed correctly, run the following command:
theta --version
This should display the current version of Theta.
- Fork the Repository: Create a fork of this repository on GitHub.
- Create a Branch: Create a new branch for your feature or bug fix:
git checkout -b some-feature-branch
- Make Changes: Implement your changes and commit them to your branch.
- Run the Tests: Run
./build/LexerTest
and./build/ParserTest
to make sure your changes didn't break any existing functionality - Submit a Pull Request: Push your changes to your fork and submit a pull request to this repository.
Theta has an Interactive Theta (ITH) REPL that can be accessed by just typing theta
into the terminal. Right now all expressions must fit on
one line, because the REPL expects a newline to mean that you want to submit and compile the code. The REPL doesn't yet interpret the code,
it will just show you the AST that is generated.
To test and run Theta code, you can use the Theta Browser Playground which will run Theta code in the browser.
If you encounter any issues or have suggestions for improvements, please use the Issues page to report them. Thank you for contributing to Theta!
Identifiers are used to name variables, functions, capsules, and structs. They must start with a letter or underscore and can contain letters, digits, and underscores.
Identifier = Letter (Letter | Digit | "_")*
Letter = "A".."Z" | "a".."z" | "_"
Digit = "0".."9"
Reserved words that have special meaning in Theta:
link, capsule, struct, true, false, void, Number, String, Boolean, Symbol, Enum
Single-line comments start with //
and extend to the end of the line. Multi-line comments
start with /-
and extend until reaching a -/
.
Example:
// This is a single line comment
// This is another
/-
This is a multiline comment.
It extends over multiple lines.
-/
String
: Represented by single quotes'example'
.Boolean
: Represented bytrue
orfalse
.Number
: Represents both integers and floating-point numbers.List
: Represented by square brackets[ ]
.Dict
: Represented by curly braces{ }
.Symbol
: Represented by a colon followed by an identifier, e.g.,:symbol
.
User-defined data structures composed of primitives or other structs. Structs can only be defined within capsules. Structs can be referenced by name within the capsule that they are defined, but must be prefixed by their containing capsule if used in another capsule
struct StructName {
fieldName<Type>
...
}
Example:
capsule Messaging {
struct MessageRequest {
hostname<String>
port<Number>
path<String>
method<String>
headers<MessageRequestHeaders>
}
}
// If referenced in another capsule
Messaging.MessageRequest
Enumerated types with custom values represented as symbols. Enum names must be in Pascal case. Enums are scoped the same as variables, therefore an enum defined in a capsule will be accessible from outside the capsule, while an enum defined within a function will be scoped to that function.
enum EnumName {
:ENUM_1
:ENUM_2
...
}
Within a capsule:
capsule Networking {
enum Status {
:SUCCESS
:FAILURE
:PENDING
}
}
// Used like so:
myVar == Networking.Status.SUCCESS
// Or like so, if being referenced from within the capsule:
myVar == Status.SUCCESS
Within a function:
capsule Networking {
isNetworkRequestSuccess<Boolean> = request<NetworkRequest> -> {
enum PassingStatuses {
:SUCCESS
:REDIRECT
}
return Enumerable.includes(PassingStatuses, request.status)
}
isNetworkRequestFailure<Boolean> = request<NetworkRequest> -> {
// PassingStatuses is not available in here
}
}
Variables are declared by their name, suffixed with their type, followed by an equal sign and their value. Variables are immutable.
variableName<Type> = value
Example:
greeting<String> = 'Hello, World'
Functions are defined as variables pointing to a block. The return type is specified after the function name.
functionName<ReturnType> = (param1<Type1>, param2<Type2>, ...) -> {
// function body
}
Example:
add<Number> = (a<Number>, b<Number>) -> a + b
Functions can be composed using the =>
operator, where the value on the left is passed as the first argument to the function on the right.
value => function
Example:
requestParams => Json.encodeStruct() => Http.request()
Capsules are static groupings of related functions, variables, and structs, providing modularity and namespace management. All code in Theta must be contained within capsules.
capsule CapsuleName {
// variable, function, and struct definitions
}
Example:
capsule Math {
add<Number> = (a<Number>, b<Number>) -> a + b
subtract<Number> = (a<Number>, b<Number>) -> a - b
struct Point {
x<Number>
y<Number>
}
origin<Point> = @Point { x: 0, y: 0 }
}
Capsules are imported using the link
keyword.
link CapsuleName
Example:
link Http
link Json
Pattern matching allows for intuitive matching of data structures.
match value {
pattern1 >> result1
pattern2 >> result2
...
_ => defaultResult
}
Example:
matchStatus<String> = status<Enum> -> {
match status {
:SUCCESS >> 'Operation was successful'
:FAILURE >> 'Operation failed'
_ => 'Unknown status'
}
}
Theta supports list, dictionary, and struct destructuring during variable assignment. It is a powerful way to pattern match values out of variables, based on the shape of the data:
myList<List<Number>> = [ 1, 2, 3 ]
// a is 1, b is 2, and c is 3. Notice that you don't have to specify the types of
// a, b, and c here. That is because the compiler can infer the types, because it
// knows we are destructuring a List<Number>, so its values must all be of type Number
[ a, b, c ] = list
myDict<Dict<Number>> = { x: 1, y: 2, z: 3 }
// x is 1, y is 2, z is 3
{x, y, z} = dict
Theta code files are saved with the extension .th
Putting it all together, here is a complete example using the discussed features:
// in Math.th
capsule Math {
struct Point {
x<Number>
y<Number>
}
distance<Number> = (point1<Point>, point2<Point>) -> {
// Calculate distance...
}
dimensionalDistanceX<Number> = (point1<Point>, point2<Point>) -> {
{ x: point1X } = point1
{ x: point2X } = point2
return point2X - point1X
}
}
// in Main.th
link Math
capsule Main {
import Math
point1 = @Math.Point { x: 0, y: 0 }
point2 = @Math.Point { x: 3, y: 4 }
distance = Math.distance(point1, point2)
}
Theta is designed to be a modern, strongly typed, functional programming language that emphasizes modularity through capsules and clarity through its syntax and structure. This specification outlines the core features and syntax of Theta, providing a foundation for further development and refinement.