Generic arrays in Rust – ebvalaim.log

ebvalaim 2015-09-25 0 Comments

Recently, I decided to try and "translate" the black hole simulator into Rust as an exercise. Initially I wanted to create a library implementing differential geometry, like tensor calculus, metric etc. I quickly encountered a problem.

Rust and arrays

Tensors are objects with some constant number of coordinates, depending on the dimension of the space they function in. For example, in an n-dimensional space, vectors have $n$ coordinates, rank-2 tensors have $n^2$ etc. Arrays are perfect for representing such objects.
Rust handles arrays without problems. An array of N elements of type T is written in Rust as [T; N]. Everything would be fine, except for one tiny detail - I would like my code not to depend on the dimension of the space.

The problem

It is possible to define so-called generic types in Rust. They are datatypes that have internal details parametrized by some other type. For example, we could easily create a generic 3-dimensional vector:

struct Vector3<T> {
    coords: [T; 3]
}

struct Vector3<T> {

coords: [T; 3]

}

What if we wanted to create an n-dimensional vector, though?

struct Vector<T,N> {
    coords: [T; N]  // won't work
}

struct Vector<T,N> {

coords: [T; N] // won't work

}

Nope. You can't express an integer type parameter in Rust.
It looked hopeless, but I accidentally stumbled upon some code that suggested the existence of a solution.

dimensioned and Peano types

In the dimensioned library the author has implemented natural numbers arithmetic... on datatypes. It basically looks like this:
Firstly, we define zero:

struct Zero;

1	struct Zero;

Then we say that zero is a natural number:

trait Natural {}
impl Natural for Zero {}

1 2	trait Natural {} impl Natural for Zero {}

Next, we define the successor of a natural number:

struct Succ<N: Natural> {
    _marker: PhantomData<N>  // suppresses compiler warnings
}

struct Succ<N: Natural> {

_marker: PhantomData<N> // suppresses compiler warnings

}

Fourth: the successor of a natural number is also a natural number:

impl<N: Natural> Natural for Succ<N> {}

1	impl<N: Natural> Natural for Succ<N> {}

And that would be it. We can also name some types:

type One = Succ<Zero>;
type Two = Succ<One>;
// etc...

type One = Succ<Zero>;

type Two = Succ<One>;

// etc...

A function returning an actual number can also prove useful:

trait ToInt : Natural {
    fn to_int() -> i32;
}

impl ToInt for Zero {
    fn to_int() -> i32 { 0 }
}

impl<N: ToInt> ToInt for Succ<N> {
    fn to_int() -> i32 { 1 + N::to_int() }
}

trait ToInt : Natural {

fn to_int() -> i32;

}

impl ToInt for Zero {

fn to_int() -> i32 { 0 }

}

impl<N: ToInt> ToInt for Succ<N> {

fn to_int() -> i32 { 1 + N::to_int() }

}

This way we defined datatypes representing the natural numbers. They are actual types - so they can be used as parameters in generics. Unfortunately, [T; Two] still won't work.

However, we can do this: for the type representing zero we define an empty structure, and for every successor we create a structure with two fields - the predecessor structure and one value of type T. This way, the structure defined for a type representing, for example, number 5 will have space for 5 elements of type T in it - so we will be able to treat it as a 5-element array! Let's get to work.

First, we have to somehow define an association between a structure and a given type:

trait ArrayType<T> : ToInt {
    type Array;
}

trait ArrayType<T> : ToInt {

type Array;

}

Now - structures for zero and the successors:

struct EmptyArray<T> {
    _marker: PhantomData<T>   // suppresses warnings
}

// EmptyArray is an array for zero
impl<T> ArrayType<T> for Zero {
    type Array = EmptyArray<T>;
}

struct SuccArray<T, N: ArrayType<T>> {
    previous_data: N::Array, // the predecessor array
    data: T                  // ... and the additional element
}

// associating arrays with successors
impl<T, N: ArrayType<T>> ArrayType<T> for Succ<N> {
    type Array = SuccArray<T, N>;
}

struct EmptyArray<T> {

_marker: PhantomData<T> // suppresses warnings

}

// EmptyArray is an array for zero

impl<T> ArrayType<T> for Zero {

type Array = EmptyArray<T>;

}

struct SuccArray<T, N: ArrayType<T>> {

previous_data: N::Array, // the predecessor array

data: T // ... and the additional element

}

// associating arrays with successors

impl<T, N: ArrayType<T>> ArrayType<T> for Succ<N> {

type Array = SuccArray<T, N>;

}

Ok, so we defined arrays associated with the number types, which should be enough. It will be nice to have syntax resembling [T; N], though:

struct GenericArray<T, N: ArrayType<T>> {
    data: N::Array
}

struct GenericArray<T, N: ArrayType<T>> {

data: N::Array

}

And so now GenericArray will work. We could now define Vector>, too. Those are real generic arrays :)

A problem again

It turns out that such a system has one serious limitation. The number types are defined recursively and so they hit the limits of the compiler pretty quickly - that is, around 64. It can be increased, but still not even to numbers around 1000, and it is too much for the compiler anyway. It would be wise to change it somehow.

In the thread on reddit the user llogiq suggested using a binary system, similar to that used in shoggoth.rs. After some looking at this way, I decided to try it.

The beginning is very similar, except that we define not one, but two types:

struct _0;
struct _1;

trait Natural {}
impl Natural for _0 {}
impl Natural for _1 {}

struct _0;

struct _1;

trait Natural {}

impl Natural for _0 {}

impl Natural for _1 {}

Then, instead of successors, we define binary numbers:

impl<N: Natural> Natural for (N, _0) {}
impl<N: Natural> Natural for (N, _1) {}

1 2	impl<N: Natural> Natural for (N, _0) {} impl<N: Natural> Natural for (N, _1) {}

This way we have _0, _1, (_1, _0), (_1, _1), ((_1, _0), _0), ... as the natural numbers - that is, we implemented the binary system! Of course, again we could use a conversion to an actual number:

trait ToInt : Natural {
    fn to_int() -> i32;
}

impl ToInt for _0 {
    fn to_int() -> i32 { 0 }
}

impl ToInt for _1 {
    fn to_int() -> i32 { 1 }
}

impl<N: ToInt> ToInt for (N, _0) {
    fn to_int() -> i32 { 2 * N::to_int() }
}

impl<N: ToInt> ToInt for (N, _1) {
    fn to_int() -> i32 { 2 * N::to_int() + 1 }
}

trait ToInt : Natural {

fn to_int() -> i32;

}

impl ToInt for _0 {

fn to_int() -> i32 { 0 }

}

impl ToInt for _1 {

fn to_int() -> i32 { 1 }

}

impl<N: ToInt> ToInt for (N, _0) {

fn to_int() -> i32 { 2 * N::to_int() }

}

impl<N: ToInt> ToInt for (N, _1) {

fn to_int() -> i32 { 2 * N::to_int() + 1 }

}

The rest is again very similar. We assign an empty structure to zero and a structure with one field to one. Then, to each tuple (N, _0) we assign a structure with two fields of type N::Array, and to each tuple (N, _1) - with two fields N::Array and one additional element. And so we have generic arrays described in binary.

The actual implementation in the repository differs a bit from this presented here (mainly in the names of the structures) - I encourage you to look through it! :)

Post Views: 2,185