TFGA

About me

Robin Kahlow

Machine-Learning and Software Engineer with Cambrian working on Home Visualization
Areas: Machine Learning, Augmented Reality, Computer Vision

1. TensorFlow

Tensors in TF - Multidimensional arrays

List of numbers

$[1, 7, 5]$

 tf.constant([1, 7, 5],
    dtype=tf.float32)

Identity matrix

$\delta_{ij}$

 tf.eye(num_dims)

3D ones

$\mathbb{1}_{ijk}$

 tf.ones([size_i, size_j, size_k])

Operations on Tensors

Addition

$A_{ijk} + B_{ijk}$

 a + b

Elementwise sine

$sin(A_{ij})$

 tf.sin(a)

Matrix multiplication

$A_{ij} B_{jk}$

 tf.matmul(a, b)

Einstein-ish notation

$\sum_{k} A_{ijk} \cdot B_{ikl} = A_{ijk} B_{ikl} = C_{ijl}$

 c = tf.einsum("ijk,ikl->ijl", a, b)

Keep $i$ , $j$ , $l$ (appear on $C_{ijl})$
Sum over $k$ (doesn't appear on $C_{ijl}$ )

Automatic differentiation

$\begin{aligned} y(x) & = x^2 & \frac{\delta y}{\delta x}(x) & = 2x \\ y(3) & = 9 & \frac{\delta y}{\delta x}(3) & = 6 \end{aligned}$

 x = tf.constant(3)
with tf.GradientTape() as tape:
    y = x ** 2 # 9
 
print(tape.gradient(y, x)) # 6

2. Neural Networks

Input $x_i$ , output $y_j$ , weight matrix $W_{ij}$
Supervised learning: adjust $W$ to minimize error on dataset

$z_{bj} = W_{ij} x_{bi} + c_{j}$

$y_{bj} = f(z_{bj}) = relu(z_{bj})$

 def dense(x, w, c):
    z = tf.matmul(w, x) + c
    # or z = tf.einsum("ij,bi->bj", w, x) + c
 
    y = tf.nn.relu(z)
    return y
 
# x has shape [batch, in_units]
w = tf.Variable(tf.random.normal([x.shape[1], 10]))
c = tf.Variable(tf.zeros([10]))
 
with tf.GradientTape() as tape:
    y = dense(x, w, c)
dy_dw = tape.gradient(y, w)

Keras - Layers as objects

 # Layer contains variables.
# Returns relu(Wx + c) when called.
dense_layer = tf.keras.layers.Dense(
    units=10, activation="relu"
)
 
# x has shape [batch, in_units]
y = dense_layer(x)
 
# Can reuse layer without creating new variables
v = dense_layer(u)

Point-cloud Classification

 network = Sequential([
    Dense(units=64, activation="relu"),
    Dense(units=64, activation="relu"),
    Dense(units=40, activation="softmax")
])
# loss: term to be minimized
network.compile(loss="categorical_crossentropy",
                optimizer="Adam")
 
# inputs: [num_samples, 3*2048] (2048 3d points)
# outputs: [num_samples, 40] (40 classes)
model.fit(train_inputs, train_outputs,
          epochs=10, validation_split=0.1)
 
predictions = model.predict(test_inputs)

3. Geometric Algebra

Basis vectors

$\{e_x, e_y\}$ , Vector

$v = v_x e_x + v_y e_y$

Parallel vectors:

$e_{x}e_{x} = 1$

Orthogonal vectors anti-commute:

$e_{x}e_{y} = -e_{y}e_{x}$

$\begin{aligned} a b = & (a_x e_x + a_y e_y) & & (b_x e_x + b_y e_y) & = \\ & a_x b_x + a_y b_y & + & (a_x b_y - a_y b_x) e_x e_y & = \\ & a \cdot b & + & a \wedge b & \\ \end{aligned}$

Inner Product Exterior Product

Complex numbers / 2D rotation

$e_x^2 = e_y^2 = 1 \implies e_{xy}^2 = -1$
Quaternions / 3D rotation

$e_x^2 = e_y^2 = e_z^2 = 1 \implies e_{xy}^2 = e_{yz}^2 = e_{xz}^2 = -1$
Spacetime algebra (Quantum Electrodynamics, Gravity (GTG))

$e_t^2 = 1, e_x^2 = e_y^2 = e_z^2 = -1$
Projective Geometric Algebra (Translations, Rotations, Reflections, Euclidean motion)

$e_0^2 = 0, e_x^2 = e_y^2 = e_z^2 = 1$

4. GA in TensorFlow

Geometric product: multiplication table

For

$e_1^2 = e_2^2 = 1$

1	e₁	e₂	e₁₂
e₁	1	e₁₂	e₂
e₂	-e₁₂	1	-e₁
e₁₂	-e₂	e₁	-1

Example:

$table[e_1, e_{12}] = e_2$

or with blade-indices (Scalar: 0,

$e_1$ : 1,

$e_2$ : 2,

$e_{12}$ : 3):

$table[1, 3] = 2$

Can be precomputed

Multivectors: basis blade coefficients as 1-Tensor

Eg.: a =

$1 + 2 e_x + 3 e_y + 4 e_{xy} \rightarrow [1, 2, 3, 4] = a_i$

Dense representation of mult. table as 3-tensor

$C_{ijk}$

$a b = y \rightarrow a_i b_j C_{ijk} = y_k$

 a = tf.constant([1, 2, 3, 4]) # shape [4]
b = tf.constant([3, 9, 7, 5]) # shape [4]
# c is precomputed and has shape [4, 4, 4]
y = tf.einsum("i,j,ijk->k", a, b, c) # y shape [4]

Reminder: Automatic differentiation

$y = a b, \frac{\delta y(a, b)}{\delta a}$

 a = tf.constant([1, 2, 3, 4]) # shape [4]
b = tf.constant([3, 9, 7, 5]) # shape [4]
with tf.GradientTape() as tape:
    # c is precomputed and has shape [4, 4, 4]
    y = tf.einsum("i,j,ijk->k", a, b, c)
 
print(tape.gradient(y, a)) # = d/da_i sum(y_j), has shape [4]
print(tape.jacobian(y, a)) # = d/da_i y_j, shape [4, 4]

TFGA - Tensor API

 ga = tfga.GeometricAlgebra([1, 1, 1])
 
# Create geometric algebra tf.Tensor for vector blades
# (ie. e_0 + e_1 + e_2).
# Result: tf.Tensor [0, 1, 1, 1, 0, 0, 0, 0]
vector = ga.from_tensor_with_kind(
    tf.ones(3), kind="vector"
)
# 5 + 5 e_01 + 5 e_02 + 5 e_12
# tf.Tensor [5, 0, 0, 0, 5, 5, 5, 0]
quaternion = ga.from_tensor_with_kind(
    tf.fill(dims=4, value=5),
    kind="even"
)
ga.print(ga.geom_prod(vector, quaternion))

TFGA - Tensor API

 # Grade reversal ~(5 + 5 e_01 + 5 e_02 + 5 e_12)
# = 5 + 5 e_10 + 5 e_20 + 5 e_21
# = 5 - 5 e_01 - 5 e_02 - 5 e_12
ga.print(ga.reversion(quaternion))
 
# tf.Tensor of shape [1]: -5
# (ie. reversed sign of e_01 component)
ga.print(ga.select_blades(quaternion, "10"))
 
# tf.Tensor of shape [8] with only e_01
# component equal to 5
ga.print(ga.keep_blades(quaternion, "10"))
 
# Exterior product e_01 ^ e_2 = e_012.
ga.print(ga.ext_prod(ga.e01, ga.e2))

TFGA - MultiVector API

 ga = tfga.GeometricAlgebra([1, 1, 1])
a = ga.from_tensor(...)
b = ga.from_tensor(...)
 
# Wraps tf.Tensors in tfga's MultiVector class which
# provides operator overrides etc.
mv_a = ga(a)
mv_b = ga(b)
 
# Reversion ((~mv_a).tensor equivalent to ga.reversion(a))
print(~mv_a)
# Geometric / inner / exterior product
print(mv_a * mv_b, mv_a | mv_b, mv_a ^ mv_b)
# Get back tf.Tensor from the multivector
print(mv_a.tensor)

5. GA Neural Nets

Standard Dense Layer:

$y_{bj} = f(W_{ij} x_{bi} + c_j)$

$W_{ij} \in \mathbb{R}^{MN}, c_j \in \mathbb{R}^N$

 dense_layer = Dense(
    units=64, activation="relu"
)

GA Dense Layer: make parameters multivector-valued

$W_{ij} \in \mathbb{Cl}_{(p,q,r)}^{MN}, c_j \in \mathbb{Cl}_{(p,q,r)}^N$

 # Create algebra with 3 basis vectors squaring to 1
ga = tfga.GeometricAlgebra([1, 1, 1])
 
# 1, e01, e02, e12 indices: [0, 4, 5, 6]
even_indices = ga.get_kind_blade_indices("even")
# e0, e1, e2, e012 indices: [1, 2, 3, 7]
odd_indices = ga.get_kind_blade_indices("odd")
 
ga_dense_layer = tfga.layers.GeometricProductDense(
    algebra=ga,
    units=64, activation="relu",
    kernel_blade_indices=even_indices,
    bias_blade_indices=odd_indices
)

Choices

Activation function $f$ : elementwise or acting on multivector (eg. $e^{a e_{12}} = cos(a) + e_{12} sin(a)$ )?
Geometric product $W_{ij} x_i$ or Sandwich product $W_{ij} x_i \widetilde{W}_{ij}$
Matrix mult $W_{ij} x_i$ or elementwise mult $W_{i} x_i$
Add bias $c_{j}$ ?
Algebra / signature
Subalgebra for $W_{ij}$ and $c_{j}$

Log input, Exp on final output, ReLU activation, geometric product, matrix multiplication, bias, quaternion weights

$x_i^{(1)} = log(x_i^{(0)})$

$x_j^{(n+1)} = relu(W_{ij}^{(n)} x_i^{(n)} + c_j^{(n)})$

$y_j = e^{W_{ij}^{(N-1)} x_i^{(N-1)} + c_j^{(N-1)}}$

$log$ goes from Lie group to algebra

$exp$ goes from Lie algebra to group

6.1 LieNet

Implementation in TFGA by Hugo Hadfield

 ga = GeometricAlgebra([1, 1, 1])
 
model = Sequential([
    TensorToGeometric(ga, blade_indices=[6, 5, 4, 0]),
    RotMap(ga, use_bias=False), 
    LogMap(ga),
    Flatten(),
    ReLU(),
    Dense(units=20),
    Softmax(axis=-1)
])

Train model on dataset (Pose-sequence

$\rightarrow$ Action)

 model.compile(
    optimizer="Adam", 
    loss="sparse_categorical_crossentropy",
    metrics=["sparse_categorical_accuracy"]
)
 
model.fit(
    x=inputs_train, y=labels_train, 
    validation_data=(inputs_test, labels_test),
    epochs=100
)

6.2 Joint Transform Estimation

6.3 Lattice QFT

$\begin{aligned} \mathcal{L}_{QED}(X) = & \langle \hbar (\nabla \psi(X)) \gamma_2 \gamma_1 \gamma_3 \widetilde{\psi}(X) - \\ & e A(X) \psi(X) \gamma_0 \widetilde{\psi}(X) - \\ & m \psi(X) \widetilde{\psi}(X) \rangle_0 \end{aligned}$

Action

$S = \int_{\mathcal{X}} \mathcal{L}(X) dX$

Many cells, many parallel calculations, perfect for TensorFlow
TensorFlow Probability provides MCMC samplers (or variational inference) needed for lattice QFT

$m \psi(X) \widetilde{\psi}(X)$

 sta = tfga.GeometricAlgebra([1, -1, -1, -1])
 
def get_mass_term(psi, electron_mass):
    return (electron_mass *
        sta.geom_prod(psi, sta.reversion(psi)
    )

$e A(X) \psi(X) \gamma_0 \widetilde{\psi}(X)$

 def get_interaction_term(psi, a, electron_charge):
    return sta.geom_prod(
        electron_charge * a,
        sta.geom_prod(
            psi,
            sta.geom_prod(sta.e0, sta.reversion(psi))
        )
    )

$\hbar (\nabla \psi(X)) \gamma_2 \gamma_1 \gamma_3 \widetilde{\psi}(X)$

 def get_momentum_term(psi, spacing, hbar):
    dt_psi = finite_differences(psi, axis=0, spacing=spacing)
    dx_psi = finite_differences(psi, axis=1, spacing=spacing)
    dy_psi = finite_differences(psi, axis=2, spacing=spacing)
    dz_psi = finite_differences(psi, axis=3, spacing=spacing)
    d_psi = dt_psi + dx_psi + dy_psi + dz_psi
 
    return sta.geom_prod(
        hbar * d_psi,
        sta.geom_prod(sta.e213, sta.reversion(psi))
    )

Sum cells'

$\mathcal{L}$ to get Action

$S$

 def get_action(psi, a, electron_charge):
    mass_term = get_mass_term(psi=psi,
        electron_mass=electron_mass)
    int_term = get_interaction_term(psi=psi,
        a=a, electron_charge=electron_charge)
    mom_term = get_momentum_term(psi=psi,
        spacing=spacing, hbar=hbar)
 
    # Sum terms and get scalar part
    lagrangians = (mom_term - mass_term - int_term)[..., 0]
 
    return tf.reduce_sum(lagrangians)

TFGA

About me

Robin Kahlow

Masked plane finder

Structure

1. TensorFlow

2. Neural Networks

3. Geometric Algebra

4. GA in TensorFlow

Tensor vs MultiVector API

5. GA Neural Nets

Choices

6. GA-NN Applications

6.1 LieNet

Results

6.2 Joint Transform Estimation

Results

6.3 Lattice QFT

7. Conclusion and Future work

Questions

	x = tf.constant(3)
	with tf.GradientTape() as tape:
	y = x ** 2 # 9

	print(tape.gradient(y, x)) # 6

	def dense(x, w, c):
	z = tf.matmul(w, x) + c
	# or z = tf.einsum("ij,bi->bj", w, x) + c

	y = tf.nn.relu(z)
	return y

	# x has shape [batch, in_units]
	w = tf.Variable(tf.random.normal([x.shape[1], 10]))
	c = tf.Variable(tf.zeros([10]))

	with tf.GradientTape() as tape:
	y = dense(x, w, c)
	dy_dw = tape.gradient(y, w)

	# Layer contains variables.
	# Returns relu(Wx + c) when called.
	dense_layer = tf.keras.layers.Dense(
	units=10, activation="relu"
	)

	# x has shape [batch, in_units]
	y = dense_layer(x)

	# Can reuse layer without creating new variables
	v = dense_layer(u)

	network = Sequential([
	Dense(units=64, activation="relu"),
	Dense(units=64, activation="relu"),
	Dense(units=40, activation="softmax")
	])
	# loss: term to be minimized
	network.compile(loss="categorical_crossentropy",
	optimizer="Adam")

	# inputs: [num_samples, 3*2048] (2048 3d points)
	# outputs: [num_samples, 40] (40 classes)
	model.fit(train_inputs, train_outputs,
	epochs=10, validation_split=0.1)

	predictions = model.predict(test_inputs)

	a = tf.constant([1, 2, 3, 4]) # shape [4]
	b = tf.constant([3, 9, 7, 5]) # shape [4]
	# c is precomputed and has shape [4, 4, 4]
	y = tf.einsum("i,j,ijk->k", a, b, c) # y shape [4]

	ga = tfga.GeometricAlgebra([1, 1, 1])

	# Create geometric algebra tf.Tensor for vector blades
	# (ie. e_0 + e_1 + e_2).
	# Result: tf.Tensor [0, 1, 1, 1, 0, 0, 0, 0]
	vector = ga.from_tensor_with_kind(
	tf.ones(3), kind="vector"
	)
	# 5 + 5 e_01 + 5 e_02 + 5 e_12
	# tf.Tensor [5, 0, 0, 0, 5, 5, 5, 0]
	quaternion = ga.from_tensor_with_kind(
	tf.fill(dims=4, value=5),
	kind="even"
	)
	ga.print(ga.geom_prod(vector, quaternion))

	# Grade reversal ~(5 + 5 e_01 + 5 e_02 + 5 e_12)
	# = 5 + 5 e_10 + 5 e_20 + 5 e_21
	# = 5 - 5 e_01 - 5 e_02 - 5 e_12
	ga.print(ga.reversion(quaternion))

	# tf.Tensor of shape [1]: -5
	# (ie. reversed sign of e_01 component)
	ga.print(ga.select_blades(quaternion, "10"))

	# tf.Tensor of shape [8] with only e_01
	# component equal to 5
	ga.print(ga.keep_blades(quaternion, "10"))

	# Exterior product e_01 ^ e_2 = e_012.
	ga.print(ga.ext_prod(ga.e01, ga.e2))

	ga = tfga.GeometricAlgebra([1, 1, 1])
	a = ga.from_tensor(...)
	b = ga.from_tensor(...)

	# Wraps tf.Tensors in tfga's MultiVector class which
	# provides operator overrides etc.
	mv_a = ga(a)
	mv_b = ga(b)

	# Reversion ((~mv_a).tensor equivalent to ga.reversion(a))
	print(~mv_a)
	# Geometric / inner / exterior product
	print(mv_a * mv_b, mv_a \| mv_b, mv_a ^ mv_b)
	# Get back tf.Tensor from the multivector
	print(mv_a.tensor)

	# Create algebra with 3 basis vectors squaring to 1
	ga = tfga.GeometricAlgebra([1, 1, 1])

	# 1, e01, e02, e12 indices: [0, 4, 5, 6]
	even_indices = ga.get_kind_blade_indices("even")
	# e0, e1, e2, e012 indices: [1, 2, 3, 7]
	odd_indices = ga.get_kind_blade_indices("odd")

	ga_dense_layer = tfga.layers.GeometricProductDense(
	algebra=ga,
	units=64, activation="relu",
	kernel_blade_indices=even_indices,
	bias_blade_indices=odd_indices
	)

	ga = GeometricAlgebra([1, 1, 1])

	model = Sequential([
	TensorToGeometric(ga, blade_indices=[6, 5, 4, 0]),
	RotMap(ga, use_bias=False),
	LogMap(ga),
	Flatten(),
	ReLU(),
	Dense(units=20),
	Softmax(axis=-1)
	])

	model.compile(
	optimizer="Adam",
	loss="sparse_categorical_crossentropy",
	metrics=["sparse_categorical_accuracy"]
	)

	model.fit(
	x=inputs_train, y=labels_train,
	validation_data=(inputs_test, labels_test),
	epochs=100
	)

	sta = tfga.GeometricAlgebra([1, -1, -1, -1])

	def get_mass_term(psi, electron_mass):
	return (electron_mass *
	sta.geom_prod(psi, sta.reversion(psi)
	)

	def get_interaction_term(psi, a, electron_charge):
	return sta.geom_prod(
	electron_charge * a,
	sta.geom_prod(
	psi,
	sta.geom_prod(sta.e0, sta.reversion(psi))
	)
	)

	def get_momentum_term(psi, spacing, hbar):
	dt_psi = finite_differences(psi, axis=0, spacing=spacing)
	dx_psi = finite_differences(psi, axis=1, spacing=spacing)
	dy_psi = finite_differences(psi, axis=2, spacing=spacing)
	dz_psi = finite_differences(psi, axis=3, spacing=spacing)
	d_psi = dt_psi + dx_psi + dy_psi + dz_psi

	return sta.geom_prod(
	hbar * d_psi,
	sta.geom_prod(sta.e213, sta.reversion(psi))
	)

	def get_action(psi, a, electron_charge):
	mass_term = get_mass_term(psi=psi,
	electron_mass=electron_mass)
	int_term = get_interaction_term(psi=psi,
	a=a, electron_charge=electron_charge)
	mom_term = get_momentum_term(psi=psi,
	spacing=spacing, hbar=hbar)

	# Sum terms and get scalar part
	lagrangians = (mom_term - mass_term - int_term)[..., 0]

	return tf.reduce_sum(lagrangians)