gograd - A minimal scalar autograd engine inspired by micrograd

I’ve recently seen a trend of implementing Andrej Karpathy’s popular autograd engine micrograd in a variety of languages. I’d attribute this to the result of the increasing interest in understanding the fundamentals of neural networks, which is in conjunction to the ongoing AI hype train.

Now there are probably already a couple of micrograd implementations in go, but as a relearning exercise, I’ve decided to create one directly from memory instead of replicating everything Karpathy did. This is based on what I remember was the significant logical portion of the original library.

Why Go? Because it’s simple, readable, and gets the job done. I could have probably used Rust for a challenge, but the objective here is to refresh on backpropagation as quickly and simply as possible, without any qualms with the borrow checker(Neuron is a recursive type so I would probably be Box<>ing everything to the heap).

Anyways, this article isn’t going to be a long one, since this is extremely simple.

We have our Neuron struct:

type Neuron struct {
	Value        float64
	Grad         float64
	Partner      *Neuron
	Constituents [2]*Neuron // index 0 is always (first) and index 1 is always (second)
	Operation    Op // Operation used to create this Neuron
}

It holds: => float64 Value, which is the value of the Neuron, => float64 Grad, which is the gradient of the Neuron, => A pointer to another neuron, which is its partner in in the mathematical operation; Partner => An array of pointers to the 2 Neurons which created the current Neuron; Constituents => The operation used to create the current Neuron; Operation

The Operation field is just an enum of the type Op, defined as follows:

type Op uint

const (
	NoneOp Op = iota
	AddOp
	MulOp
	PowOp
)

Now all we have to do is simply implement each of these operations:

// Adds two neurons
func (n *Neuron) Add(to *Neuron) Neuron {

	n.Partner = to
	to.Partner = n

	res := Neuron{Value: n.Value + to.Value}
	res.Constituents = [2]*Neuron{n, to}

	res.Operation = AddOp

	return res
}

// Multiply's two neurons
func (n *Neuron) Mul(to *Neuron) Neuron {

	n.Partner = to
	to.Partner = n
	res := Neuron{Value: n.Value * to.Value}
	res.Constituents = [2]*Neuron{n, to}
	res.Operation = MulOp

	return res

}

// Pow() function for two neurons
//Here, the constituents have index 0 as the base and index 1 as the power
func (n *Neuron) Pow(to *Neuron) Neuron {

	n.Partner = to
	to.Partner = n
	res := Neuron{Value: math.Pow(n.Value, to.Value)}
	res.Constituents = [2]*Neuron{n, to}

	res.Operation = PowOp

	return res

}

Apart from the usual boilerplate, all that’s left is a simple backpropagation method called Gradient():

// Backprop function
func (n *Neuron) Gradient() {

	n.Grad = 1.0

	traverse(n.Constituents[0], n.Grad, n.Operation, true)
	traverse(n.Constituents[1], n.Grad, n.Operation, false)

}

The function traverse goes through every constituent neuron of each neuron and computes the gradients with respect to the different operations by using partial derivatives and the chain rule. This is done until there are no more constituents left for the neuron chain:

// first^second
func traverse(n *Neuron, prevgrad float64, operation Op, first bool) {

	head := n
		
	if head==nil {

		return

	}

	switch operation {

	case AddOp:
		head.Grad = prevgrad

	case MulOp:
		head.Grad = prevgrad * (head.Partner.Value)

	case PowOp:
		if first {
			head.Grad = prevgrad * (math.Pow(head.Value, head.Partner.Value) * (head.Partner.Value / head.Value))
		} else {

			head.Grad = prevgrad * (math.Pow(head.Partner.Value, head.Value) * (math.Log(head.Partner.Value)))

		}

	}

Note: Notice how the order of the constituents matters when computing more complex functions like Pow().

And, that is all there is to it.

Before ending, I would like to point out that YES, there are design decisions taken here just for the sake of simplicity, such as the use of pointers for everything(which make reusing Neurons for operations not possible).

The repository(given below) will be updated sooner or later with more data structures like Nets, Layers and Multi-Layer Perceptrons.

Source code: gograd