Building My First Perceptron


The perceptron is the “hello world” of neural networks: a single artificial neuron that takes a few numbers in, multiplies them by weights, sums them up, and fires a 1 or a 0. In this post I build one from scratch in NumPy and teach it the logical OR function.

The idea

A perceptron computes one simple thing:

output = activation(inputs · weights + bias)

It starts knowing nothing — all weights at zero. We show it examples, measure how wrong it is, and nudge the weights in the right direction. Repeat enough times and the weights settle into values that get every example right. That nudging rule is the learning.

The data

I’m teaching it logical OR. Two binary inputs, and the output is 1 whenever at least one input is 1:

import numpy as np
import numpy.typing as npt

inputs: npt.NDArray[np.float64] = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
output: npt.NDArray[np.int_] = np.array([0, 1, 1, 1])

learning_rate = 0.1
epochs = 10

learning_rate controls how big each nudge is, and epochs is how many times we loop over the whole dataset.

The perceptron

input_size = inputs[0].size
weights = np.zeros(input_size)
bias = 0

def activation(x: npt.NDArray[np.float64]) -> npt.NDArray[np.int_]:
    return np.where(x > 1, 1, 0)

def predict(x: npt.NDArray[np.float64]) -> np.int_:
    return activation(x @ weights + bias)

def train() -> None:
    global weights, bias
    for _ in range(epochs):
        for xi, y in zip(inputs, output):
            y_hat = predict(xi)
            error = y - y_hat
            weights += error * learning_rate * xi
            bias += error * learning_rate

train()

The whole learning algorithm is those four lines inside the loop:

  1. y_hat = predict(xi) — make a guess.
  2. error = y - y_hat — how far off were we? This is +1, 0, or -1.
  3. weights += error * learning_rate * xi — if we undershot, push weights up for the inputs that were active; if we overshot, push them down.
  4. bias += error * learning_rate — shift the threshold the same way.

When the guess is already correct, error is 0 and nothing changes. The perceptron only learns from its mistakes.

Trying it out

Once trained, ask it to classify a new input:

predict(np.array([0, 0]))   # -> 0
predict(np.array([1, 0]))   # -> 1

After ten passes over four examples, this single neuron reliably reproduces the OR truth table.

A note on the activation

I used np.where(x > 1, 1, 0) as a hard threshold. That’s a deliberately blunt step function — it fires only when the weighted sum clears 1. It works here because OR is linearly separable: you can draw a single straight line that puts the one 0 case on one side and the three 1 cases on the other.

That’s also the perceptron’s famous limitation. Try swapping the targets to XOR ([0, 1, 1, 0]) and no amount of training will get it right — XOR can’t be split by a single line. Solving that is exactly what stacking neurons into layers unlocks, which is where things start getting interesting. But that’s a post for another day.