You walk into a cafe, looking for your friend. Seems like an easy task, until you see it’s so packed that you can’t see through the crowd at all, and everyone’s talking so loud that you can barely hear anything. The only things you know are your movements, and how far you are from your friend (through the special psychic bond you two share). How will you find each other?

I wanted to solve the exact same problem, but with devices instead of people (so no psychic connection for me), existing in a space of hundreds of other devices. Working the problem taught me a lot of really interesting science relating to robotics and state estimation, and I wrote this post so you can learn, too.

The Formal Problem: Range-Only Relative Localization

I have two microcontrollers (the large blue boards). Each has an inertial measurement unit (IMU, small board on top right), which gives me data including acceleration and compass heading. They also have an ultra-wideband unit (UWB, small board in slot on left), which gives the distance to the other unit in the pair.

The Beacons in question.

Each of these packages will be contained within a wearable device, and can therefore prompt its wearer to move around. This is an important freedom, because it means we can use the change in distance over time to determine location during the localization process, rather than purely statically.

Potential Solutions

To get us in the mindset of doing a technical implementation, let’s understand some options we have and see why they are or aren’t a good fit for what we’re trying to do.

Candidate 1: Multi-Antenna Ultra-Wideband

Ultra-wideband localization, found in location-aware products including AirTags, uses a simple call-and-response system known as time of flight. The UWB measures the time it takes for a roundtrip exchange of information, which is then turned into a distance d=c(TloopTreply)/2d = c \cdot (T_{loop} - T_{reply})/2.

The initiator sends out a timestamped pulse to the responder, which then responds in TreplyT_{reply} time (reported by the responder) with its own timestamp packet. The whole exchange takes TloopT_{loop} time, calculated at the initiator:

TloopT_{\text{loop}} 220.07 NS TreplyT_{\text{reply}} 120.00 NS Derived ToF\text{Derived ToF} 50.03 NS Recovered Distance 15.00 M
ToF=TloopTreply2=220.07120.002=50.03 ns\text{ToF} = \frac{T_{\text{loop}} - T_{\text{reply}}}{2} = \frac{220.07 - 120.00}{2} = 50.03 \text{ ns}
d=cToF=15.00 md = c \cdot \text{ToF} = 15.00 \text{ m}
Release to update

The time of flight system works fine for getting the distance to the other device, but alone is insufficient for getting the bearing to the other device. This limitation is where phase difference of arrival (PDoA) ultra-wideband comes in.

PDoA uses a property of radio waves called the phase. Light and radio waves oscillate up and down as they travel; where they are in that cycle is the phase. As the wave passes any fixed point in space, its phase at that point cycles:

φ=3.14 rad\varphi = 3.14 \text{ rad}

We use this property to determine the heading relative to the other device. By positioning two antennas a known distance dad_a apart, we can derive the angle of arrival of the other device’s messages by comparing the phase at the two antennas at each timestep:

Drag the point to move the wave source.

Δφ\Delta\varphi 0.00 rad θ\theta 0.00 rad (0.00°)
θ=arcsin ⁣(Δφλ2πda)=arcsin ⁣(0.0080.002π100)=0.00 rad\theta = \arcsin\!\left(\frac{\Delta\varphi \cdot \lambda}{2\pi \cdot d_a}\right) = \arcsin\!\left(\frac{0.00 \cdot 80.00}{2\pi \cdot 100}\right) = 0.00 \text{ rad}
Release to update

Failure mode: PDoA is a great system, but it has a caveat: it requires two antennas. This design restriction makes it very difficult to integrate into a wearable, where you have to deal with skin substantially attenuating signals. One antenna (what I chose for Beacons) can maybe get through with clever placement, but two is much more difficult.

Candidate 2: External Trilateration System

Another potential way to do things is to involve external devices with known fixed positions, usually known as anchors. With at least three of them, a Beacon can measure its distance to each and solve for its location. This strategy is the same as how GPS works, using multiple circles (or spheres in 3D) of distance to derive an exact location for the device:

Drag the beacon.

r1r_1 130.00 r2r_2 200.00 r3r_3 200.00
r12=(xx1)2+(yy1)2r22=(xx2)2+(yy2)2r32=(xx3)2+(yy3)2\begin{aligned} r_1^2 &= (x - x_1)^2 + (y - y_1)^2 \\ r_2^2 &= (x - x_2)^2 + (y - y_2)^2 \\ r_3^2 &= (x - x_3)^2 + (y - y_3)^2 \end{aligned}
[x2x1y2y1x3x1y3y1][xy]=12[k2k1k3k1],ki=xi2+yi2ri2\begin{bmatrix} x_2 - x_1 & y_2 - y_1 \\ x_3 - x_1 & y_3 - y_1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \tfrac{1}{2}\begin{bmatrix} k_2 - k_1 \\ k_3 - k_1 \end{bmatrix},\quad k_i = x_i^2 + y_i^2 - r_i^2
(x,y)=(240.00, 190.00)(x, y) = (240.00,\ 190.00)
Release to update

Failure mode: Trilateration is easy, once you have those three anchors. In practice, this means permanently installing devices to always broadcast their locations within a space. The localization needs to work without specially-installed equipment, so I don’t see trilateration as a valid solution. However, if I were to deploy these in a known location that has a lot of traffic (like a conference venue), this strategy could act as a refining system to keep the main system grounded.

Candidate 3: Kalman Filters

Kalman filters are the gold standard for state-space estimation in physical spaces, and are exactly what we need to solve our problem. They operate on the principle of having an internal estimation of the unobservable true state — in this case, the relative position — which is updated with new information we get from measurements in a looping process. This loop eventually converges on a reasonable estimate that we can use to guide the user. Our true state is unobservable because we can’t directly measure bearing from the sensors alone; the bearing comes from the combined information from all of our sensors.

I specifically chose an extended Kalman filter (EKF), which gives support for the nonlinear functions needed to calculate the expected bearing. The reason an EKF is our best solution is that it only requires one UWB antenna (unlike PDoA) and only two devices (unlike trilateration).

Implementing an EKF

An EKF is a loop composed of two parts: a prediction and a measurement step. You have a state vector x^\hat{\textbf{x}} and covariance matrix PP you want to keep as consistent as possible with the real unobservable state x\textbf{x} (for example, the xx and yy distance to a target). We use the “hat” ^\hat{\cdot} to say that this state vector is an estimate of the true state. PP doesn’t get a hat since it isn’t an estimate of anything, it just helps inform the estimate x^\hat{\textbf{x}}. To update x^\hat{\textbf{x}} and PP, you will ask two questions:

  • “What do I think my next state will be?”: Answered by the state transition function f(x,u)f(\textbf{x}, \textbf{u}) (also known as the prediction function), this takes the current state along with the control input and projects it to the next timestep. In the case of Beacons, it asks: “Given how I’m moving and how I’m accelerating (represented as [ax,ay][a_x, a_y]), what does physics think about where I’ll end up next?” Acceleration is a control input and not a measurement because it’s driving the state transition, not a result of it.
  • “What measurements do I expect to receive?”: Answered by the measurement function h(x)h(\textbf{x}), this also takes the current state, but it instead asks what the probable sensor readings are. For Beacons: “Given how I’m moving, what do I expect the UWB reading to be at the next timestep?”

The state x\textbf{x} represents what the system is keeping track of, but it’s also very important to know the relations of each of the state values to each other. This is where the covariance matrix PRx×xP \in \mathbb{R}^{|\textbf{x}|\times|\textbf{x}|} comes in. It’s a square matrix with row/column counts equal to the size of x\textbf{x}, denoted x|\textbf{x}|. Each entry Pi,jP_{i,j} in that matrix answers the question “how do xi\textbf{x}_i and xj\textbf{x}_j impact each other’s distributions?” For entries along the diagonal, this reduces to the variance of xi\textbf{x}_i. We use the entries in PP to shape our belief of the distributions of the variables in x\textbf{x} during the prediction and measurement steps.

Sidenote: Linearization

Linearization is the process of taking a nonlinear function (like a parabola) and “zooming in” really close, so close that the curved line actually looks straight, or in other words, linear. It’s an approximation so it’s not guaranteed to be optimal, but it’s good enough for our use case (and others, including GPS).

f(x)=x2f(x) = x^2 (solid line)
L(x)=2x1L(x) = 2x - 1 (dashed line, linearization at x0=1x_0 = 1)

EKFs have an additional layer of complexity in return for supporting nonlinear functions like square root, which you will see are necessary for our task: they require differentiability of f(x,u)f(\textbf{x}, \textbf{u}) and h(x)h(\textbf{x}) to linearize around x^k+1k\hat{\textbf{x}}_{k+1|k} using the Jacobians of ff and hh, FF and HH respectively. Doing so lets us reuse the regular Kalman update equations, making calculations much easier. The notation k+1kk+1|k expands to “we condition our guess of k+1k+1 on the information we have at timestep kk”.

The Flow of Data

We now have all of the pieces to describe how data flows through the EKF loop. We start with the current state x^kk\hat{\textbf{x}}_{k|k} and covariance matrix PkkP_{k|k}, along with our control input for the timestep, uk\textbf{u}_k.

Our first step is to predict what the next state and covariance will be.

x^k+1k=fk(x^kk,uk)Pk+1k=FkPkkFkT+Qk\begin{align*}\hat{\textbf{x}}_{k+1|k} &= f_k(\hat{\textbf{x}}_{k|k}, \textbf{u}_k)\\P_{k+1|k}&=F_k P_{k|k} F_k^T + Q_k\end{align*}

Notice FkPkkFkTF_k P_{k|k} F_k^T: this is the linearization I was talking about earlier. The QkQ_k term is the prediction noise covariance for timestep kk. The noisier the prediction is, the less the filter trusts it. Next, we update x^k+1k\hat{\textbf{x}}_{k+1|k} and Pk+1kP_{k+1|k} with our measurements:

Kk+1=Pk+1kHk+1T(Hk+1Pk+1kHk+1T+Rk+1)1x^k+1k+1=x^k+1k+Kk+1(zk+1hk+1(x^k+1k))Pk+1k+1=(IKk+1Hk+1)Pk+1k\begin{align*} K_{k+1}&=P_{k+1|k}H_{k+1}^T(H_{k+1}P_{k+1|k}H_{k+1}^T + R_{k+1})^{-1}\\ \hat{\textbf{x}}_{k+1|k+1} &= \hat{\textbf{x}}_{k+1|k} + K_{k+1}(z_{k+1} - h_{k+1}(\hat{\textbf{x}}_{k+1|k}))\\ P_{k+1|k+1} &= (I - K_{k+1}H_{k+1})P_{k+1|k} \end{align*}

zk+1z_{k+1} is our measurement for the timestep. Kk+1K_{k+1} is the Kalman gain, which weights the measurement against the prediction by their relative variances. A higher gain means we trust the measurement more; lower means we trust our prediction more. The linearization Hk+1Pk+1kHk+1TH_{k+1}P_{k+1|k}H_{k+1}^T comes back again, this time for the measurement step. Rk+1R_{k+1} plays the same role as QkQ_k but for measurement noise covariance.

And that is a single step of an EKF! Not too bad once you understand how everything fits together.

Now that we have an understanding of the underlying logic of an EKF, it’s time to define our variables.

Step 1: The State Vector x\textbf{x} and Covariance Matrix PP

Arguably the most important part of designing a filter is what state to keep track of. There are a million things going on in any nonlinear system, especially in the case of Beacons, so being selective with what gets stored saves both processing time and later headaches when things inevitably don’t work first try. For Beacons, we will store four values:

  • Δx\Delta x and Δy\Delta y: The estimated difference in xx and yy (measured in meters) from the current device to the other in world frame. Can be turned into a triangle with Pythagoras’ theorem to calculate the bearing and distance.
  • Δx˙\Delta \dot{x} and Δy˙\Delta \dot{y}: The estimated velocities in each component direction, important in the physics calculations.

These four values (packed as [Δx,Δy,Δx˙,Δy˙]T[\Delta x, \Delta y, \Delta \dot{x}, \Delta \dot{y}]^T) allow us to approximate the physical system of two devices moving in a 2D plane. We initialize with the UWB distance d0d_0, assuming the initial relative position lies entirely along xx:

x^0=[d0,0,0,0]T\hat{\textbf{x}}_0 = [d_0, 0, 0, 0]^T

It’s important to note that it’s impossible to tell the direction to the other device at this point, since at k=0k=0 we only have one distance measurement. The filter will accumulate a belief of the direction as the user moves the device around. We initialize the covariance matrix P0P_0 using tuned values:

diag(P0)=[0.04,d02,1,1]\textrm{diag}(P_0) = [0.04, d_0^2, 1, 1]

The 0.04 m20.04\ \textrm{m}^2 (20 cm standard deviation) informs the EKF that the radial covariance is small and constrained by the UWB’s relatively low error, with the starting assumption that we are directly on the xx axis. The d02d_0^2 variance informs the EKF that the larger the distance, the more unsure it should be about the yy axis (since we assume we are only on the xx axis at k=0k=0). We initialize the variances for Δx˙\Delta\dot{x} and Δy˙\Delta\dot{y} to 1 (m/s)21\ \textrm{(m/s)}^2 as a tuned starting value. The rest of P0P_0 is 00. The filter will start changing all of its values as it learns more about the system.

Now that we know what we’ll be storing, we need to think about how we project that state prediction forward.

Step 2: The State Transition Function f(x,u)f(\textbf{x}, \textbf{u})

We have our state vector layout, now we need to project it forward each timestep using information from the control vector. Luckily this is pretty easy, using physics equations you likely learned in high school:

Δx˙k+1k=Δx˙kk+ax,kΔtΔy˙k+1k=Δy˙kk+ay,kΔtΔxk+1k=Δxkk+Δx˙kkΔt+12ax,kΔt2Δyk+1k=Δykk+Δy˙kkΔt+12ay,kΔt2\begin{align*} \Delta\dot{x}_{k+1|k}&=\Delta\dot{x}_{k|k} + a_{x,k} \cdot \Delta t\\ \Delta\dot{y}_{k+1|k}&=\Delta\dot{y}_{k|k} + a_{y,k} \cdot \Delta t\\ \Delta x_{k+1|k} &= \Delta x_{k|k} + \Delta\dot{x}_{k|k}\cdot \Delta t + \tfrac{1}{2}\,a_{x,k}\cdot \Delta t^2 \\ \Delta y_{k+1|k} &= \Delta y_{k|k} + \Delta\dot{y}_{k|k}\cdot \Delta t + \tfrac{1}{2}\,a_{y,k}\cdot \Delta t^2 \end{align*}

These equations give our predicted state update x^k+1k\hat{\textbf{x}}_{k+1|k}.

Note: The world-frame acceleration values ax,ka_{x,k} and ay,ka_{y,k} are calculated using an AHRS filter combining data from the IMU. I’ve omitted the AHRS step for brevity.

Next, we update PP. To do so we need our prediction Jacobian, FF, calculated by taking the partial derivatives of the above equations:

F=fx=[10Δt0010Δt00100001]F=\frac{\partial f}{\partial \textbf{x}}=\begin{bmatrix}1 & 0 & \Delta t & 0 \\ 0 & 1 & 0 & \Delta t \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}

Diagonal entries of 11 mean each value carries forward. The two off-diagonal Δt\Delta ts integrate Δx\Delta x via Δx˙\Delta\dot{x} and Δy\Delta y via Δy˙\Delta\dot{y} over a single step. We assume constant acceleration between steps.

We then follow the prediction equations defined above to get x^k+1k\hat{\textbf{x}}_{k+1|k} and Pk+1kP_{k+1|k}. QQ is a tuned matrix of constants.

Step 3: The Measurement Function h(x)h(\textbf{x})

Now that we have our estimates x^k+1k\hat{\textbf{x}}_{k+1|k} and Pk+1kP_{k+1|k}, we need to check our work. Our measurement function is pretty simple:

h(x)=Δx2+Δy2=rh(\textbf{x}) = \sqrt{\Delta x ^2 + \Delta y^2} = r

All this does is take our deltas and convert them to a distance comparable to the UWB reading. Taking the partial derivative:

H=hx=[ΔxrΔyr00]H = \frac{\partial h}{\partial \textbf{x}} = \begin{bmatrix}\displaystyle\frac{\Delta x}{r} & \displaystyle\frac{\Delta y}{r}& 0 & 0\end{bmatrix}

The pair (Δx/r,Δy/r)(\Delta x/r, \Delta y/r) is the unit vector from this device to the other. Movement along that direction changes the distance one-for-one; movement perpendicular to it doesn’t change the distance at all. Setting zk+1z_{k+1} to our UWB reading, we update with the above equations. Similarly, RR is a tuned scalar constant. RR isn’t a matrix because the measurement function returns a scalar.

Step 4: Extracting the Results

Whenever we want to get what the filter thinks is our current bearing, we run a simple function to extract the values we need, the bearing β\beta and the 1σ1\sigma uncertainty σβ\sigma_\beta, both in radians:

r2=Δx2+Δy2β=atan2(Δy,Δx)βΔx=βΔx=Δyr2βΔy=βΔy=Δxr2v=P(0,0)βΔx2+2P(0,1)βΔxβΔy+P(1,1)βΔy2σβ={πif r2<106max(0,v)otherwise\begin{align*} r^2 &= \Delta x^2 + \Delta y^2 \\ \beta &= \textrm{atan2}(\Delta y, \Delta x) \\ \frac{\partial \beta}{\partial \Delta x} = \beta_{\Delta x} &= \frac{-\Delta y}{r^2} \\ \frac{\partial \beta}{\partial \Delta y} = \beta_{\Delta y} &= \frac{\Delta x}{r^2} \\ v&= P_{(0, 0)} \beta_{\Delta x}^2 + 2 P_{(0, 1)} \beta_{\Delta x} \beta_{\Delta y} + P_{(1, 1)} \beta_{\Delta y}^2 \\ \sigma_\beta &= \begin{cases} \pi &\textrm{if $r^2 < 10^{-6}$} \\ \sqrt{\max(0, v)} &\text{otherwise} \end{cases} \end{align*}

We set σβ=π\sigma_\beta = \pi (no knowledge of bearing) if the distance is so small that it’s basically impossible to tell, i.e. the filter thinks rr is less than a millimeter. Otherwise, we take σβ=max(0,v)\sigma_\beta = \sqrt{\max(0, v)}, with vv being the variance.

Demo

Now we have the entire pipeline! Let’s see what it can do:

Drag the Beacons (multitouch supported), or click the canvas and use WASD for 1 / IJKL for 2 / R to reset.

x^=[0.000.000.000.00]\hat{\mathbf{x}} = \begin{bmatrix} 0.00 \\ 0.00 \\ 0.00 \\ 0.00 \end{bmatrix}
β\beta 0.00° σβ\sigma_\beta 0.00° r^\hat{r} 0.00 m zz 0.00 m NaN resets 0

The colored region is the filter’s belief about where the other Beacon is.


There are tons of applications for EKFs, ranging from robotics to digital signal processing to navigation systems, and it all becomes so much more interesting when you can see how they work through all of the complicated matrix math. They are the perfect solution for Beacons, and I hope you someday experience the joy of using one, too.

Further Reading

I used this explanation of regular and extended Kalman filters extensively when writing this post; I highly recommend reading it yourself if you want to dive deeper!

If you want to read up on a more recent innovation, look into unscented Kalman filters! They turned out to be a bit overkill for my use case, but I would love to use them in the future.