Ever wondered how Monzo calculates balances (the amount of money) within accounts and pots? We calculate plenty of balances beyond the main one you see in the app and this makes things more complicated than it seems.
In June 2021 we started designing and implementing a new way to calculate balances that is more reliable and consistent.
In this post, we’ll explain why the old approach needed improving and the changes we made.
Useful terms
To understand how we calculate balances and why it’s important, here’s an overview of the key concepts.
Ledger
The ledger is the source of truth of all the money moving around the bank on behalf of all of our customers.
The ledger is built as a double-entry bookkeeping system, where every movement of money has an equal and opposite movement. The sum of all credit entries equals the sum of all debit entries. We refer to the collection of all entries for a single movement of money as an EntrySet
.
We have a microservice called service.ledger
that handles these operations in the backend across the bank.
Address
An address is an entity that lives in the ledger service. You can think of it as a URL in web terms. It’s a way to represent money movement from one entity to another. For example, the user’s main account is represented as com.monzo.account:main
.
The diagram above illustrates the address representation within the ledger service.
An address consists of five main components:
Namespace is a unique domain name (such as
com.monzo.account
) which tells us that the address is a Monzo account. We would use a namespace likecom.monzo.mastercard
to tell us an address is a Mastercard account.Name is the type of the account, for example a main account, an overdraft, or Monzo Flex.
Legal entity tells us which Monzo legal entity of Monzo the account belongs to, for example
monzo_uk
for our UK customers ormonzo_us
for our American customers.Currency is the currency the account that the address belongs to uses, such as $USD or £GBP.
Account ID is a unique account identification number that distinguishes every account in Monzo. The address has this number to know which account the address belongs to.
Balance
We have a few types of balances at Monzo.
Customer-facing balance
This is also known as “available balance”. This is what you see when you open the Monzo app and it reflects the real-time transactions you’ve made using your account.
Interest chargeable balance
This reflects the money in a customer’s account excluding pending transactions yet to settle. In the ledger’s EntrySet
we store a committed
timestamp, which is when the entry was stored in the ledger. Crucially, we use the committed timestamp from the ledger to calculate the interest chargeable balance, to reflect what happens in the backend when balances are calculated (for example to accrue overdraft fees).
The primary purpose of this balance is to calculate overdraft charges. We don’t charge overdraft interest until pending transactions have been settled. For example, if I spent £50 and went -£30 overdrawn but then I topped up my account with £30 before the transaction settled, I wouldn't be charged any overdraft interest as my actual settled balance would have never gone below zero.
Time axis in service.ledger
We capture several timestamps in service.ledger:
committed
represents the time the entry was storedreporting
represents when the entry has an accounting impactflake
is a unique identifier that encodes a timestamp component for lexicographical time ordering for the overall set of entries
We use a flake string implementation for IDs across all systems at Monzo. Every EntrySet
has an ID. The ledger stores entries sorted lexicographically by this flake ID, that in turn means they are sorted by the timestamp encoded within it.
How we historically calculated balances
A library or microservice can call the /balances
endpoint in service.ledger
to get a balance. It was down to individual teams or engineers to decide the balance type they wanted to calculate. For example, to calculate “Interest chargeable balance” one team could call the balances
endpoint to calculate the balance using the time axis committed
and a list of addresses.
This endpoint would retrieve the list of individual amounts related to those addresses in the given time axis boundaries. Then, it calculates the balance by summing up the list of amounts retrieved. The endpoint would first get the list of amounts to sum up, using the given address list and time axis given.
Another team could call the endpoint with the same address list but the reporting
time axis, which means the calculated amount list could be different here
This brought inconsistency between our data warehouse and our backend. This made reconciliation reports incorrect as they weren’t guaranteed to be based on the same balance. 🤦♂️
Why we needed to change how we calculated balances
1. We need one source of truth for balance definitions and reporting
Balance definitions are spread across the backend codebase and are not synced automatically with the data warehouse. It’s difficult to talk about balances company-wide and we may report balances to our regulators that aren’t consistent across reports.
2. We often want to provide an overall account balance
We wanted to introduce the concept of a named balance (given the name of a balance as text we return the total balance amount). We currently have tens of types of named balances such as customer-facing-balance
and interest-chargeable-balance
.
service.ledger
offers a balance endpoint to calculate balances on ledger addresses but uses flake as a time axis which is not normally what someone would expect to see. It would be a lot more expected to have an endpoint to calculate balance given just a balance name, based on one of the expected time axis (committed/reporting)
When someone wants to calculate a customer balance, they often want an overall account balance, not the balance of each ledger address. We couldn’t provide this abstraction.
3. We need to easily reconcile amounts in different systems
With the old setup, the issues may be because systems are using different time axis for the calculation or even different ledger addresses.
4. Deprecate flake
as time axis for balance calculations
Using flake timestamp is an exposure of internal implementation (as it depends on the flake ID generation), the usage of the service should be independent from the implementation. This is why, we believe that using flake IDs to define a limit until when to calculate balances is not the best option, and either committed
(when locally persisted) or reporting
(accounting impact) should be used instead.
How we calculate balances today
The diagram above showcases the configuration of the ledger’s address list, balance definitions, and the relationship between them.
We introduced what we call “Balance definitions” which is a hard-coded list of definitions stored in service.ledger
.
As discussed before, an account balance is the sum of the entries in a defined set of ledger addresses. We already have a ledger configuration file for all the addresses the ledger uses. So, to introduce the concept of “balance definition” we:
Define Balance names and time axis together in a new file (This is the BalanceDefinitions config). This lists all the balance names and maps them to the timestamp to use for the calculation and their description.
Re-use the ledger's address configuration (This is the Address config file) to link the addresses to defined balances.
The final complete balance definition is then statically generated, using the BalanceDefinitions config and the Address config. The generated file contains the balance name, timestamp for balance calculation, and ledger addresses.
Ledger address configuration
Code example of what the addresses config looks like
Ledger balances configuration
Code example of what the balances definitions config looks like
Output: The statically generated file of Balance definitions
One example “BalanceDefinition” from the statically generated file
The input balance definitions only associate ledger addresses and time axis (any legal_entity
, currency
) to a balance name. The output gives us the five elements we need to construct an address (account_id
, namespace
, name
, legal_entity
, currency
) to fetch the data and calculate a balance consistently.
Simple steps to calculate balances
So to calculate a balance we now:
Obtain the balance definition by looking up the provided balance identifier (
balance_name
). If it exists, it will retrieve the ledger addresses' list as shown in code samples above.Using parallelism, we do the rest of the work:
Get the list of
EntrySet
for the balance calculation, using the list of addresses retrieved from the balance definition.Sum up the entries amounts (filtering out the ones that aren’t between start/end timestamps).
The time axis we use will be dictated by the balance definition, having two options only either committed or reporting.
Achievements so far
This new approach has helped improve the accuracy, consistency, and abstraction of our balances.
Accuracy: we have metrics in place and system tests that compare the results of the old and the new way. This makes us confident of the accuracy of the new way.
Consistency: we now have far better consistency between the Data Warehouse and the backend. Our reconciliation tests have also shown that we can reconcile successfully. This makes us more confident with our reconciliation reports.
Abstraction on a system design level: we now have one interface which just needs a balance name and it hides away the implementation details of the balance calculations. Besides having a better system in place, this also protects our ledger interface from misuse.
If you’re interested in working on the finance systems that power the bank, we'd love to have you on our team!