How Abacus Permissions Data Through Its API (Part 2)
In part two of our permissioner series, our engineer explains why Abacus decided to solve a query overload by implementing synchronous permissioning functions.
In the first post in this series, we discussed the original system Abacus devised for sending a user only the data they are allowed to see. It allowed us to limit the data we send to clients by specifying a filtering function for each field on a model. The permissioner iterated over all the fields and removed information a user wasn’t permitted to see in the given context. These functions were asynchronous, and weren’t based solely on user role or company settings.
This worked until our permissioning functions started to grow in complexity. The more complex functions needed to query the database. We built it this way so it would be extensible. All these functions were run by one helper so we passed them a small, fixed set of data and let them query for more as needed.
We didn’t account for the performance implications this would have at scale. Scaling meant not only working with larger sets of items more frequently, but also writing more and more complex permissioning functions. The number of queries we made while permissioning started to cause strains within our infrastructure.
Let’s say we were permissioning a set of data, and one of our permissioning functions made a database request. The helper would run the permissioning function for each item in the array, and in doing so would make a query for each item in the array. As we grew, it became necessary to be able to permission hundreds of thousands of items, each of which could make a separate call to the database. We had a problem on our hands: query overload.
Making Fewer Queries
Our first ideas for solving this issue centered around caching. I realized that different permissioners might need the same set of data. I came up with a “model” cache: we could cache model instances keyed by their IDs. Future functions could then use this cache to get data without querying. However, the cache would only be able to fetch by ID. ‘Where’ clauses simply wouldn’t work.
Jan, the senior engineer who authored the first post in this series, suggested a “query” cache. This involved caching the results of a given query by the SQL query string. We weren’t sure of the efficacy of this approach; for this to significantly improve performance we would need to be making many identical queries. Both of these ideas were relatively unobtrusive and could be implemented gradually, which we liked, but in both cases we were dubious of their effectiveness.
Our third idea was quite a bit different in scope, and would require almost a wholesale rewrite of the system. The idea was to ensure that all the permissioning functions were synchronous. We would make any asynchronous calls before permissioning and pass the data into the synchronous function. This would allow us to batch the queries, making a few large ones instead of millions of small ones.
It was clear that synchronous permissioning functions were the most likely candidate to boost performance. We selected this approach because, while intrusive, it was the only solution we were confident would have a significant impact and continue to work at scale. We also had a head start on this approach, as we had already built another tool to fetch data in batches.
The Relations Helper
It is common for Object Relational Mappers to have some way of getting related data via SQL joins. We wanted to avoid having our ORM make joins, so we decided to home-brew a system to collect related data. In addition to defining the fields on each model, we also define “relations.” Simple relations describe a foreign key relation. As an example, our expense table has a card_transaction_id column, which keys out to the card transaction table. These are expressed in our models like so:
The relations helper accepts an array of model instances (of a single type) and an array of relations (e.g. ‘card_transaction’) to attach. It then makes a query to the database for the batch of data, and attaches them to the instances. Making these queries in batches allows us to fetch the same data with fewer queries. If we wanted to get five relations for one thousand models, this would allow us to make five large queries instead of five thousand smaller queries. We realized that this helper solved exactly the same problem we were facing in the permissioner.
Separation of Concerns
The major hurdle of implementing synchronous permissioning is that one must know which data to fetch prior to execution. This means that each permissioner must define the extra data it requires to execute. Knowing we wanted to use the relations helper, we restructured how our permissioners were defined:
Although the examples in our last post appeared to be synchronous, this was a simplification. In reality, they returned promises and often undertook complex operations. By contrast, the new permissioner will throw an error if an execute function returns anything but a boolean.
The real insight here is the relations field. This is what lets us know in advance what additional data the permissioner will need. It is represented as an array of strings, which are the keys of the relations on the models. These keys will be passed to the relations helper to specify what data to attach to the models.
This is a huge advance as it separates the responsibilities of data acquisition and permissioning. Both of these frameworks can be independently improved and optimized. For example, the relations helper has been optimized to only fetch a given datum once if two different models require it. This also increases our leverage: if we optimize the relations helper it will also speed up the permissioners, making it a more appealing from a planning perspective.
Coding Is Less Than Half The Battle
With my goal set, I sat down at my desk to write the code that would implement the agreed upon design. In an effort to allow for gradual adoption, the new helper would fall back on the old one if it wasn’t defined yet.
One late night I threw caution to the wind and created all the new helpers in one fell swoop. Ah, untested code: is there any greater monument to the hubris and naïveté of humanity?
As I would find, testing and rolling out all the code I had written was more difficult than writing it in the first place. That story will come in the next installment of the permissioner series.