How Abacus Permissions Data Through Its API (Part 1)
In the first post of a three part series, we’ll share how we built a powerful and extensible data-permissioning layer into our API.
Welcome to the first in a series of technical blog posts explaining how Abacus secures information exposed to consumers of our API. This series will lean a bit into the metal, but if you’re familiar with the basics of web servers and security, it should make sense.
In this installment, we will introduce the requirements, concepts, and initial version of our framework. Subsequent posts will address the evolution of this work, specifically how we addressed performance issues when they arose, and how we deployed an upgraded version of the framework in production.
First, some background.
Just send data
Almost all web applications request and display information, and APIs handle the transfer of that information. APIs are also responsible for ensuring that the client has access rights to the data they are requesting. Are they who they say they are? Are they a system administrator? And do they belong to the company for which they are requesting data? This is the problem of permissioning.
Like many applications in their early days, we used to “just send data.” Each endpoint authenticated requests and determined what data the client should see. We used a few shared functions for common scenarios (e.g. restricting access to admins only), but by and large, each route’s handler was responsible for its own security.
This setup expected the route handlers to correctly limit or prune data. If a query omitted a restriction on, say, the company to which the requesting user belonged, that user might then be able to see sensitive information they were not supposed to see. Client-side logic would prevent this in day-to-day use, but that was only a superficial protection. To be truly secure, we had to manually double-check assumptions about the caller in every request handler.
We were eager to lower the chances of human error. As the number of our endpoints grew, the complexity grew too. It was no longer reasonable to expect our engineers to be aware of, not to mention account for, all our permissioning rules. It was one thing to implement an obvious rule such as “a company’s expenses should never be sent to any user who does not belong to that company.” It was a whole other challenge articulating that “only admins and user-delegates should be able to see the receipts on a card transaction.”
Consistently enforcing all rules across all endpoints, on data that might be deeply nested within another structure, became risky. Someone, somewhere, was bound to miss something eventually. To relieve the individual route handlers of their responsibility to permission data correctly, we needed a solution that would take the onus off the engineer and programmatically prune data as necessary.
A secure API wish list
Our wish list, then, looked like this:
- The solution would ideally require as little developer awareness and action as possible.
- As alluded to above, it would have to work on a per-field level. Access-control-list type solutions per-instance wouldn’t work. Employees need to see each other’s names, but other user data might only be visible to the admin. 1
- The logic that decides whether a particular property is visible couldn’t be based solely on user-privilege or role. Theoretically speaking, the visibility of a property could depend on any number of factors.
We built a framework that works like this:
Fields on a given data model are each coupled tightly with a permissioning function. Every field sent down to the client requires a visibility determiner (visibleToUser), as the absence of one defaults the field to hidden.
These ‘visibility determiners’ came to be called permissioners. The function signature for any given permissioner is standardized, and can be used to model any conceivable logic for evaluating a permission. For a given data field (or property), this function expects the full object to which that property belongs, as well as the user who initiated the request and to whom we are about to send data. This function is expected to return a boolean 2 indicating whether to include the property key/value pair in the response.
Practically, most are given names which describe the permission they are evaluating. These can then be shared across fields on a model, or even across models, and make up a set of common visibility checks which we reuse whenever possible.
Let’s get more specific. Our user model has a unique
id property. We can easily implement a function that evaluates to
true in the event that the user is requesting data from their own model instance. In the event of a more complicated permission, the permissioners could get additional required information from the database, such as a receipt upload or a delegate record. Since this process is always a function of a requester and a requested item, any dependency of the permissioner function is guaranteed to be findable based on its original parameters.
It’s also trivial to compose these functions into more complex logic.
When we receive a request, we execute the handler as we always have. But before we send the results, we recursively evaluate each property given its permission function. The results are then reconstructed from the properties confirmed to be visible.
We end up with request handlers that need far less developer work than before. The process is consistent across all endpoints, and automated within the results sender. In addition, newly added model properties must be explicitly permissioned before they can be received by a client.
Because we allowed the permissioning functions to execute arbitrary logic, including asynchronous operations (like looking up additional information), we were able to model any conceivable permission. However, this meant our response time was subject to the slowest executing permissioner. This wasn’t a problem… until it was.
We’ll dive into what changed and what we did to adapt in Part 2.