Solving global and local records in SaaS database design

icon

Lately, we've been seriously considering developing a new version of our framework. This one is already a few years old, and besides other problems it's facing, it lacks one very important thing - it's not suited for SaaS (Software as a Service) applications. Often, we would like to host a few simple projects (like multiple web pages) in a single database, but we are also thinking about developing a product / service, which we could offer to multiple clients. Making a product for different clients that would live in the same database is not simple, and requires an architecture that is both rigid and flexible, micro-useful and scalable.

The database model we currently use is doing quite well. It's able to support complex business-oriented systems, social portals and web sites, the core tables being the same across all applications. Sure, there are many redundant things, but we are aware where we should scale-down. More importantly, we have to decide how to upgrade this database model which is used in different applications to be able to work in a single database. I'm not that fond of using multiple databases or schemas in SaaS architecture, since this is a similar situation to which we have now, and it doesn't work that well. I'm certain having the same database for different clients is most suited for our business.

This means we will be having a ClientID (ApplicationID) in all the tables that will be used by all clients / applications. But what is the best way to take care of global and local records - records that are used by everyone versus those in only a few applications?

Let's assume we are building a CMS system for many sites, for which we will be needing two global Roles (Administrator, Moderator), but one instance will have its own Role as well (News editor). There are many ways we can go.

1. Local roles

This architecture means each client having it's own Roles. It's a simple solution, very flexible, but has many flaws - the most obvious being the benefits of global functionalities. E.g. if you're and Administrator, you can restart the site. Since we now have multiple Administrators (ID = 1,3,…), features like these get a bit harder to implement, and you can end up developing a totally different application for each client.


ApplicationIDRoleID (PK)Role
11Administrator
12Moderator
23Administrator
24Moderator
25News editor

2. Global Roles with a M:N table

The other option is to have global Roles, which are attached to a single application using a M:N table. This is an elegant and very flexible solution, but from my experience, you should avoid using M:N tables if possible. Of course, there are logical cases in which you can't, but you should always consider other options. Adding another table to the equation complicates queries, makes direct data browsing less understandable, besides raising the possibilities of errors.


RoleID (PK)Role
1Administrator
2Moderator
3News editor
ApplicationID (PK)RoleID (PK)
11
12
21
22
23

3. Composite key

Another possibility that comes to mind is to use a composite key, which in reality, makes a single table from the above two (Roles, ApplicationsRoles). This allows using the same IDs for the same Roles in different systems. But it also requires having a composite key, which makes other things more complicated. It prevents making simple joins (you need to make a join by both primary keys), which means all the tables joined to Roles would need the ApplicationID field as well, even if they wouldn't really require it. Not to mention this model simply calls for trouble.


ApplicationID (PK)RoleID (PK)Role
11Administrator
12Moderator
21Administrator
22Moderator
23News editor

4. Fake composite key

Sometimes, it makes sense using an additional primary key in a M:N relationship, so you can easily join the table further. We could do this as well, making the previous case a bit more easy-going. But this situation makes all the magic that we could do across the system a bit harder - since all the joins would be made using new local IDs, we are getting too close to the first solution (you can't make joins with the RoleID field, since it's not unique).


ID (PK)ApplicationIDRoleIDRole
111Administrator
212Moderator
321Administrator
422Moderator
523News editor

5. NULL-able ApplicationID

A different architecture enables using global Roles that have no ApplicationID, and local Roles that have it. Each application is set to use both sets of Roles, allowing flexibility and understandability. But this model has a problem as well - we can't prevent an application from seeing a global Role - those are intended for everybody, which means that in the long run, the system would start gravitating towards not having any global roles at all, to assure flexibility. To simplify - this architecture is great, but needs something that takes care of Denying.


ApplicationIDRoleID (PK)Role
NULL1Administrator
NULL2Moderator
23News editor

5. a) Creating a table for Deny

Let's put another application (ApplicationID = 3) into the equation, which, for some reason, doesn't want to have the global Role named Moderator (RoleID = 2). Adding another table to care of Denied Roles would solve the problem perfectly, but do we really need another table? We want to keep our database with as little tables as possible, so it's easier to use. That's why we should consider extending the Roles table to handle denying as well.


ApplicationIDRoleID (PK)Role
NULL1Administrator
NULL2Moderator
23News editor
ApplicationID (PK)DeniedRoleID (PK)
32

5. b) Adding two fields for Deny

In the most normalized way, we need two fields to take care of Deny - a boolean to mark we are working with a Deny record type, and an integer to tell which record we are denying. But what if we want to optimize even further?


ApplicationIDRoleID (PK)RoleIsDeniedDeniedRoleID
NULL1AdministratorfalseNULL
NULL2ModeratorfalseNULL
23News editorfalseNULL
34NULLtrue2

5. c) Using an existing field for Deny

We could put information about which record we are denying into the existing Roles.Role field, since we don't really need it in these cases. But there's a major setback; if a non-numeric field would be accidentally stored in the denied Role, the SQL engine would throw an error.


ApplicationIDRoleID (PK)RoleIsDenied
NULL1Administratorfalse
NULL2Moderatorfalse
23News editorfalse
342true

5. d) Using a single field for Deny

It's much easier to combine the first two fields into one, marking the record implicitly. If the DeniedID is NULL, it means we are working with a regular record, otherwise, we are dealing with a denying record. Information about which Role we are denying is stored in the same field, which works both as a boolean, as an integer.


ApplicationIDRoleID (PK)RoleDeniedRoleID
NULL1AdministratorNULL
NULL2ModeratorNULL
23News editorNULL
34NULL2

The decision

The last architecture (5. d) is the one I think it's best. It's very transparent and flexible, and it has only one major problem I can currently think of; if only a few (of many) applications would require to share a Role, this wouldn't be possible to solve with this model, but I can live with that.


ApplicationIDRoleID (PK)RoleDeniedRoleID
NULL1AdministratorNULL
NULL2ModeratorNULL
23News editorNULL
34NULL2

The SQL statement

I have played around a bit with the 5. d) and it works very well. Here's the SELECT statement that retrieves the records for each specific application, supporting the mentioned scenarios.

SELECT Roles.*
FROM Roles
WHERE
(Roles.ApplicationID = 3 OR Roles.ApplicationID IS NULL) --retrieve local and global roles
AND
Roles.DeniedRoleID IS NULL --use only regular records
AND
Roles.RoleID NOT IN --remove denied roles
(
SELECT DeniedRoles.DeniedRoleID FROM Roles AS DeniedRoles
WHERE DeniedRoles.ApplicationID = 3
AND DeniedRoles.DeniedRoleID IS NOT NULL
)

The same query, solved with a JOIN instead of WHERE IN, which is faster, but probably less understandable.

SELECT RealRoles.*
FROM Roles AS RealRoles --regular records
LEFT OUTER JOIN
Roles AS DeniedRoles --denying records
ON RealRoles.RoleID = DeniedRoles.DeniedRoleID --join by the ID of the role
AND RealRoles.DeniedRoleID IS NULL --but only those records that are real
AND DeniedRoles.DeniedRoleID IS NOT NULL --vs those that are denying
AND DeniedRoles.ApplicationID = 3
WHERE
(RealRoles.ApplicationID = 3 OR RealRoles.ApplicationID IS NULL) --retrieve my and global roles
AND
RealRoles.DeniedRoleID IS NULL -- take all regular roles
AND
DeniedRoles.RoleID IS NULL --that don't have a deny

Conclusion

There are many ways to solve a database model in a SaaS architecture, but because of many reasons, we've decided to use a single database with an additional foreign key that defines the client in all the required tables. One of the biggest concerns in this architecture is solving global and local records, but the model presented here takes care of most requirements a typical Saas system might stumble upon, by using a NULL-able ApplicationID, with another field to take care of exceptions that require denying.

This architecture solves most of the problems, and if you're leaning towards a single database architecture, it's probably the best way to go.

Like what you read? Now tell the world!

A few more things you might find interesting:


Comment
written 21.7.2013 0:50 CET on chronolog
1721 views   •   2 comments  •   Like   •   
Hi Gregor, thanks for the comment / question. First, you have to understand this is meant fror product solutions, where instances aren't very different from one-another (or non-crucial systems such as CMS). Of course, you have to be careful with such an architecture, so the core should, once it's done, barely change. On the other hand, custom things should be solved using 1:1 or separate tables. As for managing multiple tables / databases, we have this kind of a situation now, and starting a new project, not no mention updates, are a pain in the ass. It's much easier if you solve this on the database level - adding a new client record, everything works. But yes, you should be very careful, the security is crucial, so it will probably require around two times as much effort to make the features bullet-proof, compared to a single application.
commented 21.7.2013 19:51 CET by Stritar
I barely have a slight idea about your data model, so please forgive my ignorance. But I can't help to comment anyway :) So, if you evolve your schema due to one client's requirements, you have a chance to bring each and every customer's site down? I'm always very careful with one db fits all thinking. Second thing is, if you have a common set of data, used by (almost) all customers, wouldn't it be safer and easier to manage by just puting it in a separate table or even a separate DB? I don't even dare to go into the security considerations :)
commented 21.7.2013 19:16 CET by Gregor Borosa
date
date
date
date

Connect with :