informatique -solutions series: February 2009

Saturday, February 14, 2009

A Comprehensive Database Security Model

This week I am taking a bit of a departure. Normally I write
about things I have already done, but this week I want to
speculate a bit on a security model I am thinking of coding
up. Basically I have been asking myself how to create a
security model for database apps that never requires elevated
privileges for code, but still allows for hosts sharing multiple
applications, full table security including row level and
column level security, and structural immunity to SQL injection.

The Functional Requirements

Let's consider a developer who will be hosting multiple
database applications on a server, sometimes instances of the
same application for different customers. The applications
themselves will have different needs, but they all boil down
to this:

Some applications will allow surfers to join the site
and create accounts for themselves, while others will be
private sites where an administrator must make user accounts.
Some applications will not contain sensitive data, and
so the site owner wants to send forgotten passwords in email
-- which means the passwords must be stored in plaintext. Other
site owners will need heightened security that disallows
storing of passwords in plaintext.
In both cases, administrators must of course be able to
manage accounts themselves.
The system should be structurally immune
to SQL injection.
It must be possible to have users with the same user id
("Sheilia", "John", etc.) on multiple applications who are
actually totally different people.
The application code must never need to run at an
elevated privelege level for any reason -- not
even to create accounts on public sites where
users can join up and conduct transactions.
It must be possible for the site owners or their
agents to directly
connect to the database at very least for querying and
possibly to do database writes without going through our
application.
Users with accounts on one app must never be able to
sign on to another app on the same server.

These requirements represent the most flexible possible
combination of demands that I have so far seen in real life.
The question is, can they be met while still providing
security? The model I'd like to speculate on today says
yes.

Informed Paranoia Versus Frightened Ignorance

Even the most naive programmer knows that the internet
is not a safe place, but all too often a lot of security
advice you find is based on frightened ignorance
and takes the form, "never do x, you don't know what might
happen." If we are to create a strong security model,
we have to do better than this.

Much better is to strive to be like a strong system architect,
whose approach is based on informed paranoia.
This hypothetical architect knows everybody is out
to compromise his system, but he seeks a thorough knowledge
of the inner workings of his tools so that he can
engineer the vulnerabilities out as much as possible.
He is not looking to write rules for the programmer
that say "never do this", he is rather looking to make it
impossible for the user or programmer to compromise
the system.

Two Examples

Let us consider a server hosting two applications, which
are called "social" and "finance".

The "social" application is a social networking site with
minimal security needs. Most important is that the site
owners want members of the general public to sign up, and
they want to be able to email forgotten passwords
(and we can't talk them out of it) -- so we
have to store passwords in plaintext.

The "finance" application is a private site used by employees
of a corporation around the world. The general public is
absolutely not welcome. To make matters worse however, the
corporation's IT department demands to be able to directly
connect to the database and write to the database without
going through the web app. This means the server will have
an open port to the database. Sure it will be protected with
SSL and passwords, but we must make sure that only users
of "finance" can connect, and only to their own application.

Dispensing With Single Sign-On

There are two ways to handle connections to a database. One
model is to give users real database accounts, the other is
to use a single account to sign on to the database. Prior to
the web coming along, there were proponents of both models in
the client/server world, but amongst web developers the single
sign-on method is so prevalent that I often wonder if they
know there is any other way to do it.

Nevertheless, we must dispense with the single sign-on method
at the start, regardless of how many people think that Moses
carved it on the third tablet, because it just has too many
problems:

Single Sign-on is the primary architectural flaw that makes
SQL injection possible. As we will see later, using real
database accounts makes your site (almost) completely immune
to SQL injection.
Single Sign-on requires a connection at the maximum privilege
level that any system user might have, where the code then decides
what it will let a particular user do. This is a complete
violation of the requirement that code always run at the lowest
possible privilege level.
Single Sign-on totally prevents the requirement that
authorized agents be allowed to connect to the database and
directly read and write values.

So single sign-on just won't work with the requirements listed.
This leads us to creating real accounts on the database server.

Real Accounts and Basic Security

When you use a real database account, your code connects
to the database using the username and password provided
by the user. Anything he is allowed to do your code will
be allowed to do, and anything he is not allowed to do will
throw and error if your code tries to do it.

This approach meets quite a few of our requirements nicely.
A site owner's IT department can connect with the same
accounts they use on the web interface -- they have
the same privileges in both cases. Also, there is no
need to ever have application code elevate its privilege
level during normal operations, since no regular users should ever be
doing that. This still leaves the issue of how to create
accounts, but we will see that below.

A programmer who thinks of security in terms of what code
can run will have a very hard time wrapping his head around
using real database accounts for public users. The trick to
understanding this approach
is to forget about code for a minute and to
think about tables. The basic fact of database application
security is that all security
resolves to table permissions. In other words, our security
model is all about who can read or write to what tables, it is
not about who can run which program.

If we grant public users real database accounts, and they
connect with those accounts, the security must be handled
within the database itself, and it comes down to:

Defining "groups" as collections of users who share
permissions at the table level.
Deciding which groups are allowed select, insert, update,
and delete privileges on which tables.
Granting and revoking those privileges on the server itself
when the database is built.
At very least row-level security will be required, wherein
a user can only see and manipulate certain rows in a table.
This is how you keep users from using SQL Injection to mess
with each other's order history or member profiles.
Column security is also very nice to finish off the
picture, but we will not be talking about that today as it
does not play into the requirements.

Now we can spend a moment and see why this approach eliminates
most SQL Injection vulnerabilities. We will imagine a table of
important information called SUPERSECRETS. If somebody could
slip in a SQL injection exploit and wipe out this table we'd all
go to jail, so we absolutely cannot allow this.
Naturally, most users would have no privileges on
this table -- even though they are directly connected to the
database they cannot even see the table exists, let alone
delete from it. So if our hypothetical black hat
somehow slips in ";delete from supersecrets"
and our code fails to trap for it, nothing happens. They have
no privlege on that table. On the other side of things, consider
the user who is privileged to delete from that table. If this
user slips in a ";delete from supersecrets" he is only going to
the trouble with SQL Injection to do something he is perfectly
welcome to do anyway through the user interface. So much
for SQL injection.

To repeat a point made above: row-level security is a must.
If you grant members of a social site global UPDATE privileges
on the PROFILES table, and you fail to prevent a SQL Injection,
all hell could break loose. Much better is the ability to
limit the user to seeing only his own row in the PROFILE table,
so that once again you have created a structural immunity
to SQL injection.

Anonymous Access

Many public sites allow users to see all kinds of information
when they are not logged on. The most obvious example would
be an eCommerce site that needs read access to the ITEMS table,
among others. Some type of anonymous access must be allowed
by our hypothetical framework.

For our two examples, the "social" site might allow limited
viewing of member profiles, while the "finance" application
must show absolutely nothing to the general public.

If we want a general solution that fits both cases, we opt
for a deny-by-default model and allow each application
to optionally have an anonymous account.

First we consider deny-by-default. This means simply that
our databases are always built so that no group has any
permissions on any tables. The programmer of the "social"
site now has to grant certain permissions to the anonymous
account, while the programmer of the "finance" application
does nothing - he already has a secure system.

But still the "finance" site is not quite so simple. An anonymous
user account with no privileges can still log in, and
that should make any informed paranoid architect nervous.
We should extend
the deny-by-default philosophy so the framework will
not create an anonymous
account unless requested. This way the programmer of the
"finance" application still basically does nothing, while
the programmer of the "social" must flip a flag to create
the anonymous account.

Virtualizing Users

If we are having real database accounts, there is one small
detail that has to be addressed. If the "social" site has
a user "johnsmith" and the finance application has a user
of the same name, but they are totally different people,
we have to let both accounts exist but be totally separate.

The answer here is to alias the accounts. The database
server would actually have accounts "finance_johnsmith" and
"social_johnsmith". Our login process would simply take
the username provided and append the code in front of it
when authenticating on the server. 'nuf said on that.

Allowing Public Users To Join

The "social" site allows anybody to join up and create
an account. This means that somehow the web application
must be able to create accounts on the database server.
Yet it must do this without allowing the web code to
elevate its privileges, and while preventing the disaster
that would ensue if a user on the "social" site somehow
got himself an account on the "finance" site.

Believe it or not, this is the easy part! Here is how it
works for the "social" site:

Create a table of users. The primary key is the user_id
which prevents duplication.
For the social site, there is a column called
PASSWORD that stores the password in plaintext.
Allow the anonymous account to INSERT into this table!
(Remember though that deny-by-default means that so far
this account has no other privileges).
Put an INSERT trigger on the table that automatically creates
an aliased user account, so that "johnsmith" becomes
"social_johnsmith". The trigger also sets the password.
A DELETE trigger on the table would delete users if
the row is deleted.
An UPDATE trigger on the table would update the password
if the user UPDATES the table.
Row level security is an absolute must.
Users must be able to
SELECT and UPDATE table, but only their own row. If your
database server or framework cannot support row-level
security, it's all out the window.

This gives us a system that almost gets us where we need
to be: the general public can create acounts,
the web application does not need to elevate its privileges,
users can set and change their passwords, and no user can
see or set anything for any other user. However, this leaves
the issue of password recovery.

In order to recover passwords and email them to members of
the "social" site, it is tempting to think that
the anonymous account must be able to
somehow read the users table, but that is no good because
then we have a structural flaw where a successful
SQL injection would expose user accounts. However, this
also turns out to be easy. There are two options:

Write a stored procedure that the anonymous user is
free to execute, which does not return a password but
actually emails it directly from within the database
server. This requires your database server be able to
send emails. (Postgres can, and I assume SQL Server
can, and I don't really know about mySql).
Create a table for password requests, allow inserts
to it but nothing else. A trigger sends the email.
In this approach you can track email recovery requests.

For the "finance" application we cannot allow any of this
to happen, so again we go to the deny-by-default idea. All
of the behaviors above will not happen unless the programmer
sets a flag to turn them on when the database is built.

This does leave the detail of how users of the "finance"
application will reset their passwords.
For details on how a secure app can still allow password
resets, see my posting of Sept 7 2008 "http://database-programmer.blogspot.com/2008/09/advanced-table-design-secure-password.html"
>Secure Password Resets.

One More Detail on Public Users

We still have one more detail to handle for public users.
Presumably a user, having joined up, has more privileges than
the anonymous account. So the web application must be able
to join them into a group without elevating its privileges.
The solution here is the same as for creating the account:
there will be a table that the anonymous user can make
inserts into (but nothing else), and a trigger will join
the user to whatever group is named.

Except for one more detail. We cannot let the user join
whatever group they want, only the special group for members.
This requirement can be met by defining the idea of a "freejoin"
group and also a "solo" group. If the anonymous user inserts
into a user-group table, and the requested group is flagged
as allowing anybody to join, the trigger will allow it, but
for any other group the trigger will reject the insert.
The "solo" idea is similar, it means that if a user is in
the "members" group, and that group is a "solo" group, they
may not join any other groups. This further jails in
members of the general public.

Almost Done: User Administration

In the last two sections we saw the idea of a table of users
and a cross-reference of users to groups. This turns out to
solve another issue we will have: letting administrators
manage groups. If we define a group called "user_administrators"
and give them total
power on these tables, and also give them CRUD screens
for them, then we have a user administrator system.
This works for both the "social" and the "finance" application.

The triggers on the table have to be slightly different
for the two cases, but that is a small exercise to code
them up accordingly.

Cross-Database Access

Believe it or not, the system outlined above has met all of
our requirements except one. So far we have a system that never
requires the web server to have any elevated priveleges within
the database, allows members of the public to join some sites
while barring them from others, is structurally immune from
SQL injection, allows different people on different sites to
have the same user id, and allows administrators
of both sites to directly manage accounts. Moreover, we
can handle both plaintext passwords and more serious
reset-only situations.

This leaves only one very thorny issue: cross-database
access. The specific database server I use most is PostgreSQL,
and this server has a problem (for this scenario) anyway,
which is that out-of-the-box, a database account can connect
to any database. This does not mean the account has any
priveleges on the database, but we very seriously do not want
this to happen at all. If a member of the "social" site can
connect to the "finance" app, we have a potential vulnerability
even if he has zero privileges in that database. We would be
much happier if he could not connect at all.

In Postgres there is a solution to this, but I've grown to
not like it. In Postgres you can specify that a user can only
connect to a database if they are in a group that has the
same name as the database. This is easy to set up, but it
requires changing the default configuration of Postgres.
However, for the sheer challenge of it I'd like to work out
how to do it without requiring that change. So far I'm
still puzzling this out. I'd also like to know that the
approach would work at very least on MS SQL Server and
mySql.

Conclusion

Most of what is in this week's essay is not that radical to
any informed database veteran. But to web programmers
who were unfortunate enough to grow up in the world
of relational-databases-must-die nonsense, it is probably
hard or impossible to imagine a system where users are
connecting with real database accounts. The ironic thing
is that the approached described here is far more secure
than any single sign-on system, but it requires the programmer
to shift thinking away from action-based code-centric models
to what is really going on: table-based privileges. Once
that hurdle is past, the rest of it comes easy.

Sunday, February 1, 2009

This Application Has Unique Business Rule Needs

No it does not. If it did, then your customer/employer
would be
doing something no other human being has ever done, which
is unlikely in the extreme. The application may be
unique in its particulars, but it is almost certainly
extremely common in its patterns. This week we will see
how "unique" needs are in fact nothing but common ordinary
development projects.

Beginning With the Conclusion

I have had this conversation with many programmers over
the past few years, and it always follows the same
patterns. The easy part of the argument is showing the
programmer that what he thinks is special or unique
is in fact common. The much harder part, because it
involves the delicate human ego, is showing the programmer
that he has not seen this because he is ignorant. This
is not fun to do and I myself usually skip it, it's
usually not worth the trouble.

Path 1: Details

Occasionally I speak to a programmer who thinks he has
a unique situation. His claim begins with
the mountain of details he must handle, details which appear
to be contradictory, subtle, and overall perplexing. He
wonders if some new approach is required to handle them.

In answering this claim, we begin with the easy part,
showing that the situation is itself not unique. In short,
all worthwhile projects involve mountains of detail, so
there is nothing special there. When it comes to the
subtleties and the maze of exceptions and special cases,
these are common in mature businesses that have evolved
this complexity in response to business needs over the years.
So again there is nothing unique here, the programmer's
situation is again common.

At this point we have to ask how the programmer will deal
with this perplexing mountain of detail. If he knows
what he is doing, he will give the general answer that he
is going to break it down as much as possible into
independent smaller problems that can be solved on their
own. Since this is nothing more than how all programmers
solve complex problems, the entire "uniqueness" claim
has completely collapsed. His project is utterly common.

The much harder part of the conversation comes if the
programmer does not know how to break down
the problem. For instance, if the problem is all about
a fiendishly complex pricing system with lots of discounts
and pricing levels, and the programmer does not know that
he needs to begin with the database, and he further does not
want to hear that, well, there is not much I can do for
him. He will end up working a lot harder than he needs
to, and will probably remain convinced he is dealing with
something "unique".

But let's go a little deeper into that example of the
complicated pricing system. Why do I claim that he must
start with the tables, and that is he is wasting time
if he does not? Well, a complete answer is much more than
will fit here, and in fact I hit that theme over and over
in these essays, but it comes down to:

He must have an accurate and precise description of
the details that govern the pricing scheme. That is
what tables are for.
In working out the mechanics of the tables, particularly
their primary and foreign keys, he will come to a
his most complete understanding of the mechanisms
involved.
When the tables completely reflect the details he
must work with, the code will just about write itself.
Lastly, but probably most importantly, the customer
will expect to control the pricing system by adjusting
the parameters at all levels. Again, that is what tables
are for. The user is in control of the pricing system
if he can edit the tables (because of course he cannot
edit the code).

Path 2: Combinations

Once upon a time we had simple desktop business applications,
games, and then this weird new thing, "the web". Now they
are all mixed together, as we play games on the internet that
are tied into huge databases. Modern applications often
combine technologies that used to be comfortably separate.
On any particular project,
some of the requirements look like they
can be met with an RDBMS, some require management and
delivery of media such as MP3 or video, and he is told as
well he must provide RSS feeds and import data coming in
XML format. Perhaps as well there will be stone tablets
and papyrus scrolls.

This programmer may believe he is in a unique situation
because of this combination of needs. Because no single
toolset out there can meet the entire project, perhaps this
is something never before seen? But this does
not hold up. Just like the argument about complexity,
he must break the problem up correctly, and when he has done
so he will have a perfectly ordinary project. Though I might
add it will also be a very interesting project and
probably a lot of fun.

In The End It Is All About Patterns

I have given two examples above taken from my own experience
where programmers have claimed to me that they faced some
unique situation. There are many other cases, and they always
make perfect sense to the person who thinks he has discovered
something new. The biggest flaw in the programmer's thinking
is failing to distinguish between particulars and
patterns.

My claim in this essay is that the patterns of all problems
are the same. Somebody has seen it before, somebody has done
it before, the answer is out there. The process of analysis
and programming is about slotting your particulars in the
patterns that have already been established.

In the broadest sense all programs process data, and
particular programs break down into broad patterns of data
access and manipulation. Sometimes you have a broad range
of users putting in data with very little massaging
(think twitter) and sometimes you have one group controlling
much of the data while others make use of it (think
Amazon), and sometimes your data is mostly relational
and table based (think any ecommerce or biz app) and
sometimes its mostly media (think youtube).

Once you have these broad patterns identified, you can then
proceed to make use of established practices within
each particular area. What is the best way to provide
sensitive data on the web and protect it from unauthorized
eyes? Somebody has done it before. What is the best way
to track large amounts of media? Somebody has done it
before. What is the best way to set up a complex pricing
system with lots of discounts and pricing levels? Somebody
has done it before. In all cases, your particulars may
be different, but the patterns will be the same.

Conclusion: Find the Patterns

Whenever I find myself looking at a situation that appears
to be new, I try to tell myself that it may be new to me,
but it is not likely to be new to the human race. If it
does not appear to follow a well-known pattern then I
proceed as if I have not yet recognized the pattern
and continue to analyze and break it apart until the pattern
emerges. So far it always has.