Saturday, January 22, 2011

Maintaining One Code Base with Possibly Conflicting Custom Features

Today's essay deals with the tricky issue of custom features
for individual customers who are running instances of your
software.



The question comes by way of a regular reader who prefers to
remain anonymous, but asks this:




... I work on a large (to me, anyway) application that serves as a client database, ticket system, time-tracking, billing, asset-tracking system. We have some customers using their own instances of the software. Often, those customers want additional fields put in different places (e.g., a priority column on tickets). This results in having multiple branches to account for versions with slight changes in code and in the database. This makes things painful and time-consuming in the long run: applying commits from master to the other branches requires testing on every branch; same with database migrate scripts, which frequently have to be modified.




Is there an easier way? I have thought about the possibility of making things "optional" in the database, such as a column on a table, and hiding its existence in the code when it's not "enabled." This would have the benefit of a single code set and a single database schema, but I think it might lead to more dependence on the code and less on the database -- for example, it might mean constraints and keys couldn't be used in certain cases.




Restating the Question



Our reader asks, is it better to have different code branches
or to try to keep a lot of potentially conflicting and optional
items mixed in together?



Well, the wisdom of the ages is to maintain a single code branch,
including the database schema. I tried exactly once, very early
in my career, to fork my own code, and gave up almost within days.
When I went to work in larger shops I always arrived in a situation
where the decision had already been made to maintain a single
branch. Funny thing, since most programmers cannot agree on the
color of the sky when they're staring out the window, this is
the only decision I have ever seen maintained with absolute
unanimity no matter how many difficulties came out of it.



There is some simple arithmetic as to why this is so. If you have
single feature for a customer that is giving you a headache, and
you fork the code, you now have to update both code branches for
every change plus regression test them both, including the feature
that caused the headache. But if you keep them combined you only
have the one headache feature to deal with. That's why people
keep them together.



Two Steps



Making custom features work smoothly is a two-step process.
The first step is arguably more difficult than the second,
but the second step is absolutely crucial if you have
business logic tied to the feature.



Most programmers when confronted with this situation
will attempt to make various features optional. I
consider this to be a mistake because it complicates
code, especially when we get to step 2. By far the
better solution is to make features ignorable
by anybody who does not want them.



The wonderful thing about ingorable features is
they tend to eliminate the problems with apparently
conflicting features. If you can rig the features
so anybody can use either or both, you've eliminated
the conflict.



Step 1: The Schema



As mentioned above, the first step is arguably more
difficult than the second, because it may involve
casting requirements differently than they are
presented.



For example,
our reader asks about a priority column on tickets,
asked for by only one customer. This may seem like
a conflict because nobody else wants it, but we
can dissolve the conflict when we make the feature
ignorable. The first step involves doing this at
the database or schema level.



But first we should mention that the UI is easy,
we might have a control panel
where we can make fields invisible. Or maybe our
users just ignore the fields they are not interested
in. Either way works.



The problem is in the database.
If the values for priority come
from a lookup table, which they should,
then we have a foreign key, and
we have a problem if we try to ignore it:



  • We can allow nulls in the foreign key, which is
    fine for the people ignoring it, but
  • This means the people who require it can end
    up with tickets that have no priority because it does
    not prevent a user from leaving it blank.


A simple answer here is to pre-populate your priority
lookup table with a value of "Not applicable", perhaps
with a hardcoded id of zero. Then we set the default
value for the TICKET.priority to zero. This means people
can safely ignore it because it will always be valid.



Then, for the customer who paid for it, we just go in
after the install and delete the default entry. It's
a one-time operation, not even worth writing a script
for, and it forces them to create a set of priorities
before using the system. Further, by leaving the
default of zero in there, it forces valid answers
because users will be dinged with an FK violation if
they do not provide a real priority.



For this particular example, there is no step 2, because
the problem is completely solved at the schema level.
To see how to work with step 2, I will make up an
example of my own.



Step 2: Unconditional Business Logic



To illustrate step 2, I'm going to make up an
example that is not really appropriate to our
reader's question, frankly because I cannot think
of one for that situation.



Let's say we have an eCommerce system, and one
of our sites wants customer-level discounts based
on customer groups, while another wants discounts
based on volume of order -- the more you buy, the
deeper the discount. At this point most programmers
start shouting in the meeting, "We'll make them
optional!" Big mistake, because it makes for lots
of work. Instead we will make them ignorable.



Step 1 is to make ignorable features in the schema.
Our common code base contains a table of customer
groups with a discount percent, and in the customers
table we make a nullable foreign key to the customer
groups table. If anybody wants to use it, great, and
if they want to ignore it, that's also fine. We do
the same thing with a table of discount amounts,
we make an empty table that lists threshhold amounts
and discount percents. If anybody wants to use it
they fill it in, everybody else leaves it blank.



Now for the business logic, the calculations of
these two discounts. The crucial idea here is
not to make up conditional logic that tries to
figure out whether or not to apply the discounts.

It is vastly easier to always apply both
discounts, with the discounts coming out zero for
those users who have ignored the features.



So for the customer discount, if the customer's
entry for customer group is null, it will not match
to any discount, and you treat this as zero.
Same for the sale amount discount, the lookup to
see which sale amount they qualify doesn't find
anything because the table is empty, so it treats
it as zero.



So the real trick at the business logic level is
not to figure out which feature to use, which leads
to complicatec conditionals that always end up
conflicting with each other, but to always use
all features and code them so they have no effect
when they are being ignored.



Conclusion



Once upon a time almost everybody coding for a living
dealt with these situations -- we all wrote code that
was going to ship off to live at our customer's site.
Nowadays this is less common, but for those of us
dealing with it it is a big deal.



The wisdom of the ages is to maintain a common code
base. The method suggested here takes that idea
to its most complete implementation, a totally common
code base in which all features are active all of
the time, with no conditionals or optional features
(except perhaps in the UI and on printed reports),
and with schema and business logic set up so that
features that are being ignored simply have no
effect on the user.

No comments:

Post a Comment