In yesterday's    >Rigorous Definition of Business Logic, we saw that
   business logic can be defined in four orders:
- First Order Business Logic is entities and
attributes that users (or other agents) can save,
and the security rules that govern read/write
access to the entitites and attributes.
 - Second Order Business Logic is entities
and attributes derived by rules and formulas,
such as calculated values and history tables.
 - Third Order Business Logic are non-algorithmic
compound operations (no structure or looping is
required in expressing the solution), such as
a month-end batch billing or, for the old-timers
out there, a year-end general ledger
roll-up.
 - Fourth Order Business Logic are algorithmic
compound operations. These occur when the action
of one step affects the input to future steps.
One example is ERP Allocation. 
A Case Study
The best way to see if these have any value is to
   cook up some theorems and examine them with an
   example.  We will take
   a vastly simplified time billing system, in which
   employees enter time which is billed once/month to
   customers.  We'll work out some details a little below.
Theorem 1: 1st and 2nd Order, Analysis
The first theorem we can derive from these definitions
   is that we should look at First and Second Order Schemas
   together during analysis.  This is because:
- First Order Business Logic is about entities and atrributes
 - Second Order Business Logic is about entities and attributes
 - Second Order Business Logic is about values
generated from First Order values and, possibly,
other Second Order values
 - Therefore, Second Order values are always 
expressed ultimately in terms of First Order
values
 - Therefore, they should be analyzed together
 
To give the devil his due, ORM does this easily, because
   it ignores so much database theory (paying a large price
   in performance for doing so) and 
   considers an entire row, with its first order and
   second order values together, as being part of one class.
   This is likely the foundation for the claims of ORM
   users that they experience productivity gains when
   using ORM.  Since I usually do nothing but bash ORM,
   I hope this statement will be taken as utterly sincere.
Going the other way, database theorists and evangelists
   who adhere to full normalization can hobble an
   analysis effort by refusing to consider
   2nd order because those values denormalize the database,
   so sometimes the worst of my own crowd will prevent
   analysis by trying to keep these out of the conversation.
   So, assuming I have not pissed off my own friends,
   let's keep going.
So let's look at our case study of the time billing
   system.  By theorem 1, our analysis of entities and
   attributes should include both 1st and 2nd order
   schema, something like this:
INVOICES
-----------
invoiceid 2nd Order, a generated unique value
date 2nd Order if always takes date of batch run
customer 2nd Order, a consequence of this being an
aggregation of INVOICE_LINES
total_amount 2nd Order, a sum from INVOICE_LINES
INVOICE_LINES
---------------
invoiceid 2nd order, copied from INVOICES
customer +- All three are 2nd order, a consequence
employee | of this being an aggregration of
activity +- employee time entries
rate 2nd order, taken from ACTIVITIES table
(not depicted)
hours 2nd order, summed from time entries
amount 2nd order, rate * hours
TIME_ENTRIES
--------------
employeeid 2nd order, assuming system forces this
value to be the employee making
the entry
date 1st order, entered by employee
customer 1st order, entered by employee
activity 1st order, entered by employee
hours 1st order, entered by employee
Now, considering how much of that is 2nd order, which
   is almost all of it, the theorem is not only supported
   by the definition, but ought to line up squarely
   with our experience.  Who would want to try to analyze
   this and claim that all the 2nd order stuff should
   not be there?
Theorem 2: 1st and 2nd Order, Implementation
The second theorem we can derive from these definitions
   is that First and Second Order Business logic require
   separate implementation techniques.  This is because:
- First Order Business Logic is about user-supplied values
 - Second Order Business Logic is about generated values
 - Therefore, unlike things cannot be implemented with
like tools. 
Going back to the time entry example, let's zoom in on
   the lowest table, the TIME_ENTRIES.  The employee 
   entering her time must supply customer, date, activity, and
   hours, while the system forces the value of employeeid.
   This means that customer and activity must be validated
   in their respective tables, and hours must be checked
   for something like <= 24.  But for employeeid the
   system provides the value out of its context.
   So the two kinds of values are processed in very
   unlike ways.  It seems reasonable that our code would
   be simpler if it did not try to force both kinds of
   values down the same validation pipe.
Theorem 3: 2nd and 3rd Order, Conservation of Action
This theorem states that
   the sum of Second and Third Order
   Business Logic is fixed:
- Second Order Business Logic is about generating
entities and attributes by rules or formulas
 - Third Order Business Logic is coded
compound creation of entities and attributes
 - Given that a particular set of requirements
resolves to a finite set of actions that generate
entities and values, then
 - The sum of Second Order and Third Order Business
Logic is fixed. 
In plain English, this means that the more Business
   Logic you can implement through 2nd Order
   declarative rules and formulas, the fewer
   processing routines you have to code.  Or, if you
   prefer, the more processes you code, the fewer 
   declarative rules about entitities and 
   attributes you will have.
This theorem may be hard to compare to experience
   for verification
   because most of us are so used to thinking in 
   terms of the batch billing as a process that we cannot imagine it
   being implemented any other way: how exactly am I
   suppose to implement batch billing declaratively?.
Let's go back to the schema above, where we can 
   realize upon examination that the entirety of the batch
   billing "process" has been detailed in a 2nd Order
   Schema, if we could somehow add these facts to our
   CREATE TABLE commands the way we add keys, types,
   and constraints, batch billing would occur
   without the batch part.
Consider this.  Imagine that a user enters a 
   a TIME_ENTRY.  The system
   checks for a matching EMPLOYEE/CUSTOMER/ACTIVITY
   row in INVOICE_DETAIL, and when it finds the row
   it updates the totals.  But if it does not find 
   one then it creates one!  Creation
   of the INVOICE_DETAIL record causes the system to
   check for the existence of an invoice for that
   customer, and when it does not find one it creates
   it and initializes the totals.  Subsequent time entries
   not only update the INVOICE_DETAIL rows but the
   INVOICE rows as well.  If this were happening, there would be no
   batch billing at the end of the month because the
   invoices would all be sitting there ready to go
   when the last time entry was made.
By the way, I coded something that does this in a
   pretty straight-forward way a few years ago, meaning
   you could skip the batch billing process and add a few
   details to a schema that would cause the database to
   behave exactly as described above.  Although the
   the format for specifying these extra features
   was easy enough (so it seemed to me as the author),
   it seemed the conceptual shift of thinking
   that it required of people was far larger than I
   initially and naively imagined.  Nevertheless, 
   I toil forward, and that is
   the core idea behind my    >Triangulum project.
   
   
Observation: There Will Be Code
This is not so much a theorem as an observation.
   This observation is that if your application
   requires Fourth Order Business Logic then somebody
   is going to code something somewhere.
An anonymous reader pointed out in the comments
   to    >Part 2 that Oracle's MODEL clause may work
   in some cases.  I would assume so, but I would also
   assume that reality can create complicated Fourth
   Order cases faster than SQL can evolve.  Maybe.
But anyway, the real observation here is is that
   no modern language, either app 
   level or SQL flavor, can express an algorithm
   declaratively.  In other words, no combination
   of keys, constraints, calculations and derivations,
   and no known combination of advanced SQL functions
   and clauses
   will express an ERP Allocation routine or a
   Magazine Regulation routine.  So you have to code it.
   This may not always be true, but I think it is
   true now.
This is in contrast to the example given in the
   previous section about the fixed total of
   2nd and 3rd Order Logic.  Unlike that example,
   you cannot provide enough
   2nd order wizardry to eliminate fourth order.
   (well ok maybe you can,
   but I haven't figured it
   out yet myself and have never heard that anybody
   else is even trying.  The trick would be to have
   a table that you truncate and insert a single row
   into, a trigger would fire that would know how
   to generate the
   next INSERT, generating a cascade.  Of course, since
   this happens in a transaction, if you end up 
   generating 100,000 inserts this might be a bad
   idea ha ha.)
Theorem 5: Second Order Tools Reduce Code
This theorem rests on the acceptance of an observation,
   that using meta-data repositories, or data dictionaries,
   is easier than coding.  If that does not hold true,
   then this theorem does not hold true.  But if that 
   observation (my own observation, admittedly) does
   hold true, then:
- By Theorem 3, the sum of 2nd and 3rd order
logic is fixed
 - By observation, using meta-data that manages
schema requires less time than coding,
 - By Theorem 1, 2nd order is analyzed and specified
as schema
 - Then it is desirable to specify as much business
logic as possible as 2nd order schema, reducing
and possibly eliminating manual coding of Third
Order programs. 
Again we go back to the batch billing example.
   Is it possible to convert it all to 2nd Order as
   described above.  Well yes it is, because I've done
   it.  The trick is an extremely counter-intuitive
   modification to a foreign key that causes a 
   failure to actually generate the parent row that
   would let the key succeed.  To find out more about
   this, check out    >Triangulum (not ready for prime time as of this
   writing).
Conclusions
The major conclusion in all of this is that anlaysis
   and design should begin with First and Second Order
   Business Logic, which means working out schemas, both
   the user-supplied values and the system-supplied
   values.
When that is done, what we often call "processes" 
   are layered on top of this.
Tomorrow we will see part 4 of 4, examining the
   business logic layer, asking, is it possible to
   create a pure business logic layer that gathers
   all business logic unto itself?
No comments:
Post a Comment