I wish I could recall the number of times I find myself caught in a seemingly endless Cut-Paste-Query loop.

Sometimes it seems the lot of a SQL or MDX Analyst, DBA or Architect is not pushing the boundaries of the information space, but merely repeating the same code patterns over and again.

In applying these patterns to our SQL queries we are only dealing with the proximal cause (the Now), not the root cause of the problem.

As a result, many IT departments are tying themselves in knots trying to overcome the restrictions placed on them by database design.

The Root Cause

It’s always (Ok, mostly always) down to uncaptured business rules.

There’s something the business does regularly that we haven’t captured and abstracted correctly.

By that, I mean that a great deal of effort (and brain power) is being wasted dealing with these rules in the SQL ‘Front End’ queries, rather than being safely (and reliably) extracted away in the SQL ‘Back End’

We are dealing with them in the database query as opposed to it being held in the Data Warehouse (Relational or Star Schema) or Analysis cubes.

The Symptoms

Here are a few of my favourite clues to watch for:

Event X followed by Event Y always has an outcome Z.

SQL queries that are of the form ‘If you have this event, then look for a following event’ to make sense of it. For example, an order record followed by a payment event gives the business Order Status of ‘Paid’.

Measure A * Measure B always equals Measure C.

A pretty simple one this, but often overlooked. If Total Stock is always the sum of the stock in all shops, then it should be included in the Data warehouse as Total Stock.

Large Case Statements 

If your SQL starts to look pretty raggedy with long CASE..WHEN statements to interpret the output from the database. ‘If the data value is 2, then this really means Ordered’.

This logic should really be in fields with business meaning.

Named Sets.                                                                                                                                              Named Sets are often used where, for instance a common grouping is applied to data points, For example ‘England, Scotland, Wales and Northern Ireland’ represent a named set called ‘UK’.

I should point out here I like the elegance and power of writing queries using MDX, and occasionally DAX query language.

However, my experience has taught me that Named Sets are often used to mask the fact MDX is being used as a mechanism to hastily cobble together the groupings that should really reside within the attributes of the cube.

The Resolution

Data Abstraction. Design these into the database.

For a relational database it can take the form of including calculated fields in tables, creating views to encapsulate the logic, or creating functions to handle calculations.

For an Analysis Services cube, try to include more attributes or groupings into your dimension designs, or add additional fields into the fact table. Data cubes are one of the most powerful ways to perform aggregation and translate data into business information so it’s important to capture the business logic.

Reports and Reporting Services can also help to abstract away the nitty gritty of complex logic (but be sure that you don’t end up replicating the problems of Cut and Paste that you had in the first place).

And yes, as with all business rules they may be subject to change. But at least you know where to make the change, and to do it only once or twice rather than several times.

The Benefits

Abstraction makes things simpler, easier to change, easier to implement, easier to document.

It can avoid writing tricky code multiple times, simplify design, optimisation and indexing, and ultimately increase the maintainability of the solution.

The code is no longer in the analysts head and spread across lots of SQL files. It’s available to the business and increases business confidence that people know in their own language what is happening.

So, as my first blog post for my thoughts on Technology, Strategy and Architecture I’m putting down what I think many experienced programmers and business people are telling me.

When it comes to SQL. Less is more.

A following post will look at the implications on data stewardship that the transition from IT to Business stewardship brings.