4 Commonsense Strategies for Data Testing

Syed Haider
Insurance Experts' Forum, September 5, 2012

Without thorough testing, databases often have hidden problems that eventually give insurers big headaches. In a previous blog, I laid out the reasons that data applications don’t get same level of testing as software. Here are some steps IT managers and data architects can take to remedy that situation.


1. Get smart testers

IT leadership should be wary of hiring "black-box" testers for data-projects whose core strength is in comparing the output of a batch process or report against a given spec or legacy report. While such mechanical analysis is inevitably needed, we should still look for testers who have more advanced skills, such as:

• White-box testing. This requires the ability to understand complex SQL code—keep in mind that the SQL and its variants (PL/SQL, T-SQL, etc.) are notoriously hard to document—and to create test data and scenarios that go above and beyond the “happy” path.

• Ability to independently write code against the specification that should match the output from the code being tested. This puts the testers in pretty much the same league as the developers. The importance of this core competence cannot be overstated for IT managers because it makes the task of organizing the testing group all the more complex.

• Data analysis. In BI projects, data analysis will really boost the end-users’ confidence in IT if the testers are able to perform rudimentary data analysis, such as basic trending, before they get to use the data.


2. Invest in configuration management

Since configuration management is still not as cut and dried in data projects as it is in the application development arena, it merits special attention upfront. One shouldn’t be deterred by the general lack of built-in features like revision management in most modern database platforms, as there are some decent third-party tools emerging in this space.


3. Automate routine testing

The idea here is to recycle your test-cases into automated “smoke-tests” that can be incorporated into production jobs. This way, we can eliminate the need for manual testing of routine production processes, such as incoming/outgoing feeds and batch processes.

It’s a win-win situation for all because smart testers wouldn’t typically enjoy the mundanity of such testing. Smoke tests can be written at a purely database level and don’t require any specialized tool to execute. All we really need is some sort of a framework that orchestrates these tests. Consider the following scenarios:

• As an incoming file is loaded into staging, the ETL invokes automated tests to do basic balancing of control totals against the loaded data. If it fails, an email goes out to the IT admins reporting a problem in the feed.

• As a number-crunching process finishes, it kicks off an automated test that runs a number of sniff tests using ‘acceptable’ thresholds (as specified by the business), and reports if it finds any data failing to meet them.

As we can see, it doesn’t matter which ETL or process-automation tool we employ. As long as we have the following broad-based features available in the testing framework, we can automate our testing. This will:

• enable developers/testers to define, activate/deactivate test cases and suites;

• enable a variety of test types: boundary cases, control totals, expected outcome, negative tests, exception scenarios, user-defined, etc.;

• withstand failures caused by incorrectly written or obsolete tests;

• communicate the results using some collaboration mechanism such as e-mail.


4. Focus on regression testing

As new features and improvements get introduced into the database, consider regression testing. The best way to avoid last-minute surprises is to keep your regression-test suite up-to-date as development continues. The main decision we often have to make in this regard is whether to kick off a full regression every time or make it more targeted.

The answer lies in our ability to track dependencies. We cannot fully count on relational database management system platforms’ innate ability to report dependencies. Often, a broken dependency is reported by the end users in the most unexpected of places after we have promoted some new features.

There are some third-party solutions to track dependencies more effectively from outside the database, and if they can be employed, we can run a partial regression test on potentially affected objects. In the absence of such solutions, our best bet is to run a full regression.

The above list of mitigating approaches is by no means exhaustive, but it does try to capture the big-ticket items. Let’s hope that some of these mitigating strategies might become irrelevant in future as new developments make life simpler for database developers and testers.

Syed Haider is an architect with X by 2, a technology company in Farmington Hills, Mich., specializing in software and data architecture and transformation projects for the insurance industry.

Readers are encouraged to respond to Syed using the “Add Your Comments” box below. He can also be reached at

This blog was exclusively written for Insurance Networking News. It may not be reposted or reused without permission from Insurance Networking News.

The opinions of bloggers on do not necessarily reflect those of Insurance Networking News.

Comments (1)

very wisely stated. unfortunately any application implementation ignores Data migration to be a project in its own thereby under-estimating the time, skills and efforts

Posted by: Sridevi R | September 5, 2012 10:05 PM

Report this Comment

Add Your Comments...

Already Registered?

If you have already registered to Insurance Networking News, please use the form below to login. When completed you will immeditely be directed to post a comment.

Forgot your password?

Not Registered?

You must be registered to post a comment. Click here to register.

Blog Archive

Living with the Internet of Things (and crowd funding)

The Internet of Things has it’s teething problems.

6 Technology Priorities for Individual Life Carriers

While many aging, generally mainframe-based systems, remain capable of supporting basic policy processing and accounting functions, the costs associated with enhancing them are becoming increasingly problematic.

With Google Favoring Mobile, Will The Industry Take it Seriously?

Google’s search engine will now will favor mobile friendly content over traditional website content; within the insurance industry, the greatest initial impact is likely to be felt by insurance distributors.

Why Some Technologists Get Cold Feet on Mobile

There are those who believe that favoring one channel or mode over another will lead to even more silos and dysfunction than we already have in many organizations.

Insurance IT Spending and Budgeting Benchmarks

New research from Novarica highlights areas of concern and offers insights on insurers spending and budgeting decisions.

Enterprise Mobilemania Continues Unabated

More than half of companies are spending more on developing mobile applications -- but are they more efficient?