daystar wheaton group
Library
Search all articles
by keyword(s):

« Return to main articles page  |  Download as a pdf »

The Hype and the Reality of Business Intelligence Software: Part 1

By Jim Wheaton and Boris Gendelev
Co-Founders, Daystar Wheaton Group

This is Part 1 of a two-part article that appeared in the May 17, 2005 issue of
“Business Intelligence Network” (See part 2 of this article)

You've heard the pitch:  for you, the seeker of Customer Relationship Management (“CRM”) heights, there is business intelligence software that will take you there.  With a click of a mouse, you will count and profile your customers, select your names for a marketing promotion, and then analyze your results on the back-end.  Wow!

But after the initial thrill is over, you might be disappointed for one or more reasons:

  • The software’s data manipulation capability is not powerful enough.
  • Data-driven, statistics-based predictive modeling is only superficially supported.
  • No training is provided for navigation in the dangerous waters of data analysis.
  • It does not address the issue of data integrity.
  • Simple queries are – well – simple, but complicated ones are nearly impossible.
  • Even if you manage to express your complicated questions in the language of the software, to get answers in a reasonable amount of time may require a significant investment in hardware. 
  • There is no framework to translate answers – particularly the more complex ones such as customer behavior models – into the optimization of business decisions.

Of course, nothing can be perfect.  However, to minimize disappointment remember to ask eight fundamental questions:

Question #1:  Is Important Functionality Missing?

Although automated counts and selects are very important, they are insufficient for sophisticated CRM.  That is because CRM is first and foremost the process of using customer history to anticipate (“predict”) future productivity under different scenarios.  Defined as such, the practice of CRM must center on developing an understanding of the relationships between what was known about a given customer at one point of time and that customer’s subsequent behavior.   

Data-driven, statistics-based models to predict customer behavior are core to CRM.  Therefore, counting and profiling only qualifies as the first stage of CRM – getting familiar with a business and its data.  However, marketers new to CRM are likely to stop at counts and profiles.  Does this mean they do not use models to predict customer behavior?  Of course they do!  However, their models are based on judgment, and not methodically driven by data using the engine of rigorous quantitative analysis.

Judgmental models reflect one's intuitions – the subconscious sum of one's experiences.  There is nothing wrong with that.  But, when there is hard data to provide or deny credence to a hunch, it makes sense to use it.  Software that doesn't go much beyond counts and profiles doesn't unlock the full potential of CRM.

Question #2:  Can Software Really Build Models?

Many business intelligence software packages claim to perform modeling.  Supposedly, you feed it your promotional results, crank away with logistic regression, neural networks, or some other quantitative wizardry, and out pops the result.  But, all they are offering you is model calibration.  For parametric models, this is referred to as “estimating the parameters.”  For non-parametric models, it is “discovering the structure.” 

But, who decides what variables to toss into the magician's hat?  You do!  But how?  If you are using intuition alone, then you are not being sufficiently data-driven, and you are not practicing sophisticated CRM.   

The process of predictive modeling is foremost the process of deciding, by business and data analysis, what data to use and how to transform it to illuminate patterns.  For example:

  • Defining the modeling subset.  What time frames and business segments are relevant?  Last fall?  This spring?  All spring seasons?  Last five years?  General media?  Specialty media?
  • Defining the dependent (“target”) variables.  What should you try to predict and how should it be measured?  Response Rate?  Average Purchase Size?  Demand per media?  Per marketing dollar?  Long term value?  Gross or net sales?  And, within a multi-channel marketing environment that contains a mix and direct response and brand building efforts, how is the incremental effect of each effort to be determined?
  • Defining the independent (“predictor”) variables.  What variables should be tried as independent variables?  Here, the list of possibilities is endless, considering the potential combinations of variables including differences, sums, ratios and percentages.  It is in the creation and testing of predictors that interesting data analysis truly takes place.

Once you are past the first round of these questions, you will be ready to calibrate the model.  Then you have to validate its performance.  The results might suggest fine-tuning and send you back to data analysis in search of new ideas.

Model calibration does not develop any new concepts, nor does it provide theories about what is driving your customers.  That is your job!  Software that restricts you to variables you had the foresight to record previous to a marketing effort hinders the practice of CRM.

Also, after you have constructed a predictive model – or a network of several – you still have to incorporate it into an overall decision model.  Hopefully, an outline of the decision model – how the results of the predictive model(s) will be translated into decisions – was informing your efforts all along.  But, in the end, you have to put together the nuts and bolds in such a way that your CRM and marketing resources are optimally allocated.  (See “CRM Growth Simulator:  Extending the Data Warehouse,” Jim Wheaton, Business Intelligence Network, April 21, 2005,
www.b-eye-network.com/view/788.)

Question #3:  Will Standard Reports Be Enough?

Perhaps the majority of a CRM professional’s daily needs can be satisfied with a stack of standard reports and "fill-in the blanks" queries.  But just as surely, the remainder – reports produced ad hoc in search of penetrating insight – is what will provide your company with a true competitive edge.

Standard reports help you monitor your business through the lens of your existing predictive models as well as monitor the robustness of the models.  They serve to trigger new questions.  The ad hoc, never-anticipated queries help answer the questions and move both the models and your business forward.  Therefore, while prepackaged report templates are often useful, good CRM software should excel in ad hoc reporting of any depth and complexity.  If the answer can be found in the data, the tool should give you power to formulate the question.

Question #4:  How Do You Avoid Discovering Useless Things?

The barrier to building robust, actionable, customer behavior models cannot be overcome by software alone.  Data analysis expertise is equally essential.

The data, not the software, interacting with an analyst's logical faculties and imagination, drive the course of analysis.  Decisions based on the analysis, once set in motion, may have a profound, irreversible and long lasting impact on your business. 

An entire area of research exists in the area of human fallibility as it pertains to data, probabilities and statistics.  (An excellent resource is “Judgment under Uncertainty:  Heuristics and Biases,” by Daniel Kahneman, Paul Slovic and Amos Tversky.)  Training and experience in data analysis and interpretation is all that stands between you and disaster.  For example, the predictive modeling process is full of potential pitfalls such as:

Example #1:  In the exploratory phase of modeling, there is a danger of selecting predictor variables that are contaminated by the “target”; that is, what it is that you are trying to predict.

Suppose you have a hunch that customers with children are your best buyers.  You decide to add a question to your order entry script and a new field – "presence of children" – to your marketing database.  The field is initialized to "no."  After a while, you start analyzing if those who answered "yes" bought more frequently.  And sure enough, "yes" customers are more frequent buyers than "no" customers. 

Did you just find an important key to you business?  Before you start paying for demographic overlays and over-circulating households with children, consider that those who buy more frequently were more likely to have been asked and provided an answer.  Therefore, being a frequent buyer makes a "yes" more likely, and not the other way around. 

The specific lesson is that a single code should not mean two different things.  In this example, "unknown" should be a separate code.  Moreover, a new segmentation variable must be evaluated while holding constant other variables that are already known to be good segmenters; for example, rules-driven customer segments or your current scoring model.

A more general lesson is that, without being keenly aware of how your business is reflected in the imperfect mirror of your data, and how to evaluate the incremental value of a new idea, it is easy to "discover" useless things.

Part 2 will provide two more examples as well as a discussion of the following questions:

  • Who is verifying what and how? 
  • Who is in charge of data integrity? 
  • How does the interface deal with query complexity? 
  • Was the testing ground the same as your battle ground?

 

 

The Hype and the Reality of Business Intelligence Software: Part 2

By Jim Wheaton and Boris Gendelev
Principals, Wheaton Group

This is Part 2 of a two-part article that appeared in the May 20, 2005 issue of
Business Intelligence Network”

Part 1 discussed the following fundamental questions that must be asked in order to avoid disappointment with business intelligence software employed as part of your efforts to achieve sophisticated Customer Relationship Marketing (“CRM”): 

  • Is important functionality missing? 
  • Can software really build models? 
  • Will standard reports be enough? 

Part 1 ended part way through a discussion of the fourth question:  How do you avoid discovering useless things?  We pick up with the second and third examples of useless discoveries:

Example #2:  It is easy to select predictor variables that correlated with the desired customer behavior during the “time slice” of the analysis, but no longer do.  For example:

During the construction of a model to predict upcoming purchase volume from a national hard-goods retailer, a strong positive relationship was found between purchase activity and ownership of the retailer’s private-label credit card.  Further analysis determined that, during the “time slice” of customer activity being used to build the model, the private label card had just been introduced.  “Early adopters” of the card generally were the retailer’s most loyal customers.  Hence, incentives for sign-up were modest, and card ownership was a surrogate for loyalty.

However, by the time the model was to be moved into production, the rate of card sign-up had slowed considerably.  The retailer had responded by becoming more aggressive in the provision of incentives.  Hence, the relationship between card ownership and customer loyalty had changed dramatically.  The prudent decision was to eliminate card ownership from further consideration in the modeling process.

Example #3:  In the calibration phase of predictive modeling, exposing the quantitative procedure to the "best" cross section of data is complicated. 

Techniques that produce “black boxes” are particularly troublesome.  A black box might fit the data used to build it, but as your business evolves and produces new combinations of variable values the black box may start to generate nonsense.

Suppose your new software presides over a newly built marketing database.  While your analyst knows to be careful because of having access to only eighteen months of data, did anyone mention this to the software?  How do you tell the software that some customers are three, four, or ten years old and that only a portion of their history is reflected in the database?  Will there be problems a year from now when the maximum elapsed time since the most recent purchase has increased to thirty months?

Question #5:  Who is Verifying What and How?

As previously discussed, it is easy to be misled by poorly conceived and carelessly executed data analysis.  But, even full awareness of the business, data and analytical issues does not protect you from simple mistakes.  Because you will have to translate your thoughts into organized instructions that a computer can execute, there is plenty of room for error.

The art of results verification is a specialized branch of data analysis.  Multi-million dollar mistakes await those who believe that computers print nothing but gospel.  Just ask Dick Sabot, Chairman and Co-Founder of Eziba.com, an online retailer and catalog company.  According to the January 24, 2005 issue of The New York Times:

[T]he company had sent out tens of thousands of catalogs in late September and early October and waited for the phones to ring.  After a couple of "grim, quiet" days, Mr. Sabot said, company executives checked with the business that mailed the catalogs on Eziba's behalf.  They hoped to find that the mailing had simply been delayed, but instead discovered that the catalogs had been sent to the wrong addresses.  Because of a computer error, the catalogs had reached the members of Eziba's mailing list who showed the lowest likelihood to respond to the catalog...

The revenue shortfall created by that event put the company in such a tenuous financial position that it was forced to halt operations temporarily on Jan. 14 while it sought cash to pay off creditors. Bill Miller, Eziba's chief executive, resigned amid the problems.

The moral to this sad story is that, although certain software features may prove helpful to minimize the occurrence of error, training is the real answer.

Question #6:  Who Is In Charge of Data Integrity?

A CRM package without data is an empty shell.  When stocked with data that is inaccurate, incomplete or inconsistent, it is a time bomb ready to explode. 

In checking data integrity, there is no substitute for a human expert.  But beware!  Often, people who can make sense of system and file structures have no clue how to analyze and interpret data.  Assuring data integrity is a data analysis task rather than a programming/IT job.

Should you believe IT when they say the data is clean and that all you need to do is load the files?  If they are not familiar with the tools and techniques needed to answer your business questions, how were they able to query the integrity of the data?  How much has been learned from an audit of the data?  If many basic facts about the business await the running of the "standard" reports, then on what basis did IT base its clean bill of health?

Investigation of data integrity should go far beyond the standard analysis of “valid” values, key counts and referential integrity.  Instead, it should interrogate the completeness, consistency and usability of the data for robust CRM applications.  For example, the following should be examined:

  • Trends and seasonality patterns.  Does the data capture the trends and annual patterns of activity accurately, and do the patterns hold steady from year to year and from marketing promotion to promotion?  Are there gaps in the data?
  • Longitudinal stability of field value distributions.  Has the same coding scheme been used over time?
  • Core relationships.  Examples include the analysis of promotional depth by past segmentation variables, examination of purchases by product type, and counts by source characteristics.

When it comes to the construction of a marketing database with accurate, complete and consistent data – and, just as importantly, the development of a process to maintain that integrity – it is hard to imagine a cookie cutter solution.  Nor is it advisable to entrust anyone with the task but business-savvy, computer-literate, seasoned data analysts.

Question #7:  How does the Interface Deal with Query Complexity? 

GUI's (graphical user interfaces) can be wonderful.  They allow you to construct queries by pointing and clicking instead of typing.  And, for a two-finger typist, they are a relief. 

The problem is that most GUI's make it easier for you to construct SQL or 4GL (4th generation language) statements, but still require you to know the technical constructs of their systems.  They offer step automation by demanding less effort for mindless steps.  Nevertheless, although entering a query by dragging and dropping saves keystrokes, it does not eliminate the responsibility for learning the effect of what you are dragging and where you are dropping!

It is far preferable for the interface to achieve a conceptual shift:  the elimination of entire groups of steps that, if it were not for the need to spoon-feed the computer, would not even be part of the way that you think.  Such a conceptual shift is vastly more powerful.  It is the automation or even complete hiding of an entire set of steps that, together, are conceptualized as a single task.  This saves not only physical but also mental energy. 

An automobile analogy is the automatic transmission.  With an automatic transmission, you do not have to think about which gear is appropriate for your current driving conditions.  Without having to think about when and how to shift, a human being can focus entirely on avoiding other automobiles and getting to the desired destination in a timely fashion.

Data analytics, which arguably is one of the more complex computer-assisted endeavors, is challenging enough in its own right.  It is preferable for the software to allow you to "show" or "lead" it to a desired final result, and in a manner that is natural to you.  Software that requires you to tell it how to apply its own – foreign to you – operators lengthens the learning curve.  It also increases the likelihood of mistakes and rework, and reduces the available time for more creative activities.

Question #8:  Was the Testing Ground the Same as Your Battle Ground?

Look at the past and present clientele of the business intelligence software vendor.  Ask which businesses were used as test beds for prototyping and stress testing the software.  If the industries or companies mentioned are not known for their analytical CRM expertise and the use of accountable marketing media, what makes you believe the software is appropriate for you?

Final Thoughts

These eight questions are applicable to most potential users of business intelligence software.  However, the specific circumstances of your business will dictate just how important each question is and generate other specific questions.  The important thing is not to be overtaken by the hype!

Jim Wheaton and Boris Gendelev are Co-Founders of Daystar Wheaton Group (www.DaystarWheaton.com), a Chicago-area data management, data mining and decision sciences practice that focuses on strategic CRM.  The firm also offers full list processing capabilities and campaign management software.  For additional information, please contact Jim at 847-202-0101. 

^ Top


« home  |   logo portfolio »    |   print portfolio »   |   website portfolio »   |   learn more    |   free quote! »   |  contact »

  sitemap »