Best Practices


Text Mining

Customer relationship data is available not only in quantitative form such as sales data, returns, pricing, inventory, and regional dispersion of demand. A great deal of more valuable data is available in the conversations with customer service representatives, the notes sales representative make, chat sessions and contracts or patent documents. This kind of qualitative data can help assess pain points expressed in the complaints of customers, feedback or suggestions, preferences expressed as well as news articles. Till recently, it was hard to mine textual data but increasingly vendors are able to include this capability along with statistical analysis functions. The correlation of textual information with quantitative data helps to extract insights which would otherwise have been elusive.

Hewlett Packard has been one of the pioneers in the use of textual and quantitative information for understanding product needs of customers. When HP combined its information on customer segments with the textual information it was receiving, it realized that the feedback from individual segments was not the same. The hot button issues concerned product configuration and pricing issues. Subsequently, HP was able to tailor solutions for each of these segments based on the analysis. In addition, this information was used to construct predictive models which were applied to prospect databases to target new customers. HP decided to extend the scope of its text mining by including the notes taken by its sales staff on Siebel note pads and later included articles from newspapers and magazines.

Predictive modeling can help companies take pre-emptive action to avoid harm to their brand equity. J.D. Power and Associates, a California based customer research firm, is testing models that will sift through comments from surveys to predict warranty problems, for automobile manufacturers, before large number of vehicles have been shipped. Nextel Communications faces the problem of customer churn as acutely as other telecom companies. It ferrets out the key phrases in customer interactions to predict customer churn and make offers to avoid such a situation.


Geographic location of On-line Customers

Growing e-commerce poses a challenge to companies who need to identify their customers before they can place ads, make offers and provide service. Shoppers on web-sites are known only by their IP address which could be located anywhere in the world. In the absence of information on the country the customer belongs to, companies don’t know even the language they should be using.

Digital Island of San Francisco has developed an Internet atlas application called TraceWare, which correlates IP addresses at the country level. As an international Internet backbone provider, Digital Island is in a position to gather the geographic information to determine where the IP addresses originate.

HighWire Press, a publisher of more than 150 life science journals at Stanford University, finds TraceWare useful in placing pharmaceutical ads. Pharmaceutical industry regulations vary in individual countries; some countries prohibit advertisement in the industry and others don’t. TraceWare helps in deciding where to place ads.


Customer Life Time Value

Economic Value Added was an esoteric concept, long ignored by managements, till Coco Cola used it with spectacular success. The intuition underlying the concept was simple; a company adds value only when its profits exceed the cost of capital. Coco cola refocused its business by disinvesting units that did not meet this criterion or their economic value was lower than in other departments. Customer Life Time Value has similar implications; acquisition of a customer does not add value to the company unless the costs of servicing a customer are less than the value that is added in terms of profits from sales. In practice, determination of value addition by each customer is hard. Over a life time, a customer buys several products and services and the initial sales create a beachhead for further promotions. A customer for broadband services, for example, can become a customer for internet, unified messaging and a home network. On the other hand, a customer could be acquired at a high cost by selling a DSL for no cost or at a discounted price without yielding a benefit in the future.

Companies need to find a way to segment their customers so that they are offered a value which is in line with the costs incurred to service them. One company which has adopted this approach is Best Buy which has concluded that 20 percent of its customers are unprofitable. The profitable customers are groups of customers like the suburban mothers and the upper-income men. Sales people in fifteen percent of its stores are trained to better tailor to their needs. The pilot stores are gaining sales at twice the rate of same-store sales and higher close rates as conventional stores. Best Buy expects to roll out the customization program to the rest of their stores over the next three years. Predictive Analytics helps to identify the characteristics of customers who are more likely to be profitable.


Progress in Techniques

Traditionally, marketing managers have been content to use the RFM (Recency, Frequency and Monetary Value) of purchases as the method to segment their prospects. This provides a rudimentary and intuitively appealing way of classifying customers for promotional purposes. The advantage of this method is that it requires only sales data to segment customers. It does not, however, correlate sales data information with geo-demographics, psychographics or economic conditions to estimate the likely purchases by a particular group which can improve the accuracy of the predictive models.

Classical statistics is proven method of data analysis especially when testing hypothesis about causation between variables. The most common methods that are use in marketing are linear regression and logistic regression models, the former estimates integer numbers while the latter calculates the odds of an event happening. These methods presume a probability distribution in the data, a normal distribution or a bell curve is the typical assumption, when tests of significance or validity are conducted. However, this assumption is hard to sustain when the data sets are extremely large and countless variables interact with each other in complex ways.

Data mining methods are an alternative means to parse the data without making any assumption about the probability distribution of the data. A common denominator of these methods is that they look for patterns in large data sets without making an attempt to find the causal interactions in the variables. These methods use artificial intelligence algorithms; a common method is neural networks, to cluster the data into segments where data points with common characteristics are separated from others.

One simple data mining method is the nearest neighbor method of clustering which is akin to predicting a person’s buying behavior from their place of residence. People living neighborhoods that are highly priced will tend to have higher level of education and are more likely to spend in stores such as Williams-Sonoma. Similarly, people living in retirement communities are more likely to travel. These methods require no mathematical calculations as is the case with classical statistical techniques and are intelligible to most people. The downside with this method is that the inferences are not as rigorously tested as is the case with classical statistical methods.

Similarly, segmentation of the customer database can be done with a simple method like decision trees without using complex mathematical techniques. For example, a company may divide its customer base into two groups; the first it is able to retain for the first year only and the rest who defect after that. It could further divide those who are prone to defect into two groups; the first responds to lower price and the other to better products. This could continue depending on the granularity that is desired.

More sophisticated data mining techniques are methods like neural networks which mimic the human brain’s tendency to learn and improve estimates based on the data received. This involves making tentative estimates which increase in precision as more data is received. As an example, a company might want to estimate the probability of a person clicking on an advertisement for a sports product on the web. The neural network is required to estimate this based on data on the age of a person, location and ethnicity. The estimates of the output are compared to the actual data and if there is a variance more variables are included till a satisfactory result is achieved.


Related Information

 

Featured Links

Customer Relationship Managment | Marketing Software | Business Intelligence