Improving direct mail targeting through customer response modeling


Expert Systems With Applications 42 (2015) 8403–8412

Contents lists available at ScienceDirect

Expert Systems With Applications

journal homepage: www.elsevier.com/locate/eswa

Improving direct mail targeting through customer response modeling

Kristof Coussement a,∗, Paul Harrigan b,1, Dries F. Benoit c,2

a IESEG School of Management – Université Catholique de Lille (LEM, UMR CNRS 9221), Department of Marketing, 3 Rue de la Digue, F-59000 Lille, France
b The University of Western Australia – UWA Business School, M263, 35 Stirling Highway, Crawley, 6009, Australia
c Faculty of Economics and Business Administration, Ghent University, Tweekerkenstraat 2, B-9000 Ghent, Belgium

a r t i c l e i n f o

Keywords:

Direct marketing

Direct mail

Response modeling

Database marketing

a b s t r a c t

Direct marketing is an important tool in the promotion mix of companies, amongst which direct mailing

is crucial. One approach to improve direct mail targeting is response modeling, i.e. a predictive modeling

approach that assigns future response probabilities to customers based on their history with the company.

The contributions to the response modeling literature are three-fold. First, we introduce well-known statisti-

cal and data-mining classification techniques (logistic regression, linear and quadratic discriminant analysis,

naïve Bayes, neural networks, decision trees, including CHAID, CART and C4.5, and the k-NN algorithm) to the

direct marketing community. Second, we run a predictive benchmarking study using the above classifiers on

four real-life direct marketing datasets. The 10-fold cross-validated area under the receiver operating char-

acteristics curve is used as evaluation metric. Third, we give managerial insights that facilitate the classifier

choice based on the trade-off between interpretability and predictive performance of the classifier. The find-

ings of the benchmark study show that data-mining algorithms (CHAID, CART and neural networks) perform

well on this test bed, followed by simplistic statistical classifiers like logistic regression and linear discrimi-

nant analysis. It is shown that quadratic discriminant analysis, naïve Bayes, C4.5 and the k-NN algorithm yield

poor performance.

© 2015 Elsevier Ltd. All rights reserved.

1. Introduction

The move from mass-marketing to mass-customization is no bet-

ter reflected than in the area of direct marketing, and in particular

direct mail. Marketers no longer distribute their messages to a mass

market, nor do they distribute based on basic demographic character-

istics; rather they distribute and optimize different messages to dif-

ferent segments that are developed based on past behavior (Jonker,

Piersma, & Van den Poel, 2004; Rowe, 1989; Wierich & Zielke, 2014).

Still, the need to improve the effectiveness of direct mail campaigns

is a persistent issue in many industries (Guido, Prete, Miraglia, & De

Mare, 2011; Mahdiloo, Noorizadeh, & FarzipoorSaen, 2014).

Before sending direct mail, a key dilemma for marketers is which

customers to target. In an effort to answer this question, marketers

tend to use response modeling. Response modeling identifies cus-

tomers that are likely to respond better to the marketing campaign

based on their past response behavior.

∗ Corresponding author. Tel.: +33 320545892.
E-mail addresses: k.coussement@ieseg.fr (K. Coussement),

paul.harrigan@uwa.edu.au (P. Harrigan), dries.benoit@ugent.be (D.F. Benoit).
1

Tel.: +61 8 6488 1979.
2

Tel.: +32 9 264 3552.

The above perfectly fits in the philosophy underpinning one-to-

one marketing communications seen in the customer relationship

management (CRM) domain (Mahdiloo et al., 2014). CRM is a strategic

approach to marketing underpinned by relationship marketing the-

ory (Morgan & Hunt, 1994), which has been defined as “a compre-

hensive strategy and process that enables an organization to iden-

tify, acquire, retain and nurture profitable customers by building and

maintaining long-term relationships with them” (Sin, Tse, & Yim,

2005, p. 1266). At the heart of CRM is data on customers. The in-

creasing power of CRM technologies enables more and more sophis-

ticated data collection, storage and analysis techniques. The ability to

draw powerful analyses from customer data makes CRM – and thus

response modeling – a critical success factor in today’s rapidly chang-

ing environment (Danaher & Rossiter, 2011; Kumar, 2008; Ngai, Xiu,

& Chau, 2009).

The focus of this paper is on customer response modeling. The

contributions of our research study are three-fold. First, we will in-

troduce the most popular response modeling methods to the di-

rect marketing community. In particular, we review a range of

popular classification algorithms borrowed from the statistical and

data-mining community (logistic regression, linear and quadratic

discriminant analysis, naïve Bayes, neural networks, decision trees

(CHAID, CART and C4.5) and the k-NN algorithm). Second, we com-

plement the existing response modeling literature by integrating and

http://dx.doi.org/10.1016/j.eswa.2015.06.054

0957-4174/© 2015 Elsevier Ltd. All rights reserved.

http://dx.doi.org/10.1016/j.eswa.2015.06.054
http://www.ScienceDirect.com
http://www.elsevier.com/locate/eswa
http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2015.06.054&domain=pdf
mailto:k.coussement@ieseg.fr
mailto:paul.harrigan@uwa.edu.au
mailto:dries.benoit@ugent.be
http://dx.doi.org/10.1016/j.eswa.2015.06.054


8404 K. Coussement et al. / Expert Systems With Applications 42 (2015) 8403–8412

contrasting all classification algorithms into a framework that aims to

benchmark their predictive capabilities in discriminating responders

from non-responders in four real-life direct mail companies. Third,

managerial insights on classifier choice are given to the direct mar-

keting community taken into account the comprehensibility and pre-

dictive performance of the response model.

This paper is structured as follows. The next section introduces

the direct marketing field and its links with customer response

modeling. Following that the range of classification algorithms are

introduced and explained. We then describe the evaluation met-

ric used and further explain the characteristics of the datasets and

the experimental setting. Finally, we present the results and their

implications.

2. Direct marketing and response modeling

Direct marketing is defined as the ‘interactive system of marketing

which uses one or more advertising media to affect a measurable re-

sponse and/or transaction at any location’ (Direct Marketing Associa-

tion 2009). Direct marketing is big business. It is projected that direct

marketing expenditures in the US will grow to $196 billion in 2016,

with direct mail forming part of this growth. Direct mail is targeted

at customers that are most likely to be enticed by particular offers,

as opposed to a traditional mass marketing approach whose promo-

tional activities are addressed to customers and prospects indistinctly

(Guido et al., 2011; Mahdiloo et al., 2014; Risselada et al. 2014). Direct

mail is not being killed off by the Internet; rather it is being used as

a complementary channel (Danaher & Rossiter, 2011). Winterberry

Group confirms that direct mail is still on the rise (Conlon, 2015).

In 2014, direct mail spending grew with 2.7% in the United States

compared to the projected 1.1% growth. Moreover, the market ana-

lysts project a 1% growth increase in direct mail spending for 2015,

equivalent to $45.7 billion of the $156.8 billion representing the total

direct and digital spending projection for 2015. The reason by Win-

terberry group is that direct mail costs will stay steady, and thus

they expected that the projected 1% growth to come from volume

increases.

Continued growth will be predicated upon the levels of return on

investment of direct mail campaigns, which significantly depends on

marketers being able to use specialized targeting techniques to come

up with the right set of customers to contact (Lamb, Hair, & McDaniel,

1994).

Thus, the importance in knowing which customers are more likely

to respond to a certain mailing is of paramount importance to mar-

keters. Determining or predicting those customers who have a high

probability to respond to a specific mailing based on their past be-

havior is called the customer response modeling (Bose & Chen, 2009;

Mahdiloo et al., 2014). Response modeling is part of the classification

literature stream. Classification is the procedure where customers

are predicted to belong to predefined groups or target classes based

on their historical customer information (Blattberg, Kim, & Neslin,

2008). Typically, a response model is estimated on a training set in

which both the independent variables, describing and profiling a par-

ticular customer, and the dependent (response) variable, whether the

customer responded on a certain mailing, are observed. Then, the es-

timated model on the training data is applied to a new set of cus-

tomers that are not used during training (the test set). The result is

a response probability for each customer in the test set, dependent

on his or her past behavior. Managerially speaking, depending on the

direct mail campaign budget, the company is able to target the top

x% of customers with the highest response probability given by the

response model.

The next section of this paper will introduce and describe the

range of statistical and data-mining algorithms that can be used in

customer response modeling.

3. Classification algorithms

The essence of one-to-one marketing communication is provid-

ing the right customers with marketing messages that they can eas-

ily act on (Ryals, 2005) This means that ‘prediction and targeting are

both key to decision making underlying direct marketing campaigns’

(Zahavi & Levin, 1997, p.35). Therefore, understanding which tech-

niques yield the best predictive capabilities is vital for direct mar-

keters (Bose & Chen, 2009; Rada, 2005). With increased efficiencies

and effectiveness, marketers could reduce mailing costs (Barwise &

Farley, 2005), increase conversion rates (Kaefer, Heilman, & Ramenof-

sky, 2005), and increase customer retention (Watjatrakul & Drennan,

2005).

Our literature review reveals that existing literature utilizes sev-

eral statistical and data-mining classification algorithms in various

research setups to separate responders from non-responders. How-

ever, we complement the academic literature by presenting and in-

tegrating the most popular classifiers into one predictive bench-

mark study over multiple response datasets, while summarizing the

managerial implications for managers. Several statistical classifica-

tion methods to predict customer responses have been proposed

and utilized, such as logistic regression, discriminant analysis and

naïve Bayes (Baesens, Viaene, Van den Poel, Vanthienen, & Dedene,

2002; Berger & Magliozzi, 1992; Coussement, Van den Bossche, & De

Bock, 2014; Cui, Wong, & Zhang, 2010; Deichmann, Eshghi, Haughton,

Sayek, & Teebagy, 2002; Kang, Cho, & MacLachlan, 2012; Lee, Shin,

Hwang, Cho, & MacLachlan, 2010). These techniques can be very pow-

erful, but each algorithm also makes several stringent, but different,

assumptions on the underlying distribution between the indepen-

dent variables and the dependent variable. To counter this, more ad-

vanced data-mining algorithms have been proposed for discriminat-

ing between responders and non-responders, such as artificial neu-

ral networks (Baesens et al., 2002; Chen, Hsu, & Hsu, 2011; Curry &

Moutinho, 1993; Zahavi & Levin, 1997), decision tree-generating tech-

niques (Buckinx, Moons, Van den Poel, & Wets, 2004; Chen, Hsu, &

Chu, 2012; Haughton & Oulabi, 1997; McCarty & Hastak, 2007; Rada,

2005) and k-NN learners (Govindarajan & Chandrasekaran, 2010;

Kang et al., 2012).

The following sections review the most popular response models

by describing their functioning, and by discussing their merits and

drawbacks.

3.1. Logistic regression

Logistic regression (LOG) is a well-known and industry-standard

classification technique for predicting a dichotomous dependent vari-

able such as respond/do not respond to a mailing (Coussement

et al., 2014; Suh, Noh, & Suh, 1999). Besides applications in direct

marketing, it is an often used technique in a variety of predictive busi-

ness settings like customer segmentation (McCarty & Hastak, 2007),

churn prediction (Neslin, Gupta, Kamakura, Lu, & Mason, 2006), cus-

tomer choice modeling (West, Brockett, & Golden, 1997) and many

others. Moreover, logistic regression has several advantages (Hosmer

& Lemeshow, 2000).

For a given training set with N labeled training examples (xi,yi)}

with i = 1, 2, … , N with input data xi є Rn and corresponding binary
target labels yi є {0, 1}, the logistic regression tries to estimate the
probability P(y = 1|x) given by

P(y = 1|x) = 1
1 + exp(−(w0 + wx))

(1)

with x є Rn being equal to an n-dimensional input vector, w to the pa-
rameter vector and w0 to the intercept. The parameters w0 and w are

usually estimated using a maximum likelihood procedure (Hosmer &

Lemeshow, 2000).


https://isiarticles.com/article/39889