Identifying
Identifying
www.emeraldinsight.com/0263-4503.htm
MIP
25,4                                          Identifying cross-selling
                                            opportunities, using lifestyle
                                             segmentation and survival
394
                                                      analysis
Received March 2006
Revised February 2007,
                                                   Jake Ansell, Tina Harrison and Tom Archibald
March 2007                                  The Management School, University of Edinburgh, Edinburgh, UK
Accepted March 2007
                                     Abstract
                                     Purpose – To demonstrate the successful use of lifestage segmentation and survival analysis to
                                     identify cross-selling opportunities.
                                     Design/methodology/approach – The study applies lifestyle analysis and Cox’s regression
                                     analysis model to behavioural and demographic data describing 10,979 UK customers of a large
                                     international insurance company.
                                     Findings – There are clear differences between the lifestage segments identified with respect to
                                     customer characteristics affecting the likelihood of a second purchase from the company and the
                                     timeframes within which that is likely to take place. The “mature” segments appear to offer greater
                                     opportunities for retention and cross-selling than the “younger” segments.
                                     Research limitations/implications – The study was limited by the type of data available for
                                     analysis, which related mainly to life insurance and pension products characterised by low transaction
                                     frequency. Different results might be expected for banking or credit-and-loan products. The findings
                                     could be enhanced by incorporating a wider range of customer characteristics into the analysis.
                                     Practical implications – The findings show clear differences in behaviour across the segments
                                     identified, providing a basis on which marketing planners might differentiate marketing and
                                     communication strategies for particular products market segments.
                                     Originality/value – The paper illustrates the adaptation of survival analysis methodology, familiar
                                     in other disciplines but comparatively rare in marketing, to the cross-selling of financial services. It
                                     shows how planners cannot only identify customers most likely to repurchase but also predict the
                                     timeframe in which that will take place.
                                     Keywords Market segmentation, Selling methods, Customer retention, Services, Financial services
                                     Paper type Research paper
                                     Introduction
                                     In mature markets, the need to adopt efficient marketing strategies becomes more
                                     critical. The financial services sector in the UK is an example. Following successive
                                     deregulation initiatives throughout the 1970s and 1980s (for example, abolition of the
                                     “corset” in 1979, changes to the Building Societies’ Act in 1986 and the enactment of the
                                     Financial Services Act 1986), competition has intensified from both traditional players
                                     and new entrants. In this now saturated market, new customers are hard to find, and
Marketing Intelligence & Planning    their acquisition tends to be at the expense of competitors. Such aggressive marketing
Vol. 25 No. 4, 2007
pp. 394-410                          activity is costly, and does not always lead to long-term gains since customers can
q Emerald Group Publishing Limited
0263-4503
                                     switch easily to other competitors. Consequently, such competition for customers in
DOI 10.1108/02634500710754619        mature markets leads to the phenomenon of “churn” which has also been described as a
“revolving door” (Kamakura et al., 2003) and a “leaking bucket” (Stewart, 1998): a                Identifying
constant and varying flow of customers both into and out of the business.                        cross-selling
   In order to reduce the rate of churn, financial institutions are considering ways
of strengthening the relationship with their customers. One way of forging                      opportunities
stronger ties with customers is through “cross-selling”: the strategy of selling other
products to a customer who has already purchased a product from the vendor,
designed to increase the customer’s reliance on the company and decrease the                             395
likelihood of switching to another provider. Another is “up-selling”: inducing the
customer to buy enhanced products, upgrades and add-ons. Given that the costs of
acquiring new customers are increasing, cross-selling represents a cost-effective
means of generating further business from existing customers. According to
Felvey (1982), it is easier for businesses to grow in this way than by attempting to
attract new customers.
   However, cross-selling can be a double-edged sword. Too much can upset customers
and make them less responsive, effectively weakening the relationship. Effective use of
the tactic requires that the basic objective of marketing is fulfilled: to offer the right
product to the right person at the right time. The customer database is instrumental in
this process, allowing the financial institution’s marketing planners to learn about its
customers from historic and current behaviour and to make predictions about future
needs and requirements. While most financial institutions gather and retain similar
data on their customers, the value derived from them varies across institutions. This is
due, in part, to the lack of suitable techniques to analyse the data as Kamakura et al.
(2003, p. 3) note:
   The development of techniques for the extraction of relevant information from the database
   for strategic marketing purposes, often referred to as data mining, has lagged behind the
   development of tools for collecting and storing the data.
Thus, many financial institutions possess very large amounts of data from which
limited marketing information is derived.
   Cross-selling can allow a firm to develop a continuing relationship with the
customer and hence the potential for further opportunities. In order to identify optimal
cross-selling opportunities, its marketing strategists first need to be able to identify
whether an existing customer is likely to make a subsequent purchase and/or which
product they will buy. In addition to this, it is also important to establish when the
purchase is likely to occur. It may be worthwhile maintaining the relationship if
the likelihood of a purchase is high and the timeframe short, but less rewarding if the
likelihood is low and at some distance in the future. While a number of good techniques
exist to ascertain the probability of subsequent purchases occurring, the timeframe in
which individuals are likely to act remains largely unexplored. In this paper, we apply
“survival analysis” to the task of estimating the timeframe to the next purchase.
   Survival analysis is a set of statistical techniques used to determine quantitatively
the impact of a set of variables (such as customer characteristics) on the time to the
occurrence of an event (such as a subsequent purchase). It has been applied
successfully in such areas as medicine (Collett, 1994) and industry (Ansell and Philips,
1994), but has not been used widely in predicting customer behaviours despite clear
potential for successful application in that domain. This paper reports the application
of survival analysis to customer data supplied by a large international insurance
MIP    company, to predict which customers are most likely to make a repeat purchase from
25,4   the company and when.
           Unlike many other retail customers, those buying financial services generally do not
       purchase new products frequently. Purchases tend to be at specific times in the
       lifecycle, reflecting the needs and circumstances of particular lifestages. Thus, in order
       to achieve better understanding and prediction of customer behaviour in this context,
396    our analysis also takes account of lifestage.
       Cross-selling
       The rationale for cross-selling, defined in the introduction as “the strategy of selling
       other products to a customer who has already purchased a product from the vendor” is
       not only to “increase the customer’s reliance on the company and decrease the
       likelihood of switching to another provider” but also to exert a generally positive
       influence on the relationship with the customer, strengthening the link between
       provider and user (Kamakura et al., 2003). Increasing product holding leads to an
       increased number of connection points with customers, as well as increasing the
       switching costs they would face if they decided to take their custom elsewhere
       (Srivastava and Shocker, 1987). The likelihood of defection is thereby reduced.
       Increased product holding also creates a situation in which the company can get to
       know it customers better through a greater understanding of buying patterns and
       preferences. This, in turn, puts it in a better position to develop offerings that meet
       customer needs. Consequently, it is argued (Kamakura et al., 2003) that cross-selling
       increases the total value of a customer over the lifetime of the relationship.
          Despite the apparent importance of cross-selling for relationship development and
       profitability, it seems that many financial marketers rely on intuition and experience
       for their related strategic decisions (Prinzie and Van den Poel, 2006). Indeed, Evans
       (2002) notes that cross-selling rates remain low among banks in Europe. Furthermore,
       the subject has received limited attention in the marketing literature. When it is dealt
       with, the focus is on methodologies for identifying common patterns in acquisition
       products by customers, based on ownership or usage data. While a substantial amount
       of research exists on the acquisition sequence for consumer durables, such as studies
       by Hebden and Pickering (1974), Kasulis et al. (1979) and Prinzie and Van den Poel
       (2006), there is little in the context of financial services.
          Stafford et al. (1982) followed their research into acquisition of consumer durables
       with a study that found evidence for the existence of a common acquisition pattern
       with respect to financial services. They describe an acquisition sequence from cheque
       accounts to simple savings accounts, to insurance, stocks, bonds and mutual funds,
       which was found to be relatively constant across three cohort groups. A study by
       Kamakura et al. (1991) both describes and predicts purchase sequences. It investigated
       the influence of the financial maturity of the customer (linked to lifestage) and the
       acquisition difficulty of the service (such as resources required, level of risk and
       liquidity, information costs) on the acquisition sequence. Financial services and
       consumers are positioned along a continuum of “latent” difficulty/ability, expressing a
       hierarchy of investment objectives. The probability that an investor owns a particular
       financial product is a function of that person’s position on the continuum relative to
       that of the financial product. The authors hypothesise that the more “difficult” services
       are acquired in the later stages of the family life cycle. The research clearly illustrates
the link between the acquisition sequence and stages of the lifecycle. According to              Identifying
Prinzie and Van den Poel (2006), these were the first researchers to make explicit use of       cross-selling
purchase sequence for cross-sell purposes.
   Similar studies have been conducted, focusing on financial products used for asset          opportunities
accumulation (Paas, 1998; Soutar and Cornish-Ward, 1997) and financial products
facilitating financial transactions (Paas, 2001). Kamakura et al. (2003) also extended
their original study to incorporate a data-augmentation tool that combines information                  397
from the customer database and from surveys. They argue that the use of single-source
data are not sufficient for understanding of the complete buying needs of the customer:
they exclude the possibility that the customer may already hold competitors’ products.
Marketing planners need to know about each customer’s usage of their own and
competing products, but this depth of market intelligence is not readily available,
unless in survey format, and typically provides only snapshots in time.
   The studies summarised here have employed various statistical models and
methods. Broadly speaking, purchase sequences have been described as either a
hierarchical process (such as the Guttman scalogram analysis and latent-trait analysis
used by Kamakura et al.) or a succession of purchases (such as the Markov models
used by Prinzie and Van den Poel). Hierarchical models assume that purchases are
consecutive; the same assumption does not apply to sequence models (Agrawal and
Srikant, 1995). While these models have achieved the marketing objective of knowing
which product to offer next and, in some cases, to whom, they do not address the
question of when to make it available. To do so requires a sequence model with a focus
on time, such as survival analysis. This paper attempts to fill that gap and, in doing so,
to fully address the marketing objective of making the right product available to the
right person at the right time.
Survival analysis
Survival analysis is a set of non-parametric, semi-parametric and parametric statistical
techniques used to determine quantitatively the impact of a set of potentially
influential variable on the elapsed time to the occurrence of an event, such as death or
the failure of a component (Prentice et al., 1981; Ansell and Ansell, 1987; Collett, 1994;
Ansell and Philips, 1994). The techniques are well established, widely accepted and
extensively used in biometrics (where survival analysis was first developed),
engineering and event history analysis. Survival analysis is being used increasingly in
the organisational behaviour and strategy fields (Blodgett, 1992; Chen and Lee, 1993;
Staber, 1992). It has also been applied to credit scoring, to predict the time to default
(Stepanova and Thomas, 2002).
    This analytical technique can also be used in the context of cross-selling (and
up-selling), to predict when the next purchase will be made. In other words, the aim is to
predict when an existing customer might carry out one of the actions shown in Figure 1,
which illustrates the behaviour of a customer who has already made two purchases at
time intervals t1 and t2. The objective is to predict the next action and when it will occur
(t3). There are a number of future behaviours that the customer might exhibit, including
taking up further products, surrendering existing ones or defecting from the company
altogether. Figure 1 shows the separate events or behaviours with the passage of time
and also the customer’s progress through lifestages.
MIP                              Positive
25,4                                                  Past Actions              Future Actions
Saving Plan
                                                                                         Extend Mortgage
                            Client                                                       …
398                         Actions
Surrender Policy
                                                                                         Defection
                                 Negative
                                            t1              t2                      t3                        time
Figure 1.                                                             Time of
Predicting customer                                                   observation
behaviour over time
                              Lifestages         Young Single        Setting Up Home             Middle Age          Retired
                      In the study reported here, survival analysis is used to estimate when customers are
                      likely to make their next financial services purchase. The possibility also exists of
                      providing an indication of the timeframe beyond which subsequent purchases are
                      unlikely to be made.
                         Being able to predict when a certain behaviour is likely to occur is of particular
                      importance to the timing element of marketing campaigns. For example, survival
                      analysis can assist planners in understanding the time-points at which customers are
                      most likely to be receptive to marketing communications initiatives, and also those
                      beyond which further effort is likely to be ineffective, thereby reducing the amount of
                      wasted marketing effort.
                         Survival analysis can also be useful in forming a judgement of the value of a
                      customer. If the likelihood of repurchase in the near future can be established,
                      assumptions can be made about the future profitability of a customer’s business.
                      Moreover, being able to predict the specific type of purchase can provide an even more
                      precise indication of future worth. Though these inputs are clearly useful in the
                      targeting of potentially profitable customers, the details of profitability analysis are
                      outside the scope of this paper. For an overview, see Zeithaml (2000).
                         Survival analysis is particularly suited to the study of cross-selling, in allowing the
                      time element to be analysed. It does not necessarily make strong assumptions about
                      the underlying distributions, such as the frequently made assumption of normality.
                      The approach can deal efficiently with cases when “censoring” occurs: for example, if
                      the event occurs beyond the observation period or if cases have been removed before
                      the end of the observation period and before the event occurs. It can also be applied to
                      longitudinal data.
                         In using Survival analysis to predict the time to the next event, or lifetime, the
                      objective is to identify the customer characteristics that affect survival. For example, in
                      the context of financial services, these “covariates” are to be found in customer
                      information held by the financial institution in question, such as age, gender, marital
                      status and other demographic information.
Application                                                                                     Identifying
Survival analysis comprises two elements: time effect and individual effect. The first of      cross-selling
these can be described as an underlying survivor function shared among the whole
population. The second describes the difference of the individual from a base point, in       opportunities
terms of the covariates. These two functions together describe quantitatively the
probability of an event taking place for a specific individual. A useful feature is the
ability to be able to generate separate survivor functions or hazard functions for                     399
different groups within a population. The estimation can be either non-parametric,
semi-parametric or parametric. The first two categories are best suited to exploratory
investigations, since they make limited assumptions about the distribution of the data.
Parametric regression models require the nature of that distribution to be known
(exponential, Weibull, log-normal), and the choice of regression model is chosen
accordingly.
   In the study reported here, the underlying population distribution was not known
prior to modelling. The decision was therefore made to proceed on an exploratory
basis, making the weakest assumptions about its nature. The analysis was based on
Cox’s (1972) Proportional Hazard’s Model, a semi-parametric model which allows the
data to determine the underlying distribution. If it had in fact been known a priori, then
a parametric model based on the known distribution would have been more
appropriate.
   The analysis is divided into two parts: a description of the time structure of the
population and an assessment of the impact of the covariates on the likelihood of the
event occurring. The model assumes that the covariates have a proportional effect on
the time structure, but this can be relaxed. Thus, Cox’s model considers the probability
of an event occurring in the small interval (t,t þ dt) assuming the event has not
occurred before t. This measure is sometimes called the “hazard” and can be written as:
                                      l0 ðtÞexpðx0 bÞ
where: l0(t) is the time structure; exp(x0 b) represents the effect of the individual
characteristics or covariates, x; and b is a set of parameters to be estimated.
   Cox’s model is thus similar to other regression models for estimation of the specific
customer likelihood to repurchase relative to other customers, such as logistic
regression. It also provides a ranking of customers according to the likelihood of
repurchase. This information is of particular value in pinpointing the most suitable
targets for marketing campaigns.
   However, the real advantage of survival analysis over other regression models is the
information provided on the time to next purchase. In Cox’s model, time is continuous,
whereas the application of logistic regression (in credit scoring, for example) assumes
discrete episodes of time. Continuous treatment of time is of particular value in
assessing the potential for a continued customer relationship. Assessment of the
quality of fit can be in terms of the likelihood, which can be compared to x2 statistics on
specific degrees of freedom or prediction in terms of such measures as Area Under the
Receiver Operator Curve, AUROC (Hand, 1997; Thomas et al., 2002), which measures
the ability of the model to determine the correct outcome.
   Figure 2 shows an “idealised” survival curve, typical of some customer behaviour
we have found in the financial services sector. It plots the probability of not
repurchasing against time. The curve slopes downward, but then seems to level out at
MIP                  a certain point. The proportion below the asymptote represents those who are unlikely
25,4                 to repurchase; in this case, the asymptotic value is 0.25, indicating a quarter of the total
                     number under study. Depending on how the data are gathered, the time can be
                     represented in days, months or years, to show the timeframe in which repurchase is
                     most likely for the remaining three quarters. Combined with the repurchase likelihood
                     ranking mentioned above, the possibility arises of identifying the prime targets among
400                  those likely re-buyers.
                         As with all forms of statistical analysis, there are limitations with this procedure. If
                     there are relatively few observations, the survival curves will not be as smooth as in
                     Figure 2 and will take the shape of a descending staircase. This makes it more difficult
                     to ascertain when a certain percentage of the population under investigation will have
                     repurchased. Equally, if the proportion of censored data (unobserved purchases) is
                     high, it may again be difficult to determine the timescale to a given percentage of
                     repurchases. Moreover, the precision of the estimates of the coefficients associated with
                     the covariates will be affected in both these cases.
                         Clearly, care needs to be taken with the selection of the time variable employed. In
                     some contexts, calendar time may be less appropriate than some other timeframe, such
                     as time available to purchase. As with other regression methods, the choice of variables
                     to be included or excluded will have an impact on the outcome. The issue of co-linearity
                     arises where covariates are highly correlated. If an important measure is missing from
                     the analysis, the model might be poorly specified.
                         Since, Cox’s model is based on an iterative approach, there are occasions when the
                     estimation procedure does not converge. This may be due to over-representation in the
                     sample of one of the groups within the older repurchase times.
                     Data
                     The data analysed were collected from a randomly generated sample of customer
                     records in the recently established data warehouse of a large international insurance
                     company. To protect confidentiality, the sample was limited to 10 percent of the entire
                     database, amounting to 10,976 customers in total. Although determination of the
                     sample size was beyond our control, it was considered to be a large enough basis for
                     the analysis and sufficiently representative. The demographic profile of the entire
                     customer database was not known, but is assumed to be similar to that of other large
                     financial institutions, which in turn reflects the demographic profile of the market for
                     retail financial services.
                                   Probability of
                                   Survival         1
0.25
Figure 2.                                           0
Idealised survivor
function
                                                                                        Time
Data relating to both customer characteristics and product purchases were available               Identifying
within the database. In terms of customer characteristics, they were: current age of the         cross-selling
individual; age of the individual at the time of the first purchase from the company;
gender; marital status; and “Financial ACORN” classification. Standing for A                    opportunities
Classification of Residential Neighbourhoods, ACORN is a geodemographic
classification based on census data. It classifies the population of the UK, by
household, into 17 “groups” and 54 “types”. Financial ACORN focuses exclusively on                            401
the consumption of financial services.
   The company studied derives the information in its database primarily from
application forms and policy details captured at the time of purchase. For most
variables, the information available was accurate and complete for most variables,
except that marital status suffered from a large proportion of omissions and the
possibility of inaccuracy. This is a challenge that all financial institutions face in
capturing such data, unless they can be updated regularly. Financial ACORN
classification is externally provided information, added to the customer files by postcode
matching. (In the UK, a unique domestic postcode is shared by about 15 households).
   The decision was made to capture behavioural data on the first five purchases
from the company only. While some customers would have made more than that
number, the vast majority within the database had made only two; just less than
half (44 percent) of the 10,976 customers in the sample had bought at least twice.
Owing to the relatively small proportion of customers holding multiple products,
the analysis focused only on the prediction of the second purchase.
   Table I shows the ratio, ordinal and nominal variables used in the subsequent
analysis and their properties. The variables were not coarse-classified, as is often the
case in credit scoring.
   The two-stage analysis procedure used consisted of an initial segmentation of the
population into lifestage segments, followed by the application of Cox’s Proportional
Hazard Model. Owing to the link identified by Kamakura et al. (1991) between financial
purchase sequence acquisition and the life cycle, the decision was taken to segment the
sample first into lifestage segments and to look within the segments. That was
achieved by K-means clustering, chosen as a rapid method for obtaining the clusters
(Punj and Stewart, 1983). Two criteria were used to determine the most appropriate
cluster solution: inspection of the squared mean error value, and the size of the clusters.
Age              The current age of the customer. Measured in years        Ratio, discrete
Age.Sdt          The age at which the customer made their first            Ratio, discrete
                 purchase from the company. Measured in years
Cur.Mrtl         Current marital status: single, married, divorced,        Nominal, discrete
                 separated or unclassified
ACORN            Financial ACORN classification: A – financially           Ordinal, discrete
                 sophisticated, B – financially involved, C –
                 financially moderate, D – financially inactive and a
                 final group to represent those unclassified
Gender           Male or female                                            Nominal, discrete                Table I.
Product          A range of product groups including, among others,        Nominal, discrete   Customer information
                 personal pensions and investments                                             available for analysis
MIP                      The segments were defined according to the customer characteristics available from
25,4                     the database, as presented in Table I. Cluster descriptors were chosen to reflect the
                         behaviour and characteristics of the groups, and were influenced by ACORN
                         terminology. For example, use of the term “moderate” relates to a basic engagement
                         with financial products often referred to as “foundation products” (Kamakura et al.,
                         1991). “Financially involved” individuals have bought a broader range, and
402                      “sophisticated” defines those who have purchased more complex and riskier
                         products. The next stage of the analysis considered the propensity of each cluster to
                         make a subsequent purchase and the estimation of survival functions.
                         Findings
                         Cluster analysis
                         Cluster analysis established six clusters with distinct features, based on the
                         comparison of within and between cluster variability. Figure 3 shows a graphical
                         representation of the cluster profiles. The position of the clusters in the
                         two-dimensional space represents the relative age and degree of financial
                         sophistication (according to Financial ACORN) of the individuals they contain. The
                         size of the circle represents the relative size of the cluster, and the gender distributions
                         are highlighted within each circle. The Appendix shows the relative percentages
                         within each cluster exhibiting the characteristics used in the analysis.
                            Cluster 2 is the largest, accounting for 31 percent of the sample but containing the
                         least financially sophisticated customers; Clusters 3, 5 and 6 contain the most
                         financially sophisticated customers, but Cluster 6 is the smallest, with just 3 percent of
                         the total.
                            In terms of the age, Clusters 1-3 have the lowest average ages 46, 45 and 44,
                         respectively. Cluster 5 is in the middle of the age range, with an average age of 51,
                         leaving the oldest customers in Clusters 4 and 6, the average ages being 65 in
                         both. The pattern with respect to age at first purchase from the company is similar.
                                                                                                     6. Sophisticated
                                                                                                        late starters
                                           65 Yrs
                                                                      M    F                                  F
                                                                                                          M
                                                                                                         5. Sophisticated
                                                              4. Financially moderate
                                                                                                           middle agers
                                                                      seniors
                                                                                                                   F
                                                                                                              M
                                           50 Yrs
                                     AGE
Figure 3.
Six lifestage clusters                              Low                                                                 High
                                                                    FINANCIAL SOPHISTICATION
Cluster 3 contains the youngest individuals, with an average age of 27, followed by            Identifying
Clusters 2, 1 and 5, which leaves Clusters 4 and 6 with those who were the oldest at first    cross-selling
purchase. It is notable that the difference between age at start and current age is higher
for the younger clusters than the older. For example, the difference is 12 years in          opportunities
Cluster 1 but only four in Cluster 4.
   Turning to gender representation and marital status, the distinctions between the
clusters are less obvious. In terms of gender, Clusters 1-3 contain approximately                     403
two-thirds men, whereas Clusters 4-6 show a more even distribution of male and female
individuals. With respect to marital status, a large proportion of the sample remained
unclassified, due to the difficulty in maintaining accurate records of a status that could
change over time. Of the customers for whom a description was at least available,
whether accurate or not, most were married. There is a slightly higher proportion of
single individuals in Clusters 1-3, compared with 4 and 5, and especially with Cluster 6.
   In terms of the relative degree of financial involvement and purchase activity,
Cluster 3 contains the most financially active: almost two-thirds of those customers had
made two purchases from the company, whereas only a third of those in Clusters 4 and
6 had done so. Looking at multiple product holding, 19 percent of the individuals in
Cluster 3 had bought 4 products from the company, compared with only 2 percent in
Cluster 6. The rank order of financial activity seems to suggest that Cluster 3 is the
most active, Clusters 4 and 6 the least active, while Clusters 1, 2 and 5 occupy the
middle ground.
Survival analysis
The purpose of the survival analysis was to ascertain the re-purchasing propensity of
each Cluster and to provide an estimate of the likely timeframe. A generic model was
produced, based on the whole sample, along with separate models for each of the
clusters 1-5. Cluster 6 was excluded from the survival analysis on account of its small
size.
   The variables used were the same as in the cluster analysis. The difference between
age at start of the relationship and current age represents the duration of the
relationship with the company, which is captured in the difference between the two
measures in several of the models. Age is thus reflected by either age at the start of the
relationship or current age. These two variables were found to be highly correlated,
and it may have been sufficient to use age at the start of the relationship alone. The
decision was taken to retain current age in the analysis for three reasons. First, it had
already been used in the segmentation analysis to identify age-based lifestage
segments; second, age is a useful descriptor for the subsequent profiling of the
segments; third, there were few other customer characteristics available in the
database, and it seemed desirable to retain as much customer information as possible.
   An extra variable in the analysis of Clusters 2-5 was the particular product
purchased. It could not be used in either the generic model or the model for Cluster 1
because of instability in the estimation of the parameters involved. The fitting
approach was forward stepwise selection, selecting a variable at a stage to add to the
model until no further variable proved to be significant.
   Table II provides a summary of the analysis of the fitting. All models produced a
significant fit to the data at the 5 percent level of significance, with Cluster 4 the
weakest at 3.2 percent. This reflects the size of the sample and hence of the clusters:
MIP                         Cluster 4 was the smallest after Cluster 6 had been withdrawn from the analysis. Using
25,4                        the AUROC criteria, all provide reasonable prediction, with Cluster 3 weakest and the
                            general model performing best.
                               The variables included in each model are shown in Table III. The analysis clearly
                            indicates that cluster behaviour differs in terms of the effects of variables on the
                            likelihood of a second purchase being made. The results for Clusters 1-3 are similar.
404                         Current age has a positive effect on likely subsequent purchase, while age at first
                            purchase decreases the potential. This is explained partly by the collinearity of these
                            two variables. The finding means that the older the current age of a customer, the
                            greater the likelihood that a second purchase will be made. However, the older the
                            customer at the time of the first purchase from the company, the lower the likelihood of
                            a second purchase. This reinforces the belief that it will be beneficial to make efforts to
                            attract younger customers, with the aim of developing longer term relationships with
                            them.
                               The analysis suggests that being married increases the likelihood of a second
                            purchase, and being female decreases it. This is consistent with previous research,
                            which has found men to be more financially involved than women, in general
                            (Harrison, 1997). Indeed, research commissioned by the Financial Services Authority
                            (2001) found that, while the take-up of financial products by women had increased
                            throughout the 1990s, their level of involvement generally was still below that of their
                            male counterparts. This is more pronounced for married and co-habiting women, due
                            to the devolution of financial responsibility to the husband or partner when part of a
                            couple.
                               Financial ACORN exhibits an effect on Clusters 1 and 2, indicating that the greater
                            the financial sophistication, the greater the likelihood of repurchase. The results are the
                                      Survival probability
                                                               0.6
406
                                                               0.4
0.2
                                                               0.0
Figure 4.
Survival curve for the                                               0      5000          10000       15000
generic model
                                                                                      Time in days
1.0
                                                               0.8
                                        Survival probability
0.6
0.4
0.2
Figure 5.                                                      0.0
Survival curve for
                                                                     0   2000      4000        6000   8000
cluster 5
                                                                                     Time in days
                         Discussion
                         The survival analysis has shown similarities in behaviour among “Moderately
                         Financially Active Adults” “Financially Involved Adults” and “Sophisticated Early
                         Starters” in that a proportion of those segments are unlikely to return to make another
                         purchase. For those who do return, the timeframe within which repurchase is likely to
                         take place is estimated to be about six years. While this may seem to be a long interval
                         at first sight, financial products are not normally purchased on a frequent basis. One
                         exception would be general insurance cover renewed yearly, but mortgages are
                         renewed or revised much less frequently and pensions may be purchased only once in a
                         lifetime.
                             “Sophisticated Middle Agers” and “Financially Moderate Seniors” displayed
                         convergent behaviour patterns, but differed markedly from the other three
                         segments. The apparent lack of asymptotic behaviour in the survival curves for
these two suggests that these individuals are still likely to make a further                    Identifying
purchase. The retention opportunity of these segments extends beyond the                       cross-selling
timeframe of the others.
    Thus, the findings indicate that two broader groups of segments exist, in terms           opportunities
of retention propensity. The “mature” segments exhibit a greater likelihood of
retention than the “younger” segments, which is both interesting and consistent
with work by Moschis et al. (1997, 2002) suggesting that mature consumers like to                      407
build relationships with companies. Moreover, relationships are more likely to be
maintained with mature consumers because, while they may be cynical at times,
they are more likely to trust companies and their employees than younger
customers are.
    This is a significant finding in that the UK, in common with a number of other
industrialised nations, is experiencing an ageing of the population and a consequent
increase in the proportion of older people. Those over 55 are particularly attractive in
financial terms, many of them being comparatively wealthy: income rich, asset rich
and the recipients of substantial windfalls in the form of inheritance (Silman and
Poustie, 1994). Not surprisingly, they have been found to account for more than half of
all discretionary spending (The Economist, 2002).
    However, the two “mature” segments identified in this study are the smallest,
accounting for only 10 percent of the sample collectively, compared with the 31 percent
in the “Moderately Financially Active Adults” segment alone. This has clear
implications for marketing strategy.
Managerial implications
The highly competitive nature of the financial services sector requires its marketing
planners to find effective approaches for maintaining relationships with customers.
Unlike other areas of consumption, the time lag between purchases is often relatively
long and the needs of customers tend to vary with lifestage. In such a context, the
importance of maintaining a relationship with customers is paramount, since the cost
of “cross-selling” and “up-selling” to existing customers is likely to be less than that of
acquiring new customers.
   This paper has shown that, by using a technique such as survival analysis, it is
possible to ascertain not only the likelihood of subsequent purchases being made but
also the timeframe in which that is likely to occur. In the first instance, it is important
for marketers in financial institutions to understand the relative likelihood that a
customer or set of customers will or will re-purchase. This can be achieved at the level
of the entire customer base, or for specific customer groups if it known that segment
differences exist. Such knowledge forms a sound basis for decisions about the
allocation of targeting effort.
   Survival analysis can further estimate the timeframe in which re-purchasing may
take place. For marketing planners, this is especially important in identifying windows
of opportunity for effective marketing communications. The consequences of being
“too early” or “too late” in this respect are well understood by marketers. If initiatives
can be timed to reach customers or prospects when they are likely to be in their “ready
to buy” phase, the impact will be enhanced and wastage will be reduced.
   Owing to relatively small number of customers holding three, four or five of one
company’s products among the almost 11,000 whose data were analysed in this study,
MIP    it was not possible to conduct a meaningful analysis of transactions after the second
25,4   purchase. In other circumstances, given the availability of the necessary data, it would
       be feasible to estimate the time to subsequent repurchasing.
           The current study has not considered what the second purchase might be, but that
       is the subject of ongoing work by the authors. This analysis could be dealt with in
       several ways, for example, by applying a competing risk model or by treating it as a
408    semi-Markovian problem, with times between states modelled using parametric and
       non-parametric distributions with an estimated transition matrix.
       Conclusion
       The objective of the study reported here was to use information gathered from a data
       warehouse to develop insights into a customer base and to explore the marketing
       opportunities that might arise. Based on a large sample of the customers of an
       international insurance company, the paper illustrates the application of the Survival
       analysis method to the study of cross-selling. Cox’s proportional hazard regression has
       several advantages over other similar regression techniques, chief among which is the
       treatment of time as a continuum rather than as discrete episodes, which reflects the
       reality of a customer relationship.
          The first stage of the analysis used standard clustering approaches to segment the
       sample into identifiable sub-groups, and found that those exhibited different buying
       behaviours, which were potentially the basis for decisions about the most appropriate
       marketing campaign strategy for each one. The study also identifies the variables or
       characteristics within the identified market segments that have the greatest effect on
       the likelihood of repurchasing, thus allowing a closer targeting of likely targets for
       cross-selling initiatives.
       References
       Agrawal, R. and Srikant, R. (1995), “Mining sequential patterns”, Proceedings of the 11th
              International Conference of Data Engineering (ICDE).
       Ansell, R.O. and Ansell, J.I. (1987), “Modelling the reliability of sulphur sodium batteries”,
              Reliability Engineering, Vol. 17, pp. 127-37.
       Ansell, J.I. and Philips, M.J. (1994), Practical Methods for Reliability Data Analysis, Clarendon
              Press, Oxford.
       Blodgett, L.L. (1992), “Research notes and communications factors in the instability of
              international joint ventures: an event history analysis”, Strategic Management Journal,
              Vol. 13 No. 6, pp. 475-81.
       Chen, K.C.W. and Lee, C-H.J. (1993), “Financial ratios and corporate endurance: a case of the oil
              and gas industry”, Contemporary Accounting Research, Vol. 9 No. 2, pp. 667-94.
       Collett, D. (1994), Modelling Survival Data in Medical Research, Chapman and Hall, London.
       Cox, D.R. (1972), “Regression models and life tables”, Journal of Royal Statistical Society, Series B,
              Vol. 74, pp. 187-220.
       Evans, M. (2002), “Prevention is better than cure: redoubling the focus on customer retention”,
              Journal of Financial Services Marketing, Vol. 7 No. 2, pp. 186-98.
       Felvey, J. (1982), “Cross-selling by computer”, Bank Marketing, pp. 25-7.
       Financial Services Authority (2001), “Women and personal finance: the reality of the gender
              gap”, Consumer Research, Vol. 7, April.
Hand, D.J. (1997), Construction and Assessment of Classification Rules, Wiley, Chichester.               Identifying
Harrison, T. (1997), “Mapping customer segments for personal financial services: replication and        cross-selling
      validation”, Journal of Financial Services Marketing, Vol. 2 No. 1, pp. 39-54.
Hebden, J.J. and Pickering, J.F. (1974), “Patterns of acquisition of consumer durables”, Oxford
                                                                                                       opportunities
      Bulletin of Economics and Statistics, Vol. 36, pp. 67-94.
Kamakura, W.A., Ramaswami, S.N. and Srivastava, R.K. (1991), “Applying latent trait analysis
      in the evaluation of prospects for cross-selling of financial services”, International Journal            409
      of Research in Marketing, Vol. 8, pp. 329-49.
Kamakura, W.A., Wedel, M., de Rossa, F. and Mazzon, J.A. (2003), “Cross-selling through
      database marketing: a mixed data factor analyzer for data augmentation and prediction”,
      International Journal of Research in Marketing, Vol. 20 No. 1, pp. 45-65.
Kasulis, J.L., Lusch, R.F. and Stafford, E.F. Jr (1979), “Consumer acquisition patterns for durable
      goods”, Journal of Consumer Research, Vol. 6, pp. 47-57.
Moschis, G.P., Lee, E. and Mathur, A. (1997), “Targeting the mature market: opportunities and
      challenges”, Journal of Consumer Marketing, Vol. 14 No. 4, pp. 282-93.
Moschis, G., Bellenger, D. and Curasi, C. (2002), “Financial service preferences and patronage
      motives of older consumers”, Journal of Financial Services Marketing, Vol. 7 No. 4.
Paas, L.J. (1998), “Mokken scaling characteristic sets and acquisition patterns of durable and
      financial products”, Journal of Economic Psychology, Vol. 19 No. 3, pp. 353-76.
Paas, L.J. (2001), “Acquisition patterns of products facilitating financial transactions: a
      cross-national investigation”, International Journal of Bank Marketing, Vol. 19 No. 7,
      pp. 266-75.
Prentice, R.L., Williams, B.J. and Peterson, A.V. (1981), “On regression analysis of multivariate
      failure data”, Biometrika, Vol. 68, pp. 373-9.
Prinzie, A. and Van den Poel, D. (2006), “Investigating purchasing-sequence patterns for financial
      services using Markov, MTD and MTDg models”, European Journal of Operational
      Research, Vol. 170 No. 3.
Punj, G. and Stewart, D.W. (1983), “Cluster analysis in marketing research: review and
      suggestions for application”, Journal of Marketing Research, Vol. 20, pp. 134-48.
Silman, R. and Poustie, R. (1994), “What they eat, buy, read and watch”, Admap, July/August,
      pp. 25-8.
Soutar, G.N. and Cornish-Ward, S.T. (1997), “Ownership patterns for durable goods and financial
      assets: a Rasch analysis”, Applied Economics, Vol. 29 No. 11, pp. 903-11.
Srivastava, R. and Shocker, A.D. (1987), “Strategic challenges in the financial services industry”,
      in Pettigrew, A. (Ed.), The Management of Strategic Change, Basil Blackwell, Oxford.
Staber, U.H. (1992), “Organizational interdependence and organizational mortality in the
      cooperative sector: a community ecology perspective”, Human Relations, Vol. 45 No. 11,
      pp. 1191-212.
Stafford, E.F., Kasulis, J.J. and Lusch, R.F. (1982), “Consumer behaviour in accumulating
      household financial assets”, Journal of Business Research, Vol. 10, pp. 397-417.
Stepanova, M. and Thomas, L. (2002), “Survival analysis methods for personal loan data”,
      Operations Research, Vol. 50, pp. 277-89.
Stewart, K. (1998), “An exploration of customer exit in retail banking”, International Journal of
      Bank Marketing, Vol. 16 No. 1, pp. 6-14.
The Economist (2002), “Over 60 and overlooked”, The Economist, US Edition, 10 August.
MIP         Thomas, L.C., Edelman, D.B. and Crook, J.N. (2002), “Credit Scoring and its Applications”, SIAM
                  (Society for Industrial and Applied Mathematics), Philadephia, PA.
25,4
            Zeithaml, V.A. (2000), “Service quality, profitability and the economic worth of customers: what
                  we know and what we need to learn”, Journal of the Academy of Marketing Science, Vol. 28
                  No. 1, pp. 67-85.
                                                                      Cluster number
            Characteristic                       1          2          3         4           5          6
            Cluster size
            Proportion of sample               29         31          18          7         11           3
            Financial ACORN
            A – Financially sophisticated                            100.0                 100.0       100.0
            B – Financially active            100.0                              37.5
            C – Financially moderate                      40.6                   44.9
            D – Financially inactive                      20.6                    5.6
            Unclassified                                  38.8                   12.0
            Current age
            Mean age                           45.7       45.1        43.9       64.8       50.9        65.0
            Age at first purchase
            Mean age at first purchase         33.4       31.7        26.7       60.8       42.3        62.6
            Gender representation
            Male                               61.5       63.9        66.7       57.2       58.3        57.5
            Female                             38.5       36.1        33.3       42.8       41.7        42.5
            Current marital status
            Single                              7.0        8.1         7.1        1.3        3.3         0.2
            Married                            18.4       15.4        16.6       12.1       24.0         7.6
            Separated                           0.4        0.4         0.4        0.1        0.5         0.3
            Divorced                            0.9        0.9         0.8        1.5        1.4         0.3
            Widowed                             0.2        0.1         0.1        1.2        0.2         0.8
            Unclassified                       73.0       75.1        75.0       83.8       70.6        90.7
            No of purchases with the same company
            1 product                          53.5       58.4        36.2       74.9       60.3        80.5
            2 products                         11.2       10.6        10.1       13.4       13.2        10.3
            3 products                         12.2        9.9        14.6        5.9       11.3         6.2
            4 products                         11.6        9.9        19.0        3.4        8.6         1.6
Table AI.   5 or more products                 10.6       10.4        19.1        2.0        6.2         1.0
            Corresponding author
            Tina Harrison can be contacted at: tina.harrison@ed.ac.uk