0% found this document useful (0 votes)
121 views10 pages

Food Quality and Preference: Leticia Vidal, Gastón Ares, Duncan I. Hedderley, Michael Meyners, Sara R. Jaeger

This document compares the rate-all-that-apply (RATA) and check-all-that-apply (CATA) questioning methods across seven consumer studies involving 860 consumers and different product categories. The studies found that RATA and CATA provided very similar results, with only minor differences observed between the methods. No clear superiority of one method over the other was established. The relationship between mean RATA scores and CATA term frequencies demonstrated that CATA differentiates samples based on attribute strength. Collecting RATA data but analyzing as CATA was inferior to using mean RATA scores. Overall, RATA was not found to be a clear improvement over CATA, and the best method depends on study aims and product

Uploaded by

Suhey Pérez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views10 pages

Food Quality and Preference: Leticia Vidal, Gastón Ares, Duncan I. Hedderley, Michael Meyners, Sara R. Jaeger

This document compares the rate-all-that-apply (RATA) and check-all-that-apply (CATA) questioning methods across seven consumer studies involving 860 consumers and different product categories. The studies found that RATA and CATA provided very similar results, with only minor differences observed between the methods. No clear superiority of one method over the other was established. The relationship between mean RATA scores and CATA term frequencies demonstrated that CATA differentiates samples based on attribute strength. Collecting RATA data but analyzing as CATA was inferior to using mean RATA scores. Overall, RATA was not found to be a clear improvement over CATA, and the best method depends on study aims and product

Uploaded by

Suhey Pérez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Food Quality and Preference 67 (2018) 49–58

Contents lists available at ScienceDirect

Food Quality and Preference


journal homepage: www.elsevier.com/locate/foodqual

Comparison of rate-all-that-apply (RATA) and check-all-that-apply


(CATA) questions across seven consumer studies
Leticia Vidal a,⇑, Gastón Ares a, Duncan I. Hedderley b, Michael Meyners c, Sara R. Jaeger b
a
Sensometrics & Consumer Science, Instituto Polo Tecnológico de Pando, Facultad de Química, Universidad de la República, By Pass de Rutas 8 y 101 s/n, C.P. 91000 Pando,
Canelones, Uruguay
b
The New Zealand Institute for Plant & Food Research Ltd., 120 Mt Albert Road, Private Bag 92169, Auckland, New Zealand
c
Procter & Gamble Service GmbH, 65824 Schwalbach am Taunus, Germany

a r t i c l e i n f o a b s t r a c t

Article history: Rate-all-that-apply (RATA) questions are a variation of check-all-that-apply (CATA) questions in which
Received 22 July 2016 consumers are asked to indicate whether terms from a list apply to describe a given product, and if they
Received in revised form 10 October 2016 do so, to rate their intensity. RATA questions have been argued to provide more insights than CATA ques-
Accepted 17 December 2016
tions for sensory characterization with consumers. The present research is, to date, the most exhaustive
Available online 21 December 2016
comparison of CATA and RATA with regard to term usage, sample discrimination and sample configura-
tions. A total of seven studies with 860 consumers were conducted with different product categories. A
Keywords:
between-subjects design was used in all studies to compare the two methodologies. Confirming past
Research methodology
Sensory characterization
studies, results from RATA and CATA were very similar. Minor differences between RATA and CATA were
Consumer profiling found, but were study and term specific and general superiority of one methodology over the other was
not established, as opposed to what previous studies had suggested. Instead, results indicate that each
method might have advantages over the other for certain product characteristics. A strong linear relation-
ship was established between mean RATA scores and CATA term citation frequencies, demonstrating
clearly that CATA questions differentiate among samples based on relative strength/weakness of sample
characteristics. Collecting data as RATA but analysing them as CATA was inferior to the use of mean RATA
scores, and is not recommended. The comparison of RATA data using mean scores and Dravnieks’ scores
showed no advantage of the latter and it is recommended that simple mean scores are used. Overall,
results from the present work show that RATA is not necessarily an improvement over CATA questions
and that for consumer research the decision to add an attribute intensity rating step depends on the
aim of the study and the specific characteristics of the sample set.
Ó 2016 Elsevier Ltd. All rights reserved.

1. Introduction Lee, & Meullenet, 2010; Dos Santos et al., 2015; Lelièvre, Chollet,
Abdi, & Valentin, 2008; Moussaoui & Varela, 2010).
Sensory product characterisation is a cornerstone activity in Check-all-that-apply (CATA) questions, a methodology in which
sensory and consumer science (Lawless & Heymann, 2010). Inter- consumers are presented with a list of terms and asked to select all
est in alternative methodologies for sensory characterisation with those that apply to the focal sample, have become one of the most
consumers has substantially increased in the last decade as the line popular approaches for sensory product characterization with con-
between trained assessors and consumers has blurred (Ares, 2015; sumers (Ares & Jaeger, 2015). The structured format of CATA ques-
Meiselman, 2013; Varela & Ares, 2012). For many products, con- tions enables data collection and analysis from large consumer
sumers have been reported to provide sensory spaces that are samples easily and quickly (Ares & Varela, 2014). Research has
highly similar to those obtained using descriptive analysis with shown that data from CATA questions are valid and repeatable
trained assessors (Ares et al., 2015; Cadena et al., 2014; (Ares et al., 2014; Ares et al., 2015; Jaeger et al., 2013) and that they
Dehlholm, Brockhoff, Mejnert, Aaslyng, & Bredie, 2012; Dooley, are not likely to bias hedonic scores (Jaeger & Ares, 2014; Jaeger,
Giacalone, et al., 2013).
The simplicity of CATA questions is also a potential limitation.
The binary response format does not allow direct measurement
⇑ Corresponding author. of the intensity of sensory attributes, which could potentially hin-
E-mail address: lvidal@fq.edu.uy (L. Vidal).

http://dx.doi.org/10.1016/j.foodqual.2016.12.013
0950-3293/Ó 2016 Elsevier Ltd. All rights reserved.
50 L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58

der discrimination of samples with subtle sensory differences were conducted in Auckland (New Zealand), whereas Study 7
(Meyners, Jaeger, & Ares, 2016). In order to overcome this limita- was conducted in Montevideo (Uruguay).
tion, approaches that combine CATA data and intensity measure- In New Zealand participants were registered on a database
ments are emerging (e.g., Reinbach, Giacalone, Ribeiro, Bredie, & maintained by a professional recruitment firm and were screened
Frøst, 2014). In this research focus is directed to rate-all-that- in accordance with eligibility criteria for each of the studies. In
apply (RATA) questions, in which consumers are asked to indicate Uruguay participants were recruited from the consumer database
whether terms from a list apply to describe the focal sample, and if of the Sensometrics & Consumer Science research group of Univer-
they do so, to rate their intensity using a 3 or 5-point scale (Ares, sidad de la República (Uruguay), based on their consumption of the
Bruzzone, et al., 2014). focal products. In all the studies participants gave informed con-
Comparison of sensory characterizations performed using CATA sent and were compensated for their participation.
and RATA questions has shown that both methodologies provide Participants were aged between 18 and 71 years old and the
similar information about samples (Ares, Bruzzone, et al., 2014; percentage of female participants ranged from 33% to 71%. The
Reinbach et al., 2014). The validity and reproducibility of RATA consumer samples comprised varying household compositions,
data has also been confirmed and recommendations for data anal- income levels, education levels, etc. but were not representative
ysis are emerging (Giacalone & Hedelund, 2016; Meyners et al., of the general populations in Auckland and Montevideo.
2016).
However, RATA questions are still an under-explored variant of 2.2. Samples
CATA questions and their potential to improve sample discrimina-
tion remains unconfirmed. Ares, Bruzzone et al. (2014) reported Six product categories were tested (Table 1). In Studies 1–2
evidence of greater discriminative capacity for RATA questions samples were advanced selections grown under commercial condi-
compared to the simple CATA questions: the percentage of terms tions and commercially available apple cultivars. In Studies 3, 4, 6
for which significant differences among samples were identified and 7 samples were commercially available in New Zealand or
was higher for the RATA question variant compared to CATA ques- Uruguay, which were purchased from local supermarkets. In Study
tion variant in three of four consumer studies. However, reported 5 samples were raspberry coulis made from frozen berries pur-
comparisons are limited so far and the potential superiority of chased at a local supplier (Gilmores Wholesale Food and Beverage
RATA questions still needs to be proven, particularly considering Supplies, Auckland, NZ). Defrosted berries were pulped in a Cuisin-
the increasing popularity of the methodology (Franco-Luesma eart blender. Samples were created by manipulation of sweetness
et al., 2016; Oppermann, de Graaf, Scholten, Stieger, & Piqueras- (added sucrose to 3% or 6%), berry flavour (1% sucrose and 0.2%
Fiszman, 2017; Waehrens, Zhang, Hedelund, Petersen, & Byrne, Boysenberry essence from Blue Pacific), or acidity (added malic
2016). In this context, the present research expands the method- and tartaric acids (Sigma Aldrich, Saint Louis, MO), 0.1% and 0.4%,
ological comparison of CATA and RATA questions, focusing on term respectively). Coulis samples were measured into 150 g quantities,
use, sample discrimination and sample configurations. Different frozen and defrosted prior to assessment.
approaches to the analysis of RATA data are also considered in Serving sizes were always sufficient to allow 2–3 bites/sips per
order to provide recommendations for practitioners on this topic. sample in all studies. Samples were always presented in cups
Overall, the present research contributes further insight on the dif- labelled with 3-digit random codes at room temperature, except
ferences between CATA and RATA questions, as well as guidelines for Study 7, where the orange-flavoured drinks samples were pre-
for the implementation of RATA questions. sented at 10 °C.

2.3. Experimental treatments, sensory terms and data collection


2. Materials and methods
The procedure for data collection in all studies was similar.
Seven studies involving different product categories were con- Between-subjects experiments were always used to compare
ducted with a total of 860 consumers. A between-subjects design responses from two experimental treatments: RATA and CATA
was used in all studies to compare product sensory characteriza- questions. Approximately half of the participants were randomly
tions obtained using RATA and CATA questions. Table 1 provides assigned to each of the experimental treatments (Table 1). In all
an overview of the studies. studies, no significant differences between the experimental
groups were established in terms of age, gender, and frequency
of consumption of the focal products (p > 0.10).
2.1. Participants The experimental groups using CATA questions for sample eval-
uation were asked to check all the terms that they considered
Seven consumer studies were conducted, each involving appropriate to describe each sample. Consumers who used RATA
102–203 participants, about half of which were presented with a questions were asked to check the terms they considered appropri-
RATA task and the others with a CATA task (Table 1). Studies 1–6 ate for describing samples and then to rate the intensity of the

Table 1
Overview of the seven studies comparing rate-all-that-apply (RATA) and check-all-that-apply (CATA) questions for product sensory characterization by consumers.

Study ID Number of consumers who completed Number of consumers who completed Product category Number of Number of
the task using RATA questions the task using CATA questions samples sensory terms
1 56 54 Apple 4 16
2 52 53 Apple 4 16
3 56 59 Peanuts 3 12
4 60 56 Tinned pineapple 4 12
5 101 102 Raspberry coulis 5 12
6 54 55 Fruitcake 5 12
7 53 49 Orange-flavoured powdered drinks 4 16
L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58 51

applicable terms using a 3-point structured scale (‘low’, ‘medium’ For RATA scores sample configurations were obtained using
and ‘high’). Consumers were told that they had to leave the scale Principal Component Analysis (PCA) on both the arithmetic mean
blank in the case of non-applicable terms. values and Dravnieks’ scores (Dravnieks, 1982). For each attribute
The sensory terms used in each study were selected based on and sample, Dravnieks’ scores are calculated as the square root of
pilot work or previous research using the same product categories. the product of the proportion of consumers who selected the attri-
The lists of terms comprised 12 or 16 terms and covered multiple bute for describing the sample and the average intensity scores of
sensory modalities (appearance, aroma, flavour/taste, texture, after those consumers (ignoring consumers who did not select the attri-
taste, mouth feel) (available upon request). Based on recommenda- bute). The rationale for considering Dravnieks’ scores is that they
tions by Ares, Etchemendy, et al. (2014), the order in which the may provide a better summary of the intensity scores obtained
terms were listed for both CATA and RATA questions was different in a RATA question than arithmetic means as they balance fre-
for each product and each participant, following a Williams’ Latin quency of use and average intensity by only taking into account
Square design. the scores given by those consumers who considered that the attri-
Products were presented sequentially following Williams’ bute was applicable for describing the focal sample and weighing
designs. Data collection took place in standard sensory booths, this with the usage rate.
under white lighting, controlled temperature (20–23 °C) and air- Confidence ellipses around samples were constructed using a
flow conditions. truncated total bootstrapping approach in which only the first
two dimensions of the configurations were considered (Cadoret &
2.4. Data analysis Husson, 2013). Similarity between the sample and term configura-
tions in the first two dimensions, obtained using data from CATA
In accordance with the stated aim and drawing on past method- and RATA questions, was evaluated using the RV coefficient
ological research on CATA and related methods a number of anal- (Robert & Escoufier, 1976). In order to visually compare the config-
yses were performed, focusing on sample characterization, urations, a Procrustes rotation was used considering the configura-
discrimination, sample configurations and stability of the results. tion obtained using CATA questions as reference.
These analyses were performed using data from CATA and RATA
questions. 2.4.4. Stability of the results
RATA questions enabled two approaches to analysis: converting The stability of the results was evaluated using a bootstrapping
RATA data to CATA (RATA-as-CATA) by collapsing responses to two re-sampling approach (Ares, Tárrega, Izquierdo, & Jaeger, 2014).
levels (0 if the attribute was not selected as applicable for describ- The bootstrapping process consisted of extracting random subsets
ing the focal sample or 1 if the attribute was selected as applicable, of different size (m = 5, 10, 15, 20,. . ., N) from the original data with
regardless of its intensity rating) or treating RATA data as continu- N consumers, using sampling with replacement. For each m, 1000
ous (RATA scores) by expanding the scale to 4 points (0–3) random subsets were obtained. Sample configurations were
(Meyners et al., 2016). obtained for each subset. The agreement between sample and term
configurations in the first two dimensions of the configuration and
2.4.1. Term usage the reference configuration (obtained with all the consumers) was
For each study, frequency of use of each term for each sample in evaluated by computing the RV coefficient between their coordi-
CATA and RATA-as-CATA questions was calculated by counting the nates. Average values for the 1000 random subsets of size equal
number of participants who selected the term for describing each to the total number of consumers in each study (N) were calculated
sample. Fisher’s exact test (Fisher, 1954) was used to evaluate and used as an index of stability.
the existence of significant differences between CATA and RATA All statistical analyses were performed using R language (R Core
questions at the aggregate level and for each of the terms. Team, 2015). FactoMineR was used to perform CA, PCA and to cal-
The frequency distribution of RATA scores at the aggregate level culate RV coefficients (Lê, Josse, & Husson, 2008).
was determined. The mean RATA scores for each term and sample
were graphed as a function of the percentage of consumers who
3. Results
selected the term using CATA questions. The R2 coefficient of the
linear correlation was calculated.
3.1. Frequency of use of sensory terms and RATA scores

2.4.2. Significant differences among samples In six of the seven studies consumers used a significantly larger
For CATA questions and RATA-as-CATA data, Cochran’s Q test number of terms (p < 0.0001) for describing samples when using
(Manoukian, 1986) was carried out to identify significant differ- RATA questions compared to CATA questions (Table 2a). In these
ences among samples for each of the sensory terms. Pairwise com- studies, the percentage of terms for which frequency of use signif-
parisons were performed using the sign test, as proposed by icantly increased ranged from 38% to 67% (Table 2b). The average
Meyners, Castura, and Carr (2013) and Meyners and Castura increase in the frequency of use of the terms ranged between 17
(2014). and 55% (Table 2c). An exception to these trends was Study 3, in
RATA scores were analysed following the recommendations of which no significant differences in the frequency of use of the
Meyners et al. (2016). ANOVA was performed considering sample terms between CATA and RATA questions were found, both at
and consumer as fixed effects. Besides, the t statistic for all pair- the aggregate level and for all the individual terms
wise comparisons of samples based on the pooled variances was (Table 2a and c). Compared to the rest of the studies, this study
computed. included fewer samples with larger differences (3 peanut samples:
dry roasted, honey coated and salted).
2.4.3. Sample configurations For completeness, the distribution of RATA intensity scores is
Correspondence analysis (CA) was performed on the frequency shown in Table 2d. In six of the seven studies (except Study 5)
table of CATA and RATA-as-CATA data. CA was performed consid- the middle point of the intensity scale (2: ‘medium’) was the most
ering chi-square distances, as recommended by Vidal, Tárrega, frequently used, reaching an average frequency of use that ranged
Antúnez, Ares, and Jaeger (2015). Sample configurations were not from 13% to 21%. However, the ‘low’ intensity anchor was almost
obtained in Study 3, as only 3 samples were evaluated. as frequently used.
52 L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58

Table 2
Summary of results regarding term usage for sensory characterizations by consumers obtained with CATA and RATA questions across seven studies.

Study ID
1- Apple 2- Apple 3-Peanuts 4- Tinned 5 – Raspberry 6 – Fruitcake 7 – Powdered
pineapple coulis drinks
Term usage
a. Average percentage of terms used for describing samples* CATA: 27%a CATA: 26%a CATA: 38%a CATA: 41%a CATA: 30%a CATA: 44%a CATA: 26%a
RATA: 39%b RATA: 36%b RATA: 38%a RATA: 51%b RATA: 36%b RATA: 51%b RATA: 31%b
b. Percentage of terms for which frequency of use significantly 63% 63% 0% 58% 67% 42% 38%
increased when using RATA compared to CATA
c. Average increase in the frequency of use of the terms when 55% 46% 10% 33% 24% 17% 36%
using RATA compared to CATA
d. Distribution of intensity scores in RATA questions** 0: 61% 0: 64% 0: 62% 0: 51% 0: 64% 0: 49% 0: 69%
1: 12% 1: 12% 1: 14% 1: 18% 1: 16% 1: 19% 1: 8%
2: 18% 2: 15% 2: 16% 2: 21% 2: 13% 2: 20% 2: 13%
3: 9% 3: 9% 3: 8% 3: 10% 3: 7% 3: 12% 3: 11%
*
Percentages with different letters are significantly different at p < 0.05, according to Fishers’ exact test.
**
For RATA scores 0 = ’not applicable’, 1 = ’low’, 2 = ’medium’ and 3 = ’high’.

In all seven studies mean RATA scores were linearly correlated


to the frequency of CATA term use (R = 0.87 0.93, all p < 0.001),
indicating that CATA term citation frequency reflected consumers’
perception of attribute intensity. This relationship is illustrated in
Fig. 1a. The parameters of the linear regressions were similar in
the seven studies, suggesting that the relationship between mean
RATA scores and frequency of use in CATA questions was generic
rather than highly study and product specific. The intercept of
the regressions did not significantly differ from 0 ( 0.10 to 0.08),
whereas the slope ranged from 0.019 to 0.025. The aggregate anal-
ysis across all seven studies established a similar linear relation-
ship (R = 0.92, p < 0.001: intercept = 0.05 and slope = 0.022). Thus,
a 10% increase in CATA citation frequency corresponds to about a
0.2 pts increase in attribute intensity in RATA data. Mean RATA
scores corresponding to ‘high’ intensity were not observed in any
of the seven studies and in the aggregate analysis (Fig. 1a), 100%
CATA citation frequency corresponded to a mean RATA score of
2.25. In Fig. 1b the relationship between mean RATA score and
CATA citation frequency is shown for a single attribute – sweet. It
was noteworthy that a linear relationship similar to those estab-
lished in each study and the aggregate analysis existed (R = 0.90,
p < 0.001). The parameters for this regression were similar to the
other plot in Fig. 1: intercept = 0.02 and slope = 0.024.
The similarity between the sensory characterizations obtained
using CATA and RATA questions is exemplified in Fig. 2 for ran-
domly selected samples in three of the studies. Sample 1 in Study
2 (apples) was mainly described by consumers using the terms
Crisp/crunchy, Dark coloured skin, Juicy, Sweet, and Tough skin in
CATA questions (Fig. 2a). These same terms received the highest
mean scores in RATA questions. Similarly, the terms Sour and Rasp-
berry showed the highest frequency of use in CATA questions and
the highest average scores in RATA (raspberry coulis / Sample 3:
Fig. 2b), whereas the terms Raisins/sultanas, Fruity, Dense and
Doughy were the most relevant terms for describing Sample 1 in
Study 6 (fruit cake) considering both CATA frequency of use and
mean RATA scores (Fig. 2c).

3.2. Sample discrimination

As shown in columns 2–4 of Table 3, the percentage of terms for


which significant differences among samples were established ran-
ged from to 31% to 100% when CATA questions were used for sam-
ple evaluation, and from 50% to 100% when RATA questions were
used (50–92% for RATA-as-CATA and 50–100% for RATA scores).
Fig. 1. Mean score obtained for each term and sample using RATA questions as a
With the exception of Study 7, the percentage of terms for which function of the percentage of consumers who selected the terms for describing each
significant differences among samples were established was higher sample using CATA questions across the seven studies (a) and for the term Sweet
in CATA than RATA-as-CATA on a study-by-study basis. A similar across the six studies that included this term (b).
L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58 53

systematic differences were noted. In Studies 2, 4 and 5, significant


differences in 17–25% of the terms were only identified using CATA
questions (column 7 of Table 3), whereas in Studies 3 and 7 RATA
questions tended to identify significant differences among samples
for a larger proportion of terms than CATA questions (column 8 of
Table 3).
The percentage of significant pairwise comparisons ranged from
27 to 61% for CATA questions and from 30% to 63% for RATA ques-
tions (columns 9 and 11 of Table 3). In Studies 3 and 5 the percent-
age of significant comparisons was higher for CATA than for RATA
questions. Conversely only in Study 7 were the percentages higher
for both RATA variants than for CATA (columns 9–11 of Table 3). In
the remaining studies one of the RATA variants performed on-par
with or better than CATA. In the majority of the pairwise compar-
isons (64–92%) conclusions were identical for both methodologies
(columns 12–13 of Table 3). In particular, both methodologies pro-
vided the same information regarding differences among samples
for terms that did not significantly differ among samples (e.g.,
Tropical in Studies 1 and 2 (apples), Floral in Study 5 (raspberry
coulis) or Bitter and Orange flavour in Study 7 (powdered drinks))
or when marked differences among samples existed (e.g., Tart/sour
in Studies 1 and 2 (apples), Honey coated, Sweet and Oily in Study 3
(peanuts), or Dense in Study 6 (fruit cake)).
Despite the similarities between the discriminative ability of
CATA and RATA questions, some study and term specific differ-
ences in the number of significant pairwise comparisons emerged.
Table 4 shows the terms for which differences in the discriminative
ability of CATA and RATA questions were found, both in terms of
ability to identify significant differences among samples and the
number of significant pairwise comparisons. The instances where
RATA was more discriminative than CATA tended to be terms that
were applicable to describe all the samples in the study, showing
minimum frequency of use across samples close to 20%. This was
the case for: Sweet in Study 2 (apples), Visible spices/salt in Study
3 (peanuts), Crunchy, Juicy, Pineapple flavour, Ripe, Sweet and Yellow
colour in Study 4 (tinned pineapple), Sour in Study 5 (raspberry
coulis), Baking spices, Fruity, Golden syrup, Moist, Raisins/sultanas
and Sticky in Study 6 (fruit cake), and Sour, Intense flavour and Con-
centrated in Study 7 (powdered drinks). Conversely, the terms
where CATA was more discriminative than RATA tended to be
minor sensory characteristics that in some cases mostly applied
to a sub-set of the focal samples, and/or were difficult to quan-
tify/ambiguous. This was the case for: Tough skin in Studies 1
and 2 (apples), Bland, Uneven spread of skin colour and Lingering fla-
vour in Study 2 (apples), Bland and Roasted in Study 3 (peanuts),
Fibrous, Fresh and Mango flavour in Study 4 (tinned pineapple), Boy-
senberry, Fruitiness, Green/grape, Green stalks and Plum in Study 5
(raspberry coulis), Dates in Study 6 (fruit cake) and Mandarin fla-
vour in Study 7 (powdered drinks). Most of these terms had low
average intensity in the RATA task. For other terms there was no
obvious explanation for the differences between RATA and CATA,
Fig. 2. Frequency of use of the terms in CATA (full grey line), frequency of use of the
which may be due to small differences among samples (e.g., Floral
terms in RATA (full black line) and mean RATA scores (dotted black line) for in Study 1, Firm in Study 2, Flavoursome in Study 4, Jammy and
exemplar samples: (a) Sample 1 in Study 2, (b) Sample 3 in Study 5 and (c) Sample 1 Strawberry in Study 5, Off-flavour in Studies 4 and 7).
in Study 6. An interesting observation emerged regarding the analysis of
RATA data as CATA. Table 3 shows that for five of the seven studies
pattern was not observed for the comparison of CATA and RATA a decrease in the percentage of terms with significant differences
scores, where the latter in 4 of 7 studies was associated with the among samples was observed. This type of analysis also led to a
same or higher percentage of terms with significant samples differ- lower percentage of terms with significant differences among sam-
ences as CATA. ples compared to CATA questions for six of the seven studies.
Across the seven studies, conclusions regarding significant dif- Besides, treating RATA-as-CATA also led to a decrease in the per-
ferences among samples were similar for the majority of the terms. centage of significant pairwise comparisons compared to both
For 69–100% of the terms significant differences among samples CATA questions and the analysis of RATA scores. This suggests that
were established for both CATA and RATA questions or for neither using a RATA ballot, but treating the data as CATA is likely to be
method (columns 5–6 of Table 3). Nonetheless, some smaller detrimental to sample discrimination.
54 L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58

Table 3
Percentage of terms with significant differences among samples and significant pairwise comparisons for CATA questions, RATA questions (RATA scores), and RATA when treated
as CATA (RATA-as-CATA) for a 5% significance level. A comparison of the results obtained using CATA and RATA questions for significant differences among samples and pairwise
comparisons are also shown.

Study ID Percentage of terms with Comparison of results obtained using Percentage of significant Comparison of results obtained using
significant differences CATA and RATA questions regarding pairwise comparisons CATA and RATA questions regarding
percentage of terms with significant percentage of significant pairwise
differences among samples comparisons
CATA RATA-as- RATA Both None CATA RATA scores CATA RATA-as- RATA Both None CATA RATA scores
CATA scores only only CATA scores only only
1 – Apple 69 63 69 56 19 13 13 33 33 38 26 55 7 11
2 – Apple 75 50 63 56 19 19 6 35 22 31 24 68 11 6
3 – Peanuts 83 75 92 83 8 0 8 61 53 61 50 28 11 11
4 – Tinned pineapple 100 75 83 83 0 17 0 46 32 51 31 33 15 21
5 – Raspberry coulis 75 58 50 50 25 25 0 36 29 30 23 57 13 8
6 – Fruit cake 100 92 100 100 0 0 0 55 53 63 46 28 9 17
7 – Powdered drinks 31 50 50 25 44 6 25 27 34 34 18 56 9 17

Table 4
Overview of differences in sample discrimination by CATA and RATA questions, both in terms of ability to identify significant differences among samples and the number of
significant pairwise comparisons for each of the seven consumer studies. For each term, the maximum and minimum frequency of use across samples (%) for CATA questions and
maximum and minimum mean score across samples for RATA questions are shown between brackets.

Study ID Attributes with greater sample discrimination in CATA questions Attributes with greater sample discrimination in RATA
questions
1 – Apple Astringent/drying (15–42/0.1–0.3), Pink-ish coloured skin Firm (7–44/0.4–1.2), Floral (20–28/0.3–0.9), Green/grassy
(2–22/0.3–0.5), Tough skin (30–65/1.1–1.7) (6–16/0.1–0.4)
2 – Apple Bland (6–23/0.2–0.5), Firm (17–38/0.4–0.8), Lingering flavour Pink-ish coloured skin (9–17/0.3–0.6), Sweet (26–72/0.8–1.8)
(11–32/0.4–0.5), Tough skin (19–53/0.5–1.1),
Uneven spread of skin colours (11–45/0.4–0.9)
3 – Peanuts Bland (0–58/0.1–1.3), Crunchy (80–98/1.7–2.1), Roasted Visible spices/salt (51–66/0.8–1.4)
(58–85/1.0–1.4)
4 – Tinned pineapple Fibrous (21–41/0.9–1.2), Fresh (16–36/0.6–0.7), Mango Crunchy (34–63/0.6–1.4), Flavoursome (16–50/0.6–1.3),
flavour (0–2/0.1–0.3), Off-flavour (14–38/0.4–0.9) Juicy (45–71/1.0–1.7),
Pineapple flavour (55–79/1.1–1.7), Ripe (29–54/0.7–1.6),
Sweet (37–73/0.7–1.8),
Yellow colour (48–86/0.8–2.7)
5 – Raspberry coulis Boysenberry (17–35/0.4–0.7), Fruitiness (19–52/0.9–1.3), Green Jammy (10–39/0.4–0.8), Sour (41–92/0.8–2.8), Strawberry
grape (9–24/0.2–0.4), Green stalks (9–28/0.1–0.4), Plum (6–23/0.1–0.4)
(15–35/0.3–0.5), Sweet (9–58/0.2–1.2)
6 – Fruit cake Crumbly (3–31/0.1–0.5), Dates (13–53/0.2–0.9), Doughy Baking spices (22–75/0.3–2.0), Fruity (18–75/0.5–2.3),
(22–62/0.6–1.1) Golden syrup (36–54/0.7–1.6),
Moist (27–76/0.7–1.9), Raisins/sultanas (36–98/0.5–2.6),
Sticky (40–91/0.8–2.0)
7 – Powdered drinks Mandarin flavour (10–31/0.4–0.8) Sour (20–88/0.5–2.2), Intense flavour (41–59/0.8–1.5),
Off-flavour (12–31/0.3–0.4),
Smooth (2–12/0.1–0.6), Concentrated (35–53/0.4–1.3),
Diluted (0–10/0.0–0.8)

3.3. Sample and term configurations configurations were high. For example, in Study 2 when samples
were evaluated using CATA questions samples S2 and S3 were
Across the six studies for which sample configurations were located away from the other two samples (Fig. 3b). This same
obtained, the percentage of variance explained by the first two information was also obtained when RATA data were treated as
dimensions of Correspondence Analysis (CA) and Principal Compo- CATA. However, when RATA scores were considered and analysed
nent Analysis (PCA) ranged between 75.3% and 95.7%. using PCA on arithmetic or Dravnieks’ means, sample S2 was
As shown in Table 5a, in four of the six studies (Studies 2, 5, 6 located apart from sample S3 and close to the other two samples.
and 7) sample configurations from CATA and RATA questions Similarly, in Study 4 sample S3 was located close to sample S2
tended to be similar, regardless of the data analysis approach for when RATA scores were considered, whereas it was located on
RATA. In these studies the RV coefficients between sample config- the opposite side of the first dimension on the CA performed on
urations obtained using CATA and RATA were higher than 0.80. In data from CATA questions or when RATA data were treated as
the remaining two studies (1 and 4) the RV coefficients between CATA (Fig. 3c).
sample configurations were lower than 0.80, which indicates Regarding sample discrimination in the first two dimensions of
potential differences in the conclusions regarding similarities and sample configurations, no clear superiority of one methodology
differences among samples between RATA and CATA questions. over the other was observed. In some of the studies sample dis-
Fig. 3 shows sample configurations for four studies. Sample con- crimination was higher for RATA based on intensity scores than
figurations were almost identical for CATA and RATA questions. As for CATA questions. For example, in Studies 4 and 6
an example, Fig. 3a and d show the high similarity of sample con- (Fig. 3c and d, respectively) the confidence ellipses of some of the
figurations obtained in Studies 1 and 6, respectively. Differences in samples were overlapped in sample configurations obtained using
conclusions regarding similarities and differences were identified CATA questions, whereas they did not overlap in the configurations
in some of the studies, even if the RV coefficient between sample obtained by analysing the scores obtained using RATA questions.
L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58 55

Table 5
Summary of results regarding sample and term configurations for sensory characterizations with consumers obtained with CATA and RATA questions across six of the seven
studies. RATA data were analysed as CATA (RATA-as-CATA) and using PCA based on mean RATA scores (RATA PCA) and Dravnieks’ means (RATA Dravnieks).

Study ID
1- 2- 4- Tinned 5 – Raspberry 6– 7 – Powdered
Apple Apple pineapple coulis Fruitcake drinks
Sample configurations
a. RV between sample configurations
CATA vs. RATA-as-CATA 0.75 0.99 0.80 0.96 0.99 0.98
CATA vs. RATA PCA 0.71 0.84 0.63 0.96 0.88 0.94
CATA vs. RATA Dravnieks 0.61 0.81 0.64 0.96 0.88 0.95
RATA-as-CATA vs. RATA PCA 0.90 0.84 0.76 0.99 0.85 0.97
RATA-as-CATA vs. RATA Dravnieks 0.89 0.85 0.73 0.99 0.86 0.97
RATA PCA vs RATA Dravnieks 0.98 0.94 0.99 1.00 1.00 1.00
b. RV between term configurations CATA vs. RATA-as-CATA 0.34 0.57 0.56 0.92 0.95 0.77
CATA vs. RATA PCA 0.43 0.38 0.23 0.72 0.71 0.52
CATA vs. RATA Dravnieks 0.40 0.34 0.20 0.72 0.71 0.53
RATA-as-CATA vs. RATA PCA 0.71 0.67 0.49 0.76 0.60 0.78
RATA-as-CATA vs. RATA Dravnieks 0.72 0.71 0.46 0.77 0.60 0.79
RATA PCA vs RATA Dravnieks 0.98 0.92 0.99 1.00 0.99 0.97
Stability of sample configurations
c. Average RV coefficient of sample CATA 0.97 0.94 0.93 0.98 0.98 0.96
configurations across simulations for RATA-as-CATA 0.95 0.93 0.90 0.97 0.98 0.98
the total number of consumers RATA PCA 0.92 0.86 0.94 0.94 0.95 0.95
RATA Dravnieks 0.91 0.82 0.93 0.94 0.99 0.96
d. Average RV coefficient of term CATA 0.85 0.83 0.90 0.95 0.96 0.85
configurations across simulations RATA-as-CATA 0.86 0.81 0.84 0.93 0.97 0.87
for the total number of consumers RATA PCA 0.83 0.72 0.87 0.90 0.94 0.77
RATA Dravnieks 0.81 0.68 0.86 0.90 0.93 0.78

Notes. Study 3 was excluded from these analyses because only 3 samples were evaluated.

On the contrary, in other studies the opposite trend was found, as compared to the simple CATA questions. A similar trend was
exemplified in Fig. 3b for Study 2. observed for term configurations in Study 7. On the contrary, none
Finally, it can also be seen from Fig. 3b and c that in several of the studies showed a clearly superior stability of sample or term
studies the confidence ellipses obtained using truncated total boot- configurations for RATA over CATA questions. However, differences
strapping tended to be larger in the configurations obtained in the type of data involved in CATA and RATA should be taken into
through the analysis of RATA scores than in the configurations account. As RATA is analysed considering a 4-point scale, variabil-
obtained using CATA data or by treating RATA-as-CATA. ity can and usually will be larger than for CATA, which consists of
The RV coefficients between term configurations in the first and binary data. Therefore, the variability of RATA scores encountered
second dimensions obtained from CATA and RATA, analysed using in the boostrapping approach can be expected to be higher than
PCA on arithmetic and Dravnieks’ means, were lower than those that of CATA data, such that a lower average RV coefficient across
from sample configurations (Table 5a and b). In four of the six simulations is to be expected. This is also supported by the fact the
studies the RV coefficients between term configurations of CATA RV coefficients for sample as well as for term configurations from
and RATA were lower than 0.60, which indicates differences in CA treating RATA-as-CATA is typically larger (similar if RV
the way in which consumers used the terms for describing samples approaches 1) than those using PCA on RATA data.
using the two methodologies. In Studies 5 and 6 the RV coefficient
between term configurations from CATA questions and RATA-as-
CATA were higher than 0.90, suggesting good agreement. In these 4. Discussion
studies the RV coefficient between term configurations of CATA
and RATA scores were close to 0.70, which indicated moderate The present work further explored the use of RATA questions, a
agreement. rating variant of CATA questions, for sensory characterization with
Regarding the different statistical approaches used to analyse consumers by comparing RATA and CATA across seven studies with
RATA scores, results did not largely differ. Sample configurations different product categories. The key findings are discussed below.
obtained using PCA on arithmetic or Dravnieks’ means were highly
similar, as evidenced by the RV coefficients being higher than 0.94 4.1. CATA and RATA term use and perceived attribute intensity
in five of the six studies (Table 5a). When RATA data were consid-
ered as CATA and analysed using CA, sample configurations were With regard to term use, it was found in six of the seven studies
similar to those obtained using PCA on arithmetic or Dravnieks’ that asking consumers to rate the intensity of the terms they
RATA means (RV > 0.73). However, when RATA data were analysed selected as applicable (i.e., RATA) led to an increase in the total
as CATA, the RV between term configurations were lower, particu- number of selected terms, confirming results from previous studies
larly in Studies 4 and 6 (Table 5b). (Ares, Bruzzone, et al., 2014). The increase in frequency of use was
As shown in Table 5c and d, the average RV coefficient of sample found for the majority of the terms included in the CATA/RATA
and term configurations for a sample size equal to the total num- question. These results can be attributed to two potential effects.
ber of consumers in the studies were similar for CATA and RATA First, the greater cognitive effort necessary to answer RATA ques-
questions in the six studies, regardless of the approach used for tions compared to CATA questions may have discouraged con-
RATA data analysis. However, in Study 2, the average RV coefficient sumers from using satisficing response strategies (Sudman &
of sample and term configurations tended to be lower for RATA Bradburn, 1992). Secondly, the use of a rating step may have
questions, analysed using both arithmetic and Dravnieks’ means, caused a change in the cognitive strategy used by consumers to
56 L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58

complete the task. The possibility of selecting terms and indicating


that their intensity is ‘low’ may induce consumers to select more
terms than when they have to indicate if terms are applicable or
not to describe samples (i.e., CATA task).
Therefore, CATA and RATA questions may encourage partici-
pants to approach the product sensory characterization task differ-
ently. Due to satisficing response strategies, when answering CATA
questions most consumers might not select ‘‘all” terms that apply
to describe the product, but simply select those terms that are
more important to them to characterize the product. This approach
to the task explains why frequency of use of the terms in CATA
questions correlates with intensity ratings (Ares et al., 2015;
Bruzzone, Ares, & Giménez, 2012; Bruzzone et al., 2015). For this
reason, consumers would be expected to only select the most sali-
ent attributes in CATA questions, whereas in RATA questions con-
sumers would be expected to provide a more detailed
characterization of the samples by selecting a larger number of
attributes and additionally indicating their intensity. In this sense,
consumers are probably encouraged to adopt a more analytical
cognitive strategy when completing a RATA task, compared to
CATA. Although both methodologies should establish the same
information about the main sensory characteristics of samples, dif-
ferences by method are likely to appear, as found in the present
work. The idea that RATA and CATA encourage participants to
use different strategies for sample evaluation is supported by the
fact that in most of the studies analysing RATA data as CATA
responses lead to a decrease in sample discrimination compared
to CATA questions.
The citation frequency of CATA terms was strongly and linearly
related to mean RATA scores and showed that the percentage of
consumers selecting a term in a CATA question was directly related
to perceived attribute intensity in RATA. The present data suggest
that this relationship may be largely product and study indepen-
dent, but this should be established across additional studies from
a wider range of products, consumers and cultures. Additional data
to explore attribute specific relationships would also be welcome.
One of the interesting observations in the present data was 80% or
higher CATA citation frequency was only twice associated with a
mean RATA score that is above 2.5 and hence be considered close
to the scale anchor ‘high’. This can most likely be attributed to a
boundary effect: As scores larger than 3 are impossible, very con-
sistently high scores are required to obtain an average of 2.5 or
higher, a consistency which is rarely observed in consumer data;
even if an attribute is clearly present, some consumers may still
not select it or give a lower score in avoidance of the extreme
intensity anchor on the RATA scale. It might be of interest to inves-
tigate samples with very extreme sensory properties to see if an
average value of ‘high’ can be obtained at all. It is worth mention-
ing that such a relationship does not necessarily have to exist for
every attribute: if consumers would indeed check all attributes
that they can perceive (irrespective of intensity), the relationship
might cease, as a checked attribute could go along with any of
the potential RATA scores different from zero. The fact that the
relationship is consistently found suggests that a lower intensity
(and hence lower average RATA score) goes along with fewer
assessors actually checking the attribute in a CATA task. It can
therefore be assumed that assessors only check an attribute if its
intensity exceeds a certain (subject-specific) threshold, and refrain
from checking it if noticeable (as indicated by the RATA scores) but
not intense enough. An example of this behaviour can be seen in
Fig. 3. Exemplar sample configurations obtained using CATA questions and Study 4. Although all tinned pineapple samples were yellow at
different data analysis approaches for RATA questions: treated as CATA, PCA on some degree, not all consumers selected the term yellow colour
arithmetic means and PCA on Dravnieks’ means for Studies 1 (a), 2 (b), 4 (c) and 6
to describe them; the frequency of use of this term in CATA ques-
(d).
tions ranged from 48 to 86% (Table 4)
Establishing a direct relationship between attribute intensity in
RATA and the percentage of CATA term use aligns with previous
L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58 57

studies that compared frequency of use of CATA terms and attri- differences in attribute intensities between samples are rather
bute intensities measured using structured and unstructured ‘‘smaller” than ‘‘larger”, but how big these magnitudes of differ-
scales (Ares et al., 2015; Bruzzone et al., 2012, 2015). This suggests ences should be is unknown and likely to be product specific.
that the concern that prompted the development of hybrid CATA- Despite somewhat higher sample discrimination for certain
rating methods in the first instance, including RATA, may thus be attributes being achievable by RATA such an outcome is not neces-
exaggerated. Albeit in an indirect manner, the use of CATA ques- sarily better in consumer testing. For example, RATA questions may
tions by consumers can deliver measurements of perceived inten- place consumers in a mind-set where they pay greater attention to
sity of sensory attributes. attributes than they would in natural eating situations, making the
method too sensitive. Conversely, CATA questions may allow a
4.2. Sample characterisation and discrimination more spontaneous evaluation, but be a bit less sensitive because
they encourage less attention to samples. Although the use of dif-
Across the seven studies, CATA and RATA questions did not differ ferent testing protocols to enhance detection of sample differences
in the identification of the most salient sensory characteristics of the has previously been shown (e.g., for hedonic testing using monadic
products and conclusions regarding significant differences among vs. side-by-side sample presentation: McBride, 1986), it was noted
samples were identical for the majority of the terms and pairwise long ago that the practical importance of these differences could be
comparisons. Sample and term configurations obtained using fre- exaggerated (Amerine, Pangborn, & Roessler, 1965).
quency of term use in CATA questions and RATA scores were also
similar. Thus, neither method was clearly superior to the other, in
4.3. Recommendations regarding the analysis of RATA data
agreement with results reported by Reinbach et al. (2014).
As possibly indicated by the results of Study 3, there were test
Results from the present work also provided recommendations
specific situations where RATA did not lead to greater sample dis-
for the analysis of RATA data. First of all, treating RATA-as-CATA
crimination than CATA, such as when a few samples with large dif-
led to a consistent decrease in sample discrimination. Although
ferences are tested using a low number of terms. This was a
analysing RATA questions as if they were CATA questions is a prac-
distinguishing feature of Study 3 relative to the other six studies.
tice previously reported in literature (e.g. Oppermann et al., 2017),
In such instances, the additional attention to the task that RATA
our results suggest that practitioners should refrain from using this
encourages may not be needed.
approach.
In instances where differences between CATA and RATA results
However, no major differences were found between the sample
were found, a few general trends were noted. CATA questions
and term configurations obtained using PCA on arithmetic or Drav-
tended to be slightly more discriminative than RATA questions
nieks’ means, in agreement with results reported by Meyners et al.
for terms related to minor sensory characteristics, attributes that
(2016). Therefore, considering that no evidence was obtained jus-
appeared in low intensity for only few samples and attributes that
tifying the extra computational effort involved in the calculation
may be less simple to quantify for consumers. If this result is
of Dravnieks’ means, sample and term configurations from RATA
robust and CATA/RATA method differences systematically exist
data can be obtained using arithmetic means.
for attributes that are present in low intensities and possibly only
Further research on how to analyse RATA data is, however, still
for a subset of samples, are hard for consumers to quantify or pos-
necessary. One of the specific aspects that should be investigated is
sibly ambiguous, then it may also be relevant to consider if a need
how to convert RATA intensity responses to numerical values. In
exists for revision of the terms used in a focal study. Should terms
the present work it was decided, somewhat arbitrarily, to assign
that are ambiguous (e.g., off-flavour) be used at all? What about
‘not applicable’ to 0, ‘low intensity’ to 1, ‘moderate intensity’ to 2
specific and technical sensory terms such as pungent, mineral or
and ‘high intensity’ to 3. Whether this is appropriate is open for
brittle? Are they compatible with the seemingly agreed upon but
discussion. Labelled magnitude scales (e.g., Labelled Magnitude
ill-defined notion that the terms used in CATA/RATA should be
Scale (LMS: Green et al., 1996) or Labelled Affective Magnitude
‘‘consumer friendly”?
scale (LAM: Cardello & Schutz, 2004) show that intensity cate-
RATA tended to be slightly more discriminative than CATA for
gories are not linearly spaced. It is also not obvious that setting
terms that were related to salient sensory characteristics that were
‘not applicable’ to 0 is ‘‘correct” in the sense that it equates the dis-
more or less applicable to describe all samples. These characteris-
tance between ‘not applicable’ and ‘low’ intensity to the distance
tics may be familiar/common to participants, who may find it easy
between ‘low’ and ‘medium’ intensity. Besides, the 0 on this scale
to quantify their intensity using rating scales. Therefore, in terms of
does not necessarily mean complete absence of the attribute, but
sample discrimination, RATA questions may offer a slight improve-
it could mean that the intensity is ‘‘below a certain threshold”,
ment over CATA questions for sample sets with subtle perceptual
and that threshold might differ from the one that assessors might
differences in which samples are not expected to differ in the type
apply in a classical descriptive analysis due to the nature of the
of terms that apply but in the intensity of their sensory character-
task. To investigate whether equating a non-ticked attribute to a
istics. Further research seems necessary to confirm this hypothesis.
0 value is a reasonable assumption would require, for instance,
Overall, although RATA and CATA tended to perform similarly,
the comparison between RATA and a descriptive analysis with a
they may have strengths for different types of attributes when it
similar group, on a 0:4 scale. Doing so is beyond the scope of this
comes to identifying differences among samples. Evidence of this
research.
had not been reported earlier. In this sense, further research com-
paring the discrimination of CATA and RATA questions for specific
sensory characteristics with other methodologies can contribute 5. Conclusions
to our understanding of the differences between the methods,
enabling informed decisions by practitioners. For example, it would To the best of our knowledge, this is the most exhaustive com-
be relevant to establish if RATA yields sensory profiles that are more parison of CATA and RATA to date. Results from this work show
similar to those from descriptive panels, which could point to the that RATA questions are not necessarily an improvement over
method placing consumers in a more analytical mind-set. It would CATA questions in terms of sample discrimination and attribute
also be relevant to establish if sensory characteristics present in all intensity measurement, thereby failing to confirm the superiority
the samples in a study, say sweet and acid/sour in apple, are always reported by Ares, Bruzzone et al. (2014). Both methodologies seem
best assessed with RATA. Tentatively, this may be the case when to be able to identify the main similarities and differences in the
58 L. Vidal et al. / Food Quality and Preference 67 (2018) 49–58

sensory characteristics of samples, but may have advantages for and its comparison to classical external preference mapping. Food Quality and
Preference, 21, 394–401.
the identification of differences among samples in different types
Dos Santos, B. A., Bastianello Campagnol, P. C., da Cruz, A. G., Galvão, M. T. E. L.,
of attributes. The decision to add a rating step to a CATA question Monteiro, R. A., Wagner, R., et al. (2015). Check all that apply and free listing to
depends on the aim of the study and the specific characteristics of describe the sensory characteristics of low sodium dry fermented sausages:
the sample set. In cases where differences among samples rely on Comparison with trained panel. Food Research International, 76, 725–734.
Dravnieks, A. (1982). Odor quality: Semantically generated multidimensional
the absence or presence of the attributes on the list, CATA ques- profiles are stable. Science, 218, 799–801.
tions should be preferred, as it is a less analytical and thus more Fisher, R. A. (1954). Statistical methods for research workers. Edinburgh: Oliver and
natural task for consumers. RATA questions may only be recom- Boyd.
Franco-Luesma, E., Sáenz-Navajas, M.-P., Valentin, D., Ballester, J., Rodrigues, H., &
mended when the aim of the study is to assess sets of samples Ferreira, V. (2016). Study of the effect of H2S, MeSH and DMS on the sensory
which differ in the relative intensity of salient sensory characteris- profile of wine model solutions by Rate-All-That-Apply (RATA). Food Research
tics that are familiar to consumers and apply to describe most of International, 87, 152–160.
Giacalone, D., & Hedelund, P. I. (2016). Rate-all-that-apply (RATA) with semi-
the focal samples. The results also indicate that collecting RATA trained assessors: An investigation of the method reproducibility at assessor-,
data but analysing them as CATA data should be avoided. attribute- and panel-level. Food Quality and Preference, 51, 65–71.
Green, B., Dalton, P., Cowart, B., Shaffer, G., Rankin, K., & Higgins, J. (1996).
Evaluating the ‘Labeled Magnitude Scale’ for measuring sensations of taste and
Acknowledgements smell. Chemical Senses, 21, 323–334.
Jaeger, S. R., & Ares, G. (2014). Lack of evidence that concurrent sensory product
Staff at Plant & Food Research are thanked for help in characterisation using CATA questions bias hedonic scores. Food Quality and
Preference, 35, 1–5.
planning and collection of data, in particular S.L. Chheang, D. Jin,
Jaeger, S. R., Chheang, S. L., Yin, J., Bava, C. M., Gimenez, A., Vidal, L., et al. (2013).
M.K. Beresford, and K. Kam. Financial support was received Check-all-that-apply (CATA) responses elicited by consumers: Within-assessor
from Comisión Sectorial de Investigación Científica (Universidad reproducibility and stability of sensory product characterizations. Food Quality
de la República – Uruguay) and The New Zealand Ministry for and Preference, 30, 56–67.
Jaeger, S. R., Giacalone, D., Roigard, C. M., Pineau, B., Vidal, L., Giménez, A., et al.
Business, Innovation & Employment and Plant & Food Research. (2013). Investigation of bias of hedonic scores when co-eliciting product
attribute information using CATA questions. Food Quality and Preference, 30,
References 242–249.
Lawless, H. T., & Heymann, H. (2010). Sensory evaluation of food. Principles and
practices (2nd ed.). New York: Springer.
Amerine, M. A., Pangborn, R. M., & Roessler, E. B. (1965). Principles of sensory Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate
evaluation of food (p. 427). New York: Academic Press. analysis. Journal of Statistical Software, 25(1), 1–18.
Ares, G. (2015). Methodological challenges in sensory characterization. Current
Lelièvre, M., Chollet, S., Abdi, H., & Valentin, D. (2008). What is the validity of the
Opinion in Food Science, 3, 1–5. sorting task for describing beers? A study using trained and untrained assessors.
Ares, G., Antúnez, L., Bruzzone, F., Vidal, L., Giménez, A., Pineau, B., et al. (2015).
Food Quality and Preference, 19, 697–703.
Comparison of sensory product profiles generated by trained assessors and Manoukian, E. B. (1986). Mathematical nonparametric statistics. New York, NY:
consumers using CATA questions: Four case studies with complex and/or Gordon & Breach.
similar samples. Food Quality and Preference, 45, 75–86.
McBride, R. L. (1986). Hedonic rating of food: single or side-by-side sample
Ares, G., Antúnez, L., Giménez, A., Roigard, C. M., Pineau, B., Hunter, D. C., et al. presentation? Journal of Food Technology, 21, 355–363.
(2014). Further investigations into the reproducibility of check-all-that-apply
Meiselman, H. L. (2013). The future in sensory/consumer research: Ellipsis Ellipsis
(CATA) questions for sensory product characterization elicited by consumers. EllipsisEllipsisevolving to a better science. Food Quality and Preference, 27,
Food Quality and Preference, 36, 111–121. 208–214.
Ares, G., Bruzzone, F., Vidal, L., Cadena, R. S., Giménez, A., Pineau, B., et al. (2014b). Meyners, M., Castura, J. C., & Carr, B. T. (2013). Existing and new approaches for the
Evaluation of a rating-based variant of Check-All-That-Apply questions: Rate- analysis of CATA data. Food Quality and Preference, 30, 309–319.
All-That-Apply (RATA). Food Quality and Preference, 36, 87–95.
Meyners, M., & Castura, J. C. (2014). Check-all-that apply questions. In P. Varela & G.
Ares, G., Etchemendy, R., Antúnez, L., Vidal, L., Giménez, A., & Jaeger, S. (2014). Ares (Eds.), Novel techniques in sensory characterization and consumer profiling
Visual attention by consumers to check-all-that-apply questions: Insights to (pp. 271–305). Boca Raton, FL: CRC Press.
support methodological development. Food Quality and Preference, 32, 210–220. Meyners, M., Jaeger, S. R., & Ares, G. (2016). On the analysis of Rate-All-That-Apply
Ares, G., & Jaeger, S. R. (2015). Check-all-that-apply (CATA) questions with (RATA) data. Food Quality and Preference, 49, 1–10.
consumers in practice. Experimental considerations and impact on outcome.
Moussaoui, K. A., & Varela, P. (2010). Exploring consumer product profiling
In J. Delarue, J. B. Lawlor, & M. Rogeaux (Eds.), Rapid sensory profiling techniques techniques and their linkage to a quantitative descriptive analysis. Food
and related methods (pp. 227–245). Sawston, Cambridge: Woodhead Publishing.
Quality and Preference, 21, 1088–1099.
Ares, G., Tárrega, A., Izquierdo, L., & Jaeger, S. R. (2014). Investigation of the number Oppermann, A. K. L., de Graaf, C., Scholten, E., Stieger, M., & Piqueras-Fiszman, B.
of consumers necessary to obtain stable sample and descriptor configurations (2017). Comparison of Rate-All-That-Apply (RATA) and Descriptive sensory
from check-all-that-apply (CATA) questions. Food Quality and Preference, 31, Analysis (DA) of model double emulsions with subtle perceptual differences.
135–141. Food Quality and Preference, 56, 55–68.
Ares, G., & Varela, P. (2014). Comparison of novel methodologies for sensory
Core Team, R. (2015). R: A language and environment for statistical computing. Vienna,
characterization. In P. Varela & G. Ares (Eds.), Novel techniques in sensory Austria: R Foundation for Statistical Computing.
characterization and consumer profiling (pp. 365–389). Boca Raton: CRC Press. Reinbach, H. C., Giacalone, D., Ribeiro, L. M., Bredie, W. L. P., & Frøst, M. B. (2014).
Bruzzone, F., Ares, G., & Giménez, A. (2012). Consumers’ texture perception of milk Comparison of three sensory profiling methods based on consumer perception:
desserts. II- Comparison with trained assessors’ data. Journal of Texture Studies, CATA, CATA with intensity and nappingÒ. Food Quality and Preference, 32,
43, 214–226.
160–166.
Bruzzone, F., Vidal, L., Antúnez, L., Giménez, A., Deliza, A., & Ares, G. (2015). Robert, P., & Escoufier, Y. (1976). A unifying tool for linear multivariate statistical
Comparison of intensity scales and CATA questions in new product
methods: The RV coefficient. Applied Statistics, 25, 257–265.
development: Sensory characterisation and directions for product Sudman, S., & Bradburn, N. M. (1992). Asking questions. San Francisco, CA: Jossey-
reformulation of milk desserts. Food Quality and Preference, 44, 183–193. Bass.
Cadena, R. S., Caimi, D., Jaunarena, I., Lorenzo, I., Vidal, L., Ares, G., et al. (2014). Varela, P., & Ares, G. (2012). Sensory profiling, the blurred line between sensory and
Comparison of rapid sensory characterization methodologies for the consumer science. A review of novel methods for product characterization. Food
development of functional yogurts. Food Research International, 64, 446–455.
Research International, 48, 893–908.
Cadoret, M., & Husson, F. (2013). Construction and evaluation of confidence ellipses Vidal, L., Tárrega, A., Antúnez, L., Ares, G., & Jaeger, S. R. (2015). Comparison of
applied at sensory data. Food Quality and Preference, 28, 106–115. Correspondence Analysis based on Hellinger and chi-square distances to obtain
Cardello, A. V., & Schutz, H. G. (2004). Numerical scale point locations for sensory spaces from Check-All-That-Apply (CATA) questions. Food Quality and
constructing the LAM (labeled affective magnitude) scale. Journal of Sensory Preference, 43, 106–112.
Studies, 19, 341–346.
Waehrens, S. S., Zhang, S., Hedelund, P. I., Petersen, M. A., & Byrne, D. V. (2016).
Dehlholm, C., Brockhoff, P. B., Mejnert, L., Aaslyng, M. D., & Bredie, W. L. P. (2012). Application of the fast sensory method ‘Rate-All-That-Apply’ in chocolate
Rapid descriptive sensory methods – comparison of free multiple sorting,
Quality Control compared with DHS-GC-MS. International Journal of Food Science
partial napping, napping, flash profiling and conventional profiling. Food Quality and Technology, 51, 1877–1887.
and Preference, 26, 267–277.
Dooley, L., Lee, Y. S., & Meullenet, J. F. (2010). The application of check-all-that-
apply (CATA) consumer profiling to preference mapping of vanilla ice cream

You might also like