Jain2015 PDF
Jain2015 PDF
a11111
Abstract
Any national cuisine is a sum total of its variety of regional cuisines, which are the cultural
and historical identifiers of their respective regions. India is home to a number of regional
cuisines that showcase its culinary diversity. Here, we study recipes from eight different
regional cuisines of India spanning various geographies and climates. We investigate the
OPEN ACCESS
phenomenon of food pairing which examines compatibility of two ingredients in a recipe in
Citation: Jain A, N K R, Bagler G (2015) Analysis of
terms of their shared flavor compounds. Food pairing was enumerated at the level of cui-
Food Pairing in Regional Cuisines of India. PLoS
ONE 10(10): e0139539. doi:10.1371/journal. sine, recipes as well as ingredient pairs by quantifying flavor sharing between pairs of ingre-
pone.0139539 dients. Our results indicate that each regional cuisine follows negative food pairing pattern;
Editor: Zi-Ke Zhang, Hangzhou Normal University, more the extent of flavor sharing between two ingredients, lesser their co-occurrence in that
CHINA cuisine. We find that frequency of ingredient usage is central in rendering the characteristic
Received: April 24, 2015 food pairing in each of these cuisines. Spice and dairy emerged as the most significant
ingredient classes responsible for the biased pattern of food pairing. Interestingly while indi-
Accepted: September 1, 2015
vidual spices contribute to negative food pairing, dairy products on the other hand tend to
Published: October 2, 2015
deviate food pairing towards positive side. Our data analytical study highlighting statistical
Copyright: © 2015 Jain et al. This is an open access properties of the regional cuisines, brings out their culinary fingerprints that could be used to
article distributed under the terms of the Creative
design algorithms for generating novel recipes and recipe recommender systems. It forms a
Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any basis for exploring possible causal connection between diet and health as well as prospec-
medium, provided the original author and source are tion of therapeutic molecules from food ingredients. Our study also provides insights as to
credited. how big data can change the way we look at food.
Data Availability Statement: All relevant data are
within the paper and its Supporting Information files.
Food perception involving olfactory and gustatory mechanisms is the primary influence for
food preferences in humans. These preferences are also determined by a variety of factors such
as culture, climate geography and genetics, leading to emergence of regional cuisines [4, 8–12].
Food pairing is the idea that ingredients having similar flavor constitution may taste well in a
recipe. Chef Blumenthal was the first to propose this idea, which in this study we term as posi-
tive food pairing [13]. Studies by Ahn et al found that North American, Latin American and
Southern European recipes follow this food pairing pattern where as certain others like North
Korean cuisine and Eastern European cuisines do not [14, 15]. Our previous study of food pair-
ing in Indian cuisine revealed a strong negative food pairing pattern in its recipes [16].
Knowing that each of the regional cuisines have their own identity, the question we seek to
answer in this paper is whether the negative food pairing pattern in Indian cuisine is a consis-
tent trend observed across all of the regional cuisines or an averaging effect. Towards answering
this question, we investigated eight geographically and culturally prominent regional cuisines
viz. Bengali, Gujarati, Jain, Maharashtrian, Mughlai, Punjabi, Rajasthani and South Indian.
The pattern of food pairing was studied at the level of cuisine, recipes and ingredient pairs.
Such a multi-tiered study of these cuisines provided a thorough understanding of its character-
istics in terms of ingredient usage pattern. We further identified the features that contribute to
food pairing, thereby revealing the role of ingredients and ingredient categories in determining
food pairing of the regional cuisines.
Availability of large datasets in the form of cookery blogs and recipe repositories has
prompted the use of big data analytical techniques in food science and has led to the emergence
of computational gastronomy. This field has made advances through many recent studies [14,
15, 17, 18] which is changing the overall outlook of culinary science in recent years. Our study
is an offshoot of this approach. We use statistical and computational models to analyse food
pairing in the regional cuisines. Our study reveals the characteristic signature of each Indian
regional cuisines by looking at the recipe and ingredient level statistics of the cuisine.
doi:10.1371/journal.pone.0139539.t001
Fig 1. Recipe size distributions. Plot of probability of finding a recipe of size s in the cuisine. Consistent with other cuisines, the distributions are bounded.
Mughlai and Punjabi cuisines have recipes of large sizes compared to other cuisines.
doi:10.1371/journal.pone.0139539.g001
Fig 2. Frequency-Rank distributions. Ingredients ranked as per their frequency of use in the cuisine. Higher the occurrence, better the rank of the
ingredient. All the cuisines have similar ingredient distribution profile indicating generic culinary growth mechanism. Inset shows the ingredient frequency-
rank distribution for the whole Indian cuisine.
doi:10.1371/journal.pone.0139539.g002
mechanism, the distributions also show that certain ingredients are excessively used in cuisines
depicting their inherent ‘fitness’ or popularity within the cuisine.
ingredients. Flavor profile represents a set of volatile chemical compounds that render the
characteristic taste and smell to the ingredient. Starting with the flavor profiles of each of the
ingredients, average food pairing of a recipe (NsR ) as well as that of the cuisine (N s ) was com-
puted as illustrated in Fig 3. The extent of deviation of N s of the cuisine, when compared to
that of a ‘random cuisine’ measures the bias in food pairing. The higher/lower the value of N s
from that of its random counterpart the more positive/negative the food pairing is.
ka
Pð NsR Þ ¼ a þ ð1Þ
1 þ eaNsR
Fig 3. Schematic for calculation of ‘average Ns’ (N s ). Illustration of procedure for calculating the average Ns for a given cuisine. Beginning with an
individual recipe, average Ns of the recipe (NRs ) was calculated. Averaging NRs over all the recipes returned Ns of the cuisine.
doi:10.1371/journal.pone.0139539.g003
Fig 4. ΔNs and its statistical significance. The variation in ΔNs for regional cuisines and corresponding random controls signifying the extent of bias in food
pairing. Statistical significance of ΔNs is shown in terms of Z-score. ‘Regional cuisine’ refers to each of the eight cuisines analyzed; ‘Ingredient frequency’
refers to the frequency controlled random cuisine; ‘Ingredient category’ refers to ingredient category controlling random cuisine; and ‘Category + Frequency’
refers to random control preserving both ingredient frequency and category. Among all regional cuisines, Mughlai cuisine showed least negative food paring
(ΔNs = −0.758) while Maharashtrian cuisine had most negative food pairing (ΔNs = −4.523).
doi:10.1371/journal.pone.0139539.g004
We found that all regional cuisines show a strong bias towards recipes of low NsR values as
observed in Fig 6. For each regional cuisine, the bias was accentuated in comparison to corre-
sponding random cuisines as reflected in the exponents shown in S2 Table. Once again Mugh-
lai cuisine emerged as an outlier, as the nature of its NsR distribution did not indicate a clear
distinction from that of its random control. Consistent with the observation made with N s and
ΔNs statistics (Figs 4 and 5), we found that controlling for frequency of occurrence of ingredi-
ents reproduces the nature of NsR distribution across all regional cuisines (barring the Mughlai
cuisine). This further highlights the role of ingredient frequency as a key factor in specifying
food pairing at the level of recipes as well.
Fig 5. Variation in average Ns and its statistical significance. Change in Ns with varying recipe size cut-offs reveals the nature of food pairing across the
spectrum of recipe sizes. The Ns values for regional cuisines were consistently on the lower side compared to their random counterparts. Category controlled
random cuisine displayed average Ns variation close to that of the ‘Random control’. Frequency controlled as well as ‘Category + Frequency’ controlled
random cuisines, on the other hand, displayed average Ns variations close to that of the real-world cuisine.
doi:10.1371/journal.pone.0139539.g005
Fig 6. Cumulative probability distribution of NsR values for regional cuisines and their random controls. Cumulative distribution of NRs indicates the
probability of finding a recipe having food pairing less than or equal to NRs . The data of regional cuisines as well as those of their controls were fitted with a
sigmoid equation indicating that the PðNRs Þ values fall exponentially. The exponent α Eq (1) refers to the rate of decay; larger the α more prominent is the
negative food pairing in recipes of a cuisine. As evident from S2 Table, NRs distribution of the controls based on ‘Ingredient Frequency’ as well as ‘Category
+ Frequency’ displayed recipe level food pairing similar to real-world cuisines. On the other hand, as also observed at the level of cuisine (Figs 4 and 5), both
the ‘Random Control’ as well as ‘Ingredient Category’ control deviate significantly.
doi:10.1371/journal.pone.0139539.g006
χi = 0. Significantly, spices were consistently present towards the negative side, while milk and
certain dairy products were present on the positive side across cuisines. Prominently among
the spices, cayenne consistently contributed to the negative food pairing of all regional cuisines.
Certain ingredients appeared to be ambivalent in their contribution to food pairing. While car-
damom contributed to the positive food pairing in Gujarati, Mughlai, Rajasthani, and South
Indian cuisines, it added to negative food pairing in Maharashtrian cuisine. Green bell pepper
tends to contribute to negative food pairing across the cuisines except in the case of Rajasthani
cuisine. Details of χi values of prominent ingredients for each regional cuisine are presented in
S4 Table.
Fig 7. Co-occurrence of ingredients with increasing extent of flavor profile overlap. Fraction of ingredient pair occurrence (f(N)) with a certain extent of
flavor profile overlap (N) was computed to assess the nature of food pairing at the level of ingredient pairs. Generically across the cuisines it was observed
that, the occurrences of ingredient pairs dropped as a power law with increasing extent of flavor profile sharing. This further ascertained negative food pairing
pattern in regional cuisines, beyond the coarse-grained levels of cuisine and recipes.
doi:10.1371/journal.pone.0139539.g007
raises the question whether ingredient category has any role in determining food pairing pat-
tern of the cuisine. Towards answering this question, we created random cuisines wherein we
randomized ingredients within one category, while preserving the category and frequency dis-
tribution for rest of the ingredients. The extent of contribution of an ingredient category
towards the observed food pairing in the cuisine is represented by DNscat . Fig 9 depicts signifi-
cance of ingredient categories towards food pairing of each regional cuisine. Interestingly, the
pattern of category contributions presents itself as a ‘culinary fingerprint’ of the cuisine.
The ‘spice’ category was the most significant contributor to negative food pairing across cui-
sines with the exception of Mughlai cuisine. Another category which consistently contributed
to negative food pairing was ‘dairy’. On the other hand, ‘vegetable’ and ‘fruit’ categories tend to
bias most cuisines towards positive food pairing. Compared to the above-mentioned categories,
‘nut/seed’, ‘cereal/crop’, ‘pulse’ and ‘plant derivative’ did not show any consistent trend. ‘Plant’
and ‘herb’ categories, sparsely represented in cuisines, tend to tilt the food pairing towards pos-
itive side. In Mughlai cuisine all ingredient categories, except ‘dairy’, tend to contribute towards
positive food pairing. This could be a reflection of the meagre negative food pairing observed
for the cuisine (Fig 4). Above observations were found to be consistent across the spectrum of
recipe sizes (Fig 10).
Fig 8. Contribution of ingredients (χi) towards flavor pairing. For all eight regional cuisines we calculated the χi value of ingredients that indicates their
contribution to flavor pairing pattern of the cuisine and plotted them against their frequency of appearance. Size of circles are proportional to frequency of
ingredients. Across cuisines, prominent negative contributors largely comprised of spices, whereas a few dairy products consistently appeared on the
positive side.
doi:10.1371/journal.pone.0139539.g008
Conclusions
With the help of data analytical techniques we have shown that food pairing in major Indian
regional cuisines follow a consistent trend. We analyzed the reason behind this characteristic
pattern and found that spices, individually and as a category, play a crucial role in rendering
the negative food pairing to the cuisines. The use of spices as a part of diet dates back to ancient
Indus civilization of Indian subcontinent [5–7]. They also find mention in Ayurvedic texts
such as Charaka Samhita and Bhaavprakash Nighantu [20–23]. Trikatu, an Ayurvedic formu-
lation prescribed routinely for a variety of diseases, is a combination of spices viz., long pepper,
black pepper and ginger [24]. Historically spices have served several purposes such as coloring
and flavoring agents, preservatives and additives. They also serve as anti-oxidants, anti-inflam-
matory, chemopreventive, antimutagenic and detoxifying agents [23, 25]. One of the strongest
hypothesis proposed to explain the use of spices is the antimicrobial hypothesis, which suggests
that spices are primarily used due to their activity against food spoilage bacteria [9, 26]. A few
of the most antimicrobial spices [27] are commonly used in Indian cuisines. Our recent studies
have shown the beneficial role of capsaicin, an active component in cayenne which was
Fig 9. Contribution of individual categories (DNscat ) towards food pairing bias and its statistical significance. Randomizing ingredients within a certain
category provides an insight into their contribution towards bias in food pairing. Spice and dairy category showed up as prominent categories contributing to
the negative food pairing of regional cuisines.
doi:10.1371/journal.pone.0139539.g009
revealed to be the most prominent ingredient in consistently rendering the negative food pair-
ing in all regional cuisines [28]. The importance of spices in Indian regional cuisines is also
highlighted by the fact that these cuisines have many derived ingredients (such as garam
masala, ginger garlic paste etc.) that are spice combinations. The key role of spices in rendering
characteristic food pairing in Indian cuisines and the fact that they are known to be of thera-
peutic potential, provide a basis for exploring possible causal connection between diet and
health as well as prospection of therapeutic molecules from food ingredients. Flavor pairing
has been used as a basic principle in algorithm design for both recipe recommendation and
novel recipe generation, thereby enabling computational systems to enter the creative domain
of cooking and suggesting recipes [17, 18]. In such algorithms, candidate recipes are generated
based on existing domain knowledge and flavor pairing plays a crucial role while selecting the
best among these candidates [18].
Fig 10. Variation in category contribution and its statistical significance. Across the spectrum of recipe sizes, we observed broadly consistent trend of
contribution of individual categories towards food pairing bias.
doi:10.1371/journal.pone.0139539.g010
comparison to these sources, Tarladalal.com was identified as a best recipe source of Indian
cuisine.
The data of 3330 recipes and 588 ingredients were curated for redundancy in names and to
drop recipes with only one ingredient. These ingredients belonged to 17 categories. Ingredients
of ‘snack’ and ‘additive’ categories, for which no flavor compounds could be determined, were
removed. The ingredients were further aliased to 339 source ingredients out of which we could
determine flavor profiles for 194 of them. Aliasing involves mapping ingredients to their source
ingredient. For example ‘chopped potato’ and ‘mashed potato’ were aliased to ‘potato’. The
final data comprised of 2543 recipes and 194 ingredients belonging to 15 categories. The statis-
tics of regional cuisines, their recipes and ingredient counts is provided in Table 1.
The data of flavor compounds were obtained from Ahn et. al. [15], Fenaroli’s Handbook of
Flavor Compounds [29] and extensive literature search. All the flavor profiles were cross
checked with those in 6th edition (latest) of Fenaroli’s Handbook of Flavor Compounds [29]
for consistency of names. Chemical Abstract Service numbers were used as unique identifiers
of flavor molecules.
Flavor sharing
Flavor sharing was computed for each pair of ingredients that co-occur in recipes in terms of
number of shared compounds N = jFi \ Fjj. Further, the average number of shared compounds
in a recipe NsR having s ingredients was calculated (Eq (2)).
2 X
NsR ¼ jF \ Fj j ð2Þ
sðs 1Þ i;j2R;i6¼j i
where NRand and σRand represent the number of recipes in randomized cuisine and standard
deviation of NsR values for randomized cuisine respectively.
Ingredient contribution
For every regional cuisine, the contribution (χi) of each ingredient i was calculated [15] using
Eq (4).
! !
1 X 2 X 2fi Sj2c fj jFi \ Fj j
wi ¼ jF \ Fj j ð4Þ
NR i2R nðn 1Þ j6¼iðj;i2RÞ i NR hni Sj2c fj
Here, cat stands for an ingredient category and s represents recipe size. The statistical signifi-
cance was again calculated using Z-score.
Supporting Information
S1 Table. Distribution of ingredients across categories. Number of ingredients in each cate-
gory for all regional cuisines.
(PDF)
S2 Table. Exponents (α) of Sigmoid fits for PðNsR Þ vs NsR distribution. Exponents (α) for
regional cuisines and their random controls.
(PDF)
S3 Table. Power law exponents (γ) for f(N) vs N distribution. Power law exponents (γ) of all
regional cuisines.
(PDF)
S4 Table. Ingredients contributing significantly to food pairing. Details of top 10 ingredi-
ents contributing to positive and negative food pairing in each of the regional cuisines.
(PDF)
S1 Dataset. Recipes in Indian Subcuisine and their corresponding ingredients. Recipe id
and aliased ingredient name.
(XLSX)
S2 Dataset. Flavor compound present in ingredients. Flavor compounds in each ingredient
and their corresponding CAS number.
(XLSX)
S3 Dataset. Contribution of each Ingredient towards food pairing and its frequency of
occurrence. Frequency of occurrence and corresponding χi values of ingredients.
(XLSX)
Acknowledgments
G.B. acknowledges the seed grant support from Indian Institute of Technology Jodhpur (IITJ/
SEED/2014/0003). A.J. and R.N.K. thank the Ministry of Human Resource Development, Gov-
ernment of India as well as Indian Institute of Technology Jodhpur for scholarship and Junior
Research Fellowship, respectively. The funders had no role in study design, data collection and
analysis, decision to publish, or preparation of the manuscript.
Author Contributions
Conceived and designed the experiments: GB. Performed the experiments: AJ RNK GB. Ana-
lyzed the data: GB AJ RNK. Wrote the paper: GB AJ RNK.
References
1. Navarrete A, Schaik CPV, Isler K. Energetics and the evolution of human brain size. Nature. 2011;
480:91–93. doi: 10.1038/nature10629 PMID: 22080949
2. Richard Wrangham. Catching fire: how cooking made us human. Basic Books; 2009.
3. Fonseca-Azevedo K, Herculano-Houzel S. Metabolic constraint imposes tradeoff between body size
and number of brain neurons in human evolution. Proceedings of the National Academy of Sciences of
the United States of America. 2012 Nov; 109(45):18571–6. doi: 10.1073/pnas.1206390109 PMID:
23090991
4. Pollan M. Cooked: A Natural History of Transformation. Penguin Books; 2014.
5. Weber S, Kashyap A, Mounce L. Archaeobotany at Farmana: new insights into Harappan plant use
strategies. In: Shinde V, Osada T, Kumar M, editors. Excavations at Farmana, District Rohtak, Hary-
ana, India. Kyoto: Nakanish Printing; 2011. p. 808–823. Available: http://anthro.vancouver.wsu.edu/
media/PDF/archaeobotany_at_Farmana_2011.pdf
6. Kashyap A, Steve W. Harappan plant use revealed by starch grains from Farmana. Antiquity. 2010; 84
(326). Available: http://antiquity.ac.uk/projgall/kashyap326/
7. Lawler A. The Ingredients for a 4000-Year-Old Proto-Curry. Science. 2012; 337(6092):288. doi: 10.
1126/science.337.6092.288-a
8. Appadurai A. How to Make a National Cuisine: Cookbooks in Contemporary India. Comparative Stud-
ies in Society and History. 2009 Jun; 30(01):3–24. doi: 10.1017/S0010417500015024
9. Sherman PW, Billing J. Darwinian Gastronomy: Why we use spices. Spices taste good because they
are good for us. Bioscience. 1999; 49(6):453–463. doi: 10.2307/1313553
10. Zhu YX, Huang J, Zhang ZK, Zhang QM, Zhou T, Ahn YY. Geography and similarity of regional cuisines
in China. PloS one. 2013 Jan; 8(11):e79161. doi: 10.1371/journal.pone.0079161 PMID: 24260166
11. Birch LL. Development of food preferences. Annual Review of Nutrition. 1999; 19:41–62. doi: 10.1146/
annurev.nutr.19.1.41 PMID: 10448516
12. Ventura AK, Worobey J. Early influences on the development of food preferences. Current biology: CB.
2013 May; 23(9):R401–8. doi: 10.1016/j.cub.2013.02.037 PMID: 23660363
13. Blumenthal H. The big fat duck cookbook. Bloomsbury Publishing PLC; 2008.
14. Ahnert SE. Network analysis and data mining in food science: the emergence of computational gastron-
omy. Flavour. 2013; 2(4):1–4. Available: http://www.flavourjournal.com/content/2/1/4
15. Ahn YY, Ahnert SE, Bagrow JP, Barabási AL. Flavor network and the principles of food pairing. Scien-
tific reports. 2011 Jan; 1:196. doi: 10.1038/srep00196
16. Jain A, Rakhi N, Bagler G. Spices form the basis of food pairing in Indian cuisine. arXiv:150203815.
2015;p. 1–30. Available: arxiv:1502.03815. Accessed 2 June 2015.
17. Varshney LR, Pinel F, Varshney KR, Bhattacharjya D, Schoergendorfer A, Chee YM. A Big Data
Approach to Computational Creativity. 2013 Nov;(October 2013):1–16. Available: arxiv:1311.1213.
Accessed 2 June 2015.
18. Teng CY, Lin YR, Adamic LA. Recipe recommendation using ingredient networks; 2012. Available:
arxiv:1111.3919v3. Accessed 2 June 2015.
19. Dalal T. Tarladalal.com; 2014. Available: http://www.tarladalal.com/ Accessed 1 February 2015.
20. Atridevji Gupt. Swami Agnivesha’s Charaka Samhita. 2nd ed. Banaras: Bhargav Pustakalay; 1948.
21. Valiathan MS. The Legacy of Caraka. Universities Press; 2010.
22. Pande GS, Chunekar KC. Bhavprakash Nighantu. Varanasi: Chaukhambha Bharti Academy; 2002.
23. Tapsell LC, Hemphill I, Cobiac L, Sullivan DR, Fenech M, Patch CS, et al. Health benefits of herbs and
spices: the past, the present, the future. Medical Journal of Australia. 2006; 185(4):S1–S24.
24. Johri RK, Zutshi U. An Ayurvedic formulation ‘Trikatu’ and its constituents. Journal of Ethnopharmacol-
ogy. 1992 Sep; 37(2):85–91. doi: 10.1016/0378-8741(92)90067-2 PMID: 1434692
25. Krishnaswamy K. Traditional Indian spices and their health significance. Asia Pacific journal of clinical
nutrition. 2008 Jan; 17 Suppl 1:265–268. PMID: 18296352
26. Billing J, Sherman PW. Antimicrobial functions of spices: why some like it hot. The Quarterly review of
biology. 1998 Mar; 73(1):3–49. doi: 10.1086/420058 PMID: 9586227
27. Rakshit M, Ramalingam C. Screening and Comparision of Antibacterial Activity of Indian Spices. Jour-
nal of Experimental Sciences. 2010; 1(7):33–36.
28. Perumal S, Dubey K, Badhwar R, George KJ, Sharma RK, Bagler G, et al. Capsaicin inhibits collagen
fibril formation and increases the stability of collagen fibers.; 2014.
29. Burdock GA. Fenaroli’s Handbook of Flavor Ingredients. 6th ed. CRC Press; 2010.