Nini JackTheRipper
Nini JackTheRipper
              Digital Scholarship in the Humanities, Vol. 33, No. 3, 2018. ß The Author(s) 2018. Published by Oxford University Press on behalf of EADH.            621
              This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/
              licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For
              commercial re-use, please contact journals.permissions@oup.com
              doi:10.1093/llc/fqx065             Advance Access published on 25 January 2018
Downloaded from https://academic.oup.com/dsh/article-abstract/33/3/621/4824843
by Carnegie Mellon University user
on 28 August 2018
             A. Nini
             sociological dimension of the mythology of Jack the                 iconic in the portrayal of Jack the Ripper and was
             Ripper to shed light on 19th century England and                    taken more seriously than other letters because of
             the beginning of the modern era (Walkowitz, 1982;                   the short window between the murders and the time
             Perry Curtis, 2001; Haggard, 2007) or have identi-                  the postcard was sent (Begg, 2004).
             fied links between Jack the Ripper and Victorian                        The police took these two texts seriously enough
             literature (Tropp, 1999; Eighteen-Bisang, 2005;                     to produce and post copies outside of police stations
             Storey, 2012).                                                      on 3 October 1888 (Rumbelow, 1979; Sugden, 2002).
                 The origin of the mythology of Jack the Ripper                  Following that, on 4 October, the two texts were also
             lies in the communication that the killer allegedly                 published in many newspapers (Sugden, 2002), even
             sent to the police or media during the time of the                  though some newspapers had obtained the informa-
             murders and in the following months and years.                      tion of the name ‘Jack the Ripper’ and part of the
             Although there is no evidence that the real killer                  texts already by 1 October (Perry Curtis, 2001).
             was involved in the production of any of them,                          Although much less popular than the other two
             the more than 200 Jack the Ripper letters signifi-                  texts, on 5 October the Central News Agency also
             cantly contributed to the creation and populariza-                  received a third text, commonly known by experts
             tion of the name and persona of Jack the Ripper.                    as the ‘Moab and Midian’ letter. This text
             However, despite the large number of texts involved                 announced a triple event and justified the murders
             in the case, only a small number of the Jack the                    with religious motives. The peculiarity of this letter
             Ripper letters received substantial investigative or                is that the original had never been sent to the police,
             socio-cultural importance at the time.                              as the journalist Tom Bulling of the Central News
                 Probably the most important text in the case is                 Agency decided to copy the text and send only the
             the ‘Dear Boss’ letter, which was received on 27                    envelope to the police. The reasons behind this
             September 1888 by the Central News Agency of                        choice were not explained and to date they are
             London. This letter is the first ever signed as ‘Jack               still unknown.
             the Ripper’ and it is responsible for the creation of                   Besides the three texts delivered to the Central
             the pseudonym. The letter claimed responsibility for                News Agency, a large number of other letters and
             the murder of Annie Chapman on 8 September                          postcards were sent to several other recipients such
             1888 and mentioned that an ear would be cut off                     as the press or the police between October 1888 and
             from the next victim and sent to the police. Indeed,                November 1888, that is, after the two iconic texts
             the murder of Chapman was followed by another                       were made public by the police. During this period,
             murder in which part of one of the ears of the                      130 letters allegedly written by the killer were
             victim was removed, although this was never sent                    received, and the flow of letters continued for ten
             to the police. Because of this fact and its style and               more years. Among these letters, another text that
             content, the letter was considered to be genuine and                has become iconic and that was judged as important
             it became famous for introducing the persona of                     during the case is the ‘From Hell’ letter, which was
             Jack the Ripper and for providing a name that the                   received on 16 October by George Lusk, head of the
             press could use to refer to the killer.                             Whitechapel Vigilance Committee, together with
                 The second most important text is the ‘Saucy                    half of a kidney (Rumbelow, 1979).
             Jacky’ postcard, which was received on 1 October                        In most of the letters, the author(s) mimicked
             1888 by the Central News Agency of London, signed                   the original ‘Dear Boss’ letter and ‘Saucy Jacky’
             again as ‘Jack the Ripper’. The postcard claimed                    postcard in terms of taunting the police and using
             responsibility for the double murder of Elizabeth                   salient stylistic features, such as the laughter ‘ha ha’,
             Stride and Catherine Eddowes on the night of 30                     or the salutation ‘Dear Boss’. Some of the letters
             September 1888. The postcard did not threaten                       were almost exact copies of ‘Dear Boss’, especially
             future murders and presented an apology for not                     the ones that were received a year or more later, in
             having sent an ear to the police. Together with the                 conjunction with the anniversary of the murders or
             ‘Dear Boss’ letter, this postcard has also become                   in conjunction with new murders in Whitechapel.
                  Since it is quite unlikely that the same person           competition with other news agencies and had a
              produced hundreds of letters spanning decades                 reputation of fabricating or embellishing news
              and sent from different places across the UK, it is           (Evans and Skinner, 2001; Begg, 2004). Another
              commonly assumed that most of the letters were                theory proposed by Cook (2009) suggests that a
              written by different individuals, who possibly had            journalist named Frederick Best from the tabloid
              not been involved with any of the killings.                   newspaper The Star was the actual author of the
              Particularly interesting is the case of Maria                 ‘Dear Boss’ letter.
              Coroner, a 21 year old girl who was caught sending                As a first step to shed light on the authorship
              one of those letters (Evans and Skinner, 2001).               question of the Jack the Ripper letters, the present
              When questioned, she explained that she did so as             article reports on an authorship analysis of the texts
              she was fascinated by the case. It is likely that many        received during and after the Whitechapel murders
              of the writers of these letters acted for similar rea-        case that are connected to Jack the Ripper. The
              sons, although the motives behind such actions will           available data set lends itself to several authorship
              probably never be established. These hoax letters             questions, such as the profiling of the anonymous
              themselves represent an interesting mirror into the           author(s), or to the comparison between some key
              fears and problems of the people who wrote them               letters and Bulling’s and Best’s writings. In the pre-
              (Remington, 2004). More importantly, these letters            sent article an initial exploration of the Jack the
              still exercise an impact on modern times. The                 Ripper letters is performed with the general aim of
              Yorkshire ripper hoaxer, for example, sent letters            finding out for which of the hundreds of texts there
              that borrowed several linguistic elements from the            is evidence of common authorship, with a special
              ‘Dear Boss’ letter (Ellis, 1994; Lewis, 1994).                attention to the most important texts in the case
                  Such a collection of letters also represents an in-       mentioned above and on those earliest texts received
              valuable data set for forensic linguistics and for            before 1 October 1888, that is, before the ‘Dear Boss’
              authorship analysis. Linguistic analyses of the letters       letter and the ‘Saucy Jacky’ postcard became of
              can be useful to provide new evidence for the                 public domain.
              Whitechapel murders case, since, as opposed to                    Establishing whether some of the Jack the Ripper
              other sources of evidence nowadays corrupted by               texts could be written by the same person is an im-
              time, the language of the letters has reached us un-          portant preliminary step as any future study, either
              changed. The question of the authorship of the let-           involving profiling or comparison, would benefit
              ters mostly focuses on the early ones, such as the            from knowing if a number of questioned texts can
              ‘Dear Boss’ and ‘Saucy Jacky’ texts. The most                 be clustered together. In this sense, the authorship
              common theory about the authorship of these                   question tackled in the present study constitutes a
              texts is that journalists fabricated them to increase         useful starting point for any future authorship study
              newspaper sales. The ‘enterprising journalist’                on the Jack the Ripper letters.
              theory, more specifically, suggests that letters such
              as the ‘Dear Boss’ letter were actually works of fic-
              tion skilfully created to generate shock and ‘keep the        2 Data
              business alive’ (Begg, 2004; Begg and Bennett,
              2013). Evidence for the ‘enterprising journalist’             The data set used in the present study is a corpus
              theory comes from the ‘Littlechild’ letter, in which          that includes the texts connected to the Whitechapel
              Detective Chief Inspector John George Littlechild             murders: the Jack the Ripper Corpus (JRC) (see
              mentions that at Scotland Yard virtually everyone             Supplementary Material). This corpus consists of
              knew that the ‘Dear Boss’ letter was fabricated by            the letters or postcards found and transcribed in
              Tom Bulling, a journalist of the Central News                 the Appendix of Evans and Skinner (2001), who
              Agency itself, in collaboration with his manager              claim to have collected all of the texts involved in
              (Rumbelow, 1979; Begg, 2004). At the time, the                the Whitechapel murders related to Jack the Ripper
              Central News Agency had been in a fierce                      from the Metropolitan Police files. These letters
             were OCR-scanned from the book and the scans                          and presents the intention to stop killing. The
             were manually checked for scanning errors. The                        letter is unsigned;
             corpus consists of 209 texts and 17,463 word                         Text 2 (27 September, 244 word tokens): The
             tokens. The average length of a text in the corpus                    ‘Dear Boss’ letter;
             is of eighty-three tokens (min ¼ 7, max ¼ 648,                       Text 3 (1 October, 57 word tokens): The ‘Saucy
             SD ¼ 67.4).                                                           Jacky’ postcard; and
                 The peculiarity of the JRC is that almost all of the             Text 4 (1 October, 88 word tokens): This text
             texts in the corpus are comparable in terms of their                  threatens more murders and is signed as ‘Ripper’.
             broad situational parameters (Biber, 1994), as they are
                                                                                 Even though the analysis will include all the JRC
             almost all written letters or postcards with similar
                                                                                 texts, these four texts are particularly important be-
             linguistic purposes. For example, in terms of ad-
                                                                                 cause any linguistic similarity that links them
             dressee, 67% of the texts were addressed to Scotland
                                                                                 cannot be explained by influence from the media,
             Yard; Sir Charles Warren, the head of London
                                                                                 an explanation that cannot be ruled out for the
             Metropolitan Police during that time; Inspector
                                                                                 other texts. In the rest of this article, the four texts
             Abberline; or other law enforcement units. The re-
                                                                                 above will be called the ‘pre-publication’ texts,
             maining 33% were either of unknown addressee
                                                                                 whereas the remaining 205 texts will be called the
             (13%), or were addressed to common citizens or to
                                                                                 ‘post-publication’ texts.
             newspapers, news agencies, schools, or private firms
             (20%). The vast majority of the letters was post-
             marked or found in London, although other letters
             were postmarked or found in places all over the UK,                 3 Methodology
             such as Birmingham, Bradford, Dublin, Edinburgh,
                                                                                 The authorship question considered for this study
             Liverpool, Manchester, or Plymouth. All of the letters
                                                                                 concerns finding out which texts in a corpus are
             were handwritten and a minority of them (4%)
                                                                                 likely to be written by the same author. Recently,
             included drawings of various items, such as knives,
                                                                                 this task has been called ‘author clustering’ and it
             skulls, or coffins. Finally, a large number of the letters
                                                                                 has been tackled using hierarchical cluster analysis
             (75%) were indeed signed as ‘Jack the Ripper’ or with
                                                                                 on frequencies of features (Gómez-Adorno et al.,
             variants of the name, such as ‘Jack the Whitechapel
                                                                                 2017). This authorship problem could be con-
             Ripper’, or ‘JR’, or ‘jack ripper and son’. Some other
                                                                                 sidered, however, just as a special case of ‘author-
             letters were not signed (11%) while the remaining
                                                                                 ship verification’, a problem that has received
             letters used other pseudonyms, such as ‘Jim the
                                                                                 considerable attention in the literature (Koppel
             Cutter’, ‘The Whore Killer’, or ‘Bill the Boweler’.
                                                                                 and Schler, 2004; Koppel et al., 2012; Brocardo et
                 The corpus ranges from 24 September 1888 to 14
                                                                                 al., 2013; Koppel, Schler and Argamon, 2013). The
             October 1896, thus spanning more than 10 years after
                                                                                 best solutions proposed to solve this type of prob-
             the murders. However, the majority of the texts, that
                                                                                 lem involve the addition of distractor texts belong-
             is 62% of the corpus, was received during the period
                                                                                 ing to similar registers and the use of similarity
             between October 1888 and November 1888.
                                                                                 metrics applied to feature sets consisting of frequen-
                 Among the total set of 209 texts, the present ana-
                                                                                 cies of linguistic features.
             lysis will pay special attention to those early texts
                                                                                     The problem in applying any of these techniques
             that were received not later than the 1 October 1888,
                                                                                 to the JRC corpus is that the JRC texts are too short
             before the content of the ‘Dear Boss’ and ‘Saucy
                                                                                 to produce reliable frequencies, as the average text
             Jacky’ was popularized by the police and the
                                                                                 length for the corpus is only eighty-three word
             media and therefore hoaxers could have knowledge
                                                                                 tokens. For this reason, in this case it is necessary
             of it. Before this date, according to Evans and
                                                                                 to adopt a method that does not involve the com-
             Skinner’s (2001) collection, four texts were received:
                                                                                 putation of frequencies.
              Text 1 (24 September, 128 word tokens): In this                       A solution to the problem of analysing short texts
               text the author admits to the killing of Chapman                  within a forensic linguistic context by considering
              the presence or absence of features as opposed to             features. Character n-grams could also be good fea-
              their frequencies has been initially proposed by              tures but they are less amenable to interpretation,
              Grant (2010) and then further described in Grant              which can be a drawback depending on the ultimate
              (2013) for text messages. Inspired by research in             goal of the research.
              similarity between species in biology and ecology,               In addition to these methodological advantages,
              and already applied to assess similarity in crime             the use of word n-grams as features has theoretical
              types, this approach consists in quantifying the              support. Corpus linguistics (Sinclair, 1991; Biber,
              similarity between two texts using the Jaccard coef-          Conrad and Cortes, 2004; Hoey, 2005) and psycho-
              ficient, or the number of shared features between             linguistics/cognitive linguistics (Langacker, 1987;
              two texts divided by the total number of features             Barlow and Kemmer, 2000; Schmitt, 2004; Wray,
              in both texts (Jaccard, 1912):                                2005; Schmid, 2016) have long theorized that com-
                                                                            bination of words is at the core of language process-
                                               jA \ Bj
                                  J ðA; BÞ ¼                                ing and empirical support has been found for these
                                               jA [ Bj
                                                                            theories (Ellis and Simpson-Vlach, 2009; Tremblay
                  After being successfully applied to text messages         et al., 2009).
              case, methods using the Jaccard coefficient have                 Furthermore, there is also empirical support for a
              been applied with good results to other registers,            strong idiolectal effect in the production and pro-
              including newspaper articles (Juola, 2013), short             cessing of word combinations (Mollin, 2009;
              emails (Johnson and Wright, 2014; Wright, 2017),              Barlow, 2013; Schmid and Mantlik, 2015; Günther,
              and elicited personal narratives (Larner, 2014).              2016). Wright, (2017) reveals the idiolectal nature of
              These studies have analysed the presence/absence              certain word n-grams by taking one specific speech
              of combination of words, mostly looking at word               act as constant and then analysing how different
              n-grams, that is, strings of words of length n col-           authors realize this act, uncovering that each
              lected using a moving window.                                 author recurs to their own idiosyncratic set of lex-
                  Within plagiarism detection research, word                ical choices to perform the same act.
              n-gram techniques based on similar mathematical                  In the present study, for the reasons explained
              principles are very common (Oakes, 2014, p. 65)               above, the set of features that is taken under con-
              on the grounds that the more shared strings there             sideration is word n-grams, as the ultimate goal is to
              are in two documents, the more there is shared simi-          discover possible idiolectal encoding in the JRC let-
              larity of encoding of meanings and therefore the less         ters. Because the JRC texts are short, presence or
              likely it is that the documents are independent from          absence of word n-grams is considered, as opposed
              each other, as explained by Coulthard (2004).                 to their frequency. Among all the possible sizes of
                  Word n-grams have been extensively adopted as             n-grams, word 2-grams are chosen as any n-gram of
              linguistic features in traditional frequency-based            n > 2 is ultimately made up of n-grams of n ¼ 2,
              stylometric methods for authorship attribution, al-           meaning that word 2-grams return the most com-
              though they are not deemed the best stylometric               plete picture of the shared word combinations in
              features, as they are often surpassed in efficacy by          two sets. Presence or absence of word n-grams is
              function words, simple word frequency, and, above             quantified using the Jaccard ‘distance’, as opposed
              all, character n-grams (Grieve, 2007; Stamatatos,             to the coefficient, which can be defined as:
              2009). Although word n-grams might not be ex-                                                      jA \ Bj
              tremely good features when frequency is taken                                   dJ ðA; BÞ ¼ 1 
                                                                                                                 jA [ Bj
              under consideration, for a method involving pres-
              ence/absence these features are much better than              and which returns values between 0, or absolute
              single words or function words because word strings           identity, and 1, or absolute distance. The Jaccard
              are rarer and the power of a presence/absence                 distance is used so that a hierarchical cluster analysis
              method lies in the measurement and comparison                 can then be carried out. In this way, it is possible to
              of the linguistic uniqueness of each author on rare           first find out the major groups of texts that are more
             similar to each other, and then it is possible to zoom              dealing with texts of different length, as the likeli-
             in and explore smaller groups of letters, such as the               hood of any word or n-gram type being observed is
             pre-publication letters.                                            correlated with text length. However, provided that
                However, evidence of common authorship of                        the shared n-grams found are also highly distinctive
             two sets of documents can come not only from                        the evidence of common authorship is nonetheless
             finding similarity but also from establishing that                  valid despite differences in text lengths.
             this similarity is distinctive (Grant, 2010, 2013).
             Although it is difficult to establish a universal
             threshold for distinctiveness, it is safe to assume                 4 Results
             that if a particular n-gram or lexicogrammatical
             structure does not occur at all or occurs extremely                 Figure 1 reveals that the relationship between the
             infrequently in a comparable reference corpus then                  percentage of texts using a 2-gram (occurring in at
             this n-gram or structure is distinctive.                            least two texts) and their frequency rank form a
                The comparison corpus used to assess distinct-                   zipfian shape, as expected (Zipf, 1935). The graph
             iveness should therefore include relevant population                shows that the top eight 2-grams appear in at least
             data (Turell and Gavaldà, 2013; Wright, 2017). If a                20% of the corpus. Some of these are very frequent
             smaller sub-sample of its texts is considered, the                  because they reflect common grammatical struc-
             remaining of the JRC itself is indeed a corpus with                 tures of English, such as ‘I am’, ‘I have’, ‘I will’.
             relevant population data. However, because of its                   Two 2-grams reflect the influence of the signature
             relatively small size, more data from 19th century                  and salutation of the ‘Dear Boss’ letter on the rest
             English is necessary to find evidence of distinctive-               of the corpus: ‘jack the’ and ‘dear boss’. Finally, the
             ness. Ideally, because of the pervasiveness of register             high incidence of the 2-grams ‘I shall’ and ‘yours
             variation, the perfect comparison corpus would be                   truly’ are probably explained by both the influence
             one including a large number of 19th century                        of the ‘Dear Boss’ letter and by the register of the
             English letters of comparable communicative situ-                   letters.
             ation (Biber, 2012). However, in the absence of an                      Because of their frequent occurrence and thus
             extensive resource of this kind, the most compre-                   reduced discriminatory power, these top eight
             hensive largest available set of general reference cor-             2-grams were excluded from further analysis.
             pora was used instead, consisting of the largest                        The distance between each pair of texts was
             available corpora of 19th century English:                          quantified using the Jaccard distance based on the
              The 132 million word 19th century section of the                  presence or absence of the remaining 1541 word
               Corpus of Historical American English (COHA);                     2-grams and a distance matrix was therefore gener-
              The 34 million word Corpus of Late Modern                         ated. Figure 2 shows a histogram and boxplot of the
               English Texts 3 (CLMET3), spanning from                           Jaccard distances for all possible pairs of texts in the
               1710 to 1920;                                                     JRC.
              The 19 million word Extended Old Bailey                               As the histogram of Fig. 2 shows, the most fre-
               Corpus (EOBC), including the proceedings of                       quent Jaccard distance and also the median distance
               the Old Bailey from 1720 to 1913.                                 is approximately 1, which generally speaking means
                                                                                 that the texts in the JRC are not very similar to each
             In sum, the method adopted in this study involves                   other. Only 25% of the scores are lower than 0.98,
             the comparison of all the texts in the JRC to each                  which is marked in Fig. 2 by the leftmost edge of the
             other using the Jaccard distance and a set of com-                  boxplot, and only 6% of the scores are lower than
             parison corpora to find whether there are texts                     0.95, that is, the outliers in the boxplot of Fig. 2
             that are similar and distinctive in their linguistic                indicated by circles.
             encoding.                                                               The distance matrix was then used for a hierarch-
                In addition, since the analysis involves word                    ical cluster analysis that can be visualized through
             n-gram ‘types’, the method faces problems when                      the radial dendrogram in Fig. 3.
              Fig. 1 Relationship between rank and percentage of occurrence for each word 2-gram in the JRC occurring in at least
              two texts
             Fig. 3 Radial dendrogram displaying the results of a hierarchical cluster analysis of the JRC corpus using the Ward
             method based on Jaccard distances. The name of the texts is a code starting with two letters from the signature and
             followed by the date in which it was received. The texts mentioned in the introduction, including the pre-publication
             texts, contain their name in addition to the code
              letter and the ‘Saucy Jacky’ postcard. Additionally,          in reference corpora of 19th century English. The
              these two texts have a Jaccard distance of 0.93,              use of ‘till’ as a variant of ‘until’ is also not very
              which is a degree of dissimilarity that can be found          distinctive as it is the predominant variant in the
              in less than 5% of the pairs of texts in the JRC. The         JRC (80%), CLMET3 (75%), and EOBC (90%) but
              amount of shared language is striking considering the         not in 19th century COHA (28%).
              fact that the ‘Saucy Jacky’ postcard is very short and           The two texts also share the use of infinitive
              does not share any linguistic link with either the 24         clauses to post-modify the noun ‘time’ with a neg-
              September text or the 1 October text. Although the            ation in the matrix clause (6), which occurs in only
              ‘Dear Boss’ letter shares a number of 2-grams with            two other texts in the JRC. The structure is quite
              both Text 1 and Text 4, the Jaccard score for both            rare even at a more general level, as it is found about
              pairs is in the average for the corpus.                       ten to eighteen times per million words across the
                  Excluding the 3-gram ‘Jack the Ripper’, which             reference corpora.
              refers to the signatures of the two texts, Table 1               ‘Dear Boss’ and ‘Saucy Jacky’ also share the use
              below presents the concordances of their overlap-             of the verb ‘work’ to euphemistically indicate the act
              ping 2-grams, with an analysis of their syntactic             of killing (7). This use of ‘work’ is found in some
              structure.                                                    post-publication JRC texts (about 20% of the texts
                  A closer examination reveals that the two texts           in the corpus). It is very difficult to estimate dis-
              share 2-grams of varying distinctiveness. The phrase          tinctiveness for (7) using larger reference corpora,
              ‘a bit’, although with different syntactic function           however, as it would involve the manual analysis of
              (1), the verbs ‘give’ (2) and ‘got’ (3), or the use of        thousands of instances.
              the infinitive verb ‘to get’ (3) are common struc-               Finally, the two texts share the use of a verb
              tures that are frequently found both in the JRC and           phrase headed by the phrasal verb ‘to keep back’
              Table 1 Syntactic analysis of the concordances for the 2-grams in common between Dear Boss and Saucy Jacky
              1               till I do [NP a bit more work] (Dear Boss)
                              number one squealed [ADVP a bit] (Saucy Jacky)
              2               [NP I] [VP gave [NP the lady] [NP no time to squeal]] (Dear Boss)
                              [NP I] [VP gave [NP you] [NP the tip]] (Saucy Jacky)
              3               [NP I] [VP got [NP all the red ink] [Part off]] (Dear Boss)
                              till [NP I] [VP got [INFCL to work again]] (Saucy Jacky)
              4               I want [INFCL to get [INFCL to work]] (Dear Boss)
                              had not time [INFCL to get [NP ears]] (Saucy Jacky)
              5               [SUB till] [CL [NP I] [VP do get buckled]] (Dear Boss)
                              [SUB till] [CL [NP I] [VP do a bit more work]] (Dear Boss)
                              [SUB till] [CL [NP I] [VP got to work again]] (Saucy Jacky)
              6               [NP no time [INFCL to squeal]] (Dear Boss)
                              had not [NP time [INFCL to get ears]] (Saucy Jacky)
              7               I want to get [INFCL to work] (Dear Boss)
                              till I got [INFCL to work again] (Saucy Jacky)
              8               [VP keep [NP this letter] [PART back] [SUBCL till I do]] (Dear Boss)
                              thanks for [VP keeping [NP last letter] [PART back] [SUBCL till I got to work]] (Saucy Jacky)
             with the direct object being a noun phrase with                     carried out by searching for occurrences of the
             ‘letter’ as head followed by a subordinate clause                   lemma of these variants accompanied by the
             introduced by the subordinator ‘till’ (8). Indeed,                  lemma LETTER within a span of  seven words. The
             the 4-gram ‘letter back till I’ itself neither occurs               concordances were then manually examined to
             in any other text in the JRC, nor in any other                      count only instances of the meaning of ‘withholding
             corpus of 19th century English listed above.                        a letter, delay a letter to be sent’.
             Because of the rarity of this 4-gram and the absence                    This corpus search revealed that ‘keep back’ is
             of a more relevant large corpus of 19th century                     found 22.5% of the times the meaning of ‘delay
             English letters, a search of this 4-gram was per-                   sending a letter’ is expressed across the reference
             formed on the web, which returned a total of                        corpora. In the 19th century section of COHA, the
             2,640 hits, all from exact copies of either the ‘Dear               majority of the instances (59%) use the variant
             Boss’ letter or of the ‘Saucy Jacky’ postcard.                      ‘withhold’. Out of seven instances of ‘keep back’,
                 A search was then performed using the 19th cen-                 four are from one author, John Townsend
             tury English corpora listed above on the overall                    Trowbridge. In CLMET3 the most common variant
             phrasal verb ‘keep back’ used with the meaning of                   is again ‘withhold’. One judge also uses ‘keep back’
             withholding a letter as opposed to the use of the                   in the EOBC, where the most common variant is
             other synonyms ‘keep’ (without ‘back’), ‘hold                       instead ‘detain’. Finally, in the JRC, only three in-
             back’, ‘hold up’, ‘hold out’, ‘withhold’, ‘delay’ plus              stances of this meaning are found, two of which are
             any verb indicating ‘sending’, ‘refrain’ plus any verb              instances of ‘keep (quiet)’ found in two letters
             indicating ‘sending’, and ‘detain’. The queries were                dated, respectively, 20 October (‘keep this quiret
             Fig. 4 Network graph visualizing the relationships between the pre-publication texts. The size of each node is pro-
             portional to each text’s length. Each edge represents a shared word 2-gram. For each pair of texts the Jaccard distance is
             also reported. Distances are rounded up
              [sic] till I have done one’) and 09 November 1888             whether further links between these two texts and
              (‘keep this letter a bit quiert [sic] till you here of me     other texts can be found.
              again’). The third one is found in the ‘Moab and                 As Fig. 5 indicates, only eight JRC texts have a
              Midian’ letter and it is the only instance across all         Jaccard distance lower than 0.95 with ‘Dear Boss’,
              the corpora to exactly match the syntactic structure          including ‘Saucy Jacky’ (dJ ¼ 0.929) and ‘Moab and
              in (8), having the object in between the main verb            Midian’ (dJ ¼ 0.934), which are both therefore more
              and the particle as well as a subordinate clause              similar to ‘Dear Boss’ than 95% of the JRC. The most
              introduced by the subordinator ‘till’ (‘keep this             similar text to ‘Dear Boss’ is, however, JR_191188,
              back till three are wiped out’).                              with a Jaccard distance of 0.776. This is not reported
                 In conclusion, among the four pre-publication              in Fig. 5 to ease the visualization of the boxplots.
              texts, these results support the hypothesis that the             However, this text can be discounted as its
              ‘Dear Boss’ and ‘Saucy Jacky’ texts were not written          anomalous score is explained by the fact that most
              independently from each other, since these two texts          of it was copied verbatim from ‘Dear Boss’, as the
              are more similar to each other in their use of word           presence of an overlapping 13-gram demonstrates:
              2-grams than 95% of all the other possible pairs of
                                                                                 I want to get to work right away if I get a
              texts in the JRC even though the texts received later
                                                                                 chance and will do another one indoors.
              could have been influenced by them, and since some
                                                                                 (JR_191188)
              of these similarities are also distinctive.
                                                                                 My knife’s so nice and sharp I want to get to
                                                                                 work right away if I get a chance. (Dear
              4.2 The post-publication texts                                     Boss)
              Having established a link between the ‘Dear Boss’             This is somewhat expected in the post-publication
              letter and the ‘Saucy Jacky’ postcard, let us now             texts, as the ‘Dear Boss’ and ‘Saucy Jacky’ were in
              explore the post-publication texts to determine               the public domain.
Fig. 5 Boxplots showing Jaccard scores for Dear Boss (left) and Saucy Jacky (right) and all the other texts in the JRC
                For ‘Saucy Jacky’, Fig. 5 indicates that the median              ‘Saucy Jacky’ of all the other texts in the JRC, but it
             score is 1 and that 50% of the texts in the JRC there-              is also almost as close as ‘Dear Boss’ is to ‘Saucy
             fore have almost no linguistic link with it. Only twelve            Jacky’ and, more importantly, it is the only text
             JRC texts have a Jaccard distance lower than 0.96,                  that is very close to both ‘Saucy Jacky’ and ‘Dear
             and, among these, the ‘Moab and Midian’ letter is                   Boss’ (with the exclusion of the JR_191188 that con-
             even more striking, as its Jaccard score with ‘Saucy                tains a 13-gram copied word-by-word).
             Jacky’ is 0.90, which is 0.03 points smaller than the                  Table 2 presents the 2-grams and their underlying
             second most similar text, the ‘Dear Boss’ letter.                   syntactic structures shared by ‘Moab and Midian’
                From this analysis, it is evident that the ‘Moab                 and either ‘Dear Boss’ or ‘Saucy Jacky’. ‘Midian’
             and Midian’ letter not only is the most similar to                  shares with the two pre-publication texts as well as
             Table 2 Syntactic analysis of the concordances for the n-grams in common between Dear Boss and Saucy Jacky and the
             Moab and Midian letters
             1                  till I do [NP a bit more work] (Dear Boss)
                                number one squealed [ADVP a bit] (Saucy Jacky)
                                will send you [NP a bit of face] by post (Midian)
             2                  I love [NP my work] (Dear Boss)
                                The police now reckon [NP my work] a practical joke (Midian)
             3                  you ll hear about [NP [NP saucy Jacky] [Gen s] [N work]] (Saucy Jacky)
                                well well [NP Jacky] [VP ’s [NP a very practical joker]] (Midian)
             4                  ripping them till [NP I] [VP do [VP get buckled.]] (Dear Boss)
                                The next job [NP I] [VP do] I shall clip (Dear Boss)
                                [NP I] [VP do] a bit more work (Dear Boss)
                                Do as [NP I] [VP do] and the light of glory (Midian)
             5                  I keep on hearing [NP the police] have caught me (Dear Boss)
                                and send to [NP the police officers] (Dear Boss)
                                [NP The police] now reckon (Midian)
             6                  [NP [ADJ Grand] [N work]] the last job was (Dear Boss)
                                helps me in my [NP [ADJ grand] [N work]] (Midian)
             7                  is fit enough I hope [INTJ ha. ha.] (Dear Boss)
                                They say I’m a doctor now [INTJ ha ha] (Dear Boss)
                                Jacky’s a very practical joker [INTJ ha ha ha] (Midian)
             8                  I wasnt codding [NP dear old Boss] (Saucy Jacky)
                                I promise this [NP dear old Boss] (Midian)
             9                  [VP keep [NP this letter] [PART back] [SUBCL till I do]] (Dear Boss)
                                thanks for [VP keeping [NP last letter] [PART back] [SUBCL till I got to work]] (Saucy Jacky)
                                [VP Keep [NP this] [PART back] [SUBCL till three are wiped out]] (Midian)
             10                 [CL . . . [NP saucy Jacky s work] [ADVP tomorrow]] [NP double event] [NP this time]
                                (Saucy Jacky)
                                [CL I must get [INFCL to work] [ADVP tomorrow]] [NP treble event] [NP this time]
                                (Midian)
              with several other JRC texts the use of the phrase ‘a         was not made public before the ‘Saucy Jacky’ post-
              bit’ (1) and the verb ‘work’ to euphemistically mean          card was sent, the degree of their shared linguistic
              ‘kill’ (2). ‘Midian’ and ‘Saucy Jacky’ also share the use     encoding is highly suggestive of the two documents
              of the pseudonym ‘Jacky’, although the 2-gram ‘Jacky          not being produced independently. Although it is
              s’ is only a surface similarity, as its underlying syn-       entirely possible that one author was responsible
              tactic structure is very different (3). ‘Midian’ also         for all of the earlier texts, the linguistic evidence
              presents the use of a pro-verb ‘do’ (4) and it men-           found so far can only suggest a link between the
              tions the police (5), similarly to ‘Dear Boss’. The           ‘Dear Boss’ letter and the ‘Saucy Jacky’ postcard
              adjective ‘grand’ to modify ‘work’ (6), the interjec-         while no strong links can be found between these
              tion ‘ha ha’ (7), and the vocative ‘dear old boss’ (8)        two texts and the other two pre-publication texts.
              are features that have been copied by other authors of            Among the evidence of a link between the ‘Dear
              the JRC texts, as they appear in, respectively, three,        Boss’ letter and the ‘Saucy Jacky’ postcard, the
              eight, and fifty-five other JRC texts.                        strongest piece of evidence is the presence of a
                  The two most distinctive structures are the verb          shared distinctive 4-gram, ‘letter back till I’. The
              phrase headed by ‘keep back’ (9), already discussed           syntactic structure underlying this 4-gram is a verb
              above, and the use of a verbless clause, ADJ ‘–ble            phrase headed by a phrasal verb that, used within
              event this time’, elaborating the previous clause             that particular structure underlying that particular
              ending with the adverb ‘tomorrow’ (10). This last             unit of meaning, is also rare and distinctive overall.
              syntactic structure is underlying the 2-gram ‘work            The presence of this 4-gram and of this structure
              tomorrow’ and the 3-gram ‘event this time’, which             thus supports the hypothesis that the two texts were
              do not appear in any other JRC text.                          written by the same person. This conclusion is sub-
                  The 2-gram ‘work tomorrow’ is surprisingly in-            stantiated by the fact that despite the presence of
              frequent in the reference corpora (0.03–0.05 per              about 200 texts trying to imitate the style of the
              million words) while the 3-gram ‘event this time’             ‘Dear Boss’ letter or ‘Saucy Jacky’ postcard, no
              cannot be found at all. Although the 3-gram can               other text has managed to reproduce this structure
              be found on the web (617,000 hits), a search of               or 4-gram, which indeed this analysis has proved to
              the two n-grams together returns almost only in-              be the real distinctive feature of these two texts.
              stances of either ‘Saucy Jacky’ or ‘Moab and                      The only exception is the ‘Moab and Midian’
              Midian’.                                                      letter, which does not use the 4-gram but contains
                  In conclusion, there is linguistic evidence in sup-       an instance of ‘keep back’ meaning ‘to withhold’,
              port of the hypothesis that the ‘Moab and Midian’             including the co-selection of the position of the
              letter has an authorship link with the other two pre-         object and of the adverbial clause introduced by
              publication texts, even accounting for the fact that          ‘till’. Furthermore, the ‘Moab and Midian’ letter
              ‘Dear Boss’ and ‘Saucy Jacky’ were publicly available         also shares another distinctive lexicogrammatical
              at the time ‘Midian’ was received.                            structure with ‘Saucy Jacky’, the verbless clause
                                                                            ADJ ‘–ble event this time’ which elaborates the pre-
                                                                            vious clause ending with the adverb ‘tomorrow’. It
              5 Discussion                                                  is not possible to discount that the author of this
                                                                            letter was simply more skilled in copying the style of
              The analysis of the n-gram types reported above sug-          ‘Dear Boss’ than others, as by the time the ‘Moab
              gests that the ‘Dear Boss’ letter and the ‘Saucy Jacky’       and Midian’ letter was received all the earliest texts
              postcard share distinctive linguistic similarities.           were publicly available. However, the ‘Moab and
              Because authorship analysis studies demonstrated              Midian’ letter is striking in also being the most simi-
              that common strings or rare collocations shared by            lar letter in terms of the number of shared word 2-
              documents are indicative of a common authorial                grams, even despite the fact that probably hundreds
              source (Coulthard, 2004; Mollin, 2009; Johnson                of other authors tried to imitate the style of ‘Dear
              and Wright, 2014), given that the ‘Dear Boss’ letter          Boss’ and ‘Saucy Jacky’.
                 The analysis also points out that there is no link              style of the ‘Dear Boss’ letter and of the ‘Saucy Jacky’
             between the ‘From Hell’ letter and the other histor-                postcard. However, it is evident that none of the
             ically important texts in the case. Although this lack              authors of these texts successfully managed to indi-
             of link does not constitute evidence that they were                 viduate that the real linguistic distinctiveness con-
             not written by the same person, this finding does                   sisted in a seemingly common string such as ‘letter
             lend some support to the initial presuppositions of                 back till I’, or in the phrasal verb ‘keep back’ and its
             other scholars that ‘Dear Boss’ and ‘Saucy Jacky’ are               underlying structure, or even in simply the presence
             independent from the ‘From Hell’ letter                             of the meaning of ‘withhold this letter’, found in
             (Rumbelow, 1979). This and many other letters in                    only two other Jack the Ripper texts but encoded
             the JRC texts can be analysed in more detail in the                 differently.
             future.                                                                 Instead, impostors imitated structures such as
                 Historically speaking, the comparison presented                 the salutation ‘Dear Boss’. Quantitatively speaking,
             between the earliest letters ever received in the                   despite the presence of these letters in full in the
             Whitechapel murders case provides linguistic evi-                   public domain, only a very limited percentage of
             dence supporting the hypothesis that the two most                   them presents substantial linguistic similarities,
             iconic texts sent during the case were written by the               implying that techniques such as the analysis of
             same person. Although several scholars have already                 short texts using similarity measures such as the
             commented on the similarity of the handwriting of                   Jaccard coefficient are quite effective in filtering
             the ‘Dear Boss’ letter and the ‘Saucy Jacky’ postcard,              this type of noise.
             the common authorship of these two texts has never                      Theoretically, the results presented in this article
             been established with certainty. The present analysis,              also contribute to the understanding of idiolect. A
             however, found linguistic evidence that supports the                superficial reading of most of the JRC letters would
             common authorship of these two texts. Future ana-                   only reveal their similarities in terms of meanings,
             lyses focused on their profiling or on the compari-                 themes, purposes, and some phraseology. However,
             son with known writings of suspect authors can thus                 this analysis has revealed that by investigating the
             take as point of departure a link between these two                 way these meanings, themes, and purposes are
             texts.                                                              encoded linguistically uniqueness emerges, as
                 Additionally, of great historical importance is                 demonstrated by the relatively low average Jaccard
             also the link found between the two earlier iconic                  distances between the letters. As shown by Wright
             texts and the ‘Moab and Midian’ letter, since this                  (2017) for short emails, although meanings and
             text is one of the most controversial in the JRC.                   speech acts can be shared, it is the way they are
             Besides being the third and last letter that was ever               encoded in words and syntactic structures that
             sent to the Central News Agency, after ‘Dear Boss’                  tends to be idiosyncratic or unique.
             and ‘Saucy Jacky’, Bulling’s decision of sending a
             copy of the ‘Moab and Midian’ letter instead of
             the original was never justified by the journalist                  6 Conclusions
             and still remains suspiciously unexplained (Evans
             and Skinner, 2001). The linguistic link found be-                   In this article, an analysis of the texts sent during the
             tween these three texts is therefore far from coinci-               Whitechapel murders case was presented. This ana-
             dental in the light of the other non-linguistic                     lysis found linguistic evidence that supports the hy-
             evidence and significantly contributes to the                       pothesis that the two most iconic texts signed as
             debate on the origin of the letter.                                 ‘Jack the Ripper’, the ‘Dear Boss’ letter and the
                 The present analysis is also successful in present-             ‘Saucy Jacky’ postcard, have been written by the
             ing serious implications for modern research in fo-                 same person. Because of the number and the dis-
             rensic linguistics and authorship analysis. The JRC                 tinctiveness of the linguistic similarities, it is likely
             is a corpus made up of texts the majority of which                  that an authorial link also exists between these two
             was fabricated by individuals that were imitating the               texts and a third letter sent to the same recipient, the
              ‘Moab and Midian’ letter. These results constitute              Eighteen-Bisang, R. (2005). Dracula, Jack the Ripper and
              new forensic evidence in the Jack the Ripper case                 a Thirst for Blood. Ripperologist, 60: 3–12.
              after more than 100 years, even though they do not
                                                                              Ellis, N. C. and Simpson-Vlach, R. (2009). Formulaic
              reveal information about the identity of the killer(s).
                                                                                 language in native speakers: triangulating psycholin-
                 Besides the historical and forensic implications,               guistics, corpus linguistics, and education. Corpus
              the results presented in this article also have inter-             Linguistics and Linguistic Theory, 5: 61–78.
              esting consequences for modern research in author-
                                                                              Ellis, S. (1994). The Yorkshire Ripper enquiry: part I.
              ship analysis, forensic linguistics, and research on               Forensic Linguistics, 1: 197–206.
              idiolect. The results in this article present additional
                                                                              Evans, S. P. and Skinner, K. (2001). Jack the Ripper:
              evidence that uniqueness in linguistic production
                                                                                Letters from Hell. Stroud: Sutton.
              can be found in the way meaning is encoded and
              that this encoding of meaning can be difficult to               Gómez-Adorno, H., Aleman, Y., Vilariño, D., Sanchez-
                                                                                Perez, M. A., Pinto, D., and Sidorov, G. (2017).
              imitate.
                                                                                Author clustering using hierarchical clustering analysis
                                                                                – notebook for PAN at CLEF 2017. In Cappellato, L.,
                                                                                Ferro, N., Goeuriot, L., and Mandl, T. (eds), CLEF 2017
              Supplementary Data                                                Working Notes. CEUR Workshop Proceedings. Dublin,
                                                                                Ireland: CLEF and CEUR-WS.org.
              Supplementary data are available at LLC online.
                                                                              Grant, T. (2010). Txt 4n6: idiolect free authorship ana-
                                                                                lysis. In Coulthard, M. (ed.), Routledge Handbook of
                                                                                Forensic Linguistics. London: Routledge, pp. 508–23.
              References                                                      Grant, T. (2013). TXT 4N6: method, consistency, and
              Barlow, M. (2013). Individual differences and usage-              distinctiveness in the analysis of SMS text messages.
                based grammar. International Journal of Corpus                  Journal of Law and Policy, 21: 467–94.
                Linguistics, 18: 443–78.                                      Grieve, J. (2007). Quantitative authorship attribution: an
              Barlow, M. and Kemmer, S. (2000). Usage-Based Models              evaluation of techniques. Literary and Linguistic
                of Language. Cambridge: Cambridge University Press.             Computing, 22: 251–70.
              Begg, P. (2004). Jack the Ripper: The Definitive History.       Günther, F. (2016). Constructions in Cognitive Contexts:
                Harlow: Longman.                                                Why Individuals Matter in Linguistic Relativity Research.
              Begg, P. and Bennett, J. G. (2013). The Complete and              Berlin; Boston: Walter de Gruyter.
                Essential Jack the Ripper. London: Penguin Books.             Haggard, R. F. (2007). Jack the Ripper as the threat
              Biber, D. (1994). Register and social dialect variation: an       of outcast London. In Warwick, A. and Willis, M.
                integrated approach. In Biber, D. and Finegan, E. (eds),        (eds), Jack the Ripper: Media, Culture, History.
                Sociolinguistic Perspectives on Register. Oxford: Oxford       Manchester; New York, NY: Manchester University
                University Press, pp. 315–47.                                  Press.
              Biber, D. (2012). Register as a predictor of linguistic vari-   Hoey, M. (2005). Lexical Priming: A New Theory of Words
                ation. Corpus Linguistics and Linguistic Theory, 8: 9–37.      and Language. London: Routledge.
              Biber, D., Conrad, S., and Cortes, V. (2004). If you look       Jaccard, P. (1912). The distribution of the Flora in the
                at . . .: lexical bundles in university teaching and text-      Alpine Zone. New Phytologist, 11: 37–50.
                books. Applied Linguistics, 25: 371–405.                      Johnson, A. and Wright, D. (2014). Identifying idiolect
              Brocardo, M. L., Traore, I., Saad, S., and Woungang, I.           in forensic authorship attribution: an N-gram textbite
                (2013). Authorship verification for short messages              approach. Language and Law/Linguagem E Direito, 1:
                using stylometry. In 2013 International Conference on           37–69.
                Computer, Information and Telecommunication Systems           Juola, P. (2013). Stylometry and immigration: a case
                (CITS), IEEE, pp. 1–6.                                          study. Journal of Law and Policy, 21: 287–98.
              Cook, A. (2009). Jack the Ripper. Stroud: Amberley.             Keppel, R., Weis, J., Brown, K., and Welch, K. (2005).
              Coulthard, M. (2004). Author identification, idiolect, and        The Jack the Ripper murders: a modus operandi and
                linguistic uniqueness. Applied Linguistics, 25: 431–47.         signature analysis of the 1888–1891 whitechapel
               murders. Journal of Investigative Psychology and                      Linguistic Knowledge. Berlin: De Gruyter Mouton, pp.
               Offender Profiling, 2: 1–21.                                          9–36.
             Koppel, M. and Schler, J. (2004). Authorship verification           Schmid, H.-J. and Mantlik, A. (2015). Entrenchment in
               as a one-class classification problem. In Proceedings of            historical corpora? Reconstructing dead authors’ minds
               the 21th International Conference on Machine Learning.              from their usage profiles. Anglia, 133: 583–623.
               ACM, Banff, Alberta, Canada, pp. 62–7.                            Schmitt, N. (2004). Formulaic Sequences: Acquisition,
             Koppel, M., Schler, J., and Argamon, S. (2013).                       Processing, and Use. Amsterdam; Philadelphia: John
               Authorship attribution: what’s easy and what’s hard?                Benjamins.
               Journal of Law and Policy, 21: 317–31.                            Sinclair, J. (1991). Corpus, Concordance, Collocation.
             Koppel, M., Schler, J., Argamon, S., and Winter, Y.                   Oxford: Oxford University Press.
               (2012). The ‘fundamental problem’ of authorship attri-            Stamatatos, E. (2009). A survey of modern authorship
               bution. English Studies, 93: 284–91.                                attribution methods. Journal of the American Society
             Langacker, R. W. (1987). Foundations of Cognitive                     for Information Science and Technology, 60: 538–56.
               Grammar. Stanford, CA: Stanford University Press.                 Storey, N. (2012). The Dracula Secrets: Jack the Ripper and
             Larner, S. (2014). A preliminary investigation into the use           the Darkest Sources of Bram Stoker. Stroud: History Press.
               of fixed formulaic sequences as a marker of authorship.           Sugden, P. (2002). The Complete History of Jack the
               International Journal of Speech, Language and the Law,              Ripper. London: Robinson.
               21: 1–22.                                                         Tremblay, A., Derwing, B., and Libben, G. (2009). Are
             Lewis, J. W. (1994). The Yorkshire Ripper enquiry: part I.            lexical bundles stored and processed as single units?
               Forensic Linguistics, 1: 207–16.                                    Working Papers of the Linguistics Circle. University of
             Mollin, S. (2009). ‘I entirely understand’ is a blairism: the         Victoria, vol. 19. pp. 258–79.
              methodology of identifying idiolectal collocations.                Tropp, M. (1999). Images of Fear: How Horror Stories
              International Journal of Corpus Linguistics, 14: 367–92.             Helped Shape Modern Culture (1818-1918). Jefferson,
             Oakes, M. P. (2014). Literary Detective Work on the                   NC: McFarland & Co.
               Computer. Amsterdam: John Benjamins Publishing                    Turell, M. T. and Gavaldà, N. (2013). Towards an index
               Company.                                                            of idiolectal similitude (or distance) in forensic author-
             Perry Curtis, L. (2001). Jack the Ripper and the London               ship analysis. Journal of Law and Policy, 21: 495–514.
               Press. New Haven; London: Yale University Press.                  Walkowitz, J. (1982). Jack the Ripper and the myth of
             Remington, T. (2004). Dear boss: hoax as popular com-                male violence. Feminist Studies, 8: 542–74.
               munal narrative in the case of the Jack the Ripper let-           Wray, A. (2005). Formulaic Language and the Lexicon.
               ters. Journal of Criminal Justice and Popular Culture, 10:         Cambridge: Cambridge University Press.
               199–222.                                                          Wright, D. (2017). Using word N-grams to identify au-
             Rumbelow, D. (1979). The Complete Jack the Ripper.                   thors and idiolects. A Corpus Approach to a Forensic
               London: W. H. Allen.                                               Linguistic Problem, International Journal of Corpus
                                                                                  Linguistics, 22: 212–41.
             Schmid, H.-J. (2016). A framework for understanding
               linguistic entrenchment and its psychological foun-               Zipf, G. (1935). The Psycho-Biology of Language: An
               dations. In Entrenchment and the Psychology of                      Introduction to Dynamic Philology. Boston: Houghton
               Language Learning: How We Reorganize and Adapt                      Mifflin.