5-Speech Recognition

The document discusses methods to enhance Google Speech Recognition by incorporating contextual information to improve accuracy, especially on mobile devices. It introduces an on-the-fly rescoring mechanism that adjusts language model weights based on relevant n-grams from the user's context, addressing challenges like out-of-vocabulary words. The authors present various approaches for creating contextual models and demonstrate significant improvements in speech recognition performance through their experiments.

Uploaded by

Mubashir Ehsan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views5 pages

5-Speech Recognition

Uploaded by

Mubashir Ehsan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Bringing Contextual Information to Google Speech Recognition

Petar Aleksic, Mohammadreza Ghodsi, Assaf Michaely, Cyril Allauzen,

Keith Hall, Brian Roark, David Rybach, Pedro Moreno

Google Inc.
{apetar,ghodsi,amichaely,allauzen,kbhall,roark,rybach,pedro}@google.com

Abstract weighted finite-state transducer [2, 3], contextual model. We in-

troduce several approaches for creating contextual models from
In automatic speech recognition on mobile devices, very of- the context, as well as methods for combining the score from
ten what a user says strongly depends on the particular context the main language model and the contextual model. In addi-
he or she is in. The n-grams relevant to the context are often tion, we address the issue of handling out-of-vocabulary (OOV)
not known in advance. The context can depend on, for exam- words present in the provided context, by using a class specific
ple, particular dialog state, options presented to the user, con- language model, as described in section 2.
versation topic, location, etc. Speech recognition of sentences One can view this approach as a generalization of cache
that include these n-grams can be challenging, as they are often models [4, 5, 6], which have been used to personalize language
not well represented in a language model (LM) or even include models based on recent language produced by the individual
out-of-vocabulary (OOV) words. whose utterance is being recognized. Our approach derives
In this paper, we propose a solution for using contextual the biasing n-grams from varied sources beyond an individ-
information to improve speech recognition accuracy. We utilize ual’s prior utterances and makes use of more complex methods
an on-the-fly rescoring mechanism to adjust the LM weights of for mixing with the baseline model than the fixed interpolation
a small set of n-grams relevant to the particular context during or decay functions typically used with recency cache models
speech decoding. [4, 5]. See also the discussion of related work in [1].
Our solution handles out of vocabulary words. It also ad- We organize the paper as follows. In section 2 we present
dresses efficient combination of multiple sources of context and the approach we used to perform on-the-fly n-gram biasing of
it even allows biasing class based language models. We show the language model towards context present in the contextual
significant speech recognition accuracy improvements on sev- model. In section 3 we present various approaches for creat-
eral datasets, using various types of contexts, without negatively ing a contextual model from the provided context. Finally, in
impacting the overall system. The improvements are obtained section 4, we describe the test sets used in our experiments and
in both offline and live experiments. present all of our experimental results.

1. Introduction 2. Contextual language model biasing

The impact of quality of speech recognition on user experi- In this section we describe the language model biasing frame-
ence on mobile devices has been significantly increasing with work we use, and how it handles class-based language models
increase in voice input usage. Voice input is used to perform and OOVs.
search by voice, give specific voice commands, or ask general
questions. The users expect their phones to keep on getting 2.1. General approach
smarter and to take into account various signals that would im- We used the framework for biasing language models using n-
prove the quality of communication with the device and overall grams, introduced in [1]. In this framework, a small set of n-
user experience. grams is compactly represented as a weighted finite-state trans-
In this effort, utilizing contextual information plays a great ducer [7]. An on-the-fly rescoring algorithm allows biasing the
role. The context can be defined in a number of ways. It can recognition towards these n-grams. The cost from the main lan-
depend on the location that the user is in, on the time of the day, guage model G is combined with the cost from the contextual
the user’s search history, the particular dialog state that the user model B as follows:
is in, the conversation topic, the content on the screen that the
user is looking at, etc. Very often the amount of information if (w|H) ∈

sG (w|H) /B
s(w|H) = ,
about the context can be very small, consisting of only a few C(sG (w|H), sB (w|H)) if (w|H) ∈ B
words or sentences. However, if the context is relevant it can (1)
significantly improve the speech recognition accuracy, if con- where sG (w|H) is the raw score from the main model G for
sumed appropriately by the speech recogntion system. the word w leaving history state H and sB (w|H) is the raw
In this paper we present a system that uses contextual in- score for the biasing model. Observe that this approach only
formation to improve speech recognition accuracy. Our solu- modifies the LM scores of n-grams, Hw, for which the biasing
tion works well for both large contexts and contexts consist- model provides an explicit score. This differs from regular lan-
ing of only several words or phrases. We use a framework for guage model interpolation and is motivated by the fact that the
biasing language models (LM) using n-grams as the biasing support of the biasing model is much sparser than that of the
context [1]. The n-grams and corresponding weights, calcu- main language model.
lated based on the reliability of the context, are represented as a [1] offers the following alternatives for the operation C
</contact> w:w
Michael 2 Riley w:ε
<contact> </contact>
0 1 Pedro 4 5
Moreno <classname>:$CLASSNAME
0 1
3 </classname>:ε
</contact>

Figure 1: Example of a class grammar with decorators for the

“$CONTACTS” class. Figure 2: Transducer T maps decorator-delimited phrases back
to the corresponding class label.
used to combine the scores. The first approach corresponds to
using log-linear interpolation: 2.3. Handling out-of-vocabulary words
C 0 (sG (w|H), sB (w|H)) = α∗sG (w|H)+β∗sB (w|H). (2) The contextual model might contain words that do not appear
Since our costs are negative-log conditional probabilities, this in the base vocabulary of our base language model. We want to
simply corresponds to linear interpolation in the log-domain. be able to add these OOV words to our base LM at the unigram
Finally, [1] also provides a mechanism that restricts the bi- state so that they can be hypothesized and rescored accordingly
asing to be applied only if it reduces the cost. In equation 3 we by the contextual model.
define the positive biasing function which applies this restric- We achieve this by leveraging class-based language model-
tion: ing approach. We introduce a “$UNKNOWN” class that only
appears at the unigram state of the LM. At recognition time,
C(sG (w|H), sB (w|H)) = we extract the set of OOV words from the contextual model
for the considered utterance. We then create a “$UNKNOWN”
min(sG (w|H), C 0 (sG (w|H), sB (w|H))). (3) class grammar representing these words, as a monophone-to-
Dynamic decoding of input speech is performed similarly to word transducer as described in [10]. In this instance, we do not
what is described in [8]. Specifically, given a vocabulary V we add decorators to the class grammars, since we want to rescore
generate a lattice from the alphabet Σ = V ∪ {}. Given a CLG the individual OOV words in the biased contexts and not the
(a composition of the context-dependent phone model, lexicon, “$UNKNOWN” class.
and main language model), we perform time-synchronous de-
coding via beam search. As in [8], a pseudo-deterministic word- 3. Constructing the contextual model
lattice is built during decoding. It is at this point where we apply
the on-the-fly rescoring [9] with the contextual biasing model as The context we use for biasing can consist of hundreds of
described in [1]. phrases or only a handful of phrases. Each phrase is a sequence
of one or more words. For example, the following phrases may
2.2. Biasing class-based language models be used as the context for an utterance: “Hotels in Manhattan”,
“Holiday Inn”, “Cheap flights to New York City”.
Our main language model is class-based [10, 11, 12]. Examples When biasing, we want to allow partial matching to the con-
of classes are address numbers, street names, dates, and contact text. For example, given the context above, we might also want
names. The last being an example of an utterance-dependent to bias towards “Cheap hotels in New York”.
user-specific class. In general, if the size of the context is large enough that a
We might want to bias towards the whole class in some regular language model can be constructed from it, then one can
context. For instance, we might want to bias towards “call use the LM costs as biasing scores. (In that case, the interpola-
$CONTACTS” or “directions to $ADDRESSNUM $STREET- tion would be a standard interpolation between two LM costs.)
NAME” instead of being limited to simply biasing towards However, often the context available is too small for using this
some instantiations of the classes (e.g. “call Michael” or “di- approach. We developed methods that address this case.
rections to 111 Eight Avenue”). In this section, we discuss how we select biasing n-grams
Our language model consists of: (a) a top-level n-gram and their scores, given a set of context phrases such as above.
language model over regular words and class labels and (b)
for each class c, a class grammar Gc over regular words that 3.1. Extracting and scoring n-grams
might be utterance-dependent. All components are represented
as weighted automata. At run-time, this model is expanded on- We want to bias more heavily towards higher order (longer) n-
demand into a weighted automata G using the replacement op- grams. This is because of two related reasons: The first is that
eration as described in [10]. In this approach, class-based bi- we want to reward longer exact matches between the context
asing is achieved by: (1) Modifying each class grammar to in- and the recognition result. The second is that biasing towards
sert decorators allowing us to keep track of whether words in shorter n-grams has a larger negative effect on the recognition
the hypothesis word lattice were generated by the top level LM of general (out of context) queries.
or by one of the class grammars. This corresponds to using One simple scoring function that satisfies the above require-
G0c =< c > Gc < /c > as class grammar for class c where ments is the length-linear function, where n is the length of
(<c>, </c>) is the decorator pair for c (see Figure 1). (2) Al- Hw. That is:
lowing n-grams containing class labels in the contextual biasing sB (w|H) = f1 (length(Hw)) = (n − 1)p2 + p1 (4)
model. (3) Treating decorator-delimited phrases as their cor-
responding class-label during rescoring. This is achieved by where p1 and p2 are parameters that control the strength of bias-
composition on-the-fly the word lattice with the transducer T ing, their values depending on the quality of the context. These
described Figure 2 and then applying the biasing model as de- parameters can be learned on a transcribed development data set
scribed in the previous section. with context.
The length-linear function would assign the same score to text, we attach to each utterance 100 irrelevant contexts ran-
all n-grams of the same order. However, because the final cost domly selected from other utterances. We call this a negative
used by the recognizer is an interpolation of the biasing score set. (2) In sets with fixed context, we attach the fixed context to
and the original LM cost, the effect of the biasing score depends a set of utterances for which the context is irrelevant. We call
on the interpolation function. this an anti-set.
Since we want to bias more heavily towards longer n-grams,
we would want sB (n) to be a decreasing function of n, i.e. 4.1.1. Entities and location
p2 < 0.
This test set contains 876 utterances. Each utterance con-
The main limitation of the length-linear function is that the
tains the name of an entity and/or the name of a location
cost of various n-gram orders are interdependent. A slightly
e.g. “Directions to Sky Song in Phoenix, Arizona”. The
more general function would assign independent scores to each
context is defined per utterance, and is a list of locations
of the n-gram orders. In our system, we observed diminishing
and entities, e.g. {“Sky Song”, “Phoenix, Arizona”}.
gains beyond specifying scores for unigrams and bigrams only.
Test set variants: entities pos, entities neg, entities baseline
(Note that, similar to back-offs mechanism in LMs, the biasing
model will use the score of the lower order n-gram if the longer
4.1.2. Confirmation
one is absent.)
We define the unigram-and-bigram function as: This test set contains 1000 utterances. All queries correspond
to a state where the user is provided with the choice to con-
firm or cancel some action. The context is the same for all ut-

p1 : n = 1
sB (w|H) = f2 (length(Hw)) = (5)
p2 : n ≥ 2 terance, and it consists of the words {“yes”, “no”, “cancel”}.
Test set variants: ync pos, ync baseline and anti ync. anti ync
The unigram-and-bigram function is more robust and easier is an anti-set consisting of 22k utterances not related to confir-
to interpret, compared to the length-linear function. We there- mation/cancellation states.
fore used the unigram-and-bigram function in most of the ex-
periments presented in this paper. 4.1.3. Hard n-grams
This testset consists of 2,704 utterances. All utterances in this
3.2. Sentence boundaries
testset contain n-grams with high LM costs, for n ∈ [2, 7]. The
As mentioned, biasing towards unigrams can be detrimental to context, defined per utterance, is a list of high-cost n-grams.
the general query recognition performance. But what if some Test set variants: costly pos, costly neg, costly baseline.
or all of the context phrases contain only one word? For exam-
ple, in one of our test sets (confirmation) the context consists 4.1.4. Class based (numeric)
of the phrases “yes”, “no”, and “cancel”. If we were to bias
This testset contains 816 utterances, each containing some type
towards these unigrams heavily, we may get recognition results
of number in the transcript, e.g. “Set alarm for 5:30 p.m. to-
that contain repetitions of these words, such as “no no no . . . ”.
day”. The context is defined per utterance and consists of a
We can avoid this outcome by appending sentence bound- list of the transcripts with class members replaced by their class
ary tokens (“<S>” and “</S>” in our case) to each phrase in symbol (e.g. “Set alarm for $TIME p.m. today”). The context
the context, before extracting the biasing n-grams. Then, in the for each utterance contains the utterance’s modified transcript.
above example, we would bias towards bigrams such as “<S> Test set variants: numeric, numeric baseline
no” and “no </S>” much more than we bias towards the uni-
grams. 4.1.5. Class based (contacts)

4. Experimental results This testset contains 10670 utterances. All utterances corre-
spond to contact calling voice commands, e.g. “Call James
In this section we describe our test sets, experimental setup, and Brown”. Similar to numeric testset the context is created by
analyze the experimental results. All test sets used have been name class members being replaces by “$CONTACTS” in tran-
anonymized. scripts.

4.1. Corpora 4.2. Recognition accuracy with biasing

The experiments described below use various test sets in Amer- We measured the effect of biasing on our test sets using both of
ican English. All of the test sets were manually transcribed. the functions introduced in section 3.1. We then measured the
The context for some test sets is defined per utterance (e.g. test effects of each of the features that our biasing implementation
set “Entities and location”), whereas for others the context is supports. Finally, we show how we can control the strength
constant for the whole test set (e.g. test set “Confirmation”). of the biasing by varying a range of parameters of our biasing
Several experimenal setups were used to evaluate the positive score function.
effect of relevant context and the negative effect (overtrigger- Table 1 shows the effect of biasing versus the baseline (i.e.
ing) of irrelevant context. In the baseline setup, experiments are no biasing). “bias 1” uses the length-linear scoring function,
run with no context provided. In order to evaluate the positive and “bias 2” uses the unigram-and-bigram function. Both bias-
effect of relevant context we use the following setups: (1) In ing tests use the positive biasing interpolation function in equa-
sets with per-utterance context, we attach to each utterance its tion (3), however “bias 1” uses (α, β) = (0.25, 1) whereas
relevant context. (2) In sets with fixed context, the same context “bias 2” uses (α, β) = (0, 1), which is effectively the same
is attached to every utterance as using min(sG (w|H), sB (w|H)) for interpolation. The val-
In order to evaluate the negative effect of relevant context ues of α and β control the interpolation of main LM costs and
we use the following setups: (1) In sets with per-utterance con- biasing scores based on equation (3).
Test set Baseline bias 1 bias 2 p1 p2
entities pos 8.9 7.2 7.2 -1 0 1 5
entities neg 8.9 9.0 9.0 -2 30.8 43.3 30.5 42.8 32.7 42.0 35.9 42.1
ync pos 18.8 10.4 11.0 0 7.2 9.3 8.9 8.9 7.3 9.0 7.7 8.9
anti ync 10.9 10.9 10.9 6 6.6 9.6 7.3 9.1 6.6 9.2 7.3 9.0
costly pos 12.9 4.2 6.1 10 6.7 9.4 8.2 8.9 7.0 9.0 7.5 8.9
costly neg 12.9 13.8 13.8
numeric 11.0 4.7 5.7 Table 3: WER(%) for entities pos (using regular font) and en-
contacts 15.0 2.8 3.2 tities neg (using italics) over a range of (p1 , p2 ) values for the
unigram-and-bigram scoring function (equation (5)).
Table 1: WER(%) for baseline vs two biasing methods. bias
1: length-linear, (α, β) = (0.25, 1) and (p1 , p2 ) = (0, −0.4).
bias 2: unigram-and-bigram, (α, β) = (0, 1) and (p1 , p2 ) = the strength of bias is increased, the WER for the negative test
(7, 3). increases monotonically, but the WER for the positive test set
decreases, up to a certain minimum, after which it also starts
Test set bias 2 bias 2.a bias 2.b bias 2.c increasing. This is because the context starts to cause errors in
entities pos 7.2 7.2 7.3 7.4 the parts of the utterance that are not supposed to be biased. At
entities neg 9.0 8.9 9.0 9.0 the extremely high biasing level of (-2, -1), both the positive and
ync pos 11.0 15.0 11.6 11.0 negative tests are significantly worse than baseline.
anti ync 10.9 10.9 10.9 10.9 The operating point of (7, 3) used in Table 1 and Table 2
costly pos 6.1 6.5 6.7 6.1 is a relatively conservative operating point, which has minimal
costly neg 13.8 13.6 13.8 13.8 effect on the negative test. This operating point was chosen to
numeric 5.7 6.1 6.0 5.9 balance positive and negative performance on several different
contacts 3.2 5.1 3.2 3.2 test sets. For the test set used in Table 3, a more aggressive
operating point of (6, 1) results in WERs 6.6% and 9.2%, re-
Table 2: The effect of biasing features on WER(%): bias 2: spectively, on the positive and negative tests (baseline is 8.9%).
With all features, same as in Table 1. bias 2.a: Without sentence
boundaries. bias 2.b: Without case variants. bias 2.c: Without 4.3. Live biasing experiments
OOV support.
In order to further validate that our system improvements are
beneficial we ran a live experiment. In our experiment, a per-
centage of the production traffic is cloned and sent to two speech
Table 2 compares the effect of having each of the following
recognition systems. We focused only on the traffic correspond-
features disabled:
ing to the confirmation dialog state, that is, the state in which
bias 2.a. Add sentence boundaries to the context. a user is asked to respond with one of the words “yes”, “no”,
bias 2.b. Include upper/lower case variants of the context. “cancel”. The first system was used as the baseline while the
bias 2.c. Support OOV words in the context. second used the biasing methodology described in this paper. In
the biasing system, for each of the utterances we used the fixed
Disabling sentence boundaries has a particularly detrimen- biasing context consisting of three words described above.
tal effect on our ync pos test set. It also negatively affects con- During our experiment approximately 30,000 utterances
tacts, numeric and costly pos test sets. As mentioned in sec- were processed by each system. This was done anonymously
tion 3.2, the reason is that in the ync pos test set, the context and and on-the-fly. We compared the performance of the two sys-
most of the expected transcripts are unigrams. Adding sentence tems by using sentence accuracy as metric. Using the bias-
boundaries allows us to bias these contexts at the bigram level, ing methodology and optimal operating point described in sec-
which can be safely biased more heavily. In contrast, trying tion 4.2 resulted in a sentence accuracy increase of 8% relative.
to achieve the same result by increasing unigrams bias would This was significant with p < 0.1.
result in a sharp increase in insertion errors.
Disabling case variants has a negative effect on the recog-
nition resuls for the test sets for which the context includes the
5. Conclusion
case variant less probable in the LM. The worst effect is on the In this paper, we describe an approach for biasing speech recog-
costly pos and numeric test sets. This feature does not have a nition towards provided contextual information. We analyze
noticeable effect on the negative test sets, so it can be safely various types of context, describe context preprocessing tech-
turned on by default. The third feature, support for OOV words niques, and provide a solution to OOVs present in the context.
in the context, reduces WER on test sets that include OOVs and We also present biasing functions used to adjust LM scores
has no effect otherwise. based on provided context. We conducted experiements using
Finally, we present WER for our entities and location test several datasets with various types of contextual information
sets at a range of operating points (set of values for parameters provided. The results obtained show that the proposed method-
in the scoring function). This test set contains context phrases of ology can significantly improve speech recognition accuracy
various lengths, case variants, and OOVs. In Table 3 the WER when reliable contextual information is available. For exam-
for the positive and negative tests are shown side by side, for ple, on our confirmation (ync) testset, WER relative reduction
various pairs of values of (p1 , p2 ) in equation (5). The point of 44% is achieved on positive (ync pos) testset without any
(0, 0) corresponds to baseline WERs as the context is ignored. WER changes on the negative test set (anti nyc). Furthermore,
At the lowest level of biasing, (p1 , p2 ) = (10, 5), the neg- we show that speech recognition gains are achieved withouth
ative test is not affected (the WER is equal to baseline). How- causing overtriggering on queries not related to the context.
ever the positive WER is already better than the baseline. As
6. References
[1] K. B. Hall, E. Cho, C. Allauzen, F. Beaufays, N. Coc-
caro, K. Nakajima, M. Riley, B. Roark, D. Rybach, and
L. Zhang, “Composition-based on-the-fly rescoring for
salient n-gram biasing,” in Interspeech 2015, 2015.
[2] C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and
M. Mohri, “OpenFst: A general and efficient weighted
finite-state transducer library,” in CIAA 2007, ser. LNCS,
vol. 4783, 2007, pp. 11–23, http://www.openfst.org.
[3] M. Mohri, F. Pereira, and M. Riley, “Speech recognition
with weighted finite-state transducers,” in Handbook of
Speech Processing, Y. H. Jacob Benesty, Mohan Sondhi,
Ed. Springer, 2008, pp. 559–582.
[4] R. Kuhn and R. De Mori, “A cache-based natural language
model for speech recognition,” Pattern Analysis and Ma-
chine Intelligence, IEEE Transactions on, vol. 12, no. 6,
pp. 570–583, 1990.
[5] P. R. Clarkson and A. J. Robinson, “Language model
adaptation using mixtures and an exponentially decay-
ing cache,” in Acoustics, Speech, and Signal Processing,
1997. ICASSP-97., 1997 IEEE Internation al Conference
on, vol. 2. IEEE, 1997, pp. 799–802.
[6] S. Besling and H.-G. Meier, “Language model speaker
adaptation,” in Fourth European Conference on Speech
Communication and Technology, 1995.
[7] M. Mohri, F. Pereira, and M. Riley, “Weighted finite-state
transducers in speech recognition,” Computer Speech and
Language, vol. 16, pp. 69–88, 2002.
[8] G. Saon, D. Povey, and G. Zweig, “Anatomy of an ex-
tremely fast LVCSR decoder,” in in Proc. Interspeech,
2005, pp. 549–552.
[9] T. Hori, C. Hori, Y. Minami, and A. Nakamura, “Efficient
WFST-based one-pass decoding with on-the-fly hypoth-
esis rescoring in extremely large vocabulary continuous
speech recognition,” Audio, Speech, and Language Pro-
cessing, IEEE Transactions on, vol. 15, no. 4, pp. 1352–
1365, 2007.
[10] P. Aleksic, C. Allauzen, D. Elson, A. K. D. M. Casado,
and P. J. Moreno, “Improved recognition of contact names
in voice commands,” in ICASSP 2015, 2015.
[11] L. Vasserman, V. Schogol, and K. Hall, “Sequence-based
class tagging for robust transcription in ASR,” in Submit-
ted to Interspeech, 2015.
[12] P. F. Brown, V. J. D. Pietra, P. V. deSouza, J. C. Lai,
and R. L. Mercer, “Class-based n-gram models of natu-
ral language,” Computational Linguistics, vol. 18, no. 4,
pp. 467–479, 1992.

Contextualized Streaming End-to-End Speech Recognition With Trie-Based Deep Biasing and Shallow Fusion
No ratings yet
Contextualized Streaming End-to-End Speech Recognition With Trie-Based Deep Biasing and Shallow Fusion
5 pages
Context-Aware Language Modeling For Conversational Speech Translation
No ratings yet
Context-Aware Language Modeling For Conversational Speech Translation
8 pages
CTC LLM Asr
No ratings yet
CTC LLM Asr
6 pages
Learning To Retrieve In-Context Examples For Large Language Models
No ratings yet
Learning To Retrieve In-Context Examples For Large Language Models
15 pages
Generative Context-Aware Fine-Tuning of Self-Supervised Speech Models
No ratings yet
Generative Context-Aware Fine-Tuning of Self-Supervised Speech Models
5 pages
Learning To Retrieve In-Context Examples For Large Language Models
No ratings yet
Learning To Retrieve In-Context Examples For Large Language Models
16 pages
压缩Prompt self information计算蕴含有多少对大模型有用的信息 2310.06201
No ratings yet
压缩Prompt self information计算蕴含有多少对大模型有用的信息 2310.06201
12 pages
Julian David Echeverry Correa
No ratings yet
Julian David Echeverry Correa
161 pages
Split 1363534026993628405
No ratings yet
Split 1363534026993628405
2 pages
(Evenzoha, Danr) Uiuc, Edu: A Classification Approach To Word Prediction
No ratings yet
(Evenzoha, Danr) Uiuc, Edu: A Classification Approach To Word Prediction
8 pages
Conversion of NNLM To Back Off Language Model in ASR
No ratings yet
Conversion of NNLM To Back Off Language Model in ASR
4 pages
Wavllm: Towards Robust and Adaptive Speech Large Language Model
No ratings yet
Wavllm: Towards Robust and Adaptive Speech Large Language Model
21 pages
Unsupervised Context Rewriting Method
No ratings yet
Unsupervised Context Rewriting Method
12 pages
Exploring Bi-Directional Context For Improved Chatbot Response Generation Using Deep Reinforcement Learning
No ratings yet
Exploring Bi-Directional Context For Improved Chatbot Response Generation Using Deep Reinforcement Learning
20 pages
Malayalam Speech Recognition
No ratings yet
Malayalam Speech Recognition
3 pages
2.1 Chap NLP Ngrams
No ratings yet
2.1 Chap NLP Ngrams
37 pages
Ngrams
100% (1)
Ngrams
22 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Trigram 12
No ratings yet
Trigram 12
8 pages
2307.03172v3 Lost in The Middle
No ratings yet
2307.03172v3 Lost in The Middle
18 pages
Dynamic Mixtures of Contextual Experts For Language Modeling
No ratings yet
Dynamic Mixtures of Contextual Experts For Language Modeling
11 pages
Ngrams
No ratings yet
Ngrams
22 pages
Enhancing Question Prediction With Flan T5 A Context Aware Language Model Approach
No ratings yet
Enhancing Question Prediction With Flan T5 A Context Aware Language Model Approach
6 pages
音声認識3
No ratings yet
音声認識3
5 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
Language Models For Contextual Error Detection and Correction
No ratings yet
Language Models For Contextual Error Detection and Correction
8 pages
State of Multilingual and Multimodal NLP
No ratings yet
State of Multilingual and Multimodal NLP
27 pages
Language Models & N-Gram Analysis
No ratings yet
Language Models & N-Gram Analysis
41 pages
Lost in The Middle How Language Models Use Long Contexts
No ratings yet
Lost in The Middle How Language Models Use Long Contexts
15 pages
Unit5 Notes
No ratings yet
Unit5 Notes
17 pages
Hierarchical NNLM Aistats05
No ratings yet
Hierarchical NNLM Aistats05
7 pages
Hello 3
No ratings yet
Hello 3
1 page
Speech Recognition Using Backoff N-Gram Modelling in Android Application
No ratings yet
Speech Recognition Using Backoff N-Gram Modelling in Android Application
7 pages
Integration Empirical Study
No ratings yet
Integration Empirical Study
16 pages
Auto Select Context Config
No ratings yet
Auto Select Context Config
11 pages
Contextualized Automatic Speech Recognition With Dynamic Vocabulary
No ratings yet
Contextualized Automatic Speech Recognition With Dynamic Vocabulary
5 pages
Christoph Bensch Master Thesis
No ratings yet
Christoph Bensch Master Thesis
67 pages
An Embarrassingly Simple Approach For Transfer Learning From Pretrained Language Models
No ratings yet
An Embarrassingly Simple Approach For Transfer Learning From Pretrained Language Models
7 pages
Enhancing Robustness of Retrieval-Augmented Language Models With In-Context Learning
No ratings yet
Enhancing Robustness of Retrieval-Augmented Language Models With In-Context Learning
10 pages
MIRRORWiC - On Eliciting Word-in-Context Representations From Pretrained LMs
No ratings yet
MIRRORWiC - On Eliciting Word-in-Context Representations From Pretrained LMs
13 pages
5) Lecture Feb11&13&17&18
No ratings yet
5) Lecture Feb11&13&17&18
21 pages
Cross-Lingual In-Context Learning with X-InSTA
No ratings yet
Cross-Lingual In-Context Learning with X-InSTA
16 pages
Language Models & Long Contexts
No ratings yet
Language Models & Long Contexts
19 pages
An Explanation of In-Context Learning As Implicit Bayesian Inference
No ratings yet
An Explanation of In-Context Learning As Implicit Bayesian Inference
25 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
Context in NLP
No ratings yet
Context in NLP
7 pages
Detection of Out-Of-Vocabulary Words in Posterior
No ratings yet
Detection of Out-Of-Vocabulary Words in Posterior
5 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
Rnn-Based Ams + Introduction To Language Modeling: Instructor: Preethi Jyothi
No ratings yet
Rnn-Based Ams + Introduction To Language Modeling: Instructor: Preethi Jyothi
36 pages
杨旭多模态大语言模型中的上下文学习已ok
No ratings yet
杨旭多模态大语言模型中的上下文学习已ok
57 pages
A Maximum Entropy Approach For Semantic Language Modeling: Chuang-Hua Chueh, Hsin-Min Wang and Jen-Tzung Chien
No ratings yet
A Maximum Entropy Approach For Semantic Language Modeling: Chuang-Hua Chueh, Hsin-Min Wang and Jen-Tzung Chien
20 pages
6.chapter6 LanguageModel
No ratings yet
6.chapter6 LanguageModel
33 pages
InfiniteICL Breaking The Limit of Context Window Size 1743702150
No ratings yet
InfiniteICL Breaking The Limit of Context Window Size 1743702150
12 pages
Static Dictionary For Pronunciation Modeling
No ratings yet
Static Dictionary For Pronunciation Modeling
5 pages
Cognitive Modeling Course Syllabus
No ratings yet
Cognitive Modeling Course Syllabus
4 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
Analysing Lexical Semantic Change
No ratings yet
Analysing Lexical Semantic Change
14 pages
NLP Unit-4
No ratings yet
NLP Unit-4
62 pages
ELICIT LLM Augmentation
No ratings yet
ELICIT LLM Augmentation
34 pages
Project Report Dld-2
No ratings yet
Project Report Dld-2
8 pages
Gauss-Seidel Method MCQ Solutions
No ratings yet
Gauss-Seidel Method MCQ Solutions
11 pages
RDC 19222
No ratings yet
RDC 19222
23 pages
Python Basics for Beginners
No ratings yet
Python Basics for Beginners
2 pages
MAMALUBA - Assignment For Chem Lab 401
No ratings yet
MAMALUBA - Assignment For Chem Lab 401
2 pages
Introduction to Python Basics
No ratings yet
Introduction to Python Basics
97 pages
MLA703b Maritime Industry Practice Assessment Information
No ratings yet
MLA703b Maritime Industry Practice Assessment Information
23 pages
Yaesu Rotators: Models & Specs Guide
No ratings yet
Yaesu Rotators: Models & Specs Guide
4 pages
Guide To Clinical Documentation. ISBN 0803666624, 978-0803666627
96% (25)
Guide To Clinical Documentation. ISBN 0803666624, 978-0803666627
23 pages
Finite Element Analysis - 2 Marks - All 5 Units
77% (31)
Finite Element Analysis - 2 Marks - All 5 Units
13 pages
Working Sinewave Inverter
No ratings yet
Working Sinewave Inverter
10 pages
Shafer-Landau Fundamentals of Ethics - Introduction
No ratings yet
Shafer-Landau Fundamentals of Ethics - Introduction
3 pages
The Role of Intelligence in National Security: Stan A AY O
100% (2)
The Role of Intelligence in National Security: Stan A AY O
20 pages
8051 Addressing Modes: Subject: Microprocessor and Microcontroller: Architecture & Interfacing
No ratings yet
8051 Addressing Modes: Subject: Microprocessor and Microcontroller: Architecture & Interfacing
25 pages
Material Safety Data Sheet 1. Product and Company Identification
No ratings yet
Material Safety Data Sheet 1. Product and Company Identification
4 pages
Item Assortment
No ratings yet
Item Assortment
270 pages
05 - A - C Shreyas
No ratings yet
05 - A - C Shreyas
85 pages
Duong BANA3050 Section# MS Excel Practicum1
No ratings yet
Duong BANA3050 Section# MS Excel Practicum1
22 pages
CV SushilkumarBondre
No ratings yet
CV SushilkumarBondre
2 pages
Decision Theory & Analysis Module
No ratings yet
Decision Theory & Analysis Module
4 pages
Cell Phone Addiction and Psychological and Physiol
No ratings yet
Cell Phone Addiction and Psychological and Physiol
4 pages
Solucionario Statistics For Business and Economics - David R. Anderson, Dennis J. Sweeney - 8ed
0% (1)
Solucionario Statistics For Business and Economics - David R. Anderson, Dennis J. Sweeney - 8ed
8 pages
Cisco Certification Path Poster
No ratings yet
Cisco Certification Path Poster
1 page
Doctoral Proposal Guidelines
No ratings yet
Doctoral Proposal Guidelines
6 pages
W3 D1 Critique Paper
No ratings yet
W3 D1 Critique Paper
29 pages
Blow Mould Tool Design and Manufacturing Process For 1litre Pet Bottle
No ratings yet
Blow Mould Tool Design and Manufacturing Process For 1litre Pet Bottle
10 pages
MSC BFS Flyer - ForMed
No ratings yet
MSC BFS Flyer - ForMed
2 pages
PS4 Solution
No ratings yet
PS4 Solution
9 pages
Reading Time On Analog Clocks (A) : Name: Date: Read Each Time and Write It in The Space Under The Clock
No ratings yet
Reading Time On Analog Clocks (A) : Name: Date: Read Each Time and Write It in The Space Under The Clock
2 pages
2.2 Thermal Properties and Temperature - Watermark
No ratings yet
2.2 Thermal Properties and Temperature - Watermark
82 pages