default search action
23rd NoDaLiDa 2021: Reykjavik, Iceland (Online)
- Simon Dobnik, Lilja Øvrelid:
Proceedings of the 23rd Nordic Conference on Computational Linguistics, NoDaLiDa 2021, Reykjavik, Iceland (Online), May 31 - June 2, 2021. Linköping University Electronic Press, Sweden 2021, ISBN 978-91-7929-614-8 - Sampo Pyysalo, Jenna Kanerva, Antti Virtanen, Filip Ginter:
WikiBERT Models: Deep Transfer Learning for Many Languages. 1-10 - Hasan Tanvir, Claudia Kittask, Sandra Eiche, Kairit Sirts:
EstBERT: A Pretrained Language-Specific BERT for Estonian. 11-19 - Per Egil Kummervold, Javier de la Rosa, Freddy Wetjen, Svein Arne Brygfjeld:
Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model. 20-29 - Andrey Kutuzov, Jeremy Barnes, Erik Velldal, Lilja Øvrelid, Stephan Oepen:
Large-Scale Contextualised Language Modelling for Norwegian. 30-40 - Maali Tars, Andre Tättar, Mark Fisel:
Extremely low-resource machine translation for closely related languages. 41-52 - Yuri Bizzoni, Ekaterina Lapshinova-Koltunski:
Measuring Translationese across Levels of Expertise: Are Professionals more Surprising than Students? 53-63 - Steinþór Steingrímsson, Hrafn Loftsson, Andy Way:
CombAlign: a Tool for Obtaining High-Quality Word Alignments. 64-73 - Prajit Dhar, Arianna Bisazza:
Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks. 74-85 - Tuomas Kaseva, Hemant Kumar Kathania, Aku Rouhe, Mikko Kurimo:
Speaker Verification Experiments for Adults and Children Using Shared Embedding Spaces. 86-93 - Hemant Kumar Kathania, Sudarsana Reddy Kadiri, Paavo Alku, Mikko Kurimo:
Spectral modification for recognition of children's speech undermismatched conditions. 94-100 - Leo Leppänen, Hannu Toivonen:
A Baseline Document Planning Method for Automated Journalism. 101-111 - Joakim Olsen, Arild Brandrud Næss, Pierre Lison:
Assessing the Quality of Human-Generated Summaries with Weakly Supervised Learning. 112-123 - Lovisa Hagström, Richard Johansson:
Knowledge Distillation for Swedish NER models: A Search for Performance and Efficiency. 124-134 - Jouni Luoma, Li-Hsin Chang, Filip Ginter, Sampo Pyysalo:
Fine-grained Named Entity Annotation for Finnish. 135-144 - Sidsel Boldsen, Fredrik Wahlberg:
Survey and reproduction of computational approaches to dating of historical texts. 145-156 - Samuel Rönnqvist, Valtteri Skantsi, Miika Oinonen, Veronika Laippala:
Multilingual and Zero-Shot is Closing in on Monolingual Web Register Classification. 157-165 - Mika Hämäläinen, Niko Partanen, Jack Rueter, Khalid Al-Najjar:
Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. 166-177 - Elena Volodina, Yousuf Ali Mohammed, Therese Lindström Tiedemann:
CoDeRooMor: A new dataset for non-inflectional morphology studies of Swedish. 178-189 - Katrin Ortmann:
Chunking Historical German. 190-199 - Yvonne Adesam, Aleksandrs Berdicevskis:
Part-of-speech tagging of Swedish texts in the neural era. 200-209 - Kristian Nørgaard Jensen, Mike Zhang, Barbara Plank:
De-identification of Privacy-related Entities in Job Postings. 210-221 - Synnøve Bråthen, Wilhelm Wie, Hercules Dalianis:
Creating and Evaluating a Synthetic Norwegian Clinical Corpus for De-Identification. 222-230 - Mila Grancharova, Hercules Dalianis:
Applying and Sharing pre-trained BERT-models for Named Entity Recognition and Classification in Swedish Electronic Patient Records. 231-239 - Quan Duong, Mika Hämäläinen, Simon Hengchen:
An Unsupervised method for OCR Post-Correction and Spelling Normalisation for Finnish. 240-248 - Jarkko Lagus, Arto Klami:
Learning to Lemmatize in the Word Representation Space. 249-258 - Evelina Rennes, Arne Jönsson:
Synonym Replacement based on a Study of Basic-level Nouns in Swedish Texts of Different Complexity. 259-267 - Simon Hengchen, Nina Tahmasebi:
SuperSim: a test set for word similarity and relatedness in Swedish. 268-275 - Aarne Talman, Marianna Apidianaki, Stergios Chatzikyriakidis, Jörg Tiedemann:
NLI Data Sanity Check: Assessing the Effect of Data Corruption on Model Performance. 276-287 - Jenna Kanerva, Filip Ginter, Li-Hsin Chang, Iiro Rastas, Valtteri Skantsi, Jemina Kilpeläinen, Hanna-Mari Kupari, Jenna Saarni, Maija Sevón, Otto Tarkka:
Finnish Paraphrase Corpus. 288-298 - Petter Mæhlum, Jeremy Barnes, Robin Kurtz, Lilja Øvrelid, Erik Velldal:
Negation in Norwegian: an annotated dataset. 299-308 - Mark Anderson, Carlos Gómez-Rodríguez:
What Taggers Fail to Learn, Parsers Need the Most. 309-314 - Antonia Karamolegkou, Sara Stymne:
Investigation of Transfer Languages for Parsing Latin: Italic Branch vs. Hellenic Branch. 315-320 - Hinrik Hafsteinsson, Anton Karl Ingason:
Towards cross-lingual application of language-specific PoS tagging schemes. 321-325 - Chaojun Wang, Christian Hardmeier, Rico Sennrich:
Exploring the Importance of Source Text in Automatic Post-Editing for Context-Aware Machine Translation. 326-335 - Lifeng Han, Gareth J. F. Jones, Alan F. Smeaton, Paolo Bolzoni:
Chinese Character Decomposition for Neural MT with Multi-Word Expressions. 336-344 - Juho Leinonen, Sami Virpioja, Mikko Kurimo:
Grapheme-Based Cross-Language Forced Alignment: Results with Uralic Languages. 345-350 - Mikko Aulamo, Sami Virpioja, Yves Scherrer, Jörg Tiedemann:
Boosting Neural Machine Translation from Finnish to Northern Sámi with Rule-Based Backtranslation. 351-356 - Tobias Norlund, Agnes Stenbom:
Building a Swedish Open-Domain Conversational Language Model. 357-366 - Magnus Sahlgren, Fredrik Carlsson, Fredrik Olsson, Love Börjeson:
It's Basically the Same Language Anyway: the Case for a Nordic Language Model. 367-372 - Abdul Aziz Alkathiri, Lodovico Giaretta, Sarunas Girdzijauskas, Magnus Sahlgren:
Decentralized Word2Vec Using Gossip Learning. 373-377 - Vinit Ravishankar, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal:
Multilingual ELMo and the Effects of Corpus Sampling. 378-384 - Tim Isbister, Fredrik Carlsson, Magnus Sahlgren:
Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead? 385-390 - Timo Johner, Abhik Jana, Chris Biemann:
Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language. 391-397 - Eetu Sjöblom, Mathias Creutz, Teemu Vahtola:
Grammatical Error Generation Based on Translated Fragments. 398-403 - Helga Svala Sigurðardóttir, Anna Björk Nikulásdóttir, Jón Guðnason:
Creating Data in Icelandic for Text Normalization. 404-412 - Leon Derczynski, Manuel R. Ciosici, Rebekah Baglini, Morten H. Christiansen, Jacob Aarup Dalsgaard, Riccardo Fusaroli, Peter Juel Henrichsen, Rasmus Hvingelby, Andreas Kirkedal, Alex Speed Kjeldsen, Claus Ladefoged, Finn Årup Nielsen, Jens Madsen, Malte Lau Petersen, Jonathan Hvithamar Rystrøm, Daniel Varab:
The Danish Gigaword Corpus. 413-421 - Jeppe Nørregaard, Leon Derczynski:
DanFEVER: claim verification dataset for Danish. 422-428 - Hjalti Daníelsson, Jón Hilmar Jónsson, Thordur Arnar Árnason, Alec Shaw, Einar Freyr Sigurðsson, Steinthór Steingrímsson:
The Icelandic Word Web: A language technology-focused redesign of a lexicosemantic database. 429-434 - Manfred Klenner, Anne Göhring, Sophia Conrad:
Getting Hold of Villains and other Rogues. 435-439 - Atli Sigurgeirsson, Thorsteinn Gunnarsson, Gunnar Örnólfsson, Eydís Magnúsdóttir, Ragnheiðhur Thórhallsdóttir, Stefán Jónsson, Jón Guðnason:
Talrómur: A large Icelandic TTS corpus. 440-444 - Jeremy Barnes, Petter Mæhlum, Samia Touileb:
NorDial: A Preliminary Corpus of Written Norwegian Dialect Use. 445-451 - Saga Hansson, Konstantinos Mavromatakis, Yvonne Adesam, Gerlof Bouma, Dana Dannélls:
The Swedish Winogender Dataset. 452-459 - Amalie Brogaard Pauli, Maria Barrett, Ophélie Lacroix, Rasmus Hvingelby:
DaNLP: An open-source toolkit for Danish Natural Language Processing. 460-466 - Hanna Berg, Hercules Dalianis:
HB Deid - HB De-identification tool demonstrator. 467-471
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.