default search action
Siddharth Dalmia
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c33]Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling for Spoken Language Identification. ICASSP 2024: 11526-11530 - [c32]Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar:
LLM Augmented LLMs: Expanding Capabilities through Composition. ICLR 2024 - [i33]Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar:
LLM Augmented LLMs: Expanding Capabilities through Composition. CoRR abs/2401.02412 (2024) - [i32]Frank Palma Gomez, Ramon Sanabria, Yun-Hsuan Sung, Daniel Cer, Siddharth Dalmia, Gustavo Hernández Ábrego:
Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems. CoRR abs/2404.01616 (2024) - [i31]Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu:
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? CoRR abs/2406.13121 (2024) - 2023
- [j2]Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed:
LegoNN: Building Modular Encoder-Decoder Models. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3112-3126 (2023) - [c31]Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polak, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe:
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit. ACL (demo) 2023: 400-411 - [c30]Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W. Black, Shinji Watanabe:
CTC Alignments Improve Autoregressive Translation. EACL 2023: 1615-1631 - [c29]Motoi Omachi, Brian Yan, Siddharth Dalmia, Yuya Fujita, Shinji Watanabe:
Align, Write, Re-Order: Explainable End-to-End Speech Translation via Operation Sequence Generation. ICASSP 2023: 1-5 - [i30]Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling For Spoken Language Identification. CoRR abs/2309.10567 (2023) - 2022
- [c28]Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W. Black, Shinji Watanabe:
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models. EMNLP (Findings) 2022: 5419-5429 - [c27]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. ICASSP 2022: 6412-6416 - [c26]Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W. Black, Shinji Watanabe:
ESPnet-SLU: Advancing Spoken Language Understanding Through ESPnet. ICASSP 2022: 7167-7171 - [c25]Yifan Peng, Siddharth Dalmia, Ian R. Lane, Shinji Watanabe:
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding. ICML 2022: 17627-17643 - [c24]Siddhant Arora, Siddharth Dalmia, Xuankai Chang, Brian Yan, Alan W. Black, Shinji Watanabe:
Two-Pass Low Latency End-to-End Spoken Language Understanding. INTERSPEECH 2022: 3478-3482 - [c23]Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe:
CMU's IWSLT 2022 Dialect Speech Translation System. IWSLT@ACL 2022: 298-307 - [c22]Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, Shinji Watanabe:
A Study on the Integration of Pre-Trained SSL, ASR, LM and SLU Models for Spoken Language Understanding. SLT 2022: 406-413 - [c21]Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: FEW-Shot Learning Evaluation of Universal Representations of Speech. SLT 2022: 798-805 - [i29]Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech. CoRR abs/2205.12446 (2022) - [i28]Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed:
LegoNN: Building Modular Encoder-Decoder Models. CoRR abs/2206.03318 (2022) - [i27]Yifan Peng, Siddharth Dalmia, Ian R. Lane, Shinji Watanabe:
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding. CoRR abs/2207.02971 (2022) - [i26]Siddhant Arora, Siddharth Dalmia, Xuankai Chang, Brian Yan, Alan W. Black, Shinji Watanabe:
Two-Pass Low Latency End-to-End Spoken Language Understanding. CoRR abs/2207.06670 (2022) - [i25]Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W. Black, Shinji Watanabe:
CTC Alignments Improve Autoregressive Translation. CoRR abs/2210.05200 (2022) - [i24]Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W. Black, Shinji Watanabe:
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models. CoRR abs/2210.15734 (2022) - [i23]Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, Shinji Watanabe:
A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding. CoRR abs/2211.05869 (2022) - [i22]Motoi Omachi, Brian Yan, Siddharth Dalmia, Yuya Fujita, Shinji Watanabe:
Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation. CoRR abs/2211.05967 (2022) - 2021
- [c20]Hirofumi Inaguma, Siddharth Dalmia, Brian Yan, Shinji Watanabe:
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates. ASRU 2021: 922-929 - [c19]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. EACL 2021: 2976-2992 - [c18]Siddharth Dalmia, Yuzong Liu, Srikanth Ronanki, Katrin Kirchhoff:
Transformer-Transducers for Code-Switched Speech Recognition. ICASSP 2021: 5859-5863 - [c17]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. Interspeech 2021: 1264-1268 - [c16]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. Interspeech 2021: 2471-2475 - [c15]Hirofumi Inaguma, Brian Yan, Siddharth Dalmia, Pengcheng Guo, Jiatong Shi, Kevin Duh, Shinji Watanabe:
ESPnet-ST IWSLT 2021 Offline Speech Translation System. IWSLT 2021: 100-109 - [c14]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. NAACL-HLT 2021: 1882-1896 - [i21]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. CoRR abs/2102.08345 (2021) - [i20]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. CoRR abs/2105.00573 (2021) - [i19]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. CoRR abs/2106.15065 (2021) - [i18]Hirofumi Inaguma, Brian Yan, Siddharth Dalmia, Pengcheng Guo, Jiatong Shi, Kevin Duh, Shinji Watanabe:
ESPnet-ST IWSLT 2021 Offline Speech Translation System. CoRR abs/2107.00636 (2021) - [i17]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. CoRR abs/2107.11628 (2021) - [i16]Hirofumi Inaguma, Siddharth Dalmia, Brian Yan, Shinji Watanabe:
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates. CoRR abs/2109.12804 (2021) - [i15]Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W. Black, Shinji Watanabe:
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet. CoRR abs/2111.14706 (2021) - [i14]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. CoRR abs/2111.15016 (2021) - 2020
- [c13]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-Shot Learning for Automatic Phonemic Transcription. AAAI 2020: 8261-8268 - [c12]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. EMNLP (Findings) 2020: 3088-3095 - [c11]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. ICASSP 2020: 8249-8253 - [i13]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-shot Learning for Automatic Phonemic Transcription. CoRR abs/2002.11781 (2020) - [i12]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. CoRR abs/2002.11800 (2020) - [i11]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. CoRR abs/2010.04924 (2020) - [i10]Siddharth Dalmia, Yuzong Liu, Srikanth Ronanki, Katrin Kirchhoff:
Transformer-Transducers for Code-Switched Speech Recognition. CoRR abs/2011.15023 (2020)
2010 – 2019
- 2019
- [c10]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. ACL (1) 2019: 1131-1141 - [c9]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. ICASSP 2019: 6091-6095 - [c8]Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze:
Multilingual Speech Recognition with Corpus Relatedness Sampling. INTERSPEECH 2019: 2120-2124 - [c7]Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze:
SANTLR: Speech Annotation Toolkit for Low Resource Languages. INTERSPEECH 2019: 3681-3682 - [c6]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Cross-Attention End-to-End ASR for Two-Party Conversations. INTERSPEECH 2019: 4380-4384 - [i9]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. CoRR abs/1902.07613 (2019) - [i8]Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori S. Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard H. Hovy, Alan W. Black, Jaime G. Carbonell, Graham Horwood, Shabnam Tafreshi, Mona T. Diab, Efsun Sarioglu Kayi, Noura Farra, Kathleen R. McKeown:
The ARIEL-CMU Systems for LoReHLT18. CoRR abs/1902.08899 (2019) - [i7]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. CoRR abs/1906.11604 (2019) - [i6]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Cross-Attention End-to-End ASR for Two-Party Conversations. CoRR abs/1907.10726 (2019) - [i5]Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze:
Multilingual Speech Recognition with Corpus Relatedness Sampling. CoRR abs/1908.01060 (2019) - [i4]Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze:
SANTLR: Speech Annotation Toolkit for Low Resource Languages. CoRR abs/1908.01067 (2019) - [i3]Siddharth Dalmia, Abdelrahman Mohamed, Mike Lewis, Florian Metze, Luke Zettlemoyer:
Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models. CoRR abs/1911.03782 (2019) - 2018
- [c5]Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black:
Sequence-Based Multi-Lingual Low Resource Speech Recognition. ICASSP 2018: 4909-4913 - [c4]David R. Mortensen, Siddharth Dalmia, Patrick Littell:
Epitran: Precision G2P for Many Languages. LREC 2018 - [c3]Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black:
Domain Robust Feature Extraction for Rapid Low Resource ASR Development. SLT 2018: 258-265 - [i2]Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black:
Sequence-based Multi-lingual Low Resource Speech Recognition. CoRR abs/1802.07420 (2018) - [i1]Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black:
Domain Robust Feature Extraction for Rapid Low Resource ASR Development. CoRR abs/1807.10984 (2018) - 2017
- [j1]Prafulla Kalapatapu, N. N. Tejas, Siddharth Dalmia, Prakhar Gupta, Bhaswant Inguva, Aruna Malapati:
A novel similarity measure: Voronoi audio similarity for genre classification. Int. J. Intell. Syst. Technol. Appl. 16(4): 309-318 (2017) - [c2]Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, Rohan Badlani, Anurag Kumar, Bhiksha Raj, Ian R. Lane:
An approach for self-training audio event detectors using web data. EUSIPCO 2017: 1863-1867 - 2015
- [c1]Sunit Sivasankaran, Aditya Arie Nugraha, Emmanuel Vincent, Juan Andres Morales-Cordovilla, Siddharth Dalmia, Irina Illina, Antoine Liutkus:
Robust ASR using neural network based speech enhancement and feature simulation. ASRU 2015: 482-489
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-17 20:34 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint