default search action
Takaaki Saeki
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j5]Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari:
Text-Inductive Graphone-Based Language Adaptation for Low-Resource Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1829-1844 (2024) - [c18]Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov:
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data. ICASSP 2024: 11546-11550 - [c17]Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari:
Diversity-Based Core-Set Selection for Text-to-Speech with Linguistic and Acoustic Features. ICASSP 2024: 12351-12355 - [i21]Takaaki Saeki, Soumi Maiti, Shinnosuke Takamichi, Shinji Watanabe, Hiroshi Saruwatari:
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics. CoRR abs/2401.16812 (2024) - [i20]Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov:
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data. CoRR abs/2402.18932 (2024) - [i19]Xinjian Li, Shinnosuke Takamichi, Takaaki Saeki, William Chen, Sayaka Shiota, Shinji Watanabe:
YODAS: Youtube-Oriented Dataset for Audio and Speech. CoRR abs/2406.00899 (2024) - 2023
- [j4]Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura, Naoko Tanji, Hiroshi Saruwatari:
SelfRemaster: Self-Supervised Speech Restoration for Historical Audio Resources. IEEE Access 11: 144831-144843 (2023) - [c16]Xinjian Li, Shinnosuke Takamichi, Takaaki Saeki, William Chen, Sayaka Shiota, Shinji Watanabe:
Yodas: Youtube-Oriented Dataset for Audio and Speech. ASRU 2023: 1-8 - [c15]Soumi Maiti, Yifan Peng, Takaaki Saeki, Shinji Watanabe:
Speechlmscore: Evaluating Speech Generation Using Speech Language Model. ICASSP 2023: 1-5 - [c14]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech. ICASSP 2023: 1-5 - [c13]Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari:
Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech. ICASSP 2023: 1-5 - [c12]Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari:
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining. IJCAI 2023: 5179-5187 - [c11]Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion. SSW 2023: 62-68 - [i18]Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari:
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining. CoRR abs/2301.12596 (2023) - [i17]Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari:
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech. CoRR abs/2302.13652 (2023) - [i16]Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari:
Diversity-based core-set selection for text-to-speech with linguistic and acoustic features. CoRR abs/2309.08127 (2023) - 2022
- [c10]Takaaki Saeki, Kentaro Tachibana, Ryuichi Yamamoto:
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning. INTERSPEECH 2022: 793-797 - [c9]Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura, Naoko Tanji, Hiroshi Saruwatari:
SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling. INTERSPEECH 2022: 4406-4410 - [c8]Takaaki Saeki, Detai Xin, Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Hiroshi Saruwatari:
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022. INTERSPEECH 2022: 4521-4525 - [c7]Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Personalized Filled-pause Generation with Group-wise Prediction Models. LREC 2022: 385-392 - [c6]Naoki Kimura, Zixiong Su, Takaaki Saeki, Jun Rekimoto:
SSR7000: A Synchronized Corpus of Ultrasound Tongue Imaging for End-to-End Silent Speech Recognition. LREC 2022: 6866-6873 - [c5]Yoshifumi Nakano, Takaaki Saeki, Shinnosuke Takamichi, Katsuhito Sudoh, Hiroshi Saruwatari:
VTTS: Visual-Text To Speech. SLT 2022: 936-942 - [i15]Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Personalized filled-pause generation with group-wise prediction models. CoRR abs/2203.09961 (2022) - [i14]Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura, Naoko Tanji, Hiroshi Saruwatari:
SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling. CoRR abs/2203.12937 (2022) - [i13]Yoshifumi Nakano, Takaaki Saeki, Shinnosuke Takamichi, Katsuhito Sudoh, Hiroshi Saruwatari:
vTTS: visual-text to speech. CoRR abs/2203.14725 (2022) - [i12]Takaaki Saeki, Kentaro Tachibana, Ryuichi Yamamoto:
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning. CoRR abs/2203.15683 (2022) - [i11]Takaaki Saeki, Detai Xin, Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Hiroshi Saruwatari:
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022. CoRR abs/2204.02152 (2022) - [i10]Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis. CoRR abs/2210.07559 (2022) - [i9]Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Spontaneous speech synthesis with linguistic-speech consistency training using pseudo-filled pauses. CoRR abs/2210.09815 (2022) - [i8]Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari:
Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection. CoRR abs/2210.14850 (2022) - [i7]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech. CoRR abs/2210.15447 (2022) - [i6]Soumi Maiti, Yifan Peng, Takaaki Saeki, Shinji Watanabe:
SpeechLMScore: Evaluating speech generation using speech language model. CoRR abs/2212.04559 (2022) - 2021
- [j3]Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Real-Time Full-Band Voice Conversion with Sub-Band Modeling and Data-Driven Phase Estimation of Spectral Differentials. IEICE Trans. Inf. Syst. 104-D(7): 1002-1016 (2021) - [j2]Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Incremental Text-to-Speech Synthesis Using Pseudo Lookahead With Large Pretrained Language Model. IEEE Signal Process. Lett. 28: 857-861 (2021) - [c4]Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network. ASRU 2021: 749-756 - [i5]Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network. CoRR abs/2109.10724 (2021) - [i4]Tomoki Hayashi, Ryuichi Yamamoto, Takenori Yoshimura, Peter Wu, Jiatong Shi, Takaaki Saeki, Yooncheol Ju, Yusuke Yasuda, Shinnosuke Takamichi, Shinji Watanabe:
ESPnet2-TTS: Extending the Edge of TTS Research. CoRR abs/2110.07840 (2021) - [i3]Shinnosuke Takamichi, Ludwig Kürzinger, Takaaki Saeki, Sayaka Shiota, Shinji Watanabe:
JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification. CoRR abs/2112.09323 (2021) - 2020
- [c3]Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Lifter Training and Sub-Band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials. ICASSP 2020: 7784-7788 - [c2]Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU. INTERSPEECH 2020: 1021-1022 - [c1]Naoki Kimura, Zixiong Su, Takaaki Saeki:
End-to-End Deep Learning Speech Recognition Model for Silent Speech Challenge. INTERSPEECH 2020: 1025-1026 - [i2]Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Lifter Training and Sub-band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials. CoRR abs/2002.06778 (2020) - [i1]Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari:
Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model. CoRR abs/2012.12612 (2020)
2010 – 2019
- 2010
- [j1]Takaaki Saeki, Koji Yamamoto, Hidekazu Murata, Susumu Yoshida:
Impact and Use of the Asymmetric Property in Bi-directional Cooperative Relaying under Asymmetric Traffic Conditions. IEICE Trans. Commun. 93-B(8): 2126-2134 (2010)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:07 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint