default search action
Detai Xin
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j2]Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari:
JVNV: A Corpus of Japanese Emotional Speech With Verbal Content and Nonverbal Expressions. IEEE Access 12: 19752-19764 (2024) - [j1]Detai Xin, Shinnosuke Takamichi, Hiroshi Saruwatari:
JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions. Speech Commun. 156: 103004 (2024) - [c11]Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Eric Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiangyang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao:
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models. ICML 2024 - [i15]Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao:
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models. CoRR abs/2403.03100 (2024) - [i14]Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari:
Building speech corpus with diverse voice characteristics for its prompt-based representation. CoRR abs/2403.13353 (2024) - [i13]Detai Xin, Xu Tan, Kai Shen, Zeqian Ju, Dongchao Yang, Yuancheng Wang, Shinnosuke Takamichi, Hiroshi Saruwatari, Shujie Liu, Jinyu Li, Sheng Zhao:
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis. CoRR abs/2404.03204 (2024) - [i12]Detai Xin, Xu Tan, Shinnosuke Takamichi, Hiroshi Saruwatari:
BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec. CoRR abs/2409.05377 (2024) - 2023
- [c10]Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari:
COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control. ASRU 2023: 1-8 - [c9]Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Detai Xin, Hiroshi Saruwatari:
MID-Attribute Speaker Generation Using Optimal-Transport-Based Interpolation of Gaussian Mixture Models. ICASSP 2023: 1-5 - [c8]Detai Xin, Sharath Adavanne, Federico Ang, Ashish Kulkarni, Shinnosuke Takamichi, Hiroshi Saruwatari:
Improving Speech Prosody of Audiobook Text-To-Speech Synthesis with Acoustic and Textual Contexts. ICASSP 2023: 1-5 - [c7]Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari:
Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech. ICASSP 2023: 1-5 - [c6]Detai Xin, Shinnosuke Takamichi, Ai Morimatsu, Hiroshi Saruwatari:
Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus. INTERSPEECH 2023: 17-21 - [c5]Joonyong Park, Shinnosuke Takamichi, Tomohiko Nakamura, Kentaro Seki, Detai Xin, Hiroshi Saruwatari:
How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics. INTERSPEECH 2023: 1085-1089 - [d1]Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari:
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions. IEEE DataPort, 2023 - [i11]Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari:
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech. CoRR abs/2302.13652 (2023) - [i10]Detai Xin, Shinnosuke Takamichi, Ai Morimatsu, Hiroshi Saruwatari:
Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus. CoRR abs/2305.12442 (2023) - [i9]Detai Xin, Shinnosuke Takamichi, Hiroshi Saruwatari:
JNV Corpus: A Corpus of Japanese Nonverbal Vocalizations with Diverse Phrases and Emotions. CoRR abs/2305.12445 (2023) - [i8]Joonyong Park, Shinnosuke Takamichi, Tomohiko Nakamura, Kentaro Seki, Detai Xin, Hiroshi Saruwatari:
How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics. CoRR abs/2306.00697 (2023) - [i7]Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Wataru Nakata, Detai Xin, Hiroshi Saruwatari:
Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control. CoRR abs/2309.13509 (2023) - [i6]Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari:
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions. CoRR abs/2310.06072 (2023) - 2022
- [c4]Takaaki Saeki, Detai Xin, Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Hiroshi Saruwatari:
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022. INTERSPEECH 2022: 4521-4525 - [i5]Takaaki Saeki, Detai Xin, Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Hiroshi Saruwatari:
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022. CoRR abs/2204.02152 (2022) - [i4]Detai Xin, Shinnosuke Takamichi, Takuma Okamoto, Hisashi Kawai, Hiroshi Saruwatari:
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation. CoRR abs/2204.10561 (2022) - [i3]Detai Xin, Shinnosuke Takamichi, Hiroshi Saruwatari:
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations. CoRR abs/2206.10695 (2022) - [i2]Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Detai Xin, Hiroshi Saruwatari:
Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models. CoRR abs/2210.09916 (2022) - [i1]Detai Xin, Sharath Adavanne, Federico Ang, Ashish Kulkarni, Shinnosuke Takamichi, Hiroshi Saruwatari:
Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts. CoRR abs/2211.02336 (2022) - 2021
- [c3]Detai Xin, Tatsuya Komatsu, Shinnosuke Takamichi, Hiroshi Saruwatari:
Disentangled Speaker and Language Representations Using Mutual Information Minimization and Domain Adaptation for Cross-Lingual TTS. ICASSP 2021: 6608-6612 - [c2]Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis. Interspeech 2021: 1614-1618 - 2020
- [c1]Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space. INTERSPEECH 2020: 2947-2951
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:14 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint