default search action
Naoya Takahashi
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j8]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The whole is greater than the sum of its parts: improving music source separation by bridging networks. EURASIP J. Audio Speech Music. Process. 2024(1): 39 (2024) - [i33]Mayank Kumar Singh, Naoya Takahashi, Wei-Hsiang Liao, Yuki Mitsufuji:
SilentCipher: Deep Audio Watermarking. CoRR abs/2406.03822 (2024) - [i32]Mayank Kumar Singh, Naoya Takahashi, Wei-Hsiang Liao, Yuki Mitsufuji:
LOCKEY: A Novel Approach to Model Authentication and Deepfake Tracking. CoRR abs/2409.07743 (2024) - 2023
- [c34]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability. ICASSP 2023: 1-5 - [c33]Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Nonparallel Emotional Voice Conversion for Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing. ICASSP 2023: 1-5 - [c32]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical Diffusion Models for Singing Voice Neural Vocoder. ICASSP 2023: 1-5 - [c31]Naoya Takahashi, Ritsuka Gomi, Ayato Takii, Masashi Yamakawa, Shinichi Asao, Seiichi Takeuchi:
Numerical Simulation of the Octorotor Flying Car in Sudden Rotor Stop. ICCS (2) 2023: 33-46 - [c30]Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian J. McAuley, Taylor Berg-Kirkpatrick:
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos. ICLR 2023 - [c29]Naoya Takahashi, Tatsuya Amano, Hirozumi Yamaguchi:
Multi-Person Tracking Method Robust to Dynamic Viewport Changes for AR apps. IE 2023: 1-4 - [c28]Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Iteratively Improving Speech Recognition and Voice Conversion. INTERSPEECH 2023: 206-210 - [c27]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Aleksander Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. NeurIPS 2023 - [d4]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.0.0. Zenodo, 2023 [all versions] - [d3]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.1.0. Zenodo, 2023 [all versions] - [i31]Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing. CoRR abs/2302.10536 (2023) - [i30]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Cross-modal Face- and Voice-style Transfer. CoRR abs/2302.13838 (2023) - [i29]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation. CoRR abs/2305.07855 (2023) - [i28]Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Iteratively Improving Speech Recognition and Voice Conversion. CoRR abs/2305.15055 (2023) - [i27]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. CoRR abs/2306.09126 (2023) - 2022
- [c26]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. DCASE 2022 - [c25]Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. ICASSP 2022: 241-245 - [c24]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training. ICASSP 2022: 316-320 - [c23]Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. ICASSP 2022: 4368-4372 - [c22]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. ICASSP 2022: 8872-8876 - [c21]Shrutina Agarwal, Naoya Takahashi, Sriram Ganapathy:
Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer. INTERSPEECH 2022: 3013-3017 - [d2]Adavanne Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.0.0. Zenodo, 2022 [all versions] - [d1]Archontis Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Aleksander Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.1.0. Zenodo, 2022 [all versions] - [i26]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events. CoRR abs/2206.01948 (2022) - [i25]Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi:
Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer. CoRR abs/2208.12410 (2022) - [i24]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability. CoRR abs/2210.05148 (2022) - [i23]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical Diffusion Models for Singing Voice Neural Vocoder. CoRR abs/2210.07508 (2022) - [i22]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Robust One-Shot Singing Voice Conversion. CoRR abs/2210.11096 (2022) - [i21]Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian J. McAuley, Taylor Berg-Kirkpatrick:
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos. CoRR abs/2212.07065 (2022) - 2021
- [c20]Naoya Takahashi, Yuki Mitsufuji:
Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks. CVPR 2021: 993-1002 - [c19]Sakya Basak, Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi:
End-to-End Lyrics Recognition with Voice to Singing Style Transfer. ICASSP 2021: 266-270 - [c18]Naoya Takahashi, Shota Inoue, Yuki Mitsufuji:
Adversarial Attacks on Audio Source Separation. ICASSP 2021: 521-525 - [c17]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection. ICASSP 2021: 915-919 - [c16]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical disentangled representation learning for singing voice conversion. IJCNN 2021: 1-7 - [i20]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical disentangled representation learning for singing voice conversion. CoRR abs/2101.06842 (2021) - [i19]Sakya Basak, Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi:
End-to-end lyrics Recognition with Voice to Singing Style Transfer. CoRR abs/2102.08575 (2021) - [i18]Kazuki Shimada, Naoya Takahashi, Yuichiro Koyama, Shusuke Takahashi, Emiru Tsunoo, Masafumi Takahashi, Yuki Mitsufuji:
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection. CoRR abs/2106.10806 (2021) - [i17]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Source Mixing and Separation Robust Audio Steganography. CoRR abs/2110.05054 (2021) - [i16]Naoya Takahashi, Yuki Mitsufuji:
Amicable examples for informed source separation. CoRR abs/2110.05059 (2021) - [i15]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. CoRR abs/2110.06501 (2021) - [i14]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training. CoRR abs/2110.07124 (2021) - 2020
- [c15]Naoya Takahashi, Yoshiki Matsui, Sotaro Sawa, Daichi Kawamoto, Mitsuru Shinagawa, Kohei Hamamura, Hiroshi Nakamura, Naohiro Shimizu, Masaya Sugino:
Electric Field Communication using a Wide Metal Plate as the Transmission Path. GCCE 2020: 735-738 - [c14]Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Sudarsanam Parthasaarathy, Sriram Ganapathy, Yuki Mitsufuji:
Improving Voice Separation by Incorporating End-To-End Speech Recognition. ICASSP 2020: 41-45 - [i13]Kazuki Shimada, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net. CoRR abs/2006.12014 (2020) - [i12]Naoya Takahashi, Yuki Mitsufuji:
D3Net: Densely connected multidilated DenseNet for music source separation. CoRR abs/2010.01733 (2020) - [i11]Naoya Takahashi, Shota Inoue, Yuki Mitsufuji:
Adversarial attacks on audio source separation. CoRR abs/2010.03164 (2020) - [i10]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection. CoRR abs/2010.15306 (2020) - [i9]Naoya Takahashi, Yuki Mitsufuji:
Densely connected multidilated convolutional networks for dense prediction tasks. CoRR abs/2011.11844 (2020)
2010 – 2019
- 2019
- [c13]Takeshi Morita, Naoya Takahashi, Mizuki Kosuda, Takahira Yamaguchi:
A Teaching Assistant Robot Design Tool Based on Knowledge Chunks Reuse. COMPSAC (2) 2019: 68-73 - [c12]Takeshi Morita, Naoya Takahashi, Mizuki Kosuda, Takahira Yamaguchi:
A Knowledge Chunk Reuse Support Tool based on Heterogeneous Ontologies. KEOD 2019: 217-224 - [c11]Yoshiki Matsui, Kenta Nezu, Naoya Takahashi, Koki Yoshioka, Mitsuru Shinagawa, Kohei Hamamura, Hiroshi Nakamura, Naohiro Shimizu:
Electric Field Communication using a Car Body as a Transmission Medium. ICST 2019: 1-5 - [c10]Naoya Takahashi, Sudarsanam Parthasaarathy, Nabarun Goswami, Yuki Mitsufuji:
Recursive Speech Separation for Unknown Number of Speakers. INTERSPEECH 2019: 1348-1352 - [i8]Naoya Takahashi, Sudarsanam Parthasaarathy, Nabarun Goswami, Yuki Mitsufuji:
Recursive speech separation for unknown number of speakers. CoRR abs/1904.03065 (2019) - [i7]Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Sudarsanam Parthasaarathy, Sriram Ganapathy, Yuki Mitsufuji:
Improving Voice Separation by Incorporating End-to-end Speech Recognition. CoRR abs/1911.12928 (2019) - 2018
- [j7]Takeshi Morita, Shunsuke Akashiba, Chihiro Nishimoto, Naoya Takahashi, Reiji Kukihara, Misae Kuwayama, Takahira Yamaguchi:
A Practical Teacher-Robot Collaboration Lesson Application Based on PRINTEPS. Rev. Socionetwork Strateg. 12(1): 97-126 (2018) - [j6]Naoya Takahashi, Michael Gygli, Luc Van Gool:
AENet: Learning Deep Audio Features for Video Analysis. IEEE Trans. Multim. 20(3): 513-524 (2018) - [c9]Naoya Takahashi, Purvi Agrawal, Nabarun Goswami, Yuki Mitsufuji:
PhaseNet: Discretized Phase Modeling with Deep Neural Networks for Audio Source Separation. INTERSPEECH 2018: 2713-2717 - [c8]Naoya Takahashi, Nabarun Goswami, Yuki Mitsufuji:
Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation. IWAENC 2018: 106-110 - [i6]Naoya Takahashi, Nabarun Goswami, Yuki Mitsufuji:
MMDenseLSTM: An efficient combination of convolutional and recurrent neural networks for audio source separation. CoRR abs/1805.02410 (2018) - 2017
- [c7]Stefan Uhlich, Marcello Porcu, Franck Giron, Michael Enenkl, Thomas Kemp, Naoya Takahashi, Yuki Mitsufuji:
Improving music source separation based on deep neural networks through data augmentation and network blending. ICASSP 2017: 261-265 - [c6]Shunsuke Akashiba, Chihiro Nishimoto, Naoya Takahashi, Takeshi Morita, Reiji Kukihara, Misae Kuwayama, Takahira Yamaguchi:
Implementation of Teacher-Robot Collaboration Lesson Application in PRINTEPS. KES 2017: 2299-2308 - [c5]Naoya Takahashi, Yuki Mitsufuji:
Multi-Scale multi-band densenets for audio source separation. WASPAA 2017: 21-25 - [c4]Shunsuke Akashiba, Chihiro Nishimoto, Naoya Takahashi, Takeshi Morita, Reiji Kukihara, Misae Kuwayama, Takahira Yamaguchi:
Development of applications for teaching assistant robots with teachers in PRINTEPS. WI 2017: 1184-1190 - [i5]Naoya Takahashi, Michael Gygli, Luc Van Gool:
AENet: Learning Deep Audio Features for Video Analysis. CoRR abs/1701.00599 (2017) - [i4]Naoya Takahashi, Yuki Mitsufuji:
Multi-scale Multi-band DenseNets for Audio Source Separation. CoRR abs/1706.09588 (2017) - 2016
- [c3]Naoya Takahashi, Tofigh Naghibi, Beat Pfister:
Automatic Pronunciation Generation by Utilizing a Semi-Supervised Deep Neural Networks. INTERSPEECH 2016: 1141-1145 - [c2]Naoya Takahashi, Michael Gygli, Beat Pfister, Luc Van Gool:
Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition. INTERSPEECH 2016: 2982-2986 - [i3]Naoya Takahashi, Michael Gygli, Beat Pfister, Luc Van Gool:
Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection. CoRR abs/1604.07160 (2016) - [i2]Naoya Takahashi, Tofigh Naghibi, Beat Pfister:
Automatic Pronunciation Generation by Utilizing a Semi-supervised Deep Neural Networks. CoRR abs/1606.05007 (2016) - [i1]Naoya Takahashi, Mitsuharu Matsumoto, Shuji Hashimoto:
Noise reduction combining microphone and piezoelectric device. CoRR abs/1611.03178 (2016) - 2014
- [j5]Minoru Nakayama, Naoya Takahashi:
Chronological states of viewer's intentions using hidden Markov models and features of eye movement. EAI Endorsed Trans. Context aware Syst. Appl. 1(1): e5 (2014) - 2010
- [j4]Daisuke Ishikawa, Naoya Takahashi, Takuya Sasaki, Atsushi Usami, Norio Matsuki, Yuji Ikegaya:
Fluorescent pipettes for optically targeted patch-clamp recordings. Neural Networks 23(6): 669-672 (2010)
2000 – 2009
- 2007
- [c1]Naoya Takahashi, Mitsuharu Matsumoto, Shuji Hashimoto:
Electric Koto by vibrating Body. ICMC 2007
1990 – 1999
- 1995
- [j3]Tetsuhiko Fujii, Akira Yamamoto, Naoya Takahashi, Minoru Yoshida:
Masked Trnsferring Method of Discontinuous Sectors in Disk Cache System. IEICE Trans. Inf. Syst. 78-D(10): 1239-1247 (1995) - 1994
- [j2]Naoya Takahashi:
Performance improvement of jukebox-type optical disk file system. Syst. Comput. Jpn. 25(4): 13-21 (1994)
1980 – 1989
- 1987
- [j1]Naoya Takahashi, Masao Mukaidono:
Disjoint disjunctive form of boolean functions and its applications. Syst. Comput. Jpn. 18(11): 1-11 (1987)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-15 00:26 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint