default search action
Adrián Castelló 0001
Person information
- affiliation: Universitat Jaume I de Castello, Spain
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j23]Rafael Rodríguez-Sánchez, Adrián Castelló, Sandra Catalán, Francisco D. Igual, Enrique S. Quintana-Ortí:
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors. Int. J. High Perform. Comput. Appl. 38(2): 55-68 (2024) - [j22]Héctor Martínez, Sandra Catalán, Adrián Castelló, Enrique S. Quintana-Ortí:
Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures. J. Syst. Archit. 153: 103186 (2024) - [j21]María Engracia Gómez, Julio Sahuquillo, Andrea Biagioni, Nikos Chrysos, Damien Berton, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Pier Stanislao Paolucci, Elena Pastorelli, Francesco Simula, Matteo Turisini, Piero Vicini, Roberto Ammendola, Carlotta Chiarini, Chiara De Luca, Fabrizio Capuani, Adrián Castelló, Jose Duro, Eugenio Stabile, Enrique S. Quintana-Ortí, Pascale Bernier-Bruna, Claire Chen, Pierre-Axel Lagadec, Gregoire Pichon, Etienne Walter, Manolis Katevenis, Sokratis Bartzis, Orestis Mousouros, Pantelis Xirouchakis, Vangelis Mageiropoulos, Michalis Gianioudis, Harisis Loukas, Aggelos Ioannou, Nikos Kallimanis, Miguel Sánchez de la Rosa, Gabriel Gomez-Lopez, Francisco Alfaro-Cortés, Jesús Escudero-Sahuquillo, Pedro Javier García, Francisco J. Quiles, José L. Sánchez, Gaetan De Gassowski, Matthieu Hautreaux, Stephane Mathieu, Gilles Moreau, Marc Pérache, Hugo Taboada, Torsten Hoefler, Timo Schneider, Matteo Barnaba, Giuseppe Piero Brandino, Francesco De Giorgi, Matteo Poggi, Iakovos Mavroidis, Yannis Papaefstathiou, Nikolaos Tampouratzis, Benjamin Kalisch, Ulrich Krackhardt, Mondrian Nuessle, Wolfgang Frings, Dominik Gottwald, Felime Guimaraes, Max Holicki, Volker Marx, Yannik Müller, Carsten Clauss, Hugo Falter, Xu Huang, Jennifer Lopez Barillao, Thomas Moschny, Simon Pickartz:
RED-SEA Project: Towards a new-generation European interconnect. Microprocess. Microsystems 110: 105102 (2024) - [j20]Cristián Ramírez, Adrián Castelló, Héctor Martínez, Enrique S. Quintana-Ortí:
Parallel GEMM-based convolution for deep learning on multicore RISC-V processors. J. Supercomput. 80(9): 12623-12643 (2024) - [j19]Guillermo Alaejos, Héctor Martínez, Adrián Castelló, Manuel F. Dolz, Francisco D. Igual, Pedro Alonso-Jordá, Enrique S. Quintana-Ortí:
Automatic generation of ARM NEON micro-kernels for matrix multiplication. J. Supercomput. 80(10): 13873-13899 (2024) - [j18]Guillermo Alaejos, Adrián Castelló, Pedro Alonso-Jordá, Francisco D. Igual, Héctor Martínez, Enrique S. Quintana-Ortí:
Algorithm 1039: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM. ACM Trans. Math. Softw. 50(1): 6:1-6:34 (2024) - [c23]Adrián Castelló, Julian Bellavita, Grace Dinh, Yuka Ikarashi, Héctor Martínez:
Tackling the Matrix Multiplication Micro-Kernel Generation with Exo. CGO 2024: 182-193 - [c22]Piotr Kluska, Adrián Castelló, Florian Scheidegger, A. Cristiano I. Malossi, Enrique S. Quintana-Ortí:
QAttn: Efficient GPU Kernels for mixed-precision Vision Transformers. CVPR Workshops 2024: 3648-3657 - [c21]Héctor Martínez, Francisco D. Igual, Rafael Rodríguez-Sánchez, Sandra Catalán, Adrián Castelló, Enrique S. Quintana-Ortí:
Inference with Transformer Encoders on ARM and RISC-V Multicore Processors. Euro-Par (2) 2024: 377-392 - [i7]Cristián Ramírez, Adrián Castelló, Héctor Martínez, Enrique S. Quintana-Ortí:
Performance Analysis of Matrix Multiplication for Deep Learning on the Edge. CoRR abs/2403.07731 (2024) - 2023
- [j17]Sergio Barrachina, Adrián Castelló, Mar Catalán, Manuel F. Dolz, José I. Mestre:
Using machine learning to model the training scalability of convolutional neural networks on clusters of GPUs. Computing 105(5): 915-934 (2023) - [j16]Adrián Castelló, Mar Catalán, Manuel F. Dolz, Enrique S. Quintana-Ortí, José Duato:
Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks. Computing 105(5): 1101-1119 (2023) - [j15]Sergio Barrachina, Adrián Castelló, Manuel F. Dolz, Tze Meng Low, Héctor Martínez, Enrique S. Quintana-Ortí, Upasana Sridhar, Andrés E. Tomás:
Reformulating the direct convolution for high-performance deep learning inference on ARM processors. J. Syst. Archit. 135: 102806 (2023) - [j14]Guillermo Alaejos, Adrián Castelló, Héctor Martínez, Pedro Alonso-Jordá, Francisco D. Igual, Enrique S. Quintana-Ortí:
Micro-kernels for portable and efficient matrix multiplication in deep learning. J. Supercomput. 79(7): 8124-8147 (2023) - [j13]Manuel F. Dolz, Sergio Barrachina, Héctor Martínez, Adrián Castelló, Antonio-Manuel Vidal-Maciá, Germán Fabregat, Andrés E. Tomás:
Performance-energy trade-offs of deep learning convolution algorithms on ARM processors. J. Supercomput. 79(9): 9819-9836 (2023) - [j12]Manuel F. Dolz, Héctor Martínez, Adrián Castelló, Pedro Alonso-Jordá, Enrique S. Quintana-Ortí:
Efficient and portable Winograd convolutions for multi-core processors. J. Supercomput. 79(10): 10589-10610 (2023) - [c20]Francisco D. Igual, Luis Piñuel, Sandra Catalán, Héctor Martínez, Adrián Castelló, Enrique S. Quintana-Ortí:
Automatic Generation of Micro-kernels for Performance Portability of Matrix Multiplication on RISC-V Vector Processors. SC Workshops 2023: 1521-1532 - [i6]Adrián Castelló, Julian Bellavita, Grace Dinh, Yuka Ikarashi, Héctor Martínez:
Tackling the Matrix Multiplication Micro-kernel Generation with Exo. CoRR abs/2310.17408 (2023) - [i5]Guillermo Alaejos, Adrián Castelló, Pedro Alonso-Jordá, Francisco D. Igual, Héctor Martínez, Enrique S. Quintana-Ortí:
Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM. CoRR abs/2310.20347 (2023) - 2022
- [j11]Adrián Castelló, Sergio Barrachina, Manuel F. Dolz, Enrique S. Quintana-Ortí, Pau San Juan, Andrés E. Tomás:
High performance and energy efficient inference for deep learning on multicore ARM processors using general optimization techniques and BLIS. J. Syst. Archit. 125: 102459 (2022) - [j10]Sergio Barrachina, Adrián Castelló, Manuel F. Dolz, Andrés E. Tomás:
BestOf: an online implementation selector for the training and inference of deep neural networks. J. Supercomput. 78(16): 17543-17558 (2022) - [j9]Cristián Ramírez, Adrián Castelló, Enrique S. Quintana-Ortí:
A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor. J. Supercomput. 78(16): 18051-18060 (2022) - [c19]Andrea Biagioni, Paolo Cretaro, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Pier Stanislao Paolucci, Elena Pastorelli, Francesco Simula, Matteo Turisini, Piero Vicini, Roberto Ammendola, Pascale Bernier-Bruna, Claire Chen, Said Derradji, Stéphane Guez, Pierre-Axel Lagadec, Gregoire Pichon, Etienne Walter, Gaetan De Gassowski, Matthieu Hautreaux, Stephane Mathieu, Gilles Moreau, Marc Pérache, Hugo Taboada, Torsten Hoefler, Timo Schneider, Matteo Barnaba, Giuseppe Piero Brandino, Francesco De Giorgi, Matteo Poggi, Iakovos Mavroidis, Yannis Papaefstathiou, Nikolaos Tampouratzis, Benjamin Kalisch, Ulrich Krackhardt, Mondrian Nuessle, Pantelis Xirouchakis, Vangelis Mageiropoulos, Michalis Gianioudis, Harisis Loukas, Aggelos Ioannou, Nikos Kallimanis, Nikos Chrysos, Manolis Katevenis, Wolfgang Frings, Dominik Gottwald, Felime Guimaraes, Max Holicki, Volker Marx, Yannik Müller, Carsten Clauss, Hugo Falter, Xu Huang, Jennifer Lopez Barillao, Thomas Moschny, Simon Pickartz, Francisco J. Alfaro, Jesús Escudero-Sahuquillo, Pedro Javier García, Francisco J. Quiles, José L. Sánchez, Adrián Castelló, Jose Duro, María Engracia Gómez, Enrique S. Quintana-Ortí, Julio Sahuquillo, Eugenio Stabile:
RED-SEA: Network Solution for Exascale Architectures. DSD 2022: 712-719 - [c18]Manuel F. Dolz, Adrián Castelló, Enrique S. Quintana-Ortí:
Towards Portable Realizations of Winograd-based Convolution with Vector Intrinsics and OpenMP. PDP 2022: 39-46 - [c17]Adrián Castelló, Enrique S. Quintana-Ortí, Francisco D. Igual:
Anatomy of the BLIS Family of Algorithms for Matrix Multiplication. PDP 2022: 92-99 - [c16]Cristián Ramírez, Adrián Castelló, Héctor Martínez, Enrique S. Quintana-Ortí:
Performance Analysis of Matrix Multiplication for Deep Learning on the Edge. ISC Workshops 2022: 65-76 - [c15]Adrián Castelló, Sandra Catalán, Francisco D. Igual, Enrique S. Quintana-Ortí, Rafael Rodríguez-Sánchez:
QR Factorization Using Malleable BLAS on Multicore Processors. ISC Workshops 2022: 176-189 - 2021
- [j8]Adrián Castelló, Enrique S. Quintana-Ortí, José Duato:
Accelerating distributed deep neural network training with pipelined MPI allreduce. Clust. Comput. 24(4): 3797-3813 (2021) - [j7]Sergio Barrachina, Adrián Castelló, Mar Catalán, Manuel F. Dolz, José I. Mestre:
PyDTNN: A user-friendly and extensible framework for distributed deep learning. J. Supercomput. 77(9): 9971-9987 (2021) - [c14]Sergio Barrachina, Adrián Castelló, Mar Catalán, Manuel F. Dolz, José I. Mestre:
A Flexible Research-Oriented Framework for Distributed Training of Deep Neural Networks. IPDPS Workshops 2021: 730-739 - [c13]Adrián Castelló, Mar Catalán, Manuel F. Dolz, José I. Mestre, Enrique S. Quintana-Ortí, José Duato:
Performance Modeling for Distributed Training of Convolutional Neural Networks. PDP 2021: 99-108 - [c12]Adrián Castelló, Mar Catalán, Manuel F. Dolz, José I. Mestre, Enrique S. Quintana-Ortí, José Duato:
Evaluation of MPI Allreduce for Distributed Training of Convolutional Neural Networks. PDP 2021: 109-116 - [i4]Adrián Castelló, Sergio Barrachina, Manuel F. Dolz, Enrique S. Quintana-Ortí, Pau San Juan:
High performance and energy efficient inference for deep learning on ARM processors. CoRR abs/2105.09187 (2021) - [i3]Julio Silva-Rodríguez, Manuel F. Dolz, Miguel Ferrer, Adrián Castelló, Valery Naranjo, Gema Piñero:
Acoustic Echo Cancellation using Residual U-Nets. CoRR abs/2109.09686 (2021) - 2020
- [j6]Sandra Catalán, Adrián Castelló, Francisco D. Igual, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí:
Programming parallel dense matrix factorizations with look-ahead and OpenMP. Clust. Comput. 23(1): 359-375 (2020) - [j5]Adrián Castelló, Rafael Mayo Gual, Sangmin Seo, Pavan Balaji, Enrique S. Quintana-Ortí, Antonio J. Peña:
Analysis of Threading Libraries for High Performance Computing. IEEE Trans. Computers 69(9): 1279-1292 (2020) - [c11]Pablo San Juan, Adrián Castelló, Manuel F. Dolz, Pedro Alonso-Jordá, Enrique S. Quintana-Ortí:
High Performance and Portable Convolution Operators for Multicore Processors. SBAC-PAD 2020: 91-98 - [i2]Pablo San Juan, Adrián Castelló, Manuel F. Dolz, Pedro Alonso-Jordá, Enrique S. Quintana-Ortí:
High Performance and Portable Convolution Operators for ARM-based Multicore Processors. CoRR abs/2005.06410 (2020)
2010 – 2019
- 2019
- [c10]Adrián Castelló, Manuel F. Dolz, Enrique S. Quintana-Ortí, José Duato:
Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks. CCGRID 2019: 534-541 - [c9]Adrián Castelló, Manuel F. Dolz, Enrique S. Quintana-Ortí, José Duato:
Analysis of model parallelism for distributed neural networks. EuroMPI 2019: 7:1-7:10 - 2018
- [b1]Adrián Castelló:
Unification of Lightweight Thread Solutions and their Application in High Performance Programming. Jaume I University, Spain, 2018 - [j4]Adrián Castelló, Rafael Mayo, Kevin Sala, Vicenç Beltran, Pavan Balaji, Antonio J. Peña:
On the adequacy of lightweight thread approaches for high-level parallel programming models. Future Gener. Comput. Syst. 84: 22-31 (2018) - [j3]Adrián Castelló, Antonio J. Peña, Rafael Mayo, Judit Planas, Enrique S. Quintana-Ortí, Pavan Balaji:
Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models. J. Supercomput. 74(11): 5628-5642 (2018) - [j2]Sangmin Seo, Abdelhalim Amer, Pavan Balaji, Cyril Bordage, George Bosilca, Alex Brooks, Philip H. Carns, Adrián Castelló, Damien Genet, Thomas Hérault, Shintaro Iwasaki, Prateek Jindal, Laxmikant V. Kalé, Sriram Krishnamoorthy, Jonathan Lifflander, Huiwei Lu, Esteban Meneses, Marc Snir, Yanhua Sun, Kenjiro Taura, Peter H. Beckman:
Argobots: A Lightweight Low-Level Threading and Tasking Framework. IEEE Trans. Parallel Distributed Syst. 29(3): 512-526 (2018) - [i1]Sandra Catalán, Adrián Castelló, Francisco D. Igual, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí:
Programming Parallel Dense Matrix Factorizations with Look-Ahead and OpenMP. CoRR abs/1804.07017 (2018) - 2017
- [c8]Adrián Castelló, Sangmin Seo, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí, Antonio J. Peña:
GLT: A Unified API for Lightweight Thread Libraries. Euro-Par 2017: 470-481 - [c7]Adrián Castelló, Sangmin Seo, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí, Antonio J. Peña:
GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations. ICPP 2017: 60-69 - 2016
- [c6]Sergio Iserte, Francisco J. Clemente-Castelló, Adrián Castelló, Rafael Mayo, Enrique S. Quintana-Ortí:
Enabling GPU Virtualization in Cloud Environments. CLOSER (2) 2016: 249-256 - [c5]Adrián Castelló, Antonio J. Peña, Sangmin Seo, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí:
A Review of Lightweight Thread Approaches for High Performance Computing. CLUSTER 2016: 471-480 - 2015
- [j1]Carlos Reaño, Federico Silla, Adrián Castelló, Antonio J. Peña, Rafael Mayo, Enrique S. Quintana-Ortí, José Duato:
Improving the user experience of the rCUDA remote GPU virtualization framework. Concurr. Comput. Pract. Exp. 27(14): 3746-3770 (2015) - [c4]Adrián Castelló, Antonio J. Peña, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí:
Exploring the Suitability of Remote GPGPU Virtualization for the OpenACC Programming Model Using rCUDA. CLUSTER 2015: 92-95 - [c3]Adrián Castelló, Rafael Mayo, Judit Planas, Enrique S. Quintana-Ortí:
Exploiting Task-Parallelism on GPU Clusters via OmpSs and rCUDA Virtualization. TrustCom/BigDataSE/ISPA (3) 2015: 160-165 - 2014
- [c2]Carlos Reaño, Federico Silla, Antonio J. Peña, Gilad Shainer, Scot Schultz, Adrián Castelló, Enrique S. Quintana-Ortí, José Duato:
Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0. CLUSTER 2014: 266-267 - [c1]Sergio Iserte, Adrián Castelló, Rafael Mayo, Enrique S. Quintana-Ortí, Federico Silla, José Duato, Carlos Reaño, Javier Prades:
SLURM Support for Remote GPU Virtualization: Implementation and Performance Study. SBAC-PAD 2014: 318-325
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-23 20:33 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint