Abstract
Text classification aims to classify text into one or more predefined categories based on its characteristics. Although existing methods improve model performance by fine-tuning pre-trained language models and introducing label embeddings, due to the sensitivity of medical data, its scarcity often leads to overfitting, as the model tends to overly focus on scarce samples. Details and noise, and cannot generalize well to new data, thus affecting the robustness of the model. To address this issue, we propose a novel approach that integrates prefix label embedding with pretrained language models. Furthermore, we introduce a scoring mechanism for assessing the similarity between labels and text at the classification level. By leveraging predictive score-guided Mixup, our method effectively mines features closely related to classification, alleviating overfitting and enhancing model robustness. Additionally, incorporating multi-head mechanisms enriches feature representation and improves model interpretability. Experimental results demonstrate that our framework significantly improves accuracy on medical datasets.
Supported by the National Natural Science Foundation of China (No. 62266028) and Yunnan Major Science and Technology Project (No. 202202AD080003).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Devlin, J., Chang, M., Lee, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019)
Zhuang, L., Ya, S., Wayne, L.: A robustly optimized BERT pre-training approach with post-training. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics, pp. 1218–1227. Chinese Information Processing Society of China, Huhhot, China (2021)
Zhang, H.Y., Cisse, M.: mixup: beyond empirical risk minimization. In: International Conference on Learning Representations (2018)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751S. Association for Computational Linguistics, Doha, Qatar (2014)
Zhang, Y., Liu, Q.: Sentence-state LSTM for text representation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, Long Papers, pp. 317–327. Association for Computational Linguistics, Melbourne, Australia (2018)
Howard, J.: Universal language model fine-tuning for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 328–339. Association for Computational Linguistics, Melbourne, Australia (2018)
Guo, H., Mao, Y., Zhang, R.: Augmenting data with mixup for sentence classification: an empirical study. Augmenting Data with Mixup for Sentence Classification, arXiv:1905.08941 (2019)
Ray, C.H., Caragea, C.: Cross-lingual disaster-related multi-label tweet classification with manifold mixup. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pp. 292–298. Association for Computational Linguistics, Online (2020)
He, J., Fu, M., Tu, M.: Applying deep matching networks to Chinese medical question answering: a study and a dataset. BMC Med. Inform. Decis. Making | Full Text, 91–100 (2019)
Zong, H., Yang, J., Zhang, Z., Li, Z., Zhang, X.: Semantic categorization of Chinese eligibility criteria in clinical trials using machine learning methods. BMC Med. Inform. Decis. Making, pp. 1–12 (2021)
He, C., Peng, L., Le, Y., He, J., Zhu, X.: SECaps: a sequence enhanced capsule model for charge prediction. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11730, pp. 227–239. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30490-4_19
Vaswani, A., Shazeer, N.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010. Curran Associates Inc, Red Hook, NY, USA (2017)
Diao, S., Bai, J., Song, Y., Zhang, T.: ZEN: pre-training Chinese text encoder enhanced by n-gram representations. arXiv. 1911.00720 (2019)
Li, Y.-J., Zhang, H.-J., Pan, W.-M., Feng, R.-J., Zhou, Z.-Y.: Microblog rumor detection based on Bert-DPCNN. In: Liang, Q., Wang, W., Mu, J., Liu, X., Na, Z., Cai, X. (eds.) Artificial Intelligence in China. LNEE, vol. 653, pp. 524–530. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-8599-9_60
Shreyashree, S., Sunagar, P., Rajarajeswari, S.: BERT-based hybrid RNN model for multi-class text classification to study the effect of pre-trained word embeddings. Int. J. Adv. Comput. Sci. Appl. (2022)
Li, X., Zhang, Y., Jin, J., Sun, F., Li, N.: A model of integrating convolution and BiGRU dual-channel mechanism for Chinese medical text classifications (2023). https://doi.org/10.1371/journal
Shengbin, L., Fuqi, S.: A medical text classification approach with ZEN and capsule network. J. Supercomputing (2024). https://doi.org/10.1007/s11227-023-05612-6
Pires, T., Schlinger, E.: How multilingual is multilingual BERT?. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4996–5001. Association for Computational Linguistics, Florence, Italy (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pang, Y., Xian, Y., Xiang, Y., Huang, Y. (2024). Predictive Score-Guided Mixup for Medical Text Classification. In: Peng, W., Cai, Z., Skums, P. (eds) Bioinformatics Research and Applications. ISBRA 2024. Lecture Notes in Computer Science(), vol 14954. Springer, Singapore. https://doi.org/10.1007/978-981-97-5128-0_19
Download citation
DOI: https://doi.org/10.1007/978-981-97-5128-0_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5127-3
Online ISBN: 978-981-97-5128-0
eBook Packages: Computer ScienceComputer Science (R0)