{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T13:37:47Z","timestamp":1776951467998,"version":"3.51.4"},"reference-count":80,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2024,7,24]],"date-time":"2024-07-24T00:00:00Z","timestamp":1721779200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2022ZD0115004"],"award-info":[{"award-number":["2022ZD0115004"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Quant. Biol."],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Transformer\u2010based foundation models such as ChatGPTs have revolutionized our daily life and affected many fields including bioinformatics. In this perspective, we first discuss about the direct application of textual foundation models on bioinformatics tasks, focusing on how to make the most out of canonical large language models and mitigate their inherent flaws. Meanwhile, we go through the transformer\u2010based, bioinformatics\u2010tailored foundation models for both sequence and non\u2010sequence data. In particular, we envision the further development directions as well as challenges for bioinformatics foundation models.<\/jats:p>","DOI":"10.1002\/qub2.69","type":"journal-article","created":{"date-parts":[[2024,7,24]],"date-time":"2024-07-24T10:58:31Z","timestamp":1721818711000},"page":"339-344","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["Foundation models for bioinformatics"],"prefix":"10.1002","volume":"12","author":[{"given":"Ziyu","family":"Chen","sequence":"first","affiliation":[{"name":"State Key Laboratory of Protein and Plant Gene Research School of Life Sciences Biomedical Pioneering Innovative Center (BIOPIC) &amp; Beijing Advanced Innovation Center for Genomics (ICG) Center for Bioinformatics (CBI) Peking University  Beijing China"},{"name":"Changping Laboratory  Beijing China"}]},{"given":"Lin","family":"Wei","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Protein and Plant Gene Research School of Life Sciences Biomedical Pioneering Innovative Center (BIOPIC) &amp; Beijing Advanced Innovation Center for Genomics (ICG) Center for Bioinformatics (CBI) Peking University  Beijing China"},{"name":"Changping Laboratory  Beijing China"}]},{"given":"Ge","family":"Gao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Protein and Plant Gene Research School of Life Sciences Biomedical Pioneering Innovative Center (BIOPIC) &amp; Beijing Advanced Innovation Center for Genomics (ICG) Center for Bioinformatics (CBI) Peking University  Beijing China"},{"name":"Changping Laboratory  Beijing China"}]}],"member":"311","published-online":{"date-parts":[[2024,7,24]]},"reference":[{"key":"e_1_2_9_2_1","unstructured":"BommasaniR HudsonDA AdeliE AltmanR AroraS vonArxS et\u00a0al.On the opportunities and risks of foundation models.2021. Preprint at arXiv: 2108.07258."},{"key":"e_1_2_9_3_1","unstructured":"ZhaoWX ZhouK LiJ TangT WangX HouY et\u00a0al.A survey of large language models.2023. Preprint at arXiv: 2303.18223."},{"key":"e_1_2_9_4_1","unstructured":"VaswaniA ShazeerN ParmarN UszkoreitJ JonesL GomezAN et\u00a0al.Attention is all you need.2017. Preprint at arXiv: 1706.03762."},{"key":"e_1_2_9_5_1","unstructured":"UszkoreitJ.Transformer: a novel neural network architecture for language understanding. Google Research Blog.2017."},{"key":"e_1_2_9_6_1","unstructured":"BahdanauD ChoK BengioY.Neural machine translation by jointly learning to align and translate.2014. Preprint at arXiv: 1409.0473."},{"key":"e_1_2_9_7_1","unstructured":"DevlinJ ChangM\u2010W LeeK ToutanovaK.BERT: pre\u2010training of deep bidirectional transformers for language understanding.2018. Preprint at arXiv: 1810.04805."},{"key":"e_1_2_9_8_1","unstructured":"LiuY OttM GoyalN DuJ JoshiM ChenD et\u00a0al.RoBERTa: a robustly optimized bert pretraining approach.2019. Preprint at arXiv: 1907.11692."},{"key":"e_1_2_9_9_1","unstructured":"BrownTB MannB RyderN SubbiahM KaplanJ DhariwalP et\u00a0al.Language models are few\u2010shot learners.2020. Preprint at arXiv: 2005.14165."},{"key":"e_1_2_9_10_1","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford A","year":"2019","journal-title":"OpenAI blog"},{"key":"e_1_2_9_11_1","unstructured":"WeiJ BosmaM ZhaoVY GuuK YuAW LesterB et\u00a0al.Finetuned language models are zero\u2010shot learners.2021. Preprint at arXiv: 2109.01652."},{"key":"e_1_2_9_12_1","unstructured":"OuyangL WuJ JiangX AlmeidaD WainwrightCL MishkinP et\u00a0al.Training language models to follow instructions with human\u00a0feedback.2022. Preprint at arXiv: 2203.02155."},{"key":"e_1_2_9_13_1","unstructured":"TouvronH LavrilT IzacardG MartinetX LachauxM\u2010A LacroixT et\u00a0al.LLaMA: open and efficient foundation language models.2023. Preprint at arXiv: 2302.13971."},{"key":"e_1_2_9_14_1","unstructured":"WorkshopB Le ScaoT FanA AkikiC PavlickE Ili\u0107S et\u00a0al.BLOOM: a 176b\u2010parameter open\u2010access multilingual language model.2022. Preprint at arXiv: 2211.05100."},{"key":"e_1_2_9_15_1","unstructured":"LiuH NingR TengZ LiuJ ZhouQ ZhangY.Evaluating the logical reasoning ability of ChatGPT and GPT\u20104.2023. Preprint at arXiv: 2304.03439."},{"key":"e_1_2_9_16_1","doi-asserted-by":"crossref","unstructured":"RogersA KovalevaO RumshiskyA.A primer in BERTology: what we know about how BERT works.2020. Preprint at arXiv: 2002.12327.","DOI":"10.1162\/tacl_a_00349"},{"key":"e_1_2_9_17_1","unstructured":"Elicit.Elicit: the AI research assistant.2023."},{"key":"e_1_2_9_18_1","doi-asserted-by":"crossref","unstructured":"XiaoS LiuZ ShaoY CaoZ.RetroMAE: pre\u2010training retrieval\u2010oriented language models via masked auto\u2010encoder.2022. Preprint at arXiv: 2205.12035.","DOI":"10.18653\/v1\/2022.emnlp-main.35"},{"key":"e_1_2_9_19_1","unstructured":"XiaoS LiuZ ZhangP MuennighoffN LianD NieJY.C\u2010pack: packaged resources to advance general Chinese embedding.2023. Preprint at arXiv: 2309.07597."},{"key":"e_1_2_9_20_1","unstructured":"OpenAI.OpenAI embeddings guides.2024."},{"key":"e_1_2_9_21_1","first-page":"1","article-title":"Bioinformatics and biomedical informatics with ChatGPT: year one review","author":"Wang J","year":"2024","journal-title":"Quantitative Biology"},{"key":"e_1_2_9_22_1","first-page":"1","article-title":"A comprehensive evaluation of large language models in mining gene relations and pathway knowledge","author":"Azam M","year":"2024","journal-title":"Quantitative Biology"},{"key":"e_1_2_9_23_1","doi-asserted-by":"crossref","unstructured":"HouW JiZ.Geneturing tests GPT models in genomics.2023. Preprint at bioRxiv: 2023.03.11.532238.","DOI":"10.1101\/2023.03.11.532238"},{"key":"e_1_2_9_24_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-024-02235-4"},{"key":"e_1_2_9_25_1","unstructured":"LeeJ YoonW KimS KimD KimS SoCH et\u00a0al.BioBERT: a pre\u2010trained biomedical language representation model for biomedical text mining.2019. Preprint at arXiv: 1901.08746."},{"key":"e_1_2_9_26_1","doi-asserted-by":"crossref","unstructured":"LuoR SunL XiaY QinT ZhangS PoonH et\u00a0al.BioGPT: generative pre\u2010trained transformer for biomedical text generation and mining.2022. Preprint at arXiv: 2210.10341.","DOI":"10.1093\/bib\/bbac409"},{"key":"e_1_2_9_27_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-023-06291-2"},{"key":"e_1_2_9_28_1","unstructured":"JiZ LeeN FrieskeR YuT SuD XuY et\u00a0al.Survey of hallucination in natural language generation.2022. Preprint at arXiv: 2202.03629."},{"key":"e_1_2_9_29_1","doi-asserted-by":"crossref","unstructured":"TiwariK MatthewsL MayB ShamovskyV Orlic\u2010MilacicM RothfelsK et\u00a0al.ChatGPT usage in the reactome curation process.2023. Preprint at bioRxiv: 2023.11.08.566195.","DOI":"10.1101\/2023.11.08.566195"},{"key":"e_1_2_9_30_1","doi-asserted-by":"crossref","unstructured":"ChenY GaoJ PetrucM HammerRD PopescuM XuD.Iterative prompt refinement for mining gene relationships from ChatGPT.2023. Preprint at bioRxiv: 2023.12.23.573201.","DOI":"10.1101\/2023.12.23.573201"},{"key":"e_1_2_9_31_1","unstructured":"BorgeaudS MenschA HoffmannJ CaiT RutherfordE MillicanK et\u00a0al.Improving language models by retrieving from trillions of tokens.2021. Preprint at arXiv: 2112.04426."},{"key":"e_1_2_9_32_1","unstructured":"GaoL MaX LinJ CallanJ.Precise zero\u2010shot dense retrieval without relevance labels.2022. Preprint at arXiv: 2212.10496."},{"key":"e_1_2_9_33_1","unstructured":"ChaseH.Langchain.2022."},{"key":"e_1_2_9_34_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41587-022-01432-w"},{"key":"e_1_2_9_35_1","doi-asserted-by":"crossref","unstructured":"MeierJ RaoR VerkuilR LiuJ SercuT RivesA.Language models enable zero\u2010shot prediction of the effects of mutations on protein function.2021. Preprint at bioRxiv: 2021.07.09.450648.","DOI":"10.1101\/2021.07.09.450648"},{"key":"e_1_2_9_36_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2016239118"},{"key":"e_1_2_9_37_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.ade2574"},{"key":"e_1_2_9_38_1","doi-asserted-by":"crossref","unstructured":"HsuC VerkuilR LiuJ LinZ HieB SercuT et\u00a0al.Learning inverse folding from millions of predicted structures.2022. Preprint at bioRxiv: 2022.04.10.487779.","DOI":"10.1101\/2022.04.10.487779"},{"key":"e_1_2_9_39_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btab083"},{"key":"e_1_2_9_40_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2311219120"},{"key":"e_1_2_9_41_1","unstructured":"Dalla\u2010TorreH GonzalezL Mendoza\u2010RevillaJ CarranzaNL GrzywaczewskiAH OteriF et\u00a0al.The nucleotide transformer: building and evaluating robust foundation models for human genomics.2023. Preprint at bioRxiv: 2023.01.11.523679."},{"key":"e_1_2_9_42_1","unstructured":"ZhouZ JiY LiW DuttaP DavuluriR LiuH.DNABERT\u20102: efficient foundation model and benchmark for multi\u2010species genome.2023. Preprint at arXiv: 2306.15006."},{"key":"e_1_2_9_43_1","unstructured":"NguyenE PoliM DurrantMG ThomasAW KangB SullivanJ et\u00a0al.Sequence modeling and design from molecular to genome scale with Evo.2024. Preprint at bioRxiv: 2024.02.27.582234."},{"key":"e_1_2_9_44_1","doi-asserted-by":"crossref","unstructured":"ChenJ HuZ SunS TanQ WangY YuQ et\u00a0al.Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions.2022. Preprint at arXiv: 2204.00300.","DOI":"10.1101\/2022.08.06.503062"},{"key":"e_1_2_9_45_1","doi-asserted-by":"crossref","unstructured":"WangX GuR ChenZ LiY JiX KeG et\u00a0al.UNI\u2010RNA: universal pre\u2010trained models revolutionize RNA research.2023. Preprint at bioRxiv. 2023.2007.2011.548588.","DOI":"10.1101\/2023.07.11.548588"},{"key":"e_1_2_9_46_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41587-022-01618-2"},{"key":"e_1_2_9_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cels.2023.10.002"},{"key":"e_1_2_9_48_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-022-32007-7"},{"key":"e_1_2_9_49_1","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-022-00499-z"},{"key":"e_1_2_9_50_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sbi.2024.102794"},{"key":"e_1_2_9_51_1","unstructured":"RaoR LiuJ VerkuilR MeierJ CannyJF AbbeelP et\u00a0al.MSA transformer.2021. Preprint at bioRxiv: 2021.02.12.430858."},{"key":"e_1_2_9_52_1","doi-asserted-by":"crossref","unstructured":"ZhengK LongS LuT YangJ DaiX ZhangM et\u00a0al.ESM all\u2010atom: multi\u2010scale protein language model for unified molecular modeling.2024. Preprint at arXiv: 2403.12995.","DOI":"10.1101\/2024.03.04.583284"},{"key":"e_1_2_9_53_1","unstructured":"DosovitskiyA BeyerL KolesnikovA WeissenbornD ZhaiX UnterthinerT et\u00a0al.An image is worth 16x16 words: transformers for image recognition at scale.2020. Preprint at arXiv: 2010.11929."},{"key":"e_1_2_9_54_1","first-page":"1691","volume-title":"Proceedings of the 37th international conference on machine learning","author":"Chen M","year":"2020"},{"key":"e_1_2_9_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3307339.3342186"},{"key":"e_1_2_9_56_1","volume-title":"Proceedings of the 34th international conference on neural information processing systems","author":"Rong Y","year":"2020"},{"key":"e_1_2_9_57_1","doi-asserted-by":"publisher","DOI":"10.1021\/acsomega.1c05203"},{"key":"e_1_2_9_58_1","first-page":"1","article-title":"Current opinions on large cellular models","author":"Hao M","year":"2024","journal-title":"Quantitative Biology"},{"key":"e_1_2_9_59_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-023-06139-9"},{"key":"e_1_2_9_60_1","unstructured":"YangX LiuG FengG BuD WangP JiangJ et\u00a0al.Genecompass: deciphering universal gene regulatory mechanisms with knowledge\u2010informed cross\u2010species foundation model.2023. Preprint at bioRxiv: 2023.09.26.559542."},{"key":"e_1_2_9_61_1","doi-asserted-by":"crossref","unstructured":"SchaarAC Tejada\u2010LapuertaA PallaG GutgesellR HalleL MinaevaM et\u00a0al.Nicheformer: a foundation model for single\u2010cell and spatial omics.2024. Preprint at bioRxiv: 2024.04.15.589472.","DOI":"10.2139\/ssrn.4803291"},{"key":"e_1_2_9_62_1","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-022-00534-z"},{"key":"e_1_2_9_63_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-024-02201-0"},{"key":"e_1_2_9_64_1","doi-asserted-by":"crossref","unstructured":"HaoM GongJ ZengX LiuC GuoY ChengX et\u00a0al.Large scale foundation model on single\u2010cell transcriptomics.2023. Preprint at bioRxiv: 2023.05.29.542705.","DOI":"10.1101\/2023.05.29.542705"},{"key":"e_1_2_9_65_1","doi-asserted-by":"crossref","unstructured":"GongJ HaoM ZengX LiuC MaJ ChengX et\u00a0al.xTrimoGene: an efficient and scalable representation learner for single\u2010cell RNA\u2010seq data.2023. Preprint at bioRxiv: 2023.03.24.534055.","DOI":"10.1101\/2023.03.24.534055"},{"key":"e_1_2_9_66_1","doi-asserted-by":"crossref","unstructured":"ChenY Zou J.GenePT: a simple but effective foundation model for genes and cells built from ChatGPT.2024. Preprint at bioRxiv: 2023.10.16.562533.","DOI":"10.1101\/2023.10.16.562533"},{"key":"e_1_2_9_67_1","doi-asserted-by":"crossref","unstructured":"LiuT ChenT ZhengW LuoX ZhaoH.scELMo: embeddings from language models are good learners for single\u2010cell data analysis.2023. Preprint at bioRxiv: 2023.12.07.569910.","DOI":"10.1101\/2023.12.07.569910"},{"key":"e_1_2_9_68_1","unstructured":"JainS WallaceBC.Attention is not explanation.2019. Preprint at arXiv: 1902.10186."},{"key":"e_1_2_9_69_1","doi-asserted-by":"crossref","unstructured":"AbnarS ZuidemaW.Quantifying attention flow in transformers.2020. Preprint at arXiv: 2005.00928.","DOI":"10.18653\/v1\/2020.acl-main.385"},{"key":"e_1_2_9_70_1","unstructured":"DaoT FuDY ErmonS RudraA R\u00e9C.Flashattention: fast and memory\u2010efficient exact attention with IO\u2010awareness.2022. Preprint at arXiv: 2205.14135."},{"key":"e_1_2_9_71_1","unstructured":"ChildR GrayS RadfordA SutskeverI.Generating long sequences with sparse transformers.2019. Preprint at arXiv: 1904.10509."},{"key":"e_1_2_9_72_1","unstructured":"ZaheerM GuruganeshG DubeyA AinslieJ AlbertiC OntanonS et\u00a0al.Big bird: transformers for longer sequences.2020. Preprint at arXiv: 2007.14062."},{"key":"e_1_2_9_73_1","unstructured":"ChoromanskiK LikhosherstovV DohanD SongX GaneA SarlosT et\u00a0al.Rethinking attention with performers.2020. Preprint at arXiv: 2009.14794."},{"key":"e_1_2_9_74_1","doi-asserted-by":"crossref","unstructured":"PengB AlcaideE AnthonyQ AlbalakA ArcadinhoS BidermanS et\u00a0al.Rwkv: reinventing RNNs for the transformer era.2023. Preprint at arXiv: 2305.13048.","DOI":"10.18653\/v1\/2023.findings-emnlp.936"},{"key":"e_1_2_9_75_1","unstructured":"PoliM MassaroliS NguyenE FuDY DaoT BaccusS et\u00a0al.Hyena hierarchy: towards larger convolutional language models.2023. Preprint at arXiv: 2302.10866."},{"key":"e_1_2_9_76_1","unstructured":"GuA DaoT.Mamba: linear\u2010time sequence modeling with selective state spaces.2023. Preprint at arXiv: 2312.00752."},{"key":"e_1_2_9_77_1","unstructured":"NguyenE PoliM FaiziM ThomasA Birch\u2010SykesC WornowM et\u00a0al.HyenaDNA: long\u2010range genomic sequence modeling at single nucleotide resolution.2023. Preprint at arXiv: 2306.15794."},{"key":"e_1_2_9_78_1","unstructured":"SuttonR.The bitter lesson.2019."},{"key":"e_1_2_9_79_1","unstructured":"KaplanJ McCandlishS HenighanT BrownTB ChessB ChildR et\u00a0al.Scaling laws for neural language models.2020. Preprint at arXiv: 2001.08361."},{"key":"e_1_2_9_80_1","unstructured":"HoffmannJ BorgeaudS MenschA BuchatskayaE CaiT RutherfordE et\u00a0al.Training compute\u2010optimal large language models.2022. Preprint at arXiv: 2203.15556."},{"key":"e_1_2_9_81_1","unstructured":"WeiJ TayY BommasaniR RaffelC ZophB BorgeaudS et\u00a0al.Emergent abilities of large language models.2022. Preprint at arXiv: 2206.07682."}],"container-title":["Quantitative Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/qub2.69","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,23]],"date-time":"2024-10-23T11:31:01Z","timestamp":1729683061000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/qub2.69"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,24]]},"references-count":80,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["10.1002\/qub2.69"],"URL":"https:\/\/doi.org\/10.1002\/qub2.69","archive":["Portico"],"relation":{},"ISSN":["2095-4689","2095-4697"],"issn-type":[{"value":"2095-4689","type":"print"},{"value":"2095-4697","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,24]]},"assertion":[{"value":"2024-05-18","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-06-25","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-07-24","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}