Foundational models, or large language models (LLMs), train at the scale of the entire internet to acquire the vocabulary, syntax, and narrative structure of human language. Fine-tuning involves using a smaller, domain-specific dataset to grant additional specificity to a pre-existing foundational model that has already acquired patterns and features from massive underlying training datasets. By utilizing the existing knowledge embedded in a pre-trained language model, we unlock high-performance language understanding, but fine-tuning helps to get the right answer - not just an answer that sounds right - with a minimum of hallucinations or incorrect info.
In healthcare, out-of-the-box, foundational models are often too generic to sufficiently answer the precision of a healthcare query. Foundational language models are trained on a massive scale of data, which includes jokes, incorrect information, misunderstandings, and misinformation, along with the sum total of human knowledge. The problem is, if you ask a large language model to give you factually correct information when it has been trained on BOTH accurate and inaccurate information - you’re playing a bit of a dangerous game.
Enter fine-tuning language models. In the old days, we used to build a single model for a single task - for example, classifying procedures into evidence-based or not as laid out by clinical practice guidelines, for patients with particular disease markers.
Each disease would have its own model, which took a long time to build and didn’t scale particularly well, and we struggled with the sparsity and inherent high-dimensionality of healthcare data. By fine-tuning LLMs, however, we're able to specialize those foundational models to achieve greater domain specificity and specialization, leverage the unique predictive and planning capabilities of LLMs, and cut down on the incorrect information that such models inevitably produce, while still preserving the underlying language prowess of the LLM.
If improved accuracy and specificity wasn't enough of a selling point, there's another notable upside to fine-tuning foundational models for use in a clinical context. In terms of computational efficiency, using LLMs out of the box on specific tasks can require massive compute (and spend) that amounts to computational overkill for the problem. Some LLMs charge per query, some by compute load or time - either way, as a business running LLMs in a production environment, minimizing the number of LLM calls means optimizing your compute expenditures, too.
When LLMs fail at scale in healthcare, it's usually along the fault line of failure to specify adequately. Fine-tuning large language models along domain-specific lines is one way to effectively productize genAI for healthcare use cases.
#genAI #LLM #ai #healthcareAI #healthtech #largelanguagemodels #generativeAI #chatGPT