Day 5

The document outlines a summer program on AI and Machine Learning, focusing on Large Language Models (LLMs). It covers key aspects such as data curation, model architecture, training techniques, evaluation, prompt engineering, and fine-tuning of LLMs. The document emphasizes the importance of quality data and provides practical steps for building and utilizing LLMs effectively.

Uploaded by

Hari Patel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views48 pages

Day 5

Uploaded by

Hari Patel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

NUS ACE SUMMER PROGRAMME

AI & MACHINE
LEARNING

Manoranjan Dash
Professor and Dean
School of Computing and Data Science
FLAME University, Pune

Ex-Senior Research Fellow

Singapore Data Science Consortium
National University of Singapore
Outline
• Large Language Model
• Introduction
Introduction
Introduction
1. What is LLM?
a. LLM is an instance of a foundation model
i. Foundation models are pretrained on unlabeled and self-supervised data
b. Large is really large, like petabytes of data
c. Biggest model by parameters count
i. GPT 3: 175 billion parameters, trained on 45 TB

2. How do they work?

a. LLM: Data + Architecture (Transformer) +Training
b. Training: generation starts arbitrarily but learns from BP until it starts generating coherent
models

3. Business Applications
a. Chatbots: customer queries
b. Content Creation: articles, emails, social media posts, etc
c. Code generation
Introduction
• Two years back implementing an LLM was considered very cutting
edge and esoteric
• But now it can be implemented and applied quite easily
• Lots of businesses and large organizations now have built LLMs
• Example: BloombergGPT (50 billion parameters)
• Used to deal with financial data

• No need to build LLM from scratch

• More efficient: prompt engineering, fine tuning

https://www.youtube.com/watch?v=ZLbVdvOoTKM
So, we won’t be training an LLM soon.
Let’s discuss the technical aspects of building one of these models.
4 Key Steps
• Data Curation
• Model Architecture
• Training at Scale
• Evaluation
Step 1: Data Curation
• Most important and most time consuming part of the process
• Remember:
• Garbage in, garbage out
• The quality of your model is driven by the quality of your data
• But LLM require large amount of training data set
• So wherefrom we can get quality training data?

Trillion words = 1,000,000,000,000

(approx.) 1,000,000 novels
(approx.) 1,000,000,000 news articles
Step 1: Data Curation
• Where do we get all these data?
• Internet
• Web pages, Wikipedia, forums, books, scientific articles, code bases, etc.
• Post ChatGPT there are a lot of copyrights laws
• You may grab data that you are not supposed to grab (web scraping)
• Or you use the data for a potentially commercial use, etc.
• Public datasets
• Common Crawl (Colossal Clean Crawled Corpus (C4), Falcon RefinedWeb)
• The Pile – tries to bring a wide variety of training datasets
• Hugging Face Datasets – HuggingFace has emerged a big player in AI and LLM
• Private Data Sources
• FinPile – used to train BloombergGPT
• Advantage: not available to others
• Using an LLM
• Alpaca (Stanford) – an LLM trained on structured text generated by GPT-3
Alpaca (Stanford)

Prompt TrainingData
While Common Crawl is derived from raw web data, it is
aggregated and standardized in a way that differentiates it
from a random collection of web pages. This added
structure and cleanliness make it more valuable for training
LLMs.
Step 1: Data
Curation

- Higher data diversity means models that can perform for wide variety of tasks thus becoming a general purpose
- GPT-3: webpages+some books
- Gopher: webpages+some books and some codes
- Llama: webpages+some books and some codes and some scientific articles
- PaLM: mainly built on conversational data+ webpages, books and codes
- Knowing these will guide us how we query these models
Quality of a model is driven by the quality of the training data
Step 1: Data Curation
• How to prepare the data?
• Four ways
1. Quality filtering – remove low quality text from dataset
• Toxic language, hate speech, objective false (2+2=5), …
• Two types of filtering
• Classifier based
• Classify text as high or low quality
• Heuristic based
• Removing specific words, repeated words, remove words based on statistical
properties
• Or take a combo of these two approaches
2. De-duplication – several instances of similar text can bias model
• If the same web page appears both in training and test
3. Privacy reduction – removal of sensitive and confidential information
4. Tokenization – translate text into numbers
• ANN does not understand text but only numerical values
Step 2: Model Architecture
• Transformers
• ANN architecture that uses attention mechanisms
• Attention mechanism – learns dependencies between different elements of a
sequence based on position and content.
• E.g., I like samosa. In general I like Indian cuisine.
• The word Indian is dependent on samosa
• 3 Types of Transformers
1. Encoder-only – encoder translates tokens into a semantically meaningful
representation | tasks: text classification
2. Decoder-only – similar to encoder but does not allow self-attention with future
elements | tasks: text generation
3. Encoder-decoder – combines and allows cross-attention | tasks: translation
• Most popular: decoder-only architecture
There is a lot more detail about model architecture. But
Considering the interest and level of students, I do not
Discuss them here.
Step 2: Model Architecture
• How big do I make it big?
• If a model is too big or trained too long, it can underperform
• If a model is too small or not trained long enough, it can underperform
Step 3: Training at Scale
• Central challenge of the LLM is their scale – when training with trillions
of tokens and with billions of parameters, lots of computational cost is
associated with this
• So, some computation tricks/techniques are necessary
• 3 Training Techniques
1. Mixed Precision Training – uses both 32-bit and 16-bit floating point data types
• Use 16-bit precision whenever possible; else use 32-bit for higher precision
2. 3D Parallelism – combination of pipeline, model and data parallelism
• Pipeline Parallelism – distributes layers across multiple GPUs
• Model Parallelism – decomposes parameter matrix operation into multiple matrix and
distributes across multiple GPUs
• Data Parallelism – distributes training data across multiple GPUs
3. Zero Redundancy Optimizer (ZeRO) – reduces data redundancy regarding the
optimizer state, gradient, or parameter partitioning
• Example -- DeepSpeed
Step 3: Training at Scale
• Training Stability
• Checkpointing – takes a snapshot of model artifacts so training can
resume from that point
• E.g., let us say the training was going well and error was going down. But all of a
sudden there is a big spike in the error. So, with checkpointing, it is possible to go
back to a point where the training was going well
• Weight Decay – regularization strategy that penalizes large parameter
values by adding a term (e.g., L2 norm of weights) to the loss function or
changing the parameter update rule
• Gradient Clipping – rescales the gradient of the objective function if its
norm exceeds a pre-specified value
Step 3: Training at Scale
• Hyperparameters
• Batch Size: (Static) typically ~16M tokens; (Dynamic) GPT-3 increased
from 32K to 3.2M
• Learning Rate (LR): (Dynamic) LR increases linearly until reaching a
maximum value and then reduces via a cosine decay until the LR is about
10% if its max value
• Optimizer: Adam-based optimizers are most commonly used for LLMs
• Dropout: typical values between 0.2 and 0.5
Step 4: Evaluation
• Just having model is not all
• You need to evaluate it and find in which cases it works well
• For this there are many benchmark datasets
• Benchmark Dataset (Open LLM Leaderboard)
Step 4: Evaluation
• Multiple-choice Tasks
• ARC, Hellaswag, MMLU
• ARC and MMLU: questions on arts, history, common knowledge
• E.g., which is the latest technology developed: (a) cellphone, (b) airplane, (c) microwave,
(d) refrigerator
• Hellaswag: is different – based on commonsense questions
• Open-ended Tasks
• TruthfulQA
• Human Evaluation – a person scores completion based on ground truth, guidelines,
or both
• NLP Metrics – quantify completion quality via metrics such as Perplexity, BLEU, or
ROGUE scores
• Auxiliary Fine-tuned LLM – use LLM to compare completions to ground truth
What’s Next?
• Base models are typically a starting point, not final solution
Prompt Engineering
• Prompt engineering is a crucial technique in the use of large
language models (LLMs) like GPT-4, designed to optimize the way
prompts are crafted to elicit the best possible responses from the
model.
• As LLMs are driven by the context provided to them, the way questions or
tasks are framed can significantly impact the quality and relevance of the
output.
What is Prompt Engineering?
• Prompt engineering involves designing and refining the input
prompts given to an LLM to achieve specific, desired responses.
• This includes choosing the right words, structure, and context to
maximize the effectiveness of the model's output.
Contextual Information
• Contextual Prompts
• Use context-rich prompts to help the model understand the scenario
better. This might include examples, definitions, or prior conversation
history.
• Multi-turn Prompts
• In a conversational setting, use a series of prompts that build on previous
responses to maintain context and coherence.
Format and Structure
• Question Format
• Frame questions or tasks in a way that directs the model towards the type of response
needed.

• For example, asking "List three benefits of exercise" rather than "What are the benefits of
exercise?"

• Templates
• Use consistent templates for repetitive tasks to standardize responses and ensure
reliability.
Prompt Length
• Brevity vs. Detail
• Balance between being concise and providing enough detail.
• While too short prompts might lack context, overly long prompts can be
confusing and dilute the main instruction.
Examples and Demonstrations
• Few-shot Learning
• Provide examples within the prompt (few-shot learning) to illustrate the
desired response format.
• For example, “Translate the following sentence to French: ‘Hello, how are
you?’ Example: 'Bonjour, comment ça va?' Now translate: ‘Good
morning.’”
• Role-Playing
• Use role-playing to set the scene, such as "You are a helpful assistant.
How would you explain photosynthesis to a 5-year-old?"
Iterative Refinement
• Feedback Loop
• Continuously refine prompts based on the outputs received. If the
model’s response is not as expected, tweak the prompt and try again.
• Experimentation
• Experiment with different phrasings and structures to see which
variations yield the best results.
Examples of Prompt Engineering
• Example 1: Simple Query
• Before
• "Tell me about climate change."

• After
• "Explain the causes and effects of climate change in detail, and provide examples of
how it impacts different regions around the world."
Examples of Prompt Engineering
• Example 2: Simple Query
• Before
• "Generate a summary."

• After
• "Read the following paragraph and generate a summary that highlights the main
points: [Insert paragraph here]."
Examples of Prompt Engineering
• Example 3: Few-shot Learning
• Before
• "Translate to Spanish: 'Good night.'"

• After
• "Translate the following English sentences to Spanish. Example: 'Hello, how are
you?' -> 'Hola, ¿cómo estás?'. Now translate: 'Good night.'"
Best Practices
• Be Explicit
• Avoid vague language and be as explicit as possible about what you want
the model to do.
• Test Variations
• Try different versions of your prompt to see which one works best.
• Leverage Examples
• Use in-context examples to guide the model.
• Maintain Consistency
• Keep a consistent format for similar types of prompts to ensure reliable
outputs.
Model Fine-Tuning
• Introduction to Model Fine-Tuning
• Definition: Fine-tuning is the process of taking a pre-trained language
model and training it further on a specific dataset to adapt it to a
particular task.
• Purpose: It helps the model perform better on specific tasks by leveraging
the general knowledge it has already learned during pre-training.
Why Fine-Tuning is Important?
• Specialization
• Adapts a general-purpose model to perform well on specific tasks (e.g.,
summarization, translation, sentiment analysis).
• Improved Performance
• Enhances the model's accuracy and effectiveness for the given task.
• Resource Efficiency
• Saves time and computational resources compared to training a model
from scratch.
Steps in Fine-Tuning a Language Model
1. Select a Pre-trained Model
• Choose a base model that has been pre-trained on a large corpus of text
(e.g., GPT-3, BERT).
2. Prepare the Dataset
• Collect and preprocess a dataset relevant to the specific task you want
the model to perform.
• Ensure the dataset is clean, well-labeled, and representative of the task.
3. Adjust Model Architecture (if needed)
• Sometimes, minor modifications to the model architecture are made to
better suit the task.
Steps in Fine-Tuning a Language Model
4. Set Hyperparameters
• Define hyperparameters such as learning rate, batch size, and number of
training epochs.
5. Training
• Train the model on the specific dataset by updating its weights through
backpropagation.
• Use techniques like gradient descent to minimize the loss function, which
measures the difference between the model’s predictions and the actual
results.
6. Validation
• Evaluate the model’s performance on a validation set to monitor overfitting and
adjust hyperparameters if necessary.
7. Testing
• Test the fine-tuned model on a separate test set to assess its final performance.
Key Concepts to Understand
• Pre-training vs. Fine-tuning
• Pre-training involves training on a large, diverse dataset to learn general
language patterns, while fine-tuning adapts the model to a specific task.
• Overfitting
• A situation where the model performs well on the training data but poorly
on unseen data. Fine-tuning helps mitigate this by focusing on relevant
features for the specific task.
• Transfer Learning
• The concept of transferring knowledge from one task (pre-training) to
another (fine-tuning).
LLM Hands On
• Langchain
• Framework for applications to leverage LLM for various purposes
• FAISS

https://github.com/codebasics/langchain/blob/main/3_project_codebasics_q_and_a/google_palm_codebasics_q_and_a.ipynb
LLM Hands On Using Google PaLM
• Basic working of Google PaLM LLM in langchain
• Loading and retrieving data from a CSV file using Langchain
and FAISS
• Create RetrievalQA chain using with prompt template

https://github.com/codebasics/langchain/blob/main/3_project_codebasics_q_and_a/google_palm_codebasics_q_and_a.ipynb
Basic working of Google PaLM LLM in
langchain
Loading and retrieving data from a CSV file using Langchain
and FAISS

LLM From Scratch
No ratings yet
LLM From Scratch
27 pages
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
No ratings yet
ML A Deep Dive in The World of AI and LLM Tun'Up Munich - 241021 - 130023
34 pages
How LLM's Work, How GPT Was Trained, and How GPT Generates Outputs
No ratings yet
How LLM's Work, How GPT Was Trained, and How GPT Generates Outputs
12 pages
Building Finetuning Aimodels
No ratings yet
Building Finetuning Aimodels
41 pages
LLM Basics for Researchers
No ratings yet
LLM Basics for Researchers
54 pages
Small Language Models (SLMS)
No ratings yet
Small Language Models (SLMS)
23 pages
Toc 9780138199302
No ratings yet
Toc 9780138199302
8 pages
Know Thy Frenemy
No ratings yet
Know Thy Frenemy
40 pages
Notes 4 Large Language Model
No ratings yet
Notes 4 Large Language Model
4 pages
Quick Start Guide To Large Language Models Second Edition Sinan Ozdemir Online PDF
100% (3)
Quick Start Guide To Large Language Models Second Edition Sinan Ozdemir Online PDF
115 pages
Quick Start Guide to LLMs 2nd Ed
No ratings yet
Quick Start Guide to LLMs 2nd Ed
279 pages
Little Guide To Building Large Language Models in 2024
100% (1)
Little Guide To Building Large Language Models in 2024
65 pages
Large Language Model (LLM) 1
100% (1)
Large Language Model (LLM) 1
17 pages
Attention Is All You Need.
No ratings yet
Attention Is All You Need.
5 pages
Summary - Foundations On LLMs
No ratings yet
Summary - Foundations On LLMs
6 pages
Little Guide To Building Large Language Models in 2024
No ratings yet
Little Guide To Building Large Language Models in 2024
65 pages
Kickstart Your Journey With LLM - A Comprehensive Guide
No ratings yet
Kickstart Your Journey With LLM - A Comprehensive Guide
2 pages
Data Seminar
No ratings yet
Data Seminar
10 pages
AI 900 M2 Notes
No ratings yet
AI 900 M2 Notes
7 pages
Building LLMs - Stanford
No ratings yet
Building LLMs - Stanford
78 pages
LLMs in Production-MLC - GRC
No ratings yet
LLMs in Production-MLC - GRC
39 pages
Week4 LLMs EN
No ratings yet
Week4 LLMs EN
48 pages
LLM Book
No ratings yet
LLM Book
275 pages
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
No ratings yet
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
325 pages
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
100% (6)
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
326 pages
Understanding Large Language Models (LLMS) - A Mode
No ratings yet
Understanding Large Language Models (LLMS) - A Mode
3 pages
Advanced Prompt Engineering
No ratings yet
Advanced Prompt Engineering
27 pages
An Overview of Large Language Models For Statisticians
No ratings yet
An Overview of Large Language Models For Statisticians
67 pages
Prompt Engineering NLP Master Guide
No ratings yet
Prompt Engineering NLP Master Guide
14 pages
Techniques, Tricks & Frameworks
No ratings yet
Techniques, Tricks & Frameworks
143 pages
LLMs: A Researcher's Guide
No ratings yet
LLMs: A Researcher's Guide
46 pages
Understanding LLMS: A Comprehensive Overview From Training To Inference
No ratings yet
Understanding LLMS: A Comprehensive Overview From Training To Inference
30 pages
Productionizing LLM Applications
No ratings yet
Productionizing LLM Applications
13 pages
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
100% (2)
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
254 pages
LLMs: Training to Inference Guide
No ratings yet
LLMs: Training to Inference Guide
30 pages
Thoughts On NLP Research in The (Post-) LLM Era: Yijia Shao Yuanpei College 2023/04/28
No ratings yet
Thoughts On NLP Research in The (Post-) LLM Era: Yijia Shao Yuanpei College 2023/04/28
51 pages
The Best LLMs Cheatsheet - Part 1
No ratings yet
The Best LLMs Cheatsheet - Part 1
16 pages
DAB311 DL Week 11 RNN
No ratings yet
DAB311 DL Week 11 RNN
25 pages
Creating LLM
No ratings yet
Creating LLM
3 pages
4-HC24.PrimisAI - Hans Bouwmeester.v4
No ratings yet
4-HC24.PrimisAI - Hans Bouwmeester.v4
29 pages
Foundations of Large Language Models: Tong Xiao and Jingbo Zhu
No ratings yet
Foundations of Large Language Models: Tong Xiao and Jingbo Zhu
277 pages
GenAI LLM Foundations and Building Blocks
No ratings yet
GenAI LLM Foundations and Building Blocks
6 pages
14 Key Skills To Master Large Language Models 1729745509
No ratings yet
14 Key Skills To Master Large Language Models 1729745509
17 pages
Unlocking The Power of LLMs - Transformative Use Cases Across Industries
No ratings yet
Unlocking The Power of LLMs - Transformative Use Cases Across Industries
44 pages
All The Basics That You Need To Know About LLMs
No ratings yet
All The Basics That You Need To Know About LLMs
26 pages
To Create A LLM
No ratings yet
To Create A LLM
53 pages
A Beginner's Guide To Large Language Models
No ratings yet
A Beginner's Guide To Large Language Models
25 pages
Fine Tuning Techniques For Large Language Models LLMs
100% (4)
Fine Tuning Techniques For Large Language Models LLMs
15 pages
LLM's For Code Generation
No ratings yet
LLM's For Code Generation
31 pages
LLMS&EMBEDDINGS
No ratings yet
LLMS&EMBEDDINGS
10 pages
Icaps LLM Tut Slides Posted
No ratings yet
Icaps LLM Tut Slides Posted
97 pages
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
100% (3)
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
275 pages
LLM Model
No ratings yet
LLM Model
43 pages
Training Large Language Models
No ratings yet
Training Large Language Models
7 pages
Introduction To Large Language Models-2025072419561496
No ratings yet
Introduction To Large Language Models-2025072419561496
16 pages
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
No ratings yet
Exploring The Evolution of Large Language Models: Architectures, Applications, and Future Directions
11 pages
Presentation On Ai
No ratings yet
Presentation On Ai
10 pages
Chemistry SA Crit - D
No ratings yet
Chemistry SA Crit - D
3 pages
Bronze Book Record
No ratings yet
Bronze Book Record
11 pages
Day 4
No ratings yet
Day 4
62 pages
Galaxy School 2024 Registration
No ratings yet
Galaxy School 2024 Registration
4 pages
Act V Scene I A Midsummer Night S Dream
No ratings yet
Act V Scene I A Midsummer Night S Dream
24 pages
GRASPS Student Reflection 1 - 2
No ratings yet
GRASPS Student Reflection 1 - 2
1 page
Death and The King's Horseman
No ratings yet
Death and The King's Horseman
5 pages
Contributions
No ratings yet
Contributions
1 page
Assets File Ans-Key Stem BB CL 9
No ratings yet
Assets File Ans-Key Stem BB CL 9
1 page
Day 2 Presentation
No ratings yet
Day 2 Presentation
65 pages
A Comprehensive Guide To Building Agentic RAG Systems With LangGraph
No ratings yet
A Comprehensive Guide To Building Agentic RAG Systems With LangGraph
23 pages
Corso Su AI Post Editing - ENG
No ratings yet
Corso Su AI Post Editing - ENG
23 pages
Scaling Laws For Optimal Data Mixtures
No ratings yet
Scaling Laws For Optimal Data Mixtures
24 pages
(IJETA-V11I3P36) :ayushi Shukla, Jayant Kumar Vijay, Urvashi Sen, Janvi Jain
No ratings yet
(IJETA-V11I3P36) :ayushi Shukla, Jayant Kumar Vijay, Urvashi Sen, Janvi Jain
8 pages
Prompt Engineering
100% (2)
Prompt Engineering
26 pages
Mulitilingual Embedding Models - Sentiment Analysis Using Mbert, Bert, XLM-R
No ratings yet
Mulitilingual Embedding Models - Sentiment Analysis Using Mbert, Bert, XLM-R
10 pages
LLM Embeddings
No ratings yet
LLM Embeddings
11 pages
E 09 F
No ratings yet
E 09 F
23 pages
Call For Papers SADASC26
No ratings yet
Call For Papers SADASC26
6 pages
ChatGPT PowerPoint
No ratings yet
ChatGPT PowerPoint
11 pages
LLM Cheat Sheetpdf
No ratings yet
LLM Cheat Sheetpdf
7 pages
Lakshmi Sampath Potluri AI ML Engineer
No ratings yet
Lakshmi Sampath Potluri AI ML Engineer
7 pages
Multimodal Foundation World Models For Generalist Embodied Agents
No ratings yet
Multimodal Foundation World Models For Generalist Embodied Agents
19 pages
TPACK in The Age of ChatGPT and Generative AI
No ratings yet
TPACK in The Age of ChatGPT and Generative AI
18 pages
Practical Guide To Using LLMs by Andrej Karpathy Feb 29 2025
No ratings yet
Practical Guide To Using LLMs by Andrej Karpathy Feb 29 2025
8 pages
Intro To AI - Course Notes
No ratings yet
Intro To AI - Course Notes
26 pages
15 AI Agents With n8n 1752830106
100% (2)
15 AI Agents With n8n 1752830106
27 pages
It's About Time: Incorporating Temporality in Retrieval Augmented Language Models
No ratings yet
It's About Time: Incorporating Temporality in Retrieval Augmented Language Models
8 pages
Medical LLMs: Complex Reasoning
No ratings yet
Medical LLMs: Complex Reasoning
29 pages
Openllm-Rtl: Open Dataset and Benchmark For Llm-Aided Design RTL Generation
No ratings yet
Openllm-Rtl: Open Dataset and Benchmark For Llm-Aided Design RTL Generation
9 pages
A Control Theory of LLM Prompting
No ratings yet
A Control Theory of LLM Prompting
23 pages
NExT-GPT: Multimodal AI Model
No ratings yet
NExT-GPT: Multimodal AI Model
22 pages
Instant Download Developing Kaggle Notebooks Gabriel Preda PDF All Chapter
100% (3)
Instant Download Developing Kaggle Notebooks Gabriel Preda PDF All Chapter
54 pages
Macss Ma Qichang
No ratings yet
Macss Ma Qichang
36 pages
DeepSeek Prover V2
No ratings yet
DeepSeek Prover V2
34 pages
50 Examples of How Brands Are Using AI Plus AI Survey - Sweathead
No ratings yet
50 Examples of How Brands Are Using AI Plus AI Survey - Sweathead
86 pages
LangChain For JavaScript Developers How To Integrate LLMs Into Javascript Web Apps (Daniel Nastase) (Z-Library)
No ratings yet
LangChain For JavaScript Developers How To Integrate LLMs Into Javascript Web Apps (Daniel Nastase) (Z-Library)
120 pages
Attentionstore: Cost-Effective Attention Reuse Across Multi-Turn Conversations in Large Language Model Serving
No ratings yet
Attentionstore: Cost-Effective Attention Reuse Across Multi-Turn Conversations in Large Language Model Serving
17 pages
AI Agentic Design Patterns With AutoGen
No ratings yet
AI Agentic Design Patterns With AutoGen
13 pages
Gemini Prompt Guide For Creatives and Strategists
100% (5)
Gemini Prompt Guide For Creatives and Strategists
55 pages

Day 5

Uploaded by

Day 5

Uploaded by

NUS ACE SUMMER PROGRAMME

Ex-Senior Research Fellow

2. How do they work?

• No need to build LLM from scratch

Trillion words = 1,000,000,000,000

You might also like