Certificate of Completion in
Programme for the Use of Generative Artificial Intelligence Tools at the Workplace
Application Course: Using GPT to Solve Real-Life Problem
Introduction to Chatbot and How It Works
江紹祥教授 KONG, Siu-Cheung
數學與資訊科技學系電子學習與數碼能力研究講座教授 Research Chair Professor of E-Learning and Digital Competency of
人工智能及數碼能力教育中心總監 Department of Mathematics and Information Technology
香港教育大學 Director of Artificial Intelligence and Digital Competency Education Centre
The Education University of Hong Kong
Introduction to chatbot – How to use next word
prediction to create Chatbot?
2
Introduction to chatbot – How to use next word prediction to
create Chatbot?
● Using special tokens to inform (during fine-tuning/training) the chatbot that it
needs to generate a “chat-like” text response
● <s> : Beginning of sentence; </s>: End of sentence
3
Introduction to chatbot
● Types of chatbot:
○ AI Chatbot: based on learning and training
○ Rule-based chatbot: based on definite rules and provide
predefined replies
○ Hybrid / Sequential Agent
4
Introduction to chatbot: Ways to create chatbot
● No-code platform
● Low-code platform
○ Drag and drop
○ Some programming knowledge
● Self-deploy server
5
Introduction to chatbot – Chatbot infrastructure
Server
Text
User LLM
Text
6
Introduction to chatbot – Chatbot infrastructure
Server Internal
source
Other External
source source
Text
User LLM
Text
7
Introduction to chatbot and how it works: Chatbot building
steps
● Define the purpose
○ FAQ
○ Learning
○ Training
○ Acting as an agent
■ Sequential
● Understand your users
○ Solve a particular problem
○ Learning
● Pick a type
○ NLP base / rule base
8
Introduction to chatbot and how it works: Chatbot building
steps
● Pick a platform/deployment scheme
○ Price
○ Demand
○ Your preference
○ Limitations
● Test your chatbot
○ Performance
○ Concurrent request
● Chatbot Deployment
● Monitor usage and improvement
9
What is an agent?
Icons are from : https://www.flaticon.com/ 10
What is an agent?
11
What is an agent?
12
What is an agent?
13
Sequential Agent - FlowiseAI
https://docs.flowiseai.com/using-flowise/agentflows/sequential-agents 14
Sequential Agent - FlowiseAI
https://docs.flowiseai.com/using-flowise/agentflows/sequential-agents 15
Chatbot building platforms - POE
16
Chatbot building platforms - Botpress
17
Chatbot building platforms - Botpress
18
Copilot Whatsapp
https://support.microsoft.com/en-us/topic/copilot-for-social-apps-43eb625d-eb25-4c72-a458-19842bf42212
19
Copilot Whatsapp
https://support.microsoft.com/en-us/topic/copilot-for-social-apps-43eb625d-eb25-4c72-a458-19842bf42212
20
Chatbot building platforms - Defy: https://dify.ai/
21
Chatbot building platforms - Botsonic: https://botsonic.com/
22
Chatbot building platforms - https://tryfastgpt.ai/
23
Chatbot building platforms - https://www.coze.com/
24
Chatbot building platforms - coze - https://www.coze.com/
25
Introduction to chatbot: No-code platform - POE
POE
Server Internal
source
Other External
source source
Text
User LLM
Text
26
Introduction to chatbot: Low-code platform – Flowise
AI FLOWISE AI
API
Server Internal
source
API API
Other External
source source
API
Text
User LLM
Text
API (Application Programming Interface) 27
Introduction to chatbot: Low-code platform – Flowise
AI FLOWISE AI
API
Server Internal
source
API API
Other External
source source
API
Text
User LLM
Text
API (Application Programming Interface) 28
Introduction to chatbot: Self-hosting platform
customized
Server Internal
source
customized API
Other External
source source
local
Text
User /
LLM
API
Text
https://docs.vllm.ai/en/latest/getting_started/quickstart.html
29
Introduction to chatbot: Personal Chatbot
customized
Your own PC Internal
source
customized
Other External
source source
local
Text
User /
LLM
API
Text
https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generative-ai/
30
Pick a platform/deployment scheme
“Freemium” Server price
Price (Free / Premium) (EduHK) + OpenAI Free Hardware cost Hardware cost
API key
Deployment on POE.com
Local Local
Location
Some Server Some Server Server + UI
Skill Set No Code No
Understanding Understanding knowledge
Building chatbot UI Simple Drag and drop Drag and drop No Need No Need
Limited (Free)
LLMs / other Mistral and other Mistral and other
e.g. GPT3.5, Claude
models opensource LLM opensource LLM
etc
You – server You – server You – server You – PC
Ownership POE
API – Model API – Model local – Model local – Model
Public (some
Target User Public ($) / private Public / private Public / private Private
config) / private
Stability Good Good Fair Good Good
31
GPT4All - https://www.nomic.ai/gpt4all
32
GPT4All - MAC (OS 14.1)
33
GPT4All - MAC (OS 14.1)
34
GPT4All - MAC (OS 14.1)
35
Chatbot building API OpenRouter https://openrouter.ai/
36
Chatbot building API OpenRouter https://openrouter.ai/
37
Common chatbot parameters:
Reference: Reference:
https://docs.vllm.ai/en/latest/ https://platform.openai.com/docs
dev/sampling_params.html /api-reference/chat/create
38
Common chatbot parameters: Temperature
Lower values for temperature result in more consistent outputs (e.g. 0.2), while
higher values generate more diverse and creative results (e.g. 1.0).
Select a temperature value based on the desired trade-off between coherence and
creativity for your specific application.
The temperature can range is from 0 to 2.
Reference: https://platform.openai.com/docs/guides/text-generation/faq
39
Common chatbot parameters: Temperature
Reference: https://clickup.com/blog/llm-temperature/
40
Common chatbot parameters: Low Temperature
Reference: https://medium.com/@harshit158/softmax-temperature-5492e4007f71
41
Common chatbot parameters: High Temperature
Reference: https://medium.com/@harshit158/softmax-temperature-5492e4007f71
42
Common chatbot parameters
Temperature:
lower temperatures make the output more coherence/consistent
Max tokens:
Max tokens limit the number of tokens that can be generated in a single request.
Repetition penalty:
The parameter for repetition penalty. 1.0 means no penalty
https://arxiv.org/pdf/1909.05858.pdf
Frequency Penalty
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing
frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Reference: https://platform.openai.com/docs/api-reference/chat/create
43
Common chatbot parameters
Presence Penalty
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they
appear in the text so far, increasing the model's likelihood to talk about new topics.
Top P
An alternative to sampling with temperature, called nucleus sampling, where the model
considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens
comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature but not both. Must be in (0, 1]. Set to 1 to
consider all tokens.
Reference: https://platform.openai.com/docs/api-reference/chat/create
https://docs.vllm.ai/en/latest/dev/sampling_params.html
44
Common chatbot parameters
Top K
top_k – Integer that controls the number of top tokens to consider. Set to -1 to
consider all tokens.
Reference: https://docs.vllm.ai/en/latest/dev/sampling_params.html
45
Certificate of Completion in
Programme for the Use of Generative Artificial Intelligence Tools at the Workplace
Application Course: Using GPT to Solve Real-Life Problem
Introduction to Chatbot and How It Works
江紹祥教授 KONG, Siu-Cheung
數學與資訊科技學系電子學習與數碼能力研究講座教授 Research Chair Professor of E-Learning and Digital Competency of
人工智能及數碼能力教育中心總監 Department of Mathematics and Information Technology
香港教育大學 Director of Artificial Intelligence and Digital Competency Education Centre
The Education University of Hong Kong
Thank You!