Unit 2: AI Foundations
Artificial Intelligence = ability of machines to mimic the capabilities of
humans.
Human intelligence:
- Learn new skills
- Abstract thinking
- Non verbal cues
- Handle complex situations
- Plan short/long term
- Creativity
Artificial General Intelligence (AGI) – repliate ANY of those capabilities.
AI – using AGI for specific problems
AI enhances SPEED and EFFECTIVENESS of human tasks.
Uses of AI:
- Automation and Decision Making
- Creative Support
AI Domains:
- Language, Vision, Speech, Product Recommendations, Anomaly
Detection, Learn by Reward, Forecasting, Generating Content
Language Tasks:
- Tokenization – converting words to numbers
- Padding – reducing varying sentence length
- Embedding – Dot and Cosine similarity classification to find patterns
- NLP – natural language processing.
- Recurrent Neural Networks – Process data SEQUENTIAL, hidden state
- Long Short-Term Memory – SEQUENTIAL, Uses gates
- Transformers – PARALLEL, uses SELF ATTENTION to understand context
Speech Tasks:
- Sample Rate – 44.1 kHz, rate of sampling audio
- Bit depth - # bits in each of the samples
- Goal: Find correlations of multiple samples.
- Previous 3 + Variational Autoencoders, Waveform models, Siamese
Networks
Vision Tasks:
- Convolutional Neural Network – detect patterns in images, learning
hierarchical representations of visual features
- YOLO – pocess and detect objects within the image
- Generative Adversarial Network (GAN) – Generate REAL looking
images.
Artificial Intelligence:
- Machines imitate human intelligence
- Machine Learning:
o Algorithms that learn from past data and predict outcomes or
identify trends.
o Deep Learning:
Learn from complex data using neural networks to predict
outcomes or generate NEW data.
Supervised Machine Learning:
- Learns from labeled data
- Extract rules.
Unsupervised Machine Learning:
- Extracting trends from unlabeled data
- Clustering, dimensionality reduction, etc.
Reinforcement Learning:
- Agent learns to perform actions in an environment.
- Reward or punishment
- Solve by trial/error
Deep Learning:
- Training neural networks with multiple layers to learn features and
rules.
- Can automatically learn by themselves.
- Example of Supervized ML algorithm.
Function Approximation – Technique of estimating an underlying mystery
function using historical observations.
Unit 3: Machine Learning Foundations
Machine Learning – SUBSET of AI that learns and improves from experience.
- Analyze, visualize, and make predictions from data
- Input Features Output Labels
Supervised ML – Classify data or make predictions
Unsupervised ML – No labels, understand RELATIONSHIPS
Reinforcement ML – make decisions or choices
Continuous Output – Regression. Categorical Output – Classification.
Classification: Binary – one or the other. Otherwise multiclass.
Logistic Regression – Predicts something as true or false. (think logistic
function)
- Y axis of logistic regression is probability of true. (0-1)
Independent vs Dependent Freatures Use scatter plot and linear
regression line
Y-intercept of line can also be called “bias”.
Loss = error. The goal is to minimize the Squared Error.
Anaconda – use for data science, supports python and R and Jupyter
Notebooks
Unsupervised ML Use cases: Market Segmentation, Outlier Analysis
Similarity – how close 2 data points are to each other (value 0-1)
Unsupervised Workflow –
- Prepare (normalize + scaling) , Create
similarity metrics , Run clustering
algorithm , interpret and adjust clustering.
Reinforcement Learning Examples: Autonomous vehicles, Smart Devices,
Industrial Automation, Gaming/Entertainment
Unit 4: Deep Learning Foundations
DL = training Artificial Neural Networks (ANNs) with multiple layers.
- Learn and exract intricate representations from data.
- EXTRACT FEATURES from raw and complex data. (no specify features)
- Internal representation of data built using extracted features
- Parallel processing of data
- Better scalability and performance.
Applications of DL:
- Image classification
- NLP
- Language generation, summary, generative AI, etc.
Select the right DL Algorithm:
- Images/Videos Convolutional Neural Network (CNN)
- Sequential, Time Series, Natural Language Transformers, Long-Short-
Term Memory (LSTM), or Recurrent Neural
Network (RNN)
- Images, Text, Audio Generation Transformer,
Diffusion Models, GANs
ANNs are inspired by the human brain. Use neurons
( single computation unit (input output) )
Input, Hidden, and Output layers.
Weights – determine the strength of connection between neurons
Activation Function – Works on the weighted sum of inputs to a neuron
and produces an output.
Bias – Additional input to neurons for flexibility.
Backpropagation Algorithm – Training ANNs
- Guess and Compare
- Measure the Error
- Adjust the Guess
- Update the weights (then repeat)
Sequence Models – input data are sequences. Goal is to find patterns and
make predicitons.
- NLP, Speech Recognition, Music Generation, Gesture Recognition, Time
Series Analysis
Recurrent Neural Network (RNN) – Handle Sequential Data
- Allow info to persist using a feedback loop
- Maintains a hidden state or memory
- Capture dependencies
- Types of RNN Architecture
o 1 to 1 – Standard non-sequential data like FNN
o 1 to many – music generation or sequence generation
o Many to 1 – sentiment analysis
o Many to Many – machine translation / named entity recognition
LSTM – uses specialized memory cell and gating mechanics to capture long
term dependencies of data
- Selectively remembers/forgets information over time.
Steps of LSTM: input processing, Previous memory, Gating Mechanism (input
gate forget gate output gate), Update memory (cell state), Output
Generation (for current time step)
Feed Forward Neural Networks (FNN)
- Also called Multi Layer Perceptron (MLP) (the simplest form)
Convolutional Neural Network (CNN) – learn patterns from image or video
RNN – Handle sequential data and use feedback loop
Autoencoders – Unsupervised models for feature extraction, dimensionality
reduction, employed in data compression, anomaly identification.
LSTM – (type of RNN) Find long term dependencies of data
GAN – Generates realistic synthetic data
Transformers – used in NLP
CNN – Used for grid like data.
- Input layers – accept 3D images with height, width, and depth.
- Feature Extraction Layers – repeating pattern of convolution layer,
ReLu activation function, and pooling layer
- Classification Layer (output)
Convolution Layer – applies convolutions to images using small filters
(kernels)
Activation Layer – learn complex and NON-linear relationships
Pooling Layer – Reduce computational complexity and dimensions of the
feature maps
Limitation of CNN:
- Computationally expensive
- Overfits with limited training data
- Hard to interpret (black box)
- Sensitive to input variations
Applications of CNN – image classification, object detection, image
segmentation, face recognition, medical imaging, autonomous vehicles
(understand what they see), remote sensing.
Unit 5: Generative AI and LLM Foundations
Generative AI: Creates NEW content, part of deep learning
- Learns the underlying patterns to create NEW data matching these
patterns.
- Does not require labeled data in pre-train stage
- Text-Based Gen-AI vs Multimodal (images, audio, video, text) Gen-AI.
Language Model (LM) – probabilistic model of text.
- Uses probabities to decide what the next word is.
- Large in LLM just means # of parameters. (A LOT)
- EOS = end of sentence/sequece.
LLM can answer questions, write stuff, translate stuff
- Based on DL architecture (Transformer)
- Enhanced contextual understanding. NLP
- Trained on vast language data to recognize patterns
- 100s of millions to billions of parameters.
Parameters – adjustable weights in the neural network. (too many
parameters = overfit)
Model Size – memory to store the parameters.
RNNs maintain a hidden state to allow persistance of information.
RNNs have feedback loops and capture dependencies.
Vanishing Gradient – long range dependencies are harder to capture.
Transformers – look at ALL words in the sentence and understand how all
words relate to each other.
Attention Mechanism – adds context to the text.
- Helps the transformer capture long range dependencies.
Encoder processes input and makes vectors using attention mechanism.
Decoder generates output.
Tokens – part of word, a word, or punctuation.
- # tokens = complexity
Embeddings – NUMERICAL representation of a piece of text converted to
number sequences.
- Piece of text can be from part of a word to a lot of text.
Vector Database – used to do similarity searches. Helps LLMs provide
informed answers.
Retrieval-Augmented Generation (RAG) – the architecture of LLM and vector
database.
Decoder = Models take a sequence and output next word.
- Decoder generates token and sends it back to itself until the whole
output is generated.
Encoder-Decoder Architecture: Encoder + decoder (from above)
Prompt – input or initial text provided to model
Prompt Engineering – refining a prompt to get a particular style of
response
Completion LLMs follow the dataset, which may not always be what the user
wants.
- Instruction tuning is a CRITICAL STEP in LLM alignment.
- Reinforcement Learning from Human Feedback (RLHF) = used to fine
tune LLMs to follow human instructions.
In-context Learning – conditioning an LLM with instructions or demos of the
ideal task.
k-shot prompting – provide k examples of the intended task in the prompt.
- 0-shot prompting = no examples.
Chain-of-Thought Prompting – using reasoning steps and calculation logic
before the final answer.
Hallucination – model generated text that is made up.
- RAG claims to reduce hallucination but there is no known methodology
to reliably reduce hallucination.
Customize LLMs with your data: Prompt Engineering RAG Fine Tuning
RAG – language model queries enterprise knowledge bases (DBs, wikis,
vector DBs, etc)
- RAG doesn’t require fine tuning
Fine-Tuning – take pretrained foundational model and provide add’l training
using CUSTOM DATA.
Inference – model receives NEW TEXT as input and generates output based
onw hat it learns from pretraining and fine tuning.
Fine Tuning:
- Optimize on a domain specifc dataset.
- Useful for when model doesn’t perform
task well or when teaching new things
- Adapt to specific style, tone, and learn
domain-specific words/phrases.
Fine Tuning Benefits:
- Improve model performance on specific
tasks
- Improve model efficiency
Unit 6: OCI AI Portfolio
Left consumes Right for [SaaS Apps, AI Services, Infrastructure, Data]
OCI console – browser-based interface, acces to notebook and service
features
Rest API – access to service functionality, requires programming
Language SDKs – provides programming language SDKs
CLI – quick access and full functionality, no scripting
Pretrained Models – Language Detection, Sentiment Analysis, Key Phrase
Extraction
Custom Models – Named entity recognition, Text Classification
Speech – convert media files into text (JSON and SRT format)
Digital Assistant – ai driven interfaces that help users achieve tasks with NLP
Services (7x): Generative AI, AI Agent Platform, Digital Assistant,
Language, Speech, Vision, Document Understanding
OCI Data Science:
- Accelerated: automated workflow, open-source libraries, streamlined
approach to building models, Collaborative, Enterprise-Grade
- Build, train, deploy ML models. Use Jupyter
Notebook
- Notebook session – contains jupyter notebook, libraries, etc
- Conda environment – code environment
- Accelerated Data Science (ADS) SDK
- Model Deployments – deploy models as HTTP or API infrastructure
- Jobs – Define and run repeatable ML tasks and workflow.
GPU – Graphics Processing Unit:
- Parallel computing for large datasets. Optimize for DL
Remote Direct Memory Access (RDMA) – data transfer, bypass CPU (low
latency)
Supercluster – many GPUs with RDMA.
OCI uses Non-blocking Network Fabric OCI RDMA Supercluster
Clos Fabric - multistage circuit-switching network
OCI RDMA Supercluster is lossless and low latency. (because networks are
very local)
Control Plane – deploys workloads only as much as needed to customers.
Optimize distribution for efficiency.
Flow Collision – two flows collide on a single link.
AI guiding principles: Legal, ethical, robust (technical and social)
AI must be regulated by policy, national, and international law
Human ethics: respect for human dignity, freedom for individuals, respect for
democracy, justice, and law, and have equality.
AI Ethics: respect human autonomy, prevent harm, fairness, explicability
Responsible AI Requirements:
- Human centric and human oversight
- Technical robust and safety
- Privacy and data protection
- Transparency, diversity, nondiscrimination, fairness +
Accountability
Steps: Set up governance Develop policies and procedures Ensure
compliance
AI is only as good as the data it is trained on.
Unit 7: OCI Generative AI Service
Gen-AI Service:
- Fully managed service, provides customizable LLMs (single API)
- Choice of pretrained models
- Flexible fine-tuning
- Dedicated AI clusters
o GPU and RDMA to host resources (separate from other GPUs)
Foundational Models:
- Chat – questions, conversational response
o Instruction-following models
- Embedding – text vector embeddings
o Semantic Search Multilingual Model
Fine-tuning: optimization on a SMALLER, domain-SPECIFC dataset
T-Few Fine Tuning – fast and efficient customizations
Preamble Override – defines the AI’s self-identity (context)
Temperature = how random it is
Endpoint – used to host and serve fine-tuned models.
Playground – visually explore and test pre-trained and fine-tuned models
(no code).
Oracle Database 23ai
- SQL support for vector generation
- Vector Data Type
- Similarilty search
- Approx. search indices
Used with Gen-AI Pipelines
Distance Functions – used to determine similarity
Vector Search finds top K closest matches to a query item
Organization: In memory or Neighbor Partitions
Target Accuracy – specify how much accuracy needed for result
Similarity Search Over Joins:
- Go through multiple tables
Select AI:
- Use human langauge to query data
- Just ask a question.
- The AI will find the data you need.
- Can view data in different ways
- Can access the AI generated SQL used to retrieve data.
Select AI is: Simple, Futre-enabled, Secure
You can choose which AI model to use. Then specify schemas, tables, or
views for processing.
Unit 8: OCI AI Services
Language:
- Detect language
- Identify entities (like date, time, currency)
- Identiy sentiment of parts of the text
- Identify key phrases
- Classify general topic from list of 600 categories and subcategories
Speech:
- Treanscription (DL) (w/o data science)
o With time stamps
- Processes data in oject storage
- Multiple languages (Spanish, Portuguese, English)
- Batching support (many files at once)
- Fast processing. (10 hr in <10 min)
- Confidence scores
- Punctuates transcription
- SRT close captions
- Normalization – make transcribed text more readable (one hundred
100)
- Profanity filtering (hide, mask, tag)
Vision:
- Image Analysis:
o Object Detection (detect objects inside image)
o Image Classification (label scene)
- Document AI:
o Works with document images
o Text recognition
o Document Classification (10 different possible types)
o Language Detection (analyze visual features of text)
o Table Extraction
o Key Value Extraction