#5-6: Bayesian Machine Learning

Created

Sep 21, 2021 05:49 PM

Topics

Validation Formula Choosing Validation Set Size Fold-Back-In Validation Procedure: Make Most Use of Data Model Selection Cross Validation Leave One Out Analysis Pros & Cons Model Selection with Cross Validation Bayesian Decision Theory Bayes Rule Given To Find: Posterior Probability Error Conditional Risk Bayes Theorem Evidence Conditional Risk Bayes Decision Rule Frequentist vs Bayesian Approach Bayesian Parameter Estimation Example: Logistic Regression How to Choose a Prior ?? Learning Naive Bayes Classifier Discriminative vs Generative Model

Validation

For any hypothesis

Overfitting Penalty: estimated by regularization

: estimated by validation

Formula

Dataset

Training set samples

Validation set samples

the hypothesis selected after training on

Expected value of validation error

This is because

Thus we have

Expected value of validation error closely matches

Choosing Validation Set Size

⬆️ ⇒ ⬇️

too large ⇒ training set size too small

💡

Practical Rule: use 20% of as

Fold-Back-In Validation

Procedure: Make Most Use of Data

Separate the training & validation set

Tune model to find best hyper-parameters

Put validation set back to train the last time

Model Selection

💡

Use the same validation set multiple times without loosing guarantees

Assume you have models (hypothesis sets):

Cross Validation

Small K:

Large K:

Leave One Out Analysis

Leave 1 sample out, use the rest to train the model

Compute model error with the 1 sample

Repeat step 1 & 2 for times

Compute the Cross Validation Error

Pros & Cons

Pro: Far more accurate

Con: Very expansive

The Model is trained times

Model Selection with Cross Validation

Define models by choosing different values of :

For each model :

Run the cross validation module to get an estimate of the cross validation error
Pick the model with the smallest error

Train the model on the entire training set to obtain the final hypothesis

Bayesian Decision Theory

Bayes Rule

Given

State of nature :

Prior probabilities

Class conditional probability: ,

To Find: Posterior Probability

Error

if we decide on

Evidence is a scaling factor ⇒ Pick if otherwise

Conditional Risk

multi-dimensional observations represented as a vector

Bayes Theorem

Evidence

Conditional Risk

Expected loss associated with taking an action when the true state of nature is

Bayes Decision Rule

🎯

To minimize the overall risk, compute & take action to minimize conditional risk for

Frequentist vs Bayesian Approach

Frequentist

Probability = relative frequency of events over the long run

Repeat & count # of event occurrence

Information incomplete; with uncertainty

Parameter is fixed but unknown

Goal: Find params to best explain data

Implementation: MLE

Bayesian

Probability = measure of confidence of belief in the occurrence of event

Cannot repeat events

Information complete; no uncertainty

Goal: find dist over params to quantify uncertainty

Bayesian Parameter Estimation

Start with prior distribution

= prior knowledge about parameters before looking at the data

Given a data set

Use the Bayesian Theorem to find the posterior distribution of given

Denominator = dist of data

Example: Logistic Regression

⬆️ The product term

Maximum A Posteriori Estimation

Bayesian Linear Regression

Generative vs Discriminative Models

How to Choose a Prior

Objective

Subjective

Conjugate

?? Learning

Prior = posterior of the previous iteration

Naive Bayes Classifier

Assumes all features are conditionally independent of other features

⇒ only depends on y

⇒

Reduce params to

Generative Model

Discriminative vs Generative Model

Discriminative

Models

Models the decision boundary itself

Simpler

Cannot derive the generative model by Bayes Theorem

Because required but impossible to find
Contains a lot less information

Generative

Models

Harder to model the data itself, but more useful

Find by Bayes Theorem

Since is the same for , P(x) is considered a constant normalizer and can be disregarded in computation