ML-As-1
Vector Calculus Review (15 pts)
When
Bayes’ Rule (10 pts)
Assume the probability of a certain disease is
(a) Calculate the probability of test positive.
: The probability that a person has the disease. : The probability of testing positive given that the person has the disease. : The probability of testing positive given that the person does not have the disease. : The probability that a person does not have the disease.
(b) Use Bayes’ Rule to calculate the probability of being infected with the disease given that the test is positive.
Gradient Descent Mechanics (20 pts)
Gradient descent is the primary algorithm to search optimal parameters for our models. Typically, we want to solve optimization problems stated as
where
a
The simplest parametric model entails learning a single-parameter constant function, = 𝜃. where we set
i. What is the gradient of with respect to ?
ii. What is the optimal value of ?
when
iii. Write the gradient descent update.
Where
iv. Stochastic Gradient Descent (SGD) is an alternative optimization algorithm, where instead of using all N samples, we use single sample per optimization step to update the model. What is the contribution of each data-point to the full gradient update?
Thus, the gradient descent update for a single data point is:
In SGD, this single sample gradient update is used to update \theta after each data point.
b
Instead of constant functions, we now consider a single-parameter linear model
i. What is the gradient of with respect to ?
ii. What is the optimal value of ?
when
iii. Write the gradient descent update.
Where
iv. Do all points get the same vote in the update? Why or why not?
No, not all points get the same vote in the gradient update.
Each data point is weighted by
MAP Interpretation of Ridge Regression (20 pts)
Consider the Ridge Regression estimator
We know this is solved by
One interpretation of Ridge Regression is to find the Maximum A Posteriori (MAP) estimate on
Note that each entry of vector
The MAP estimate maximizes the posterior distribution:
From Bayes' rule:
Substituting the likelihood and prior:
Maximizing this expression with respect to
Programming (35 pts)
Task 1
# You should return your result.
import numpy as np
def insertSecond(a, b):
return np.insert(a, 1, b)
assert np.array_equal(insertSecond(np.array([-5,-10,-12,-6]),5), np.array([-5, 5, -10, -12, -6]))
assert np.array_equal(insertSecond(np.array([1,2,3]),7), np.array([1, 7, 2, 3]))
assert np.array_equal(insertSecond(np.array([-5,-10,-12,-6]),8), np.array([ -5, 8, -10,-12, -6]))
assert np.array_equal(insertSecond(np.array([1,2,3]),12), np.array([1, 12, 2, 3]))
Task 2
import numpy as np
def mergeArrays(a,b):
return np.sort(np.unique(np.concatenate((a,b))))
# Test cases
assert np.array_equal(mergeArrays(np.array([1,1,4,8,1]), np.array([2, 3])), np.array([1, 2, 3, 4, 8]))
assert np.array_equal(mergeArrays(np.array([-5,-10,-10,-6]), np.array([-5, 8, -10, -12,-6])),np.array([-12, -10, -6, -5, 8]) )
assert np.array_equal(mergeArrays(np.array([1,1,6,8,1]), np.array([2, 3])), np.array([1, 2, 3, 6, 8]))
Task 3
import numpy as np
import matplotlib.pyplot as plt
# data to plot
n_groups = 5
men_means = (22, 30, 33, 30, 26)
women_means = (25, 32, 30, 35, 29)
alpha = 0.5
fig, ax = plt.subplots()
index = np.arange(n_groups)
bar_width = 0.4
opacity = 0.8
rects1 = plt.bar(index, men_means, bar_width,
alpha=0.5,
color='g',
label='Men')
rects2 = plt.bar(index + bar_width, women_means, bar_width,
alpha=0.5,
color='r',
label='Women')
plt.xlabel('Person')
plt.ylabel('Scores')
plt.title('Scores by person')
plt.xticks(index + bar_width / 2, ('G1', 'G2', 'G3', 'G4', 'G5'))
plt.legend()
plt.tight_layout()
plt.show()
Task 4
import pandas as pd
def setDataFrameZeros(df):
rows = df.isin([0]).any(axis=1)
cols = df.isin([0]).any(axis=0)
df.loc[rows, :] = 0
df.loc[:, cols] = 0
return df
df1 = pd.DataFrame({'c1': [1, 4, 7], 'c2': [2, 0, 8], 'c3': [3, 6, 9]})
df2 = pd.DataFrame({'c1': [1, 0, 7], 'c2': [0, 0, 0], 'c3': [3, 0, 9]})
(df2.equals(setDataFrameZeros(df1)))
df1 = pd.DataFrame({'c1': [0, 3, 1], 'c2': [1, 4, 3], 'c3': [2, 5, 1], 'c4': [0, 2, 5]})
df2 = pd.DataFrame({'c1': [0, 0, 0], 'c2': [0, 4, 3], 'c3': [0, 5, 1], 'c4': [0, 0, 0]})
assert (df2.equals(setDataFrameZeros(df1)))
df1 = pd.DataFrame({'c1': [1, 4, 7], 'c2': [2, 0, 8], 'c3': [3, 6, 9]})
df2 = pd.DataFrame({'c1': [1, 0, 7], 'c2': [0, 0, 0], 'c3': [3, 0, 9]})
assert (df2.equals(setDataFrameZeros(df1)))
df1 = pd.DataFrame({'c1': [0, 3, 1], 'c2': [1, 4, 3], 'c3': [2, 5, 1], 'c4': [0, 2, 5]})
df2 = pd.DataFrame({
'c1': [0, 0, 0],
'c2': [0, 4, 3],
'c3': [0, 5, 1],
'c4': [0, 0, 0]
})
assert (df2.equals(setDataFrameZeros(df1)))