本文为吴恩达DeepLearning课程的学习笔记。
1. 逻辑回归
逻辑回归并不是一种回归算法,而是用来解决二分类问题的一种算法。
逻辑回归的表达式为:,其中。当中的很大时,则趋近于0,约等于1。反之,当很小时,趋近于无穷大,则约等于0。逻辑回归需要训练的参数是和。
2.逻辑回归损失函数
Loss function可以用来衡量算法的运行情况。通过Loss function,我们可以计算预测值与真实值的差距。
Cost function 是将所有的Loss function取平均值,我们要做的就是训练得到参数
3.梯度下降
使用梯度下降算法来训练参数和。逻辑回归的Cost function是一个凸函数,所以逻辑回归最终都会得到最小值。参数和的公式如下所示。其中α为学习率,在梯度下降里面表示每一步的步长。
4.向量化
向量化可以避免显式的for循环,这里如果学过线性代数的话,很容易理解。将每一条数据合并为一个矩阵,通过矩阵之间的乘法便可以得到结果。
python代码
逻辑回归核心代码如下,如果使用逻辑回归处理图片分类,那么需要将(nn3)的图像转换为一维的平铺向量。
def sigmoid(z):
return 1.0 / (1.0 + np.exp(-1.0 * z))
def initialize_with_zeros(dim):
w = np.zeros((dim, 1))
b = 0
assert (w.shape == (dim, 1))
return w, b
def propagate(w, b, X, Y): # calculate forward propagate and backward propagate , and calculate cost and gradient
m = X.shape[1]
A = sigmoid(np.dot(w.T, X) + b)
cost = -(1.0 / m) * np.sum(Y * np.log(A) + (1 - Y) * np.log(1 - A)) # np.log means the log function in math
cost = np.squeeze(cost)
dw = 1.0 / m * (np.dot(X, (A - Y).T))
db = 1.0 / m * np.sum(A - Y)
grads = {"dw": dw,
"db": db}
return cost, grads
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost=False):
costs = []
for i in range(num_iterations):
cost, grads = propagate(w, b, X, Y)
dw = grads["dw"]
db = grads["db"]
w = w - learning_rate * dw
b = b - learning_rate * db
if i % 100 == 0:
costs.append(cost)
if print_cost and i % 100 == 0:
print("cost after iteration %i:%f" % (i, cost))
params = {"w": w,
"b": b}
grads = {"dw": dw,
"db": db}
return params, grads, costs
def predict(w, b, X):
m = X.shape[1]
y_prediction = np.zeros((1, m))
w = w.reshape(X.shape[0], 1)
A = sigmoid(np.dot(w.T, X) + b)
for i in range(A.shape[1]):
if A[0, i] > 0.5:
y_prediction[0, i] = 1
else:
y_prediction[0, i] = 0
return y_prediction
def model(train_x, train_y, test_x, test_y, num_iterations=2000, learning_rate=0.5, print_cost=False):
w, b = initialize_with_zeros(train_x.shape[0])
params, grads, costs = optimize(w, b, train_x, train_y, num_iterations, learning_rate, print_cost)
w = params["w"]
b = params["b"]
Y_prediction_test = predict(w, b, test_x)
Y_prediction_train = predict(w, b, train_x)
print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - train_y)) * 100))
print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - test_y)) * 100))
d = {"costs": costs,
"Y_prediction_test": Y_prediction_test,
"Y_prediction_train": Y_prediction_train,
"w": w,
"b": b,
"learning_rate": learning_rate,
"num_iterations": num_iterations}
return d