DeepLearning - Part1 Week 2（1）

本文为吴恩达DeepLearning课程的学习笔记。

1. 逻辑回归
逻辑回归并不是一种回归算法，而是用来解决二分类问题的一种算法。
逻辑回归的表达式为： $y=sigmoid(w^TX+b)$ ，其中 $sigmoid(z) = 1/(1+e^-z)$ 。当 $sigmoid$ 中的 $z$ 很大时，则 $e^-z$ 趋近于0， $y$ 约等于1。反之，当 $z$ 很小时， $e^-z$ 趋近于无穷大，则 $y$ 约等于0。逻辑回归需要训练的参数是 $w$ 和 $b$ 。

sigmoid

2.逻辑回归损失函数
Loss function可以用来衡量算法的运行情况。通过Loss function，我们可以计算预测值与真实值的差距。

Loss Function of Logistic Regression

Cost function 是将所有的Loss function取平均值，我们要做的就是训练得到参数

w

和

b

，使得Cost function最小化。

Cost function

3.梯度下降
使用梯度下降算法来训练参数 $w$ 和 $b$ 。逻辑回归的Cost function是一个凸函数，所以逻辑回归最终都会得到最小值。参数 $w$ 和 $b$ 的公式如下所示。其中α为学习率，在梯度下降里面表示每一步的步长。

。

4.向量化
向量化可以避免显式的for循环，这里如果学过线性代数的话，很容易理解。将每一条数据合并为一个矩阵，通过矩阵之间的乘法便可以得到结果。

python代码
逻辑回归核心代码如下，如果使用逻辑回归处理图片分类，那么需要将（nn3）的图像转换为一维的平铺向量。

逻辑回归简单流程图


def sigmoid(z):
    return 1.0 / (1.0 + np.exp(-1.0 * z))


def initialize_with_zeros(dim):
    w = np.zeros((dim, 1))
    b = 0

    assert (w.shape == (dim, 1))
    return w, b


def propagate(w, b, X, Y):  # calculate forward propagate and backward propagate , and calculate cost and gradient
    m = X.shape[1]
    A = sigmoid(np.dot(w.T, X) + b)
    cost = -(1.0 / m) * np.sum(Y * np.log(A) + (1 - Y) * np.log(1 - A))  # np.log means the log function in math
    cost = np.squeeze(cost)

    dw = 1.0 / m * (np.dot(X, (A - Y).T))
    db = 1.0 / m * np.sum(A - Y)
    grads = {"dw": dw,
             "db": db}
    return cost, grads


def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost=False):
    costs = []

    for i in range(num_iterations):
        cost, grads = propagate(w, b, X, Y)

        dw = grads["dw"]
        db = grads["db"]

        w = w - learning_rate * dw
        b = b - learning_rate * db
        if i % 100 == 0:
            costs.append(cost)

        if print_cost and i % 100 == 0:
            print("cost after iteration %i:%f" % (i, cost))

    params = {"w": w,
              "b": b}
    grads = {"dw": dw,
             "db": db}

    return params, grads, costs


def predict(w, b, X):
    m = X.shape[1]
    y_prediction = np.zeros((1, m))
    w = w.reshape(X.shape[0], 1)

    A = sigmoid(np.dot(w.T, X) + b)

    for i in range(A.shape[1]):
        if A[0, i] > 0.5:
            y_prediction[0, i] = 1
        else:
            y_prediction[0, i] = 0

    return y_prediction


def model(train_x, train_y, test_x, test_y, num_iterations=2000, learning_rate=0.5, print_cost=False):
    w, b = initialize_with_zeros(train_x.shape[0])
    params, grads, costs = optimize(w, b, train_x, train_y, num_iterations, learning_rate, print_cost)
    w = params["w"]
    b = params["b"]

    Y_prediction_test = predict(w, b, test_x)
    Y_prediction_train = predict(w, b, train_x)

    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - train_y)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - test_y)) * 100))

    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test,
         "Y_prediction_train": Y_prediction_train,
         "w": w,
         "b": b,
         "learning_rate": learning_rate,
         "num_iterations": num_iterations}

    return d

最后编辑于：2020.09.16 11:04:43