1.Plot vectors
采用如下的方法绘图
import matplotlib.pyplot as plt
plt.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
举例说明:
- x: 是所有需要绘制的向量起点的横坐标,依序排列
- y: 是所有需要绘制的向量起点的纵坐标
- U:每个向量横跨的长度单位
- V:每个向量纵跨的长度单位
import numpy as np
import matplotlib.pyplot as plt
# We're going to plot three vectors
# The first will start at origin 0,0, then go over 1 and up 2
# The second will start at origin 1,2, then go over 3 and up 2
# The third will start at origin 0,0, then go over 4 and up 4
X = [0,1,0]
Y = [0,2,0]
U = [1,3,4]
V = [2,2,4]
# Create the plot
plt.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
# Set the x-axis limits
plt.xlim([0,6])
# Set the y-axis limits
plt.ylim([0,6])
# Show the plot
plt.show()
2- np.dot
矩阵点乘
3-LinearRegression
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
x = cars[["weight"]].values
y = cars["mpg"].values
lr.fit(x, y)
预测
import sklearn
from sklearn.linear_model import LinearRegression
lr = LinearRegression(fit_intercept=True)
lr.fit(cars[["weight"]], cars["mpg"])
predictions = lr.predict(cars[["weight"]])
print (predictions[0:5])
print (cars["mpg"][0:5])
画图scatter
fig = plt.figure()
ax1 = fig.add_subplot(2,1,1)
ax2 = fig.add_subplot(2,1,2)
cars.plot("weight", "mpg", kind='scatter', ax=ax1)
cars.plot("acceleration", "mpg", kind='scatter', ax=ax2)
plt.show()
画预测点和实际点
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
se = pd.Series(predictions)
cars['predictions'] = se.values
plt.scatter(cars["weight"], cars["mpg"], c='red')
plt.scatter(cars["weight"], predictions, c='blue')
算MSE: mean_squared_errors
from sklearn.metrics import mean_squared_error
lr = LinearRegression()
lr.fit(cars[["weight"]], cars["mpg"])
predictions = lr.predict(cars[["weight"]])
mse = mean_squared_error(cars["mpg"], predictions)
print (mse)
rmse = math.sqrt(mse)
print (rmse)
清理数据,去掉?,将数据转变为float
import pandas as pd
columns = ["mpg", "cylinders", "displacement", "horsepower", "weight", "acceleration", "model year", "origin", "car name"]
cars = pd.read_table("auto-mpg.data", delim_whitespace=True, names=columns)
filtered_cars = cars[cars['horsepower'] != '?']
filtered_cars['horsepower'] = filtered_cars['horsepower'].astype('float')
Logistic Regression
The fit method requires that the first input be formatted with the following dimensions: num_features by num_labels. We'll need to use admissions[["gpa"]] instead of admissions["gpa"]. Compare print(admissions[["gpa"]].shape) with print(admissions["gpa"].shape) to understand the difference.
print(admissions[["gpa"]].shape)
# returns (644,1)
print(admissions[["gpa", "actual_label]].shape)
# returns (644,2)
print(admissions["gpa"].shape)
# returns (644,)
logistic_model = LogisticRegression()
logistic_model.fit(admissions[["gpa"]], admissions["admit"])
pred_probs = logistic_model.predict_proba(admissions[["gpa"]])
plt.scatter(admissions["gpa"], pred_probs[:,1])
plt.show()
另外,.predict也可以达到同样的效果
logistic_model = LogisticRegression()
logistic_model.fit(admissions[["gpa"]], admissions["admit"])
fitted_labels = logistic_model.predict(admissions[["gpa"]])
print (fitted_labels[0])