什么是非线性模型
如果回归模型的因变量是自变量的一次以上函数形式,回归规律在图形上表现为形态各异的各种曲线,称为非线性回归。
- 可线性化问题
处理可线性化处理的非线性回归的基本方法是,通过变量变换,将非线性回归化为线性回归,然后用线性回归方法处理。假定根据理论或经验,已获得输出变量与输入变量之间的非线性表达式,但表达式的系数是未知的,要根据输入输出的n次观察结果来确定系数的值。按最小二乘法原理来求出系数值,所得到的模型为非线性回归模型(nonlinear regression model)。
- 不可线性化问题
对实际科学研究中常遇到不可线性处理的非线性回归问题,提出了一种新的解决方法。该方法是基于回归问题的最小二乘法,在求误差平方和最小的极值问题上,应用了最优化方法中对无约束极值问题的一种数学解法——单纯形法。应用结果证明,这种非线性回归的方法算法比较简单,收敛效果和收敛速度都比较理想。
线性模型与非线性模型的区别
与线性模型不一样的是,这些非线性模型的特征因子对应的参数不止一个。
案例
- 指数模型
rm(list=ls())
source("D:\\Users\\Administrator\\Desktop\\RStudio\\Environmental_and_Ecological_Statistics_with_R\\DataCode\\Code\\FrontMatter.R")
## Chapter 6
## nonlinear regression
plotDIRch6 <- paste(plotDIR, "chapter6", "figures", sep="/")
if(file_test("-d", plotDIRch6)){cat("plotDIRch6 already exists")}else{dir.create(plotDIRch6,recursive = TRUE)}
setwd(plotDIR)
getwd()
laketrout <- read.csv(paste(dataDIR,"laketrout2.csv", sep="/"),
header=T)
head(laketrout)
## dividing by 2.54 converts 60 cm to inches
## single predictor
laketrout$length <- laketrout$length*2.54
laketrout$size<- "small"
laketrout$size[laketrout$length>60] <- "large"
laketrout$large<- 0
laketrout$large[laketrout$length>60] <- 1
laketrout$len.c <- laketrout$length-mean(laketrout$length, na.rm=T)
laketrout$inches<-laketrout$length/2.54
laketrout <- laketrout[laketrout$pcb>exp(-2)&laketrout$length>0,]
head(laketrout)
# the PCB in fish example
> pcb.exp <- nls(pcb ~ pcb0 * exp(-k * (year-1974)), data=laketrout,
+ start=list(pcb0=10, k=0.08))
> summary(pcb.exp)
Formula: pcb ~ pcb0 * exp(-k * (year - 1974))
Parameters:
Estimate Std. Error t value Pr(>|t|)
pcb0 11.76215 0.64432 18.25 <2e-16 ***
k 0.11488 0.00885 12.98 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.143 on 629 degrees of freedom
Number of iterations to convergence: 6
Achieved convergence tolerance: 2.173e-06
(15 observations deleted due to missingness)
nlm.coef <- coef(pcb.exp)
# tikz(file=paste(plotDIRch6, "pcbnls1.tex", sep="/"),
# width=4.5, height=3, standAlone=F)
par(mar=c(3,3,1,1), mgp=c(1.25,0.125,0), las=1, tck=0.01)
plot(pcb~year, data=laketrout, col="gray", xlab="Year", ylab="PCB (mg/kg)", cex=0.5)
curve(nlm.coef[1]*exp(-nlm.coef[2]*(x-1974)), lwd=2, add=T)
QQ 图
qqmath(~resid(pcb.exp),
panel = function(x,...) {
panel.grid()
panel.qqmath(x,...)
panel.qqmathline(x,...)
}, ylab="Residuals", xlab="Standard Normal Quantile",
# scales=list(y=list(alternating=2)),
)
残差与拟合值作图:
xyplot(resid(pcb.exp)~fitted(pcb.exp), panel=function(x,y,...){
panel.grid()
panel.xyplot(x, y,...)
panel.abline(0, 0)
panel.loess(x, y, span=1, col="red",...)
}, ylab="Residuals", xlab="Fitted")
残差S-L图
xyplot(sqrt(abs(resid(pcb.exp)))~fitted(pcb.exp), panel=function(x,y,...){
panel.grid()
panel.xyplot(x, y,...)
panel.loess(x, y, span=1, col="red",...)
}, ylab="Sqrt. Abs. Residuals", xlab="Fitted"
# scales=list(y=list(alternating=2)),
)
par(mar=c(3,3,1,1), mgp=c(1.5,0.125,0), las=1, tck=0.01)
hist(resid(pcb.exp), xlab="Residuals", main="")
以上,可见残差并不符合正态分布。
- 具有非零渐近线的指数衰减模型
> pcb.exp2 <- nls(log(pcb) ~ log(pcb0*exp(-k*(year-1974))+pcba),
+ data=laketrout, start=list(pcb0=10, k=0.08, pcba=1))
> nlm.coef2 <- coef(pcb.exp2)
> summary(pcb.exp2)
Formula: log(pcb) ~ log(pcb0 * exp(-k * (year - 1974)) + pcba)
Parameters:
Estimate Std. Error t value Pr(>|t|)
pcb0 6.5916 0.8612 7.654 7.39e-14 ***
k 0.2450 0.0378 6.482 1.83e-10 ***
pcba 1.6877 0.1367 12.348 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.847 on 628 degrees of freedom
Number of iterations to convergence: 6
Achieved convergence tolerance: 2.828e-07
(15 observations deleted due to missingness)
- 双指数衰减模型
pcb.exp3 <- nls(log(pcb) ~ log(pcb01*exp(-k1*(year-1974))+pcb02*exp(-k2*(year-1974))),
data=laketrout, start=list(pcb01=10, pcb02=2, k1=0.24, k2=0.00002),
algorithm="port", lower = rep(0, 4))
nlm.coef3 <- coef(pcb.exp3)
summary(pcb.exp3)
Formula: log(pcb) ~ log(pcb01 * exp(-k1 * (year - 1974)) + pcb02 * exp(-k2 *
(year - 1974)))
Parameters:
Estimate Std. Error t value Pr(>|t|)
pcb01 6.59158 1.01307 6.507 1.57e-10 ***
pcb02 1.68769 0.95763 1.762 0.07850 .
k1 0.24501 0.09095 2.694 0.00725 **
k2 0.00000 0.02567 0.000 1.00000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8477 on 627 degrees of freedom
Algorithm "port", convergence message: relative convergence (4)
(15 observations deleted due to missingness)
- 混合级数模型
mixedorder <- function(x, b0, k, theta){
LP1 <- LP2 <- 0
if(theta==1){
LP1 <- log(b0) - k*x
} else {
LP2 <- log(b0^(1-theta) - k*x*(1-theta))/(1-theta)
}
return( LP1 + LP2)
}
pcb.exp4 <- nls(log(pcb) ~ mixedorder(x=year-1974, pcb0, k, phi), data=laketrout, start=list(pcb0=10, k=0.0024, phi=3.5))
nlm.coef4 <- coef(pcb.exp4)
summary(pcb.exp4)
Formula: log(pcb) ~ mixedorder(x = year - 1974, pcb0, k, phi)
Parameters:
Estimate Std. Error t value Pr(>|t|)
pcb0 10.698228 1.689648 6.332 4.62e-10 ***
k 0.007854 0.003116 2.521 0.012 *
phi 3.087139 0.323447 9.544 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8471 on 628 degrees of freedom
Number of iterations to convergence: 5
Achieved convergence tolerance: 4.959e-06
(15 observations deleted due to missingness)
par(mar=c(3,3,1,1), mgp=c(1.25,0.125,0), las=1, tck=0.01)
plot(pcb~year, data=laketrout, col=grey(0.5), xlab="Year", ylab="PCB (mg/kg)", cex=0.5)
curve(nlm.coef [1]*exp(-nlm.coef[2]*(x-1974)), lwd=2, add=T, lty=1)
curve(nlm.coef2[1]*exp(-nlm.coef2[2]*(x-1974))+nlm.coef2[3], lwd=2, add=T, lty=2)
curve(nlm.coef3[1]*exp(-nlm.coef3[3]*(x-1974))+nlm.coef3[2]*exp(-nlm.coef3[4]*(x-1974)), lwd=2, add=T, lty=3)
curve((nlm.coef4[1]^(1-nlm.coef4[3]) - nlm.coef4[2]*(x-1974)*(1-nlm.coef4[3]))^(1/(1-nlm.coef4[3])), lwd=2, add=T, lty=4)
legend (x=1995, y=40, legend=1:4, lty=1:4, lwd=2, cex=0.5, bty="n")
exp1.pred <- exp(predict(pcb.exp, new=data.frame(year=c(2000, 2007))))
exp2.pred <- exp(predict(pcb.exp2, new=data.frame(year=c(2000, 2007))))
exp3.pred <- exp(predict(pcb.exp3, new=data.frame(year=c(2000, 2007))))
exp4.pred <- exp(predict(pcb.exp4, new=data.frame(year=c(2000, 2007))))
logdiff1 <- diff(exp1.pred)
exp1.sim <- sim.nls(pcb.exp, 1000)
betas <- exp1.sim$beta
pred.00<-betas[,1]*exp(-betas[,2]*(2000-1974))
pred.07<-betas[,1]*exp(-betas[,2]*(2007-1974))
percent1 <- 1-pred.07/pred.00
precent1 <- percent1[!is.na(percent1)]
exp2.sim <- sim.nls (pcb.exp2, 1000)
betas <- exp2.sim$beta
pred.00<-betas[,1]*exp(-betas[,2]*(2000-1974))+betas[,3]
pred.07<-betas[,1]*exp(-betas[,2]*(2007-1974))+betas[,3]
percent2 <- 1-pred.07/pred.00
precent2 <- percent2[!is.na(percent2)]
exp3.sim <- sim.nls (pcb.exp3, 1000)
betas <- exp3.sim$beta
pred.00<-betas[,1]*exp(-ifelse(betas[,3]<0, 0, betas[,3])*(2000-1974))+betas[,2]*exp(-ifelse(betas[,4]<0, 0, betas[,4])*(2000-1974))
pred.07<-betas[,1]*exp(-ifelse(betas[,3]<0, 0, betas[,3])*(2007-1974))+betas[,2]*exp(-ifelse(betas[,4]<0, 0, betas[,4])*(2007-1974))
percent3 <- 1-pred.07/pred.00
precent3 <- percent3[!is.na(percent3)]
mixedorder2 <- function(x, b0, k, theta){
ifelse(theta==1, exp(log(b0) - k*x), exp(log(b0^(1-theta) - k*x*(1-theta))/(1-theta)))
}
exp4.sim <- sim.nls (pcb.exp4, 1000)
betas <- exp4.sim$beta
pred.00<-mixedorder2 (2000, betas[,1], betas[,2], betas[,3])
pred.07<-mixedorder2 (2007, betas[,1], betas[,2], betas[,3])
percent4 <- 1-pred.07/pred.00
precent4 <- percent4[!is.nan(percent4)]
# tikz(file=paste(plotDIRch6, "pcb00to07.tex", sep="/"),
# width=4, height=2.5, standAlone=F)
par(mar=c(3,3,1,1), mgp=c(1.25,0.125,0), las=1, tck=0.01)
plot(y=c(1-0.25, 4+0.25), x=c(0,0.5)*100, type="n",
xlab="\\% change, 2000 to 2007", ylab="Models", axes=F)
segments(y0=1:4, y1=1:4,
x0=100*c(quantile(percent1, prob=0.025, na.rm=T), quantile(percent2, prob=0.025, na.rm=T),
quantile(percent3, prob=0.025, na.rm=T), quantile(percent4, prob=0.025, na.rm=T)),
x1=100*c(quantile(percent1, prob=0.975, na.rm=T), quantile(percent2, prob=0.975, na.rm=T),
quantile(percent3, prob=0.975, na.rm=T), quantile(percent4, prob=0.975, na.rm=T)))
segments(y0=1:4, y1=1:4, x0=100*c(quantile(percent1, prob=0.25, na.rm=T), quantile(percent2, prob=0.25, na.rm=T),
quantile(percent3, prob=0.25, na.rm=T), quantile(percent4, prob=0.25, na.rm=T)),
x1=100*c(quantile(percent1, prob=0.75, na.rm=T), quantile(percent2, prob=0.75, na.rm=T),
quantile(percent3, prob=0.75, na.rm=T), quantile(percent4, prob=0.75, na.rm=T)),
lwd=2.5)
points(y=1:4, x=100*c(quantile(percent1, prob=0.5, na.rm=T), quantile(percent2, prob=0.5, na.rm=T),
quantile(percent3, prob=0.5, na.rm=T), quantile(percent4, prob=0.5, na.rm=T)))
abline(v=25)
axis(1)
axis(2, at=1:4)
box()
> lake.lm7 <- lm(log(pcb) ~ len.c*I(year-1974), data=laketrout)
> display(lake.lm7, 4)
lm(formula = log(pcb) ~ len.c * I(year - 1974), data = laketrout)
coef.est coef.se
(Intercept) 1.8907 0.0465
len.c 0.0510 0.0038
I(year - 1974) -0.0874 0.0036
len.c:I(year - 1974) 0.0008 0.0003
---
n = 631, k = 4
residual sd = 0.5520, R-Squared = 0.67
分段线性模型
分段线性模型常用来评估阈值效应。
由于分段模型的一阶导数是不连续的,该模型在很多常用的数值优化程序中会出问题。要避免这个问题,分段模型要做微小调整,要在阈值点处加上一小段二次曲线以保证一阶导数连续。可以让曲线两端的斜率分别与两条线段的斜率相同来估计二次曲线。
cewise linear model
beta1 <- 0.75
beta2 <- 2.5
eps <- 5
alpha1 <- 2
x1 <- -eps
x2 <- +eps
b <- (x2*beta1-x1*beta2)/(x2-x1)
cc <- (beta2-b)/(2*x2)
a <- alpha1+beta1*x1-b*x1-cc*x1^2
alpha2 <- - beta2*x2 +(a + b*x2 + cc*x2^2)
## postscript(file="hockeystick.eps", height=3.5, width=4, horizontal=F)
# tikz(file=paste(plotDIRch6, "hockeystick.tex", sep="/"),
# height=3.25, width=4.125, standAlone=F)
plot(y=c(2.5*(35-20)+2, 0.75*(5-20)+2), x=c(35, 5), type="n", xlab="x", ylab="y")
segments(x0=c(5, 25), x1=c(15, 35), y0=c(0.75*(5-20)+2, 2.5*(25-20)+2), y1=c(0.75*(15-20)+2, 2.5*(35-20)+2))
lines(seq(15,25,,100), a + b*(seq(15,25,,100)-20) + cc*(seq(15,25,,100)-20)^2)
points(x=c(15, 25), y=c(0.75*(15-20)+2, 2.5*(25-20)+2))
segments(x0=c(15, 25), x1=c(20, 20), y0=c(0.75*(15-20)+2, 2.5*(25-20)+2), y1=c(2, 2), lty=4)
dev.off()
lake.nlm1 <- nls(log(pcb) ~ hockey(len.c, beta0, beta1, delta, theta),
start=list(beta0=1.6, beta1=0.07, delta=0.03, theta=0),
data=laketrout,
na.action=na.omit)
summary(lake.nlm1)
Formula: log(pcb) ~ hockey(len.c, beta0, beta1, delta, theta)
Parameters:
Estimate Std. Error t value Pr(>|t|)
beta0 0.550600 0.131652 4.182 3.30e-05 ***
beta1 0.025259 0.006197 4.076 5.17e-05 ***
delta 0.047032 0.008595 5.472 6.44e-08 ***
theta -2.523846 2.324076 -1.086 0.278
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.7508 on 627 degrees of freedom
Number of iterations to convergence: 24
Achieved convergence tolerance: 8.931e-06
(15 observations deleted due to missingness)
lake.nlm1 <- nls(log(pcb) ~ hockey(length, beta0, beta1, delta, theta),
start=list(beta0=.6, beta1=0.07, delta=0.03, theta=60), data=laketrout,
na.action=na.omit)
summary(lake.nlm1)
lake1.coef<-coef(lake.nlm1)
lake1.sim <- sim.nls (lake.nlm1, 1000)
betas <- lake1.sim$beta
logPCB.mean <- betas[,1] +
(betas[,2] + betas[,3]*(60>betas[,4]))*(60-betas[,4])
pred.PCB<-exp(rnorm(1000, logPCB.mean, lake1.sim$sigma))
pred.log.95CI <- quantile(log(pred.PCB), prob=c(0.025, 0.975))
theta <- quantile(lake1.sim$beta[,4], prob=c(0.025, 0.975))
##postscript(paste(base, "lake1sim.eps", sep="/"), height=3.5, width=4.75, horizontal=F)
# tikz(file=paste(plotDIRch6, "lake1sim.tex", sep="/"),
# height=3.5, width=4.75, standAlone=F)
par(mar=c(3,3,1,1), mgp=c(1.25,.125,0), las=1, tck=0.01)
plot(log(pcb)~length, data=laketrout, xlab="Length (cm)",ylab="log PCB", type="n")
for (i in 1:200)
curve(hockey(x, betas[i,1], betas[i,2], betas[i,3], betas[i,4]), col=grey(0.6) , add=T)
curve(hockey(x, lake1.coef[1], lake1.coef[2], lake1.coef[3], lake1.coef[4]),lwd=3, add=T)
points(laketrout$length, log(laketrout$pcb), col=grey(0.4), cex=0.5)
segments(x0=60, x1=60, y0=pred.log.95CI[1], y1=pred.log.95CI[2])
segments(y0=-1.5, y1=-1.5, x0=theta[1], x1=theta[2], col=grey(0.5))
lake.nlm2 <- nls(log(pcb) ~ beta1*(year-1974) + hockey(length, beta0, beta2, delta, theta),
start=list(beta0=.6, beta1= - 0.08, beta2=0.07, delta=0.03, theta=60), data=laketrout,
na.action=na.omit)
summary(lake.nlm2)
Formula: log(pcb) ~ beta1 * (year - 1974) + hockey(length, beta0, beta2,
delta, theta)
Parameters:
Estimate Std. Error t value Pr(>|t|)
beta0 1.598569 0.153380 10.422 < 2e-16 ***
beta1 -0.084594 0.003527 -23.983 < 2e-16 ***
beta2 0.043091 0.004361 9.881 < 2e-16 ***
delta 0.034569 0.006224 5.554 4.14e-08 ***
theta 60.716807 2.262821 26.832 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5422 on 626 degrees of freedom
Number of iterations to convergence: 3
Achieved convergence tolerance: 2.829e-06
(15 observations deleted due to missingness)
lake2.coef<- coef(lake.nlm2)
##postscript(paste(base, "lake2sim.eps", sep="/"), height=3.5, width=4.75, horizontal=F)
#pdf(paste(base, "lake2sim.pdf", sep="/"), height=3.5, width=4.75)
# tikz(file=paste(plotDIRch6, "lake2sim.tex", sep="/"),
# height=3.5, width=4.75, standAlone=F)
par(mar=c(3,3,1,1), mgp=c(1.25,.125,0), las=1, tck=0.01)
plot(log(pcb)~length, data=laketrout, xlab="Length (cm)",ylab="log PCB", type="n")
curve(lake2.coef[2]*(0)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]), col=1 , add=T)
points(laketrout$length[laketrout$year<1984], log(laketrout$pcb[laketrout$year<1984]), col=1, cex=0.5)
curve(lake2.coef[2]*(10)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]), col=2 , add=T)
points(laketrout$length[laketrout$year>=1984&laketrout$year<1994],
log(laketrout$pcb[laketrout$year>=1984&laketrout$year<1994]), col=2, cex=0.5)
curve(lake2.coef[2]*(20)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]), col=3 , add=T)
points(laketrout$length[laketrout$year>=1994],
log(laketrout$pcb[laketrout$year>=1994]), col=3, cex=0.5)
curve(lake2.coef[2]*(30)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]), col=4 , add=T)
legend(x=30, y=3.5, col=1:4, legend=seq(1974,2004, 10), lty=1, cex=0.5)
par(mar=c(3,3,1,1), mgp=c(1.25,.125,0), las=1, tck=0.01)
plot(log(pcb)~length, data=laketrout, xlab="Length (cm)",ylab="log PCB", type="n")
curve(lake2.coef[2]*(0)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]),
col=1, lty=1, add=T, lwd=2)
points(laketrout$length[laketrout$year<1984], log(laketrout$pcb[laketrout$year<1984]),
col=1, pch="1", cex=0.5)
curve(lake2.coef[2]*(10)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]),
col=grey(0.5), lty=2, add=T, lwd=2)
points(laketrout$length[laketrout$year>=1984&laketrout$year<1994],
log(laketrout$pcb[laketrout$year>=1984&laketrout$year<1994]),
col=grey(0.5), pch="2", cex=0.5)
curve(lake2.coef[2]*(20)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]),
col=1, lty=3, add=T, lwd=2)
points(laketrout$length[laketrout$year>=1994],
log(laketrout$pcb[laketrout$year>=1994]),
col=1, pch="3", cex=0.5)
curve(lake2.coef[2]*(30)+hockey(x, lake2.coef[1], lake2.coef[3], lake2.coef[4], lake2.coef[5]),
col=1, lty=4, add=T, lwd=2)
legend(x=30, y=3.5, col=c(1, grey(0.5), 1, 1), legend=seq(1974,2004,
10), lty=1:4,
pch=c(1:3, ""), cex=0.75, bty="n")
数据平滑
局部加权回归散点平滑法(locally weighted scatterplot smoothing,LOWESS或LOESS)是查看二维变量之间关系的一种有力工具。
LOWESS主要思想是取一定比例的局部数据,在这部分子集中拟合多项式回归曲线,这样我们便可以观察到数据在局部展现出来的规律和趋势;而通常的回归分析往往是根据全体数据建模,这样可以描述整体趋势,但现实生活中规律不总是(或者很少是)教科书上告诉我们的一条直线。我们将局部范围从左往右依次推进,最终一条连续的曲线就被计算出来了。显然,曲线的光滑程度与我们选取数据比例有关:比例越少,拟合越不光滑(因为过于看重局部性质),反之越光滑。
回归算法之非线性回归
常见的非线性回归模型
机器学习中线性模型和非线性的区别
分段回归的拐点连续性
根据Econometrics in R一书,将回归方法总结一下
分段混合效果输出的解释
题 用R分段回归:绘制段
非线性回归模型与平滑回归
用局部加权回归散点平滑法观察二维变量之间的关系
R语言实战第八章:回归