R包学习之broom

#broom包接受R中内置函数的杂乱输出,如lm、nls或t-test,并将它们转换为整齐的数据帧。

#就是把非数据框的杂乱数据整理为数据框

#broom+dplyr配合使用

#有三个功能:tidy;augment;glance

#例子一

```

lmfit <- lm(mpg ~ wt, mtcars)

lmfit

summary(lmfit)

library(broom)

tidy(lmfit)

```

#返回一个数据框,行名变成了名为term的列

# 您可能对回归中每个原始点的拟合值和残差感兴趣,而不是查看系数。

# 使用augment,它使用来自模型的信息来扩充原始数据

augment(lmfit)

#添加的列前面有一个点.,以避免覆盖原始列

#对于整个回归计算,有好几个总结性统计方法,glance功能可实现

glance(lmfit)

#例子二

```

#Generalized linear and non-linear models

glmfit <- glm(am ~ wt, mtcars, family="binomial")

tidy(glmfit)

augment(glmfit)

glance(glmfit)

#这些功能对非线性模型一样适用

nlsfit <- nls(mpg ~ k / wt + b, mtcars, start=list(k=1, b=0))

tidy(nlsfit)

augment(nlsfit, mtcars)

glance(nlsfit)

#The tidy function can also be applied to htest objects,

#such as those output by popular built-in functions like

#t.test, cor.test, and wilcox.test.

tt <- t.test(wt ~ am, mtcars)

tidy(tt)

wt<-wilcox.test(wt ~ am, mtcars)

tidy(wt)

glance(tt)

glance(wt)

#augment method is defined only for chi-squared tests

chit <- chisq.test(xtabs(Freq ~ Sex + Class, data = as.data.frame(Titanic)))

tidy(chit)

augment(chit)

```

# All functions

# The output of the tidy, augment and glance functions is always a data frame.

# The output never has rownames. This ensures that you can combine it with other tidy outputs without

# fear of losing information (since rownames in R cannot contain duplicates).

# Some column names are kept consistent, so that they can be combined across different models and so

# that you know what to expect (in contrast to asking “is it pval or PValue?” every time). The examples

# below are not all the possible column names, nor will all tidy output contain all or even any of these

# columns.

# tidy functions

# Each row in a tidy output typically represents some well-defined concept, such as one term in a

# regression, one test, or one cluster/class. This meaning varies across models but is usually self-evident.

# The one thing each row cannot represent is a point in the initial data (for that, use the augment method).

# Common column names include:

#  term"" the term in a regression or model that is being estimated.

# p.value: this spelling was chosen (over common alternatives such as pvalue, PValue, or pval) to

# be consistent with functions in R’s built-in stats package

# statistic a test statistic, usually the one used to compute the p-value. Combining these across

# many sub-groups is a reliable way to perform (e.g.) bootstrap hypothesis testing

# estimate

# conf.low the low end of a confidence interval on the estimate

# conf.high the high end of a confidence interval on the estimate

# df degrees of freedom

# augment functions

# augment(model, data) adds columns to the original data.

# If the data argument is missing, augment attempts to reconstruct the data from the model (note that

#                                                                                          this may not always be possible, and usually won’t contain columns not used in the model).

# Each row in an augment output matches the corresponding row in the original data.

# If the original data contained rownames, augment turns them into a column called .rownames.

# Newly added column names begin with . to avoid overwriting columns in the original data.

# Common column names include:

#  .fitted: the predicted values, on the same scale as the data.

# .resid: residuals: the actual y values minus the fitted values

# .cluster: cluster assignments

# glance functions

# glance always returns a one-row data frame.

# The only exception is that glance(NULL) returns an empty data frame.

# We avoid including arguments that were given to the modeling function. For example, a glm glance

# output does not need to contain a field for family, since that is decided by the user calling glm rather

# than the modeling function itself.

# Common column names include:

#  r.squared the fraction of variance explained by the model

# adj.r.squared R^2 adjusted based on the degrees of freedom

# augment(chit)sigma the square root of the estimated variance of the residuals

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 204,189评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,577评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 150,857评论 0 337
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,703评论 1 276
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,705评论 5 366
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,620评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,995评论 3 396
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,656评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,898评论 1 298
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,639评论 2 321
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,720评论 1 330
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,395评论 4 319
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,982评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,953评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,195评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 44,907评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,472评论 2 342

推荐阅读更多精彩内容