讲解:Stat5401、R、data、RMatlab|Java

Stat5401 Midterm Exam II - Due April 8th, 2020Exam Instruction:• There are a total number of 3 questions (25 points in total). Please checkif you have answered all the questions.• Please attach all R codes in your solution. You can use R notebook toorganize your results.• Please organize your answers in a single pdf file and submit it throughCanvas.• The exam is due at 11:59 pm on April 8, 2020 (CDT). Please try to submityour work at least a few minutes earlier than the deadline to avoid delaysdue to technical issues.• This is an open book exam. You are allowed to use your notes, book, Rhelp files, academic papers, online tutorials, etc.• You are NOT allowed to discuss the exam with anyone else.• General questions about the exam should be asked on Canvas discussionsboard ‘Midterm 2 related questions and clarifications’. Please askquestions as early as possible. I may not be able to answer last-minutequestions.• Other questions should be directed to me at lixx1766@umn.edu. If youare writing emails to me, please use your university email, i.e., ending with@umn.edu, etc. You can also send email directly through Canvas.• All answers must be written in your own words.• Updates are colored in red.1Question 1(5 points)Consider the following underlying linear regression modelyi = β1Xi1 + β2Xi2 + �i,with the standard assumption that E(�i) = 0, V ar(�i) = σ2, and Cov(�i, �j ) = 0for i 6= j. Note that we don’t include the intercept in this question.Suppose that you observe.Answer the following questions using math, i.e., by hand:(a) (1 points) Suppose that we fit the model:yi = β1Xi1 + β2Xi2 + �i.Write down the design matrix X with the given X1 and X2.(b) (2 points) Fit the model yi = β1Xi1 + �i, and let βˆ1 be the least squareestimator for β1.• Derive the mean and variance for βˆ1.• Is βˆ1 an unbiased estimator for β1?(c) (2 points) Suppose now that the true underlying model isyi = β1Xi1 + β2Xi3 + �i,and you observeSuppose that you fit the modelyi = β1Xi1 + �ito estimate β1.• Derive the mean and variance of the least square estimator for β1.• Is it an unbiased estimator?2Question 2 (Total 8 points)Download the data Q2.csv from Canvas. This is a simulated dataset motivatedby an example in the book Machine Learning with R by Brett Lantz.The response variable in the dataset is• charges = medical cost (in dollar) billed by insurance companyThe 6 covariates are• gender = Gender of the primary beneficiary, ‘f’ if female, ‘m’ if male.• age = Age of the primary beneficiary.• bmi = Body mass index• smoker = Smoker or not, ‘yes’ for smoker, ’no’ for non-smoker• children = Number of dependents, treated as a continuous/numeric covariatein this problem.• region = Residential area in the US.(a) Build a linear model by regressing charges on all the 6 covariates. Answerthe following questions.(i) Which effects are significant at α = 0.05, and what is the direction ofthe effects? Is there a relationship between age and charges?(ii) Find a 95% confidence interval for the linear coefficient for bmi.(iii) What’s the R2 and adjusted R2?(b) Build a reduced model by regressing charges on age, bmi and smoker.Compare the this model with the full model fitted in part (a) using an Ftest. According to the F test, does the model in part (a) fit the modelsignificantly better than that in part (b)?QuesStat5401作业代做、代写R编程语言作业、data课程作业代写、R程序作业调试 调试Matlab程序|帮做Java程tion 3 (Total 12 points)In class, we mentioned that there are many variable selection methods available.In this example, we study additional performance metrics, and use simulationto verify their effectiveness. We will also study and try the stepwise variableselection for multiple linear regression.(a) We have learned the adjusted R2 as a metric for the model fit. In thisquestion, we compare different models using the adjusted R2.Suppose the predictor is generated byset.seed(2020)n=200x=rnorm(n)3Remark: To make our results comparable, please use set.seed(2020)when generating x.(i) Let the underlying model is generated byeps=rnorm(n)y=x+x^2+x^3+epsWhat is the underlying model? How many covariates are there inthe underlying model? Please specify the covariates and true linearcoefficients.(ii) Fit 6 different models: yi = β0 +Ppj=1 βjXji + �i for p = 1, 2, 3, 4, 5, 6.These models are polynomials of different orders. Calculate the adjustedR2for each of them, and draw a plot showing the adjustedR2. (x-axis: p, y-axis: adjusted R2) Does the correct model have thelargest adjusted R2?(Hint: You can first create a data matrixX = cbind(x,x^2,x^3,x^4,x^5,x^6)and then use a for loop to run the 6 regression models, in order tosimplify codes. Also, try summary(model)$adj.r.squared to extractthe adjusted R2for a fitted model. )(iii) Instead of using the adjusted R2, there are other performance criteria.Here, we consider the Akaike Information Criterion (AIC) andBayesian Information Criterion (BIC). Read the document at the linkhttps://daviddalpiaz.github.io/appliedstats/variable-selection-and-model-building.html.Alternatively, you can also read page 385-386 of the textbook on ‘selectingpredictor variable from a large set’ and page 705 for the defi-nition of AIC and BIC.For this question, write down the definition of AIC and BIC in termsof Residual Sum of Squares (RSS), n and p.(iv) In R, AIC and BIC can be computed using the functions AIC and BIC,respectively. Replace the adjusted R2 by AIC and BIC in part (iii)and plot the results. Does the correct model have the smallest AICand BIC?(v) Repeat the simulation in (i), (iii) and (iv) for 100 times. You will needto keep the same x while generating new random eps each time.Each time, use the adjusted R2(the largest one), AIC and BIC (smallestone) to select the model. Take record of the model selected for eachsimulation (i.e., take record of the selected p).Report the frequency that the adjusted R2, AIC, BIC correctly selectthe true model among the 100 simulations. For this problem, whichmetric selects the model best? For the other metrics, do they tend toselect more covariates or fewer covariates than the correct one?4(b) For multiple linear regression, stepwise selection (including forward search,backward search, and both directions) is usually used for model selection.Read Section 16.2 of the documenthttps://daviddalpiaz.github.io/appliedstats/variable-selection-and-model-building.html and answer the following questions(i) Use about two or three sentences to describe stepwise variable selectionmethods.(ii) Download the data Q3.csv and regress Y on X1 - X20. (Hint: trylm(Y~.,data=Q3)). Use the function step to select variables (use thedefault arguments without changing its arguments like k or direction).Report the selected variables.5转自:http://www.6daixie.com/contents/18/5068.html

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 200,302评论 5 470
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 84,232评论 2 377
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 147,337评论 0 332
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 53,977评论 1 272
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 62,920评论 5 360
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,194评论 1 277
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,638评论 3 390
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,319评论 0 254
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,455评论 1 294
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,379评论 2 317
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,426评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,106评论 3 315
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,696评论 3 303
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,786评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,996评论 1 255
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,467评论 2 346
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,043评论 2 341

推荐阅读更多精彩内容