前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >R语言代做编程辅导STATG/M003 STATISTICAL COMPUTING | IN-COURSE ASSESSMENT 2(附答案)

R语言代做编程辅导STATG/M003 STATISTICAL COMPUTING | IN-COURSE ASSESSMENT 2(附答案)

原创
作者头像
拓端
发布2022-12-14 22:14:54
3530
发布2022-12-14 22:14:54
举报
文章被收录于专栏:拓端tecdat拓端tecdat

全文链接:http://tecdat.cn/?p=30885

? Your solutions should be your own work and are to be handed in by yourself to the Statistical Science Departmental office by 1600hrs on MONDAY, 23rd FEBRUARY

Declaration: I am aware of the UCL Statistical Science Department’s regulations on plagiarism for assessed coursework. I have read the guidelines in the student handbook and understand what constitutes plagiarism. I hereby affirm that the work I am submitting for this in-course assessment is entirely my own.

G3 Assessment 2 | Instructions

  1. Answer both questions.
  2. For Question 1 you should hand in a paper copy of your report for part (d), and you should also upload an electronic copy of your report to the course Moodle page. You are NOT required to submit your R script for this question.
  3. For Question 2 you should submit: ? an electronic copy of your Surname.r file containing your R script (see below), ? a printout of your Surname.r file containing your R script (see below), ? a printout of the graph in part (b) that is produced by your script (black and white copy is sufficient), and ? a printout of your Surname_out.txt file (see below) containing output to parts (d), (e) and (f) produced by your R script. Your R script should be saved in a file named Surname.r, where Surname is your surname. For example, if your name is Karl Pearson, your R script should be saved in the file Pearson.r. Only one file is required/allowed. Do not submit a zip file. Your code should create an output file called Surname_out.txt, where again Surname is your surname. This file should contain the output from parts (d), (e) and (f) and it should include text and comments indicating what the results are. This should be done by inserting remarks in the source file using the cat() function. Your program should be well commented and have a header section summarising the logical structure, followed by the main body of the script, which should also contain comments. You should clearly indicate the question and part numbers in both your code and output. It should execute without the need for user intervention when the R command source() is run on it.
  4. Paper copies of your answers and printouts should be handed in to the Statistical Science departmental office. Your answers to both questions should be submitted in a single document (not in loose pages) that is clearly identified with your name. Remember to complete a plagiarism declaration, and to attach it to your work.
  5. Electronic copies of your script (i.e., just the file Surname.r) should be submitted via the Moodle page for the course | look for the heading \Submission of in-course assessment 2" and follow the link beneath it, to obtain instructions on how to do this

STATG003/M003 Assessment 2 | Questions

  1. The file lungfunction.dat contains data from 50 people. Lung function can be measured by a forced out breath of air into a device called a spirometer and is used as a marker for lung health. There are four quantitative variables: forced expiratory volume in 1 second (FEV1) measurement (to be denoted by fev and measured in litres), the person’s height in metres (to be denoted by height), age in years (to be denoted by age) and average time spent exercising in hours per week (to be denoted by exercise). In addition there is a variable indicating if the person has used an inhaler (to be denoted inhaler) within the last 24 hours (1=no, 2=yes). Clinicians are interested in how a person’s lung health depends on height, age, exercise and inhaler usage

(a) Download the file lungfunction.dat from the G3 Moodle page. Read the data into R using read.table and then name the columns as fev, height, inhaler, age, exercise. (b) Obtain summary statistics for each quantitative variable and make useful plots of the data | i.e., that are relevant to the objectives of the study. Such plots may include, but are not necessarily restricted to, pairwise scatter plots with different plotting symbols for those who have or haven’t used an inhaler recently. Put plots together in a single figure where appropriate and consider possibly using log scales for the quantitative variables. (c) Find a linear model that enables fev to be predicted from the other variables and that is not more complicated than necessary. You may wish to consider using log transformations of one or more of the explanatory variables. All your models should be fitted using the lm function, and wide range of models should be considered to make your choice of model convincing with the use appropriate diagnostics to assess them. Ultimately you are required to recommend a single model that is suitable for interpretation and to justify your recommendation. (d) Write a brief report on your analysis in three sections: I Describe briefly what you found in your exploratory analysis in part (a) II Describe briefly (without too many technical details) what models you considered in part (b) and why you chose the model you did, and III State your final model clearly and describe it in words. Remember to include an estimate of the error standard deviation and say what this means. Give an estimate of what would be the effect on the average FEV1 by being older (e.g, by 1 year of age). Give an appropriate assessment of the uncertainty in your estimate

  1. The file trnormal.dat contains observations from a truncated normal distribution. Each observation was originally drawn from a normal N(?; σ2) distribution, and any values less than u are replaced as u. In this instance, u is taken to be 1. So if X ~ N(?; σ2) then the observations are from the distribution of W = max(u; X) where u = 1. Hence:
image.png
image.png

and

image.png
image.png

The log-likelihood of ? and σ given a set of observations w1; : : : wn is

image.png
image.png

The function I(C) is the indicator function, taking the value 1 if the condition C is true and 0 if the condition C is false. (a) Download the data trnormal.dat from the G3 Moodle page. Read it into R using scan. (b) Obtain summary statistics for the data and plot a histogram. (c) Write a function called negll that takes two arguments (i) params, a vector containing the values of the two parameters (?; σ), and (ii) dat, a vector w of the data, and returns the negative log-likelihood, -l(?; σjw). (Hint R functions pnorm and dnorm maybe useful in computing the negative log-likelihood.) (d) Use your function negll to evaluate and print out the negative log-likelihood for the data in trnormal.dat for a few sensible values of ? and σ. (e) Use the R function nlm to find and print out the maximum likelihood estimates of ? and σ for the data in trnormal.dat by minimising the negative log likelihood. (f) Obtain and print out approximate standard errors for these estimates.

代码语言:javascript
复制
#1

colnames(data)=c("fev", "height","inhaler", "age", "exercise")#给列名赋值
summary(data)
cor(data)#查看各个变量之间的关系
plot(data)
attach(data)#绑定数据

boxplot(fev ~ inhaler,
          col = "yellow",
         main = "inhaler与fev箱线图",
         xlab = "inhaler",
         ylab = "fev",
         xlim = c(0, 3), ylim = c(5, 9), yaxs = "i")
boxplot(fev ~ height,
        
          col = "red",
          summary(lm1)
#使用向前向后线性拟合剔除无关变量
lm2=step(lm1,direction="both")
summary(lm2)
#对变量进行log变换
lm3=lm(fev~height+inhaler+age+log(inhaler),data=data)



#2

#用几个参数进行测试
l=negll(c(1,1),data)
l=negll(c(2,2),data)
l=negll(c(1,3),data)
#用极大似然法估计negll函数的参数
lm1模型概要.png
lm1模型概要.png
lm2模型概要.png
lm2模型概要.png
lm3模型概要.png
lm3模型概要.png
lm4模型概要.png
lm4模型概要.png
lm5模型概要.png
lm5模型概要.png
变量关系.png
变量关系.png
变量关系图.png
变量关系图.png
散点图1.png
散点图1.png
散点图2.png
散点图2.png
散点图3.png
散点图3.png
数据概览.png
数据概览.png
箱线图1.png
箱线图1.png
箱线图2.png
箱线图2.png
箱线图3.png
箱线图3.png
箱线图4.png
箱线图4.png

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 全文链接:http://tecdat.cn/?p=30885
  • G3 Assessment 2 | Instructions
  • STATG003/M003 Assessment 2 | Questions
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档
http://www.vxiaotou.com