Ethan Witkowski Spring 2019
Define Variables:
## [1] 1.242294
## [1] 0.2149149
Because >.05, we fail to reject the null, and these control subjects do not differ significantly from a normal population.
## [1] 3.928478
## [1] 8.704141e-05
Effect size is the magnitude of difference between means; statistical significance is our confidence that the effect size is not occurring due to chance.
An increase in n, holding the effect size constant, increases the statistical significance.
## [1] 119.4662
## [1] 120.5338
This confidence interval does contain 120.
Generate random samples, create 4 histograms.
The samples do not appear to be normal.
Generate random sample, create 4 QQplots.
par(mfrow=c(2,2))
for (i in 1:4) {
dataset <- rnorm(10,0,1)
qqnorm(dataset)
abline(0,1,col=gray(.7))
}
The samples do not appear to be normal.
Repeat sampling with 100 observations.
par(mfrow=c(2,2))
for (i in 1:4) {
dataset <- rnorm(100,0,1)
qqnorm(dataset)
abline(0,1,col=gray(.7))
}
The samples appear to more closely resemble a normal distribution.
datasetbmi <- read.csv("C:\\Users\\ethan\\Desktop\\Swarthmore\\Spring 2019\\Statistics II\\Problem Sets\\Problem Set 1\\bmi.csv", header=T)
bmi <- datasetbmi[,"bmi"]
status <- datasetbmi[,"status"]
## [1] 3.431874
numshuffles <- 1000
diffs <- rep(NA, numshuffles)
for(i in 1:numshuffles) {
newstatus <- sample(status)
diffs[i] <- mean(bmi[newstatus==1]) - mean(bmi[newstatus==99])
}
## [1] 0.003431874
The null hypothesis is rejected because <.05.
marathon <- scan("C:\\Users\\ethan\\Desktop\\Swarthmore\\Spring 2019\\Statistics II\\Problem Sets\\Problem Set 1\\nycmarathon.csv")
par(mfrow=c(2,1))
hist(marathon, col="darkgrey", border="white", nclass=200)
hist(marathon, col="darkgrey", border="white", nclass=30)