# Statistics Crash Course Day 2

library(help="datasets")

# t-test

A t-test is a statistic that checks if two means are reliably different from each other

x <- c(1.2,3.4,1.3,-2.1,5.6,2.3,3.2,2.4,2.1,1.8,1.7,2.2)
y <- c(2.4,5.7,2.0,-3,13,5,6.2,4.8,4.2,3.5,3.7,5.2)
t.test(x,y)


##
##  Welch Two Sample t-test
##
## data:  x and y
## t = -1.9667, df = 15.943, p-value = 0.06688
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -4.7799367  0.1799367
## sample estimates:
## mean of x mean of y
##  2.091667  4.391667

# Z-test

You can also embed plots, for example:

x <- read.csv("/home/ashim888/Dropbox/Madan Bhandari/Stat Tutorial/Day 2/ztest.txt",header=F)
x <- x[1:100,]
z <- sqrt(100) * (mean(x) - 0)/sd(x)
z
## [1] -0.2334861

# Correlation

?cor()

Correlation basically goes from -1 to +1 (weak to strong) while measuring the relation let’s use correlation for our BullRiders Dataset

bull <- read.csv("~/Desktop/edx/Foundations of Data Analysis/BullRiders.csv")
cor(bull$YearsPro,bull$BuckOuts)
## [1] -0.1670275

Now i want to check correlation between three variables in a vector,

for that i need to do following steps
- create a vector
- find correlation of those variables in vector with the ones in dataset
myVars<-c("YearsPro","Events","BuckOuts")
cor(bull[,myVars])
##            YearsPro     Events   BuckOuts
## YearsPro  1.0000000 -0.1597916 -0.1670275
## Events   -0.1597916  1.0000000  0.9803737
## BuckOuts -0.1670275  0.9803737  1.0000000