Data set
Sociology article
The mosaic
package has the following 4 functions will give us (most of) the random simulation tools we need. We've seen
rflip()
: Flip a coinshuffle()
: Shuffle a set of valuesdo()
: Do the same thing many, many, many timesresample()
: the swiss army knife of functionsshuffle()
and resample()
A huge distinction in types of sampling:
In the Powerball analogy, this translates to:
Shuffling AKA re-ordering AKA permuting are all synonyms. I'm going to use all three terms interchangeably.
Run the following in your console:
library(mosaic) # Define a vector fruit fruit <- c("apple", "orange", "mango") # Do this multiple times: shuffle(fruit)
This works with the do()
operator…
do(5) * shuffle(fruit)
… as well as within a mutate()
example_data <- data_frame( name = c("Ilana", "Abbi", "Hannibal"), fruit = c("apple", "orange", "mango") ) # Run this multiple times: example_data %>% mutate(fruit = shuffle(fruit))
At its most basic, resample()
resamples the input vector with replacement. Run this in the console multiple times:
resample(fruit)
resample()
has default settings that we can set to fit our needs; it is a swiss army knife.resample(x = fruit, size = length(x), replace = TRUE, prob = rep(1 / length(x), length(x)) )
x
is the input. In this case fruit
.size
: size of output vector. By default the same size as x
.replace
: Sample with or without replacement. By default with replacement.prob
: Probability of sampling each input value. By default, equal probabilityrep(1/length(fruit), length(fruit))
in your console. In the case of fruit
, this vector is rep(1/3, 3)
i.e. repeat 1/3 three times.Rewrite rflip(10)
using the resample()
command.
Hint: coin <- c("H", "T")
shuffle()
command by changing the minimal number of default settings of resample()
. Test this on fruit
.What's the fastest way to do the above (in Question 4) 5 times? Write it out.
resample()
ing.Describe the process of bootstrapping. (You can use your notes if you like, but you can't use the textbook.)