There are two main panes in RStudio where we write and execute R commands, or R statements. We work with scripts in the script editor, to document our work, and with individual statements in the console.
As an example, we’ll look at a one-sample t-test. We’ll simulate data, so we know what the result should be, then perform the test.
You do most of your work in RStudio by writing, running, and saving scripts, files with sequences of R commands.
Start a new script by going to the File menu and clicking New File - R Script. You can do the same thing by clicking the New File icon on the toolbar.
You’ll notice you have the usual options for opening existing files and for saving script files in the menu and on the toolbar.
Yet another way you’ll open existing scripts is to click on them in the Files pane, in the lower right corner of the RStudio workspace.
Now, type the following commands into a new script:
x <- rnorm(25)
t.test(x)
This generates 25 observations from a random normal distribution with mean zero and a standard deviation of one. Then we perform a one-sample t-test. We expect we won’t reject the null hypothesis.
Each line in our script is a statement, a command to be run. To run these one-at-a-time move your cursor anywhere in the first line and key Ctrl-Enter. You can also click on the Run button at the top of the script editor. This not only runs the current command, but also moves the cursor to the next command, making it easy to walk step-by-step through your script.
To run more than one statement at a time, highlight all the code you want to run and then key Ctrl-Enter or click Run. (Be careful: you can highlight less than a full command, and R will try to execute only what you highlighted!)
One Sample t-test
data: x
t = -0.85712, df = 24, p-value = 0.3999
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.5343228 0.2207482
sample estimates:
mean of x
-0.1567873
If you try this example, you will certainly get different numbers, because we are generating random data. You might even get a result where you reject the null hypothesis! Why?
When you are in the midst of thinking your way through a problem, you will also find it useful to issue commands in the Console. The Console is the programmer’s scratch sheet of paper.
For example, suppose you have done the first step above, generating random numbers in a vector called x
. As a check, you decide you’d like to see the numbers generated, and calculate their mean, which should be close to zero.
In the Console, at the >
prompt, you type
x
[1] -1.09718699 0.04301967 -0.59282359 0.30116491 0.44059642 2.20407748
[7] -0.17392492 -0.12615702 -1.00326870 -0.96294769 -1.61037435 -0.27502236
[13] 0.07006282 -1.50444702 0.95271667 0.64445942 -1.56483325 -0.10622609
[19] 0.17223694 -1.00402994 0.34221631 0.87284991 0.88050756 -0.77238078
[25] -0.04996753
an implicit print()
function. The numbers look reasonable (most of the values are between -3 and 3, right?), so you follow up in the Console with
mean(x)
[1] -0.1567873
which is statistically close to zero. So if I now follow up with the t.test()
in my script, my result should be not statistically significant.
R keeps track of your command history, whether the command was run from a script of from the Console. In the Console you can scroll through this history with either with the up- and down- arrows on your keyboard (the ones you use to move your cursor in a document).
For example, you could use this to quickly generate a new sample, and run a new t-test.
You can also access your command history in Rstudio’s History pane (upper right, tabbed with Environment). Here you can highlight one or more commands and send them to the Console. You can also send commands “to Source” (your script), turning your scratch work into something you can save.
Create a script where you generate random observations, and test the hypothesis that the mean is zero with a t.test.
You can generate n
random observations from a distribution with mean m
and a standard deviation of 1 with rnorm(n, mean=m)
.
After you have run the data-generating command, use the Console to check the mean and standard deviation (sd()
) of your data. Look about right?
From our working example script, highlight and run just x
in the line
t.test(x)
R expressions can be nested. For instance, we might write:
t.test(rnorm(15))
To step through this statement, first highlight and run rnorm(15)
, to verify that you understand what that piece of code does. Then highlight and run the entire statement. Did you get the answer you expected?