R is a language I’ve been curious about for some time. I started learning a bit about Python when I kept reading and hearing about R. Python and R are the two programming languages most commonly used in data science. Out of curiosity and a bit of boredom, I decided to learn a bit of R syntax. After that experience, I decided to continue learning R. I decided that I would learn more R using DataQuest’s Data Analyst in R track.
This is the first of a series of posts documenting what I’m learning in the Data Analyst in R track. In this post, I’ll discuss the first lesson of the track, Introduction to Programming in R.
Introduction to Programming in R
217 – 9
415 + 156
7 * 18
(45 – 3)/6
(17 * 8) – 4
chai = 5
matcha = 4
black = 2
green = 2
white = 3
chai <- 5
matcha <- 4
black <- 2
green <- 2
white <- 3
As with other languages, when naming variables in R, there are rules to follow. Variables can contain numbers, letters and underscores, special characters are not allowed and variable names cannot begin with a number.
What was new to me, however, is that in R a dot can be used in a variable name. I believe this is the first time I’ve come across this rule in programming. Variable names can begin with a dot but the dot cannot be followed by a number.
DataQuest provides a takeaway(I’ll discuss what this is in a bit) that gives us a link to this awesome resource from R-bloggers about variable naming conventions.
Next, I moved on to vectors. Vectors are storage objects that store a sequence of values. These values are assigned to a single variable. To create a vector you would write:
tea_prices <- c(5, 4, 2, 2, 3)
tea_flavors <- c(“chai”, “matcha”, “black”, “green”, “white”)
The c() is a function that stands for concatenate. It takes multiple values as input and stores these values as one variable to create the vector. You can also use variable names to create a vector as I did with the tea_flavors vector.
R has built-in functions. Some of them include:
mean() – average of values in vector
sum()– sum of values in vector
length()-total number of elements in vector
min()-smallest value in vector
max()-largest value in vector
These functions allow to quickly operate across all the values in the vector. For example, if I wanted to know the average of tea_prices, I would use the mean() function the average price of teas and store that value in a vector like so:
If I wanted to know how many elements were in tea_flavors, I would write:
Quick Note: These screenshots are from R Studio, an IDE for R that I’ll discuss in a future post.
I mentioned a bit earlier in the post about a takeaway. At the end of each lesson, DataQuest gives you a takeaway, a summary of what you learned in the lesson. This can be downloaded as a PDF and it’s great for reviewing concepts and syntax. It also gives you links to learn more about topics in the lesson. I can access my takeaways on my computer or my phone and I read them frequently. They have been really helpful for me as I forget the syntax rules.
And with that said, this wraps up intro to programming in R! I’ll get more into vectors in my next post.