I was talking with [name redacted] about a side project the other day and he said “OMG you’re still using *gnuplot*?” So I figured I’d better get with the program and learn some R.
Luckily for me, Portland has an R users’ group, and they held a hackathon/workshop last week, newbies welcome. They’re a good group of people & I heartily recommend the workshop night. Special thanks to Homer for the personalized help and suggestions.
Prior to the meeting, I installed R (using the instructions here).
I wanted to graph some old running data since I had a good idea of what it was supposed to look like. Here’s how I ended up doing it. (NOTE: this is the results of me reading online tutorials, floundering around, & then asking for help from the workshop mentors – there are more efficient ways to accomplish some of these tasks.)
Excerpt of my data file (date, distance, pace, kCal):
Fire up the R shell:
Then I loaded my file:
> mydata <- scan("miles_individual.dat", sep=",", what=list(date="", distance=0, pace="0:00", kcal=0))
And messed around with these commands to see what I had:
 "date" "distance" "pace" "kcal"
List of 4
$ date : chr [1:110] "01-Jan-2009" "04-Jan-2009" "06-Jan-2009" "08-Jan-2009" ...
$ distance: num [1:110] 2.36 6.4 2.77 3.59 4.49 1.94 2.16 2.39 0.94 2.51 ...
$ pace : chr [1:110] "9:51" "10:45" "11:00" "10:30" ...
$ kcal : num [1:110] 200 200 200 200 200 182 203 225 89 236 ...
The mydata command shows all the values I scanned in (output’s a bit too large).
Note that mydata$date is “chr”, which I’m guessing is “character’. Let’s convert those to actual dates:
mydata$date = strptime(mydata$date, "%d-%b-%Y")
and compare that to what I had before:
POSIXlt[1:110], format: "2009-01-01" "2009-01-04" "2009-01-06" "2009-01-08" ...
Unfortunately, using scan the way I did  left me with my values in a list (which is, of course, just what I told it to do); I need to convert them to a “data frame” so I can graph them.
> mydata = as.data.frame(mydata)
Two more steps & I’m ready to graph:
Load the ggplot2 library:
Prep the graph:
This pulled up a separate window with an empty graph.
I graphed distance first:
> ggplot(mydata, aes(x = date, y = distance)) + geom_line()
“aes” assigns “aesthetic” to the graph, telling it how you want it to look. I assigned date to the x-axis and distance to the y, and then specified a line to join the points. (Try leaving off the geom_line() specification and you’ll get a “No layers in plot” error.)
Then I tried a scatter representation:
> ggplot(mydata, aes(x = date, y = distance)) + geom_point()
Then, at Homer’s suggestion, I got fancy with it. I added a point to the graph indicating kcal, and make the size of that point reflect the value of the kcal :
> ggplot(mydata, aes(x = date, y = distance)) + geom_line() + geom_point(aes(size = kcal))
Save the graph like so:
1 – comment: “Oh wow, I’ve never used scan that way before.” So, not the recommended method.
2 – This made me so excited.