Posted in programming

Quick tip for R: How to save your dataset in a native R format for future work

This is a note to self more than anything else, but maybe someone learning R out there finds it useful, too.

I lost some time recently because I kept running R results and only saved them as plots and csvs. As I’m on a budget Macbook with limited memory I can’t keep many results loaded in R (it all stays in memory). Now if I want to go back and change a plot, for example to make it prettier in terms of its dimensions or to add a title or even to filter the data that goes into a subplot… I’ll have to rerun the results.

Saving the results in a csv file is good for future reference, but won’t help with the issue that we can’t easily (?)  recreate the results from it. It seems far easier to save the actual R data in an R format.

In fact, my ‘statistics and programming colleague’ has been providing such R files in the ‘RDS format’ for our project to save my time of running them, but giving me the chance to select my own subsets for plots. I’m a bit gutted that I didn’t realise the potential of this function for my own work until today. (I am having to rerun the results in order to create nicer plots; but then it’s also better for archiving the results in an R format than only in csv, I suppose, because things do change or I might find mistakes in my methodology later…).

In order to create an RDS file you have to use this function (from the R documentation; for me its usually sufficient to simply name the object the file path):

saveRDS(object, file = "", ascii = FALSE, version = NULL,
        compress = TRUE, refhook = NULL)

For the technical details you can refer to the R documentation linked above or this post that explains the difference between ‘saveRDS()’ and ‘save()’ in more detail. In a nutshell, ‘save()’ apparently saves the object with its name. So, if my original results were called ‘results’ and meanwhile I had created another object called ‘results’ I’d have a problem when I loaded the saved version. With ‘saveRDS()’ we don’t have this problem.

Hopefully, this post can be of use to some of you (obviously check what’s most helpful for your work). I’ll start saving all my important R results in this format 🙂


					
Advertisements

Author:

I am a research fellow on the CLiC Dickens project at the Centre for Corpus Research, University of Birmingham. My research interests focus on the use of corpus linguistic tools to identify meaning in texts. In the CLiC Dickens project we develop and use methods to study the language of literary texts, particularly in Dickens’s and other 19th century fiction. My PhD research seeks to understand connections in discourse through a corpus linguistic approach. Specifically, I study how the concept of surveillance is represented in different types of texts. This blog reflects my personal opinions and not those of my employers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s