Posted in techy

A practical one: Steps for installing WordSmith Tools on a Mac

I wrote this post 1.5 months ago, in late September 2015. Now that some time has passed and I have played around with WordSmith and Windows on my Mac I think I’m ready to post it.


I have decided to put something relatively practical down today – compared to my previous posts, which were more generally about feelings related to the PhD. I’m about to start the 2nd year of my PhD (until 1 October I like to take advantage of the ‘1st year status’, though) and therefore things must get more practical. There’s still reason to talk about feelings, the nature of academia and a PhD. Yet, at the moment my feelings are actually somewhat dominated by the need to get something practical done. In corpus linguistics practical tasks often have a technical aspect.

My kitschy mac decoration; sorry for the imprecise application!

In early 2013 at the beginning of my final BA semester I bought a Macbook, because … my relatively cheap Asus laptop had badly crashed twice, requiring a new hard disk (ok, I poured coffee over it…), was generally getting slow and had some pink and turquoise stripes on the display. At that point I was mainly thinking about my final year project which I would have to submit in May. Then I didn’t realise that the area of corpus linguistics, which I had already studied in a BA module, would also become the major focus of my MA and my PhD and that a Macbook might not be the greatest choice for that. [Please feel free to criticise this idea].

The reason that having a Mac is tricky for corpus linguistics is that one of the most popular software packages, WordSmith Tools (WS), does not natively run on a Mac. There are many other options, specifically the freeware AntConc which runs on basically any operating system. [I recently learned about a new tool called corpkit which so far seems a Mac/Linux exclusive though!] Many corpora are also accessible from the web – such as the COCA, the BNC, … If you want to build your own corpus, however, you likely need to have a tool on your own computer (unless you can convince the developers of a system like CQPweb to host it for you). Of course there are more techy options like using programming environments such as R or Python for corpus linguistic analysis. Because of some of the functions available in WS and the fact that my undergraduate and postgraduate corpus linguistics modules were based on this software I still like to use it for some tasks.

Since I had regular access to a campus-based Windows desktop in the first year of the PhD I avoided the issue of installing WS on my mac. Now I might need to do more work from my home office so that the question has popped up. I had heard that you need to install Windows in a virtual environment on your Mac by installing either Parallels or VMware. Each of them costs approximately £70, I believe, add that to the cost of a Windows licence and the effort of installing it all and I wasn’t too excited. Now that I did some research I learned about Oracle’s Virtualbox, and it seems to work as well, but is free. Disclaimer: I don’t know what the potential disadvantages are in installing WS via the free Virtualbox rather than a paid-for virtual environment! (Anyone?) Once I also tried circumventing the step of installing the Windows OS by using the tool WineBottler which allows you to pretend to your Mac that the Windows programme you want to use is actually in a Mac format. This wasn’t successful in my attempt to use WS and there wasn’t support available for this case, probably because corpus tools are not very widely used in comparison to other software (I suppose only linguists, other academics, and some language teacher know about them…).

So here are the steps that I followed for installing WS in a Virtualbox on my Mac:

  1. Download Virtualbox (Oracle, available for free) + its extension pack (this allows you to have shared folders between your Mac and virtual OS, I think – see this video at 22.30 for a guideline of setting up a shared folder)
  2. Install Virtualbox + extension pack
  3. Buy a Windows license (I decided for Windows 7, because that’s the last one I’m familiar with) from a software website & download the operating system (iso file) from there – I found the German site, but I’m sure there are English options available
  4. Install Windows inside a new virtual machine in the Virtualbox. I basically followed the directions in video 1 and video 2. (I settled on 2GB RAM because I have 4GB; 2 CPU because I have 4 and 20GB dynamically allocated space).
    The option of setting up shared folders to access the same files from the mac OS and the new Windows OS are explained in video 3 (minute…)
  5. Install the latest version of WS from the Mike Scott’s website  – you will need to have a valid license key, which you can purchase from the same site (but if you are a research student it might be worth checking with your university whether they can provide you with one)

The software runs a bit more hesitant than on my previous university PC, but it does show results. How are people’s experiences with Parallels/VMware? For those, do you also need to allocate a certain percentage of your macbook’s RAM. CPU and storage for the virtual machine? How much?


November update:

Having used WS on my Mac multiple times now over the course of 1.5 months I’d say it works alright. I can open files and also create keyword lists or concordances without major problems. However, I always have to be careful that I don’t select items or click on buttons too quickly. For example, when I ‘choose texts’ for one of the tools it’s dangerous to hold down the shift key and the downward arrow – usually this makes the whole application freeze and I have to kill it. It’s also worth noting that it’s better not to have too many other programmes running at the same time (also on the Mac OS). This might be a problem of my own computer, though. It’s been bought on a student budget and therefore is one of the slowest Macbook options from 2012.

One issue that came up regarding Windows is that I forgot to activate it at the beginning (although I had a key! – it didn’t force me too, though…). So last week the Windows screen turned black and I got all blamed and shamed by the operating system (this copy is not genuine!). Unfortunately when I tried activating it this didn’t work – the system said I was trying to use a key for the wrong computer. I think this is probably due to confusion caused by the virtual environment. After many stressful attempts at getting through the Microsoft UK customer service hotline I finally got to talk to a human (!) customer service operator who helped me to manually activate my Windows 7…



I am a research fellow on the CLiC Dickens project at the Centre for Corpus Research, University of Birmingham. My research interests focus on the use of corpus linguistic tools to identify meaning in texts. In the CLiC Dickens project we develop and use methods to study the language of literary texts, particularly in Dickens’s and other 19th century fiction. My PhD research seeks to understand connections in discourse through a corpus linguistic approach. Specifically, I study how the concept of surveillance is represented in different types of texts. This blog reflects my personal opinions and not those of my employers.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s