Posted in academia, Conferences/events, corpus linguistics

Free teacher workshop: corpus stylistics for the English classroom

I have recently started working as a Research Fellow on the CLiC Dickens project at the Centre for Corpus Research, University of Birmingham. The main focus of this project is the custom-developed CLiC web app, which allows to use corpus tools – i.e. search, concordance, find clusters (repeated phrases) etc. – in Dickens’s novels and other 19th century fiction.

Next week the CLiC Dickens project is hosting a free workshop for English teachers (and those interested in/researching teaching methods for literature): ‘Corpus stylistics for the English classroom‘ at the University of Birmingham on June 16, 2017. If you’re interested, please do check the event link. Registration is easy & free via email (to me) and refreshments will be provided :).

You can also check out some of the CLiC functionality in this recent video tutorial that introduces the CLiC KWICGrouper; a new approach to sorting concordances! (Read my previous blog post for more information on reading, sorting and analysing concordances.)

As the CLiC Dickens project is about corpus linguistics and meaning, the work is pretty ‘close to home’ (it’s also physically in the same department) in terms of my previous work. At the same time, there are some new directions in it for me: corpus stylistics is concerned with meaningful patterns in literature (mainly, anyway) and this is quite different from my PhD research which looks at non-fiction (academic writing, blog posts and newspaper articles). Moreover, the CLiC project combines its corpus stylistic approach with ‘cognitive poetics’, which is another really exciting direction.

Posted in academia, Conferences/events

University of Birmingham Corpus Linguistics Summer School

This week (20 – 24 June 2016) a corpus linguistics summer school took place at the University of Birmingham Centre for Corpus Research. I was fortunate to be involved in the event.
The schedule was tight, but it seems to have been well worth it, as these tweets from participants suggest:
The full virtual Twitter conversation from throughout the week can be found under the hashtag #ccrss16.
Topics ranged from multiple facets of corpus statistics and their applications in R to Sinclairian lexical items, corpus stylistics and translation studies, specialised corpora and an introduction to Python for corpus linguists. The workshops and talks were held by Johan de Joode, Stefan Evert, Chris Fallaize, Matt Gee, Stefan Th. Gries, Nicholas Groom, Susan Hunston, Andrew Kehoe, Michaela Mahlberg, Lorenzo Mastropierro, Florent Perek, Simon Preston, Pablo Ruano, Adam Schembri, Paul Thompson and I. While most of us are based at UoB, it was great to have colleagues from other institutions and even from abroad join us to share their expertise.
My own session was inspired by a talk from Mark Davies at the ICAME 37 conference (Chinese University of Hong Kong, May 2016), where he demoed the new ‘virtual corpus’ feature on the BYU corpus interface.[Click on the links for the PDF versions of my presentations slides and the handout of my session].
Personally I enjoyed this week of intense exposure to different aspects of corpus linguistics. Full-week events like conferences and summer schools can be quite draining as you have to be ‘always on’, responding to new contents and people. However, the learning hopefully makes up for that.
Posted in academia, academic writing, PhD

The joy of moving on to the next chapter

I’m very happy to share the news about moving on from my first analysis chapter (Chapter 4 in the thesis).  On January 31  I was already sharing my frustration about writing this chapter and now, exactly two months later, I finally have a full draft. Actually, I’ve been sitting on this draft for a while with only a few paragraphs that needed reworking or were still in the shape of bullet points. In the mean time the text has been part of various different documents/files. The screenshot here displays the metadata of the current file. I know it’s at ~ 17,000 words too long for the final chapter. Now this number includes tables that I might shorten/delete/move to the appendix in the final thesis. The document also has a rather long background and methodological section which I might have to move to the background and methodology chapters of my thesis at a later stage.

Screen Shot 2016-03-31 at 16.11.28.png

For now, though, I’m just really happy that I was psychologically able to call it a ‘full draft’. This means I sent it to a friend today who will have a look at it and give me some comments. She’s also a linguist, but works in a different subfield. I need some distancing from this text and – as I’ve been feeling quite insecure – either some confirmation that it is an okay text or some advice on what is needed to clarify things a bit. I won’t go back to this until late April or early May, though.

I think that having worked on this chapter or preparatory stages for it since September has been too long of an intense period of thinking about this particular aspect of my PhD. My supervisor has been urging me to move on and today I finally felt ready to let it go. I know that it’s nowhere near the shape that I need it in for my final thesis. Some references aren’t probably as relevant as I first thought and others are lacking. The argumentation may not be clear enough. But I am moving on to the next stage of my analysis where I’m applying the same method to a different dataset. I am sure this will also give me more ideas for the analysis of the first corpus.

Best of all, I can feel some enthusiasm again! Have you felt tired about any of your chapters? Did it help to move on to something new and return to the work after a couple of weeks? Or have you found it most useful to fully finish one chapter/study before starting something else?

Posted in academic writing, PhD, Uncategorized

Little cartoon sharing at the end of the leap day (just for fun)

Everyone loves phdcomics, right? They even get included in Grad School workshop presentations…

Lately I’ve come to admire another source of grad student/ academic comments though: Have a look at A Prolific Source by Belle Kim, will you? I think you might enjoy it 🙂

Belle Kim’s cartoons are just lovely and they often strike a chord with me. I also like her approach that drawing can help you stay sane. It made me want to start, too. Now here is a very poor first draft. (I HAVE drawn other stuff recently but it’s too cute and non-academic; Chinese-style stickers from WeChat… and I have also jumped on that colouring book bandwagon). Anyway, not trying to do anything professional here, hence also just a cellphone picture, no scan. A really quick drawing to share an anecdote from the end of my leap day.


[By the way, it’s March now! oO *ahhhhhh* *heeeeeelp*)


Posted in academia, academic writing

Flying (and floating) like a kite


Just a some quick sharing today. First of all I’d like to thank everyone who read and commented on yesterday’s post on my feelings related to writing the first analysis chapter. It really feels great to hear back from people who have been through this already or are going through the same sort of thing.

So far I still feel a bit lost – and today some other annoying bits like problems with technology and bureaucracy were added to my plate. It doesn’t help, either, that I’ve some other deadline coming up … in theory it’s all very exciting only right now it doesn’t seem to be working quite ideally just yet. But I’ll try to hang in there and follow everyone’s advice to just try and get something ‘down’.

For now I just wanted to share this silly little drawing. I mentioned this simile to a friend recently (who is also a PhD student) and we got some fun out of it. We sometimes really feel like we’re flying (or floating) in the wind, sometimes way too far into one direction (or so it seems). Then at some point our supervisors may try to pull us back. At the moment I can feel lots of forces pulling on my line. But I do hope that something will pull me back to more familiar heights or grounds so that I’ll feel more comfortable soon. If you can relate, I hope you’ll feel that soon as well. Or perhaps you’ve already gotten into this kite thing – in that case happy flying :)!!!

Posted in academia, Misc

Of(f) time.

In this post I make use of the highly praised (at least in a writing seminar I attended last year) method of ‘free writing’. I’m afraid that if I start with a proper draft that needs to be (re- and re-) edited, the momentum for this post will get lost in procrastination.

As you may have noticed, there has been a real down-time on this blog. Over summer. In fact I don’t even know how it’s possible that this summer has already passed.

HOW has it passed? Personally, there were events that I attended: a 4-day summer school, and a 1-week conference, both in July. After my June post I thought I’d write a really cool post about the exciting month of July talking about these two. Then, August came, I moved to another city, and then flew to my home country to visit family. Now it’s September, (actually approaching mid-September…). I had made writing plans for the time after the conference. Now I don’t want to say much about it, only that it didn’t go quite like I planned. But maybe that’s okay?!

The lovely decoration at my relative’s wedding. Perhaps a little break from the PhD was good for my PhD love story?

One of the main reasons for me to go home for quite a while was my relative’s wedding and the accompanying hen do. They were three weeks apart and I wanted to attend both. I really enjoyed being back in the social circle of my childhood and teenage years, as I have now lived abroad for 5 years. Sometimes I feel a bit lost as to where I belong and what I’m going to do/ where I’m going to be. Visiting my family is then like a grounding experience because many of my relatives enjoy stable jobs/careers/families and places to live (i.e. they haven’t really moved ever or perhaps only once or twice).

It feels good that I gained some distance from my every-day PhD experience by being reintegrated in this family setting. [And friends! I met many of them, spent an entire 4 days at my friend’s place in Hamburg! I also went to the gym! :)] Inevitably, perhaps, some of the ambitious plans for the break got stuck at the procrastination stage. At the same time, procrastinating during my planned tasks let to progress in other areas: I completed my switch from Endnote to Zotero due to the need to collaborate with colleagues. Browsing the web for suggestions, I learned about the possibility of storing my journal articles in a local Dropbox folder and linking them to the Zotero entries. How cool is that? When I double-click on a Zotero item, the PDF pops up, but rather than being stored in some weird, automatically generated, individual Zotero folder, they are all in my ‘PhD reading’ folder.

I also finished a MOOC on R (thanks for the tip by my friend @kimsuekreischer): Microsoft: DAT204x Introduction to R (I think it will re-run in a few weeks’ time; not sure whether the URL stays the same or not). It will be too basic for the pros among you, but I was quite happy with its interactive assignments, which I was actually able to follow. I have also had a try at another R book, this time not by Stefan Gries (see my previous post), but by Matthew Jockers. The writing style is really refreshing and there are many very interesting ideas for analysing texts, with a focus on their meaning. I still have to get used to the perspective/terminology, though, which seems more text mining or NLP-minded than corpus linguistic. While following the MOOC and reading Jockers’ book I have often felt that R is something I can actually learn (apart from all the moments when I mistyped the code and couldn’t find the typo >< !) But what is still a huge challenge to me is how to move from following the instructions to actually designing my own code! This is where some more procrastination developed because I had been hoping to use R for cleaning my data… but… that’s still a little tricky.

In other news, I have sort of followed the #survivephd15 MOOC by @thesiswhisperer and team. Are some of you also participating? I haven’t gotten as involved as many of the other participants, but I am quite happy about this opportunity of dialogue among PhD students. I’m looking forward to topics in the later weeks, as it’s still at the introductory stage (history of the doctorate this past week).

Now there are just 3-4 more days until I fly back to my PhD life. I’ll miss my home, family and friends when I’m back abroad. But it’s good that I’m almost feeling like I have an itch to start again and continue my PhD relationship ;). Incidentally, today is the anniversary of my MA dissertation submission. Time is a strange phenomenon.

Posted in academia, programming

Trying to take up a coding mindset (as a linguist)

I originally tweeted this photo on the morning when my second R book by Stefan Gries arrived at the same time as the Bloomberg code issue… was that a sign? I haven’t gotten around to starting through with the book yet though, as I am still working through the first one!
I am currently attempting to learn something about the programming language R. Why? Is that even a good idea?

At a few points during the past few years I have considered whether I chose  the wrong degree(s). My BA degree was called “English Studies for the Professions (BAESP)” and I really enjoyed it and found everything interesting. At the same time I wanted to get more involved with research and see how linguistics can get really useful. So I moved on to an MA in Applied Linguistics and finally a PhD in the same field. I am really interested in linguistics and think it is a worthwhile area. BUT at times I wonder “Why didn’t I study computational linguistics?” Since my research deals with corpus linguistics this is actually not so far of a stretch. The problem is that I don’t seem have a computational mindset… So far the only type of computational stuff that I can more or less deal with is interactive. During the MA we did some work with the statistical package SPSS which used to be command-driven but now has an interactive interface. For corpus linguistic analyses I have used WordSmith Tools, AntConc and SketchEngine, which are all more or less user-friendly. If anything I get confused by too many buttons and settings that are offered.

When and how did I decide to do something about my non-computational situation?
I have been playing with the thought of getting a little bit more tech-savvy (and at the same time brush up on my understanding of statistics) for a year or so. Throughout my studies I have simply come across so many studies where people do more interesting stuff than I seem to be able to do because I don’t know how to make something like that happen. An example is a Twitter study that I already quoted in my BA project (which was also about Twitter). For my own project I used an online tool (at the time it was called TAGS v3 now there is TAGS v6) to collect a limited number of Tweets, leading to a small corpus. Michelle Zappavigna (@SMLinguist), in her book Discourse of Twitter and Social Media, however, had access to the infrastructure and support necessary for downloading and compiling a large Twitter corpus containing over 100 million Tweets. She used a Python script and the Twitter API. At that time I thought that I’m never going to be able to either do this myself or have the required technical support. While I still don’t know how to do this my attitude has changed slightly. I’m lucky to be cooperating with people from statistics and programming for a project coordinated by my supervisor. This regular interdisciplinary contact has taught me there are things that seem infinitely difficult to me but can easily be done by others in a short amount of time with a few lines of code. Moreover, the cooperation is gradually showing what kind of things are actually possible with programming. In the meantime I have been wondering whether or not it is worth investing time and energy (and money I guess) for learning some baby steps in programming when there are so many experts out there? Well, I don’t know, but I am trying to regain some control over my work…

Here are some interesting view points on coders and coding expressed by Paul Ford in that recent Bloomberg code issue:

Coders are people who are willing to work backward to that key press. It takes a certain temperament to page through standards documents, manuals, and documentation and read things like “data fields are transmitted least significant bit first” in the interest of understanding why, when you expected “ü,” you keep getting “?”

[Paul Ford, What Is Code?, Bloomberg Special Double Issue June 15-28, 2015, print p. 24 (digital – free & with really cool animated visualisations! – Section 2.1)]

Regarding the question whether or not to learn coding, Ford says:

There’s likely to be work. But it’s a global industry, and there are thousands of people in India with great degrees. […] I’m happy to have lived through the greatest capital expansion in history, an era in which the entirety of our species began to speak, awkwardly, in digital abstractions, as venture capitalists waddle around like mama birds, dropping blog posts and seed rounds into the mouths of waiting baby bird developers, all of them certain they will grow up to be billionaires. It’s a comedy of ego, made possible by logic gates. I am not smart enough to be rich, but I’m always entertained. I hope you will be, too. Hello, world!

[print pp. 109-112, digital Section 7.5]

Personally, I don’t think I can now start to become ‘a real coder’ and ‘compete’ with all those computer science graduates and other professional coders. BUT, the whole thing seems fascinating and if I know a little bit some light might be shed on so many areas that are still dark for me.

Why R?
I saw info about the ‘Regression modelling for corpus linguistics‘ workshop by the linguist Stefan Gries (held in Lancaster, 20 July) and knew about his books (Quantitative Corpus Linguistics with R – QCLWR – and Statistics for Linguistics with R) so I finally decided to buy them. That’s really the main point for me. [By the way, in the book, Gries argues that R is particularly well-suited for corpus linguistics…] While I know other resources are available, such as MOOCs (I even attempted a MOOC on R but dropped out), I need to see something that’s relevant to my own research (the R MOOC I attempted used data from biology, I believe). Having said that the MOOC introduced a neat little learning environment called ‘Swirl‘ which allows you to “learn R, in R”. I might go back to that at some point. Actually, it’s even hard for me to get through the first 100 pages of Gries’ QCLWR because it’s about the basics with few linguistic applications. But I try to motivate myself to continue by flipping beyond the 100 pages now and then because  I can see that soon I’ll be soon (hopefully) able to apply those basics to linguistic problems (I’m almost at page 96  now – yay!). So if someone had made a book about Python for corpus linguistics (is there one?), I might have gone for that, because I didn’t really know anything about which language is best to know. However, I am looking forward to a session at the Nottingham Summer School in Corpus Linguistics entitled ‘Essential python for corpus linguists’ run by Johan de Joode.

My main problems so far
Unfortunately, I am still lacking the coding mindset, but I hope that will change after working through the second, more applied linguistics part of QCLWR. I haven’t done proper math since high school and this step-wise logical thinking about embedding logical/ regular expressions and loops and variables and whatnot all feels a bit foreign to me. More often than not I can’t follow the examples at first sight (usually because I have missed a parenthesis somewhere…). Just have a look at an example of the lines that I have been trying to work through… (Gries, 2009: 89):

gsub(“(\\w+?)(\\W+\\w*?)\\1(\\W)”, “\\1\\2\\1\\3”, text, perl=T)

Trying to keep track of everything that could be potentially useful in my copy of QCLWR with sticky tags.
Trying to keep track of everything that could be potentially useful in my copy of QCLWR with sticky tags.

I also have difficulties with remembering function names and their argument structures and, worse still, I can’t really follow the R/ RStudio help entries about the functions. The biggest problem is that it takes me ages to go through the tutorial in Gries’ QCLWR. There are still more than a hundred pages left including masses of exercises and assignments and the second book (Statistics for linguists with R) is still waiting for me… Obviously this is not even the only task I’m supposed to be doing for my PhD at the moment…
On the bright side, though, I am slowly starting to feel more comfortable staring at condensed strings of digits and characters and slowly picking up the ability to analyse a command string step by step. Once something does work it really delights me.

What are your experiences with starting to code? Do you think it’s worthwhile to invest in these skills? Which programming language are you learning and why? [And sorry for turning this into such a long post…]

Posted in academia, academic writing

Nail your colours to the mast!


“Nail your colours to the mast”

Meaning: “To defiantly display one’s opinions and beliefs. Also, to show one’s intention to hold on to those beliefs until the end” (

This blog post is inspired by my recent first year confirmation review. The review was actually a positive, encouraging and also refreshing experience. I received numerous pieces of valuable advice. One point, though, stuck with me most and this was the motto “nail your colours to the mast”. This seemed to be the examiner’s main concern (and I shamelessly quote his saying about the colours here). What’s the actual theoretical approach that the project is based on?

Blog_Terminology_20150613As the author of the report, having spent like half a year on it, I can testify that this has really been the main struggle. In writing my literature review I had spent plenty of time on locating different positions in the literature and identifying potential differences between the different approaches. This was at times a frustrating endeavour, as ever so similar terms were used for frameworks with only very subtle differences. Coping with this diverse, overlapping terminology has been a key issue. Mapping out the different terms and their usage had already cost quite a bit of energy. I understand that I fell short on the next step – evaluating them and picking one for my work or even making up another term (and of course justifying it!).

Whose side am I on? To me as a first year PhD student it is just a scary thought to have to make such a decision. If I take up a specific term (such as, in my case, ‘corpus-assisted discourse analysis’ rather than ‘corpus-based/driven’), do I need to then follow the scholars associated with this terms for the rest of my work? Will I contradict myself if I choose one term and at a later stage take a turn with my work that doesn’t really harmonise with the work of the related people? How do I know the implications? Due to some of these daunting questions, I attempted to stay on ‘friendly terms’ with various approaches. This, however, is problematic in itself.

Is it possible to decide on a different set of colours at a later point? (The definition for the colour metaphor quoted above seems to suggest this, but maybe in academia we can allow for more flexibility? Clearly we’re all evolving and learning?) The saying was new to me, but in the context of the confirmation review I understood what was meant – clearly express what you’re trying to achieve and how you are doing that. I think I did that fairly well in my methods section (although at times it needs simplification as I was told) – in the parts that are very practically oriented. I have noticed that I struggle more with attempting to explain the theoretical implications of my work. And I believe this problem is routed in the fear that I misuse theoretical claims, for instance by combining incompatible approaches.

At the end of the review meeting both my examiner and supervisor emphasised, however, that it is okay (and probably even good!) to keep an open mind throughout the PhD and refine your theoretical standpoint through continuous writing. That made me feel more relieved. Still, I have had to go back to more reading on the terms I wasn’t sure about (this includes the definition of ‘discourse’ – a real can of worms…).

What are your views on this topic? Have you encountered similar difficulties?

Posted in academia, careers

PhD – career in academia?


Whether or not to seek a job in academia is probably a question that most if not all PhD students will consider at some point. I imagine that many actually start their PhD with the motivation to work in academia (whether this motivation stays is probably another question). I’m choosing this career topic today, because I just attended a related training session at the University of Nottingham: ‘Academic careers in Higher Education’. This panel session, hosted by @UoNgradschool, featured four academics and was specifically organised for arts and social science graduates.

The course had the following objectives:

“By the end of the session you will:
1. have an understanding of possible modes of entry into academia
2. know what your next step should be if you are considering working in this sector
3. have had the opportunity to ask questions of professionals working in the sector”

The academics from across the Faculties of Arts and Social Science (Dr Sarah Davison, Dr Andrew FisherDr Cathy Johnson and Dr Andrew Mumford) all shared their interesting career paths in academia along with their top dos and don’ts for PhD students seeking an academic job. What I really liked about the event was the personal touch as it seems that there’s never the one and only way to do something. However, some patterns emerged and I have summarised the main points that I gained from the session in the diagram below.

Paths from PhD to academic career (my summary of the panels' discussion)
Typical paths from PhD into academia (my summary of the panel’s discussion, specifically inspired by Dr Mumford’s comments on the temporary lectureship vs. post-doc routes)
The Hong Kong Polytechnic University
The Hong Kong Polytechnic University

Personally, as I am just about to finish my first PhD year (confirmation review is tomorrow… oO), I don’t have any clear plan yet as to which of these routes I will take. However, I am quite sure that I would like to attempt an academic career. In fact, I signed up for the panel session today because I enjoy working in an academic environment and would like to stay in this sector. Having completed my undergraduate at Hong Kong PolyU and my MA as well as the first year of my PhD at the University of Nottingham, I can say that I have felt very comfortable in both of these institutions.

I am also conscious that I am now at the first stage of the diagram, the PhD, and that means that I have to work on meeting requirements listed in ‘person specifications’ of potential future jobs. (The panelists emphasised throughout how important it is to stay on top of what’s required by the market – many of them still regularly check! That’s a habit I need to start.) So now, in the first stage, I have to do the extra stuff. Luckily, I have had the chance to be involved in organising last year’s ICAME 35 conference in Nottingham and am now in the process of co-organising the Nottingham Summer School in Corpus Linguistics as well as our Symposium ‘Corpus Linguistics beyond Boundaries’. I’m very grateful for these opportunities and feel that they help me learn more about the field, procedures of admin work and of course myself. I have just started attending conferences as a participant as well and that’s something I have to further work on. Right now I am still a bit nervous about publications and haven’t submitted anything yet, but that will hopefully change in this coming PhD year. I am also hoping for the opportunity to do part-time undergraduate teaching during the second year of my PhD, because I think teaching is an important aspect of academia. As the panelists pointed out today, the chance of getting a post-doc position though extremely attractive is rather unlikely in the current funding situation. Therefore, a teaching position (with possibly some research elements) seems the most likely job opportunity after the PhD…

What are your thoughts or experiences regarding the academic career path?

University of Nottingham Photo credits: @HunterZhou
University of Nottingham
Photo by my friend @HunterZhou