Earlier this month I defended my PhD thesis. The bulk of the work of a PhD consists of producing peer-reviewed publications, and in my field, three first-author publications on a coherent topic in top-tier venues (EMNLP, ACL, NAACL, TACL, etc) is typically sufficient to earn a PhD.
Reflecting on my process for producing my three papers, I noticed that all of them took roughly 8-10 months from the time of the initial idea, until submitting the 8-page manuscript to a conference. Furthermore, each paper followed a similar trajectory from how it evolved from a vague idea into a concrete research contribution.
This is definitely not the only way to write papers, but it has worked well for me. All three of my papers were accepted into main conferences on the first attempt, with reviewer scores consistently between 3.5 and 4 (on a 1-5 rating scale). Therefore, I think it’s a good method to iterate on research projects and reliably produce strong NLP papers.
Month 1: Literature review
When I begin a project, I typically only have a vague idea of the direction or some phenomenon I want to explore. Since I don’t have a good understanding yet of the field, it helps to spend some time reading the prior literature at this stage, instead of diving into experiments right away. This is my focus for the first 3-4 weeks of a project.
See my blog post here for a guide on how to find and read research papers.
By the end of the first month, I would’ve read about 50 papers, and have a good understanding of:
- The theoretical frameworks and assumptions related to the problem.
- Recent work in this area, and what datasets and methodologies they use to solve it.
- Major challenges and open problems in this area, and why they remain difficult to solve.
At this point, after familiarizing myself with the recent work, I can usually identify some gaps that have not yet been addressed in the literature and have some ideas of how to begin solving them. This is when I begin running experiments – these initial experiments almost never make it into the final paper, but allow me to start building an intuition for the problem and become familiar with commonly used dataset and techniques.
Months 2-4: Exploration
The next phase of the project is iterative exploration and experimentation. For the next two or three months, I work on experiments, building on top of each other and using lessons learned from one experiment to guide the design of the next. Most of these experiments will be “failures” – inconclusive for various reasons:
- I discover that some theoretical assumptions turn out to be invalid, rendering the experiment pointless.
- After running the experiment, I find that the results are not very interesting: they are explainable by something obvious, or there are no consistent patterns.
- I try to run the experiment, but find that it’s impossible because the dataset is missing some crucial feature, or my tools are not powerful enough.
One thing you should never do is decide beforehand what evidence you want to find, and then run experiments until you find it. That would be bad science. So in this context, an experiment failure means it didn’t produce a result that’s interesting enough to include in a paper. An experiment might produce results that are different from what I expected, while being a very interesting and successful experiment.
During this phase, I read papers in a different way from the first month. Rather than casting a wide net, my reading is more focused on understanding details so that I can execute a specific experiment correctly.
After a few months of iteration and doing about 15-20 experiments, I have at least 2 or 3 with sufficiently interesting or cool results that I want to share with the community. These experiments will form the core of my new paper, but it’s not enough, since I still have to tie them together into a single coherent narrative, and strengthen all the weaknesses that would be mercilessly attacked during peer review.
Month 5: Telling a story
Before you can write a paper, you have to decide on a framing or narrative that aligns with your experiments. If this is not done correctly, the reader will be confused; your experiments will feel incoherent and unmotivated.
The same data and experiments can be framed in many different ways. Is our focus on evaluating several NLP models on how well they represent some linguistic property? Or are we using evidence from NLP models to argue for some theory of human language learnability? Or perhaps our main contribution is releasing a novel dataset and annotation schema?
To decide on a framing, we must consider several possible narratives and pick the one that best aligns holistically with our core experiments. We’ll need to provide a justification for it, which is usually not the original reason we did the experiment (since the exploration phase is so haphazard).
The product of this narrative brainstorming is a draft of an abstract of the final paper, containing the main results and motivation for them. By writing the abstract first, the overall scientific goal and structure of the paper is clarified. This also gives everyone an idea of gaps in the narrative and what experiments are still needed to fill in these gaps. Around this time is when I decide on a conference submission date to aim for, around 3-4 months down the road.
Months 6-7: Make it stronger
Now we are on the home stretch of the project: we have decided on the core contributions, we now just have to patiently take the time to make it as strong as possible. I make a list of experiments to be done to strengthen the result, like running it with different models, different datasets in other languages, ablation studies, controlling for potential confounding variables, etc.
Essentially I look at my own paper from the perspective of a reviewer, asking myself: “why would I reject this paper?” My co-authors will help out by pointing out flaws in my reasoning and methodology, anticipating problems in advance of the official peer review and giving me ample time to fix them. The paper probably has a decent chance of acceptance without all this extra work, but it is worth it because it lowers the risk of having the paper rejected and needing to resubmit, which would waste several valuable months for everyone.
Month 8: Paper writing
It takes me about 3 weeks to actually write the paper. I like to freeze all the code and experimental results one month before the deadline, so that during the last month, I can focus on presentation and writing. When all the tables and figures are in place, it is a lot easier to write the paper without having to worry about which parts will need to be updated when new results materialize.
The experiment methodology and results sections are the easiest to write since that’s what’s been on my mind for the past few months. The introduction is the most difficult since I have to take a step back and think about how to present the work for someone who is seeing it for the first time, but it is the first thing the reader sees and it’s perhaps the most important part of the whole paper.
A week before the deadline, I have a reasonably good first draft. After sending it out to my co-authors one last time to improve the final presentation, I’m ready to press the submit button. Now we cross our fingers and wait eagerly for the acceptance email!
There were two things that helped me tremendously during my PhD: reading lots of NLP papers, and having a good committee.
Reading a lot of NLP papers is really useful because it helps you build an intuition of what good and bad papers look like. Early in my research career, I participated in a lot of paper reading groups, where we discuss recent papers (both published and arXiV preprints) and talk about which parts are strong and weak, and why. I notice recurring trends of common problems and how strong papers manage them, so that I can incorporate the same solutions in my own papers.
This is sort of like training a GAN (generative adversarial network). I trained myself to be a good discriminator of good vs bad papers, and this is useful for my generator as well: when my paper passes my own discriminator, it is usually able to pass peer review as well.
Another thing that helped me was having a solid committee of experts from different academic backgrounds. This turned out to be very useful because they often pointed out weaknesses and faulty assumptions that I did not realize, even if they didn’t have a solution of how to fix these problems. This way I have no surprises when the peer reviews come out: all the weaknesses have already been pointed out.
For the PhD students reading this, I have two pieces of advice. First, read lots of papers to fine-tune your discriminator. Second, get feedback on your papers as often and as early as possible. It is a lot less painful at this stage when you’re still in the exploratory phase of the project, rather than after you’ve submitted the paper to get the same feedback from reviewers.
I am looking for a position as an NLP research scientist or machine learning engineer. Here is my CV. I can work in-person from Vancouver, Canada or remotely. If your company is hiring, please leave me a message!