Last week, I attended my first NLP conference, NAACL, which was held in Minneapolis. My paper was selected for a short talk of 12 minutes in length, plus 3 minutes for questions. I presented my research on dementia detection in Mandarin Chinese, which I did during my master’s.
Here’s a video of my talk:
Going to conferences is a good way as a grad student to travel for free. Some of my friends balked at the idea of going to Minneapolis rather than somewhere more “interesting”. However, I had never been there before, and in the summer, Minneapolis was quite nice.
Minneapolis is very flat and good for biking — you can rent a bike for $2 per 30 minutes. I took the light rail to Minnehaha falls (above) and biked along the Mississippi river to the city center. The downside is that compared to Toronto, the food choices are quite limited. The majority of restaurants serve American food (burgers, sandwiches, pasta, etc).
It’s often said that most of the value of a conference happens in the hallways, not in the scheduled talks (which you can often find on YouTube for free). For me, this was a good opportunity to finally meet some of my previous collaborators in person. Previously, we had only communicated via Skype and email. I also ran into people whose names I recognize from reading their papers, but had never seen in person.
Despite all the advances in video conferencing technology, nothing beats face-to-face interaction over lunch. There’s a reason why businesses spend so much money to send employees abroad to conduct their meetings.
Talks and posters
The accepted papers were split roughly 50-50 into talks and poster presentations. I preferred the poster format, because you get to have a 1-on-1 discussion with the author about their work, and ask clarifying questions.
Talks were a mixed bag — some were great, but for many it was difficult to make sense of anything. The most common problem was that speakers tended to dive into complex technical details, and lost sense of the “big picture”. The better talks spent a good chunk of time covering the background and motivation, with lots of examples, before describing their own contribution.
It’s difficult to make a coherent talk in only 12 minutes. A research paper is inherently a very narrow and focused contribution, while the audience come from all areas of NLP, and have probably never seen your problem before. The organizers tried to group talks into related topics like “Speech” or “Multilingual NLP”, but even then, the subfields of NLP are so diverse that two random papers had very little in common.
Research trends in NLP
Academia has a notorious reputation for inventing impractically complex models to squeeze out a 0.2% improvement on a benchmark. This may be true in some areas of ML, but it certainly wasn’t the case here. There was a lot of variety in the problems people were solving. Many papers worked with new datasets, and even those using existing datasets often proposed new tasks that weren’t considered before.
A lot of papers used similar model architectures, like some sort of Bi-LSTM with attention, perhaps with a CRF on top. None of it is directly comparable to one another because everybody is solving a different problem. I guess it shows the flexibility of Bi-LSTMs to be so widely applicable. For me, the papers that did something different (like applying quantum physics to NLP) really stood out.
Interestingly, many papers did experiments with BERT, which was presented at this conference! Last October, the BERT paper bypassed the usual conventions and announced their results without peer review, so the NLP community knew about it for a long time, but only now it’s officially presented at a conference.