How to read research papers for fun and profit

One skill that I’ve learned after a year in grad school is how to effectively read research papers. Previously I had found them impenetrable, but now I find them a great source of information about cutting-edge science while it is being done and before it’s made its way into textbooks. Now I read about 4-5 of them every week.

My research area is natural language processing and machine learning, but I read papers in lots of fields, not just in AI and computer science. Papers are my go-to source for a myriad of scientific inquiries, for example: does drinking alcohol cause cancer? Are women more talkative than men? Was winter in Toronto abnormally cold this year? Etc.

Why read scientific papers?

If you try to Google questions like these, you typically end up on Wikipedia or some random article on the internet. Research papers are an underutilized resource that have several advantages over other common sources of information on the internet.

Advantages over articles on the internet: no matter what topic, you will undoubtedly find articles on it on the internet. Some of these articles are excellent, but others are opinionated nonsense. Without being an expert yourself, it can be difficult to decide what information to trust. Peer-reviewed research papers are held to a much higher minimum quality standard, and for every claim they make, they have to clearly state their evidence, assumptions, how they arrived at the conclusion, and their degree of confidence in their result. You can examine the paper for yourself and decide if the assumptions are reasonable and the conclusions follow logically, rather than trust someone else’s word for it. With some digging deeper and some critical thinking, you can avoid a lot of misinformation on the internet.

Advantages over Wikipedia: Wikipedia is a pretty reliable source of truth; in fact, it often cites scientific papers as its sources. However, Wikipedia is written to be concise, so that oftentimes, a 30-page research paper is summarized to 1-2 sentences. If you only read Wikipedia, you will miss a lot of the nuances contained in the original paper, and only develop a cursory understanding compared to going directly to the source.

Finding the right paper to read

If your professor or colleague has assigned you a specific paper to read, then you can skip this section.

A big part of the challenge of reading papers is deciding which ones to read. There are a lot of papers out there, and only a few will be relevant to you. Therefore, deciding what to read is a nontrivial skill in itself.

Research papers are the most useful when you have a specific problem or question in mind. When I first started out reading papers, I approached this the wrong way. One day, I’d suddenly decide “hmm, complexity theory is pretty interesting, let’s go on arXiv and look at some recent complexity theory papers“. Then, I’d open a few, attempt to read them, get confused, and conclude I’m not smart enough to read complexity theory papers. Why is this a bad idea? A research paper exists to answer a very specific question, so it makes no sense to pick up a random paper without the background context. What is the problem? What approaches have been tried in the past, and how have they failed? Without understanding background information like this, it’s impossible to appreciate the contribution of a specific paper.

2.pngAbove: Use the forward citation and related article buttons on Google Scholar to explore relevant papers.

It’s helpful to think of each research paper as a node in a massive, interconnected graph. Rather than each paper existing as a standalone item, a paper is deeply connected to the research that came before and after it.

Google Scholar is your best friend for exploring this graph. Begin by entering a few keywords and picking a few promising hits from the first 2-3 pages. Good, this is your starting point. Here are some heuristics for traversing the paper graph:

  • To go forward in time, look at works that cited this paper. A paper being cited usually means one of two things: (1) the future paper uses some technique or result developed in the current paper for some other purpose, or (2) the future paper improves on the techniques in the current paper. Citations of the second type are more useful.
  • To go backward in time, look at the paper’s introduction and related work. This puts the paper in context of previous work. Occasionally, you find a survey paper that doesn’t contribute anything novel of its own, but summarizes a bunch of previous related work; these are really helpful when you’re beginning your research in a topic.
  • Citation count is a good indicator of a paper’s importance and merit. If the paper has under 10 citations, take its claims with a grain of salt (even more so if it’s an arXiv preprint and not a peer-reviewed paper). Over 100 citations means the paper has made a significant contribution; over 1000 citations indicates a landmark paper in the field and is probably worth reading. Citation count is not a perfect metric, especially for very recent work, but it’s a useful heuristic that’s applicable across disciplines.

The first pass: High level overview

Great, you’ve decided on a paper to read. Now how to read it effectively?

Reading a paper is not like reading a novel. When you read a novel, you start at the beginning and read linearly until you reach the end. However, reading a paper is most efficient by hopping around the sections as appropriate, rather than read linearly from beginning to end.

The goal of your first reading of a paper is to first get a high level overview of the paper, before diving into the details. As you go through the paper, here are some good questions that you should be asking yourself:

  • What is the problem being solved?
  • What approaches have been tried before, and what are their limitations?
  • What is this paper’s novel contribution?
  • What experiments were done, using what dataset? How successful were the results?
  • Can the method in this paper be applied to my problem?
  • If not, what assumptions are needed for this method to work?

3.pngAbove: Treat each paper as a node in a massive graph of research, rather than a standalone item in a vacuum.

When I read a paper, I usually proceed in the following order:

  1. Abstract: a long paragraph that summarizes the entire paper. Read this to decide if the rest of the paper is worth reading or not.
  2. Introduction, diagrams, tables, and conclusion. Often, reading the diagrams and captions gives you a good idea of what’s going on with minimal effort.
  3. If the field is unfamiliar to you, then note down any interesting references in the introduction and related works sections to explore later. If the field is familiar, then just skim these sections.
  4. Read the main body of the paper: model, experiment, and discussion, without getting too bogged down in the details. If a section is confusing, skip it for now and come back to it on a second reading.

That’s it — you’ve finished reading a paper! Now you can either go back and read it again, focusing on the details you skimmed over the first pass, or move on to a different paper that you’ve added to your backlog.

When reading a paper, you should not expect to understand every aspect of the paper by the time you’re done. You can always refer back to the paper at a later time, as needed. Generally, you don’t need to understand all the details, unless you’re trying to replicate or extend the paper.

Help, I’m stuck!

Sometimes, despite your best efforts, you find that a paper is impenetrable. It’s not necessarily your fault — some papers are hastily written hours before a conference deadline. What do you do now?

Look for a video or blog post explaining the paper. If you’re lucky, someone may have recorded a lecture where the author presents the paper at a conference. Maybe somebody wrote a blog post summarizing the paper (Colah’s blog has great summaries of machine learning research). These are often better at explaining things than the actual paper.

If there’s a lot of background terminology that don’t make sense, it may be better to consult other sources like textbooks and course lectures rather than papers. This is especially true if the research is not new (>10 years old). Research papers are not always the best at explaining a concept clearly: by their nature, they document research as it’s being done. Sometimes, the paper paints an incomplete picture of something that’s better understood later. Textbook writers can look back on research after it’s already done, and thereby benefit from hindsight knowledge that didn’t exist when the paper was written.

Basic statistics is useful in many experimental fields — concepts like linear / logistic regression, p-values, hypothesis testing, and common statistical distribution. Any paper that deals with experimental data will use at least some statistics, so it’s worthwhile to be comfortable with basic stats.


That’s it for my advice. The densely packed two-column pages of text may appear daunting to the uninitiated reader, but they can be conquered with a bit of practice. Whether it’s for work or for fun, you definitely don’t need a PhD to read papers.

How to succeed in your first tech internship

Congratulations, you’ve just landed your first software engineer internship! You’ve passed a round or two of interviews, signed an offer letter, and you’re slated to start next month. What now? You might be a bit excited, a bit apprehensive, wondering what the startup life is like, are you even smart enough to do the work they give you…

I felt all these things when I started my first internship three years ago. Now, I’ve completed four internships and I’m halfway through my fifth one; I’m sort of a veteran intern by now. In these five internships, I’ve learned a good deal about what it takes to succeed in an internship, things that are not obvious to those just starting out. Hopefully by sharing this, others can avoid some of the mistakes I made.

Your first week at [startup]

Chances are that you’ve coded in assignments for schoolwork, and maybe you’ve coded a few side projects for fun. Work is a bit different: you’re working with a massive codebase that you didn’t write, and probably no single engineer in the company understands it all. Facing a codebase of this complexity, you might feel overwhelmed, struggling to find the right file to start. You feel uneasy that a small change is taking you hours, afraid that your boss thinks you’re underperforming.

Relax, you’re doing fine. If you got the job, it means they have faith in your abilities to learn and to succeed. I’ve talked to hundreds of Waterloo interns, and I’ve never heard of anyone getting dismissed for underperforming. The first few weeks will be rough as you come to terms with the codebase and technology stack, but trust me, it gets much, much easier afterwards.

Asking for help

As an intern, you’re not expected to know everything, and often you will be asking for help from more experienced, full-time engineers.

Before asking for help, you should spend a minute or so searching Google, or Stack Overflow, or the company wiki. Most general questions (not relating to company specific code) can be answered with Google, and you save everyone’s time this way.

When you do ask for help, be aware that they might be working on a completely different project, so they don’t have the same mental context as you. Rather than jump straight into the intricate technical details of your problem, you should describe at a high level what you’re trying to accomplish, and what you tried, and only then delve into the exact technical details.

An example of a poorly phrased question would be: “hey, how do I invalidate a FooBarWindow object if its parent is not visible?” You’re likely to get some confused stares — this might make perfect sense to you, but they’re wondering what is FooBarWindow and why are you trying to invalidate it at all.

A better way to phrase it would be something like: “hey, I’m working on X feature, and I’m encountering a problem where the buttons stop working after you press the back button. After looking a bit, I discovered my component should have been invalidated when its parent is no longer visible, but that’s not happening…” This time, you’ve done a much better job of describing your problem.

It’s always helpful to take notes, so you never ask the same question again. How do you commit your code to Git? How do you deploy the app to stage? If you don’t write it down, you’re going to forget.

At the start, you’re going to be asking 5 questions an hour, which is okay. Soon you will find yourself needing to ask less and less, and eventually you’ll only ask a handful per day.

Taking charge of your own learning

Like it or not, software engineering is a rapidly shifting field, where a new Javascript framework comes out every six months. You have to be continuously learning things, or your skills will become obsolete. Learning is even more important when you’re an intern, still learning the ropes. Fortunately, a tech internship is a great opportunity to learn quickly.

Not all software engineers are equal — at some point, you get to choose what you want to do: frontend, backend, or full stack? Web, iOS, or Android? Become an expert in Django or Ruby on Rails? Depending on the company, you often get considerable say on what team you’re on, and what project you work on within your team. Use this as an opportunity to get paid to learn new, interesting stuff!

Good technologies to learn should satisfy two criteria: it should be something you’re interested in, and it should also be widely used in the industry. That is to say, it’s more useful to know a popular web framework than an internal company-specific framework that does the same thing.

When you get to pick what project to take next, it might be tempting to pick something familiar, where you already know how to do everything. But you learn a lot more by working on something new; in my experience, employers have always been accommodating to my desire to work on a variety of different things.

You will overhear people talk passionately, with phrases like, “oh, it’s running Nginx inside Docker and fetches the data from a Cassandra cluster…”┬áIf you’ve never heard of these technologies, this sentence would be nonsensical to you. It’s well worth the time to spend 10 minutes reading about each technology that you hear mentioned, not to become an expert, but just to have a passing understanding of what each of these things do. With a few minutes of research, you’d be able to answer: “when should you use Cassandra over MySQL?

Learning is valuable even when it’s not immediately relevant to you. Occasionally, you’ll find yourself in meetings where you don’t have a clue what’s going on, say with business managers or projects you’re not involved in. Rather than zone out and browse Reddit for the duration of the meeting, listen in and learn as much as you can, and take notes if you begin to fall asleep! The human brain has near infinite capacity for learning new things, and at no point will it reach “capacity” like a hard drive.

Take responsibility and deliver results

A common misconception is programmers are paid to write code. Wrong: as a programmer, your job is to deliver results and provide value to your company; part of this job involves writing code, but a lot of the work is communicating with managers, designers, and other engineers to figure out what code to write.

When you’re assigned a project, you own it and you’re in charge of any tasks required to push it through to completion. What if something is broken in an API owned by another team? You might be tempted to hand in your code and proclaim, “my code works fine, so my job here is done, I can show you that their API is broken, so it’s their fault.” No, if your feature is broken then you need to fix it one way or another. So go and ping the engineer responsible, schedule a meeting with him, anything to get your project completed.

Sometimes you run into problems that seem insurmountable, so complex that you feel compelled to put down your sword and give up, and tell yourself, “this is too hard for an intern“. This is a bad idea, you should never expect a full-time engineer to come in, take over, and bail you out of the situation. Your mentors are not superhuman — it’s not like they can instantly conjure a solution, no, they have to work through the problem one piece at a time, just like you. There’s no reason you can’t do the same.

The product you deliver is what ultimately matters, so don’t worry about secondary measures of productivity, like how many lines of code you commit, or how many story points you rack up on Jira. There’s an apocryphal tale of a programmer who disagreed with management measuring productivity by lines of code, and writing “-2000” because he made the code simpler. Likewise, you aren’t being judged if you come in 30 minutes after your manager does, or if you leave 30 minutes before he does, or if you just feel like taking a mid-day stroll in the park, as long as you’re consistently delivering quality features.

Many interns suffer from “intern mentality” and consider themselves fundamentally different from full-times in some way. This is an irrational belief — your skills are probably on par with those of a junior engineer (or will be in a few weeks). This means you should behave like any other full-time engineer (albeit minus interview and on-call duties); the only difference is you’re leaving in a few months. Don’t be afraid to contribute your insights and ideas and consider them less valuable because you’re “just an intern”.

Other tips

What should you learn to prepare for an internship if you have spare time? Learn Git! Git is a version control system used in most companies, and is both non-trivial to pick up, and used more or less the same way everywhere. Other stuff is less useful to pre-learn because they’re either easy to pick up, or can be used in lots of different ways so it’s more efficient to learn on the job.

Internships are a great way to travel places, if that interests you. I picked 5 internships in 4 different cities for this reason. Unlike school, you don’t have to think about work during weekends, which leaves you lots of time to travel to nearby destinations.

I’ve only talked about what happens during work. If your internship is in the USA, the Unofficial Waterloo USA Intern Guide was super helpful in answering all my logistical questions. Also, some of my friends have written about crafting a resume, and how to ace the coding interview.