I’d like to share a side project I’ve been working on for the past few weeks. Roboroast is an app that automatically generates humorous insults for you or a friend based on how you look. It was written in collaboration with my friend Andrei Danciulescu.
The basic operation is as follows. There’s a subreddit called /r/RoastMe where random people post a picture of themselves, and other people proceed to “roast” the person with funny comments making fun of his appearance.
Our app takes your photo and uses a face recognition algorithm to find a poster in /r/RoastMe who looks like you. Then we display the comments for your closest matches.
You can try it at roboroast.tk.
Here’s some roasts for myself:
Here’s some for Andrei:
High Level Overview
The project comprises of roughly 3 parts:
Part 1 is the Reddit scraper. We use the PRAW API to go through all posts on the /r/RoastMe subreddit, saving comments to MongoDB and saving images to the filesystem.
Part 2 is the Face++ uploader. Face++ is a cloud service with a REST API that handles our face matching. To use it, we upload all the images from part 1 into a “faceset” which we can query later.
The first two components only need to be run periodically, maybe once a month to update the faceset with new posts from Reddit. Part 3 is the webapp, which is the use facing component. It accepts user uploads, searches for matches using the Face++ API, and renders a list of insults to the user.
As mentioned before, we used a number of third party APIs; PRAW for scraping Reddit posts, and Face++ for face recognition.
All the backend code is written in Python. The web app uses the Flask web framework, and is wrapped with NGINX and Gunicorn to handle connections and serve static files. We use MongoDB for the database.
The whole thing is hosted on a single AWS EC2 instance.
How good is the face matching?
The face matching is actually decent. Face++ produces reasonable matches most of the time.
To see the matching results for yourself, you can append ?r=1 to the end of the URL (on the results page). This is hidden by default.
Do the insults make sense?
Although the face matching does a decent job, we found that the quality of results were somewhat hit-or-miss.
When we envisioned the concept for this app, we assumed that most insults were going to make fun of the subject’s face. However, many insults refer to their non-facial appearance, or clothing, or objects in the background. Since we only do face matching, these comments will make no sense.
Other times, comments will refer to the title of the post — in other words, an insult depends on both the submission title and the picture. Again, these make no sense with only the picture.
We attempt to mitigate this with heuristics that analyze the comment, in order to exclude roasts which refer to the title or articles of clothing. This approach had limited success because natural language processing is hard.
When Andrei initially proposed this idea for an app, I thought the concept was pretty cool and unique. In a month or so we had a prototype, and I spent a few more weeks polishing the project for release. The quality of results you get is still highly variable, but we’re working on improving our algorithms.
In any case, it’s my first time with a lot of these technologies, and I had fun and learned a lot building it.