Using Waveform Plots to Improve your Accent, and a Dive into English Phonology

I was born in China and immigrated to Canada when I was 4 years old. After living in Canada for 18 years, I consider myself a native speaker for most purposes, but I still retain a noticeable non-native accent when speaking.

This post has a video that contains me speaking, if you want to hear what my accent sounds like.

It’s often considered very difficult or impossible to change your accent once you reach adulthood. I don’t know if this is true or not, but it sounds like a self-fulfilling prophecy — the more you think it’s impossible, the less you try, so of course your accent will not get any better. Impossible or not, it’s worth it to give it a try.

The first step is identifying what errors you’re making. This can be quite difficult if you’re not a trained linguist — native English speakers will detect that you have an accent, but they can’t really pinpoint exactly what’s wrong with your speech — it just sounds wrong to them.

One accent reduction strategy is the following: listen to a native speaker saying a sentence (for example, in a movie or on the radio), and repeat the same sentence, mimicking the intonation as closely as possible. Record both sentences, and play them side by side. This way, with all the other confounding factors gone, it’s much easier to identify the differences between your pronunciation and the native one.

When I tried doing this using Audacity, I noticed something interesting. Oftentimes, it was easier to spot differences in the waveform plot (that Audacity shows automatically) than to hear the differences between the audio samples. When you’re used to speaking a certain way all your life, your ears “tune out” the differences.

Here’s an example. The phrase is “figure out how to sell it for less” (Soundcloud):

2_.png

The difference is clear in the waveform plot. In my audio sample, there are two spikes corresponding to the “t” sound that don’t appear in the native speaker’s sample.

For vowels, the spectrogram works better than the waveform plot. Here’s the words “said” and “sad”, which differ in only the vowel:

1.png

Again, if you find it difficult to hear the difference, it helps to have a visual representation to look at.


I was surprised to find out that I’d been pronouncing the “t” consonant incorrectly all my life. In English, the letter “t” represents an aspirated alveolar stop (IPA /tʰ/), which is what I’m doing, right? Well, no. The letter “t” does produce the sound /tʰ/ at the beginning of a word, but in American English, the “t” at the final position of a word can get de-aspirated so that there’s no audible release. It can also turn into a glottal stop (IPA /ʔ/) in some dialects, but native speakers rarely pronounce /tʰ/, except in careful speech.

This is a phonological rule, and there are many instances of this. Here’s a simple experiment: put your hand in front of your mouth and say the word “pin”. You should feel a puff of air in your palm. Now say the word “spin” — and there is no puff of air. This is because in English, the /p/ sound always changes into /b/ following the /s/ sound.

Now this got me curious and I wondered: exactly what are the rules governing sound changes in English consonants? Can I learn them so I don’t make this mistake again? Native English speakers don’t know these rules (consciously at least), and even ESL materials don’t go into much detail about subtle aspects of pronunciation. The best resources for this would be linguistics textbooks on English phonology.

I consulted a textbook called “Gimson’s Pronunciation of English” [1]. For just the rules regarding sound changes of the /t/ sound at the word-final position, the book lists 6 rules. Here’s a summary of the first 3:

  • No audible release in syllable-final positions, especially before a pause. Examples: mat, map, robe, road. To distinguish /t/ from /d/, the preceding vowel is lengthened for /d/ and shortened for /t/.
  • In stop clusters like “white post” (t + p) or “good boy” (d + b), there is no audible release for the first consonant.
  • When a plosive consonant is followed by a nasal consonant that is homorganic (articulated in the same place), then the air is released out of the nose instead of the mouth (eg: topmost, submerge). However, this doesn’t happen if the nasal consonant is articulated in a different place (eg: big man, cheap nuts).

As you can see, the rules are quite complicated. The book is somewhat challenging for non-linguists — these are just the rules for /t/ at the word-final position; the book goes on to spend hundreds of pages to cover all kinds of vowel changes that occur in stressed and unstressed syllables, when combined with other words, and so on. For a summary, take a look at the Wikipedia article on English Phonology.

What’s really amazing is how native speakers learn all these patterns, perfectly, as babies. Native speakers may make orthographic mistakes like mixing up “their, they’re, there”, but they never make phonological mistakes like forgetting to de-aspirate the /p/ in “spin” — they simply get it right every time, without even realizing it!


Some of my friends immigrated to Canada at a similar or later age than me, and learned English with no noticeable accent. Therefore, people sometimes found it strange that I still have an accent. Even more interesting is the fact that although my pronunciation is non-native, I don’t make non-native grammatical mistakes. In other words, I can intuitively judge which sentences are grammatical or ungrammatical just as well as a native speaker. Does that make me a linguistic anomaly? Intrigued, I dug deeper into academic research.

In 1999, Flege et al. conducted a study of Korean-American immigrants who moved to the USA at an early age [2]. Each participant was given two tasks. In the first task, the participant was asked to speak a series of English sentences, and native speakers judged how much of a foreign accent was present on a scale from 1 to 9. In the second task, the participant was a list of English sentences, some grammatical and some not, and picked which ones were grammatical.

Linguists hypothesize that during first language acquisition, babies learn the phonology of their language long before they start to speak; grammatical structure is acquired much later. The Korean-American study seems to support this hypothesis. For the phonological task, immigrants who arrived as young as age 3 sometimes retained a non-native accent into adulthood.

3.png
Above: Scores for phonological task decrease as age of arrival increases, but even very early arrivals retain a non-native accent.

Basically, arriving before age 6 or so increases the chance of the child developing a native-like accent, but by no means does it guarantee it.

On the other hand, the window for learning grammar is much longer:

4.png
Above: Scores for grammatical task only start to decrease after about age 7.

Age of arrival is a large factor, but does not explain everything. Some people are just naturally better at acquiring languages than others. The study also looked at the effect of other factors like musical ability and perceived importance of English on the phonological score, but the connection is a lot weaker.

Language is so easy that every baby picks it up, yet so complex that linguists write hundreds of pages to describe it. Even today, language acquisition is poorly understood, and there are many unresolved questions about how it works.


References

  1. Cruttenden, Alan. “Gimson’s Pronunciation of English, 8th Edition”. Routeledge, 2014.
  2. Flege, James Emil et al. “Age Constraints on Second Language Acquisition”. Journal of Memory and Language, Issue 41, 1999.

I have a Youtube channel!

Here’s something I’ve been working on recently: a Youtube channel of my guitar covers. I’ve been playing guitar for a few years now (I started in first year university) and I thought it would be fun to record myself playing my favorite songs.

At time of writing, I have 11 videos. Here’s a few of them:

I’m going to upload more as I have time. Please subscribe!

Waterloo’s Jobmine process and my first co-op internship

I just finished my first internship — since it’s my first ever “real” full-time job, I feel it’s a rite of passage of some sort.

The internship, or co-op work term, lasted 4 months from January to April. My position was titled “Software Developer”, and the company I worked for was TutorJam, a small educational startup in Kitchener.

The Jobmine Process

Like most students at Waterloo, I found my job through Jobmine. The process was intimidating at first: the whole slew of resumes, interviews, jobmine cycles, ranking systems, etc, were a lot to take in. But as I brushed up my resume and tentatively submitted a few cover letters, I began to relax a little.

In the end, I applied to 25 jobs (the limit is 50 applications). Most of these were in the Kitchener-Waterloo area, mainly because I leased a house here and didn’t want to relocate. Out of these 25 positions, 5 of them were cancelled before the interview stage. Out of the 20 jobs remaining, I got interviewed for 10 of them.

The interviews came and went, and in the end, 4 of the 10 companies that interviewed me gave me an offer. So I had the good fortune to take my pick between 4 jobs, any one of which I’d be happy working for. I ended up simply picking the job that looked the most interesting.

The Internship

During the 4 months, I worked on a site called YuJa. It’s an “online video collaboration platform”, but I like to describe it to my friends as “kind of like D2L but with lots of videos”. Here’s a picture of the login page of the website:

The team was very small — there were 2 co-op students and 2 full time developers, so essentially we had 4 programmers and 1 manager working on the entire project. As a result, I was entrusted with developing whole features by myself, both the frontend and backend — something rather unusual for a first time co-op.

The project is built with the standard HTML/CSS/Javascript/jQuery on the frontend, and used WildFly on the backend (basically a Java based server). When I started, I was proficient with the Java programming language, but had very little experience with web development (like HTML/CSS/JS). Initially the learning curve was quite steep, but I quickly picked up the skills I was missing.

In the first week, I fixed minor bugs and implemented small improvements, in order to “learn the ropes”. In the second week, I was assigned my first major feature. Essentially it allowed a professor to quickly send a group message to everyone in a class, and the students would receive it by email and SMS. Before the end of the month, my feature was complete.

Here’s a picture of my office (my computer is on the right, the guy on the left is Samson, another co-op student):

There were only the two of us physically present in this room in Kitchener — the company is spread out between several cities across North America. Thus all of our communications were done remotely, via Google talk. Another consequence of this was that in order to keep everyone in the same time zone, we were required to work from noon to 8pm.

Conclusion

All in all, my first internship was a positive experience, as I learned a lot and worked with very smart people. I learned how to work my way around a large codebase, also got a taste of what a startup is like. I suppose the only downside was that there was almost no social activity.

Hopefully I haven’t violated any company NDA by writing this post.

This sums up my co-op experience. Starting this week, I will be doing another 4 month study term (2B Computer Science) until August.

A Simple Shorthand Musical Notation

Anyone who’s played piano, or any other musical instrument, would be familiar with the “standard” musical notation. It’s clear, unambiguous, accepted worldwide, and has been basically unchanged since Bach. It looks like this:

Now there’s a reason this notation has survived this long — it’s good. It’s easy to read, and allows a musician to read and play a piece he’s never heard before.

But when you try to write music, you find that the notation is actually quite cumbersome to write. The notes are positioned on groups of 5 lines, so you’d better either have sheets of these lines printed, or be prepared to tediously draw these lines with a ruler. The timing of notes is very precise, so if you slightly exceed the allowed time for a bar, sorry, your notation is not valid anymore.

Principles of Shorthand Notation

To solve these frustrations, I created an alternate system of recording music, with the primary goal of being easy to write. It’s possible to jot down a melody in 30 seconds, with just a pencil and normal (not printed sheet) paper.

I do not claim my notation to be better than the standard notation. Rather, I achieve a different goal, sacrificing information for the ease of writing.

Standard notation is good for recording a song so that a musician can play it without having heard it before.

My notation is good for reminding a musician how to play a song he has heard before.

A common use case would be reminding yourself the notes of a song you’re playing, or accompanying a recording of the song. In a way, its purpose is similar to that of guitar tablature.

Here’s my justification for doing this. Most people can produce rhythm intuitively — that is, after hearing a passage a few times, he can clap back the rhythm. It’s much more difficult to find the correct notes after hearing the passage — I stumble upon it by trial and error.

So if you write down the notes but leave out the rhythm, it would often be enough information to play the song.

The tradeoff should become clear if you compare the same passage written side by side (from Bach’s Minuet in G Major):

Rules of Writing Shorthand Notation

Start by writing the notes in a line, and separate bars with a vertical | line. Indicate the key signature at the beginning of the page, if needed. Feel free to liberally clump notes together or space them apart based on rhythm.

Next is the rule for jumps. When the melody goes upwards by a perfect fourth or more (like from C->F), write the jumped note on an elevated line.

Remain on the elevated line as long as the melody is still increasing or stays the same. But as soon as the melody descends, immediately drop back down to the neutral line.

Here’s an example:

As long as the melody consists of small intervals (like C->E->C), we stay on the neutral line. Only when the jump is large (C->F) do we go to the elevated line.

Typically in music, a large jump in one direction is followed by a small step backwards. This means that we spend most of our time on the neutral line. It’s very rare for a melody to have multiple jumps in the same direction.

Here’s another example (Twinkle twinkle little star):

The melody does a large jump on the third note (C->G), so the third note (G) is on the elevated line. On the seventh note, the melody descends one note from A->G, so we immediately drop back to the neutral line. It does not matter that the same G was on the elevated line before.

You do not always have to start on the neutral line. It might be useful to start on an elevated or depressed line. Here’s an example (Harry Potter):

Reasoning behind the Jump Rule

You might be wondering, why make this jump rule so complicated? Why have a jump rule at all?

Well, we need some way of indicating octaves. Otherwise, a interval like C->F would be ambiguous: are we going up a perfect fourth, or going down a perfect fifth?

On the other hand, if we decreased the jump threshold, say a major third (C->E) is a jump, then the melody would be littered with jumps up and down, which would be a nightmare to handle. Setting the threshold to the perfect fourth is a good balance.

The complexities of the jump rule ensures that when you’re shifting upwards, the melody is actually going upwards. It would be confusing to the reader if there was a situation where we return from the elevated line down to the neutral line, while the melody is going upwards!

Another distinct alternative to the jump rule is to divide all the notes into distinct octaves: for instance, put any notes between C4 (middle C) and C5 on the neutral line, everything between C5 and C6 on the elevated line, and so on. I experimented with this, but found it very awkward when the melody straddles on the boundary between two octaves.

And that’s how the jump rule was created. So please experiment with this system, see if you like it!

Improving the (physical) Bookmark

If you’re an avid reader like me, you might have experienced this frustration with bookmarks.

You open up your book to the bookmarked page, but you aren’t sure where on the page you left off. So you go to the beginning of the page and start reading. But soon you realize that you’ve already read this paragraph, and the next…

A minor annoyance, fair enough. But I’d like to share a trick that neatly solves this problem.

Take any bookmark. (This doesn’t work as well if the bookmark has lots of contrasting colors)

Draw a line through the bookmark at somewhere around the 2/3 or 3/4 mark. Do this only on one side.

We’re done.

Now every time you stop reading, orienting and aligning the bookmark stores enough information that you can start exactly where you left off the next time you start reading. Examples:

I’m not sure whether I’m the first to come up with this or if it’s common knowledge elsewhere, but this trick has saved me a great deal of time and frustration. Hopefully you will find it useful!

Fix for Digsby’s Facebook authentication error and broken Facebook support

To all Digsby users (ignore this post if you don’t use Digsby):

If you use Digsby with Facebook, you might have noticed that things behave strangely — the program pops up a window looking like this when it tries to connect to Facebook:

Then after you give it your credentials, Digsby still thinks you’re not logged in, and so on.

If you found this page via a google search, there’s a simple hack / workaround you can use to patch up this problem. Basically, instead of using the Facebook protocol to connect, we let Digsby use the Jabber protocol as a ‘proxy’ to connect to Facebook:

  1. Go to Digsby -> My Accounts and in the Add Accounts section at the top, select the Jabber icon.
  2. You should get a window that looks like this:
  3. In the Jabber ID box, put your.id@chat.facebook.com, and in the password field, put your facebook password. For example, if your facebook page is at facebook.com/yourname, your Jabber id is yourname@chat.facebook.com.
  4. Remove the facebook account from Digsby

At this point, you’re done: Digsby should give you no more problems about Facebook.

Warning: the following is unnecessary and experimental! It might screw up the entire Digsby installation, forcing you to reinstall!

However, you can replace the Jabber icon with the Facebook one (this is for purely cosmetic purposes):

  1. Go to C:\Program Files (x86)\Digsby\res\skins\default\serviceicons (that’s the default installation path on my machine, yours may be different)
  2. Delete jabber.png, duplicate facebook.png, and rename it jabber.png
  3. Restart Digsby

There you have it — hack accomplished: