Representation Learning for Discovering Phonemic Tone Contours

My paper titled “Representation Learning for Discovering Phonemic Tone Contours” was recently presented at the SIGMORPHON workshop, held concurrently with ACL 2020. This is joint work with Jing Yi Xie and Frank Rudzicz.

Problem: Can an algorithm learn the shapes of phonemic tones in a tonal language, given a list of spoken words?

Answer: We train a convolutional autoencoder to learn a representation for each contour, then use the mean shift algorithm to find clusters in the latent space.

sigmorphon1

By feeding the centers of each cluster into the decoder, we produce a prototypical contour that represents each cluster. Here are the results for Mandarin and Chinese.

sigmorphon2

We evaluate on mutual information with the ground truth tones, and the method is partially successful, but contextual effects and allophonic variation present considerable difficulties.

For the full details, read my paper here!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s