Week 12

Working with MATLAB

This week I started playing around with MATLAB because I thought it would be useful for visualizing some of our summary statistics. MATLAB also has a suite of really nice audio functions that I thought might be helpful. Actually, I ended up writing a script that would calculate the total duration for a particular audio corpus. I tested this out on some of the downloaded corpora and it worked! This is nice because now we have a way of calculating how many hours of audio we have. For example, the Davidson corpus contains approximately 569 hours of audio. I think is useful for a number of reasons, but primarily to make sure our data collection and analysis are publication quality.

CREU Group meeting

We had a group meeting this week. We talked about drafting up our outlines and papers we want to submit to the CogSci conference in February. Prof. Beckage and Prof. Brumberg advised us to work on an outline to present next meeting, so we can have a better picture of how our papers will be structured and also to have at least one presentation completed. Prof. Beckage also mentioned some things about storytelling which was cool to hear. I think there are a lot of compelling stories we can tell with this project.

Kaldi

I didn’t do very much this week because of Thanksgiving, but I have thinking a lot about how to get single WER output. Being able to get error rates per words and track how the accuracy changes over training instances, will be a major portion of the work we present as it’s one of the primary metaphors for how children gain proficiency with recognizing and using words.

Best,
EO