This is a workflow for processing a corpus of ultrasound recordings in Articulate Assistant Advanced (AAA), in order to analyse tongue splines (thanks to George Bailey for some of the initial AAA workflow steps). You can download AAA here, but you (or your lab/grant) will need to purchase the full version of the software for full functionality. This guide was written using AAA version 220.04.2 (21st Feb 2023). We will also make use of Praat and R scripts written by Pat Strycharczuk and Sam Kirkham – these scripts are available if requested. In this guide, ultrasound splines are fitted using the version of DeepLabCut (DLC) which is packaged with the latest versions of AAA. This runs on Anaconda, so be sure to install the full suite of DLC and Miniconda (if you don’t already have it) when prompted during your AAA installation.

You’ll also need to install the Montreal Forced Aligner (MFA) Anaconda environment. Full installation instructions can be found here: you should be able to just create MFA as an environment in the same Miniconda build as the one AAA created when you installed it.

I expect you will have these already, but you’ll also need to be running both R (e.g. RStudio download) and Praat (Windows download, Mac download).

(Optional) Download Bulk Rename Utility here. This is optional because you might have a preferred method of renaming batches of files, e.g. various command-line processes, depending on your experience and your operating system. I included Bulk Rename Utility in this guide because it’s very user-friendly for those who don’t often use command line operations for this sort of thing.

Fitting tongue splines using DeepLabCut

In AAA, check that the only prompts you have left for your speaker are the ones you want to use in your analysis (this can include configuration recordings such as bite plate and water swallow prompts). In particular, make sure there are none that generate AAA errors when you play/open them, or this will cause the batch DLC processing to crash. Now check that the videos are correctly synchronised with the audio. Here is Alan Wrench’s video on how to do this, but generally speaking, if the first pulse tick mark in the audio channel lines up with the first tick mark in the channel corresponding to the pulse, it’ll be fine:

To begin fitting splines, right-click on the ultrasound display, and click Edit splines… to bring up the dialog box below. Click the Batch tab, then the Process Batch tab:

For future analysis you’ll probably need fiducial lines (i.e. comparisons) for your tongue splines, so tick all three DLC options to create hyoid-mandible and short-tendon-mandible splines as well. Also check that you’ve selected the neural-net model that you want to run the splining. To change the model, go to the Fit Spline tab, then click the DeepLabCut Settings button. This is also the place to check if everything is running correctly on your computer, and to update the models to the latest versions:

I’ve found that ResNet50 is the most accurate model (in our project we’re interested in retroflexes), but do read up on the advantages of the different available models here.

Before you start processing your batch, you’ll need to make sure the AAA filtering is set up to allow you to spline batches of recordings. Click on the various settings in the Choose recordings tab to select which recordings you want to process:

If you’re selecting lots of files, and you’re confident that your processing speed is up to the task (or you’re happy to wait), then you’re good to go. Here is more information on the processing requirements (I found that my laptop started to overheat during processing, so if this happens to you, you might want to invest in a cooling pad! Here is the one I got.)

When you’re happy with the settings, click the Process button, and the splines will start to be generated. This will take either minutes or hours, depending on your hardware setup and how many files you’re processing at once. You can monitor the progress if you left that option ticked.

Exporting from AAA for automatic alignment in MFA

Assuming each speaker has their own project, extract the project’s wavs and txt files to their own folder. You’ll want to make sure you have good folder management from this point, so create folders with meaningful names, like Output_files_for_MFA\**SPEAKER_FOLDER**\. Feel free to use a different method of folder management than what I’m showing in this guide – whatever works best for you!

To extract files in AAA, select File>Export>files…, then in the Files tab, select Use prompt and the correct directory. Make sure you select the client (and make sure all the recordings are selected) by pressing the Select Clients/Sessions to Export button. Then in the WAV tab, select Export WAV as individual channels, selecting channel 1, and removing the Track1 suffix. Ensure everything else is unchecked in the other tabs:

Now, run Bulk Rename Utility (or use your preferred command line batch-renaming method) to rename the .txt files in both directories to MFA-friendly format e.g. ‘I_could_hear_again.txt’, without spaces, because spaces cause problems in the Python-style commands that are used in Anaconda. In fact, your entire filepath must have no spaces, so it’s a good idea to check at this stage that all your parent folder names use something like underscores or hyphens instead.

Next, make a copy of all the renamed txt files and put them here (you’ll need to create these sub-folders): Output_files_for_MFA\Output_TextGrids_from_MFA\Input_annotations_MFA_to_AAA\**SPEAKER_FOLDER**\. This is so that you still have the AAA metafiles for importing the annotations back into AAA later. Now put all the wavs here: Output_files_for_MFA\Output_TextGrids_from_MFA\**SPEAKER_FOLDER**\. Put the browse_save_clear.praat script in this same directory

Back inside the first directory containing the files you want to convert to MFA-friendly format (Output_files_for_MFA\**SPEAKER_FOLDER**\), run the R script remove_AAA_metadata.R, changing the directory to match the speaker name etc. This will remove the metadata from the files, leaving only the prompts. It will also convert some words, e.g. Mr'->mister’: the abbreviation `Mr’ is not in the MFA dictionary we’re using in our project. This script has conversions that are specific to the prompts in our project – you’ll need to edit this to suit your own data.

Processing files in MFA

Assuming you’ve successfully installed your MFA environment on your computer, open Anaconda Prompt from the start menu and run this command to activate it:

conda activate aligner

(to deactivate when you’re finished, it’s `conda deactivate’)

As per Eleanor Chodroff’s MFA tutorial, the structure for running an alignment command is:

mfa align --clean input_files_location pretrained_model_location acoustic_model output_files_location

You should be able to see now why spaces in filenames are dispreffered in Python/Anaconda! Again, since this issue also applies to the whole filepath, use e.g. underscores instead. So for my local version, the command looks like this:

mfa align --clean C:\Users\rober\Dropbox\Danielle-Robert-shared\ultrasound\BB_2023\Output_files_for_MFA\**SPEAKER_FOLDER**\ C:\Users\rober\Documents\MFA\pretrained_models\dictionary\english_us_arpa.dict english_us_arpa C:\Users\rober\Dropbox\Danielle-Robert-shared\ultrasound\BB_2023\Output_files_for_MFA\Output_TextGrids_from_MFA\**SPEAKER_FOLDER**\

Here’s what that looks like in the Miniconda window:

When you’re writing these commands, pay close attention to the location of the spaces, as they separate the command arguments. You might want to write it in a text editor (e.g. Notepad++) first, then copy it into the Miniconda window.

Once this has finished processing, open Praat and run browseSaveClear.praat to inspect the results and to clean up any MFA errors, such as misaligned boundaries. I’ve found MFA to be very accurate though, especially with ultrasound recordings, which tend to be produced in a careful lab speech style.

Importing back into AAA

Once you’re happy that the boundaries for your segments of interest all line up with the correct part of the audio, copy the TextGrids to the folder you created earlier, which already has the AAA metadata files we put there: Output_files_for_MFA\Output_TextGrids_from_MFA\Input_annotations_MFA_to_AAA\**SPEAKER_FOLDER**\ Unless you’ve changed something along the way, the TextGrid filenames should all still match the txt filenames (including underscores instead of spaces): it’s important that they match exactly.

In AAA, go to File>Import>TextGrid… to load the annotations into your project, selecting the folder which contains the TextGrids and txt files. Importing will take a few minutes, and you can watch the progress at the bottom of the AAA window.

Exporting data from AAA

When you’re happy that everything looks OK, go to File>Export>Data…. There are many options here, and your project will have its own requirements, but see the following screenshots for what we exported in our project:

This one is a bit fiddly, as you have to press the Add button to create new column variables, then change that variable using the buttons on the right hand side:

Again, this is just a small selection of the values that you can export: there are many more options that you can export, so just have a look around and see what’s useful for you. Press the Export button to generate your data file in in any location you wish.

Happy analysing!