1 Objective
2 Preliminaries and starting up R
3 Converting Praat TextGrids
4 Calculating pitch with wrassp
5 Adding the calculated pitch files to the database
6 Displaying the pitch files in the webapp
7 Adding an event tier
8 Labelling some tones
9 Automatically linking event and segment times

1 Objective

The aim is to get from a Praat .TextGrid to an Emu database format as exemplified by Fig. 1.1:

Figure 1.1: An utterance fragment in Praat and in Emu.

2 Preliminaries and starting up R

The assumption is that you have a project called ips and that it contains the following directories.

If not, please see preliminaries here.

Start up R in the project you are using for this course.

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(emuR)

## 
## Attaching package: 'emuR'
## 
## The following object is masked from 'package:base':
## 
##     norm

library(wrassp)

In R, store the path to the directory testsample as sourceDir in exactly the following way:

sourceDir = "./testsample"

And also store in R the path to emu_databases as targetDir:

targetDir = "./emu_databases"

3 Converting Praat TextGrids

The directory testsample/praat on your computer contains a Praat style database with .wav files and .Textgrid files. Define the path to this database in R and check you can see these files with the list.files() function:

path.praat = file.path(sourceDir, "praat")
list.files(path.praat)

##  [1] "wetter1.pit"       "wetter1.TextGrid"  "wetter1.wav"      
##  [4] "wetter10.pit"      "wetter10.TextGrid" "wetter10.wav"     
##  [7] "wetter11.pit"      "wetter11.TextGrid" "wetter11.wav"     
## [10] "wetter12.pit"      "wetter12.TextGrid" "wetter12.wav"     
## [13] "wetter13.pit"      "wetter13.TextGrid" "wetter13.wav"     
## [16] "wetter14.pit"      "wetter14.TextGrid" "wetter14.wav"     
## [19] "wetter15.pit"      "wetter15.TextGrid" "wetter15.wav"     
## [22] "wetter16.pit"      "wetter16.TextGrid" "wetter16.wav"     
## [25] "wetter17.pit"      "wetter17.TextGrid" "wetter17.wav"     
## [28] "wetter2.pit"       "wetter2.TextGrid"  "wetter2.wav"      
## [31] "wetter3.pit"       "wetter3.TextGrid"  "wetter3.wav"      
## [34] "wetter4.pit"       "wetter4.TextGrid"  "wetter4.wav"      
## [37] "wetter6.pit"       "wetter6.TextGrid"  "wetter6.wav"      
## [40] "wetter7.pit"       "wetter7.TextGrid"  "wetter7.wav"

The emuR function for converting the TextGridCollection to an Emu database and then storing the latter in targetDir (defined above) is convert_TextGridCollection(). It works like this:

convert_TextGridCollection(path.praat, 
                           dbName = "praat",
                           targetDir = targetDir)

The converted Praat database can now be loaded:

praat_DB = load_emuDB(file.path(targetDir, "praat_emuDB"))

## INFO: Loading EMU database from ./emu_databases/praat_emuDB... (14 bundles found)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |======================================================================| 100%

and its properties examined as before:

summary(praat_DB)

##

## ── Summary of emuDB ────────────────────────────────────────────────────────────

## Name:     praat 
## UUID:     c56238e2-8b60-4aab-b306-7ab6da2d8713 
## Directory:    /Users/jmh/Desktop/ipsR/emu_databases/praat_emuDB 
## Session count: 1 
## Bundle count: 14 
## Annotation item count:  214 
## Label count:  214 
## Link count:  0

##

## ── Database configuration ──────────────────────────────────────────────────────

##

## ── SSFF track definitions ──

##

## data frame with 0 columns and 0 rows

## ── Level definitions ──

##  name type    nrOfAttrDefs attrDefNames
##  ORT  SEGMENT 1            ORT;

## ── Link definitions ──

## data frame with 0 columns and 0 rows

And it can of course be viewed:

serve(praat_DB, useViewer = F)

4 Calculating pitch with `wrassp`

The task is to calculate the pitch from each of the utterance’s waveforms for the praat_emuDB database created above. First, find the full path names of all of the .wav files. They are here:

praat_wav_paths = list.files(path.praat, 
                             pattern = ".*wav$", 
                             recursive = T, 
                             full.names = T)
praat_wav_paths

##  [1] "./testsample/praat/wetter1.wav"  "./testsample/praat/wetter10.wav"
##  [3] "./testsample/praat/wetter11.wav" "./testsample/praat/wetter12.wav"
##  [5] "./testsample/praat/wetter13.wav" "./testsample/praat/wetter14.wav"
##  [7] "./testsample/praat/wetter15.wav" "./testsample/praat/wetter16.wav"
##  [9] "./testsample/praat/wetter17.wav" "./testsample/praat/wetter2.wav" 
## [11] "./testsample/praat/wetter3.wav"  "./testsample/praat/wetter4.wav" 
## [13] "./testsample/praat/wetter6.wav"  "./testsample/praat/wetter7.wav"

The signal processing package wrassp will now be used to calculate the pitch for each of these .wav files. To see the full range of signal processing routines available, enter:

?wrassp

There are two possible routines that are needed here for calculating pitch: ksvF0 and mhsF0.

Here’s how to use mhsF0 with the default settings. The output is going to be stored in path.praat (i.e. in testsample/praat on your computer).

mhsF0(praat_wav_paths, outputDirectory = path.praat)

## 
##   INFO: applying mhspitch to 14 files
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |======================================================================| 100%

As the figure below shows, the pitch files should now all have been dumped in path.praat, i.e. in testsample/praat.

5 Adding the calculated pitch files to the database

These calculated pitch files now need to be added to praat_emuDB. This is done with the add_files() function. The parameter targetSessionName can be omitted in this case, because all of the bundles are stored in the session directory 0000. This can be verified with:

list_bundles(praat_DB)

## # A tibble: 14 × 2
##    session name    
##    <chr>   <chr>   
##  1 0000    wetter1 
##  2 0000    wetter10
##  3 0000    wetter11
##  4 0000    wetter12
##  5 0000    wetter13
##  6 0000    wetter14
##  7 0000    wetter15
##  8 0000    wetter16
##  9 0000    wetter17
## 10 0000    wetter2 
## 11 0000    wetter3 
## 12 0000    wetter4 
## 13 0000    wetter6 
## 14 0000    wetter7

Now add the pitch files to praat_DB:

add_files(praat_DB, 
          dir = path.praat, 
          fileExtension = "pit", 
          targetSessionName = "0000")

Having added the files, they need to be defined. The information required is:

a track name. This can be anything and it is needed when referring to these signal files in R.
the file extension. This is pit as already established above.
the columnName. This is the name of the column in the .pit files in which the fundamental frequency data is stored. This type of information (as well as information about the extension) is given by wrasspOutputInfos. In this case, append $mhsF0 since this was the name of the signal processing routine that has been used to calculate the pitch data:

wrasspOutputInfos$mhsF0

## $ext
## [1] "pit"
## 
## $tracks
## [1] "pitch"
## 
## $outputType
## [1] "SSFF"

The column name is given by $tracks which in this case is pitch. Putting all this together, and using "pitch" for the name of the track gives:

add_ssffTrackDefinition(praat_DB,
                        name = "pitch",
                        columnName = "pitch",
                        fileExtension = "pit")

summary(praat_DB)

## ── Summary of emuDB ────────────────────────────────────────────────────────────

## Name:     praat 
## UUID:     c56238e2-8b60-4aab-b306-7ab6da2d8713 
## Directory:    /Users/jmh/Desktop/ipsR/emu_databases/praat_emuDB 
## Session count: 1 
## Bundle count: 14 
## Annotation item count:  214 
## Label count:  214 
## Link count:  0

##

## ── Database configuration ──────────────────────────────────────────────────────

##

## ── SSFF track definitions ──

##

##  name  columnName fileExtension
##  pitch pitch      pit

## ── Level definitions ──

##  name type    nrOfAttrDefs attrDefNames
##  ORT  SEGMENT 1            ORT;

## ── Link definitions ──

## data frame with 0 columns and 0 rows

6 Displaying the pitch files in the webapp

The signals that are currently displayed for this praat_DB database can be seen with the function get_signalCanvasesOrder() as follows:

get_signalCanvasesOrder(praat_DB, perspectiveName = "default")

## [1] "OSCI" "SPEC"

which confirms that what is seen when viewing the database with the serve() function is the waveform (OSCI) and the spectrogram. The pitch data created above now needs to be added using the function set_signalCanvasesOrder(). The second argument should always be "default", thus:

set_signalCanvasesOrder(praat_DB, 
                        perspectiveName = "default",
                        order = c("OSCI", "SPEC", "pitch"))
serve(praat_DB, useViewer = F)

7 Adding an event tier

The next task is to add an event tier that can be used for labelling tones. Here the tier is called “Tone”. So far, the only existing time tier is ORT as confirmed by:

list_levelDefinitions(praat_DB)

##   name    type nrOfAttrDefs attrDefNames
## 1  ORT SEGMENT            1         ORT;

In order to add a new tier called Tone as an EVENT tier:

add_levelDefinition(praat_DB, "Tone", "EVENT")

##   INFO: Rewriting 14 _annot.json files to file system...
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |======================================================================| 100%

Display Tone so that it is above the ORT tier and so directly underneath the signals:

get_levelCanvasesOrder(praat_DB, perspectiveName = "default")

## [1] "ORT"

set_levelCanvasesOrder(praat_DB, 
                       perspectiveName = "default", 
                       order = c("Tone", "ORT"))

8 Labelling some tones

Add two tone labels H* at pitch peak of morgens and ruhig in wetter1 as in Fig. 1.1 and save the result.

serve(praat_DB, useViewer=F)

The tones are to be linked to words within which they occur in time. To do this, define a hierarchical relationship such that ORT dominates Tone:

list_linkDefinitions(praat_DB)

## NULL

add_linkDefinition(praat_DB, 
                   type = "ONE_TO_MANY", 
                   superlevelName = "ORT", 
                   sublevelName = "Tone")

list_linkDefinitions(praat_DB)

##          type superlevelName sublevelName
## 1 ONE_TO_MANY            ORT         Tone

Inspect the hierarchy:

summary(praat_DB)

## ── Summary of emuDB ────────────────────────────────────────────────────────────

## Name:     praat 
## UUID:     c56238e2-8b60-4aab-b306-7ab6da2d8713 
## Directory:    /Users/jmh/Desktop/ipsR/emu_databases/praat_emuDB 
## Session count: 1 
## Bundle count: 14 
## Annotation item count:  214 
## Label count:  214 
## Link count:  0

##

## ── Database configuration ──────────────────────────────────────────────────────

##

## ── SSFF track definitions ──

##

##  name  columnName fileExtension
##  pitch pitch      pit

## ── Level definitions ──

##  name type    nrOfAttrDefs attrDefNames
##  ORT  SEGMENT 1            ORT;        
##  Tone EVENT   1            Tone;

## ── Link definitions ──

##  type        superlevelName sublevelName
##  ONE_TO_MANY ORT            Tone

# switch to hierarchy view
serve(praat_DB, useViewer = F)

9 Automatically linking event and segment times

This makes use of the autobuild_linkFromTimes() function in order to link the tones to the corresponding words:

autobuild_linkFromTimes(praat_DB,
                        superlevelName = "ORT",
                        sublevelName = "Tone")

##   INFO: Rewriting 14 _annot.json files to file system...
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |======================================================================| 100%

# switch to hierarchy view
serve(praat_DB, useViewer = F)

Converting a Praat TextGrid collection

Jonathan Harrington

WiSe 2023

1 Objective

2 Preliminaries and starting up R

3 Converting Praat TextGrids

4 Calculating pitch with `wrassp`

5 Adding the calculated pitch files to the database

6 Displaying the pitch files in the webapp

7 Adding an event tier

8 Labelling some tones

9 Automatically linking event and segment times

Converting a Praat TextGrid collection

Jonathan Harrington

WiSe 2023

1 Objective

2 Preliminaries and starting up R

3 Converting Praat TextGrids

4 Calculating pitch with wrassp

5 Adding the calculated pitch files to the database

6 Displaying the pitch files in the webapp

7 Adding an event tier

8 Labelling some tones

9 Automatically linking event and segment times

4 Calculating pitch with `wrassp`