emuR has for historical reasons some specialized objects and specialized methods that allow working with these specialized emuR objects. While it is sometimes unavoidable to have such specialized objects and methods, it should be avoided to do so whenever possible - instead, we could use some standardized procedures that are very common in R.

In order to see the advantages of a more standardized procedure, let us one again create a temporary EMU-SDMS-database first (using, by the way, some specialized (but unavoidable) commands from the package emuR):

# load package
library(emuR)
# create demo data in directory
# provided by tempdir()
create_emuRdemoData(dir = tempdir())
# create path to demo database
path2ae = file.path(tempdir(), "emuR_demoData", "ae_emuDB")
# load database
ae = load_emuDB(path2ae, verbose = F)

Specialized objects containing data need specialized commands to plot data

As we have seen in chapter 06, the default resulting object of a call to get_trackdata() is of class trackdata, which is a very special class only existing in the package emuR (and its predecessors). The emuR package provides multiple specialized routines such as dcut(), trapply(), eplot and dplot() for processing and visually inspect objects of this type (see Harrington, 2010, for the use of these functions).

vowels = query(ae,query="Phonetic==i:|u:|E")
vowels_fm = get_trackdata(ae,
                        seglist = vowels,
                        ssffTrackName =  "fm",
                        verbose = FALSE)
# show class of vowels_fm
class(vowels_fm)
## [1] "trackdata"

The folloing command then extracts the formant values at the temporal midpoint of each segment (each vowel, in this case):

vowels_fm05=dcut(vowels_fm,.5,prop = TRUE)

We can then use this object to plot the data and 95%-confidence ellipses.

eplot(vowels_fm05[,1:2],label(vowels),centroid=TRUE,formant = TRUE)

The original emutrack trackdata object can be used to plot trajectories of formants (here: F2 only) as a function of time (first example) or a mean trajectory for each vowel categories’ F2 as a function of normalized time (second example)

dplot(vowels_fm[,2],label(vowels))

dplot(vowels_fm[,2],label(vowels),normalise=TRUE,average=TRUE)

These commands (and many other commands in the predecessors of emuR) are specialized to work with (and only with) emutrack data objects.

Use of standard data.frames instead of specialized objects

In most cases, however, a R user will store his data in data.frames. Data.frames are required for most commands in most packages concerned with plotting and/or statistical analyses.

As the emutrack trackdata object is a fairly complex nested matrix object with internal reference matrices, which can be cumbersome to work with, the emuR package introduces a new equivalent object type called emuRtrackdata that essentially is a flat data.frame or data.table object. This object type can be retrieved by setting the resultType parameter of the get trackdata() function to emuRtrackdata:

vowels_fm_new = get_trackdata(ae,
                        seglist = vowels,
                        ssffTrackName =  "fm",
                        resultType="emuRtrackdata",
                        verbose = FALSE)
# show class of vowels_fm_new
class(vowels_fm_new)
## [1] "emuRtrackdata" "data.table"    "data.frame"
vowels_fm_new
##      sl_rowIdx labels    start      end          utts
##   1:         1      E  949.925 1031.925 0000:msajc003
##   2:         1      E  949.925 1031.925 0000:msajc003
##   3:         1      E  949.925 1031.925 0000:msajc003
##   4:         1      E  949.925 1031.925 0000:msajc003
##   5:         1      E  949.925 1031.925 0000:msajc003
##  ---                                                 
## 295:        18      E 2480.425 2587.675 0000:msajc057
## 296:        18      E 2480.425 2587.675 0000:msajc057
## 297:        18      E 2480.425 2587.675 0000:msajc057
## 298:        18      E 2480.425 2587.675 0000:msajc057
## 299:        18      E 2480.425 2587.675 0000:msajc057
##                                   db_uuid session   bundle start_item_id
##   1: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc003           157
##   2: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc003           157
##   3: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc003           157
##   4: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc003           157
##   5: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc003           157
##  ---                                                                    
## 295: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc057           200
## 296: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc057           200
## 297: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc057           200
## 298: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc057           200
## 299: 0fc618dc-8980-414d-8c7a-144a649ce199    0000 msajc057           200
##      end_item_id    level start_item_seq_idx end_item_seq_idx    type
##   1:         157 Phonetic                 11               11 SEGMENT
##   2:         157 Phonetic                 11               11 SEGMENT
##   3:         157 Phonetic                 11               11 SEGMENT
##   4:         157 Phonetic                 11               11 SEGMENT
##   5:         157 Phonetic                 11               11 SEGMENT
##  ---                                                                 
## 295:         200 Phonetic                 39               39 SEGMENT
## 296:         200 Phonetic                 39               39 SEGMENT
## 297:         200 Phonetic                 39               39 SEGMENT
## 298:         200 Phonetic                 39               39 SEGMENT
## 299:         200 Phonetic                 39               39 SEGMENT
##      sample_start sample_end sample_rate times_rel times_orig  T1   T2
##   1:        18999      20638       20000         0      952.5 422 1613
##   2:        18999      20638       20000         5      957.5 434 1651
##   3:        18999      20638       20000        10      962.5 447 1686
##   4:        18999      20638       20000        15      967.5 449 1703
##   5:        18999      20638       20000        20      972.5 445 1712
##  ---                                                                  
## 295:        49609      51753       20000        85     2567.5 440 1564
## 296:        49609      51753       20000        90     2572.5 428 1515
## 297:        49609      51753       20000        95     2577.5 400 1470
## 298:        49609      51753       20000       100     2582.5 348 1422
## 299:        49609      51753       20000       105     2587.5 278 1376
##        T3   T4
##   1: 2118 2750
##   2: 2195 2824
##   3: 2229 3536
##   4: 2245 3536
##   5: 2275 3224
##  ---          
## 295: 2345 3275
## 296: 2308 3217
## 297: 2286 3203
## 298: 2260 3214
## 299: 2232 3274
names(vowels_fm_new)
##  [1] "sl_rowIdx"          "labels"             "start"             
##  [4] "end"                "utts"               "db_uuid"           
##  [7] "session"            "bundle"             "start_item_id"     
## [10] "end_item_id"        "level"              "start_item_seq_idx"
## [13] "end_item_seq_idx"   "type"               "sample_start"      
## [16] "sample_end"         "sample_rate"        "times_rel"         
## [19] "times_orig"         "T1"                 "T2"                
## [22] "T3"                 "T4"

The emuRtrackdata object is an amalgamation of both a segment list and a trackdata object. The first sl_rowIdx column of the iVu object indicates the row index of the segment list the current row belongs to, the times_rel and times_orig (and times_norm in the forthcoming emuR-version) columns represent the relative time and the original time of the samples contained in the current row and T1 (to Tn in n dimensional trackdata) contains the actual signal sample values. It is also worth noting that the emuR package provides a function called create emuRtrackdata(), which allows users to create emuRtrackdata from a segment list and a trackdata object. This is beneficial as it allows trackdata objects to be processed using functions provided by the emuR package (e.g., dcut() and trapply()) and then converts them into a standardized data.table object for further processing (e.g., using R packages such as lme4 or ggplot2 which were implemented to use with data.frame or data.table objects).

Introduction to the package ggplot2

The goal of this chapter is to allow the reader to plot any numeric data from data.frames, whatever their source may be, including the new emuRtrackdata object. In order to do so, we sometimes have to manipulate the data.frame. We therefore will repeat some standard methods that manipulate data.frames.

The plots above can be done with ggplot2 and will look like:

Figure 1: Equivalent to the eplot

Figure 1: Equivalent to the eplot

Figure 2: Equivalent to the dplot

Figure 2: Equivalent to the dplot

Figure 3: Equivalent to the normalized dplot

Figure 3: Equivalent to the normalized dplot

Why ggplot2?

Advantages of ggplot2

  • consistent underlying grammar of graphics (Wilkinson, 2005)
  • plot specification at a high level of abstraction
  • very flexible
  • theme system for polishing plot appearance
  • mature and complete graphics system
  • many users, active mailing list

That said, there are some things you cannot (or should not) do With ggplot2:

  • 3-dimensional graphics (see the rgl package)
  • Graph-theory type graphs (nodes/edges layout; see the igraph package)
  • Interactive graphics (see the ggvis package)

What Is The Grammar Of Graphics?

The basic idea: independently specify plot building blocks and combine them to create just about any kind of graphical display you want. Building blocks of a graph include:

  • data
  • aesthetic mapping
  • geometric object
  • statistical transformations
  • scales
  • coordinate system
  • position adjustments
  • faceting

The structure of a ggplot

The ggplot() function is used to initialize the basic graph structure, then we add to it. The structure of a ggplot looks like this:

  ggplot(data = <default data set>, 
         aes(x = <default x axis variable>,
             y = <default y axis variable>,
             ... <other default aesthetic mappings>),
         ... <other plot defaults>) +

         geom_<geom type>(aes(size = <size variable for this geom>, 
                        ... <other aesthetic mappings>),
                    data = <data for this point geom>,
                    stat = <statistic string or function>,
                    position = <position string or function>,
                    color = <"fixed color specification">,
                    <other arguments, possibly passed to the _stat_ function) +

    scale_<aesthetic>_<type>(name = <"scale label">,
                       breaks = <where to put tick marks>,
                       labels = <labels for tick marks>,
                       ... <other options for the scale>) +

    theme(plot.background = element_rect(fill = "gray"),
          ... <other theme elements>)

The basic idea is that you specify different parts of the plot, and add them together using the + operator.

Examples

See e.g. Handbook on R and figures: http://www.cookbook-r.com/ and the introduction to ggplot2 (gg = grammar of graphics) in http://docs.ggplot2.org/current/

Let’s try with a few datasets from Jonathan Harrington’s statistics seminar:

# if necessary, install.packages(ggplot2)
library(ggplot2)

pfadu = "http://www.phonetik.uni-muenchen.de/~jmh/lehre/Rdf"
asp = read.table(file.path(pfadu, "asp.txt"))
coronal = read.table(file.path(pfadu, "coronal.txt"))
int.df = read.table(file.path(pfadu, "intdauer.txt"))
v.df = read.table(file.path(pfadu, "vdata.txt"))


# check class (data.frame or not):
class(asp)
## [1] "data.frame"
# the first few lines:
head(coronal)
##   Fr Region Vpn Socialclass
## 1 sh     R2  S1           W
## 2  s     R2  S2           W
## 3 sh     R1  S3           W
## 4  s     R3  S4           W
## 5  s     R2  S5           W
## 6 sh     R3  S6           W
# 'ai[m,]' = row m
# 'ai[,m]' = column m
# You can use '$Name' to access column "Name"

#############################################################################
# 1. Numerical und categorical variables
############################################################################
# In a data.frame, columns can consist of numerical or categorical variables.
# In a matrix, you can only have one or the other class of variables.

# Numerical variables: continuous
#
class(asp$d)
## [1] "numeric"
# or
with(asp, class(d))
## [1] "numeric"
# [1] "numeric"

class(int.df$Dauer)
## [1] "integer"
# [1] "integer"

# Categorical variables will be treated as factors (that have two or more levels, or categories; this is different to objects of the class "character"):
class(coronal$Socialclass)
## [1] "factor"
# [1] "factor"

# first 10
coronal$Socialclass[1:10]
##  [1] W W W W W W W W W W
## Levels: LM UM W
# asks which levels are given
levels(coronal$Socialclass)
## [1] "LM" "UM" "W"
##########################################################
# 2. Typical example in phonetics
##########################################################
# Is there an influence of x on y?
# 
# 1. y = numerical, x = categorical
# 1.1 difference in duration in /i, e, a/ ?
# 1.2 = influence of x (=vowel) on y (=duration)?
# 1.3 possible geoms: geom_boxplot() 
# or: geom_histogram()  or stat_density()

# 2. y = categorical, x = categorical
# 2.1 words like Sohn, Sonne... can be produced either with /s/ or /z/.
# /s/ more likely in Bavaria or in Hamburg?
# 2.2 possible geom: geom_barchart()

# 3. y = numerical, x = numerical 
# 3.1 bigger mouth opening related to a longer duration?
# 3.2 possible geom: geom_point(), geom_line()

Summary:

  • y = numerical, x = categorical: geom_boxplot(), geom_histogram(), stat_density()
  • y = categorical, x = categorical: geom_bar()
  • y = numerical, x = numerical: geom_point(), geom_line()
  • y = categorical, x = numerical: geom_point()

Boxplots

############################################################################
# 3. geom_boxplot(): y = numerical, x = categorical
############################################################################

head(asp)
##        d             Wort Vpn Kons Bet
## 1 26.180 Fruehlingswetter k01    t  un
## 2 23.063          Gestern k01    t  un
## 3 26.812           Montag k01    t  un
## 4 14.750            Vater k01    t  un
## 5 42.380            Tisch k01    t  be
## 6 21.560           Mutter k01    t  un
# Influence of place of articulation (Kons) on duration of aspiration (d)?
# y: d    (numerical)
# x: Kons (categorical)

# Syntax in ggplot()
# A + B + C + D + ...
# A, B, C... are modules.
# Here:
# A. data-frame + B. Variables + C. kind of plot
ggplot(asp) + aes(y = d, x = Kons) + geom_boxplot()

# or
# A
p1 = ggplot(asp)
# B
p2 = aes(y = d, x = Kons)
# C
p3 = geom_boxplot()
# A + B + C
p1 + p2 + p3

# oder A + B + C ablegen
erg = p1 + p2 + p3
# Bild
erg

# boxplot.
# thick line = median;  'Box': interquartile range
# 

Barplots

############################################################################
# 4. geom_bar(): y ist kategorial, x ist kategorial
############################################################################
head(coronal)
##   Fr Region Vpn Socialclass
## 1 sh     R2  S1           W
## 2  s     R2  S2           W
## 3 sh     R1  S3           W
## 4  s     R3  S4           W
## 5  s     R2  S5           W
## 6 sh     R3  S6           W
# Influence of region (Region) in place of articulation (F1)?
# y: Fr (categorical)
# x: Region (categorical)

p1 = ggplot(coronal)
p2 = aes(fill = Fr, x = Region)
# to print frequencies of occurance
p3 = geom_bar()
p1 + p2 + p3

# place bars side by side
p4 = geom_bar(position="dodge")
p1 + p2 + p4

# print proportions
p5 = geom_bar(position="fill")
p1 + p2 + p5

Scatterplots

############################################################################
# 5. geom_point(), geom_line():  y ist numerisch, x ist numerisch
############################################################################  
# Inwiefern wird die Dauer (Dauer) von der Intensität (dB) beeinflusst in dem Data-Frame int.df()
# y: Dauer (numerisch) 
# x: dB (numerisch)
head(int.df)
##   Vpn    dB Dauer
## 1  S1 24.50   162
## 2  S2 32.54   120
## 3  S2 38.02   223
## 4  S2 28.38   131
## 5  S1 23.47    67
## 6  S2 37.82   169
# Nur Linie
ggplot(int.df) +  aes(x = dB, y = Dauer) + geom_line() 

# Nur Punkte
ggplot(int.df, aes(x = dB, y = Dauer)) + geom_point() 

# Beide
ggplot(int.df, aes(x = dB, y = Dauer)) + geom_line() + geom_point()

Titles and axes

############################################################################
# 6. + xlab() + ylab() + ggtitle()
############################################################################  
# same boxplot as above
p1 = ggplot(asp) + aes(y = d, x = Kons) + geom_boxplot()
# label for x-axis
p2 = xlab("Place of Articulation")
# label for x-axis
p3 = ylab("Duration (ms)")
# Titel
p4 = ggtitle("Boxplot")
p1 + p2 + p3 + p4

# same barchart as above
bar.p = ggplot(coronal) + aes(x = Region, fill = Fr) + geom_bar(position = "fill")
x.p = xlab("Region")
y.p = ylab("Proportion")
t.p = ggtitle("Proportional Distribution of Fricatives")
bar.p + x.p + y.p + t.p

############################################################################
# 7. Limits on axes +xlim() + ylim()
############################################################################

# same geom_bar() as above
p1 = ggplot(int.df, aes(dB, Dauer)) + geom_point() 
# xlim
p2 = xlim(c(10, 60))
# ylim
p3 = ylim(c(30, 280))
p1 + p2 + p3

#reverse axes:
p4 = scale_x_reverse()
p5 = scale_y_reverse()

p1 + p4 + p5

Colors

(see http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf)

colors()
##   [1] "white"                "aliceblue"            "antiquewhite"        
##   [4] "antiquewhite1"        "antiquewhite2"        "antiquewhite3"       
##   [7] "antiquewhite4"        "aquamarine"           "aquamarine1"         
##  [10] "aquamarine2"          "aquamarine3"          "aquamarine4"         
##  [13] "azure"                "azure1"               "azure2"              
##  [16] "azure3"               "azure4"               "beige"               
##  [19] "bisque"               "bisque1"              "bisque2"             
##  [22] "bisque3"              "bisque4"              "black"               
##  [25] "blanchedalmond"       "blue"                 "blue1"               
##  [28] "blue2"                "blue3"                "blue4"               
##  [31] "blueviolet"           "brown"                "brown1"              
##  [34] "brown2"               "brown3"               "brown4"              
##  [37] "burlywood"            "burlywood1"           "burlywood2"          
##  [40] "burlywood3"           "burlywood4"           "cadetblue"           
##  [43] "cadetblue1"           "cadetblue2"           "cadetblue3"          
##  [46] "cadetblue4"           "chartreuse"           "chartreuse1"         
##  [49] "chartreuse2"          "chartreuse3"          "chartreuse4"         
##  [52] "chocolate"            "chocolate1"           "chocolate2"          
##  [55] "chocolate3"           "chocolate4"           "coral"               
##  [58] "coral1"               "coral2"               "coral3"              
##  [61] "coral4"               "cornflowerblue"       "cornsilk"            
##  [64] "cornsilk1"            "cornsilk2"            "cornsilk3"           
##  [67] "cornsilk4"            "cyan"                 "cyan1"               
##  [70] "cyan2"                "cyan3"                "cyan4"               
##  [73] "darkblue"             "darkcyan"             "darkgoldenrod"       
##  [76] "darkgoldenrod1"       "darkgoldenrod2"       "darkgoldenrod3"      
##  [79] "darkgoldenrod4"       "darkgray"             "darkgreen"           
##  [82] "darkgrey"             "darkkhaki"            "darkmagenta"         
##  [85] "darkolivegreen"       "darkolivegreen1"      "darkolivegreen2"     
##  [88] "darkolivegreen3"      "darkolivegreen4"      "darkorange"          
##  [91] "darkorange1"          "darkorange2"          "darkorange3"         
##  [94] "darkorange4"          "darkorchid"           "darkorchid1"         
##  [97] "darkorchid2"          "darkorchid3"          "darkorchid4"         
## [100] "darkred"              "darksalmon"           "darkseagreen"        
## [103] "darkseagreen1"        "darkseagreen2"        "darkseagreen3"       
## [106] "darkseagreen4"        "darkslateblue"        "darkslategray"       
## [109] "darkslategray1"       "darkslategray2"       "darkslategray3"      
## [112] "darkslategray4"       "darkslategrey"        "darkturquoise"       
## [115] "darkviolet"           "deeppink"             "deeppink1"           
## [118] "deeppink2"            "deeppink3"            "deeppink4"           
## [121] "deepskyblue"          "deepskyblue1"         "deepskyblue2"        
## [124] "deepskyblue3"         "deepskyblue4"         "dimgray"             
## [127] "dimgrey"              "dodgerblue"           "dodgerblue1"         
## [130] "dodgerblue2"          "dodgerblue3"          "dodgerblue4"         
## [133] "firebrick"            "firebrick1"           "firebrick2"          
## [136] "firebrick3"           "firebrick4"           "floralwhite"         
## [139] "forestgreen"          "gainsboro"            "ghostwhite"          
## [142] "gold"                 "gold1"                "gold2"               
## [145] "gold3"                "gold4"                "goldenrod"           
## [148] "goldenrod1"           "goldenrod2"           "goldenrod3"          
## [151] "goldenrod4"           "gray"                 "gray0"               
## [154] "gray1"                "gray2"                "gray3"               
## [157] "gray4"                "gray5"                "gray6"               
## [160] "gray7"                "gray8"                "gray9"               
## [163] "gray10"               "gray11"               "gray12"              
## [166] "gray13"               "gray14"               "gray15"              
## [169] "gray16"               "gray17"               "gray18"              
## [172] "gray19"               "gray20"               "gray21"              
## [175] "gray22"               "gray23"               "gray24"              
## [178] "gray25"               "gray26"               "gray27"              
## [181] "gray28"               "gray29"               "gray30"              
## [184] "gray31"               "gray32"               "gray33"              
## [187] "gray34"               "gray35"               "gray36"              
## [190] "gray37"               "gray38"               "gray39"              
## [193] "gray40"               "gray41"               "gray42"              
## [196] "gray43"               "gray44"               "gray45"              
## [199] "gray46"               "gray47"               "gray48"              
## [202] "gray49"               "gray50"               "gray51"              
## [205] "gray52"               "gray53"               "gray54"              
## [208] "gray55"               "gray56"               "gray57"              
## [211] "gray58"               "gray59"               "gray60"              
## [214] "gray61"               "gray62"               "gray63"              
## [217] "gray64"               "gray65"               "gray66"              
## [220] "gray67"               "gray68"               "gray69"              
## [223] "gray70"               "gray71"               "gray72"              
## [226] "gray73"               "gray74"               "gray75"              
## [229] "gray76"               "gray77"               "gray78"              
## [232] "gray79"               "gray80"               "gray81"              
## [235] "gray82"               "gray83"               "gray84"              
## [238] "gray85"               "gray86"               "gray87"              
## [241] "gray88"               "gray89"               "gray90"              
## [244] "gray91"               "gray92"               "gray93"              
## [247] "gray94"               "gray95"               "gray96"              
## [250] "gray97"               "gray98"               "gray99"              
## [253] "gray100"              "green"                "green1"              
## [256] "green2"               "green3"               "green4"              
## [259] "greenyellow"          "grey"                 "grey0"               
## [262] "grey1"                "grey2"                "grey3"               
## [265] "grey4"                "grey5"                "grey6"               
## [268] "grey7"                "grey8"                "grey9"               
## [271] "grey10"               "grey11"               "grey12"              
## [274] "grey13"               "grey14"               "grey15"              
## [277] "grey16"               "grey17"               "grey18"              
## [280] "grey19"               "grey20"               "grey21"              
## [283] "grey22"               "grey23"               "grey24"              
## [286] "grey25"               "grey26"               "grey27"              
## [289] "grey28"               "grey29"               "grey30"              
## [292] "grey31"               "grey32"               "grey33"              
## [295] "grey34"               "grey35"               "grey36"              
## [298] "grey37"               "grey38"               "grey39"              
## [301] "grey40"               "grey41"               "grey42"              
## [304] "grey43"               "grey44"               "grey45"              
## [307] "grey46"               "grey47"               "grey48"              
## [310] "grey49"               "grey50"               "grey51"              
## [313] "grey52"               "grey53"               "grey54"              
## [316] "grey55"               "grey56"               "grey57"              
## [319] "grey58"               "grey59"               "grey60"              
## [322] "grey61"               "grey62"               "grey63"              
## [325] "grey64"               "grey65"               "grey66"              
## [328] "grey67"               "grey68"               "grey69"              
## [331] "grey70"               "grey71"               "grey72"              
## [334] "grey73"               "grey74"               "grey75"              
## [337] "grey76"               "grey77"               "grey78"              
## [340] "grey79"               "grey80"               "grey81"              
## [343] "grey82"               "grey83"               "grey84"              
## [346] "grey85"               "grey86"               "grey87"              
## [349] "grey88"               "grey89"               "grey90"              
## [352] "grey91"               "grey92"               "grey93"              
## [355] "grey94"               "grey95"               "grey96"              
## [358] "grey97"               "grey98"               "grey99"              
## [361] "grey100"              "honeydew"             "honeydew1"           
## [364] "honeydew2"            "honeydew3"            "honeydew4"           
## [367] "hotpink"              "hotpink1"             "hotpink2"            
## [370] "hotpink3"             "hotpink4"             "indianred"           
## [373] "indianred1"           "indianred2"           "indianred3"          
## [376] "indianred4"           "ivory"                "ivory1"              
## [379] "ivory2"               "ivory3"               "ivory4"              
## [382] "khaki"                "khaki1"               "khaki2"              
## [385] "khaki3"               "khaki4"               "lavender"            
## [388] "lavenderblush"        "lavenderblush1"       "lavenderblush2"      
## [391] "lavenderblush3"       "lavenderblush4"       "lawngreen"           
## [394] "lemonchiffon"         "lemonchiffon1"        "lemonchiffon2"       
## [397] "lemonchiffon3"        "lemonchiffon4"        "lightblue"           
## [400] "lightblue1"           "lightblue2"           "lightblue3"          
## [403] "lightblue4"           "lightcoral"           "lightcyan"           
## [406] "lightcyan1"           "lightcyan2"           "lightcyan3"          
## [409] "lightcyan4"           "lightgoldenrod"       "lightgoldenrod1"     
## [412] "lightgoldenrod2"      "lightgoldenrod3"      "lightgoldenrod4"     
## [415] "lightgoldenrodyellow" "lightgray"            "lightgreen"          
## [418] "lightgrey"            "lightpink"            "lightpink1"          
## [421] "lightpink2"           "lightpink3"           "lightpink4"          
## [424] "lightsalmon"          "lightsalmon1"         "lightsalmon2"        
## [427] "lightsalmon3"         "lightsalmon4"         "lightseagreen"       
## [430] "lightskyblue"         "lightskyblue1"        "lightskyblue2"       
## [433] "lightskyblue3"        "lightskyblue4"        "lightslateblue"      
## [436] "lightslategray"       "lightslategrey"       "lightsteelblue"      
## [439] "lightsteelblue1"      "lightsteelblue2"      "lightsteelblue3"     
## [442] "lightsteelblue4"      "lightyellow"          "lightyellow1"        
## [445] "lightyellow2"         "lightyellow3"         "lightyellow4"        
## [448] "limegreen"            "linen"                "magenta"             
## [451] "magenta1"             "magenta2"             "magenta3"            
## [454] "magenta4"             "maroon"               "maroon1"             
## [457] "maroon2"              "maroon3"              "maroon4"             
## [460] "mediumaquamarine"     "mediumblue"           "mediumorchid"        
## [463] "mediumorchid1"        "mediumorchid2"        "mediumorchid3"       
## [466] "mediumorchid4"        "mediumpurple"         "mediumpurple1"       
## [469] "mediumpurple2"        "mediumpurple3"        "mediumpurple4"       
## [472] "mediumseagreen"       "mediumslateblue"      "mediumspringgreen"   
## [475] "mediumturquoise"      "mediumvioletred"      "midnightblue"        
## [478] "mintcream"            "mistyrose"            "mistyrose1"          
## [481] "mistyrose2"           "mistyrose3"           "mistyrose4"          
## [484] "moccasin"             "navajowhite"          "navajowhite1"        
## [487] "navajowhite2"         "navajowhite3"         "navajowhite4"        
## [490] "navy"                 "navyblue"             "oldlace"             
## [493] "olivedrab"            "olivedrab1"           "olivedrab2"          
## [496] "olivedrab3"           "olivedrab4"           "orange"              
## [499] "orange1"              "orange2"              "orange3"             
## [502] "orange4"              "orangered"            "orangered1"          
## [505] "orangered2"           "orangered3"           "orangered4"          
## [508] "orchid"               "orchid1"              "orchid2"             
## [511] "orchid3"              "orchid4"              "palegoldenrod"       
## [514] "palegreen"            "palegreen1"           "palegreen2"          
## [517] "palegreen3"           "palegreen4"           "paleturquoise"       
## [520] "paleturquoise1"       "paleturquoise2"       "paleturquoise3"      
## [523] "paleturquoise4"       "palevioletred"        "palevioletred1"      
## [526] "palevioletred2"       "palevioletred3"       "palevioletred4"      
## [529] "papayawhip"           "peachpuff"            "peachpuff1"          
## [532] "peachpuff2"           "peachpuff3"           "peachpuff4"          
## [535] "peru"                 "pink"                 "pink1"               
## [538] "pink2"                "pink3"                "pink4"               
## [541] "plum"                 "plum1"                "plum2"               
## [544] "plum3"                "plum4"                "powderblue"          
## [547] "purple"               "purple1"              "purple2"             
## [550] "purple3"              "purple4"              "red"                 
## [553] "red1"                 "red2"                 "red3"                
## [556] "red4"                 "rosybrown"            "rosybrown1"          
## [559] "rosybrown2"           "rosybrown3"           "rosybrown4"          
## [562] "royalblue"            "royalblue1"           "royalblue2"          
## [565] "royalblue3"           "royalblue4"           "saddlebrown"         
## [568] "salmon"               "salmon1"              "salmon2"             
## [571] "salmon3"              "salmon4"              "sandybrown"          
## [574] "seagreen"             "seagreen1"            "seagreen2"           
## [577] "seagreen3"            "seagreen4"            "seashell"            
## [580] "seashell1"            "seashell2"            "seashell3"           
## [583] "seashell4"            "sienna"               "sienna1"             
## [586] "sienna2"              "sienna3"              "sienna4"             
## [589] "skyblue"              "skyblue1"             "skyblue2"            
## [592] "skyblue3"             "skyblue4"             "slateblue"           
## [595] "slateblue1"           "slateblue2"           "slateblue3"          
## [598] "slateblue4"           "slategray"            "slategray1"          
## [601] "slategray2"           "slategray3"           "slategray4"          
## [604] "slategrey"            "snow"                 "snow1"               
## [607] "snow2"                "snow3"                "snow4"               
## [610] "springgreen"          "springgreen1"         "springgreen2"        
## [613] "springgreen3"         "springgreen4"         "steelblue"           
## [616] "steelblue1"           "steelblue2"           "steelblue3"          
## [619] "steelblue4"           "tan"                  "tan1"                
## [622] "tan2"                 "tan3"                 "tan4"                
## [625] "thistle"              "thistle1"             "thistle2"            
## [628] "thistle3"             "thistle4"             "tomato"              
## [631] "tomato1"              "tomato2"              "tomato3"             
## [634] "tomato4"              "turquoise"            "turquoise1"          
## [637] "turquoise2"           "turquoise3"           "turquoise4"          
## [640] "violet"               "violetred"            "violetred1"          
## [643] "violetred2"           "violetred3"           "violetred4"          
## [646] "wheat"                "wheat1"               "wheat2"              
## [649] "wheat3"               "wheat4"               "whitesmoke"          
## [652] "yellow"               "yellow1"              "yellow2"             
## [655] "yellow3"              "yellow4"              "yellowgreen"
############################ geom_boxplot()
ggplot(asp) + aes(y = d, x = Kons) + geom_boxplot()

# Default colors
# filled with different colors
ggplot(asp) + aes(y = d, x = Kons, fill = Kons) + geom_boxplot()

# different line colors
ggplot(asp) + aes(y = d, x = Kons, col = Kons) + geom_boxplot()

# or chose your own colors
farben = c("green", "red")
# filled
ggplot(asp) + aes(y = d, x = Kons) + geom_boxplot(fill = farben)

# line colors
ggplot(asp) + aes(y = d, x = Kons) + geom_boxplot(col = farben)

############################ geom_bar()
########## 
p1 = ggplot(coronal) + aes(x = Region, fill = Fr) + geom_bar()
p1

# Eigene Farben wählen
farben = c("yellow", "green")
p2 = scale_fill_manual(values = farben) 
p1 + p2

Plotting characters (pch) and character sizes (cex); line width (lwd)

(see http://www.endmemo.com/program/R/pchsymbols.php)

########## 
ggplot(int.df, aes(x = dB, y = Dauer)) +  geom_point() + geom_line()

# col: color. 
# pch: plotting character. 
# cex: character expansion:cex =2 means 2*standard size
ggplot(int.df, aes(x = dB, y = Dauer)) +  geom_point(col="purple", pch=0, cex=2) + geom_line(col = "pink")

# lwd: Liniendichte
ggplot(int.df, aes(x = dB, y = Dauer)) +  geom_point(col="purple", pch=0, cex=2) + geom_line(col = "pink", lwd=2)

Size of Labels

# Default size ist 11 (Legende: 10 (??))

p1 = ggplot(asp) + aes(y = d, x = Kons) + geom_boxplot() + xlab("Artikulationsstelle") + ylab("Dauer (ms)") + ggtitle("Boxplot-Daten")
p1

# size 16
p16 = theme(text = element_text(size=16))
p1 + p16

# change only on axes
q24 = theme(axis.text = element_text(size=24))
p1 + q24

# Different values on axes labels and title
p30 = theme(text = element_text(size=30))
p1 + q24 + p30

Two (or three) independent variables: Facets and (once again) colors

#create one boxplot per stress pattern (Bet: levels "be" and "un")
pf = facet_grid(~Bet)
p1 + pf

# or add col to aes():
pc = ggplot(asp) + aes(y = d, x = Kons,col=Bet) + geom_boxplot() + xlab("Artikulationsstelle") + ylab("Dauer (ms)") + ggtitle("Boxplot-Daten")
pc

You can, of course combine facets and colors and therefore plot the influences of up to three independent variables.

Arrange several figures on one plane

# if necessary, install.packages(gridExtra)
library(gridExtra)

p1 = ggplot(asp, aes(y = d, x = Kons))  + geom_boxplot()
p2 = ggplot(coronal) + aes(x = Region, fill = Fr) + geom_bar()
p3 = ggplot(int.df, aes(dB, Dauer)) + geom_line() + geom_point()
grid.arrange(p1, p2, p3,  ncol=3, nrow =1)

More things to change with theme

# see
help(theme)

Adding statistical measures to a plot

p1 = ggplot(int.df, aes(dB, Dauer)) + geom_point()
int.lm =  geom_smooth(method="lm",se=FALSE)
p1 + int.lm

#by default, geom_smooth shows the standard error:
int.lmse =  geom_smooth(method="lm")
p1 + int.lmse

# you can calculate this stat (here lm() ) for each facet (e.g. for each subject (Vpn)) separately

p1 + int.lmse + facet_grid(~Vpn)

Instead of geom_smooth(), you could also add lines with geom_abline(intercept=..., slope=... ), and horizontal and vertical lines with geom_hline() and geom_vline.

geom_smooth() can be used with several smoothing methods, like lm, but also glm (for sigmoidal curves fitting binary perceptual data), and some others (it can fit e.g. splines with loess). One example of method glm (in which you have to add the information that it is binomial data) would be:

bat.df = read.table("Rgraphics/dataSets/bat.df.txt")
bat.plot = ggplot(bat.df) + aes(y = p, x = steps) + geom_point(col = "red") +  facet_wrap(~participant)  + ggtitle("bat")

#add listener-specific sigmoids
bat.plot + geom_smooth(method = "glm",se=FALSE,method.args = list(family=binomial))

In phonetics, we often draw ellipses around two-dimensional data points, representing F2 and F1 values of vowels. We can add an ellipse by stat_ellipse().

ell = stat_ellipse()
p1 + ell

By default, this adds an ellipse representing the 95%-confidence interval (under the assumption of a multivariate t-distribution). While it is not extremely useful with the given data, it is useful in segregating vowel categories. However - be careful: at low numbers of tokens, one or two outliers can produce somehow “silly” ellipses:

td_mid = read.table("Rgraphics/dataSets/td_mid.txt")
p1 = ggplot(td_mid, aes(y = T1, x  = T2, col = labels, label=labels)) 

#add data.points as text labels, defined by their value
p2 = geom_text()
p1 + p2

p3 = stat_ellipse() 
p4 = scale_y_reverse() 
p5 = scale_x_reverse() 
p6 =labs(x = "F2(Hz)", y = "F1(Hz)") 
p7 = theme(legend.position="none")

p1 + p2 + p3 + p4 + p5 + p6 + p7

# only ellipses (do NOT plot data.points)
p1  + p3 + p4 + p5 + p6 + p7

#plot the label-specific means of F1 and F2 (here: T1 and T2)
p2_centroid = geom_text(data = aggregate(cbind(T1,T2)~labels,data=td_mid,FUN=mean))
p1 + p2_centroid + p3 + p4 + p5 + p6 + p7

#btw, we could also vary the linetype
p1_alt = ggplot(td_mid, aes(y = T1, x  = T2, col = labels, label=labels,linetype=labels)) 
p1_alt + p2_centroid + p3 + p4 + p5 + p6

It is also very easy to do the replacement of the dplot shown at the beginning of this document.

ggplot(vowels_fm_new) +
  aes(x=times_rel,y=T2,col=labels,group=sl_rowIdx) +
  geom_line() +
  labs(x = "vowel duration (ms)", y = "F2 (Hz)")

However, it is much more difficult to produce the time-normalized and by-vowel averaged version. We will need the function normalizeLength() (that will be available with the next release of emuR). We can, however, use a prepared version of a length-normalized emuRtrackdata object that contains normalized times:

td_norm = read.table("Rgraphics/dataSets/td_norm.txt")
ggplot(aggregate(T2~times_norm+labels, data = td_norm,FUN=mean)) +
  aes(x=times_norm,y=T2,col=labels) +
  geom_line() +
  labs(x = "vowel duration (normalized)", y = "F2 (Hz)")

Conclusion

This chapter gave a very short introduction into the package ggplot2. More information can be found at e.g. http://r-statistics.co/Complete-Ggplot2-Tutorial-Part1-With-R-Code.html, or https://opr.princeton.edu/workshops/Downloads/2015Jan_ggplot2Koffman.pdf, or any other website you may find (there are numerous introductions to ggplot2).