Q & A’s

Q1: Berechnen Sie den Medianwert von Rating im Data-Frame rating getrennt pro Versuchsperson und pro Sprache.

rating %>%
  group_by(Vpn, Lang) %>%
  summarise(x = median(Rating)) %>%
  ungroup()

## `summarise()` has grouped output by 'Vpn'. You can override using the `.groups`
## argument.

## # A tibble: 26 × 3
##    Vpn   Lang      x
##    <fct> <fct> <dbl>
##  1 S1    E      6.5 
##  2 S10   E      5.75
##  3 S11   E      5.83
##  4 S12   E      5.25
##  5 S13   E      6.2 
##  6 S14   S      4.75
##  7 S15   S      5.58
##  8 S16   S      5.92
##  9 S17   S      6.75
## 10 S18   S      5.58
## # ℹ 16 more rows

Q2: Legen Sie eine neue Spalte im Data-Frame rating an, genannt Lrating, die die logarithmischen Werte von Rating enthält. Speichern Sie das Ergebnis als neues Objekt r2.

r2 = rating %>%
  mutate(Lrating = log(Rating))

Q3: Im Data-Frame Rating tabellieren Sie die Häufigkeit der Stufen-Kombinationen der Faktoren Gram, Type, Fam.

rating %>%
  select(Gram, Type, Fam) %>%
  table()

## , , Fam = New
## 
##           Type
## Gram       Identical Structural
##   High            26         26
##   Moderate        26         26
## 
## , , Fam = Old
## 
##           Type
## Gram       Identical Structural
##   High            26         26
##   Moderate        26         26

# Alternativ:
rating %>%
  group_by(Gram, Type, Fam) %>%
  summarise(anzahl = n()) %>%
  ungroup()

## `summarise()` has grouped output by 'Gram', 'Type'. You can override using the
## `.groups` argument.

## # A tibble: 8 × 4
##   Gram     Type       Fam   anzahl
##   <fct>    <fct>      <fct>  <int>
## 1 High     Identical  New       26
## 2 High     Identical  Old       26
## 3 High     Structural New       26
## 4 High     Structural Old       26
## 5 Moderate Identical  New       26
## 6 Moderate Identical  Old       26
## 7 Moderate Structural New       26
## 8 Moderate Structural Old       26

Q4: Im Data-Frame Rating berechnen Sie den Mittelwert von Rating für die beiden Stufen des Faktors Fam und getrennt für die Versuchspersonen S1 und S10.

rating %>%
  filter(Vpn == "S1" | Vpn == "S10") %>%
  group_by(Fam, Vpn) %>%
  summarise(mean(Rating)) %>%
  ungroup()

## `summarise()` has grouped output by 'Fam'. You can override using the `.groups`
## argument.

## # A tibble: 4 × 3
##   Fam   Vpn   `mean(Rating)`
##   <fct> <fct>          <dbl>
## 1 New   S1              6.37
## 2 New   S10             5.84
## 3 Old   S1              6.19
## 4 Old   S10             6.08

# Alternativ:
rating %>%
  filter(Vpn %in% c("S1", "S10")) %>%
  group_by(Fam, Vpn) %>%
  summarise(mean(Rating)) %>%
  ungroup()

## `summarise()` has grouped output by 'Fam'. You can override using the `.groups`
## argument.

## # A tibble: 4 × 3
##   Fam   Vpn   `mean(Rating)`
##   <fct> <fct>          <dbl>
## 1 New   S1              6.37
## 2 New   S10             5.84
## 3 Old   S1              6.19
## 4 Old   S10             6.08

Q5: Im Data-Frame vdata berechnen Sie log(F2/F1) getrennt für alle Sprecher und für die Vokale (Faktor V) Y und U.

vdata %>%
  filter(V %in% c("Y", "U")) %>%
  group_by(Subj, V) %>%
  summarise(logf1f2 = log(F2/F1)) %>%
  ungroup()

## Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
## dplyr 1.1.0.
## ℹ Please use `reframe()` instead.
## ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
##   always returns an ungrouped data frame and adjust accordingly.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## `summarise()` has grouped output by 'Subj', 'V'. You can override using the
## `.groups` argument.

## # A tibble: 847 × 3
##    Subj  V     logf1f2
##    <fct> <fct>   <dbl>
##  1 bk    U       1.03 
##  2 bk    U       0.948
##  3 bk    U       0.840
##  4 bk    U       0.986
##  5 bk    U       0.748
##  6 bk    U       1.00 
##  7 bk    U       1.02 
##  8 bk    U       0.979
##  9 bk    U       0.855
## 10 bk    U       0.940
## # ℹ 837 more rows

Q6: Im Date-Frame vdata berechnen Sie den F1-Mittelwert für den Vokal A getrennt für alle drei Artikulationsstellen von Cons.

vdata %>%
  filter(V == "A") %>%
  group_by(Cons) %>%
  summarise(mF1 = mean(F1)) %>%
  ungroup()

## # A tibble: 3 × 2
##   Cons    mF1
##   <fct> <dbl>
## 1 K      637.
## 2 P      646.
## 3 T      652.

Q7: Im Data-Frame vdata berechnen Sie den F1- und F2-Mittelwert getrennt für alle Vokale und in ungespannten (Faktor: Tense, -) und gespannten (Faktor: Tense, +) Vokalen.

vdata %>%
  group_by(V, Tense) %>%
  summarise(mF1 = mean(F1), 
            mF2 = mean(F2)) %>%
  ungroup()

## `summarise()` has grouped output by 'V'. You can override using the `.groups`
## argument.

## # A tibble: 14 × 4
##    V     Tense   mF1   mF2
##    <fct> <fct> <dbl> <dbl>
##  1 %     -      479. 1458.
##  2 %     +      368. 1493.
##  3 A     -      622. 1326.
##  4 A     +      668. 1249.
##  5 E     -      488. 1729.
##  6 E     +      363. 2024.
##  7 I     -      346. 1811.
##  8 I     +      276. 2127.
##  9 O     -      520. 1009.
## 10 O     +      348.  686.
## 11 U     -      348.  938.
## 12 U     +      259.  681 
## 13 Y     -      338. 1476.
## 14 Y     +      266. 1673.

Q8: Im Data-Frame vdata legen Sie eine neue Spalte an, D, die drei Stufen enthält: low wenn die Dauer (dur) kleiner als 75 ms ist, high wenn die Dauer größer als 200 ms ist, sonst mid. Speichern Sie das Ergebnis als Data-Frame v2.

v2 = vdata %>%
  mutate(D = case_when(dur < 75 ~ "low",
                       dur > 200 ~ "high",
                       TRUE ~ "mid"))

Q9: Für den neu angelegten Data-Frame v2, stellen Sie fest, wie oft ungespannte Vokale (Faktor Tense, -) in den Stufen low, mid oder high vorkommen (Faktor D).

v2 %>%
  filter(Tense == "-") %>%
  select(D) %>%
  table()

## D
## high  low  mid 
##    1   27 1475

# oder:
v2 %>%
  filter(Tense == "-") %>%
  group_by(D) %>%
  summarise(n())

## # A tibble: 3 × 2
##   D     `n()`
##   <chr> <int>
## 1 high      1
## 2 low      27
## 3 mid    1475

Q10: Für den Data-Frame preasp stellen Sie fest, welche Stadt (city) die höchste Vokaldauer (vdur) hat für den Vokal o (Faktor vtype) und für die Artikulationsstelle kk (Faktor cplace).

preasp %>%
  filter(vtype == "o" & cplace == "kk") %>%
  slice_max(vdur) %>%
  pull(city)

## [1] milano
## 15 Levels: bari bergamo cagliari Catanzaro firenze genova lecce ... venezia

Q11: Für den Data-Frame preasp erzeugen Sie eine neue Spalte CV, die die Summen von clodur und vdur enthält. Speichern Sie diesen Data-Frame als p2. Berechnen Sie den Mittelwert von CV getrennt für die verschiedenen Wörter (word) aber nur in der nördlichen Region (region: N).

p2 = preasp %>%
  mutate(CV = clodur + vdur)

p2 %>%
  filter(region == "N") %>%
  group_by(word) %>%
  summarise(mean(CV))

## # A tibble: 8 × 2
##   word         `mean(CV)`
##   <fct>             <dbl>
## 1 bocca             0.301
## 2 bottoni           0.249
## 3 cappello          0.240
## 4 macchina          0.217
## 5 occhi             0.315
## 6 specchiettok      0.188
## 7 specchiettot      0.318
## 8 tetto             0.340

Lösungen zu Übung 3b

Jonathan Harrington / Johanna Cronenberg

Daten & Packages laden

Q & A’s