1 Motivation

Experiment 1 was designed to test the hypothesis that in transitive object contexts, the definiteness of a DP containing a RC impacts the acceptability of extractin from that RC.

2 Design

This experiment uses the “length by complexity” design found in Sprouse, Wagers, and Phillips (2012) and others. On top of the necessary length and complexity (embedded clause structure) factors, this experiment adds a definiteness factor so that the strength of the island can be gauged in both definite and indefinite environments.

  1. Factors
    1. Length (of dependency)
      1. short (matrix subject extraction)
      2. long (embedded object extraction)
    2. Structure (of embedded clause)
      1. non-island (embedded that-clause)
      2. island (embedded RC or CP complement to N)
    3. Definiteness of the embedding/intervening DP
      1. definite (the-DP)
      2. indefinite (a-DP or bare plural)

Two sample item sets are given below. Half of the items used a RC as the island, and the other half used a CP complement to N as the island. These items had a slightly different design.

  1. Sample item (island = RC)
    1. Who appreciated that the students finished the optional assignment?
      (non-island|short|definite)
    2. Who appreciated that students finished the optional assignment?
      (non-island|short|indefinite)
    3. What did Patty appreciate that the students finished?
      (non-island|long|definite)
    4. What did Patty appreciate that students finished?
      (non-island|long|indefinite)
    5. Who appreciated the students who finished the optional assignment?
      (island|short|definite)
    6. Who appreciated students who finished the optional assignment?
      (island|short|indefinite)
    7. What did Patty appreciate the students who finished?
      (island|long|definite)
    8. What did Patty appreciate students who finished?
      (island|long|indefinite)
  2. Sample item (island = CP complement to N)
    1. Who claimed that the university wants to hire Stanley?
      (non-island|short|definite)
    2. Who claimed that a university wants to hire Stanley?
      (non-island|short|indefinite)
    3. Who did Salazar claim that the university wants to hire?
      (non-island|long|definite)
    4. Who did Salazar claim that a university wants to hire?
      (non-island|long|indefinite)
    5. Who heard the claim that the university wants to hire Stanley?
      (island|short|definite)
    6. Who heard a claim that the university wants to hire Stanley?
      (island|short|indefinite)
    7. Who did Salazar hear the claim that the university wants to hire?
      (island|long|definite)
    8. Who did Salazar hear a claim that the university wants to hire?
      (island|long|indefinite)

3 Analysis

The following code chunk reads the data into the R environment.

# Read in the formatted results
raw_data <- read.csv("results/results_12-14(3).csv")

# Fix first column name issue from excel
colnames(raw_data)[1] <- "list"

# Remove columns that won't be used
raw_data %<>% subset(select=c(list:gender, q6.response.categorized:rating))

# Sort the data into experimental data and filler data
raw_data %>% subset(item.type=="experimental") %>% droplevels -> experiment_data
raw_data %>% subset(item.type=="filler") %>% droplevels -> filler_data

The following table summarizes the ratings by condition. The lowest-rated conditions are those with an embedded island and a long extraction (island|long). This is the expected island effect. Another easily observable pattern is that the indefinite conditions are rated lower across the board compared to the corresponding definite condition.

# Make sure the ratings are numeric so they can be averaged
experiment_data$rating %<>% as.numeric

# Reorder factor levels so that the baseline levels come first
experiment_data$ec.type <- relevel(experiment_data$ec.type, "non.isl")
experiment_data$dependency.length <- relevel(experiment_data$dependency.length, "short")

# Group the data in long format according to these properties
experiment_data %>%
  group_by(ec.type, dependency.length, definiteness) %>%
  summarize(mean.rating = mean(rating),
            sd.rating = sd(rating),
            n = n(),
            se.rating = sd.rating/sqrt(n)) -> descriptive_summary

# Save the data for use in other scripts
saveRDS(descriptive_summary, file = "expt1_descriptive_summary")

# Present in a table
print(descriptive_summary)

It is worth comparing the two different kinds of islands used directly (RC islands and CP complement to noun islands). Although traditionally both of these clause types are included in the Complex NP Constraint, there is agreement that extraction from adjuncts is generally less allowed than extraction from complements. As shown in the table below, every condition with a CP complement to N (N-CP in the table) is rated slightly higher than its corresponding RC condition.

The following plot represents the overall ratings. The left side shows all the definite conditions, and when compared to the right side (indefinite conditions), one can see that the indefinite conditions are rated lower overall than the definite conditions; the ratings are shifted down but display the same general pattern.

The following plot represents the island conditions, comparing the ratings across the two island types.

3.1 Calculating the DD scores (island strength)

# Copy raw ratings to another data frame I can modify
raw_data -> raw_data_shrt

# Paste the length and structure conditions together
raw_data_shrt$structureXlength <- paste(raw_data_shrt$ec.type, "x", raw_data_shrt$dependency.length)

# Calculate z-scores
raw_data_shrt %>%
  group_by(subject) %>% # Group raw results by subject
  mutate(z_rating = scale(rating)) %>% # Get z-scores for each subject's ratings
  ungroup %>% # Undo group by subject
  subset(item.type == "experimental") %>% droplevels %>% # Select only experimental conditions, drop unused levels
  group_by(definiteness, structureXlength, item.set) %>% # Group by definiteness, pasted structureXlength, and item set
  summarize(mean_z_rating = mean(z_rating)) %>% # Average the z-scores per condition per item
  group_by(definiteness, structureXlength) %>% # Group this summary by condition only so that means of each condition per item set can be averaged
  summarize(mean_z_ratings = mean(mean_z_rating)) -> summary_zscores # Get the mean of the mean z-scores
print(summary_zscores)

# Get table to start making DD scores
z_DDs <- summary_zscores %>% spread(structureXlength, mean_z_ratings)

# Calculate D1, add to table
z_DDs$D1 <- z_DDs$`non.isl x long` - z_DDs$`isl x long`

# Calculate D2, add to table
z_DDs$D2 <- z_DDs$`non.isl x short` - z_DDs$`isl x short`

# Calculate DD, add to table
z_DDs$DD <- z_DDs$D1 - z_DDs$D2
print(z_DDs)

3.2 Ordinal regression analyses

# Reassign contrasts
contrasts(experiment_data$definiteness) <- c(-0.5, 0.5)
contrasts(experiment_data$ec.type) <- c(0.5, -0.5)
contrasts(experiment_data$dependency.length) <- c(0.5, -0.5)
# Make sure the ratings are a factor, rather than numbers
experiment_data$rating %<>% as.factor

# Run simple effects analysis w/ rating as dependent variable, and definiteness, structure, and length and their interactions as fixed effects
clm(rating ~ definiteness * ec.type * dependency.length, data = experiment_data) -> clm.def_ectype
summary(clm.def_ectype)
formula: rating ~ definiteness * ec.type * dependency.length
data:    experiment_data

Coefficients:
                                           Estimate Std. Error z value Pr(>|z|)    
definiteness1                             -0.485471   0.115190  -4.215 2.50e-05 ***
ec.type1                                   1.331746   0.121241  10.984  < 2e-16 ***
dependency.length1                         2.153736   0.129317  16.655  < 2e-16 ***
definiteness1:ec.type1                    -0.005638   0.229327  -0.025    0.980    
definiteness1:dependency.length1          -0.126322   0.229682  -0.550    0.582    
ec.type1:dependency.length1               -1.584530   0.238251  -6.651 2.92e-11 ***
definiteness1:ec.type1:dependency.length1  0.008292   0.458784   0.018    0.986    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
    Estimate Std. Error z value
1|2 -2.46784    0.11344 -21.755
2|3 -1.22807    0.08523 -14.409
3|4 -0.27012    0.07424  -3.638
4|5  0.54304    0.07367   7.371
5|6  1.48061    0.08265  17.915

Does the type of island have any effect on transparency to extraction? Let’s include island type as a fixed effect.

## Make sure ratings are a factor
experiment_data$rating %<>% factor

## Reassign contrasts
contrasts(experiment_data$island.type) <- c(0.5, -0.5)

## Run clm analysis
clm(rating ~ definiteness * ec.type * dependency.length * island.type, data = experiment_data) -> clm.w.island.type
summary(clm.w.island.type)
formula: rating ~ definiteness * ec.type * dependency.length * island.type
data:    experiment_data

Coefficients:
                                                        Estimate Std. Error z value Pr(>|z|)    
definiteness1                                          -0.515398   0.115868  -4.448 8.66e-06 ***
ec.type1                                                1.364246   0.121955  11.186  < 2e-16 ***
dependency.length1                                      2.188875   0.129987  16.839  < 2e-16 ***
island.type1                                            0.588257   0.116146   5.065 4.09e-07 ***
definiteness1:ec.type1                                  0.007104   0.230461   0.031    0.975    
definiteness1:dependency.length1                       -0.119952   0.230835  -0.520    0.603    
ec.type1:dependency.length1                            -1.628827   0.239307  -6.806 1.00e-11 ***
definiteness1:island.type1                              0.271805   0.230526   1.179    0.238    
ec.type1:island.type1                                  -0.290002   0.230782  -1.257    0.209    
dependency.length1:island.type1                        -0.300368   0.231074  -1.300    0.194    
definiteness1:ec.type1:dependency.length1               0.021228   0.461083   0.046    0.963    
definiteness1:ec.type1:island.type1                    -0.266892   0.460756  -0.579    0.562    
definiteness1:dependency.length1:island.type1           0.268622   0.461020   0.583    0.560    
ec.type1:dependency.length1:island.type1                0.278444   0.461669   0.603    0.546    
definiteness1:ec.type1:dependency.length1:island.type1  0.459113   0.921796   0.498    0.618    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
    Estimate Std. Error z value
1|2 -2.52152    0.11628 -21.685
2|3 -1.23754    0.08623 -14.352
3|4 -0.25760    0.07497  -3.436
4|5  0.56949    0.07449   7.645
5|6  1.51554    0.08358  18.134

A significant effect of island type would have shown up as a three-way interaction between structure (ec.type), length (dependency.length), and island type, but this interaction was not significant (p=0.546).

Mixed effects model by subjects and by items

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: rating ~ definiteness * ec.type * dependency.length + (1 + definiteness *  
    ec.type * dependency.length | subject) + (1 + definiteness *  
    ec.type * dependency.length | item.set)
data:    experiment_data

Random effects:
 Groups   Name                                                  Variance Std.Dev. Corr         
 item.set (Intercept)                                           1.7862   1.3365                
          definitenessind                                       0.1050   0.3241    0.073       
          ec.typenon.isl                                        2.0225   1.4221   -0.850 -0.504
          dependency.lengthshort                                2.8047   1.6747   -0.613 -0.228
          definitenessind:ec.typenon.isl                        0.3558   0.5965    0.552 -0.004
          definitenessind:dependency.lengthshort                0.4556   0.6750   -0.579 -0.480
          ec.typenon.isl:dependency.lengthshort                 0.4607   0.6788    0.816  0.413
          definitenessind:ec.typenon.isl:dependency.lengthshort 0.9804   0.9901    0.901  0.122
 subject  (Intercept)                                           0.5943   0.7709                
          definitenessind                                       0.9241   0.9613    0.053       
          ec.typenon.isl                                        3.0084   1.7345   -0.563  0.523
          dependency.lengthshort                                0.7174   0.8470    0.097 -0.357
          definitenessind:ec.typenon.isl                        2.0440   1.4297    0.292 -0.516
          definitenessind:dependency.lengthshort                1.1113   1.0542   -0.480 -0.240
          ec.typenon.isl:dependency.lengthshort                 1.9321   1.3900    0.104 -0.073
          definitenessind:ec.typenon.isl:dependency.lengthshort 2.5198   1.5874    0.232  0.067
                                    
                                    
                                    
                                    
  0.785                             
 -0.625 -0.478                      
  0.731  0.252 -0.770               
 -0.852 -0.760  0.189 -0.325        
 -0.898 -0.758  0.784 -0.692  0.692 
                                    
                                    
                                    
  0.100                             
 -0.736  0.182                      
  0.471  0.515 -0.544               
 -0.684 -0.543  0.323 -0.362        
 -0.209 -0.777 -0.277 -0.334  0.281 
Number of groups:  subject 32,  item.set 32 

Coefficients:
                                          Estimate Std. Error z value Pr(>|z|)    
definiteness1                              -0.7634     0.1757  -4.345 1.39e-05 ***
ec.type1                                   -1.9688     0.2523  -7.802 6.10e-15 ***
dependency.length1                         -3.2657     0.3306  -9.877  < 2e-16 ***
definiteness1:ec.type1                      0.1490     0.4126   0.361    0.718    
definiteness1:dependency.length1            0.1955     0.3421   0.572    0.568    
ec.type1:dependency.length1                -2.2987     0.4679  -4.913 8.97e-07 ***
definiteness1:ec.type1:dependency.length1  -0.1072     0.6335  -0.169    0.866    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
    Estimate Std. Error z value
1|2  -3.5965     0.2744 -13.108
2|3  -1.8514     0.2482  -7.459
3|4  -0.4478     0.2391  -1.873
4|5   0.7636     0.2386   3.200
5|6   2.1879     0.2480   8.824

Sprouse, Jon, Matthew W. Wagers, and Colin Phillips. 2012. “A test of the relation between working memory capacity and syntactic island effects.” Language 88 (1): 82–123.

---
title: "RC subextraction in English: Experiment 1 notebook"
author: Jake W. Vincent (&#106;&#119;&#118;&#105;&#110;&#99;&#101;n&#64;&#117;c&#115;c.e&#100;u)
output:
  html_notebook: 
    fig_caption: yes
    number_sections: yes
    theme: flatly
    code_folding: show
    css: style.css
  pdf_document: 
    keep_tex: yes
bibliography: ../../../../../../Documents/library.bib
---

```{r setup, include = FALSE}
library(magrittr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(ordinal)
library(knitr)
## Set working directory
setwd("C:/Users/jakew/OneDrive/Documents/School/UCSC Grad school/QP2/Experiment 1/")
## Load saved clmm analysis
readRDS("clmm_def_ectype.rds") -> clmm.def_ectype
```

# Motivation

Experiment 1 was designed to test the hypothesis that in transitive object contexts, the definiteness of a DP containing a RC impacts the acceptability of extractin from that RC.

# Design

This experiment uses the "length by complexity" design found in @Sprouse2012 and others. On top of the necessary length and complexity (embedded clause structure) factors, this experiment adds a definiteness factor so that the strength of the island can be gauged in both definite and indefinite environments.

1. **Factors**
    a. <span class="smallcaps">Length</span> (of dependency)
        i. <span class="smallcaps">short</span> (matrix subject extraction)
        ii. <span class="smallcaps">long</span> (embedded object extraction)
    b. <span class="smallcaps">Structure</span> (of embedded clause)
        i. <span class="smallcaps">non-island</span> (embedded *that*-clause)
        ii. <span class="smallcaps">island</span> (embedded RC or CP complement to N)
    c. <span class="smallcaps">Definiteness of the embedding/intervening DP
        i. <span class="smallcaps">definite</span> (*the*-DP)
        ii. <span class="smallcaps">indefinite</span> (*a*-DP or bare plural)

Two sample item sets are given below. Half of the items used a RC as the island, and the other half used a CP complement to N as the island. These items had a slightly different design.

2. **Sample item** (island = RC)
    a. Who appreciated that the students finished the optional assignment?<div class="alignright">(<span class="smallcaps">non-island|short|definite</span>)</div>
    b. Who appreciated that students finished the optional assignment? <div class="alignright">(<span class="smallcaps">non-island|short|indefinite</span>)</div>
    c. What did Patty appreciate that the students finished? <div class="alignright">(<span class="smallcaps">non-island|long|definite</span>)</div>
    d. What did Patty appreciate that students finished? <div class="alignright">(<span class="smallcaps">non-island|long|indefinite</span>)</div>
    e. Who appreciated the students who finished the optional assignment? <div class="alignright">(<span class="smallcaps">island|short|definite</span>)</div>
    f. Who appreciated students who finished the optional assignment? <div class="alignright">(<span class="smallcaps">island|short|indefinite</span>)</div>
    g. What did Patty appreciate the students who finished? <div class="alignright">(<span class="smallcaps">island|long|definite</span>)</div>
    h. What did Patty appreciate students who finished? <div class="alignright">(<span class="smallcaps">island|long|indefinite</span>)</div>

3. **Sample item** (island = CP complement to N)
    a. Who claimed that the university wants to hire Stanley?<div class="alignright">(<span class="smallcaps">non-island|short|definite</span>)</div>
    b. Who claimed that a university wants to hire Stanley?<div class="alignright">(<span class="smallcaps">non-island|short|indefinite</span>)</div>
    c. Who did Salazar claim that the university wants to hire?<div class="alignright">(<span class="smallcaps">non-island|long|definite</span>)</div>
    d. Who did Salazar claim that a university wants to hire?<div class="alignright">(<span class="smallcaps">non-island|long|indefinite</span>)</div>
    e. Who heard the claim that the university wants to hire Stanley?<div class="alignright">(<span class="smallcaps">island|short|definite</span>)</div>
    f. Who heard a claim that the university wants to hire Stanley?<div class="alignright">(<span class="smallcaps">island|short|indefinite</span>)</div>
    g. Who did Salazar hear the claim that the university wants to hire?<div class="alignright">(<span class="smallcaps">island|long|definite</span>)</div>
    h. Who did Salazar hear a claim that the university wants to hire?<div class="alignright">(<span class="smallcaps">island|long|indefinite</span>)</div>

# Analysis

The following code chunk reads the data into the R environment.

```{r getData}
# Read in the formatted results
raw_data <- read.csv("results/results_12-14(3).csv")

# Fix first column name issue from excel
colnames(raw_data)[1] <- "list"

# Remove columns that won't be used
raw_data %<>% subset(select=c(list:gender, q6.response.categorized:rating))

# Sort the data into experimental data and filler data
raw_data %>% subset(item.type=="experimental") %>% droplevels -> experiment_data
raw_data %>% subset(item.type=="filler") %>% droplevels -> filler_data
```

The following table summarizes the ratings by condition. The lowest-rated conditions are those with an embedded island and a long extraction (<span style = "smallcaps">island|long</span>). This is the expected island effect. Another easily observable pattern is that the indefinite conditions are rated lower across the board compared to the corresponding definite condition.

``` {r descriptiveSummary}
# Make sure the ratings are numeric so they can be averaged
experiment_data$rating %<>% as.numeric

# Reorder factor levels so that the baseline levels come first
experiment_data$ec.type <- relevel(experiment_data$ec.type, "non.isl")
experiment_data$dependency.length <- relevel(experiment_data$dependency.length, "short")

# Group the data in long format according to these properties
experiment_data %>%
  group_by(ec.type, dependency.length, definiteness) %>%
  summarize(mean.rating = mean(rating),
            sd.rating = sd(rating),
            n = n(),
            se.rating = sd.rating/sqrt(n)) -> descriptive_summary

# Save the data for use in other scripts
saveRDS(descriptive_summary, file = "expt1_descriptive_summary")

# Present in a table
print(descriptive_summary)
```

It is worth comparing the two different kinds of islands used directly (RC islands and CP complement to noun islands). Although traditionally both of these clause types are included in the Complex NP Constraint, there is agreement that extraction from adjuncts is generally less allowed than extraction from complements. As shown in the table below, every condition with a CP complement to N (N-CP in the table) is rated slightly higher than its corresponding RC condition.

```{r descriptiveSummaryIslandsByType, echo = FALSE}
# Subset the data so it only includes the island conditions
experiment_data %>% subset(ec.type == "isl") -> experiment_data_isl

# Organize the data by island type and condition
experiment_data_isl %>%
  group_by(island.type, dependency.length, definiteness) %>%
  summarize(mean.rating = mean(rating),
            sd.rating = sd(rating),
            n = n(),
            se.rating = sd.rating/sqrt(n)) -> descriptive_summary_isl

# Present in a table
print(descriptive_summary_isl)
```

The following plot represents the overall ratings. The left side shows all the definite conditions, and when compared to the right side (indefinite conditions), one can see that the indefinite conditions are rated lower overall than the definite conditions; the ratings are shifted down but display the same general pattern.

``` {r descriptivePlot, echo=FALSE}
# Plot the ratings averages
descriptive_summary %>%
  ggplot(aes(x = dependency.length,
             y = mean.rating,
             color = ec.type,
             group = ec.type)) -> descriptive_plot

# Facet by definiteness
descriptive_plot + facet_grid(.~definiteness) +
  theme_minimal() +
  # Change the default labels
  labs(x = "Length",
       y = "Mean rating",
       color = "Structure") +
  scale_color_discrete(labels = c("Non-island", "Island")) +
  # Add error bars based on the standard errors in descriptive_summary
  geom_errorbar(aes(ymin = mean.rating - se.rating,
                    ymax = mean.rating + se.rating,
                    width = 0.15)) +
  geom_point(aes(col = ec.type),
             size = 3) +
  # Change font sizes
  theme(legend.text = element_text(size = 12),
        axis.text = element_text(size = 12),
        axis.title = element_text(size = 15),
        strip.text.x = element_text(size = 15),
        strip.text.y = element_text(size = 15),
        axis.title.x = element_text(margin = margin(0.5, NA, 0.5, NA, "cm")),
        axis.title.y = element_text(margin = margin(NA, 0.5, NA, 0.5, "cm"))) -> descriptive_plot

# Print the plot
print(descriptive_plot)
```

The following plot represents the island conditions, comparing the ratings across the two island types.

```{r descriptivePlotIsl, echo = FALSE}
descriptive_summary_isl %>%
  ggplot(aes(x = dependency.length,
             y = mean.rating,
             color = island.type,
             group = island.type)) -> descriptive_plot_isl

descriptive_plot_isl + facet_grid(.~definiteness) +
  theme_minimal() +
  labs(x = "Length",
       y = "Mean rating",
       colour = "Island type") +
  geom_point(aes(col = island.type),
             size = 3) +
  geom_errorbar(aes(ymin = mean.rating - se.rating,
                    ymax = mean.rating + se.rating,
                    col = island.type),
                width = 0.15) +
  theme(legend.text = element_text(size = 12),
        axis.text = element_text(size = 12),
        axis.title = element_text(size = 15),
        strip.text.x = element_text(size = 15),
        strip.text.y = element_text(size = 15),
        axis.text.x = element_text(margin = margin(0.5, NA, 0.5, NA, "cm")),
        axis.text.y = element_text(margin = margin(NA, 0.5, NA, 0.5, "cm"))) -> descriptive_plot_isl
descriptive_plot_isl
```

<!-- The two following plots are based on the subset of the data that has the RC item sets separated from the N-CP item sets, respectively. -->
```{r ratingsPlotRC, echo = FALSE, include = FALSE}
## Subset RC item sets
experiment_data %>% subset(island.type == "RC") -> experiment_data_RC
## Organize
experiment_data_RC %>%
  group_by(dependency.length, definiteness) %>%
  summarize(mean.rating = mean(rating),
            sd.rating = sd(rating),
            n = n(),
            se.rating = sd.rating/sqrt(n)) -> descriptive_summary_RC
## Make a plot
descriptive_summary_RC %>%
  ggplot(aes(x = dependency.length, y = mean.rating)) -> descriptive_plot_RC
descriptive_plot_RC + geom_pointrange(aes(ymin = mean.rating - se.rating,
                                          ymax = mean.rating + se.rating,
                                          col = definiteness))
```

```{r ratingsPlotNCP, echo= FALSE, include = FALSE}
experiment_data %>% subset(island.type == "N-CP")-> experiment_data_NCP
experiment_data_NCP %>%
  group_by(dependency.length, definiteness) %>%
  summarize(mean.rating = mean(rating),
            sd.rating = sd(rating),
            n = n(),
            se.rating = sd.rating/sqrt(n)) -> descriptive_summary_NCP
## Make a plot
descriptive_summary_NCP %>%
  ggplot(aes(x = dependency.length, y = mean.rating)) -> descriptive_plot_NCP
descriptive_plot_NCP + geom_pointrange(aes(ymin = mean.rating - se.rating,
                                           ymax = mean.rating + se.rating,
                                           col = definiteness))
```

```{r ratingsPlot, echo = FALSE, include = FALSE}
## This produces a ratings plot that abstracts away from definiteness
experiment_data %>% ggplot(aes(x = rating)) -> ratings_plot

ratings_plot + facet_grid(ec.type ~ dependency.length) +
  geom_histogram(bins=6)
```

<!-- #### Now looking only at islands... -->

```{r ratingsPlotIsland, echo = FALSE, include = FALSE}
experiment_data %>% subset(ec.type=="isl") %>% droplevels %>% ggplot(aes(x = rating)) -> isl_ratings_plot

isl_ratings_plot + facet_grid(definiteness ~ dependency.length) +
  geom_histogram(bins=6)
```

```{r ratingsPlotIslandFillByDef, echo = FALSE, include = FALSE}
## Make sure the ratings column is numeric
experiment_data$rating %<>% as.numeric

## The following line is the same as above, just with a different variable assigned to the graph
experiment_data %>% subset(ec.type=="isl") %>% droplevels %>% ggplot(aes(x = rating)) -> isl_ratings_plot_defFill

## Reduce the faceting so that there are only two blocks in the graph: long and short
## Fill by definiteness.
isl_ratings_plot_defFill +
  theme_minimal() +
  facet_grid(. ~ dependency.length) +
  geom_histogram(bins=6,
                 aes(fill = definiteness=="def"),
                 position = "identity",
                 alpha = 0.5) +
  ## Edit the labels
  labs(x = "Rating",
       y = "Count (of rating)") +
  ## Edit the legend title and labels
  ## Using fill here b/c the histogram was generated using 'fill'
  scale_fill_discrete("Definiteness", labels = c("Indefinite", "Definite")) +
  theme(legend.text = element_text(size = 12),
        axis.text = element_text(size = 12),
        axis.title = element_text(size = 15),
        strip.text.x = element_text(size = 15),
        strip.text.y = element_text(size = 15)) -> isl_ratings_plot_defFill
isl_ratings_plot_defFill
```

## Calculating the DD scores (island strength)

```{r zscores_for_DDs}
# Copy raw ratings to another data frame I can modify
raw_data -> raw_data_shrt

# Paste the length and structure conditions together
raw_data_shrt$structureXlength <- paste(raw_data_shrt$ec.type, "x", raw_data_shrt$dependency.length)

# Calculate z-scores
raw_data_shrt %>%
  group_by(subject) %>% # Group raw results by subject
  mutate(z_rating = scale(rating)) %>% # Get z-scores for each subject's ratings
  ungroup %>% # Undo group by subject
  subset(item.type == "experimental") %>% droplevels %>% # Select only experimental conditions, drop unused levels
  group_by(definiteness, structureXlength, item.set) %>% # Group by definiteness, pasted structureXlength, and item set
  summarize(mean_z_rating = mean(z_rating)) %>% # Average the z-scores per condition per item
  group_by(definiteness, structureXlength) %>% # Group this summary by condition only so that means of each condition per item set can be averaged
  summarize(mean_z_ratings = mean(mean_z_rating)) -> summary_zscores # Get the mean of the mean z-scores
print(summary_zscores)

# Get table to start making DD scores
z_DDs <- summary_zscores %>% spread(structureXlength, mean_z_ratings)

# Calculate D1, add to table
z_DDs$D1 <- z_DDs$`non.isl x long` - z_DDs$`isl x long`

# Calculate D2, add to table
z_DDs$D2 <- z_DDs$`non.isl x short` - z_DDs$`isl x short`

# Calculate DD, add to table
z_DDs$DD <- z_DDs$D1 - z_DDs$D2
print(z_DDs)
```

```{r DDs_byitem, include = FALSE, echo = FALSE}
# Paste item.set and definiteness together
raw_data_shrt -> raw_data_shrt2
raw_data_shrt2$item_definiteness <- paste(raw_data_shrt2$item.set,"_", raw_data_shrt2$definiteness)

# Get z-scores for each item
raw_data_shrt2 %>%
  group_by(subject) %>% # Group raw results by subject
  mutate(z_rating = scale(rating)) %>% # Get z-scores for each subject's ratings
  ungroup %>% # Undo group by subject
  subset(item.type == "experimental") %>% droplevels %>% # Select only experimental conditions, drop unused levels
  group_by(item_definiteness, structureXlength) %>% # Group by pasted item and definiteness, pasted structureXlength, and item set
  summarize(mean_z_rating = mean(z_rating)) -> summary_zscores_byitem # Average the z-scores per condition per item
print(summary_zscores_byitem)
# Calculate DD scores
# Get table to start making DD scores
z_DDs_byitem <- summary_zscores_byitem %>% spread(structureXlength, mean_z_rating)

# Calculate D1, add to table
z_DDs_byitem$D1 <- z_DDs_byitem$`non.isl x long` - z_DDs_byitem$`isl x long`

# Calculate D2, add to table
z_DDs_byitem$D2 <- z_DDs_byitem$`non.isl x short` - z_DDs_byitem$`isl x short`

# Calculate DD, add to table
z_DDs_byitem$DD <- z_DDs_byitem$D1 - z_DDs_byitem$D2
print(z_DDs_byitem)

# Get items whose DDs are less than 0.25
subset(z_DDs_byitem[1], z_DDs_byitem$DD < 0.25)
```


## Ordinal regression analyses

``` {r contrasts}
# Reassign contrasts
contrasts(experiment_data$definiteness) <- c(-0.5, 0.5)
contrasts(experiment_data$ec.type) <- c(0.5, -0.5)
contrasts(experiment_data$dependency.length) <- c(0.5, -0.5)
```

```{r clmAnalysis}
# Make sure the ratings are a factor, rather than numbers
experiment_data$rating %<>% as.factor

# Run simple effects analysis w/ rating as dependent variable, and definiteness, structure, and length and their interactions as fixed effects
clm(rating ~ definiteness * ec.type * dependency.length, data = experiment_data) -> clm.def_ectype
summary(clm.def_ectype)
```

Does the type of island have any effect on transparency to extraction? Let's include island type as a fixed effect.

```{r withIslandType}
## Make sure ratings are a factor
experiment_data$rating %<>% factor

## Reassign contrasts
contrasts(experiment_data$island.type) <- c(0.5, -0.5)

## Run clm analysis
clm(rating ~ definiteness * ec.type * dependency.length * island.type, data = experiment_data) -> clm.w.island.type
summary(clm.w.island.type)
```

A significant effect of <span class="smallcaps">island type</span> would have shown up as a three-way interaction between <span class="smallcaps">structure</span> (ec.type), <span class="smallcaps">length</span> (dependency.length), and <span class="smallcaps">island type</span>, but this interaction was not significant (p=0.546).

**Mixed effects model by subjects and by items**

```{r echo = FALSE}
summary(clmm.def_ectype)
```
