1 Motivation

Experiment 1 was designed to test the hypothesis that in transitive object contexts, the definiteness of a DP containing a RC impacts the acceptability of extractin from that RC.

2 Design

This experiment uses the “length by complexity” design found in Sprouse, Wagers, and Phillips (2012) and others. On top of the necessary length and complexity (embedded clause structure) factors, this experiment adds a definiteness factor so that the strength of the island can be gauged in both definite and indefinite environments.

  1. Factors
    1. Length (of dependency)
      1. short (matrix subject extraction)
      2. long (embedded object extraction)
    2. Structure (of embedded clause)
      1. non-island (embedded that-clause)
      2. island (embedded RC or CP complement to N)
    3. Definiteness of the embedding/intervening DP
      1. definite (the-DP)
      2. indefinite (a-DP or bare plural)

Two sample item sets are given below. Half of the items used a RC as the island, and the other half used a CP complement to N as the island. These items had a slightly different design.

  1. Sample item (island = RC)
    1. Who appreciated that the students finished the optional assignment?
      (non-island|short|definite)
    2. Who appreciated that students finished the optional assignment?
      (non-island|short|indefinite)
    3. What did Patty appreciate that the students finished?
      (non-island|long|definite)
    4. What did Patty appreciate that students finished?
      (non-island|long|indefinite)
    5. Who appreciated the students who finished the optional assignment?
      (island|short|definite)
    6. Who appreciated students who finished the optional assignment?
      (island|short|indefinite)
    7. What did Patty appreciate the students who finished?
      (island|long|definite)
    8. What did Patty appreciate students who finished?
      (island|long|indefinite)
  2. Sample item (island = CP complement to N)
    1. Who claimed that the university wants to hire Stanley?
      (non-island|short|definite)
    2. Who claimed that a university wants to hire Stanley?
      (non-island|short|indefinite)
    3. Who did Salazar claim that the university wants to hire?
      (non-island|long|definite)
    4. Who did Salazar claim that a university wants to hire?
      (non-island|long|indefinite)
    5. Who heard the claim that the university wants to hire Stanley?
      (island|short|definite)
    6. Who heard a claim that the university wants to hire Stanley?
      (island|short|indefinite)
    7. Who did Salazar hear the claim that the university wants to hire?
      (island|long|definite)
    8. Who did Salazar hear a claim that the university wants to hire?
      (island|long|indefinite)

3 Analysis

The following code chunk reads the data into the R environment.

# Read in the formatted results
raw_data <- read.csv("results/results_12-14(3).csv")

# Fix first column name issue from excel
colnames(raw_data)[1] <- "list"

# Remove columns that won't be used
raw_data %<>% subset(select=c(list:gender, q6.response.categorized:rating))

# Sort the data into experimental data and filler data
raw_data %>% subset(item.type=="experimental") %>% droplevels -> experiment_data
raw_data %>% subset(item.type=="filler") %>% droplevels -> filler_data

The following table summarizes the ratings by condition. The lowest-rated conditions are those with an embedded island and a long extraction (island|long). This is the expected island effect. Another easily observable pattern is that the indefinite conditions are rated lower across the board compared to the corresponding definite condition.

# Make sure the ratings are numeric so they can be averaged
experiment_data$rating %<>% as.numeric

# Reorder factor levels so that the baseline levels come first
experiment_data$ec.type <- relevel(experiment_data$ec.type, "non.isl")
experiment_data$dependency.length <- relevel(experiment_data$dependency.length, "short")

# Group the data in long format according to these properties
experiment_data %>%
  group_by(ec.type, dependency.length, definiteness) %>%
  summarize(mean.rating = mean(rating),
            sd.rating = sd(rating),
            n = n(),
            se.rating = sd.rating/sqrt(n)) -> descriptive_summary

# Save the data for use in other scripts
saveRDS(descriptive_summary, file = "expt1_descriptive_summary")

# Present in a table
print(descriptive_summary)

It is worth comparing the two different kinds of islands used directly (RC islands and CP complement to noun islands). Although traditionally both of these clause types are included in the Complex NP Constraint, there is agreement that extraction from adjuncts is generally less allowed than extraction from complements. As shown in the table below, every condition with a CP complement to N (N-CP in the table) is rated slightly higher than its corresponding RC condition.

The following plot represents the overall ratings. The left side shows all the definite conditions, and when compared to the right side (indefinite conditions), one can see that the indefinite conditions are rated lower overall than the definite conditions; the ratings are shifted down but display the same general pattern.

The following plot represents the island conditions, comparing the ratings across the two island types.

3.1 Calculating the DD scores (island strength)

# Copy raw ratings to another data frame I can modify
raw_data -> raw_data_shrt

# Paste the length and structure conditions together
raw_data_shrt$structureXlength <- paste(raw_data_shrt$ec.type, "x", raw_data_shrt$dependency.length)

# Calculate z-scores
raw_data_shrt %>%
  group_by(subject) %>% # Group raw results by subject
  mutate(z_rating = scale(rating)) %>% # Get z-scores for each subject's ratings
  ungroup %>% # Undo group by subject
  subset(item.type == "experimental") %>% droplevels %>% # Select only experimental conditions, drop unused levels
  group_by(definiteness, structureXlength, item.set) %>% # Group by definiteness, pasted structureXlength, and item set
  summarize(mean_z_rating = mean(z_rating)) %>% # Average the z-scores per condition per item
  group_by(definiteness, structureXlength) %>% # Group this summary by condition only so that means of each condition per item set can be averaged
  summarize(mean_z_ratings = mean(mean_z_rating)) -> summary_zscores # Get the mean of the mean z-scores
print(summary_zscores)

# Get table to start making DD scores
z_DDs <- summary_zscores %>% spread(structureXlength, mean_z_ratings)

# Calculate D1, add to table
z_DDs$D1 <- z_DDs$`non.isl x long` - z_DDs$`isl x long`

# Calculate D2, add to table
z_DDs$D2 <- z_DDs$`non.isl x short` - z_DDs$`isl x short`

# Calculate DD, add to table
z_DDs$DD <- z_DDs$D1 - z_DDs$D2
print(z_DDs)

3.2 Ordinal regression analyses

# Reassign contrasts
contrasts(experiment_data$definiteness) <- c(-0.5, 0.5)
contrasts(experiment_data$ec.type) <- c(0.5, -0.5)
contrasts(experiment_data$dependency.length) <- c(0.5, -0.5)
# Make sure the ratings are a factor, rather than numbers
experiment_data$rating %<>% as.factor

# Run simple effects analysis w/ rating as dependent variable, and definiteness, structure, and length and their interactions as fixed effects
clm(rating ~ definiteness * ec.type * dependency.length, data = experiment_data) -> clm.def_ectype
summary(clm.def_ectype)
formula: rating ~ definiteness * ec.type * dependency.length
data:    experiment_data

Coefficients:
                                           Estimate Std. Error z value Pr(>|z|)    
definiteness1                             -0.485471   0.115190  -4.215 2.50e-05 ***
ec.type1                                   1.331746   0.121241  10.984  < 2e-16 ***
dependency.length1                         2.153736   0.129317  16.655  < 2e-16 ***
definiteness1:ec.type1                    -0.005638   0.229327  -0.025    0.980    
definiteness1:dependency.length1          -0.126322   0.229682  -0.550    0.582    
ec.type1:dependency.length1               -1.584530   0.238251  -6.651 2.92e-11 ***
definiteness1:ec.type1:dependency.length1  0.008292   0.458784   0.018    0.986    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
    Estimate Std. Error z value
1|2 -2.46784    0.11344 -21.755
2|3 -1.22807    0.08523 -14.409
3|4 -0.27012    0.07424  -3.638
4|5  0.54304    0.07367   7.371
5|6  1.48061    0.08265  17.915

Does the type of island have any effect on transparency to extraction? Let’s include island type as a fixed effect.

## Make sure ratings are a factor
experiment_data$rating %<>% factor

## Reassign contrasts
contrasts(experiment_data$island.type) <- c(0.5, -0.5)

## Run clm analysis
clm(rating ~ definiteness * ec.type * dependency.length * island.type, data = experiment_data) -> clm.w.island.type
summary(clm.w.island.type)
formula: rating ~ definiteness * ec.type * dependency.length * island.type
data:    experiment_data

Coefficients:
                                                        Estimate Std. Error z value Pr(>|z|)    
definiteness1                                          -0.515398   0.115868  -4.448 8.66e-06 ***
ec.type1                                                1.364246   0.121955  11.186  < 2e-16 ***
dependency.length1                                      2.188875   0.129987  16.839  < 2e-16 ***
island.type1                                            0.588257   0.116146   5.065 4.09e-07 ***
definiteness1:ec.type1                                  0.007104   0.230461   0.031    0.975    
definiteness1:dependency.length1                       -0.119952   0.230835  -0.520    0.603    
ec.type1:dependency.length1                            -1.628827   0.239307  -6.806 1.00e-11 ***
definiteness1:island.type1                              0.271805   0.230526   1.179    0.238    
ec.type1:island.type1                                  -0.290002   0.230782  -1.257    0.209    
dependency.length1:island.type1                        -0.300368   0.231074  -1.300    0.194    
definiteness1:ec.type1:dependency.length1               0.021228   0.461083   0.046    0.963    
definiteness1:ec.type1:island.type1                    -0.266892   0.460756  -0.579    0.562    
definiteness1:dependency.length1:island.type1           0.268622   0.461020   0.583    0.560    
ec.type1:dependency.length1:island.type1                0.278444   0.461669   0.603    0.546    
definiteness1:ec.type1:dependency.length1:island.type1  0.459113   0.921796   0.498    0.618    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
    Estimate Std. Error z value
1|2 -2.52152    0.11628 -21.685
2|3 -1.23754    0.08623 -14.352
3|4 -0.25760    0.07497  -3.436
4|5  0.56949    0.07449   7.645
5|6  1.51554    0.08358  18.134

A significant effect of island type would have shown up as a three-way interaction between structure (ec.type), length (dependency.length), and island type, but this interaction was not significant (p=0.546).

Mixed effects model by subjects and by items

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: rating ~ definiteness * ec.type * dependency.length + (1 + definiteness *  
    ec.type * dependency.length | subject) + (1 + definiteness *  
    ec.type * dependency.length | item.set)
data:    experiment_data

Random effects:
 Groups   Name                                                  Variance Std.Dev. Corr         
 item.set (Intercept)                                           1.7862   1.3365                
          definitenessind                                       0.1050   0.3241    0.073       
          ec.typenon.isl                                        2.0225   1.4221   -0.850 -0.504
          dependency.lengthshort                                2.8047   1.6747   -0.613 -0.228
          definitenessind:ec.typenon.isl                        0.3558   0.5965    0.552 -0.004
          definitenessind:dependency.lengthshort                0.4556   0.6750   -0.579 -0.480
          ec.typenon.isl:dependency.lengthshort                 0.4607   0.6788    0.816  0.413
          definitenessind:ec.typenon.isl:dependency.lengthshort 0.9804   0.9901    0.901  0.122
 subject  (Intercept)                                           0.5943   0.7709                
          definitenessind                                       0.9241   0.9613    0.053       
          ec.typenon.isl                                        3.0084   1.7345   -0.563  0.523
          dependency.lengthshort                                0.7174   0.8470    0.097 -0.357
          definitenessind:ec.typenon.isl                        2.0440   1.4297    0.292 -0.516
          definitenessind:dependency.lengthshort                1.1113   1.0542   -0.480 -0.240
          ec.typenon.isl:dependency.lengthshort                 1.9321   1.3900    0.104 -0.073
          definitenessind:ec.typenon.isl:dependency.lengthshort 2.5198   1.5874    0.232  0.067
                                    
                                    
                                    
                                    
  0.785                             
 -0.625 -0.478                      
  0.731  0.252 -0.770               
 -0.852 -0.760  0.189 -0.325        
 -0.898 -0.758  0.784 -0.692  0.692 
                                    
                                    
                                    
  0.100                             
 -0.736  0.182                      
  0.471  0.515 -0.544               
 -0.684 -0.543  0.323 -0.362        
 -0.209 -0.777 -0.277 -0.334  0.281 
Number of groups:  subject 32,  item.set 32 

Coefficients:
                                          Estimate Std. Error z value Pr(>|z|)    
definiteness1                              -0.7634     0.1757  -4.345 1.39e-05 ***
ec.type1                                   -1.9688     0.2523  -7.802 6.10e-15 ***
dependency.length1                         -3.2657     0.3306  -9.877  < 2e-16 ***
definiteness1:ec.type1                      0.1490     0.4126   0.361    0.718    
definiteness1:dependency.length1            0.1955     0.3421   0.572    0.568    
ec.type1:dependency.length1                -2.2987     0.4679  -4.913 8.97e-07 ***
definiteness1:ec.type1:dependency.length1  -0.1072     0.6335  -0.169    0.866    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
    Estimate Std. Error z value
1|2  -3.5965     0.2744 -13.108
2|3  -1.8514     0.2482  -7.459
3|4  -0.4478     0.2391  -1.873
4|5   0.7636     0.2386   3.200
5|6   2.1879     0.2480   8.824

Sprouse, Jon, Matthew W. Wagers, and Colin Phillips. 2012. “A test of the relation between working memory capacity and syntactic island effects.” Language 88 (1): 82–123.

