Skip to content

vboyce/natural-stories-maze

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

natural-stories-maze

This contains the materials, data, code, and write-ups associated with the Natural Stories Maze paper.

Contents

Analysis

  • read_results.R takes the in Data/raw_data and produces Data/cleaned.rds
  • nat_stories.Rmd has non-modelling analysis looking at accuracy, comprehension, participant feedback; takes in Data/cleaned.rds and produces Analysis/models/comp.rds
  • models.Rmd has the modelling stuff
  • models/ has saved summaries of models and other pre-processed data objects for inclusion in the paper (paper should build without needing to run any of the models oneself)

Data

  • raw_data - what Ibex produces
  • cleaned.rds (generated by Analysis/read_results.R)
  • maze_pre_error.Rds is a cleaned up version of only pre-mistake data used for modelling, created by models.Rmd
  • SPR/ contains raw data from Futrell et al; first.rds is a cleaned-up version of first stories only created by models.Rmd

Materials

  • for_ns.js is the code to run the experiment (insert into Ibex maze framework)
  • for_ibex.txt is natural stories text split up in sentences with distractors
  • ibex_questions.txt is the natural stories questions
  • natural_stories_sentences.tsv is the text split into sentences
  • raw_questions.txt is the raw natural stories questions
  • practice.txt is the text and questions of practice items
  • practice_post_maze.txt is the practice items with distractors in Ibex maze format

Prep_code

  • nat_stories_prep.Rmd - takes raw Natural Stories materials and processes it for labels, Maze and model surprisals; also takes in tokenizations and surprisal and makes a nice table of them. This generates some of the files in Materials/
  • useful.py manages formatting for before and after running surprisals (Note: ngram, txl and grnn were run on a cluster with a precursor to lm-zoo. GPT was run with lm-zoo. For replicating/altering, I recommend using lm-zoo. TXL is not currently on lm-zoo)
  • natural_stories_surprisals.rds is used in models.Rmd
  • ns_pre_maze.txt is the natural stories sentences ready to get Maze distractors
  • other files are inputs or intermediate outputs to reformatting the natural stories materials for the experiment
  • predictors/ is all surprisal and frequency predictions and model tokenization patterns

Papers

  • Papers/Paper has the actual manuscript
  • Amlap_2020_talk contains abstract and slides for the presentation given at Amlap 2020
  • UCI_2021 contains slides for a lab meeting presentation
  • Images/ and many loose image files are just that
  • lab_meeting_2020 (.tex and .pdf) is from a pre-Amlap lab meeting presentation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published