The Seidenberg and Waters (1989) Mega-study

A long time ago (the 1980s) in a land far, far away (Montreal), I did a lot of studies of properties of words that affected how hard it is to read them aloud. In that era, the research strategy was to compare groups of words that differed along some dimension of interest (such as frequency, consistency, ambiguity, etc.) but comparable (“equated”) in other respects. If you did a really careful job, you might end up with a very small number of very nicely equated words per condition. I became dissatisfied with this approach because words differ in so many ways there was no way to properly equate them; someone could always find another dimension that hadn’t been “controlled.” Moreover the method assumed that these properties are independent, but they’re not. And let’s not talk about statistical power (because no one did in those days).

Around the time McClelland and I were working on our reading model, I had a thought. The model was trained on about 2900 monosyllablic words. Why not do an experiment in which the subjects (as they were known) would read all of them aloud? It didn’t take that long to acquire the data, even giving subjects breaks after every few hundred words. People would never have to run another word naming experiment: they could just choose their stimuli from our big list. I proceeded to do the experiment with Gloria Waters, who was then my post-doc (and is now Vice President of Boston University). We ran 30 subjects on 2980 words using Apple II computers with floppy disks.

The experiment worked just fine. We presented the results in a talk, “Word Recognition and Naming: A Mega Study” at the Psychonomic Society meeting in 1989 (I remember Dave Balota, who later did the English Lexicon Project, being there). Only the abstract was ever published, however (Bulletin of the Psychonomic Society, 1989, 27, 489). Working with this big (for the era) data set on floppy disks was a nightmare. We managed to calculate the item-wise data but we didn’t have the by-subjects data. I later tried and failed to get the data transferred to another format. The conventions of the era required computing statistical tests both by subjects and by items and that seemed to preclude publication in a journal. There was nothing like PsyArXiv to post them to–the modern Internet didn’t exist yet. I gave the itemwise data to people who were interested, and later posted them on the web. They were used in several published studies. We showed that several important studies replicated using the means for the same items. Eventually the results were superseded by the ELP and other “mega-studies”.

Here, once more, are the original data. Have fun. Caveat emptor.

Simple means [.docx]

Means, SDs, other word statistics (length, neighbors, etc). [.xls]