JF Ptak Science Books Post 2132
We're talking about JJ's Ulysses, not the older one.
Miles L. Hanley performed an enormous and exacting (!) task of providing a word census for this first-among-the-Modernists novel. Joyce used 250,000-odd words to take Mr. Bloom around Dublin that day, 30,000 or so of them different1.
Hanley's very precise pre-spreadsheet undertaking was done in a year, somehow, and published in 1937.
What he and his team of 20 or so did was this: working from the Randomn House edition (the most democratic of the editions at that time and the most accessible to the largest reading audience), he/they typed out each word on a miniature index card (above) and recorded the page and line in the text in which the word occurred. The cards were specially designed and pre-pasted so that they could be linked together inb long lines and then stored afterwards in six long wooden trays. At the end of the day there were some 220,000 of these little cards, each of which was checked and double-checked and edited and re-edited and then everything proof-read and then so again. There was a LOT of small work in this procedure, but it is in the small work that the greatest of this undertaking happened.
Among many other things that Hanley recorded was the equipment and production expense, which totaled $148 for the project, most of which was spent of 250,000 1.5x2.5" cards (which cost $100). There are two dozen pages of introduction and explanation of process, as well as about the same number in the appendix (which is interesting and useful)--there is basically nothing that is contributed for interpretation, or what any of the word usage/occurrence/frequency/rarity and so on might mean. That was certainly another project.
I must say that this was an extremely sharp operation, well-planned and directed, and done in sort of no time at all. Simply impressive.
The first edition of the work is relatively rare (it went through a number of editions and iterations and reprintings), and the one I have in front of me now is from the Library of Congress via the Copyright Office. It is always troubling to say that something is "rare" and then have two of them, which is the case here--not only that, but both are from the Copyright Office, and (excitingly!) they have their original carbon cards of their LC card catalog cards tucked in. Pretty cool.
Notes:
1. Word count is complicated, and inlcudes plurals and other variations of the same word. Suffice to say that it is around 30,000. Or 33,000.