You are looking at content from Sapping Attention, which was my primary blog from 2010 to 2015; I am republishing all items from there on this page, but for the foreseeable future you should be able to read them in their original form at sappingattention.blogspot.com. For current posts, see here.

Back to Darwin

Nov 13 2010

Henry asks in the comments whether the decline in evolutionary thought in the 1890s is the Eclipse of Darwinism, rise or prominence of neo-Lamarckians and saltationism and kooky discussions of hereditary mechanisms? Lets take a look, with our new and improved data (and better charts, too, compared to earlier in the weekany suggestions on design?). First,three words very closely tied to the theory of natural selection.

Three rises from around 1859, Origins publication date (obviously the numbers for Spencer are inflated by other Spencers in the world, but the trend seems like it might be driven by Herbert); and three peaks at different points from 1885 to 1900, followed by a fall and perhaps a recovery. The question is: how significant are those falls, and how can we interpret them? First, lets look at the bookcounts: are those falls a result of less intensive discussion of the subjects, or of a waning in interest across all books?

Interesting. This makes it clear that evolution continued to diffuse across the language as late as 1910; any fall in wordcount could reflect the fact that its simply not contentious anymore. Darwin, likewise, still rises but doesnt fall quite so dramatically. So we could provisionally read these as saying that the extensiveness of discussion of evolution increased, even as its intensiveness declined. It would be great to actually put some sort of numbers on theseoccurences of a word per book it appears in is the obvious metric of intensivity of discussion. But that wont scale well for rare or common words. We can create a better one, but first Ill need to fit a curve to that bookcounts chart from earlier.

(Im not even going to try to read those Spencer results, given my uncertainty about how much of that involves the ancestors of Winston Churchill and Princess Diana. We could write some just-so stories about it, but thats already too much of a temptation. (Theres probably an Erasmus Darwin spike in there toobut under my current system, it takes too longmaybe 45 minutesto get results for a firstname lastname pair).)

Also interesting is how closely aligned the rises of evolution and Darwin have become, at least initially; but then in the 1880s, evolution rises again without a concurrent interest in Darwin. What accompanies evolutions second wind? Well, I could just pull some words out of a hat. Im going to do bookcounts only (got to make the code change the titles to make that clear):

Henrys neo-Lamarckian explanation may have some currency, although his curve is so shaky its hard to impute much meaning to its 1880-1890 spike. Fiske rises right in the period were talking about, but Haeckel is a little early. Saltation, which I didnt plot, peaks at 25‰ (first and only warningIm generally going to use per mil instead per cent to describe numbers on this blog) of books in 1845, and is generally below 10 after 1870.

Heredity, though, is a big winnerit starts rising along with our other two words, but keeps climbing as they peak out. Thats a correlation thats helpful, because it helps us think about what sort of discourse of evolution was rising. Henrys kooky discussions of hereditary mechanisms seems like a likely candidateafter Darwin is somewhat digested, other theories of evolution keep coming forth.

Now, instead of pulling names out of a hat to analyze, I could have actually done a full analysis of word and book collocation as I did yesterday. We could even do it for different periodswords associated with evolution in the 1870s as opposed to the 1890s. That probably would have prompted some intuitions about better words to search for in these charts. (Even now, Im realizing I should have tried to get lester frank ward in there, although his names are a little common). Currently those collocations take too much time. I might try to implement a database solution that would let us do it much faster sometime later, though.

Lastly, a caveat (I always bury these at the bottom): these bookcounts are even more vulnerable to changes in the genre composition of my library over the years than are the wordcounts. In terms of data cleaning and segmentation, thats the most important thing I can do. Some sort of text-based classification would be fun to implement, although I obviously wouldnt do it as well as some others. Talking to Ben Gross the other day, I realized that I might be able to batch search the library of congress or worldcat and end up with call numbers for at least some of these books, which would be amazing. But if they do allow batch queries, getting the ugly title and author data from internet archive to be accepted by their system is probably going to be a pain. So well see if I get it done.

Comments:

Good stuff again, Ben - to be more specific about

Hank - Nov 6, 2010

Good stuff again, Ben - to be more specific about kooky discussions of hereditary mechanisms, it bears noting that, insofar as heredity (or the lack of an explanation for a plausible mechanism thereof) was taken to be the chief failing of the natural selection hypothesis, you might consider plotting heredity and evolution on the same graph and/or (probably or) finding an associated word that might explain around what, exactly, debates hinged after evolution was accepted by Darwin was denied (1880s/1890s). Something similar to your work yesterday on scientific method might suffice, yes?