A couple of coincidences turn thoughts towards the topics of Grumbines. A few weeks I heard from R. Edward Grumbine. Turns out we are indeed related, though perhaps no closer than my 5-great grandfather (Leonhart Krumbein). Also in recent visitation, it seems some people looking for Dr. Francis Grumbine found themselves here. Francis is a medical doctor, I'm a PhD in geoscience, Ed (R. Edward) is a PhD and professor of ecology. One more, if not recent, coincidence: I was an undergraduate at Northwestern University and graduate student at the University of Chicago. While at Northwestern, I worked in the office (what once was) of William Krumbein, another relative, who was a noted geologist. (He's another 3 generations back from me, and I've never talked to Francis though he is in the area here.)
If you look some more, you'll find more Grumbines (also Krumbein, Crumbine, ...) in science and medicine. Rather a surprising number, at least to me.
Ok, a surprise. What should a scientist do with a surprise? Start thinking about how surprising it really is, of course. Keeping it to current Grumbines (throw in a Richard who is also involved in natural science), we've got at least 4 scientific researchers (Francis publishes in the scientific literature, though I've also run in to a patient or two of his so he must double in clinical practice).
Now, Grumbine is a very uncommon name. Among the most uncommon, in fact, in the US. So maybe there's a science gene we Grumbines carry? After all, here we've got 4 of the name doing science and probably almost none of you had ever heard of the name before stopping by here.
How would we test an idea about there being particularly many Grumbines doing science? We really want the same kind of numbers that I looked at in dismissing the bogus petition -- how many people are there, and how many have the characteristic we're interested in? It would not be terribly hard to come up with good numbers on how many Grumbines are publishing in science: Just do a scientific literature search, or even Google Scholar search and start counting. There'll be some fuzziness as apparently different names (R vs. Robert vs. Robert W. vs. R. W., for instance) might, all be me, or maybe not. Same for the others.
But how to get a sense of how many Grumbines there are in the US? That's a problem. The listing I saw of name frequency only gave the rank order, not how many there were. You might be tempted to do a general web search on the name. But then you run in to a very strong selection effect. The ease of finding people on the web depends heavily on what it is they do. Scientists are typically extremely easy to find, so get represented well. On the other hand, carpenters are probably relatively hard to find (aside from Bill Grumbine, who seems extremely well-known in the world of bowl turning; some lovely pictures out there of his work). By this sort of thing you could come up with 4 scientists, 1 bowl-turner, and 1 stand-up comedian (Peter, and another profession likely to be overrepresented). Now, if the majority of Grumbines were scientists, that would clearly be different from the general population.
This is what makes selection effects a problem. With a small sample that is biased to finding the sort of person we're trying to test a hypothesis about, you're in trouble. In the US population as a whole, scientists are something like 1 in 1000. If there were actually about 4000 Grumbines in the US, then the 4 I've named would be about par for the course. If there were 40,000, then the easily found 4 actually show it uncommon for Grumbines to be scientists. But most people* don't leave very much web trace, so the additional 4000, or 40,000, would be much harder to find than the 4. (Well, at least 5 -- there's a David Grumbine in physics.)
So, anyone have clever ways of putting some limits on just how many Grumbines there are? Er, that may not have come out right. Finding ways of estimating with some confidence that there are between X and Y Grumbines?
I'm not proposing any genetic link, nor that if there is one, my family has it+. Rather, the idea is to illustrate an early step or two in doing science. Have some notion, from whatever unlikely source, and then start looking at what kind of data you would need to test the notion. If there can't be data to test the idea, it can be good and interesting, but not science. In this case, there clearly can be such data. We then move on to the next step -- how can I get hold of it? If I can't get hold of what I really want for data (accurate counts of how many Grumbines there are, and how many are publishing in science), can I find something close to it that will let me test the idea anyhow?
*Ok, maybe I should say most people of my generation and older.
+ Unlike for teaching, where, if there can be a genetic disposition to teaching, I'll definitely submit my genealogy in candidacy for illustrating it.