Anti-Science Humping in the Dog Park
#SokalSquaredDebunked Vol. 1 | Guest post by Néstor de Buen
Introducing a new series: #SokalSquaredDebunked. The first installment is a guest post written by my colleague Néstor de Buen. You can follow him on Twitter here.
One of the most famous—or infamous—papers that was written, submitted, and, in this case, published during the Grievance Studies Hoax (also known as the Sokal Squared Affair) was the Portland dog park paper. The article was officially titled “Human reactions to rape culture and queer performativity at urban dog parks in Portland, Oregon” and published in the journal Gender, Place & Culture in 2018. It has since been retracted. The Canadian right-wing outlet The National Post described it as “the jewel in the crown of Sokal 2.” It is not difficult to see why it might be considered as such. On the face of it, the premise sounds absolutely ridiculous. The fictional author, Helen Wilson, purports to show that dogs humping other dogs show the existence of rape culture and exhibit the toxic themes intrinsic to gender binaries.
Now, the stated goal of the Grievance Studies Hoax was to show that identity-centered disciplines in the humanities are broken beyond repair. One thing worth emphasizing, as many have before, is that the way the hoax was carried out was, ironically, completely unscientific. As noted by Professor Matthew Blackwell of Harvard’s Institute for Quantitative Social Science, the hoaxers did not use a control group from other disciplines to have a baseline of effectiveness. This was also brought up when James Lindsay appeared on the Very Bad Wizards podcast with David Pizarro and Tamler Sommers. While this appearance discussed Lindsay’s original “Conceptual Penis” hoax, it seems the same issues have persisted since then. Lindsay did not have a convincing answer.
In what may have been the most low-effort authors-meet-critics session in history, pseudojournalist Andy Ngo served the Hoaxers some soft-serve objections for Quillette magazine—including the observation that there was no control group. Here is Lindsay’s rebuttal, verbatim:
What on earth would a control group be for if our test is to test “grievance studies” and not all of academic scholarship? The fact that some of our papers were rejected and we could see why they were rejected gives you the kind of data that serves in place of a control.
We’re not doing a comparative study. We’re trying to do an investigation. Somebody asked me why I didn’t do this to math. Why the hell would I do it to math? Yes, people make mistakes and stuff gets by, but I don’t see systematic error going on there.
The way in which the Hoaxers wanted to diagnose “grievance studies” was very specific. The point was not to show that, for instance, the current research output in these fields was simply low-quality with respect to its full potential. No, what they set out to prove was that the conceptual frameworks on which they stand are fundamentally flawed. In their own words, as they wrote in The New Statesman describing some of their papers, including this one:
Our papers claim that dog parks are rape-condoning spaces and that by observing the reactions of dog-owners to “unwanted humping” among dogs, we can determine that a human rape culture is deeply ingrained in men who could benefit from being trained like dogs. They investigate why heterosexual men enjoy the company of attractive and scantily-clad female servers in a restaurant and conclude it’s so they can live out fantasies of patriarchal domination.
[…]
Do you see a pattern? Although the papers we wrote scanned many subdisciplines of identity-based studies, by far the greatest uptake was of the ones which argued (on purely theoretical, subjective, and unfalsifiable grounds) that heterosexual masculinity is toxic, abusive, and thoroughly problematic.
I have never read the one about (presumably) Hooters’ female servers, so we will leave it aside for now. However, using the dog park example, they decidedly do not show what they, in their own words, claim to show. The theme of the paper is admittedly ridiculous, so it is easy to let the silliness of writing about humping dogs do a lot of the heavy lifting. But again, the idea is not simply to argue that there are problems (in general) in identity-centered humanities disciplines, but that the whole research programme (I use this term intentionally, more on this later) is broken. Moreover, as laid out in their New Statesman piece, the argument is that these programmes are broken for very specific reasons, namely, they are purely theoretical, subjective, and unfalsifiable. But is this true?
Now, before I continue, I want to address the issue of a journal having published a paper based on false data. This is, of course, a bad thing, but it does not—in any way—prove what the authors set out to prove. And while this aspect of the paper does show that the peer review process in this particular journal could have been more thorough, several considerations need to be made. First, this issue is present in every field. (This is discussed further near the end of this post.) Second, while there could be ways to filter false data, the kind of observational research described in the paper is inherently difficult to review. It is practically impossible to do that kind of review with, for example, anthropological field research, since often the only available evidence is the researcher’s own notes. This is very similar. A dataset could have been constructed, but then the reviewers would simply have to trust the accuracy of the dataset. So the idea is not to suggest nothing should be done to prevent this kind of fraud, but to show that there is no straightforward solution to this problem.
Their claim, in a nutshell, is that gender studies are not scientific. This is why they emphasize things like ‘subjective’ and ‘unfalsifiable’. Karl Popper’s idea that falsification is a kind of litmus test for whether a theory is scientific or not is one of the most ubiquitous claims in online discourse. Of course, Popper’s own position was never this simple, but that is beside the point.
Before discussing the actual paper further, allow me a little detour through the philosophy of science, since that is essentially what is at stake in the Grievance Hoaxers’ claim. Let us turn to Imre Lakatos who expanded greatly on Popper’s ideas, and, crucially, introduced the idea of a scientific research programme in his philosophy of science.
This is important, because it acknowledges the fact that science is more than doing experiments and proving some theories wrong. The following framework is set out in Lakatos’ 1968 article, “Criticism and the Methodology of Scientific Research Programmes.” Each research programme, in addition to the observations that support it, is built on a number of theoretical statements, which in turn provide empirical content to be verified via experiment or observation. What defines a research programme is its capacity for growth. It starts with a few core theoretical claims and explanations for a limited number of phenomena. But if it is successful, it will be able to expand and explain additional phenomena. This, I think, is a perfectly reasonable view of science.
Now, back to the dog park paper. It is very strange that anyone could describe its argument as either purely theoretical, subjective, or unfalsifiable. For starters, the whole paper is centered around a supposed year of empirical observation. What about it being purely theoretical and unfalsifiable? In describing the implications of the observations for the (let’s call it) research programme of gender studies, fake author Helen Wilson writes:
Wherein it concerns rape/humping behavior, the social structural reach of oppressive patriarchal norms reach a zenith in dog parks, rendering them not only gendered spaces but spaces that exhibit and magnify toxic and violent themes intrinsic to gender binaries. There is little male tolerance for queering acts while rape/humping of female dogs is often permitted, condoned, not stopped, or in some cases laughed at. In all cases the (species-centric) mechanisms to keep oppressive, masculinist systems in place are enforced by shouting or hitting.
Yes, it is true that there is a lot of unnecessary, obfuscating jargon. But anyone should see that there is very clear empirical content in this paragraph. This, of course, means it is a claim that can be tested and is therefore falsifiable by definition.
But not only that, if we go by Lakatos’ more sophisticated concept of science, we could even say that this potentially represents an example of the growth that research programmes should exhibit. How so? Gender studies was presumably devised to explain how gender operates in human interactions. The hypothetical honest version of the dog park paper tries to show that gender could also explain some human-animal interactions, in particular, human-dog interactions. Now, if we assume, for the sake of argument, that cultural gender norms influence human-only interactions, I think asking whether they influence human-animal interactions is at least a valid and interesting question. Especially given that pets are often the kinds of animals we tend to anthropomorphize. In addition, if it turns out that gender plays a significant role in interactions with our dogs, would this not also say something about the explanatory power behind the core conceptual framework of gender studies?
We can easily reduce the core empirical content of the previously cited jargon-riddled extract as follows: currently existing gender norms are much more lenient toward unprompted male-on-female sexual advances than male-on-male ones. The reason for this is that males are supposed to be dominant while females are supposed to be submissive. Thus unwanted advances by males targeting females are seen as conforming to these gender norms, while those made by males on other males conflict with them. Moreover, it is primarily men who have an interest in keeping this system in place.
This can very obviously be tested when it comes to dogs. If we wanted to do it, how might we go about it? Presumably, we could design an experiment that would look exactly like the one described in the dog park paper. We’d look for interactions between dogs and observe their owners’ reactions, and see if there are any patterns related to the gender of the owner, the dog, or both.
The thing is, the paper does all of that, and it does so remarkably well. Yes, the data is false, but what we are interested in at the moment is not the limitations of the peer-review process, but in what the paper’s structure, claims, and empirical content indicate about the conceptual apparatus of its intended field.
What is even more striking is that if the research had actually been conducted and the results showed what the paper says they show, there is absolutely no reason why it should not have been published. And moreover, what it proves is the opposite of what its intention is. It shows that one can make scientifically testable claims based on the conceptual framework of gender studies, and that the field has all the markings of a perfectly functional research programme.
The authors of the hoax have tried to address the fact that the dog park paper uses falsified data. This was part of a reply to a critique of the hoax that touched on this particular issue. Note that this somewhat contradicts their original defense, published in The New Statesman, and which, as cited, completely ignores the fact that they used false data. In this new one they write:
In our award-winning “dog park” paper that Cole references (Wilson 2018), the data weren’t just fabricated; they were preposterous. Among other absurdities, our fictional researcher claimed to have examined 10,000 sets of dog genitals over 1,000 hours spent in just one year and in just three parks. She then drew ludicrous conclusions from the data—that dog parks and nightclubs are “rape-condoning spaces,” that dog parks are “petri dishes for canine rape culture,” and therefore that men should be trained like dogs.
Maybe the last bit about training men like dogs should have been a red flag for the reviewers. However, in their writings in the press, the Hoaxers consistently exaggerate the degree to which the paper advocates this. That section of the paper reads as follows:
It is also not politically feasible to leash men, yank their leashes when they ‘misbehave,’ or strike men with leashes (or other objects) in an attempt to help them desist from sexual aggression and other predatory behaviors (as previously, this human behavior as directed at dogs, though a sadly common anthropocentric mistreatment of animals, is not ethically warranted on dogs). The reining in or ‘leashing’ of men in society, however, can again be understood pragmatically on a metaphorical level with clear parallels to dog training ‘pedagogical’ methodologies. By properly educating human men (and re-educating them, when necessary) to respect women (both human and canine), denounce rape culture, refuse to rape or stand by while sexual assault occurs, de-masculinize spaces, and espouse feminist ideals – say through mandatory diversity and harassment training, bystander training, rape culture awareness training, and so on, in any institutions that can adopt them (e.g. workplaces, university campuses, and government agencies) – human men could be ‘leashed’ by a culture that refuses to victimize women, perpetuate rape culture, or permit rape-condoning spaces (cf. Adams [1990] 2010, 68, 81–84).
I leave it to readers to judge by themselves whether this paragraph really advocates “training men like dogs” or whether the explicit statement that it is a metaphor (and a visibly tongue-in-cheek one at that) waters it down sufficiently.
But more importantly, the fact remains that they time and again continue to simply ignore the possibility that their fake data might actually support some of their less silly conclusions, had it been real. Their defense basically rests on the implausibility of collecting the data, which is not at all relevant to their argument. As a side note: it is also not that implausible, since a year has 8760 hours, or about 6000 waking hours. If stripped down of all its silliness, the hypothesis is fairly straightforward: men have a much more lenient view of unwanted sexual contact. And their fake data does not just support the hypothesis, it actually does so to a remarkable degree, at least within the bounds of the conclusions that can be reached through statistical methods.
Since the data does not exist, I decided to simulate the observations using randomly generated variables to which I applied certain delimiting parameters intended to make them consistent with some of the figures provided in the paper. For example, the paper says the author examined the gender of just under 10,000 dogs. I took this to mean just under 5,000 interactions happened, each involving two dogs and their two owners. The code used to generate the dataset and the final dataset are available on request. Note that the variables are randomly generated. This means that, every time the code runs, the resulting dataset will be slightly different. However, all statistical parameters and relations between variables remain consistent.
The full simulated dataset contains 4,981 entries, each corresponding to one interaction. Each entry, in turn contains several variables, for example, the gender of each of the dogs and the owners, the kind of interaction, and the owner’s reactions. In line with the paper, the three possible interactions are “uneventful,” “fight,” and “sexual,” while the possible reactions are “intervene,” “do nothing” and “encourage.”
The dataset is large enough that it is possible to perform all the regular statistical tests used in a dataset of this type in scientific study. Now, it should be noted in the paper the claims are causal (i.e. such and such norms cause such and such behavior) and statistical methods cannot establish causation. But this is a universal problem. It is by no means exclusive to gender studies. It is, in fact, endemic to all science, even the natural sciences. Even the statistical analysis of experiments like the Large Hadron Collider do not establish causation just by themselves. All science can do is have a proposed explanation based on the theoretical framework, make a prediction about causal mechanisms, and then look at how well the data fits with the proposed explanation. This is where statistics comes in. To put it colloquially, statistical tests tell us the probability that a set of observed results, consistent with our theories, are due to sheer chance.
The dataset constructed based on the paper’s figures consists only of categorical variables. That means each value is simply a designator, rather than a numerical value which can be ordered from low to high, for example. With data like that, the appropriate statistical test is called chi-square.
Allow me another detour through some basic statistics. What a chi-squared test does is look at the distribution of results of various observations classified according to two variables and compares the observed distribution to a uniform one. Imagine, for example, a hundred books. We classify them based on whether they are written in Spanish or English and whether they are hardcover or paperback. If we expect the language and the cover to have no relation, then we would also expect to have four groups of books, each containing approximately 25 books. So what we do is classify all four groups, count how many are in each, and compare the actual number to the expected value, i.e. 25. If it turns out that the group of books written in English and with a hard cover contains 46, then that might be consistent with a theory that posits that books written in English are more likely to be hardcover. The chi-squared test then gives a number between 0 and 1. The smaller it is, the more likely it is that the deviation from the expected value is not due to random chance. In the social sciences, a value below 0.05 (or a 95% confidence level) is generally considered acceptable and fit to be published. Obviously, smaller values are better.
The following two tables show the results of two chi-squared tests performed on the dataset, considering only sexual interactions (humping, to use the paper’s term). The first looks at the reaction of the owner (guardian) of the dog initiating the contact, based on the owners’ gender. The second one looks at the reaction but based on the gender of the dog “receiving” the action. The gender of the dog that initiated the action is not relevant, since, in line with the paper, the simulated data contains only male dogs initiating sexual contact.
As the tables show, the probability that results like these are due to chance are exceedingly small. In both cases the probability is lower than 0.001. If this were true, it would be very strong evidence that the conceptual framework of gender studies accurately describes reality. One can even plot the probabilities of different outcomes based on these same data to make it easier to see. The following charts show the probability function of each reaction based on the same parameters, namely, the gender of the owner, and the gender of the dog being humped.
In the dataset, “male” is coded as 0 and “female” is coded as 1, so the right hand side of the x-axis of each chart represents the probability when the object of analysis is male, while the left hand side means the opposite. The y-axis is just the probability from 0 to 1. It should be clear that, according to this, the probability of encouraging, or just letting go increases when the owner is a man, or when the dog being humped is female, while the opposite happens for intervening to stop.
The relationship between gender and attitudes towards unwanted sexual interactions are clear as day. Again, do not be fooled by the silliness of it being about dogs humping other dogs. To illustrate, suppose that rather than observing dogs and their owners, the data came from interviewing parents about their reaction to possible conducts by their teenage sons and daughters. Suppose also that instead of “humping” the questions were about groping or making lewd comments. Now imagine that, in such a scenario, fathers were equally lenient with their sons as the simulated data presented here suggests. The statistical and probabilistic relations are absolutely striking. Would anyone be willing to say that gender is not an issue in this case? The fact that we are dealing with dogs might be a mitigating factor, but I do not see why it should not merit the conclusion that gender is playing a role in these attitudes, especially given the strength of the statistical associations.
Yes, the data is fake, but that really does not matter for the purpose of what the hoaxers intended to achieve. All this shows is that the framework of gender studies can make specific predictions about the world, and design experiments to test those predictions using the most mainstream of scientific tools available.
So one certainly cannot conclude, from a paper like this, that the field of gender studies is “purely theoretical, subjective, and unfalsifiable.” One could even go ahead and do the observation, and it might turn out that the data shows no relation between gender and reactions to how dogs interact. That would be fine. It would mean the hypothesis has been falsified, but of course that would prove that it is falsifiable.
All of this also shows that this paper is nothing like the original Sokal hoax. Alan Sokal was trying to show that he could publish nonsense if he used the right jargon. Whatever one thinks is the right conclusion from the original Sokal hoax, he at least achieved this. The Grievance Hoaxers present themselves as doing the same thing, but they totally fail.
Yes, the dog park paper is based on false data and, like Sokal’s, contains a lot of unnecessary jargon, but it is not nonsense, and the distinction is far from trivial. Nonsense implies one cannot even obtain a truth value from a proposition. In fact, the paper being false, if anything, proves that it is not nonsense, yet the grievance hoaxers try to pass falsity as nonsense. Nonsense is something like Chomsky’s famous sentence “colorless green ideas sleep furiously.” It is nonsense because it is impossible to decide how one might evaluate whether it is true. A false sentence would be “the moon is cubical.” It has a definite meaning, it just happens not to be true.
So, if the original Sokal Hoax is like Chomsky’s sentence, the dog park paper is much more like “the moon is cubical.” And in fact, a more accurate analogy would be “the moon is cubical and here is a picture that proves it,” and an attached doctored picture of the cubical moon.
It would also be like falsifying data of a physics experiment proving, say, that gravitons exist, and after the paper gets published, claiming that since it was published all of physics must be nonsense. Furthermore, an issue like this is far more common than one might think.
For a long time, it was orthodoxy in economics that debt slows growth. This idea was actually bolstered by empirical data from Harvard professors Carmen Reinhart and Kenneth Rogoff, not just economic theory. As it turns out, the supposed evidence rested on a mistake in a Microsoft Excel formula, which miscalculated an average by excluding a few countries. When included, the effect disappeared. Conceptually, this is no different than the dog park paper. Reinhart and Rogoff’s paper could have avoided publication if the reviewers had double-checked the data, but they did not.
No reasonable person would argue that this shows that the entire study of economics is nonsense. All it shows is that a claim was made, data was gathered, and the claim was shown to be false. The only difference between the two is that one was intentional. But every criticism that can be applied to one applies to the other. Even if one were to argue that the gender studies reviewers were ideologically motivated, the same ought to be said for economics.
And even if we look at fields where ideology is not a factor, and which are presumably much more scientific like medicine, publications with false data are not some kind of extreme anomaly. A report published in 2018 by Science Magazine found that retraction due to false data had increased tenfold, and that fraud (that is, not mistakes) accounted for about 60% of the problem. A single researcher in anesthesiology accounted for almost 90 retracted papers just by himself. Does this mean that medicine is nonsense? Probably not.
So, if the Grievance Hoaxers want to be consistent, they have to claim that all of economics is bullshit. Of course, they are not claiming that. But if they are serious about taking their contentions to their logical conclusions, then they ought to condemn it, along with many other fields of study.