That's a relief:
It turns out I'm a male. So are all of my co-conspirators, at least according to this test.
On the other hand, Maggie Gallagher is also male. (Paste this in the gender testing box.) And David Blankenhorn is a closet female (Paste this in the box; start with "At the Volokh Conspiracy..." and end with "23 good reasons to favor the change").
(Thanks to Andrew Sullivan, who's very male.)
Any advice on how to break this news to my Bride?
DC: You could move to Massachusetts. If you’re willing to settle for a civil union, there are a handful of other states now that you could settle in.
And Dickens? Based on the first few pages of Great Expectations, he was a woman. This could open up whole new avenues of literary research and historical speculation.
I'm going back to work now, but this has the potential for ENDLESS fun.
It would be interesting to compare different passages of Left Hand of Darkness to see if the changes in voice make any difference.
Otherwise, I can't wait for the Opine Editorials crowd to show up, complaining about the Natural Purpose of Language and neutered paragraphs.
Do the developers of this algorithm plan to apply it to the vexed "Who wrote Shakespeare" question? How about Harold Bloom's thesis that the "J" (co)author of Genesis was a woman?
My recent opus: Send Knitting Needles Haiku.was written by a male. (Though it was close.)
And my directions on how to knit Half Twisted Rib is very, very male.
Stockinette ad nauseam: Part I (which is actually quite a popular article with knitters) was also written when I assumed male form.
A few of my blog entries are written in my female incarnation. This happens when I mention friends or my outings with friends. (I guess women just don't write how to articles?)
Well, you already knew it... (just a hint, the result doesn't start with an 'F').
Women's letters were often classified by the test as male.
Men's letters were almost always classified by the test as male.
Both my own letters and my mother's letters to the editor came out male.
It seems that the test may have a bias in favor of categorizing authors as male.
Or, it may be that people who do stereotypically "male" things like writing letters to the editor tend to have "male" writing styles: more than three-quarters of all letters sent to newspapers are from male would-be authors, and most would-be editorialists are male.
(Many of my wife's informal messages, by contrast, do come out as female).
Should, for example, is not something you'd want to use to get attention. You should brush your teeth twice a day, but you must look at data.
Likewise, if you want, we could go to your mother's and see her view on things and come back here with her and think... but that's not particularly strong data.
And "to be", of course, is passive tense and should be avoided unless delivering bad news.
There's a slight male bias (focusing on singular events or items is considered a more powerful writing tool than on collective, and you kinda need "a" and "the" to discuss singles), but it's more that 'female' wordings aren't particularly powerful.
The system works not by looking at the topic of the writing, but rather at its writing style, as measured by the occurrences of various grammatical words, such as pronouns, auxiliaries, prepositions, and the like. Different individuals, and the two sexes, empirically tend to use these various words with different frequencies. One important caveat, however, is that different genres of text (such as fiction, academic nonfiction, biographies, etc.) also have characteristic preferences for different such words (e.g., fiction tends to contain many more pronouns than non-fiction).
So, interestingly, fiction shares many of the characteristics of female writing, and nonfiction essays and articles share many of the characteristics of male writing. Thus, if the system thinks it is looking at fiction, if you give it a female-authored nonfiction piece, it is more likely to appear male, and vice versa. Thus the need to know what kind of text is to be analyzed. This is, of course, only the tip of the iceberg of the complexities involved, and we still have only a murky handle on them.
My suspicion about the letters-to-the-editor is that they are edited for a more formal "newspaper" style, which will make them skewed towards a more "masculine style".
NB: We are always on the lookout for interesting applications of these techniques, so if you think you may have some, email me...
It has been submitted to a CS journal. Draw your own conclusions.
I'm not sayin', I'm just sayin'.
I am not saying that women can't write forcefully, I am saying that most have adopted a less effective style.
According to the keyword breakdown, the two biggest keywords that make me male are "are" and "the". "With" and "not" made a valiant effort in favor of my actual gender, but to no avail. What, if I may use another masculine keyword, is up with that?
"Less effective"? For what purpose? One of the things that you quickly realize when studying language, and particularly language variation, is that each language variety typically is the way it is for a good reason. So, for example, the more narrative style typical of women is better for building rapport and relationships, while the more thing-oriented, "informational" style more typical of men is better for quickly conveying precise information. Both important goals in different circumstances. Being "more effective" in language means, in my opinion, having more tools in your language toolbox (and some research suggests that women are better at this than men... :-).
This would explain why genie thinks my fairly popular "how to" knitting blog articles are written by a man while my articles describing meeting with my friends for a yarn dyeing party in my garage are diagnosed as written by a female. :-)
This post is rated male, 55:0
Since texts from different times will themselves differ in their word-usage frequencies, a model of male/female writing differences based on 20C texts (as is the Gender Genie) will not be directly applicable to texts from a different century. Which is likely a cause of the phenomenon you saw. (The other point to keep in mind is that the method is only about 60-70% accurate, so by random chance many texts will be wrongly attributed...)