Quotes Analysis

Anlysing topics by gender

The female voice in media is underrepresented not only in a quantitative perspective but also in a qualitative perspective and female expert opinions are reportedly underrepresented in certain male-dominated topic fields like politics, governemnt and enonomy and 8. Let’s see if we can confirm these claims using the data at hand. Therefore, we preprocessed the quotes extracted from Quotebank in a standard NLP pipeline and conducted topic analysis either on quotes from male or from female speakers, to elucidate diverging trends in mentioned topics based on gender.

Topics of Quotes by Women and Men

Some expected and classically gender-associated topics appeared in the topic analysis: In the quotes of women, the topic of lifestyle and fashion appears as a common topic (third topic), which is absent in men’s quotes, whereas the topic of sports (seventh topic) and business (sixth topic) is prominent with men but doesn’t crystallize in female quotes.

But also the topic of social issues (ethnicity, sexuality, family) is a prominent topic with female speakers but doesn’t clearly show in male quotes.

If we consider very similar topics in both datasets like business and government, it still seems that in general there is a different spin on how the topics seem to be approached. Female quotes have a community and social aspect, whereas in male quotes it seems to be more focused on either business or politics in the pure sense.


Proportion of quotes from each topic, for male and female quotes:

The number of quotes in common topics still show that women are mentioned far less in classically male-dominated fields like politics, economy, business, and sports. And there are very cliché heavy topics like beauty and lifestyle which appear among the leading topics for female quotes whereas they are very rare for male speakers and vice versa for very male-heavy topics like sports.

Distribution of topics by gender and bias of journal:

We further investigate the distribution of quote numbers of the common topics found in the whole subset. One interesting aspect is if the topic distribution is strikingly different in right- vs. left-leaning media outlets, which would indicate a different perception of women in the political parties.

Another potentially interesting aspect is the gender of the direct producer of the text. One might expect that the gender of the text author might influence his/her choice of person to quote. However, as discussed in the section about the construction of the dataset, we could only get a hold of the editors-in-chief of the journal in question as the closest and accessible approximation for the person deciding on the content in the media outlet. However, the plots clearly show that the distribution of topics is very similar in the different subset of the overall dataset. It seems that the topics for which both women and men are quoted do not strongly depend on the editor’s gender.

Most frequent words in Quotes by Women and Men in a pretty Slideshow

For those interested in all the topic clusters we have found, you can find the slideshows for both female and male speaker topics below!



Quotes about Women and Men

Now we have an idea of what female and male speakers talk about when quoted in mainstream media. However, we are still missing a further important aspect of female presence in media: In what context are women or men mentioned? Is there an overlap between what women talk about and how they are talked about?

Looking at this notion of the quotes can give us a feeling of how women are viewed in society or rather what image of women (and men) is propagated by US mainstream media. To further elucidate this information in the quotes, we created new datasets from the US mainstream media set, of quotes that mention words that are indicative of either gender like the words ‘man’, ‘boy’, ‘woman’, ‘lady’, ‘him’, ‘she’, etc. and categorized the quotes into either talking-about-men or talking about-woman quotes. We grouped both the quotes about men and women into seven different topics and again, while of course there is quite some overlap between the topics - such as general & lifestyle, family and politics, there are clearly topics reserved to the respective genders. While women appear second most frequently (after the less interpretable cluster termed “general life”) in the context of family & work, the most common topic cluster for men is service (to the country, military,…) paired with work. Surprisingly, for men the second most important cluster centers around family and almost exclusively so, which is contrary to the thematization jointly between work and family for women - a topic we also know well as a typical societal problem of the family-work balance that apparently only women face …

Furthermore, very stereotypically, sports is omnipresent as a “men’s topic” and appears for women only as a minor topic paired with culture and leisure connotation reflecting somewhat the importance of professional sports for male teams as opposed to the limited interest in female sport.

Quite expected is the important topic of reproductive rights for women in the USA, which is, in all its specificity the 4th most dominating topic! That says quite a bit about what moves the USA of today…

What is generally observable is that although often the overarching topics are comparable between genders, the connotation and pairing takes interesting twists and makes the devil lie in the detail. An additional example would be politics: For men, this is heavily paired with crime and violence, whereas for women it has a heavy focus on elections and party politics. In the context of the time period portayed in the dataset, US elections have been heavily intertwined with allegations of sexual misconduct primarily against women as well as the first female candiate of a major party…

We could keep on going forever, since there are so many interesting details to discover, but browse yourself through the slideshows below!



Quotes about Women and Men in the context of Editor’s Gender and Political Bias

From this analysis, we can see that gender representation in media is still not ideally balanced in many aspects. So far, we have considered the content of the quotes only on the whole common US media outlets. But do these observations apply equally in journals of different political bias? Although the general trend is not overwhelmingly different, there are a few oddities: We need to be careful due to the imbalance in first of all right-leaning vs. left-leaning media in the most frequented subset we present here. Looking back at the bubble plot representing all the media used in our analysis here, one can see that within the right-leaning portion of media, there are no life-style magazines, such as Vanity Fair, Refinery29, etc. potentially covering different types of stories far away from political issues, so this may skew the data. Nevertheless, it is quite remarkable how politics and health & reproductive rights issues are heavily overrepresented in the right-leaning media compared to the overall and left-leaning dataset. The same trend could be reovered when looking at quotes issued by women!

For men, on the other side, one cannot observe such any big shift in topic distributions. The gender of editors however, has so far no determinable influence on the story choice, hence the representation about men and women. Likely, other factors such as the type of magazine and the author have a higher impact, but this could not be determined in the course of this study.