FirstName_LastName_hw3.pdf
) and a folder containing your code. Code will not be graded.Online data has become an essential source of training data for natural language processing and machine learning tools; however, the use of this type of data has raised concerns about privacy. Furthermore, the detection of demographic characteristics is a common component of microtargeting. In this assignment, you will explore how to obfuscate demographic traits, specifically gender. The primary goals are (1) develop a method for obfuscating an author’s gender and (2) explore the trade-off between obfuscating an author’s identity and preserving useful information in the data
The data for this assignment is available here. Your primary dataset consists of posts from Reddit. Each post is annotated with the gender of the post’s author (op_gender
) and the subreddit where the post was made (subreddit
). The main text of the post is in the column post_text
. The contents of the provided data include:
classify.py
: a classifier that predicts the author’s gender and the subreddit for a post (example run: python classify.py --test_file dataset.csv
). Note that this file also uses the two provided pickle files.dataset.csv
: your primary data.background.csv
: additional Reddit posts that you may optionally use for training an obfuscation model. A larger version is available here.female.txt
: a list of words commonly used by women.male.txt
: a list of words commonly used by men.The provided classifier achieves an accuracy of 64.95% at identifying the gender of the poster and an accuracy of 85.85% at identifying a post’s subreddit when tested over dataset.csv. Your goal in this assignment is to obfuscate the data in dataset.csv so that the provided classifier is unable to determine the gender of authors, while still being able to determine the subreddit of the post. Note that in this set-up, we treat the provided classifier as a blackbox adversary (please do not try to hack it). This assignment was largely inspired by the paper Obfuscating Gender in Social Media Writing (Knight & Reddy, 2016), which may be a useful reference. Scenerios where this obfuscation model might be useful could be social media users who want to preserve their privacy by hiding their gender from the adversary, without losing the meaning of their post. You could also imagine this is a dataset of health records or other sensitive information that needs to be anonymized before providing it to NLP researchers.
Completing the basic requirements will earn a passing (B-range) grade
First, build a baseline obfuscation model:
dataset.csv
, if the post was written by a man (M
) and it contains words from male.txt
, replace these words with a random word from female.txt
.W
) in the same way (i.e. by replacing words from female.txt
with random words from male.txt
)classify.py
on your obfuscated data and analyze the results.Second, improve your obfuscation model:
male.txt
with randomly chosen words from female.txt
, choose a semantically similar word from female.txt
(use the same metric for replacing words from female.txt
with words from male.txt
). You may use any metric you choose for identifying semantically similar words. We recommend using cosine distance between pre-trained word embeddings (available here). You can also use SpaCy-based similarity here (example 1, example 2).classify.py
on data obfuscated using your improved model and analyze the results. The classifier should perform close to random at identifying gender (e.g. <53.5%) and should obtain at least 79% accuracy on classifying the subreddit.Third, experiment with some basic modifications to your obfuscation models. For example, what if you randomly decide whether or not to replace words instead of replacing every lexicon word? What if you only replace words that have semantically similar enough counterparts?
Develop your own obfuscation model. We provide background.csv
, a large data set of Reddit posts tagged with gender and subreddit information that you may use to train your obfuscation model. A larger version of the background corpus is available here. Your ultimate goal should be to obfuscate text so that the classifier is unable to determine the gender of an author (no better than random guessing) without compromising the accuracy of the subreddit classification task. However, creative or thorough approaches will receive full credit, even if they do not significantly improve results. Some ideas you may consider:
In your report, include a description of your model and results.
Write a 2-3 page report (ACL format) FirstName_LastName_hw3.pdf
. Please do not write more than 4 pages. The report should include:
dataset.csv
and running classify.py
over your obfuscated test data.