11-830: Computational Ethics for NLP

Spring 2022


HW 3: Privacy and Obfuscation


Goals

Online data has become an essential source of training data for natural language processing and machine learning tools; however, the use of this type of data has raised concerns about privacy. Furthermore, the detection of demographic characteristics is a common component of microtargeting. In this assignment, you will explore how to obfuscate demographic traits, specifically gender. The primary goals are (1) develop a method for obfuscating an author’s gender and (2) explore the trade-off between obfuscating an author’s identity and preserving useful information in the data


Overview

The data for this assignment is available here. Your primary dataset consists of posts from Reddit. Each post is annotated with the gender of the post’s author (op_gender) and the subreddit where the post was made (subreddit). The main text of the post is in the column post_text. The contents of the provided data include:

The provided classifier achieves an accuracy of 64.95% at identifying the gender of the poster and an accuracy of 85.85% at identifying a post’s subreddit when tested over dataset.csv. Your goal in this assignment is to obfuscate the data in dataset.csv so that the provided classifier is unable to determine the gender of authors, while still being able to determine the subreddit of the post. Note that in this set-up, we treat the provided classifier as a blackbox adversary (please do not try to hack it). This assignment was largely inspired by the paper Obfuscating Gender in Social Media Writing (Knight & Reddy, 2016), which may be a useful reference. Scenerios where this obfuscation model might be useful could be social media users who want to preserve their privacy by hiding their gender from the adversary, without losing the meaning of their post. You could also imagine this is a dataset of health records or other sensitive information that needs to be anonymized before providing it to NLP researchers.


Basic Requirements

Completing the basic requirements will earn a passing (B-range) grade

First, build a baseline obfuscation model:

Second, improve your obfuscation model:

Third, experiment with some basic modifications to your obfuscation models. For example, what if you randomly decide whether or not to replace words instead of replacing every lexicon word? What if you only replace words that have semantically similar enough counterparts?

Advanced Analysis

Develop your own obfuscation model. We provide background.csv, a large data set of Reddit posts tagged with gender and subreddit information that you may use to train your obfuscation model. A larger version of the background corpus is available here. Your ultimate goal should be to obfuscate text so that the classifier is unable to determine the gender of an author (no better than random guessing) without compromising the accuracy of the subreddit classification task. However, creative or thorough approaches will receive full credit, even if they do not significantly improve results. Some ideas you may consider:

In your report, include a description of your model and results.

Extra Credit!


Write-up

Write a 2-3 page report (ACL format) FirstName_LastName_hw3.pdf. Please do not write more than 4 pages. The report should include:


Grading (100 points + up to 10 extra credit)