Harvard College, 2014
Lead Data Scientist at BlueLabs in Washington, D.C.
PhD student in Statistics
University of Oxford, funded by the Clarendon Scholarship
Favourite thing to do in research: Listen to talks that are about something entirely different from what I study!
Lover of pasta, statistics, and politics (in that order)
I moved to Oxford last year to start my DPhil in Statistics. Before that, I lived in Washington, D.C. for 4 years where I worked helping Democratic campaigns use data to build campaign strategy. I’m still obsessed with American politics 🇺🇸. I studied statistics at university, but it wasn’t until I binge watched (the show) The West Wing the fall of my last year at school that I decided to try to find a way to use statistics/data science in politics. I love good food and cooking for friends – my specialty is homemade pasta (I even brought my pasta maker to Oxford from the US!) 🍝 I had bone cancer when I was 14 (which is thankfully all gone now), so can’t run any more, but have developed a love for cycling and swimming!
Developing methods for prediction with messy, aggregated or biased data
During the 2016 American presidential election, I worked for Hillary Clinton’s campaign, mostly analyzing polling data. I was devastated when we lost the election. I decided that I was going to to go back to grad school to try to develop better methods for collecting and analyzing the polling data that campaigns typically collect. I thought the analytics team on the campaign did an amazing job, but that there was a lot of work being done in the machine learning world that hadn’t quite made its way to the world of political analytics.
My research goal is to innovate on cutting-edge machine learning methods and adapt them to solve problems typically seen in campaign analytics. For example, I want to figure out how to build really accurate models to predict how likely an individual is to support a candidate I’m working for when we only observe data at the post code or even county level.
My Typical Day
Coding, listening to talks and reading.
First (as I wait for the coffee to kick in), I check my email and respond to messages from friends and family back in the US who texted me overnight. I usually try to go into the office to do this, but sometimes I work from home (we have a really cute garden!).
Once I’m more awake, I usually start my day by reading an academic paper (I have a pile on my desk that I print off/bookmark when I find them interesting). I love knowing that part of my job is learning about things simply because I find them interesting!
Then, I’ll try to dive back into the research that I’m working on – sometimes it takes a bit to remember where I left off the night before. My research involves a lot of coding in R and (for my current work) yelling at census data or asking people to give me more data.
After working on that for a bit, I usually have lunch with friends in the department and then go to a seminar where I hear about what other people in the department or in my smaller research group are working on or thinking about.
Then, I go back to my desk and dive back into yelling at census data…I mean, coding in R and running simulation experiments. It helps to have an uninterrupted block of a few hours to really dig into the problem that I’m working on. I definitely occasionally lose track of time.
After that (or when I get hungry), I leave the department to go have dinner with friends, swim laps at the pool or go to another talk. I try to go to a talk about something completely unrelated to statistics 1-2 times a week, of which Oxford has A ZILLION. This week it’s Oxford’s Black History Month lecture and next week it’s a talk by the former directory of MI5!
What I'd do with the money
Approachable explanations of statistical methods applied to a wide range of applications
When I tell people I study statistics, I generally get one of two reactions:
- “Wow, that’s really cool, I love statistics”
- [the more common] “Oh, I could never do that”
The second response always makes me really sad, because I love statistics exactly because I think it’s intuitive and highly applicable to almost any other field of study. Specifically, I heard of natural language processing being used to identify the authors of the anonymous Federalist papers (analysis like this: https://blog.jonlu.ca/posts/the-federalist-papers-author-identification-through-k-means-clustering), which are important documents in US history. I realized that studying statistics would give me a set of tools to better understand other fields that I found really interesting – like history, art, language and politics.
Statistics can be highly intuitive and impactful, but is often found to be too”mathy” or esoteric. If I win the prize money, I would use it to compile videos of researchers explaining applications of statistics to a wide array of other disciplines (art! sports! food! literature!), and the math behind those applications, in really basic and approachable ways. The goal would be to convince students like you that you can use statistics to understand whatever else may interest you!
How would you describe yourself in 3 words?
adventurous, curious, foodie
What's the best thing you've done as a researcher?
Helped design and run experiments to get more people in the US enrolled in health insurance
What or who inspired you to become a researcher?
Having too many questions and not enough time in my old job to answer all of them!
What was your favourite subject at school?
math! and maybe art history
What did you want to be after you left school?
I don't think I had any idea haha
Were you ever in trouble at school?
no (really)...I liked following rules
If you weren't a researcher, what would you be?
a food blogger
Who is your favourite singer or band?
What's your favourite food?
What is the most fun thing you've done?
rode a camel through the Sahara!
If you had 3 wishes for yourself what would they be? - be honest!
unlimited travel budget (or the ability to teleport), to be able to run again, a time turner
Tell us a joke.
There are two kinds of statisticians: 1) Those who can extrapolate from incomplete data.