Glassdoor asks employees whether they’d recommend a company to a friend and whether they approve of the CEO. For the most part, you would expect those two numbers to match closely. And for the most part they do, but there are some interesting ways in which they don’t. I looked at data from Glassdoor for […]
Analysis with R
Exploring Trends in the Coffee Industry through Text Analysis
I spend a lot of time in coffee shops and this has led me to want to know more about coffee and the industry around it. To this end, I analyzed text from posts to reddit’s coffee forum and looked for trends, following the same process I outlined in this post on nutrition trends. I […]
Buy the (avocado) dip: Analyzing nutritional buzzwords on Reddit
Big CPG companies are struggling to move quickly enough to meet changing tastes. Kraft Heinz’s recent write-downs are a particularly salient example of this. One way incumbents have sought to keep up is by acquiring small brands that have a more direct connection with consumers and are perceived as healthier, fresher, or more authentic. Prominent […]
Game Theory, Blockchain Tech and Predictive Modeling at Numerai
Numerai is a hedge fund that crowdsources its model-building through data science competitions. It does this by publishing a dataset with blinded and encrypted variables so that data scientists can build predictive models without having full access to Numerai’s proprietary data. Competitors can stake Numeraire, a token built on Ethereum, to signal confidence in their […]
Learning Spanish with R and Tableau: The Most Common Spanish Words Visualized
I am trying to learn a bit of Spanish very quickly. I’m going to Peru in a few months and I think it would be fun to see how much I can learn for that trip. I have never formally studied Spanish but am familiar with Romance languages and so I feel like I’m learning […]
Growth in Government Requests for Facebook Data
The relationship between government and Facebook is increasingly complicated. I recently worked on a project with Big Finish Digital, a new advertising agency, looking at the rise in government requests for Facebook data around the world. The founders of Big Finish are interested in data privacy and the ethics around data in advertising, and gave […]
Sentiment Analysis of Media Coverage of Facebook
I recently worked on data for a story with Charlie Warzel at Buzzfeed about media coverage of Facebook over time. I am really interested in the decline of public opinion towards Big Tech, and wanted to quantify it. Getting Data from the NYT API I started by using the New York Times API to pull […]
Taylor Swift’s Newfound Infatuation with Alcohol
Update: A few people reached out asking me to re-run this analysis for Ms. Swift’s latest album, Lover. I did and was somewhat surprised to see that the alcohol references weren’t a passing phase. I followed the same methodology as I did for the below, but this time put the data into Tableau Public for […]
Topic Analysis of Tim Ferriss’ Podcast Using Rvest
March 28, 2018 Text Analysis of Tim Ferriss Podcast Episodes Using R I recently came across a great respository of transcripts from Tim Ferriss’ podcast, courtesy of transcripts.io. I don’t actually listen to Ferriss’ podcast (too heavily monetized for my taste), but I know that many do and he gets great guests. I thought I’d […]
Analyzing Facebook Messages in R
Facebook Profile Data Analysis It’s fun to see data about yourself. So upon learning that you can download all of your facebook data, I think it’s natural to want to analyze it. Facebook makes it quite easy to download all of your data, but analyzing it is harder. You get a .zip file with a […]