How We Pulled Data for the Global Email Benchmark Report Warren Duff November 1, 2016 Best Practices, Email Marketing // SUMMARIES ?> In September, we published our first data report that measured email engagement benchmarks across the varying industries of SendGrid customers. The Global Email Benchmark Report is designed to help email program managers determine how their programs compare to other senders in their industry. To create the report, I worked extensively with in-house data scientists, Victor Amin and Aaron Beach. They pulled email engagement statistics from about 30 billion different emails that were sent from more than 100,000 different senders. Below are a few questions I had for Victor about how the data for the report was pulled and what some of the more interesting findings of the report were. What were some of the difficulties that you faced when pulling and analyzing the data for this report? Since we’re dealing with billions of emails, and the events associated with each one, the size of the data made everything take longer. We had to make sure that everything was 100% right before running any of the analyses because having to re-run would have been costly. It took months to plan and weeks to run the data before we had anything to work with. Since the number of combinations of slices multiplies depending on what you want to look at (like open rates and click rates across all devices or gender and industry), we had to pick and choose what combinations would be most interesting. Are there any benchmarks or stats within this report that surprised you or were unexpected? One thing that surprised me is how common names are in email addresses. It was a pleasant surprise because it allowed us to approximate gender breakdowns by cross-referencing the U.S. Social Security Administration public names database to infer gender likelihoods. That led to another surprising finding: health and fitness emails are female-dominated. There are a lot of reports about email out there, how does the Global Email Benchmark Report differ from others? I think the size of our data and the level of rigor we applied to the analysis sets us apart. These aren’t casual numbers–they’re statistically valid measurements that you can use with confidence. How accurate are these benchmarks? Why should this report be trusted? Because of our size, we see a big piece of certain industries. We were careful to only include elements in this report for which we had enough data to be totally accurate; if we didn’t have enough data to be confident in our numbers, we didn’t put it in the report. You can trust the birds-eye view this report presents. However, every sender is different; it’s never a good idea to rely too heavily on averages, even if they’re highly representative. What are some of the other types of data SendGrid can look into? What could we expect to see in future iterations of this report? We have lots of opportunities that I’d be excited to present in the future. A few that I’m looking forward to right now include: In addition to current snapshots, looking at trends over time International trends and relationships The effects of email subject lines How would you like to use SendGrid’s data in the future? There are many opportunities to build machine learning and Artificial Intelligence (AI) into our products using our data. I’d like to help our customers take the guesswork out of which message should be sent to which recipient at which times. I think the future of email will be a lot more data-focused. Are there any ways email senders could leverage their own data to improve their email programs? The short answer is: run experiments. Experiments and A/B tests are the gold standard for email marketing. In lieu of that, make sure you understand how your recipient engagement breaks down on as many axes as you have data on (demographics, device, etc.). There are two ways that data is valuable: It can bring problems to light. For instance, you may ask yourself, why are my click rates so terrible on Android? Maybe we have a rendering issue that we can fix to get more engagement. It can help you find your best customers. You could find specific segments and audiences that engage more than others. For instance, you might look at your data and find that men aged 35-50 engage way more than other groups. Why is that? Can you target them more? Can you advertise to them in other channels? For other big data insights from Victor, you can read the blog post Email Click Delays, Unsubscribes, and Engagement Rates: How Are They Affected By the Holidays? and to see all of the data mentioned above put to use, check out the full Global Email Benchmark Report.