I often think of sports analogies when I think about what we do here at SendGrid around the anticipation of a big event. While we deliver more than a billion messages every day of the year, our big event comes when our customers rely upon us to deliver critical messages to their customers on Black Friday and Cyber Monday.
You could say that Black Friday and Cyber Monday is our championship game of the year. It often looks easy for people looking from the outside. However, it takes years of dedicated work and preparation to succeed.
This year, I’m excited to report that SendGrid processed 2.8 billion emails on Black Friday (nearly 1B more than 2017). And for Cyber Monday, SendGrid processed 2.9B emails–making it our largest sending day ever. This post explains the preparations that we take at SendGrid to ensure that every last email is delivered on behalf of our customers.
How SendGrid prepares for Black Friday and Cyber Monday
We have a saying here, “Black Friday is Coming,” that helps prepare our organization for the awesome responsibility that our customers have entrusted to us—whether they are small, large, or non-profits—they rely on us to deliver their messages when they are needed most.
Each year in December we complete a retro on the past holiday sending period to see what went well, what could have gone better, and what we are going to implement to ensure we win the next year as well.
Yes, you heard it right: only days after Cyber Monday, we start planning how we are going to meet the needs of our customers in 12 months time.
Within this retrospective, we analyze our data and we update projects based on the most recent findings, our review system, and service performance to identify areas to tune and to look at any new products we have coming up to ensure they will meet the scales that we need.
In January, we begin preventative maintenance sprints which have our engineering teams focused on cleaning up small nuisances that can add up to big problems if neglected throughout the year. We have a running list that anyone can add to, and we review that list weekly to determine priority and timing to rectify.
Think of it like oil changes every 5,000 miles instead of replacing the engine after 100,000 miles. Doing the little things makes the big things easier (and avoids potential disasters).
Everything we build through our blueprint process has to support high availability (HA), disaster recovery (DR), and scalable needs for our customers. So throughout the year, everything new being built has to be verified to meet the scalability needs of our business. This is an ongoing activity that all of our company understands—our customer trusts lie in our reliability, scalability, and deliverability.
We run disaster recovery scenarios throughout the year, and when incidents or near misses occur, we review each of those events to identify learnings that we can apply to better our service and to improve our customer experience.
We take advantage of every opportunity to continuously evolve our services to prepare them for the holidays. Think of these as our practice sessions that hone our skills like mid-week adjustments during a football season.
As we enter the summer months, our preparations ramp up even more. We continue to refine our projections. We also work closely with our customers to understand their growth and do joint planning and testing to ensure that we have a better understanding of their needs and how they see their business evolving.
Stress testing in controlled environments
At the same time, we start systematically stressing our production environments to find potential breakpoints. We inform our customers of these activities through maintenance notifications because we want to find if there is any fragility in our systems in a controlled manner rather than during the most critical time for our customers’ businesses.
These stress tests don’t always go as we expect, and that is where we learn the most.
We run simulations and tests before these production tests, but there is nothing like stressing your production environments to identify issues.
As a result, as we transition into fall and playoff time, we have learned a great deal, we have better intelligence about what we expect our customers’ needs are, and we have learned and improved our weaknesses.
During the fall time, we:
- Increase our stress testing frequency
- Run more simulations
- Plan out for any failures and our resulting communications procedures
- Ensure we have a few contingencies in place to ensure success
We are prepared for the big game and prepared for what happens during that game. I always loved Tyson’s comment “everyone has a plan until they get punched in the mouth.” We plan to be able to adapt when we get punched in the mouth because we owe that to our customers.
I started this blog with the title Riding the Tidal Wave of Email because you don’t simply paddle out to a monster wave without years of preparation, without constant practice and learning, and without building up for it over time.
We are excited to meet the challenge of the big wave, the championship game, and the final shot because that is our goal each year. We prepare for it in everything we do so our customers can count on us.
Interested in leveraging SendGrid’s scalability and dependability in your own email program? Check out our various email plans.