How We Automatized Custom Email Reports Using Python, Apache Airflow, and SendGrid API (Part 1)

By Kamil Jankowski | Business Intelligence Engineer

A modern business needs data-driven decision-making. According to statista.com, in 2020 there had been a significant 12% increase in implementing data-driven decision-making approaches in global organizations. We also observe this trend based on our experiences with partners. The business that we work with wants to make conscious decisions, based on solid data, so we are thrilled and proud of the opportunity to deliver them exactly what they need in a fully automated way.

How do we do it? Which programmatic languages and tools do we use? If you would like to gain a decent understanding of sending custom email reports to your colleagues or business partners, this two parts article will help guide you through the process. In this part, we are focusing on an explanation of the key project requirements and tools.

Custom E-mail Reports – What Should Be The Effect?

Imagine that your company owns a marketplace platform like Amazon that allows users to sell their products on it. Retailers should be able to check their sales efficiency to decide which products fit buyers needs and which one should be withdrawn from the online store due to potential losses. Depending on the subscription type (daily, weekly, and/or monthly) a user should receive one or more emails including a CSV file with Product ID, price, and quantity sold, for his further analysis. Of course, e-mails should have a nice template and be sent from an internal domain that matches other company emails such as newsletters or customer support messages. 

Why Did We Choose Phyton, Apache Airflow, and Sendgrid API To Send E-mail Reports?

Our Data Engineering Team was using an external, but popular, visualization tool for data gathering, chart building, and reports sending to internal and external partners. Unfortunately, that particular BI solution has been taken over and closed which forced our team to search for a new, more advanced reporting system. Of course, we also considered using one of the well-known BI tools, such as Tableau, but they did not fit our top 5 key requirements for this project that were:

  1. The tool should be able to load the data from the database, process, and attach it as a CSV file.
  2. The tool should give us the possibility to validate the data before sending it to partners and alert the Data Engineering Team about all issues or emails that could not be sent.
  3. The tool should be flexible enough to give us the possibility to play with the dataset and email formats. 
  4. The tool should send emails from our internal domain.
  5. The tool should fully automate the process and work independently 24/7.

We analyzed many potential solutions and decided that the best way is to combine the Python language, Airflow server, and SendGrid service for building our custom reporting system. 

Why Python?

Python is a well-known programming language supported by an enormous and engaged community. It offers and supports multiple useful libraries and API services that give almost unlimited possibilities. It is also a language we use every day for creating data pipelines, and everyone in the Data Engineering Team knows it. Based on that, for us, it was an obvious choice for this project.

Why Apache Airflow?

Python was just a solution for the script-building process. However, we still needed a separate tool that would give us the option to set everything once and forget about it. At least until a problem arises that requires our team to be informed. 
One of our objectives was to find a tool that allows full automation. Since we used Airflow for other ETL processes, we decided to apply it to this project as well. If you have not heard about Airflow or would you like to learn more about it, e.g. how to configure your own server, click here for more information.

Why SendGrid?

SendGrid is the world’s largest cloud-based email delivery platform that provides API Keys and programmatic language integrations (including Python). SendGrid is user-friendly and has a nice interface, so even without knowing any of the programming languages, users are able to send custom emails directly from the web platform and analyze real-time data there, which is great for teams such as marketing or customer service. However, for our Data Engineering Team, the most important was an API Key that could be loaded into Python to unlock the full potential of automated email sending through SendGrid. If you would like to know more about it, I recommend the official documentation of it.

Summary

Choosing a combination of the Python language, Airflow server, and SendGrid service gave us the flexibility we needed to customize the reports according to our specific needs and also allowed us to automate the process of generating and sending them out on a regular basis.

From today’s perspective, our solution turned out better than the default emails sent by our previous reporting system. As a confirmation, we received great feedback from our partners. What we really love about that solution is that we connected existing blocks (Airflow, Sendgrid) into a full-fledged, robust & configurable custom solution. We strongly believe this is how the future IT world will look like: not building everything from scratch but combining existing building blocks into bigger systems.

In part 2 of this article, we will share the technical implementation of our idea and also the final outcome.