Backup solution for Hebe's BigQuery

Scheduled backup solution for Hebe’s BigQuery Warehouse

Backup solution for Hebe's BigQuery

About

Hebe was in need of a secure backup solution for their data warehouse. They wanted to be secure in case of a catastrophic failure and be prepared in case a disaster recovery is needed.

01.

Launch

2023

02.

Scope

EDW backup

03.

SOLUTION

04.

RESULTS

About the client

Hebe is a specialised health and beauty chain with more than 300 stores in Poland, that is part of Jeronimo Martins. Its business concept is based on offering high quality services at very competitive prices. In 2022, the company consolidated its omnichannel approach, strengthening the integration between the digital channel and the physical store network. Hebe’s online shop also sells to the Czech Republic and Slovakia.

Hebe was in need of a more secure and longer-standing backup solution for their data warehouse than a 7 day backup that is native for BigQuery. They also wanted to be secure in case of a catastrophic failure such a physical destruction of Google data centers, be prepared in case a disaster recovery is needed, and set up a backup not only in a different zone, but also a different region.

An automatic backup is done every 7 days, without a need of human triggers. Hebe is protected in case of a catastrophic failure, but since the backup is automatically removed after 30 days, the solution remains economical.

01.

Solution

To automatically backup the data from BigQuery, we used such Cloud solutions as Cloud Composer with Apache Airflow and Storage Transfer. We also leveraged Google Cloud’s division into regions and zones to safely store the data across Europe.

02.

Backups

To store the backup and make sure a restore is available with a maximum of 7 days gap, we use a bucket in a different region, with data in .parquet file. For the backup to be automatically done, without a need of human triggers, we used Cloud Composer with Apache Airflow to first export tables to a bucket in the same region, and then Storage Transfer to sync it with a different region bucket.

03.

Results

Effectively, backup is safely stored 1500 KM away from where hebe’s BigQuery warehouse is set and hebe is protected in case of a natural disaster or error. In case of any problems in BigQuery, there is an option to go back and restore a previous version for as long as 30 days. After 30 days the backup is automatically removed, which helps keep the solution financially manageable.