Disaster Recovery: How To Protect Your Database and Business with SkySQL (Part 1)
October 24, 2024
The Critical Importance of Database Disaster Recovery
In today’s digital-first world, database downtime can lead to catastrophic consequences for businesses. Natural disasters, hardware failures, or cyber-attacks can disrupt your database operations, resulting in revenue loss, reputational damage, and compromised data integrity. Implementing a robust database disaster recovery (DR) plan is essential to ensure business continuity and minimize data loss in the face of unexpected events.
We will cover two key scenarios for setting up a DR site using SkySQL in a two part blog series. In this blog, we address setting up a DR site for databases running in the cloud with SkySQL In the second forthcoming blog, we will address setting up a DR SkySQL cloud site for on-premises production databases. With these blogs, we will provide you with a comprehensive understanding of how to build a resilient disaster recovery plan, whether your databases are hosted in the cloud or on-premises.
Key Components of an Effective Disaster Recovery Plan
Automated Backups: Ensure that your DBaaS provider offers automated and frequent backups. These backups should be stored in geographically dispersed locations so that if one region experiences downtime, your data can still be restored from another.
Replication Across Regions: Implement database replication across multiple regions. This crucial step in cloud database protection allows seamless failover to a secondary region in case of a disaster, maintaining business continuity..
Automated Failover Mechanisms: A strong DR plan includes automated failover mechanisms. If the primary database becomes unavailable, the system should immediately switch to the standby replica in another region without manual intervention. This reduces downtime significantly.
Define Recovery Objectives: Establish clear Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO). RTO is the amount of time your database can be down before it impacts your business, and RPO is how much data you can afford to lose. This will enable you to align your replication and backup strategies with these critical business continuity metrics.
Regular Testing: Regularly test your disaster recovery setup by simulating failover scenarios. Utilize native database tools to monitor and validate your DR site’s readiness, ensuring your business continuity plan remains effective.
Advantages of a Well-Designed Disaster Recovery Solution
Minimized Downtime: Automated failover and rapid recovery capabilities ensure your database quickly returns to operation, preserving revenue and customer satisfaction in mission-critical applications.
Data Integrity: Unplanned outages can cause irreparable data loss if backups are not timely or sufficiently distributed. With frequent backups and replication, your data remains safe and intact, even during unexpected outages.
Scalability and Flexibility: A well designed disaster recovery solution allows for right-sizing your DR site. You can scale resources as needed without matching your primary site’s configuration, optimizing costs while maintaining robust cloud database protection.
Uninterrupted Business Continuity: Your applications continue running with minimal disruption. Without a DR solution, businesses face extended downtime and possibly permanent data loss.
Why Choose SkySQL for Database Disaster Recovery?
SkySQL stands out as a fully managed, multi-cloud database service that empowers MariaDB and MySQL users with unparalleled scalability and reliability. Trusted globally for managing mission-critical workloads, SkySQL delivers exceptional performance, flexibility, and resilience across more than 40 global regions on major cloud platforms including Amazon AWS, Google Cloud, and Microsoft Azure.
SkySQL offers distinct advantages for implementing database disaster recovery strategies for both cloud-native and on-premises databases:
Cost-Effective: Using SkySQL for disaster recovery allows you to pay only for what you use. You can stand up smaller instances for DR and scale them up only when needed, reducing costs.
Auto-Scaling: In the event of a failure, SkySQL’s auto-scaling capabilities allow your DR instance to scale rapidly, ensuring minimal downtime and uninterrupted service for your applications.
Geographic Redundancy: By setting up DR in a different region, you can protect your data from regional disasters, such as natural calamities or network outages.
Multi-Cloud Redundancy: By setting up DR with a different cloud provider in the same region, you can protect your data from specific cloud provider outages with equivalent low latency.
Ease of Management: SkySQL’s API and intuitive UI simplify the entire process, from backups to failover, making it easier for teams to manage their DR setup.
SkySQL provides a comprehensive suite of tools and capabilities to help organizations implement a cost-effective and high-performance database disaster recovery strategy. Whether your database is hosted on SkySQL or on-premises, SkySQL’s flexible, cloud-based architecture facilitates the creation of a resilient system, ensuring robust business continuity and data protection.
Setting Up a DR Site for SkySQL in a Different Region
If your primary database is already running in SkySQL, setting up a disaster recovery site in a different region offers a powerful way to ensure business continuity. The cloud’s distributed nature allows you to easily replicate data across multiple regions, protecting your business from regional outages or disasters. Let’s dive into how you can set up a DR site for a SkySQL-hosted database.
The following steps can be completed via SkySQL’s UI or REST API, before proceeding, please make sure you have the following info available to use with the SkySQL Backup Service API:
Collect following information from SkySQL Portal.
API_KEY: Please obtain your API_KEY from the portal: https://app.skysql.com/user-profile/api-keys
Service_ID – can be obtained from the “Connect” window in the portal. Please take the first part of the “Fully Qualified Domain Name” from the connect window. For example, “acme-production”’s FQDN is “dbtwf12345678.syst0000.test1.skysql.com”. First part of the FQDN is the Service_ID. Hence: Service_ID=dbtwf12345678
For this exercise we will use the following values, but please be sure to use appropriate values from your environment. API_KEY: skysql.1zzz.z88ne2qm.pxxxxyyyyzzzzxxxxxagWscolHJ8uw9Q2Tcle.xxx Service_ID: dbtwf12345678
1. Take a One-time Snapshot Backup
Using SkySQL API: SkySQL API can be used to create a snapshot backup of your current database. Snapshots capture the current state of your database. For more information, please refer to the documentation here: https://api.skysql.com/public/services/dbs/docs/swagger/index.html
In the following example, we have a production database “acme-production” in region us-east-1 that will act as the source for a new remote DR replica we are going to set up.
Use the following API call for creating a one-time snapshot backup: Make sure to replace $API_KEY with your own API key.
1bash#: curl --location 'https://api.skysql.com/skybackup/v1/backups/schedules'
2--header 'Content-Type: application/json'
3--header 'Accept: application/json'
4--header "X-API-Key: skysql.1zzz.z88ne2qm.pxxxxyyyyzzzzxxxxxagWscolHJ8uw9Q2Tcle.xxx"
5--data '{
6 "backup_type": "snapshot",
7 "schedule": "once",
8 "service_id": "dbtwf12345678"
9}'
If you prefer the portal, please use the following screen to
Click on “Backup Now” and choose your database service name and select the “Type of backup” option “Snapshot”.
2. Wait for Snapshot to Complete
Ensure that the backup has successfully completed before proceeding. SkySQL’s API can be used to monitor the status of ongoing backups.
Example API call to check backup status:
1bash#: curl --location 'https://api.skysql.com/skybackup/v1/backups?service_id=dbtwf12345678 --header 'Accept: application/json' --header "X-API-Key: skysql.1zzz.z88ne2qm.pxxxxyyyyzzzzxxxxxagWscolHJ8uw9Q2Tcle.xxx"
You can also use the “Backups” tab in the portal to observe the status of the snapshot:
3. Set Up a New, Smaller Database in a Different Region
Create a new SkySQL instance in the desired region where you want to establish your DR site. To save costs, this instance can be smaller than your production database and can later be scaled up if a failover occurs.
Example API call to create a new DR instance. Please replace “acme-drsite” for your appropriate situation.
1bash#: curl --location --request POST https://api.skysql.com/provisioning/v1/services
2 --header "X-API-Key: ${API_KEY}" --header "Content-type: application/json"
3 --data '{
4"service_type": "transactional",
5"topology": "standalone",
6"provider": "aws",
7"region": "us-west-2",
8"architecture": "amd64",
9"size": "sky-2x8",
10"storage": 100,
11"nodes": 1,
12"name": "acme-drsite",
13"ssl_enabled": true
14}'
Alterna
tively, you can use the portal to deploy the new service by selecting appropriate options.
4. Restore the Snapshot Backup
Once the new database service “acme-drsite” is ready, restore the snapshot from the original database into the new instance by following the instructions below.
Navigate to the “Backups” tab as shown below, click on the “Actions” button in the same row as the snapshot backup you just created with the “acme-production” database.
Pick the new DR site database as the target as shown below:
5. Track Restore JobJust like with backups, it’s important to track the restoration process to ensure it completes successfully.
6. Set Up ReplicationConfigure replication between the original database (acme-production) and the new DR (acme-drsite) instance. This ensures that the DR database stays up-to-date with real-time changes from the production database. 6.1. Allowlist Outbound IP First, you need to add the outbound IP address of the DR service (acme-drsite) to the allowlist of your primary service (acme-production). This ensures the two databases can communicate securely. Outbound IP can be obtained from the “Service Details” page in the SkySQL portal.
In our example, Outbound IP is 52.13.163.153.
Obtain the GTID position from which to start the replication by using the following stored procedure from the source database. Please replace (service_id) with the service id of the source.
Note down the “gtid_binlog_pos” value from above output. gtid_binlog_pos = 0-1-7062,572700-572700-210
6.2. Configure GTID Position
Now configure the acme-drsite database by calling the stored procedure as below. Login to acme-drsite database service and run the following command from the mariadb command line. Replace host and port with the source (acme-production) hostname and port. Replace ‘gtid’ with the gtid_binlog_pos value obtained from the previous step.MariaDB [(none)]> CALL sky.change_external_primary_gtid(dbtwf12345678.syst0000.test1.skysql.com, 3306, 0-1-7062,572700-572700-210, true);When you run the above command, you’ll see an output similar to the following.
1+----------------------------------------------------------------------------------------------------------------------------+
2| Run_this_grant_on_your_external_primary |
3+----------------------------------------------------------------------------------------------------------------------------+
4| GRANT REPLICATION SLAVE ON *.* TO 'skysql_replication_dbpgf28771827'@'%' IDENTIFIED BY 'Vs?wr^A86NNijlh4,-v57o?W&PGoQDa1'; |
5+----------------------------------------------------------------------------------------------------------------------------+
61 row in set (0.799 sec)
Copy the “GRANT” text (entire line) and run it in the source database “acme-production”. Log in to the database service and run the following command:
MariaDB [sky]> GRANT REPLICATION SLAVE ON *.* TO ‘skysql_replication_dbpgf28771827’@’%’ IDENTIFIED BY ‘Vs?wr^A86NNijlh4,-v57o?W&PGoQDa1’; Query OK, 0 rows affected (0.044 sec)
6.3. Start Replication Once the GTID position is configured, start replication on the target database (“acme-drsite) service.
MariaDB [sky]>CALL sky.start_replication();
6.4. Verify Replication Status Finally, verify that the replication is working correctly by checking the replication status on the standby service.
Example stored procedure to check replication status:
MariaDB [sky]>CALL sky.replication_status();
7. Monitor Replication and Configure Alerts
SkySQL provides monitoring tools to check for replication lag or failures. Use these tools to ensure that the DR instance is consistently updated.
Conclusion
By following the steps outlined above, you can ensure that your SkySQL-hosted database is protected from unexpected outages or disasters with minimal effort. A robust DR plan not only reduces downtime but also protects your data integrity and ensures that your business operations remain uninterrupted.
Stay tuned for “Part 2” of this series, where we will explore how to set up a DR site for on-premises production databases using SkySQL. This will allow businesses to leverage the cloud for reliable, scalable, and cost-effective disaster recovery.
Ready to Get Started?
Sign up for SkySQL and launch your database in any of the three major clouds. There is no credit card required to start and new users receive $100 in free credits.
For more information, sign up for a free trial, or to request a demo, contact a SkySQL expert.