Understanding RTO & RPO….

Disaster recovery is one of the most critical aspects to look into when you are designing a system. It becomes even more prominent while architecting a cloud solution. Two key concepts that must be taken into consideration are Recovery Time Objective(RTO) and Recovery Point Objective(RPO). Let’s understand these in this post.

 

RTO means how much down time your business can tolerate. RPO means how much data loss or business loss your business can tolerate. Let’s say that your RTO and RPO are set to 24 hours. So by that means, if a disaster happens, you have agreed to tolerate it for 24 hours. Or in other words, your business can tolerate it for 24 hours. So, now the disaster has happened. Let’s say you recovered from it in 8 hours. It means that you are well under your RTO because it was 24 hours. And now, this is not an issue because you originally did agree that you are okay with24 hours of downtime. But now if your recovery time was 72 hours or 48 hours, that’s not acceptable because this is way over what you, as a business can tolerate.

So now, let’s come to RPO. RPO is how much data loss your business can tolerate. Since RPO is the data loss or the transaction loss, it will have an even more severe impact. Usually, data loss is not really acceptable, thus an organization will definitely have a backup (read Disaster recovery) and restore plan already in place. So let’s assume that you take backup every day at 10PM to 2AM which is a 4 hour window. Now the disaster happens at 2AM. So now, you have lost 4 hours of the last generated data because that’s your backup window was. Now, since you have RPO as 24 hours, you are still going to survive and be okay,as you have lost 4 hours of the transactional data which is under your RPO. Now, assume that you are a mission critical application , like a bank or a shopping portal and your RPO was just 1 hour, loss of 4 hours’ worth of transactional data now is not acceptable, simply because you have lost more than what your business can comprehend with.

Why these two concepts are important? Well, when you are designing a cloud solution, you do need to pay attention to these concepts and the times accepted by your org. And based on that you need to decide how you can take the leverage of features like different Regions & AD’s and Fault Domains. This will also include to decide whether you want to use a single instance database or a RAC database or do you want to make your application run in one Fault Domain or more. But it’s important to keep RTO and RPO in mind while architecting a cloud solution.

Hope this helped.

Aman….