Most Common RAID Failures And Data Recovery Insights

SCSI Server

The use of RAID (Redundant Array of Independent Disks) configurations is common in large corporations. In fact, the way in which this system works has increased the popularity of it being used in small and medium-sized businesses as well. Essentially, RAID is a method of accessing multiple individual disks in a set-up that acts as if one larger disk is being used. What this does is it spreads data across several disks which reduce the loss of all data if one disk fails. A RAID system also allows for quicker access time to retrieve stored data.

Although RAID is considered a safe choice for massive data storage, mostly due to the redundancy factor of the stored data, they still can fail. If this happens to you and you have not been habitually backing up your data, this could be devastating.

Here are the most common RAID failure types:

Controller Failure

There is a controller of some kind as part of any RAID configuration. When that controller dies or has a malfunction, it will result in data loss. This happens to be a rather common scenario and if the controller is beyond repair, it must be emulated or used with hardware that is capable of rebuilding the array. A power surge is most often the source of a controller failure. By investing in and using a good quality, reliable UPS can save you a huge headache.

Running Too Long On Degraded Mode

When a RAID config hard drive fails, and if it is supported, RAID will continue to run in degraded mode. This lets you keep operating while you locate a hot swap. The most common mistake here is putting RAID into read-only mode and continuing to operate for a long period of time. An example of this is when access to the database is required frequently. While putting the RAID in read-only mode is a good idea in general, users often make mistakes where they end up running it in a degraded read-only mode for too long. A second drive usually fails pretty quickly in this scenario which will result in total data loss.

Degraded Read Only Server

Mistake Logical Corruption To Physical

This is another common occurrence. A user mistakes the logical corruption for a physical one and starts swapping drives. This results in a catastrophic data loss. A general rule of thumb is to first run a test on all drives to check their SMART status before trying to repair a physical impairment.

Rebuilding With A Wrong/Bad Drive

All RAID configs have their own classifications and number of drives required in order to function properly. A common mistake that typically results in a catastrophic data failure – that is almost always unrecoverable – is trying to rebuild the RAID config with a bad drive. This could render all data on the balance of the RAID unrecoverable – with a known Unrecoverable Read Error (URE).

Although I have discussed the four most common RAID failure types, there are others that can impact a RAID system.

Okay, moving on… now you know the common failure types, here are the data recovery insights, facts, and warnings to help you decide what your next steps will be to recovery.

RAID Data Recovery Insights

Recovery Costs Can Be Very High

RAID recovery is often a very expensive service. The average cost of a RAID 5 recovery in North America ranges from $5,000 and up. This depends, of course, on the complexity and number of drives used. RAID 6+ can cost upwards of $10,000, also depending on specifics.

Bad Diagnosis

Some companies will misdiagnose the situation which in turn makes it look worse than it actually is. This happens to be a very common error with RAID configs as well as single hard drives. If a user is not too savvy in an emergency situation, they end up paying more than they should. Sadly, this is a practice that plagues this industry. If you have your doubts, get a second opinion.

Not Sending In The Controller

Frequently clients will just send in the hard drives for recovery/repair and do not include the controller. It is a good idea to include both as in 60% of the time, the controller can be repaired which will enable your RAID config to return to its normal state. If the controller is not repairable, the engineers will then work directly with the drives. This means they will rebuild with the assistance of an emulated controller.

Raid Controller

Remote Sessions

Half the time remote access to your environment will permit engineers to determine the issue with the RAID/Server configuration. Although remote sessions are not free, they are the best ‘first step’ to the recovery by a professional data recovery RAID expert. In some cases, RAIDs can be rebuilt remotely as well. It depends if there is physical damage present or not.

In Conclusion

It doesn’t matter if you are an individual, business owner or a corporate entity. A RAID failure can cripple you and bring a halt to all your work. When you start searching for a data recovery provider, there are many things to watch out for as well. That’s so you don’t get scammed or mistreated for not being tech-savvy.

If you are experiencing issues with a RAID setup, or you wish to ask an industry specialist a question or require remote assistance, you can send us a message on our Contact Us page. We will be happy to provide you with the assistance you may require.