Meta to be resolving its silent data corruption issue

Meta has recently published its approach for resolving the silent data corruption (SDCs) issue.

Indeed, SDCs are data errors that don’t leave any trace in system logs but can affect memory, storage, networking, and cause data loss and corruption. Meta has started testing three years ago after having difficulties detecting SDCs within their data center fleets.

Hence, Meta is now using both out-of-production and ripple testing to detect the hardware issue. It was then recommended that big organizations should use both approaches as well in order to detect data corruption at scale as quickly as possible.

It was also announced that Meta will provide five grants, worth around $50,000, for academia in order to develop research proposals in this field of research.

 

More
articles