Follow us on twitter   |   Join our Facebook page  


HOME
Where it all begins!
 
 

Google
    Stay in Touch! Rental GSM & 3G Phones, Data & SIM Cards, Prepaid Phones & SIM Cards. | Get Inside Costa Rica In Your Email Daily. Click here!

 INSIDETECHNOLOGY | Friday  24 September 2010

Facebook Re-Boots To Fix System Error

If you are a Facebook junkie, yesterday was a stressful day as the social media network suffered a world wide connection problem, where users could not connect, nor communicate with friends as the page would hang for what felt an eternity in today's high speed internet access.

Facebook blamed Thursday's 2.5-hour downtime on a change it made to its system, resulting in the worst outage the social-networking company had seen in four years.

"The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition," Facebook's Robert Johnson wrote in a blog post. "This is the worst outage we've had in over four years, and we wanted to first of all apologize."

Thursday's outage was the second in as many days for Facebook, which was hit was sporadic downtime on Wednesday because of an "issue with a third-party provider."

Facebook has an automated system that checks for invalid configuration values throughout the site. If it finds an error, it replaces it with an updated value from its persistent store.

"This works well for a transient problem with the cache, but it doesn't work when the persistent store is invalid," Johnson wrote.

Unfortunately, Facebook made a change to its persistent store on Thursday that ended up being invalid. As a result, the automated system checking for errors would replace those errors with values from the persistent store - which was also not working.

"Because the fix involves making a query to a cluster of databases, that cluster was quickly overwhelmed by hundreds of thousands of queries a second," Johnson said.

"To make matters worse, every time a client got an error attempting to query one of the databases it interpreted it as an invalid value, and deleted the corresponding cache key," Johnson continued. "This meant that even after the original problem had been fixed, the stream of queries continued."

The result was a "feedback loop" that didn't allow for database recovery time, he said.

How did Facebook fix it? Re-booting, essentially. "We had to stop all traffic to this database cluster, which meant turning off the site," Johnson said.

For now, the system that corrects configuration values has been shut down, and Facebook is "exploring new designs for this configuration system following design patterns of other systems at Facebook that deal more gracefully with feedback loops and transient spikes," he said.

All users should now have access, Facebook said.

 

 
 
AvenidaClassifieds
Buy, sell, rent & trade
anything in Costa Rica for FREE!

Click here!
 
 


 

 
Costa Rica's Daily English News Source
Apdo. 2133-1000, San José, Costa Rica
Tel: (506) 2231 3205 / (506) 8399 9642 
Fax: (506) 2232 6337

 
 

Follow us on twitter  

 Join our Facebook page
 

 

Insidecostarica is an independent news media portal featuring news of Costa Rica, Central America, Latin America and other wonderful and weir stuff.  External links are provided for reference purposes. Insidecostarica.com is not responsible for the content of the external sites.

If you need more information or to provide recommendations, write to editor@insidecostarica.com
 

InsideGuanacaste  |  InsideNicaragua  |  InsidePanama  |  InsideCuba  |  InsideColombia