Google, Gmail and YouTube down: the company explains what caused the disservices

Source: HW Upgrade added 21st Dec 2020

  • google,-gmail-and-youtube-down:-the-company-explains-what-caused-the-disservices

Google returns to the topic of inefficiencies , explaining the causes that caused the malfunctions of last week

by Nino Grasso published , at 10: 41 in the web channel

Google YouTube

Last week the whole world suffered severe disruptions on all Google services, and the company promptly returned to the subject by officially declaring that the damage was caused by a bug in automated system for the management of memory quotas of Google User ID Service . The bug involved the Google account log-in system, preventing users from authenticating on all online and Cloud services.

In short, who tried to access company services, such as Gmail , YouTube , Google Drive , Google Maps, Google Calendar and many other services, the 14 last December, for about an hour, was greeted by an error screen. During the outage, users could not send emails through the Gmail mobile apps, or receive emails via POP3 on the desktop, or YouTube users received error messages (503), with an invitation to try again (in vain).

Google describes the causes and impact of the outage

“Monday 14 December 2020 from 03: 46 at 04: 33 US / Pacific, all operations for issuing credentials and the account metadata searches did not work “, the company explained on an error report page. “As a result, we were unable to verify user authentication by providing 5xx errors on all authenticated traffic. Most of authenticated services had an impact: high error rates across the entire Google Cloud Platform, and the Google Workspace API and Consoles “.

At the root of the problem there was the decreased capacity to access the storage space by the central system of the Google authentication system , all due to a bug in the memory quota management system. Due to the error the authentication requests from users failed, referring to errors on “virtually” all attempts.

For authentication in its services Google uses the Google User ID Service , a system that stores unique identifiers for each Google account managing the authentication credentials for both OAuth tokens and cookies. Furthermore, it also stores user data in a distributed database that makes use of Paxos protocols to coordinate updates during authentication. The Service automatically rejects all access attempts that make use of obsolete data, which is why during the disruption all the authentication that required a Google OAuth access were rejected.

The bug is was introduced in the systems in October as part of a “migration process of the User ID Service to a new management system of quotas “, explains Google, with parts of the previous system that have remained in use and have set the use of the storage provided for the User ID Service. Once the new criteria were enabled (last week) access to writing on Paxos protocols was immediately forbidden and in short all operations in reading have become obsolete preventing all access attempts running on the platform.

Two outages on Gmail on the same day

On Gmail, the problems continued for about seven hours after the failure in the authentication system was resolved, with several users of the platform not being able to deliver their emails. Google always talks about it in another article: “The error message indicated that the email address did not exist and, therefore, some emails were never delivered. Affected users should have received a bounce email generated by an intermediary SMTP service. In some cases the SMTP error message was quoted in the email, based on to the configuration of external SMTP clients in connection with the Google SMTP service “.

Google also explained the causes of the second disruption, also caused by an operation being migrated, in this case, however, of the configuration system underlying the Gmail SMTP incoming service.