Tech:Incidents: Difference between revisions

From Orain Meta
Jump to navigation Jump to search
Content added Content deleted
imported>Addshore
No edit summary
imported>Addshore
(oh please learn to type ....)
 
(29 intermediate revisions by 5 users not shown)
Line 1: Line 1:
This pages lists all incidents on Orain.
This pages lists all incidents on Orain. Newest incidents are listed at the top. Tracking started in April 2014. If there is anything missing then please add it!


==Manual List==
=== May 2015 ===
* [[Incidents/2014-06-14]] - 20 minutes downtime due to bad DB list
* [[Tech:Incidents/2015-05-ddos]] - over a week of consistent downtime on ipv4 due to ddos
* [[Incidents/2014-04-Downtimes]] - 16 hours + 80 hours downtimes due to SSL cert and i18n / fpm issues


==All Subpages==
=== April 2015 ===
* [[Tech:Incidents/2015-04-04-prod7-resize]] - 15 mins downtime due to an oversight when resizing prod7
{{Special:PrefixIndex/Incidents/|hideredirects=1}}

=== March 2015 ===
* [[Tech:Incidents/2015-03-16]] - 1 hour of downtime due to Linux OOM killer on prod8 (this report includes information about some long-term issues too)

=== February 2015 ===
* [[Tech:Incidents/2015-02-23]] - 1 hour, 35 minutes downtime caused by runaway overload on MediaWiki application servers
* [[Tech:Incidents/2015-02-15]] - 2 hours of frequent 502 Bad Gateway errors due to prod9 HHVM
* [[Tech:Incidents/2015-02-07]] - 20 minutes of downtime due to corrupt extension testing and bad DB list

=== January 2015 ===
* [[Tech:Incidents/2015-01-27]] - 15-20 minutes of downtime due to extension test gone wrong
* [[Tech:Incidents/2015-01-23]] - 4 hours of downtime due to DNS solving issues

=== December 2014 ===
* [[Tech:Incidents/2014-12-prod3]] - 2 hour downtime from a prod3 failure. Removed from cluster
* [[Tech:Incidents/2014-12-hhvm]] - Several days of slow loading times and 504s

=== July 2014 ===
* [[Tech:Incidents/2014-07-prod3Reinstall]] - 3 days of downtime after a forced reinstall on prod3
* [[Tech:Incidents/2014-07-07]] - 4 hours issues with the databases (cause not known)

=== June 2014 ===
* [[Tech:Incidents/2014-06-ExtensionIssues]] - 12 days of issues with some extensions due to an issue with the Echo extension
* [[Tech:Incidents/2014-06-14]] - 20 minutes downtime due to bad DB list
* [[Tech:Incidents/2014-06-12]] - 14 hours downtime due to prod4 suspension by RamNode

=== May 2014 ===
* [[Tech:Incidents/2014-05-12]] - 18 hours downtime due to prod4 suspension by RamNode

=== April 2014 ===
* [[Tech:Incidents/2014-04-Downtimes]] - 16 hours + 80 hours downtimes due to SSL cert and i18n / fpm issues

==All Subpages of Incidents==
{{Special:PrefixIndex/Tech:Incidents/|hideredirects=1}}

Latest revision as of 11:30, 30 May 2015

This pages lists all incidents on Orain. Newest incidents are listed at the top. Tracking started in April 2014. If there is anything missing then please add it!

May 2015

April 2015

March 2015

  • Tech:Incidents/2015-03-16 - 1 hour of downtime due to Linux OOM killer on prod8 (this report includes information about some long-term issues too)

February 2015

January 2015

December 2014

July 2014

June 2014

May 2014

April 2014

All Subpages of Incidents