Tech:Incidents: Difference between revisions

oh please learn to type ....
No edit summary
imported>Addshore
(oh please learn to type ....)
 
(15 intermediate revisions by 4 users not shown)
Line 1:
This pages lists all incidents on Orain. Newest incidents are listed at the top. Tracking started in April 2014. If there is anything missing then please add it!
 
=== May 2015 ===
* [[Tech:Incidents/2015-05-ddos]] - over a week of consistent downtime on ipv4 due to ddos
 
=== April 2015 ===
* [[Tech:Incidents/2015-04-04-prod7-resize]] - 15 mins downtime due to an oversight when resizing prod7
 
=== March 2015 ===
* [[Tech:Incidents/2015-03-16]] - 1 hour of downtime due to Linux OOM killer on prod8 (this report includes information about some long-term issues too)
 
=== February 2015 ===
* [[Tech:Incidents/2015-02-23]] - 1 hour, 35 minutes downtime caused by runaway overload on MediaWiki application servers
* [[Tech:Incidents/2015-02-15]] - 2 hours of frequent 502 Bad Gateway errors due to prod9 HHVM
* [[Tech:Incidents/2015-02-07]] - 20 minutes of downtime due to corrupt extension testing and bad DB list
 
=== January 2015 ===
* [[Tech:Incidents/2015-01-dns27]] - 415-20 hoursminutes of downtime due to DNSextension test solvinggone issueswrong
* [[Tech:Incidents/2015-01-23]] - 4 hours of downtime due to DNS solving issues
 
=== December 2014 ===
* [[Tech:Incidents/2014-12-prod3]] - 2 hour downtime from a prod3 failure. Removed from cluster
* [[Tech:Incidents/2014-12-hhvm]] - Several days of slow loading times and 504s
 
=== July 2014 ===
* [[Tech:Incidents/2014-07-prod3Reinstall]] - 3 days of downtime after a forced reinstall on prod3
* [[Tech:Incidents/2014-07-07]] - 4 hours issues with the databases (cause not known)
 
=== June 2014 ===
* [[Tech:Incidents/2014-06-ExtensionIssues]] - 12 days of issues with some extensions due to an issue with the Echo extension
* [[Tech:Incidents/2014-06-14]] - 20 minutes downtime due to bad DB list
* [[Tech:Incidents/2014-06-12]] - 14 hours downtime due to prod4 suspension by ramnodeRamNode
 
=== May 2014 ===
* [[Tech:Incidents/2014-05-12]] - 18 hours downtime due to prod4 suspension by ramnodeRamNode
 
=== April 2014 ===
* [[Tech:Incidents/2014-04-Downtimes]] - 16 hours + 80 hours downtimes due to SSL cert and i18n / fpm issues
 
==All Subpages of Incidents==
{{Special:PrefixIndex/Tech:Incidents/|hideredirects=1}}
Anonymous user