Jump to content

Server Admin Log

From Wikitech

2024-09-19

  • 21:15 Dreamy_Jazz: Evening UTC backport window done
  • 21:15 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Re-order arguments to DataAccess::addTrackingCategory, Add a "duplicate-ids" lint category (T200517) (duration: 29m 20s)
  • 21:02 dreamyjazz@deploy1003: dreamyjazz, cscott: Continuing with sync
  • 21:00 dreamyjazz@deploy1003: dreamyjazz, cscott: Backport for Re-order arguments to DataAccess::addTrackingCategory, Add a "duplicate-ids" lint category (T200517) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:45 dreamyjazz@deploy1003: Started scap sync-world: Backport for Re-order arguments to DataAccess::addTrackingCategory, Add a "duplicate-ids" lint category (T200517)
  • 20:36 Dreamy_Jazz: Running `foreachwikiindblist group1.dblist extensions/CheckUser/maintenance/populateCentralCheckUserIndexTables.php` on a tmux session for T375203
  • 20:34 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Add ::caller to queries in populateCentralCheckUserIndexTables.php (T375221) (duration: 06m 50s)
  • 20:29 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with sync
  • 20:29 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for Add ::caller to queries in populateCentralCheckUserIndexTables.php (T375221) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for Add ::caller to queries in populateCentralCheckUserIndexTables.php (T375221)
  • 20:27 aqu@deploy1003: Finished deploy [airflow-dags/analytics@e0d8d78]: Fix canary events generation schedule [airflow-dags/analytics@e0d8d78a] (duration: 00m 42s)
  • 20:26 aqu@deploy1003: Started deploy [airflow-dags/analytics@e0d8d78]: Fix canary events generation schedule [airflow-dags/analytics@e0d8d78a]
  • 20:12 toyofuku@deploy1003: Finished scap sync-world: Backport for Deploy new donate link location to pilot wikis (take 2) (T373585) (duration: 08m 35s)
  • 20:08 toyofuku@deploy1003: toyofuku: Continuing with sync
  • 20:06 toyofuku@deploy1003: toyofuku: Backport for Deploy new donate link location to pilot wikis (take 2) (T373585) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:04 toyofuku@deploy1003: Started scap sync-world: Backport for Deploy new donate link location to pilot wikis (take 2) (T373585)
  • 18:38 jnuche@deploy1003: Finished scap sync-world: Backport for DiscussionParser: Do not create User objects from subpages (T375212) (duration: 06m 38s)
  • 18:33 jnuche@deploy1003: jnuche, kharlan: Continuing with sync
  • 18:33 jnuche@deploy1003: jnuche, kharlan: Backport for DiscussionParser: Do not create User objects from subpages (T375212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:31 jnuche@deploy1003: Started scap sync-world: Backport for DiscussionParser: Do not create User objects from subpages (T375212)
  • 18:23 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:23 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove entries for sretest2002 - cmooney@cumin1002"
  • 18:23 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove entries for sretest2002 - cmooney@cumin1002"
  • 18:20 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:02 swfrench-wmf: scaling down mw-api-ext in eqiad after pre-switchover testing - T371273
  • 18:02 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 18:02 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 18:02 swfrench-wmf: scaling down mw-web in eqiad after pre-switchover testing - T371273
  • 18:01 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 18:01 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 18:01 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on vrts1001.eqiad.wmnet with reason: Migration
  • 18:01 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on vrts1001.eqiad.wmnet with reason: Migration
  • 17:53 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-web-ro,name=codfw [reason: Reverting pre-switchover capacity validation - T371273]
  • 17:51 brett@cumin2002: conftool action : set/pooled=yes; selector: name=cp5024.eqsin.wmnet
  • 17:49 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-api-ext-ro,name=codfw [reason: Reverting pre-switchover capacity validation - T371273]
  • 17:47 dancy@deploy1003: Installation of scap version "4.104.0" completed for 211 hosts
  • 17:46 swfrench@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=mw-api-int-ro,name=codfw [reason: Reverting pre-switchover capacity validation - T371273]
  • 17:43 dancy@deploy1003: Installing scap version "4.104.0" for 211 hosts
  • 17:38 brett@cumin2002: conftool action : set/pooled=no; selector: name=cp5024.eqsin.wmnet
  • 17:27 Dreamy_Jazz: Finished running script for T375203 on `group0`
  • 17:21 mforns@deploy1003: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
  • 17:21 mforns@deploy1003: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
  • 17:17 swfrench@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=mw-web-ro,name=codfw [reason: Pre-switchover capacity validation - T371273]
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69376 and previous config saved to /var/cache/conftool/dbconfig/20240919-171122-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69375 and previous config saved to /var/cache/conftool/dbconfig/20240919-171117-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2219 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69374 and previous config saved to /var/cache/conftool/dbconfig/20240919-171112-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69373 and previous config saved to /var/cache/conftool/dbconfig/20240919-171107-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2182 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69372 and previous config saved to /var/cache/conftool/dbconfig/20240919-171101-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69371 and previous config saved to /var/cache/conftool/dbconfig/20240919-171057-arnaudb.json
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69370 and previous config saved to /var/cache/conftool/dbconfig/20240919-171052-arnaudb.json
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69369 and previous config saved to /var/cache/conftool/dbconfig/20240919-171047-arnaudb.json
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69368 and previous config saved to /var/cache/conftool/dbconfig/20240919-171042-arnaudb.json
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 100%: T373105', diff saved to https://phabricator.wikimedia.org/P69367 and previous config saved to /var/cache/conftool/dbconfig/20240919-171037-arnaudb.json
  • 17:08 swfrench@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=mw-api-ext-ro,name=codfw [reason: Pre-switchover capacity validation - T371273]
  • 17:02 swfrench@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=mw-api-int-ro,name=codfw [reason: Pre-switchover capacity validation - T371273]
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69366 and previous config saved to /var/cache/conftool/dbconfig/20240919-165617-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69365 and previous config saved to /var/cache/conftool/dbconfig/20240919-165611-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2219 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69364 and previous config saved to /var/cache/conftool/dbconfig/20240919-165606-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69363 and previous config saved to /var/cache/conftool/dbconfig/20240919-165602-arnaudb.json
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2182 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69362 and previous config saved to /var/cache/conftool/dbconfig/20240919-165556-arnaudb.json
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69361 and previous config saved to /var/cache/conftool/dbconfig/20240919-165551-arnaudb.json
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69360 and previous config saved to /var/cache/conftool/dbconfig/20240919-165546-arnaudb.json
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69359 and previous config saved to /var/cache/conftool/dbconfig/20240919-165542-arnaudb.json
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69358 and previous config saved to /var/cache/conftool/dbconfig/20240919-165537-arnaudb.json
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 75%: T373105', diff saved to https://phabricator.wikimedia.org/P69357 and previous config saved to /var/cache/conftool/dbconfig/20240919-165531-arnaudb.json
  • 16:52 sukhe: ulsfo was depooled between 15:55 and 16:12 for sre.dns.admin test, current state is pooled
  • 16:50 swfrench-wmf: scaling up mw-web in eqiad for pre-switchover testing - T371273
  • 16:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 16:50 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 16:48 swfrench-wmf: scaling up mw-api-ext in eqiad for pre-switchover testing - T371273
  • 16:48 vgutierrez: updated to purged 0.24 in codfw, ulsfo and eqsin - T334078
  • 16:47 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 16:47 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 16:44 vgutierrez: uploaded purged 0.24 to apt.wm.o (bullseye-wikimedia)
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69356 and previous config saved to /var/cache/conftool/dbconfig/20240919-164111-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69355 and previous config saved to /var/cache/conftool/dbconfig/20240919-164106-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2219 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69354 and previous config saved to /var/cache/conftool/dbconfig/20240919-164101-arnaudb.json
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69353 and previous config saved to /var/cache/conftool/dbconfig/20240919-164056-arnaudb.json
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2182 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69352 and previous config saved to /var/cache/conftool/dbconfig/20240919-164051-arnaudb.json
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69351 and previous config saved to /var/cache/conftool/dbconfig/20240919-164046-arnaudb.json
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69350 and previous config saved to /var/cache/conftool/dbconfig/20240919-164041-arnaudb.json
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69349 and previous config saved to /var/cache/conftool/dbconfig/20240919-164036-arnaudb.json
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69348 and previous config saved to /var/cache/conftool/dbconfig/20240919-164031-arnaudb.json
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 50%: T373105', diff saved to https://phabricator.wikimedia.org/P69347 and previous config saved to /var/cache/conftool/dbconfig/20240919-164026-arnaudb.json
  • 16:38 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2020.codfw.wmnet
  • 16:38 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2020.codfw.wmnet
  • 16:38 topranks: disable LAG interface from asw-d-codfw to ssw1-dX-codfw
  • 16:38 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2019.codfw.wmnet
  • 16:38 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2019.codfw.wmnet
  • 16:37 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2018.codfw.wmnet
  • 16:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2018.codfw.wmnet
  • 16:37 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2282.codfw.wmnet
  • 16:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2282.codfw.wmnet
  • 16:37 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2053.codfw.wmnet
  • 16:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2053.codfw.wmnet
  • 16:37 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2052.codfw.wmnet
  • 16:36 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2052.codfw.wmnet
  • 16:34 effie: restarting confd on all servers on: codfw, ulsof, eqsin - T373105
  • 16:34 vgutierrez: testing purged 0.24 in cp2037 - T334078
  • 16:27 Emperor: restart swift-object-replicator on thanos-be2002
  • 16:27 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=(cp2041|cp2042).codfw.wmnet [reason: T373105 is done]
  • 16:26 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2042.codfw.wmnet
  • 16:26 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for cp2042.codfw.wmnet
  • 16:26 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2041.codfw.wmnet
  • 16:26 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for cp2041.codfw.wmnet
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69346 and previous config saved to /var/cache/conftool/dbconfig/20240919-162606-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69345 and previous config saved to /var/cache/conftool/dbconfig/20240919-162601-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2219 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69344 and previous config saved to /var/cache/conftool/dbconfig/20240919-162556-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69343 and previous config saved to /var/cache/conftool/dbconfig/20240919-162551-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2182 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69342 and previous config saved to /var/cache/conftool/dbconfig/20240919-162546-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69341 and previous config saved to /var/cache/conftool/dbconfig/20240919-162541-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69340 and previous config saved to /var/cache/conftool/dbconfig/20240919-162536-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69339 and previous config saved to /var/cache/conftool/dbconfig/20240919-162531-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2152 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69338 and previous config saved to /var/cache/conftool/dbconfig/20240919-162526-arnaudb.json
  • 16:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2131 (re)pooling @ 25%: T373105', diff saved to https://phabricator.wikimedia.org/P69337 and previous config saved to /var/cache/conftool/dbconfig/20240919-162521-arnaudb.json
  • 16:17 topranks: migrating server uplinks in codfw rack D8 to new top-of-rack switch T373105
  • 16:16 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on 25 hosts with reason: Move server uplinks in codfw rack D8
  • 16:16 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:25:00 on 25 hosts with reason: Move server uplinks in codfw rack D8
  • 16:13 Dreamy_Jazz: Running `foreachwikiindblist group0.dblist extensions/CheckUser/maintenance/populateCentralCheckUserIndexTables.php` on a tmux session for T375203
  • 16:12 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool site ulsfo [reason: testing done, no task ID specified]
  • 16:12 sukhe@cumin1002: START - Cookbook sre.dns.admin DNS admin: pool site ulsfo [reason: testing done, no task ID specified]
  • 16:11 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Call require_once on CheckUserQueryInterface in population script (T375203) (duration: 06m 47s)
  • 16:11 topranks: migrating server uplinks in codfw rack D7 to new top-of-rack switch T373105
  • 16:09 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 22 hosts with reason: Move server uplinks in codfw rack D7
  • 16:08 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 22 hosts with reason: Move server uplinks in codfw rack D7
  • 16:07 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 16:06 dreamyjazz@deploy1003: dreamyjazz: Backport for Call require_once on CheckUserQueryInterface in population script (T375203) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:04 dreamyjazz@deploy1003: Started scap sync-world: Backport for Call require_once on CheckUserQueryInterface in population script (T375203)
  • 16:04 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqiad
  • 15:59 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device cr2-eqiad
  • 15:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cloudsw1-d5-eqiad
  • 15:56 mforns@deploy1003: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
  • 15:56 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device cloudsw1-d5-eqiad
  • 15:55 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool site ulsfo [reason: testing cookbook for actual change, no task ID specified]
  • 15:55 sukhe@cumin1002: START - Cookbook sre.dns.admin DNS admin: depool site ulsfo [reason: testing cookbook for actual change, no task ID specified]
  • 15:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr1-eqiad
  • 15:51 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device cr1-eqiad
  • 15:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cloudsw1-c8-eqiad
  • 15:47 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device cloudsw1-c8-eqiad
  • 15:46 mforns@deploy1003: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
  • 15:45 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 15:45 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 15:45 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 15:45 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 15:44 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 15:44 elukey@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 15:41 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2020.codfw.wmnet
  • 15:40 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2020.codfw.wmnet
  • 15:40 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2019.codfw.wmnet
  • 15:39 mforns@deploy1003: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
  • 15:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 10 hosts with reason: network maintenance T373105
  • 15:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 10 hosts with reason: network maintenance T373105
  • 15:38 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2131 db2152 db2173 db2174 db2181 db2182 db2195 db2219 db2220 es2040 - T373105', diff saved to https://phabricator.wikimedia.org/P69336 and previous config saved to /var/cache/conftool/dbconfig/20240919-153815-arnaudb.json
  • 15:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2019.codfw.wmnet
  • 15:37 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2018.codfw.wmnet
  • 15:36 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2018.codfw.wmnet
  • 15:36 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2282.codfw.wmnet
  • 15:33 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2282.codfw.wmnet
  • 15:33 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2053.codfw.wmnet
  • 15:32 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2053.codfw.wmnet
  • 15:32 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2052.codfw.wmnet
  • 15:31 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2052.codfw.wmnet
  • 15:29 mforns@deploy1003: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
  • 15:26 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-eqiad
  • 15:24 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f1-eqiad
  • 15:22 sukhe@cumin1002: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool site ulsfo [reason: testing cookbook for actual change, no task ID specified]
  • 15:22 sukhe@cumin1002: START - Cookbook sre.dns.admin DNS admin: depool site ulsfo [reason: testing cookbook for actual change, no task ID specified]
  • 15:09 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=(cp2041|cp2042).codfw.wmnet [reason: depool for T373105]
  • 15:08 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Add CheckUserQueryInterface to autoload classes (T375203), Add CheckUserQueryInterface to autoload classes (T375203) (duration: 07m 18s)
  • 15:04 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 15:04 dreamyjazz@deploy1003: dreamyjazz: Backport for Add CheckUserQueryInterface to autoload classes (T375203), Add CheckUserQueryInterface to autoload classes (T375203) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:02 elukey@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f1-eqiad
  • 15:02 elukey@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f1-eqiad
  • 15:01 dreamyjazz@deploy1003: Started scap sync-world: Backport for Add CheckUserQueryInterface to autoload classes (T375203), Add CheckUserQueryInterface to autoload classes (T375203)
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 depool T373579', diff saved to https://phabricator.wikimedia.org/P69333 and previous config saved to /var/cache/conftool/dbconfig/20240919-145626-arnaudb.json
  • 14:56 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:55 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2129.codfw.wmnet
  • 14:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2129.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 14:53 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2129.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 14:50 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 14:46 moritzm: installing expat security updates on Bookworm
  • 14:45 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2129.codfw.wmnet
  • 14:40 tappof: manual replace of mtail binary on centrallog2002 (3.0.0-rc50 to 3.0.8) T375085
  • 14:25 sukhe: sudo cumin -b11 "A:cp" 'run-puppet-agent --enable "merging CR 1054918"'
  • 14:24 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:24 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Revert^2 "Activate feature flag for moving wikibase item to Other Projects sidebar in pilot wikis." (T66315) (duration: 18m 31s)
  • 14:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 depool T373579', diff saved to https://phabricator.wikimedia.org/P69332 and previous config saved to /var/cache/conftool/dbconfig/20240919-142046-arnaudb.json
  • 14:19 lucaswerkmeister-wmde@deploy1003: seanleong-wmde, lucaswerkmeister-wmde: Continuing with sync
  • 14:11 lucaswerkmeister-wmde@deploy1003: seanleong-wmde, lucaswerkmeister-wmde: Backport for Revert^2 "Activate feature flag for moving wikibase item to Other Projects sidebar in pilot wikis." (T66315) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:05 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Revert^2 "Activate feature flag for moving wikibase item to Other Projects sidebar in pilot wikis." (T66315)
  • 14:03 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Bring back quality colors before dark mode fixes (T375114) (duration: 12m 39s)
  • 13:59 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CR 1054918"'
  • 13:58 elukey@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f1-eqiad
  • 13:58 elukey@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f1-eqiad
  • 13:58 lucaswerkmeister-wmde@deploy1003: soda, lucaswerkmeister-wmde: Continuing with sync
  • 13:57 elukey@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f1-eqiad
  • 13:57 elukey@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f1-eqiad
  • 13:57 lucaswerkmeister-wmde@deploy1003: soda, lucaswerkmeister-wmde: Backport for Bring back quality colors before dark mode fixes (T375114) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:50 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Bring back quality colors before dark mode fixes (T375114)
  • 13:42 XioNoX: update pfw codfw syslog target - T374658
  • 13:28 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 13:28 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 11:34 ayounsi@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f1-eqiad
  • 11:34 ayounsi@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f1-eqiad
  • 11:31 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f1-eqiad
  • 11:31 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f1-eqiad
  • 10:55 xSavitar: T375078 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=bhwiki --logwiki=metawiki 'SikAnderAhmedas' 'Renamed user ab8e0a47aa0e5d456f28ee3977f8c682'
  • 10:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:30 elukey@deploy1003: Finished scap sync-world: Remove network policies for old poolcounter nodes. (duration: 08m 55s)
  • 10:26 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 10:26 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 10:26 XioNoX: enable gNMI on cloudsw
  • 10:26 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 10:26 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 10:23 moritzm: rebalance ganeti group C following the various switch maintenances T370630
  • 10:21 elukey@deploy1003: Started scap sync-world: Remove network policies for old poolcounter nodes.
  • 09:48 volans: uploaded python3-wmflib_1.2.7 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
  • 09:44 vgutierrez: testing purged 0.24 in cp4038 - T334078
  • 09:44 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 09:44 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 09:42 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 09:42 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 09:42 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 09:41 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 09:23 arnaudb@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1246.eqiad.wmnet with OS bookworm
  • 09:23 btullis@cumin1002: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
  • 09:10 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 09:10 arnaudb@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1246.eqiad.wmnet with OS bookworm
  • 08:51 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 08:50 arnaudb@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1246.eqiad.wmnet with OS bookworm
  • 08:45 hashar: Restarting CI Jenkins with Java 17 # T359795
  • 08:31 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 08:31 _joe_: deployed conftool 3.2.4 T375059 T373449
  • 08:30 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 08:29 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 08:16 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.23 refs T373642
  • 08:04 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 08:04 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2018.codfw.wmnet
  • 07:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2018.codfw.wmnet
  • 07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16509
  • 07:38 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 16509
  • 07:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2229 (re)pooling @ 100%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69318 and previous config saved to /var/cache/conftool/dbconfig/20240919-073543-arnaudb.json
  • 07:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2229 (re)pooling @ 75%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69317 and previous config saved to /var/cache/conftool/dbconfig/20240919-072037-arnaudb.json
  • 07:19 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 16509
  • 07:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2229 (re)pooling @ 50%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69316 and previous config saved to /var/cache/conftool/dbconfig/20240919-070532-arnaudb.json
  • 06:53 moritzm: adding Tiziano to pwstore
  • 06:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2229 (re)pooling @ 25%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69315 and previous config saved to /var/cache/conftool/dbconfig/20240919-065026-arnaudb.json
  • 06:47 moritzm: cleanup some old Bacula restores (4G) on seaborgium
  • 06:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2229 (re)pooling @ 15%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69314 and previous config saved to /var/cache/conftool/dbconfig/20240919-063521-arnaudb.json
  • 06:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2229 (re)pooling @ 10%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69313 and previous config saved to /var/cache/conftool/dbconfig/20240919-062016-arnaudb.json
  • 06:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2229 (re)pooling @ 5%: post maintenance', diff saved to https://phabricator.wikimedia.org/P69312 and previous config saved to /var/cache/conftool/dbconfig/20240919-060510-arnaudb.json
  • 05:01 eileen: civicrm upgraded from ac29ff45 to 8af371aa
  • 01:25 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:25 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns name for frack new switches - pt1979@cumin2002"
  • 01:24 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns name for frack new switches - pt1979@cumin2002"
  • 01:21 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:46 sukhe: sudo cumin 'puppetserver1003* or puppetserver2003*' 'systemctl start sync-puppet-volatile.service'
  • 00:45 sukhe: sukhe@puppetserver1002:~$ sudo systemctl start sync-puppet-volatile.service
  • 00:41 swfrench-wmf: force-reboot of puppetserver1001 via ipmitool (unresponsive for over 30m)

2024-09-18

  • 22:43 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
  • 22:43 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
  • 22:19 jynus: inserting without binlog missing heartbeat reecod on x1 codfw hosts
  • 22:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from eqiad to codfw
  • 21:55 mutante: seaborgium - apt-get clean (disk space before: 98% used, now: 76% used, was alerting)
  • 20:59 ladsgroup@cumin1002: START - Cookbook sre.switchdc.databases.prepare for the switch from eqiad to codfw
  • 20:45 toyofuku@deploy1003: Finished scap sync-world: Backport for Enable dark mode for all logged in users on all projects (T370099), Deploy Vector 2022 on several Wikimedia wikis (T374255), Limit quick surveys to wikis with messages defined (T374654) (duration: 12m 52s)
  • 20:40 toyofuku@deploy1003: toyofuku, jdlrobson: Continuing with sync
  • 20:35 toyofuku@deploy1003: toyofuku, jdlrobson: Backport for Enable dark mode for all logged in users on all projects (T370099), Deploy Vector 2022 on several Wikimedia wikis (T374255), Limit quick surveys to wikis with messages defined (T374654) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:32 toyofuku@deploy1003: Started scap sync-world: Backport for Enable dark mode for all logged in users on all projects (T370099), Deploy Vector 2022 on several Wikimedia wikis (T374255), Limit quick surveys to wikis with messages defined (T374654)
  • 18:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 100%: T375050', diff saved to https://phabricator.wikimedia.org/P69310 and previous config saved to /var/cache/conftool/dbconfig/20240918-180800-arnaudb.json
  • 17:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Revert^2 "Create group for assigning checkuser-temporary-account right" (T369187) (duration: 08m 18s)
  • 17:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 75%: T375050', diff saved to https://phabricator.wikimedia.org/P69309 and previous config saved to /var/cache/conftool/dbconfig/20240918-175255-arnaudb.json
  • 17:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 17:49 dreamyjazz@deploy1003: dreamyjazz: Backport for Revert^2 "Create group for assigning checkuser-temporary-account right" (T369187) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:47 dreamyjazz@deploy1003: Started scap sync-world: Backport for Revert^2 "Create group for assigning checkuser-temporary-account right" (T369187)
  • 17:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 50%: T375050', diff saved to https://phabricator.wikimedia.org/P69308 and previous config saved to /var/cache/conftool/dbconfig/20240918-173749-arnaudb.json
  • 17:29 sukhe: re-enable puppet on A:cp to finish rolling out T347114
  • 17:29 sukhe: re-enable puppet on A:cp to finish rolling out T368755
  • 17:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 25%: T375050', diff saved to https://phabricator.wikimedia.org/P69306 and previous config saved to /var/cache/conftool/dbconfig/20240918-172243-arnaudb.json
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69305 and previous config saved to /var/cache/conftool/dbconfig/20240918-170918-arnaudb.json
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69304 and previous config saved to /var/cache/conftool/dbconfig/20240918-170913-arnaudb.json
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2215 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69303 and previous config saved to /var/cache/conftool/dbconfig/20240918-170909-arnaudb.json
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69302 and previous config saved to /var/cache/conftool/dbconfig/20240918-170903-arnaudb.json
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69301 and previous config saved to /var/cache/conftool/dbconfig/20240918-170858-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69300 and previous config saved to /var/cache/conftool/dbconfig/20240918-170849-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2140 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69299 and previous config saved to /var/cache/conftool/dbconfig/20240918-170843-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69298 and previous config saved to /var/cache/conftool/dbconfig/20240918-170838-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: T373104', diff saved to https://phabricator.wikimedia.org/P69297 and previous config saved to /var/cache/conftool/dbconfig/20240918-170833-arnaudb.json
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 15%: T375050', diff saved to https://phabricator.wikimedia.org/P69296 and previous config saved to /var/cache/conftool/dbconfig/20240918-170738-arnaudb.json
  • 17:02 mutante: copied vopsbot.db from alert1001 to alert1002; restarted vopsbot
  • 16:57 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 16:56 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 16:56 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:56 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69294 and previous config saved to /var/cache/conftool/dbconfig/20240918-165412-arnaudb.json
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69293 and previous config saved to /var/cache/conftool/dbconfig/20240918-165407-arnaudb.json
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2215 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69292 and previous config saved to /var/cache/conftool/dbconfig/20240918-165403-arnaudb.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69291 and previous config saved to /var/cache/conftool/dbconfig/20240918-165357-arnaudb.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69290 and previous config saved to /var/cache/conftool/dbconfig/20240918-165352-arnaudb.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69289 and previous config saved to /var/cache/conftool/dbconfig/20240918-165344-arnaudb.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2140 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69288 and previous config saved to /var/cache/conftool/dbconfig/20240918-165337-arnaudb.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69287 and previous config saved to /var/cache/conftool/dbconfig/20240918-165332-arnaudb.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: T373104', diff saved to https://phabricator.wikimedia.org/P69286 and previous config saved to /var/cache/conftool/dbconfig/20240918-165327-arnaudb.json
  • 16:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 10%: T375050', diff saved to https://phabricator.wikimedia.org/P69285 and previous config saved to /var/cache/conftool/dbconfig/20240918-165232-arnaudb.json
  • 16:42 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CR 1073453"': T347114
  • 16:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Autopromote users into checkuser-temporary-account-viewer (T369187 T327913) (duration: 14m 06s)
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69284 and previous config saved to /var/cache/conftool/dbconfig/20240918-163907-arnaudb.json
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69283 and previous config saved to /var/cache/conftool/dbconfig/20240918-163902-arnaudb.json
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2215 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69282 and previous config saved to /var/cache/conftool/dbconfig/20240918-163857-arnaudb.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69281 and previous config saved to /var/cache/conftool/dbconfig/20240918-163852-arnaudb.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69280 and previous config saved to /var/cache/conftool/dbconfig/20240918-163847-arnaudb.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69279 and previous config saved to /var/cache/conftool/dbconfig/20240918-163837-arnaudb.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2140 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69278 and previous config saved to /var/cache/conftool/dbconfig/20240918-163832-arnaudb.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69277 and previous config saved to /var/cache/conftool/dbconfig/20240918-163827-arnaudb.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: T373104', diff saved to https://phabricator.wikimedia.org/P69276 and previous config saved to /var/cache/conftool/dbconfig/20240918-163822-arnaudb.json
  • 16:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 5%: T375050', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240918-163721-arnaudb.json
  • 16:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 T375050', diff saved to https://phabricator.wikimedia.org/P69275 and previous config saved to /var/cache/conftool/dbconfig/20240918-163637-arnaudb.json
  • 16:35 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2218 to s7 primary T375050', diff saved to https://phabricator.wikimedia.org/P69274 and previous config saved to /var/cache/conftool/dbconfig/20240918-163404-arnaudb.json
  • 16:33 arnaudb: Starting s7 codfw failover from db2220 to db2218 - T375050
  • 16:30 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2017.codfw.wmnet
  • 16:30 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2017.codfw.wmnet
  • 16:29 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2016.codfw.wmnet
  • 16:29 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2016.codfw.wmnet
  • 16:29 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2451.codfw.wmnet
  • 16:29 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2451.codfw.wmnet
  • 16:29 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2450.codfw.wmnet
  • 16:29 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2450.codfw.wmnet
  • 16:28 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2449.codfw.wmnet
  • 16:28 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2449.codfw.wmnet
  • 16:28 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2448.codfw.wmnet
  • 16:28 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2448.codfw.wmnet
  • 16:28 dreamyjazz@deploy1003: dreamyjazz: Backport for Autopromote users into checkuser-temporary-account-viewer (T369187 T327913) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2218 from API/vslow/dump T375050', diff saved to https://phabricator.wikimedia.org/P69273 and previous config saved to /var/cache/conftool/dbconfig/20240918-162822-arnaudb.json
  • 16:28 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2447.codfw.wmnet
  • 16:28 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2447.codfw.wmnet
  • 16:28 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2446.codfw.wmnet
  • 16:27 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2446.codfw.wmnet
  • 16:27 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2445.codfw.wmnet
  • 16:27 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2445.codfw.wmnet
  • 16:27 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2444.codfw.wmnet
  • 16:27 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2444.codfw.wmnet
  • 16:27 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2051.codfw.wmnet
  • 16:27 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2051.codfw.wmnet
  • 16:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2218 with weight 0 T375050', diff saved to https://phabricator.wikimedia.org/P69272 and previous config saved to /var/cache/conftool/dbconfig/20240918-162703-arnaudb.json
  • 16:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2050.codfw.wmnet
  • 16:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T375050
  • 16:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2050.codfw.wmnet
  • 16:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2049.codfw.wmnet
  • 16:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2049.codfw.wmnet
  • 16:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s7 T375050
  • 16:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2048.codfw.wmnet
  • 16:26 dreamyjazz@deploy1003: Started scap sync-world: Backport for Autopromote users into checkuser-temporary-account-viewer (T369187 T327913)
  • 16:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2048.codfw.wmnet
  • 16:25 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2024.codfw.wmnet
  • 16:25 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2024.codfw.wmnet
  • 16:25 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2014.codfw.wmnet
  • 16:25 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2014.codfw.wmnet
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2218 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69271 and previous config saved to /var/cache/conftool/dbconfig/20240918-162406-arnaudb.json
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2217 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69270 and previous config saved to /var/cache/conftool/dbconfig/20240918-162401-arnaudb.json
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2216 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69269 and previous config saved to /var/cache/conftool/dbconfig/20240918-162357-arnaudb.json
  • 16:23 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2013.codfw.wmnet
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2215 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69268 and previous config saved to /var/cache/conftool/dbconfig/20240918-162351-arnaudb.json
  • 16:23 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2013.codfw.wmnet
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69267 and previous config saved to /var/cache/conftool/dbconfig/20240918-162346-arnaudb.json
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69266 and previous config saved to /var/cache/conftool/dbconfig/20240918-162341-arnaudb.json
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2172 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69265 and previous config saved to /var/cache/conftool/dbconfig/20240918-162331-arnaudb.json
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2140 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69264 and previous config saved to /var/cache/conftool/dbconfig/20240918-162326-arnaudb.json
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69263 and previous config saved to /var/cache/conftool/dbconfig/20240918-162321-arnaudb.json
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: T373104', diff saved to https://phabricator.wikimedia.org/P69262 and previous config saved to /var/cache/conftool/dbconfig/20240918-162316-arnaudb.json
  • 16:07 topranks: moving servers in codfw rack D6 from asw-d6-codfw to lsw1-d6-codfw T373104
  • 16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on 24 hosts with reason: Move servers in codfw rack D6
  • 16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:25:00 on 24 hosts with reason: Move servers in codfw rack D6
  • 16:01 hashar@deploy1003: Finished scap sync-world: Update termbox (mul support) - T373088 (duration: 06m 48s)
  • 16:00 topranks: moving servers in codfw rack D5 from asw-d5-codfw to lsw1-d5-codfw T373104
  • 15:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 23 hosts with reason: Move servers in codfw rack D5
  • 15:58 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 23 hosts with reason: Move servers in codfw rack D5
  • 15:55 elukey: deploy python3-setuptools upgrades fleetwide
  • 15:54 hashar@deploy1003: Started scap sync-world: Update termbox (mul support) - T373088
  • 15:40 sukhe: finished updating TLS1.3 ciphers for cp hosts: T365327
  • 15:40 bking@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 12 hosts with reason: network maintenance T373104
  • 15:40 bking@deploy1003: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 12 hosts with reason: network maintenance T373104
  • 15:39 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2129 db2130 db2140 db2172 db2187 db2193 db2194 db2215 db2216 db2217 db2218 - T373104', diff saved to https://phabricator.wikimedia.org/P69261 and previous config saved to /var/cache/conftool/dbconfig/20240918-153922-arnaudb.json
  • 15:34 bking@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:34 bking@deploy1003: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:32 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2017.codfw.wmnet
  • 15:31 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2017.codfw.wmnet
  • 15:31 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2016.codfw.wmnet
  • 15:30 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2016.codfw.wmnet
  • 15:30 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2451.codfw.wmnet
  • 15:30 bking@deploy1003: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:30 bking@deploy1003: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:30 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2451.codfw.wmnet
  • 15:30 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2450.codfw.wmnet
  • 15:29 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2450.codfw.wmnet
  • 15:29 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2449.codfw.wmnet
  • 15:28 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2449.codfw.wmnet
  • 15:28 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2448.codfw.wmnet
  • 15:27 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2448.codfw.wmnet
  • 15:27 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2447.codfw.wmnet
  • 15:27 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2447.codfw.wmnet
  • 15:27 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2446.codfw.wmnet
  • 15:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2446.codfw.wmnet
  • 15:26 bking@deploy1003: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:25 bking@deploy1003: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:22 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:22 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:21 denisse: Enable metamonitoring for the alert1002, and alert2002 hosts - T372418
  • 15:20 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on vrts2001.codfw.wmnet with reason: Migration
  • 15:20 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on vrts2001.codfw.wmnet with reason: Migration
  • 15:20 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2445.codfw.wmnet
  • 15:20 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for vrts2002.codfw.wmnet
  • 15:20 aokoth@cumin1002: START - Cookbook sre.hosts.remove-downtime for vrts2002.codfw.wmnet
  • 15:19 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2445.codfw.wmnet
  • 15:19 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2444.codfw.wmnet
  • 15:19 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:19 sukhe: rolling out TLS1.3 cipher suite priority order change CR 1073798 to all cp hosts
  • 15:19 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2444.codfw.wmnet
  • 15:19 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:19 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2051.codfw.wmnet
  • 15:19 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on vrts2002.codfw.wmnet with reason: Migration
  • 15:18 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on vrts2002.codfw.wmnet with reason: Migration
  • 15:18 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2051.codfw.wmnet
  • 15:18 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2050.codfw.wmnet
  • 15:17 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2050.codfw.wmnet
  • 15:17 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2049.codfw.wmnet
  • 15:17 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2049.codfw.wmnet
  • 15:16 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2048.codfw.wmnet
  • 15:16 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2048.codfw.wmnet
  • 15:16 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2024.codfw.wmnet
  • 15:15 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2024.codfw.wmnet
  • 15:15 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2014.codfw.wmnet
  • 15:14 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2014.codfw.wmnet
  • 15:14 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2013.codfw.wmnet
  • 15:14 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2013.codfw.wmnet
  • 15:08 denisse: Resolve alerts DNS queries to alert1002 - T372418
  • 15:03 _joe_: uploading conftool 3.2.4 to apt T375059
  • 15:02 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CR 1073798"': T365327
  • 15:01 denisse: Make alert1002 the active host - T372418
  • 15:00 denisse: Disable meta-monitoring for the alert hosts - T372418
  • 14:55 elukey: restart poolcounter on poolcounter100[4,5] (depooled nodes) to clear old/stale TCP conns for port 7531
  • 14:54 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:54 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:54 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:54 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 55655
  • 14:52 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 55655
  • 14:50 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:49 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:47 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:46 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:42 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:40 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough
  • 14:36 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:26 sukhe@cumin1002: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough and A:wikidough
  • 14:25 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:24 sukhe: run puppet agent on A:wikidough
  • 14:23 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:19 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:19 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:07 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:07 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 13:53 elukey@deploy1003: Finished scap sync-world: Backport for Swap poolcounter1005 with poolcounter1007 (T332015) (duration: 07m 23s)
  • 13:49 elukey@deploy1003: elukey: Continuing with sync
  • 13:48 elukey@deploy1003: elukey: Backport for Swap poolcounter1005 with poolcounter1007 (T332015) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:46 elukey@deploy1003: Started scap sync-world: Backport for Swap poolcounter1005 with poolcounter1007 (T332015)
  • 13:38 elukey@deploy1003: Finished scap sync-world: Backport for Swap poolcounter1004 with poolcounter1006 (T332015) (duration: 07m 15s)
  • 13:34 elukey@deploy1003: elukey: Continuing with sync
  • 13:33 elukey@deploy1003: elukey: Backport for Swap poolcounter1004 with poolcounter1006 (T332015) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:31 elukey@deploy1003: Started scap sync-world: Backport for Swap poolcounter1004 with poolcounter1006 (T332015)
  • 13:25 Dreamy_Jazz: Afternoon UTC backport window done
  • {{safesubst:SAL entry|1=13:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for GrowthExperiments: enable Community Updates module in testwiki (T374577), Check that throttling exceptions use valid public IP addresses (T374980), Hide temp account IP address viewing right from non-temp account wikis (T369187), [[gerrit:1073586|Lift IP cap on 2024-10-07/08 for edit-a-thon (T374964)]}}
  • 13:18 elukey: restart puppetserver on puppetserver1002 - trashing - T373527
  • 13:15 dreamyjazz@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, cscott, hnowlan, dreamyjazz: Continuing with sync
  • {{safesubst:SAL entry|1=13:11 dreamyjazz@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, cscott, hnowlan, dreamyjazz: Backport for GrowthExperiments: enable Community Updates module in testwiki (T374577), Check that throttling exceptions use valid public IP addresses (T374980), Hide temp account IP address viewing right from non-temp account wikis (T369187), [[gerrit:1073586|Lift IP cap on}}
  • 13:09 dreamyjazz@deploy1003: Started scap sync-world: Backport for GrowthExperiments: enable Community Updates module in testwiki (T374577), Check that throttling exceptions use valid public IP addresses (T374980), Hide temp account IP address viewing right from non-temp account wikis (T369187), Lift IP cap on 2024-10-07/08 for edit-a-thon (T374964)
  • 12:46 vgutierrez: rolling upgrade to purged 0.23 in A:cp-ulsfo - T334078
  • 12:44 vgutierrez: uploaded purged 0.23 to bullseye-wikimedia (apt.wm.o) - T334078
  • 12:33 moritzm: uploaded cas 7.0.4.1+wmf12u3 T367487
  • 12:28 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.23 refs T373642
  • 12:21 tchin@deploy1003: Finished deploy [airflow-dags/analytics_test@e6cc31a]: Regular analytics weekly train (duration: 00m 20s)
  • 12:21 tchin@deploy1003: Started deploy [airflow-dags/analytics_test@e6cc31a]: Regular analytics weekly train
  • 12:18 tchin@deploy1003: Finished deploy [airflow-dags/analytics@e6cc31a]: Regular analytics weekly train (duration: 01m 18s)
  • 12:18 tchin@deploy1003: Started deploy [airflow-dags/analytics@e6cc31a]: Regular analytics weekly train
  • 12:12 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Allow IP ranges in CentralAuth::getInstanceByName() (T375061), Allow IP ranges in CentralAuth::getInstanceByName() (T375061) (duration: 07m 00s)
  • 12:10 tchin: Deployed refinery using scap, then deployed onto hdfs
  • 12:08 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1177.eqiad.wmnet with OS bullseye
  • 12:07 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 12:07 dreamyjazz@deploy1003: dreamyjazz: Backport for Allow IP ranges in CentralAuth::getInstanceByName() (T375061), Allow IP ranges in CentralAuth::getInstanceByName() (T375061) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for Allow IP ranges in CentralAuth::getInstanceByName() (T375061), Allow IP ranges in CentralAuth::getInstanceByName() (T375061)
  • 11:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Hooks: Re-order checks to verify that request user is same as Special:Contributions user (T375061) (duration: 09m 03s)
  • 11:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 11:48 dreamyjazz@deploy1003: dreamyjazz: Backport for Hooks: Re-order checks to verify that request user is same as Special:Contributions user (T375061) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:46 dreamyjazz@deploy1003: Started scap sync-world: Backport for Hooks: Re-order checks to verify that request user is same as Special:Contributions user (T375061)
  • 11:43 XioNoX: update pfw3-codfw dhcp-relay target 0 T375011
  • 11:43 tchin@deploy1003: Finished deploy [analytics/refinery@bc0be94] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bc0be94a] (duration: 03m 57s)
  • 11:39 tchin@deploy1003: Started deploy [analytics/refinery@bc0be94] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bc0be94a]
  • 11:39 tchin@deploy1003: Finished deploy [analytics/refinery@bc0be94] (thin): Regular analytics weekly train THIN [analytics/refinery@bc0be94a] (duration: 05m 50s)
  • 11:33 tchin@deploy1003: Started deploy [analytics/refinery@bc0be94] (thin): Regular analytics weekly train THIN [analytics/refinery@bc0be94a]
  • 11:32 tchin@deploy1003: Finished deploy [analytics/refinery@bc0be94]: Regular analytics weekly train [analytics/refinery@bc0be94a] (duration: 09m 06s)
  • 11:23 tchin@deploy1003: Started deploy [analytics/refinery@bc0be94]: Regular analytics weekly train [analytics/refinery@bc0be94a]
  • 11:23 tchin: Deploying refinery
  • 11:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 11:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:54 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1177.eqiad.wmnet with OS bullseye
  • 10:25 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1176.eqiad.wmnet with OS bullseye
  • 10:20 elukey: restart poolcounterd on poolcounter2003 (not serving any traffic atm, tried to clear old/stale conns)
  • 10:14 elukey@deploy1003: Finished scap sync-world: Backport for Swap poolcounter2004 with poolcounter2006 (T332015) (duration: 07m 08s)
  • 10:09 elukey@deploy1003: elukey: Continuing with sync
  • 10:09 elukey@deploy1003: elukey: Backport for Swap poolcounter2004 with poolcounter2006 (T332015) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:07 elukey@deploy1003: Started scap sync-world: Backport for Swap poolcounter2004 with poolcounter2006 (T332015)
  • 09:26 tappof@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
  • 09:11 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host an-worker1176.eqiad.wmnet with OS bullseye
  • 09:11 tappof@cumin2002: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
  • 09:01 moritzm: drain ganeti2026 T373104
  • 08:41 tappof: centrallog2002 upgrade to bookworm in progress https://phabricator.wikimedia.org/T353912
  • 08:32 elukey: install openjdk-17-jdk on puppetserver1002 to get some useful tools like jmap - T373527
  • 08:30 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.23 refs T373642
  • 08:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2017.codfw.wmnet
  • 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2017.codfw.wmnet
  • 08:15 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.23 refs T373642
  • 07:45 volans@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:45 volans@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fixed asset tag for db1179 - volans@cumin1002"
  • 07:43 volans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fixed asset tag for db1179 - volans@cumin1002"
  • 07:33 volans@cumin1002: START - Cookbook sre.dns.netbox
  • 06:39 moritzm: installing curl security updates
  • 06:05 arnaudb@cumin1002: dbctl commit (dc=all): 'T374807', diff saved to https://phabricator.wikimedia.org/P69250 and previous config saved to /var/cache/conftool/dbconfig/20240918-060549-arnaudb.json
  • 06:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2220 to s7 primary T374807', diff saved to https://phabricator.wikimedia.org/P69249 and previous config saved to /var/cache/conftool/dbconfig/20240918-060332-arnaudb.json
  • 06:02 arnaudb: Starting s7 codfw failover from db2218 to db2220 - T374807
  • 05:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T374807
  • 05:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2220 from API/vslow/dump T374807', diff saved to https://phabricator.wikimedia.org/P69248 and previous config saved to /var/cache/conftool/dbconfig/20240918-054921-arnaudb.json
  • 05:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2220 with weight 0 T374807', diff saved to https://phabricator.wikimedia.org/P69247 and previous config saved to /var/cache/conftool/dbconfig/20240918-054909-arnaudb.json
  • 05:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s7 T374807
  • 05:47 arnaudb@cumin1002: dbctl commit (dc=all): 'T374804', diff saved to https://phabricator.wikimedia.org/P69246 and previous config saved to /var/cache/conftool/dbconfig/20240918-054729-arnaudb.json
  • 05:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2179 to s4 primary T374804', diff saved to https://phabricator.wikimedia.org/P69245 and previous config saved to /var/cache/conftool/dbconfig/20240918-054515-arnaudb.json
  • 05:43 arnaudb: Starting s4 codfw failover from db2140 to db2179 - T374804
  • 05:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2179 from API/vslow/dump T374804', diff saved to https://phabricator.wikimedia.org/P69244 and previous config saved to /var/cache/conftool/dbconfig/20240918-053807-arnaudb.json
  • 05:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s4 T374804
  • 05:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2179 with weight 0 T374804', diff saved to https://phabricator.wikimedia.org/P69243 and previous config saved to /var/cache/conftool/dbconfig/20240918-053633-arnaudb.json
  • 05:36 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 32 hosts with reason: Primary switchover s4 T374804
  • 05:33 arnaudb@cumin1002: dbctl commit (dc=all): 'T375047', diff saved to https://phabricator.wikimedia.org/P69242 and previous config saved to /var/cache/conftool/dbconfig/20240918-053357-arnaudb.json
  • 05:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2214 to s6 primary T375047', diff saved to https://phabricator.wikimedia.org/P69241 and previous config saved to /var/cache/conftool/dbconfig/20240918-053115-arnaudb.json
  • 05:30 arnaudb: Starting s6 codfw failover from db2129 to db2214 - T375047
  • 05:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2214 with weight 0 T375047', diff saved to https://phabricator.wikimedia.org/P69240 and previous config saved to /var/cache/conftool/dbconfig/20240918-052446-arnaudb.json
  • 05:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s6 T375047
  • 05:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s6 T375047
  • 00:11 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab2002.wikimedia.org with reason: version upgrade
  • 00:11 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab2002.wikimedia.org with reason: version upgrade

2024-09-17

  • 23:56 dzahn@cumin2002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab2002.wikimedia.org with reason: security release 20240917
  • 21:25 sbassett: Deployed mitigation for T374438
  • 21:21 hashar: UTC late backport window is completed
  • 21:18 dzahn@cumin2002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: security release 20240917
  • 21:07 dzahn@cumin2002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release 20240917
  • 20:56 hashar@deploy1003: Finished scap sync-world: Backport for logging: rm per channel 'error' logging (T228838) (duration: 08m 22s)
  • 20:52 hashar@deploy1003: hashar: Continuing with sync
  • 20:50 hashar@deploy1003: hashar: Backport for logging: rm per channel 'error' logging (T228838) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:48 dzahn@cumin2002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release 20240917
  • 20:48 hashar@deploy1003: Started scap sync-world: Backport for logging: rm per channel 'error' logging (T228838)
  • 20:46 hashar@deploy1003: Finished scap sync-world: Backport for Improve $wgFooterIcons override, simplify $wmgWikimediaIcon (duration: 08m 05s)
  • 20:45 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 20:44 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 20:43 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 20:42 dzahn@cumin2002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: security release 20240917
  • 20:42 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 20:42 hashar@deploy1003: matmarex, hashar: Continuing with sync
  • 20:42 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 20:41 bd808@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 20:40 hashar@deploy1003: matmarex, hashar: Backport for Improve $wgFooterIcons override, simplify $wmgWikimediaIcon synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:38 hashar@deploy1003: Started scap sync-world: Backport for Improve $wgFooterIcons override, simplify $wmgWikimediaIcon
  • 20:37 hashar@deploy1003: Finished scap sync-world: Backport for Revert "Enter deprecation trial for third-party cookie blocking" (T359957) (duration: 06m 37s)
  • 20:32 hashar@deploy1003: tgr, hashar: Continuing with sync
  • 20:32 hashar@deploy1003: tgr, hashar: Backport for Revert "Enter deprecation trial for third-party cookie blocking" (T359957) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:32 dzahn@cumin2002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release 20240917
  • 20:30 hashar@deploy1003: Started scap sync-world: Backport for Revert "Enter deprecation trial for third-party cookie blocking" (T359957)
  • 20:27 hashar@deploy1003: Finished scap sync-world: Backport for Assign the API portal to the Wikimedia group for CentralNotice (T270308) (duration: 12m 11s)
  • 20:22 hashar@deploy1003: ejegg, hashar: Continuing with sync
  • 20:17 hashar@deploy1003: ejegg, hashar: Backport for Assign the API portal to the Wikimedia group for CentralNotice (T270308) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:15 hashar@deploy1003: Started scap sync-world: Backport for Assign the API portal to the Wikimedia group for CentralNotice (T270308)
  • 20:07 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:07 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:07 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 herron: pyrra upgraded to 0.7.7-1
  • 18:27 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T370754, transfer categories jnl) xfer categories from wdqs2023.codfw.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 18:16 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T370754, transfer categories jnl) xfer categories from wdqs2023.codfw.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 18:16 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T370754, transfer categories jnl) xfer categories from wdqs2023.codfw.wmnet -> wdqs1023.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 18:05 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T370754, transfer categories jnl) xfer categories from wdqs2023.codfw.wmnet -> wdqs1023.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 18:04 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T370754, transfer categories jnl) xfer categories from wdqs2023.codfw.wmnet -> wdqs2024.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on gerrit2003.wikimedia.org with reason: in setup
  • 17:56 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on gerrit2003.wikimedia.org with reason: in setup
  • 17:54 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T370754, transfer categories jnl) xfer categories from wdqs2023.codfw.wmnet -> wdqs2024.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 100%: T373103', diff saved to https://phabricator.wikimedia.org/P69237 and previous config saved to /var/cache/conftool/dbconfig/20240917-175321-arnaudb.json
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 100%: T373103', diff saved to https://phabricator.wikimedia.org/P69236 and previous config saved to /var/cache/conftool/dbconfig/20240917-175311-arnaudb.json
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2214 (re)pooling @ 100%: T373103', diff saved to https://phabricator.wikimedia.org/P69235 and previous config saved to /var/cache/conftool/dbconfig/20240917-175306-arnaudb.json
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 100%: T373103', diff saved to https://phabricator.wikimedia.org/P69234 and previous config saved to /var/cache/conftool/dbconfig/20240917-175302-arnaudb.json
  • 17:44 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@d93f2c7] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/83 (duration: 01m 13s)
  • 17:43 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@d93f2c7] (releasing): Deploying https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/83
  • 17:39 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host registry2005.codfw.wmnet
  • 17:39 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host registry2005.codfw.wmnet with OS bookworm
  • 17:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 75%: T373103', diff saved to https://phabricator.wikimedia.org/P69233 and previous config saved to /var/cache/conftool/dbconfig/20240917-173816-arnaudb.json
  • 17:38 arnaudb@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 75%: T373103', diff saved to https://phabricator.wikimedia.org/P69232 and previous config saved to /var/cache/conftool/dbconfig/20240917-173806-arnaudb.json
  • 17:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2214 (re)pooling @ 75%: T373103', diff saved to https://phabricator.wikimedia.org/P69231 and previous config saved to /var/cache/conftool/dbconfig/20240917-173801-arnaudb.json
  • 17:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 75%: T373103', diff saved to https://phabricator.wikimedia.org/P69230 and previous config saved to /var/cache/conftool/dbconfig/20240917-173756-arnaudb.json
  • 17:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 50%: T373103', diff saved to https://phabricator.wikimedia.org/P69229 and previous config saved to /var/cache/conftool/dbconfig/20240917-172310-arnaudb.json
  • 17:23 arnaudb@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 50%: T373103', diff saved to https://phabricator.wikimedia.org/P69228 and previous config saved to /var/cache/conftool/dbconfig/20240917-172300-arnaudb.json
  • 17:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2214 (re)pooling @ 50%: T373103', diff saved to https://phabricator.wikimedia.org/P69227 and previous config saved to /var/cache/conftool/dbconfig/20240917-172255-arnaudb.json
  • 17:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 50%: T373103', diff saved to https://phabricator.wikimedia.org/P69226 and previous config saved to /var/cache/conftool/dbconfig/20240917-172250-arnaudb.json
  • 17:22 dduvall@deploy1003: Installation of scap version "4.103.0" completed for 211 hosts
  • 17:18 dduvall@deploy1003: Installing scap version "4.103.0" for 211 hosts
  • 17:10 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=(cp2039|cp2040).codfw.wmnet [reason: [maint done] depool for T373103]
  • 17:08 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2376.codfw.wmnet
  • 17:08 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2376.codfw.wmnet
  • 17:08 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2375.codfw.wmnet
  • 17:08 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2375.codfw.wmnet
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 25%: T373103', diff saved to https://phabricator.wikimedia.org/P69225 and previous config saved to /var/cache/conftool/dbconfig/20240917-170805-arnaudb.json
  • 17:08 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2374.codfw.wmnet
  • 17:08 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2374.codfw.wmnet
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'es2023 (re)pooling @ 25%: T373103', diff saved to https://phabricator.wikimedia.org/P69224 and previous config saved to /var/cache/conftool/dbconfig/20240917-170755-arnaudb.json
  • 17:07 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2373.codfw.wmnet
  • 17:07 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2373.codfw.wmnet
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2214 (re)pooling @ 25%: T373103', diff saved to https://phabricator.wikimedia.org/P69223 and previous config saved to /var/cache/conftool/dbconfig/20240917-170749-arnaudb.json
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 25%: T373103', diff saved to https://phabricator.wikimedia.org/P69222 and previous config saved to /var/cache/conftool/dbconfig/20240917-170745-arnaudb.json
  • 17:07 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2372.codfw.wmnet
  • 17:07 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2372.codfw.wmnet
  • 17:07 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2371.codfw.wmnet
  • 17:07 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2371.codfw.wmnet
  • 17:07 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2370.codfw.wmnet
  • 17:07 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2370.codfw.wmnet
  • 17:07 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2369.codfw.wmnet
  • 17:07 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2369.codfw.wmnet
  • 17:06 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2368.codfw.wmnet
  • 17:06 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2368.codfw.wmnet
  • 17:06 swfrench@cumin2002: conftool action : set/pooled=yes; selector: dc=codfw,name=mw2279.codfw.wmnet
  • 17:06 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2367.codfw.wmnet
  • 17:06 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2367.codfw.wmnet
  • 17:06 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2366.codfw.wmnet
  • 17:06 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2366.codfw.wmnet
  • 17:06 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2008.codfw.wmnet
  • 17:06 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2056.codfw.wmnet
  • 17:06 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2056.codfw.wmnet
  • 17:06 swfrench@cumin2002: conftool action : set/pooled=yes; selector: dc=codfw,name=mw2278.codfw.wmnet
  • 17:05 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2047.codfw.wmnet
  • 17:05 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2047.codfw.wmnet
  • 17:05 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2046.codfw.wmnet
  • 17:05 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2046.codfw.wmnet
  • 17:05 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2022.codfw.wmnet
  • 17:05 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2022.codfw.wmnet
  • 16:58 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2376.codfw.wmnet
  • 16:54 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2376.codfw.wmnet
  • 16:54 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2375.codfw.wmnet
  • 16:54 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2375.codfw.wmnet
  • 16:54 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2374.codfw.wmnet
  • 16:53 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2374.codfw.wmnet
  • 16:53 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2373.codfw.wmnet
  • 16:52 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2373.codfw.wmnet
  • 16:52 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2372.codfw.wmnet
  • 16:51 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2372.codfw.wmnet
  • 16:51 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2371.codfw.wmnet
  • 16:51 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2371.codfw.wmnet
  • 16:51 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2370.codfw.wmnet
  • 16:50 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2370.codfw.wmnet
  • 16:50 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2369.codfw.wmnet
  • 16:49 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 24 hosts with reason: reboot cloudsw1-c8-eqiad
  • 16:48 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 24 hosts with reason: reboot cloudsw1-c8-eqiad
  • 16:47 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2369.codfw.wmnet
  • 16:47 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2368.codfw.wmnet
  • 16:46 swfrench@cumin2002: conftool action : set/pooled=no; selector: dc=codfw,name=mw2279.codfw.wmnet
  • 16:46 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2368.codfw.wmnet
  • 16:46 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2367.codfw.wmnet
  • 16:45 swfrench@cumin2002: conftool action : set/pooled=no; selector: dc=codfw,name=mw2278.codfw.wmnet
  • 16:45 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2367.codfw.wmnet
  • 16:45 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2366.codfw.wmnet
  • 16:45 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2366.codfw.wmnet
  • 16:44 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2056.codfw.wmnet
  • 16:44 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2056.codfw.wmnet
  • 16:44 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2047.codfw.wmnet
  • 16:43 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2047.codfw.wmnet
  • 16:43 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2046.codfw.wmnet
  • 16:42 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2046.codfw.wmnet
  • 16:42 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2022.codfw.wmnet
  • 16:42 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2022.codfw.wmnet
  • 16:30 hashar@deploy1003: Finished scap sync-world: Backport for Lift IP cap on this dates 17/09,24/09 for edit-a-thon for eswiki, commons and wikidata (T373468) (duration: 07m 25s)
  • 16:26 hashar@deploy1003: hashar, gergesshamon: Continuing with sync
  • 16:26 hashar@deploy1003: hashar, gergesshamon: Backport for Lift IP cap on this dates 17/09,24/09 for edit-a-thon for eswiki, commons and wikidata (T373468) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:23 hashar@deploy1003: Started scap sync-world: Backport for Lift IP cap on this dates 17/09,24/09 for edit-a-thon for eswiki, commons and wikidata (T373468)
  • 16:08 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2209.codfw.wmnet with reason: move to new switch
  • 16:08 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2209.codfw.wmnet with reason: move to new switch
  • 16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cloudsw1-c8-eqiad,cloudsw1-c8-eqiad IPv6,cloudsw1-c8-eqiad.mgmt,cr1-eqiad
  • 16:06 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for cloudsw1-c8-eqiad,cloudsw1-c8-eqiad IPv6,cloudsw1-c8-eqiad.mgmt,cr1-eqiad
  • 15:50 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2008.codfw.wmnet
  • 15:50 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 15:47 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 15:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:45:00 on 6 hosts with reason: network maintenance T373103
  • 15:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:45:00 on 6 hosts with reason: network maintenance T373103
  • 15:44 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 15:43 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2213 db2214 es2023 pc2016 db2209 - T373103', diff saved to https://phabricator.wikimedia.org/P69221 and previous config saved to /var/cache/conftool/dbconfig/20240917-154355-arnaudb.json
  • 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
  • 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
  • 15:33 dancy: dancy@deploy1003 Installation of scap version "4.102.1" completed for 211 hosts
  • 15:28 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=(cp2039|cp2040).codfw.wmnet [reason: depool for T373103]
  • 15:28 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@d8093b9] (releasing): (no justification provided) (duration: 00m 43s)
  • 15:28 dancy@deploy1003: Started deploy [releng/jenkins-deploy@d8093b9] (releasing): (no justification provided)
  • 15:19 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on registry2005.codfw.wmnet with reason: WIP - working on puppet runs
  • 15:19 moritzm: installing postgresql-13 security updates
  • 15:18 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on registry2005.codfw.wmnet with reason: WIP - working on puppet runs
  • 15:17 logmsgbot: dreamyjazz Deployed security patch for T372998
  • 15:09 logmsgbot: dreamyjazz Deployed security patch for T372998
  • 15:05 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on 24 hosts with reason: reboot cloudsw1-c8-eqiad
  • 15:05 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on 24 hosts with reason: reboot cloudsw1-c8-eqiad
  • 15:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on cr1-eqiad with reason: reboot cloudsw1-c8-eqiad
  • 15:04 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on cr1-eqiad with reason: reboot cloudsw1-c8-eqiad
  • 15:03 Dreamy_Jazz: Starting security deploy
  • 14:52 dancy@deploy1003: Finished deploy [releng/phatality@84c7283]: T374880 (duration: 00m 06s)
  • 14:52 dancy@deploy1003: Started deploy [releng/phatality@84c7283]: T374880
  • 14:47 dancy@deploy1003: Finished deploy [releng/phatality@84c7283]: T374880 (duration: 00m 09s)
  • 14:46 dancy@deploy1003: Started deploy [releng/phatality@84c7283]: T374880
  • 14:45 dancy@deploy1003: Finished deploy [releng/phatality@84c7283]: (no justification provided) (duration: 00m 18s)
  • 14:45 dancy@deploy1003: Started deploy [releng/phatality@84c7283]: (no justification provided)
  • 14:29 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on cloudsw1-c8-eqiad,cloudsw1-c8-eqiad IPv6,cloudsw1-c8-eqiad.mgmt with reason: Reboot cloudsw1-c8-eqiad and upgrade JunOS
  • 14:29 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on cloudsw1-c8-eqiad,cloudsw1-c8-eqiad IPv6,cloudsw1-c8-eqiad.mgmt with reason: Reboot cloudsw1-c8-eqiad and upgrade JunOS
  • 13:57 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2016.codfw.wmnet
  • 13:54 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2016.codfw.wmnet
  • 13:53 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
  • 13:51 elukey: copy jwt-authorizer from bullseye-wikimedia to bookworm-wikimedia
  • 13:48 hashar: UTC afternoon backport window completed (again!)
  • 13:47 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
  • 13:46 hashar@deploy1003: Finished scap sync-world: Backport for Enable ContactPage extension on zhwiki (T359998) (duration: 08m 09s)
  • 13:43 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
  • 13:41 hashar@deploy1003: hashar: Continuing with sync
  • 13:40 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kafka-stretch1002 to an-worker1177
  • 13:40 hashar@deploy1003: hashar: Backport for Enable ContactPage extension on zhwiki (T359998) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1177
  • 13:39 stevemunene@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1177
  • 13:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:38 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kafka-stretch1002 to an-worker1177 - stevemunene@cumin1002"
  • 13:38 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kafka-stretch1002 to an-worker1177 - stevemunene@cumin1002"
  • 13:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T367781)', diff saved to https://phabricator.wikimedia.org/P69219 and previous config saved to /var/cache/conftool/dbconfig/20240917-133805-arnaudb.json
  • 13:37 hashar@deploy1003: Started scap sync-world: Backport for Enable ContactPage extension on zhwiki (T359998)
  • 13:36 Lucas_WMDE: (for the record, refreshing the WDQS GUI cache five minutes ago seems to have worked well enough… that, or the cache just happened to expire around the same time ^^)
  • 13:36 hashar@deploy1003: Finished scap sync-world: Backport for Configure ContactPage and IPBE contact form on zhwiki (T359998) (duration: 10m 00s)
  • 13:35 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
  • 13:34 stevemunene@cumin1002: START - Cookbook sre.hosts.rename from kafka-stretch1002 to an-worker1177
  • 13:31 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ for domain in query{,-{main,scholarly; do for path in / /index.html /i18n/en.json /{default,custom}-config.json; do printf 'https://%s.wikidata.org%s\n' "$domain" "$path"; done; done | mwscript purgeList enwiki # try to refresh WDQS GUI cache, don’t know if it’ll work}}
  • 13:31 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kafka-stretch1001 to an-worker1176
  • 13:31 hashar@deploy1003: hashar, hamishz: Continuing with sync
  • 13:30 stevemunene@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-worker1176
  • 13:30 stevemunene@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host an-worker1176
  • 13:30 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:30 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kafka-stretch1001 to an-worker1176 - stevemunene@cumin1002"
  • 13:28 hashar@deploy1003: hashar, hamishz: Backport for Configure ContactPage and IPBE contact form on zhwiki (T359998) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:26 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kafka-stretch1001 to an-worker1176 - stevemunene@cumin1002"
  • 13:26 hashar@deploy1003: Started scap sync-world: Backport for Configure ContactPage and IPBE contact form on zhwiki (T359998)
  • 13:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P69218 and previous config saved to /var/cache/conftool/dbconfig/20240917-132257-arnaudb.json
  • 13:22 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
  • 13:22 stevemunene@cumin1002: START - Cookbook sre.hosts.rename from kafka-stretch1001 to an-worker1176
  • 13:21 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
  • 13:20 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
  • 13:20 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
  • 13:18 hashar: UTC afternoon backport window completed
  • 13:17 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
  • 13:16 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on registry2005.codfw.wmnet with reason: host reimage
  • 13:14 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
  • 13:13 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on registry2005.codfw.wmnet with reason: host reimage
  • 13:09 hashar@deploy1003: Finished scap sync-world: Backport for logging: Default to log any error (all wikis) (T228838) (duration: 07m 06s)
  • 13:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P69217 and previous config saved to /var/cache/conftool/dbconfig/20240917-130750-arnaudb.json
  • 13:04 hashar@deploy1003: hashar: Continuing with sync
  • 13:04 hashar@deploy1003: hashar: Backport for logging: Default to log any error (all wikis) (T228838) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:02 hashar@deploy1003: Started scap sync-world: Backport for logging: Default to log any error (all wikis) (T228838)
  • 12:57 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
  • 12:57 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host registry2005.codfw.wmnet with OS bookworm
  • 12:57 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
  • 12:56 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
  • 12:54 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
  • 12:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T367781)', diff saved to https://phabricator.wikimedia.org/P69216 and previous config saved to /var/cache/conftool/dbconfig/20240917-125242-arnaudb.json
  • 12:51 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM registry2005.codfw.wmnet - elukey@cumin1002"
  • 12:51 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM registry2005.codfw.wmnet - elukey@cumin1002"
  • 12:51 elukey@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) registry2005.codfw.wmnet on all recursors
  • 12:50 elukey@cumin1002: START - Cookbook sre.dns.wipe-cache registry2005.codfw.wmnet on all recursors
  • 12:50 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:50 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM registry2005.codfw.wmnet - elukey@cumin1002"
  • 12:50 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM registry2005.codfw.wmnet - elukey@cumin1002"
  • 12:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T367781)', diff saved to https://phabricator.wikimedia.org/P69215 and previous config saved to /var/cache/conftool/dbconfig/20240917-125032-arnaudb.json
  • 12:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 12:50 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 12:49 arnaudb@cumin1002: dbctl commit (dc=all): 'codfw/db2161 weight', diff saved to https://phabricator.wikimedia.org/P69214 and previous config saved to /var/cache/conftool/dbconfig/20240917-124927-arnaudb.json
  • 12:47 elukey@cumin1002: START - Cookbook sre.dns.netbox
  • 12:47 elukey@cumin1002: START - Cookbook sre.ganeti.makevm for new host registry2005.codfw.wmnet
  • 12:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2165 to s8 primary T374946', diff saved to https://phabricator.wikimedia.org/P69213 and previous config saved to /var/cache/conftool/dbconfig/20240917-124638-arnaudb.json
  • 12:46 arnaudb: Starting s8 codfw failover from db2161 to db2165 - T374946
  • 12:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2165 with weight 0 T374946', diff saved to https://phabricator.wikimedia.org/P69212 and previous config saved to /var/cache/conftool/dbconfig/20240917-124022-arnaudb.json
  • 12:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s8 T374946
  • 12:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 32 hosts with reason: Primary switchover s8 T374946
  • 12:38 topranks: disable Equinix IXP BGP peers on cr2-eqiad before reconfiguring port as LAG T370696
  • 12:30 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on cr1-eqiad with reason: enable ixp port cr1-eqiad
  • 12:30 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on cr1-eqiad with reason: enable ixp port cr1-eqiad
  • 12:23 topranks: reconfigure cr1-eqiad xe-3/0/6 into LAG grou ae6 for Equinix IXP peering T370696
  • 12:15 topranks: disable Equinix IXP peering on cr1-eqiad in advance of port move
  • 12:15 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on cr1-eqiad with reason: enable ixp port cr1-eqiad
  • 12:15 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on cr1-eqiad with reason: enable ixp port cr1-eqiad
  • 12:02 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 11:44 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 11:44 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
  • 11:44 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 11:44 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 11:44 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 11:44 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
  • 11:39 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 11:38 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
  • 11:38 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 11:37 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 11:36 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 11:19 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 11:18 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 11:15 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
  • 11:07 volans: installed spicerack_8.13.1 on the cumin hosts
  • 11:05 volans: uploaded spicerack_8.13.1 to apt.wikimedia.org bullseye-wikimedia
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host deploy1003.eqiad.wmnet
  • 10:51 elukey@deploy1003: Finished scap sync-world: Backport for Swap poolcounter2003 with poolcounter2005 (T332015) (duration: 13m 19s)
  • 10:45 elukey@deploy1003: elukey: Continuing with sync
  • 10:45 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host deploy1003.eqiad.wmnet
  • 10:45 elukey@deploy1003: elukey: Backport for Swap poolcounter2003 with poolcounter2005 (T332015) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:38 elukey@deploy1003: Started scap sync-world: Backport for Swap poolcounter2003 with poolcounter2005 (T332015)
  • 10:37 elukey@deploy1003: Finished scap sync-world: Swap poolcounter2003 with poolcounter2005 (duration: 30m 00s)
  • 10:07 elukey@deploy1003: Started scap sync-world: Swap poolcounter2003 with poolcounter2005
  • 10:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2138.codfw.wmnet
  • 10:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2138.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 10:05 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2138.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 10:01 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 09:58 jayme: re-enable puppet on all kafka brokers
  • 09:57 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main2006.codfw.wmnet
  • 09:57 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main2006.codfw.wmnet
  • 09:57 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2138.codfw.wmnet
  • 09:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 T374852', diff saved to https://phabricator.wikimedia.org/P69209 and previous config saved to /var/cache/conftool/dbconfig/20240917-095625-arnaudb.json
  • 09:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 T374852', diff saved to https://phabricator.wikimedia.org/P69208 and previous config saved to /var/cache/conftool/dbconfig/20240917-095230-arnaudb.json
  • 09:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on kafka-main2006.codfw.wmnet with reason: Rollout of 1073402
  • 09:51 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 0:10:00 on kafka-main2006.codfw.wmnet with reason: Rollout of 1073402
  • 09:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2137.codfw.wmnet
  • 09:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2137.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 09:51 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2137.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 09:47 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 09:45 jayme: disabling puppet on all kafka brokers for rollout of https://gerrit.wikimedia.org/r/c/operations/puppet/+/1073402
  • 09:43 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2137.codfw.wmnet
  • 09:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db2137 T374851', diff saved to https://phabricator.wikimedia.org/P69207 and previous config saved to /var/cache/conftool/dbconfig/20240917-094241-arnaudb.json
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2137 T374851', diff saved to https://phabricator.wikimedia.org/P69206 and previous config saved to /var/cache/conftool/dbconfig/20240917-093850-arnaudb.json
  • 09:32 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
  • 09:31 moritzm: installing python-jwcrypto security updates
  • 09:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2127.codfw.wmnet
  • 09:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2127.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 09:26 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host chartmuseum1001.eqiad.wmnet with OS bookworm
  • 09:25 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2127.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 09:22 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 09:17 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2127.codfw.wmnet
  • 09:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 T374849', diff saved to https://phabricator.wikimedia.org/P69205 and previous config saved to /var/cache/conftool/dbconfig/20240917-091706-arnaudb.json
  • 09:10 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on chartmuseum1001.eqiad.wmnet with reason: host reimage
  • 09:07 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on chartmuseum1001.eqiad.wmnet with reason: host reimage
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 T374849', diff saved to https://phabricator.wikimedia.org/P69204 and previous config saved to /var/cache/conftool/dbconfig/20240917-090733-arnaudb.json
  • 08:56 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host chartmuseum1001.eqiad.wmnet with OS bookworm
  • 08:55 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2001.codfw.wmnet with OS bookworm
  • 08:55 elukey@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 08:55 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2001.codfw.wmnet with OS bookworm
  • 08:55 elukey@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2125.codfw.wmnet
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2125.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 08:50 elukey@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
  • 08:48 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2125.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 08:45 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 08:40 arnaudb@cumin1002: dbctl commit (dc=all): 'T374848', diff saved to https://phabricator.wikimedia.org/P69203 and previous config saved to /var/cache/conftool/dbconfig/20240917-084036-arnaudb.json
  • 08:40 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2125.codfw.wmnet
  • 08:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 T374848', diff saved to https://phabricator.wikimedia.org/P69202 and previous config saved to /var/cache/conftool/dbconfig/20240917-083652-arnaudb.json
  • 08:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2124.codfw.wmnet
  • 08:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2124.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 08:31 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2124.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 08:26 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 08:18 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2124.codfw.wmnet
  • 08:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 T374847', diff saved to https://phabricator.wikimedia.org/P69201 and previous config saved to /var/cache/conftool/dbconfig/20240917-081642-arnaudb.json
  • 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.23 refs T373642
  • 08:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2121 T374846', diff saved to https://phabricator.wikimedia.org/P69200 and previous config saved to /var/cache/conftool/dbconfig/20240917-080453-arnaudb.json
  • 08:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2122.codfw.wmnet
  • 08:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2122.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 08:02 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2122.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 07:58 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 07:54 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2122.codfw.wmnet
  • 07:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2121 T374846', diff saved to https://phabricator.wikimedia.org/P69199 and previous config saved to /var/cache/conftool/dbconfig/20240917-074918-arnaudb.json
  • 07:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2121.codfw.wmnet
  • 07:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2121.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 07:38 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2121.codfw.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 07:35 hashar@deploy1003: Finished scap sync-world: Backport for logging: Default to log any error (on group1) (T228838) (duration: 15m 36s)
  • 07:33 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 07:29 hashar@deploy1003: hashar: Continuing with sync
  • 07:29 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db2121.codfw.wmnet
  • 07:28 hashar@deploy1003: hashar: Backport for logging: Default to log any error (on group1) (T228838) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:26 vgutierrez: testing purged 0.23 in cp4038 - T334078
  • 07:20 hashar@deploy1003: Started scap sync-world: Backport for logging: Default to log any error (on group1) (T228838)
  • 07:11 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2121 T374845', diff saved to https://phabricator.wikimedia.org/P69198 and previous config saved to /var/cache/conftool/dbconfig/20240917-071120-arnaudb.json
  • 04:00 mwpresync@deploy1003: Pruned MediaWiki: 1.43.0-wmf.20 (duration: 00m 58s)
  • 03:53 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.43.0-wmf.23 refs T373642 (duration: 50m 47s)
  • 03:02 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.43.0-wmf.23 refs T373642

2024-09-16

  • 22:49 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 20:53 jhathaway: reloading puppetserver to enable strict mode
  • 20:47 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:47 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add frack new switches - pt1979@cumin2002"
  • 20:47 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add frack new switches - pt1979@cumin2002"
  • 20:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add frack new switches - pt1979@cumin2002"
  • 20:28 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add frack new switches - pt1979@cumin2002"
  • 20:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 20:25 toyofuku@deploy1003: Finished scap sync-world: Backport for Deploy Vector 2022 on small wikis (T374255), Disable quick surveys (T374743) (duration: 08m 53s)
  • 20:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 20:21 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 20:20 toyofuku@deploy1003: jdlrobson, toyofuku: Continuing with sync
  • 20:18 toyofuku@deploy1003: jdlrobson, toyofuku: Backport for Deploy Vector 2022 on small wikis (T374255), Disable quick surveys (T374743) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:16 toyofuku@deploy1003: Started scap sync-world: Backport for Deploy Vector 2022 on small wikis (T374255), Disable quick surveys (T374743)
  • 18:27 dancy@deploy1003: Finished deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836 (duration: 00m 06s)
  • 18:27 dancy@deploy1003: Started deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836
  • 18:27 dancy@deploy1003: Finished deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836 (duration: 00m 04s)
  • 18:27 dancy@deploy1003: Started deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836
  • 18:24 rzl: rzl@cumin1002:~$ sudo cumin O:logging::opensearch::collector 'chown -R opensearch-dashboards: /usr/share/opensearch-dashboards/plugins/phatality'
  • 18:07 dancy@deploy1003: Finished deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836 (duration: 00m 04s)
  • 18:06 dancy@deploy1003: Started deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836
  • 18:06 dancy@deploy1003: Finished deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836 (duration: 00m 46s)
  • 18:05 dancy@deploy1003: Started deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836
  • 17:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 100%: T374623', diff saved to https://phabricator.wikimedia.org/P69196 and previous config saved to /var/cache/conftool/dbconfig/20240916-172712-arnaudb.json
  • 17:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 100%: T374623', diff saved to https://phabricator.wikimedia.org/P69195 and previous config saved to /var/cache/conftool/dbconfig/20240916-172706-arnaudb.json
  • 17:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 100%: T374623', diff saved to https://phabricator.wikimedia.org/P69194 and previous config saved to /var/cache/conftool/dbconfig/20240916-172701-arnaudb.json
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 100%: T374623', diff saved to https://phabricator.wikimedia.org/P69193 and previous config saved to /var/cache/conftool/dbconfig/20240916-172656-arnaudb.json
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 100%: T374623', diff saved to https://phabricator.wikimedia.org/P69192 and previous config saved to /var/cache/conftool/dbconfig/20240916-172651-arnaudb.json
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 100%: T374623', diff saved to https://phabricator.wikimedia.org/P69191 and previous config saved to /var/cache/conftool/dbconfig/20240916-172646-arnaudb.json
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 100%: T374623', diff saved to https://phabricator.wikimedia.org/P69190 and previous config saved to /var/cache/conftool/dbconfig/20240916-172641-arnaudb.json
  • 17:22 mfossati@deploy1003: Finished deploy [airflow-dags/platform_eng@5ad6710]: (no justification provided) (duration: 00m 44s)
  • 17:21 mfossati@deploy1003: Started deploy [airflow-dags/platform_eng@5ad6710]: (no justification provided)
  • 17:16 dancy@deploy1003: Finished deploy [releng/phatality@b1a2a70]: testing (duration: 00m 05s)
  • 17:16 dancy@deploy1003: Started deploy [releng/phatality@b1a2a70]: testing
  • 17:14 dancy@deploy1003: Installation of scap version "4.101.3" completed for 2 hosts
  • 17:12 dancy@deploy1003: Installing scap version "4.101.3" for 2 hosts
  • 17:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 75%: T374623', diff saved to https://phabricator.wikimedia.org/P69189 and previous config saved to /var/cache/conftool/dbconfig/20240916-171206-arnaudb.json
  • 17:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 75%: T374623', diff saved to https://phabricator.wikimedia.org/P69188 and previous config saved to /var/cache/conftool/dbconfig/20240916-171201-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 75%: T374623', diff saved to https://phabricator.wikimedia.org/P69187 and previous config saved to /var/cache/conftool/dbconfig/20240916-171155-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 75%: T374623', diff saved to https://phabricator.wikimedia.org/P69186 and previous config saved to /var/cache/conftool/dbconfig/20240916-171150-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 75%: T374623', diff saved to https://phabricator.wikimedia.org/P69185 and previous config saved to /var/cache/conftool/dbconfig/20240916-171146-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 75%: T374623', diff saved to https://phabricator.wikimedia.org/P69184 and previous config saved to /var/cache/conftool/dbconfig/20240916-171140-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 75%: T374623', diff saved to https://phabricator.wikimedia.org/P69183 and previous config saved to /var/cache/conftool/dbconfig/20240916-171136-arnaudb.json
  • 17:08 dancy@deploy1003: Installing scap version "4.101.3" for 1 hosts
  • 17:03 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 17:03 dancy@deploy1003: Installing scap version "4.101.3" for 211 hosts
  • 16:58 dancy@deploy1003: Installation of scap version "4.102.0" completed for 211 hosts
  • 16:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 50%: T374623', diff saved to https://phabricator.wikimedia.org/P69182 and previous config saved to /var/cache/conftool/dbconfig/20240916-165700-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 50%: T374623', diff saved to https://phabricator.wikimedia.org/P69181 and previous config saved to /var/cache/conftool/dbconfig/20240916-165655-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 50%: T374623', diff saved to https://phabricator.wikimedia.org/P69180 and previous config saved to /var/cache/conftool/dbconfig/20240916-165650-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 50%: T374623', diff saved to https://phabricator.wikimedia.org/P69179 and previous config saved to /var/cache/conftool/dbconfig/20240916-165645-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 50%: T374623', diff saved to https://phabricator.wikimedia.org/P69178 and previous config saved to /var/cache/conftool/dbconfig/20240916-165640-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 50%: T374623', diff saved to https://phabricator.wikimedia.org/P69177 and previous config saved to /var/cache/conftool/dbconfig/20240916-165635-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 50%: T374623', diff saved to https://phabricator.wikimedia.org/P69176 and previous config saved to /var/cache/conftool/dbconfig/20240916-165630-arnaudb.json
  • 16:54 dancy@deploy1003: Installing scap version "4.102.0" for 211 hosts
  • 16:48 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 16:45 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 25%: T374623', diff saved to https://phabricator.wikimedia.org/P69175 and previous config saved to /var/cache/conftool/dbconfig/20240916-164154-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 25%: T374623', diff saved to https://phabricator.wikimedia.org/P69174 and previous config saved to /var/cache/conftool/dbconfig/20240916-164149-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 25%: T374623', diff saved to https://phabricator.wikimedia.org/P69173 and previous config saved to /var/cache/conftool/dbconfig/20240916-164144-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 25%: T374623', diff saved to https://phabricator.wikimedia.org/P69172 and previous config saved to /var/cache/conftool/dbconfig/20240916-164139-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 25%: T374623', diff saved to https://phabricator.wikimedia.org/P69171 and previous config saved to /var/cache/conftool/dbconfig/20240916-164134-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 25%: T374623', diff saved to https://phabricator.wikimedia.org/P69170 and previous config saved to /var/cache/conftool/dbconfig/20240916-164129-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 25%: T374623', diff saved to https://phabricator.wikimedia.org/P69169 and previous config saved to /var/cache/conftool/dbconfig/20240916-164124-arnaudb.json
  • 16:33 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 16%: T374623', diff saved to https://phabricator.wikimedia.org/P69168 and previous config saved to /var/cache/conftool/dbconfig/20240916-162649-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 16%: T374623', diff saved to https://phabricator.wikimedia.org/P69167 and previous config saved to /var/cache/conftool/dbconfig/20240916-162644-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 16%: T374623', diff saved to https://phabricator.wikimedia.org/P69166 and previous config saved to /var/cache/conftool/dbconfig/20240916-162638-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 16%: T374623', diff saved to https://phabricator.wikimedia.org/P69165 and previous config saved to /var/cache/conftool/dbconfig/20240916-162633-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 16%: T374623', diff saved to https://phabricator.wikimedia.org/P69164 and previous config saved to /var/cache/conftool/dbconfig/20240916-162629-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 16%: T374623', diff saved to https://phabricator.wikimedia.org/P69163 and previous config saved to /var/cache/conftool/dbconfig/20240916-162623-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 16%: T374623', diff saved to https://phabricator.wikimedia.org/P69162 and previous config saved to /var/cache/conftool/dbconfig/20240916-162618-arnaudb.json
  • 16:22 ebernhardson@deploy1003: Finished deploy [airflow-dags/search@5ad6710]: standardize created file permissions (duration: 00m 22s)
  • 16:22 ebernhardson@deploy1003: Started deploy [airflow-dags/search@5ad6710]: standardize created file permissions
  • 16:12 rzl: rzl@cumin1002:~$ sudo cumin logstash[2023,2025,2030-2032].codfw.wmnet,logstash[1025,1030,1032].eqiad.wmnet 'systemctl restart opensearch-dashboards' # only hosts where status is failed
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 8%: T374623', diff saved to https://phabricator.wikimedia.org/P69161 and previous config saved to /var/cache/conftool/dbconfig/20240916-161143-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 8%: T374623', diff saved to https://phabricator.wikimedia.org/P69160 and previous config saved to /var/cache/conftool/dbconfig/20240916-161138-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 8%: T374623', diff saved to https://phabricator.wikimedia.org/P69159 and previous config saved to /var/cache/conftool/dbconfig/20240916-161133-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 8%: T374623', diff saved to https://phabricator.wikimedia.org/P69158 and previous config saved to /var/cache/conftool/dbconfig/20240916-161128-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 8%: T374623', diff saved to https://phabricator.wikimedia.org/P69157 and previous config saved to /var/cache/conftool/dbconfig/20240916-161123-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 8%: T374623', diff saved to https://phabricator.wikimedia.org/P69156 and previous config saved to /var/cache/conftool/dbconfig/20240916-161117-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 8%: T374623', diff saved to https://phabricator.wikimedia.org/P69155 and previous config saved to /var/cache/conftool/dbconfig/20240916-161113-arnaudb.json
  • 16:09 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:07 jhathaway: testing strict mode on puppetservers
  • 16:06 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:00 rzl: rzl@cumin1002:~$ sudo cumin 'O:logging::opensearch::collector and not logstash1032.eqiad.wmnet' '/usr/share/opensearch-dashboards/bin/opensearch-dashboards-plugin --allow-root remove phatality'
  • 15:59 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 15:59 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 15:59 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns4003.wikimedia.org,service=recdns
  • 15:59 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns4003.wikimedia.org,service=recdns
  • 15:57 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 4%: T374623', diff saved to https://phabricator.wikimedia.org/P69154 and previous config saved to /var/cache/conftool/dbconfig/20240916-155637-arnaudb.json
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 4%: T374623', diff saved to https://phabricator.wikimedia.org/P69153 and previous config saved to /var/cache/conftool/dbconfig/20240916-155632-arnaudb.json
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 4%: T374623', diff saved to https://phabricator.wikimedia.org/P69152 and previous config saved to /var/cache/conftool/dbconfig/20240916-155627-arnaudb.json
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 4%: T374623', diff saved to https://phabricator.wikimedia.org/P69151 and previous config saved to /var/cache/conftool/dbconfig/20240916-155622-arnaudb.json
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 4%: T374623', diff saved to https://phabricator.wikimedia.org/P69150 and previous config saved to /var/cache/conftool/dbconfig/20240916-155617-arnaudb.json
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 4%: T374623', diff saved to https://phabricator.wikimedia.org/P69149 and previous config saved to /var/cache/conftool/dbconfig/20240916-155612-arnaudb.json
  • 15:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 4%: T374623', diff saved to https://phabricator.wikimedia.org/P69148 and previous config saved to /var/cache/conftool/dbconfig/20240916-155607-arnaudb.json
  • 15:55 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1020.eqiad.wmnet
  • 15:54 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 15:52 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1020.eqiad.wmnet
  • 15:49 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Upgrading mariadb on clouddb1020 T365424
  • 15:49 herron: logstash1023:/usr/share/opensearch-dashboards/bin# /usr/share/opensearch-dashboards/bin/opensearch-dashboards-plugin remove phatality
  • 15:48 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Upgrading mariadb on clouddb1020 T365424
  • 15:48 rzl: rzl@logstash1032:~$ sudo systemctl restart opensearch-dashboards
  • 15:48 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 15:47 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 15:46 dancy@deploy1003: Finished deploy [releng/phatality@b1a2a70]: Attempting to revert (duration: 00m 06s)
  • 15:46 dancy@deploy1003: Started deploy [releng/phatality@b1a2a70]: Attempting to revert
  • 15:44 rzl: rzl@logstash1032:~$ sudo systemctl restart opensearch-dashboards
  • 15:42 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 15:42 dancy@deploy1003: Finished deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836 (duration: 00m 21s)
  • 15:41 dancy@deploy1003: Started deploy [releng/phatality@c2cb594]: Deploying https://gerrit.wikimedia.org/r/c/releng/phatality/+/1071836
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 2%: T374623', diff saved to https://phabricator.wikimedia.org/P69147 and previous config saved to /var/cache/conftool/dbconfig/20240916-154132-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 2%: T374623', diff saved to https://phabricator.wikimedia.org/P69146 and previous config saved to /var/cache/conftool/dbconfig/20240916-154127-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 2%: T374623', diff saved to https://phabricator.wikimedia.org/P69145 and previous config saved to /var/cache/conftool/dbconfig/20240916-154121-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 2%: T374623', diff saved to https://phabricator.wikimedia.org/P69144 and previous config saved to /var/cache/conftool/dbconfig/20240916-154116-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 2%: T374623', diff saved to https://phabricator.wikimedia.org/P69143 and previous config saved to /var/cache/conftool/dbconfig/20240916-154112-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 2%: T374623', diff saved to https://phabricator.wikimedia.org/P69142 and previous config saved to /var/cache/conftool/dbconfig/20240916-154106-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 2%: T374623', diff saved to https://phabricator.wikimedia.org/P69141 and previous config saved to /var/cache/conftool/dbconfig/20240916-154101-arnaudb.json
  • 15:39 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 15:34 dancy@deploy1003: Finished deploy [releng/phatality@8ddb2fa]: (no justification provided) (duration: 00m 15s)
  • 15:34 dancy@deploy1003: Started deploy [releng/phatality@8ddb2fa]: (no justification provided)
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2238 (re)pooling @ 1%: T374623', diff saved to https://phabricator.wikimedia.org/P69140 and previous config saved to /var/cache/conftool/dbconfig/20240916-152626-arnaudb.json
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2237 (re)pooling @ 1%: T374623', diff saved to https://phabricator.wikimedia.org/P69139 and previous config saved to /var/cache/conftool/dbconfig/20240916-152621-arnaudb.json
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 1%: T374623', diff saved to https://phabricator.wikimedia.org/P69138 and previous config saved to /var/cache/conftool/dbconfig/20240916-152616-arnaudb.json
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2225 (re)pooling @ 1%: T374623', diff saved to https://phabricator.wikimedia.org/P69137 and previous config saved to /var/cache/conftool/dbconfig/20240916-152611-arnaudb.json
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2224 (re)pooling @ 1%: T374623', diff saved to https://phabricator.wikimedia.org/P69136 and previous config saved to /var/cache/conftool/dbconfig/20240916-152606-arnaudb.json
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 1%: T374623', diff saved to https://phabricator.wikimedia.org/P69135 and previous config saved to /var/cache/conftool/dbconfig/20240916-152601-arnaudb.json
  • 15:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 1%: T374623', diff saved to https://phabricator.wikimedia.org/P69134 and previous config saved to /var/cache/conftool/dbconfig/20240916-152556-arnaudb.json
  • 15:13 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from codfw to eqiad
  • 15:03 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from codfw to eqiad
  • 15:02 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
  • 15:01 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from codfw to eqiad
  • 15:01 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:59 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from codfw to eqiad
  • 14:58 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:58 root@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: sync
  • 14:57 root@deploy1003: helmfile [codfw] START helmfile.d/services/mw-jobrunner: sync
  • 14:57 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from codfw to eqiad
  • 14:57 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:57 swfrench@cumin1002: [DRY-RUN] MediaWiki read-only period ends at: 2024-09-16 14:57:30.267664
  • 14:57 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from codfw to eqiad
  • 14:57 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:57 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from codfw to eqiad
  • 14:56 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:56 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from codfw to eqiad
  • 14:55 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:54 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from codfw to eqiad
  • 14:54 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:54 moritzm: installing gdk-pixbuf security updates
  • 14:54 swfrench@cumin1002: [DRY-RUN] MediaWiki read-only period starts at: 2024-09-16 14:54:20.136310
  • 14:54 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from codfw to eqiad
  • 14:47 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:47 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad
  • 14:47 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 14:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 14:41 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from codfw to eqiad
  • 14:39 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:39 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from codfw to eqiad
  • 14:37 swfrench@cumin1002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) for datacenter switchover from codfw to eqiad
  • 14:37 swfrench@cumin1002: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet for datacenter switchover from codfw to eqiad
  • 14:31 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet,service=s4
  • 14:31 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet,service=s6
  • 14:27 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1019.eqiad.wmnet
  • 14:24 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1019.eqiad.wmnet
  • 14:15 hashar: Afternoon backport window is complete
  • 14:12 hashar@deploy1003: Finished scap sync-world: Backport for eswiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T374621) (duration: 06m 41s)
  • 14:10 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Upgrading mariadb on clouddb1019 T365424
  • 14:09 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Upgrading mariadb on clouddb1019 T365424
  • 14:07 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet,service=s6
  • 14:07 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet,service=s4
  • 14:07 hashar@deploy1003: hamishz, hashar: Continuing with sync
  • 14:07 hashar@deploy1003: hamishz, hashar: Backport for eswiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T374621) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:06 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 14:06 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 14:05 hashar@deploy1003: Started scap sync-world: Backport for eswiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T374621)
  • 14:04 hashar@deploy1003: Finished scap sync-world: Backport for logging: Default to log any error (on group0) (T228838) (duration: 07m 59s)
  • 14:00 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1018.eqiad.wmnet
  • 13:59 hashar@deploy1003: matmarex, hashar: Continuing with sync
  • 13:58 hashar@deploy1003: matmarex, hashar: Backport for logging: Default to log any error (on group0) (T228838) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:57 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1018.eqiad.wmnet
  • 13:56 hashar@deploy1003: Started scap sync-world: Backport for logging: Default to log any error (on group0) (T228838)
  • 13:50 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Upgrading mariadb on clouddb1018 T365424
  • 13:50 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Upgrading mariadb on clouddb1018 T365424
  • 13:49 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 13:49 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 13:49 hashar@deploy1003: Finished scap sync-world: Backport for Define MW_ENTRY_POINT in static.php (T374286) (duration: 10m 03s)
  • 13:48 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 13:48 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 13:48 sukhe: sudo cumin -b11 "A:cp" 'run-puppet-agent --enable "merging CR 1072566"'
  • 13:47 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4052.ulsfo.wmnet [reason: [done] testing NOOP CR but depooling to be extra sure]
  • 13:43 hashar@deploy1003: hashar, matmarex: Continuing with sync
  • 13:43 hashar@deploy1003: hashar, matmarex: Backport for Define MW_ENTRY_POINT in static.php (T374286) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4052.ulsfo.wmnet [reason: testing NOOP CR but depooling to be extra sure]
  • 13:40 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1017.eqiad.wmnet
  • 13:39 hashar@deploy1003: Started scap sync-world: Backport for Define MW_ENTRY_POINT in static.php (T374286)
  • 13:37 hashar: mwmaint: mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki=sewikimedia --current --namespace 2 # T374089
  • 13:37 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1017.eqiad.wmnet
  • 13:36 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CR 1072566"'
  • 13:36 hashar@deploy1003: Finished scap sync-world: Backport for [sewikimedia] Enable signatures in the User-namespace (T374089) (duration: 15m 35s)
  • 13:31 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Upgrading mariadb on clouddb1017 T365424
  • 13:31 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Upgrading mariadb on clouddb1017 T365424
  • 13:31 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Upgrading mariadb on clouddb1017 T365424
  • 13:30 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Upgrading mariadb on clouddb1017 T365424
  • 13:30 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 13:30 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 13:25 hashar@deploy1003: gergesshamon, hashar: Continuing with sync
  • 13:25 hashar@deploy1003: gergesshamon, hashar: Backport for [sewikimedia] Enable signatures in the User-namespace (T374089) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:21 hashar@deploy1003: Started scap sync-world: Backport for [sewikimedia] Enable signatures in the User-namespace (T374089)
  • 13:20 hashar@deploy1003: Sync cancelled.
  • 13:20 hashar@deploy1003: hashar, gergesshamon: Backport for [sewikimedia] Enable signatures in the User-namespace (T374089) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:14 hashar@deploy1003: Started scap sync-world: Backport for [sewikimedia] Enable signatures in the User-namespace (T374089)
  • 13:13 hashar@deploy1003: Sync cancelled.
  • 13:13 hashar@deploy1003: gergesshamon, hashar: Backport for [sewikimedia] Enable signatures in the User-namespace (T374089) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:01 hashar@deploy1003: Started scap sync-world: Backport for [sewikimedia] Enable signatures in the User-namespace (T374089)
  • 12:51 moritzm: installing node-undici security updates
  • 12:30 kevinbazira@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 12:23 moritzm: installing glibc bugfix updates from bookworm 12.7 point release
  • 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1004.wikimedia.org
  • 12:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1004.wikimedia.org
  • 12:18 kevinbazira@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 11:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM chartmuseum1001.eqiad.wmnet
  • 10:56 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1349.eqiad.wmnet
  • 10:50 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM chartmuseum1001.eqiad.wmnet
  • 10:50 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1349.eqiad.wmnet
  • 10:33 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 10:33 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 10:29 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1016.eqiad.wmnet
  • 10:25 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1016.eqiad.wmnet
  • 10:06 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Upgrading mariadb on clouddb1016 T365424
  • 10:06 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Upgrading mariadb on clouddb1016 T365424
  • 10:05 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 10:05 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 10:03 elukey@deploy1003: Finished scap sync-world: Update network policies to allow the new poolcounter vms. (duration: 04m 35s)
  • 10:03 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet,service=s4
  • 10:03 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet,service=s6
  • 10:00 elukey@deploy1003: Started scap sync-world: Update network policies to allow the new poolcounter vms.
  • 09:56 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw
  • 09:50 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1015.eqiad.wmnet
  • 09:49 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 09:47 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1015.eqiad.wmnet
  • 09:42 elukey: upload helm3 3.11.3-2 to bookworm-wikimedia
  • 09:36 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet,service=s6
  • 09:36 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet,service=s4
  • 09:35 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Upgrading mariadb on clouddb1015 T365424
  • 09:35 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Upgrading mariadb on clouddb1015 T365424
  • 09:34 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet,service=s2
  • 09:34 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet,service=s7
  • 09:32 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1002"
  • 09:29 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1014.eqiad.wmnet
  • 09:26 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1014.eqiad.wmnet
  • 09:22 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Upgrading mariadb on clouddb1014 T365424
  • 09:22 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Upgrading mariadb on clouddb1014 T365424
  • 09:21 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet,service=s7
  • 09:21 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet,service=s2
  • 09:21 elukey: copy python3-docker-report from bullseye-wikimedia to bookworm-wikimedia
  • 09:19 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s1
  • 09:19 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
  • 09:17 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 09:14 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2001.codfw.wmnet with reason: host reimage
  • 09:12 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1013.eqiad.wmnet
  • 09:09 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1013.eqiad.wmnet
  • 09:04 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
  • 09:04 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s1
  • 09:03 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Upgrading mariadb on clouddb1013 T365424
  • 09:03 fnegri@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Upgrading mariadb on clouddb1013 T365424
  • 09:03 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 08:55 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on chartmuseum2001.codfw.wmnet with reason: host reimage
  • 08:53 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on chartmuseum2001.codfw.wmnet with reason: host reimage
  • 08:38 moritzm: bump memory allocation of chartmuseum1001/2001 to 2G (Bookworm fails to install with just 1G) T331969
  • 08:37 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:33 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:24 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:19 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:15 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:14 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:09 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:08 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 08:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2214 T374806', diff saved to https://phabricator.wikimedia.org/P69131 and previous config saved to /var/cache/conftool/dbconfig/20240916-080342-arnaudb.json
  • 08:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2129 to s6 primary T374806', diff saved to https://phabricator.wikimedia.org/P69130 and previous config saved to /var/cache/conftool/dbconfig/20240916-080132-arnaudb.json
  • 07:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db2213 T374805', diff saved to https://phabricator.wikimedia.org/P69129 and previous config saved to /var/cache/conftool/dbconfig/20240916-075059-arnaudb.json
  • 07:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s6 T374806
  • 07:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s6 T374806
  • 07:45 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host chartmuseum2001.codfw.wmnet with OS bookworm
  • 07:43 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 07:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2123 to s5 primary T374805', diff saved to https://phabricator.wikimedia.org/P69128 and previous config saved to /var/cache/conftool/dbconfig/20240916-074312-arnaudb.json
  • 07:42 arnaudb: Starting s5 codfw failover from db2213 to db2123 - T374805
  • 07:39 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 07:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2123 from API/vslow/dump T374805', diff saved to https://phabricator.wikimedia.org/P69126 and previous config saved to /var/cache/conftool/dbconfig/20240916-073521-arnaudb.json
  • 07:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s5 T374805
  • 07:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s5 T374805
  • 07:33 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 07:33 elukey@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 07:23 elukey@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw
  • 06:24 moritzm: installing git security updates
  • 06:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1125.eqiad.wmnet with reason: testing node
  • 06:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1125.eqiad.wmnet with reason: testing node

2024-09-15

  • 12:05 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 12:03 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw

2024-09-14

  • 11:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P69122 and previous config saved to /var/cache/conftool/dbconfig/20240914-111434-ladsgroup.json
  • 11:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P69121 and previous config saved to /var/cache/conftool/dbconfig/20240914-110416-ladsgroup.json
  • 10:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P69120 and previous config saved to /var/cache/conftool/dbconfig/20240914-104909-ladsgroup.json
  • 10:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T367856)', diff saved to https://phabricator.wikimedia.org/P69119 and previous config saved to /var/cache/conftool/dbconfig/20240914-103402-ladsgroup.json

2024-09-13

  • 21:24 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on vrts2001.codfw.wmnet with reason: nftables migration
  • 21:24 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on vrts2001.codfw.wmnet with reason: nftables migration
  • 21:14 dwisehaupt@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:14 dwisehaupt@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decommissioning frban2001 - dwisehaupt@cumin1002"
  • 21:14 dwisehaupt@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decommissioning frban2001 - dwisehaupt@cumin1002"
  • 21:11 dwisehaupt@cumin1002: START - Cookbook sre.dns.netbox
  • 19:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs[2021-2024].codfw.wmnet
  • 19:52 bking@cumin2002: START - Cookbook sre.hosts.remove-downtime for wdqs[2021-2024].codfw.wmnet
  • 19:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs[1021-1024].eqiad.wmnet
  • 19:52 bking@cumin2002: START - Cookbook sre.hosts.remove-downtime for wdqs[1021-1024].eqiad.wmnet
  • 17:38 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.transfer-purged-positions (exit_code=0) rolling custom on P{cp[2028-2034,2038-2042].codfw.wmnet,cp[5017,5019-5020,5023,5027-5028,5030].eqsin.wmnet,cp[4038-4052].ulsfo.wmnet} and A:cp
  • 16:59 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2123.codfw.wmnet
  • 16:59 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2123.codfw.wmnet
  • 16:59 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2123.codfw.wmnet
  • 16:57 swfrench-wmf: running homer cr*codfw* commit 'T372878'
  • 16:56 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on wdqs[1021-1024].eqiad.wmnet with reason: T373935
  • 16:56 bking@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on wdqs[1021-1024].eqiad.wmnet with reason: T373935
  • 16:55 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on wdqs[2021-2024].codfw.wmnet with reason: T373791
  • 16:55 bking@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on wdqs[2021-2024].codfw.wmnet with reason: T373791
  • 16:52 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2122.codfw.wmnet
  • 16:52 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2122.codfw.wmnet
  • 16:52 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2122.codfw.wmnet
  • 16:50 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2123.codfw.wmnet with OS bullseye
  • 16:46 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2121.codfw.wmnet
  • 16:46 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2121.codfw.wmnet
  • 16:46 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2121.codfw.wmnet
  • 16:43 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2122.codfw.wmnet with OS bullseye
  • 16:38 swfrench-wmf: running homer lsw1-b3-codfw* commit 'T372878'
  • 16:35 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2121.codfw.wmnet with OS bullseye
  • 16:31 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2123.codfw.wmnet with reason: host reimage
  • 16:27 swfrench@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2123.codfw.wmnet with reason: host reimage
  • 16:23 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2122.codfw.wmnet with reason: host reimage
  • 16:18 swfrench@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2122.codfw.wmnet with reason: host reimage
  • 16:18 vgutierrez@cumin1002: START - Cookbook sre.cdn.transfer-purged-positions rolling custom on P{cp[2028-2034,2038-2042].codfw.wmnet,cp[5017,5019-5020,5023,5027-5028,5030].eqsin.wmnet,cp[4038-4052].ulsfo.wmnet} and A:cp
  • 16:16 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2121.codfw.wmnet with reason: host reimage
  • 16:15 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.transfer-purged-positions (exit_code=0) rolling custom on P{cp2027*} and A:cp
  • 16:13 vgutierrez@cumin1002: START - Cookbook sre.cdn.transfer-purged-positions rolling custom on P{cp2027*} and A:cp
  • 16:12 swfrench@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2121.codfw.wmnet with reason: host reimage
  • 16:10 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2123
  • 16:10 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2123
  • 16:10 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2123
  • 16:10 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2123.codfw.wmnet 164.16.192.10.in-addr.arpa 4.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:09 swfrench@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2123.codfw.wmnet 164.16.192.10.in-addr.arpa 4.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:09 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:09 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2123 - swfrench@cumin2002"
  • 16:09 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2123 - swfrench@cumin2002"
  • 16:08 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@e8b4e0b] (releasing): (no justification provided) (duration: 00m 43s)
  • 16:07 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@e8b4e0b] (releasing): (no justification provided)
  • 16:07 dduvall: performing friday deployment of jenkins-deploy (releases server) to fix broken job (see https://gitlab.wikimedia.org/repos/releng/jenkins-deploy/-/merge_requests/81)
  • 16:06 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:06 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:05 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 16:03 swfrench@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2123
  • 16:03 swfrench@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2123.codfw.wmnet with OS bullseye
  • 16:03 swfrench@cumin2002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2123.codfw.wmnet
  • 16:02 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2122
  • 16:02 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2122
  • 16:02 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2122
  • 16:01 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2122.codfw.wmnet 163.16.192.10.in-addr.arpa 3.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:01 swfrench@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2122.codfw.wmnet 163.16.192.10.in-addr.arpa 3.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:01 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:01 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2122 - swfrench@cumin2002"
  • 16:01 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2122 - swfrench@cumin2002"
  • 15:57 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 15:57 swfrench@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2122
  • 15:57 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 swfrench@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2122.codfw.wmnet with OS bullseye
  • 15:56 swfrench@cumin2002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2122.codfw.wmnet
  • 15:56 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:55 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2121
  • 15:55 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2121
  • 15:55 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2121
  • 15:55 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2121.codfw.wmnet 162.16.192.10.in-addr.arpa 2.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:55 swfrench@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2121.codfw.wmnet 162.16.192.10.in-addr.arpa 2.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:55 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:55 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2121 - swfrench@cumin2002"
  • 15:55 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2121 - swfrench@cumin2002"
  • 15:51 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 15:50 swfrench@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2121
  • 15:50 swfrench@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2121.codfw.wmnet with OS bullseye
  • 15:50 swfrench@cumin2002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2121.codfw.wmnet
  • 15:46 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2121.codfw.wmnet wikikube-worker2122.codfw.wmnet wikikube-worker2123.codfw.wmnet on all recursors
  • 15:46 swfrench@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2121.codfw.wmnet wikikube-worker2122.codfw.wmnet wikikube-worker2123.codfw.wmnet on all recursors
  • 15:39 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2315 to wikikube-worker2123
  • 15:38 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2123
  • 15:38 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2123
  • 15:38 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:38 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2315 to wikikube-worker2123 - swfrench@cumin2002"
  • 15:38 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2315 to wikikube-worker2123 - swfrench@cumin2002"
  • 15:34 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 15:34 swfrench@cumin2002: START - Cookbook sre.hosts.rename from mw2315 to wikikube-worker2123
  • 15:33 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2314 to wikikube-worker2122
  • 15:32 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2122
  • 15:32 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2122
  • 15:32 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:32 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2314 to wikikube-worker2122 - swfrench@cumin2002"
  • 15:32 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2314 to wikikube-worker2122 - swfrench@cumin2002"
  • 15:28 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 15:28 swfrench@cumin2002: START - Cookbook sre.hosts.rename from mw2314 to wikikube-worker2122
  • 15:27 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2313 to wikikube-worker2121
  • 15:26 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2121
  • 15:26 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2121
  • 15:26 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:26 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2313 to wikikube-worker2121 - swfrench@cumin2002"
  • 15:26 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2313 to wikikube-worker2121 - swfrench@cumin2002"
  • 15:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2120.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2120.codfw.wmnet
  • 15:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2119.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2119.codfw.wmnet
  • 15:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2118.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2118.codfw.wmnet
  • 15:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2117.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2117.codfw.wmnet
  • 15:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2116.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2116.codfw.wmnet
  • 15:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2115.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2115.codfw.wmnet
  • 15:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2114.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2114.codfw.wmnet
  • 15:23 akosiaris@cumin1002: END (ERROR) - Cookbook sre.k8s.pool-depool-node (exit_code=97) pool for host wikikube-worker2111.codfw.wmnet
  • 15:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2111.codfw.wmnet
  • 15:22 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 15:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
  • 15:22 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
  • 15:22 swfrench@cumin2002: START - Cookbook sre.hosts.rename from mw2313 to wikikube-worker2121
  • 15:17 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker2120.codfw.wmnet
  • 15:17 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2120.codfw.wmnet
  • 15:17 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2120.codfw.wmnet
  • 15:16 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2315.codfw.wmnet
  • 15:16 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2315.codfw.wmnet
  • 15:15 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2314.codfw.wmnet
  • 15:15 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2314.codfw.wmnet
  • 15:14 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2313.codfw.wmnet
  • 15:14 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2313.codfw.wmnet
  • 15:12 akosiaris: homer lsw1-a6-codfw* commit T372878
  • 14:48 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 14:48 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 14:44 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2120.codfw.wmnet with OS bullseye
  • 14:41 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2118.codfw.wmnet
  • 14:41 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker2118.codfw.wmnet
  • 14:40 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2118.codfw.wmnet
  • 14:39 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2117.codfw.wmnet
  • 14:39 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2117.codfw.wmnet with OS bullseye
  • 14:34 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.transfer-purged-positions (exit_code=0) rolling custom on P{cp2035*} and A:cp
  • 14:33 akosiaris: homer lsw1-a6-codfw* commit 'T372878'
  • 14:33 akosiaris: homer cr*codfw* commit 'T372878'
  • 14:32 vgutierrez@cumin1002: START - Cookbook sre.cdn.transfer-purged-positions rolling custom on P{cp2035*} and A:cp
  • 14:30 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2118.codfw.wmnet with OS bullseye
  • 14:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1172.eqiad.wmnet with reason: Schema change (T367856)
  • 14:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1172.eqiad.wmnet with reason: Schema change (T367856)
  • 14:19 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.transfer-purged-positions (exit_code=0) rolling custom on P{cp2036*} and A:cp
  • 14:18 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2119.codfw.wmnet
  • 14:18 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2119.codfw.wmnet with OS bullseye
  • 14:17 vgutierrez@cumin1002: START - Cookbook sre.cdn.transfer-purged-positions rolling custom on P{cp2036*} and A:cp
  • 14:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2118.codfw.wmnet with reason: host reimage
  • 14:09 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2118.codfw.wmnet with reason: host reimage
  • 14:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2120.codfw.wmnet with reason: host reimage
  • 14:08 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2116.codfw.wmnet
  • 14:08 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2116.codfw.wmnet with OS bullseye
  • 14:05 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2120.codfw.wmnet with reason: host reimage
  • 14:01 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2115.codfw.wmnet
  • 14:01 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2115.codfw.wmnet with OS bullseye
  • 14:01 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2114.codfw.wmnet
  • 14:01 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2114.codfw.wmnet with OS bullseye
  • 14:01 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2115.codfw.wmnet with OS bullseye
  • 14:00 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2115.codfw.wmnet
  • 14:00 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2115.codfw.wmnet
  • 14:00 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2115.codfw.wmnet
  • 13:59 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2114.codfw.wmnet with OS bullseye
  • 13:58 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2114.codfw.wmnet
  • 13:55 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2115.codfw.wmnet
  • 13:55 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2115.codfw.wmnet with OS bullseye
  • 13:53 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2114.codfw.wmnet
  • 13:53 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2114.codfw.wmnet with OS bullseye
  • 13:53 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2114.codfw.wmnet with OS bullseye
  • 13:52 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2114.codfw.wmnet
  • 13:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2118
  • 13:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2118
  • 13:42 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2114.codfw.wmnet
  • 13:42 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2114.codfw.wmnet with OS bullseye
  • 13:40 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2118
  • 13:40 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2118.codfw.wmnet 173.0.192.10.in-addr.arpa 3.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:40 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2118.codfw.wmnet 173.0.192.10.in-addr.arpa 3.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:40 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2118 - akosiaris@cumin1002"
  • 13:40 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2118 - akosiaris@cumin1002"
  • 13:37 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:27 hashar@deploy1003: Finished scap sync-world: Backport for rdbms: only count replication sources toward "masterConns" in getServerConnection() (T374534) (duration: 10m 34s)
  • 13:22 hashar@deploy1003: hashar: Continuing with sync
  • 13:22 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2118
  • 13:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2117
  • 13:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2117
  • 13:21 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2117
  • 13:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2117.codfw.wmnet 172.0.192.10.in-addr.arpa 2.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:21 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2117.codfw.wmnet 172.0.192.10.in-addr.arpa 2.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2117 - akosiaris@cumin1002"
  • 13:21 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2117 - akosiaris@cumin1002"
  • 13:20 hashar@deploy1003: hashar: Backport for rdbms: only count replication sources toward "masterConns" in getServerConnection() (T374534) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:17 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:16 hashar@deploy1003: Started scap sync-world: Backport for rdbms: only count replication sources toward "masterConns" in getServerConnection() (T374534)
  • 13:12 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2117
  • 13:12 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2118.codfw.wmnet with OS bullseye
  • 13:11 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2118.codfw.wmnet
  • 13:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2120
  • 13:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2120
  • 13:10 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2118.codfw.wmnet
  • 13:10 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2118.codfw.wmnet with OS bullseye
  • 13:05 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2120
  • 13:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2120.codfw.wmnet 175.0.192.10.in-addr.arpa 5.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:05 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2120.codfw.wmnet 175.0.192.10.in-addr.arpa 5.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2120 - akosiaris@cumin1002"
  • 13:05 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2120 - akosiaris@cumin1002"
  • 13:01 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:01 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2120
  • 13:01 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2119
  • 13:01 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2119
  • 13:00 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 31s)
  • 13:00 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 12:57 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2119
  • 12:57 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2119.codfw.wmnet 174.0.192.10.in-addr.arpa 4.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:57 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2119.codfw.wmnet 174.0.192.10.in-addr.arpa 4.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:57 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:57 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2119 - akosiaris@cumin1002"
  • 12:57 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2119 - akosiaris@cumin1002"
  • 12:54 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2117.codfw.wmnet with OS bullseye
  • 12:54 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2117.codfw.wmnet
  • 12:54 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2117.codfw.wmnet on all recursors
  • 12:53 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2117.codfw.wmnet on all recursors
  • 12:53 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2117.codfw.wmnet
  • 12:53 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2117.codfw.wmnet
  • 12:52 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:52 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2120.codfw.wmnet with OS bullseye
  • 12:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2302 to wikikube-worker2117
  • 12:52 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2120.codfw.wmnet
  • 12:52 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2119
  • 12:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2116
  • 12:51 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2116
  • 12:51 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2117
  • 12:49 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2117
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2302 to wikikube-worker2117 - akosiaris@cumin1002"
  • 12:49 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2302 to wikikube-worker2117 - akosiaris@cumin1002"
  • 12:49 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2116
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2116.codfw.wmnet 171.0.192.10.in-addr.arpa 1.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:49 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2116.codfw.wmnet 171.0.192.10.in-addr.arpa 1.7.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2116 - akosiaris@cumin1002"
  • 12:45 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2119.codfw.wmnet with OS bullseye
  • 12:45 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2116 - akosiaris@cumin1002"
  • 12:45 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2119.codfw.wmnet
  • 12:45 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:44 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2302 to wikikube-worker2117
  • 12:41 godog: bounce thanos-query-frontend on titan eqiad
  • 12:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2118.codfw.wmnet with OS bullseye
  • 12:41 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2305 to wikikube-worker2120
  • 12:40 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2120
  • 12:40 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2118.codfw.wmnet
  • 12:39 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2118.codfw.wmnet on all recursors
  • 12:39 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2120
  • 12:39 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2118.codfw.wmnet on all recursors
  • 12:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2305 to wikikube-worker2120 - akosiaris@cumin1002"
  • 12:39 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2305 to wikikube-worker2120 - akosiaris@cumin1002"
  • 12:38 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2116
  • 12:37 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2115
  • 12:37 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2115
  • 12:36 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2115
  • 12:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2115.codfw.wmnet 124.0.192.10.in-addr.arpa 4.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:36 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2115.codfw.wmnet 124.0.192.10.in-addr.arpa 4.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2115 - akosiaris@cumin1002"
  • 12:35 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:35 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2115 - akosiaris@cumin1002"
  • 12:32 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2117.codfw.wmnet on all recursors
  • 12:31 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2117.codfw.wmnet on all recursors
  • 12:31 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2305 to wikikube-worker2120
  • 12:30 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:30 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2304 to wikikube-worker2119
  • 12:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2119
  • 12:29 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2119
  • 12:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2304 to wikikube-worker2119 - akosiaris@cumin1002"
  • 12:29 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2304 to wikikube-worker2119 - akosiaris@cumin1002"
  • 12:26 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2116.codfw.wmnet with OS bullseye
  • 12:26 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2116.codfw.wmnet
  • 12:25 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:25 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2304 to wikikube-worker2119
  • 12:25 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2303 to wikikube-worker2118
  • 12:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2118
  • 12:24 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2115
  • 12:24 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2118
  • 12:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2303 to wikikube-worker2118 - akosiaris@cumin1002"
  • 12:24 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2303 to wikikube-worker2118 - akosiaris@cumin1002"
  • 12:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2114
  • 12:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2114
  • 12:22 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2114
  • 12:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2114.codfw.wmnet 102.0.192.10.in-addr.arpa 2.0.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:22 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2114.codfw.wmnet 102.0.192.10.in-addr.arpa 2.0.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2114 - akosiaris@cumin1002"
  • 12:20 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:18 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2114 - akosiaris@cumin1002"
  • 12:15 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2303 to wikikube-worker2118
  • 12:15 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2301 to wikikube-worker2116
  • 12:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2116
  • 12:12 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2116
  • 12:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2301 to wikikube-worker2116 - akosiaris@cumin1002"
  • 12:12 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2114
  • 12:12 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2115.codfw.wmnet with OS bullseye
  • 12:12 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2114.codfw.wmnet with OS bullseye
  • 12:11 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2301 to wikikube-worker2116 - akosiaris@cumin1002"
  • 12:11 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2115.codfw.wmnet
  • 12:11 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2114.codfw.wmnet
  • 12:07 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:07 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2301 to wikikube-worker2116
  • 12:06 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2060 to wikikube-worker2115
  • 12:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2115
  • 12:05 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2115
  • 12:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2060 to wikikube-worker2115 - akosiaris@cumin1002"
  • 11:59 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2060 to wikikube-worker2115 - akosiaris@cumin1002"
  • 11:54 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 11:54 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2060 to wikikube-worker2115
  • 11:53 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2059 to wikikube-worker2114
  • 11:53 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2114
  • 11:52 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2114
  • 11:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:52 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2059 to wikikube-worker2114 - akosiaris@cumin1002"
  • 11:50 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2059 to wikikube-worker2114 - akosiaris@cumin1002"
  • 11:47 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 11:47 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2059 to wikikube-worker2114
  • 11:18 damilare: SmashPig upgraded from 08c79f4f to ac85ad1d
  • 10:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db1172.eqiad.wmnet with reason: Depooled recovering replag
  • 10:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db1172.eqiad.wmnet with reason: Depooled recovering replag
  • 10:37 xSavitar: T374684 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwiki --logwiki=metawiki 'Monty.ch' 'MajorFault'
  • 10:33 xSavitar: T374684 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwiki --logwiki=metawiki 'Mohamadanisahmad5' 'Vanished user a53a2dd4f79a7bde25cf2ea2b2a309cb'
  • 10:29 xSavitar: T374684 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'Iosonopony' 'L.Sala'
  • 10:29 xSavitar: T374684 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'IloveuFlyTek' 'Theology1937' --ignorestatus
  • 10:25 xSavitar: T12345 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'IloveuFlyTek' 'Theology1937' --ignorestatus
  • 10:18 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1172.eqiad.wmnet with reason: ongoing schema change
  • 10:18 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1172.eqiad.wmnet with reason: ongoing schema change
  • 09:41 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 09:41 klausman@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - klausman@cumin1002"
  • 09:41 klausman@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - klausman@cumin1002"
  • 09:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kafka-main2005.codfw.wmnet
  • 09:34 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:34 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:34 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:28 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-lab1001.eqiad.wmnet with reason: host reimage
  • 09:27 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:25 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2305.codfw.wmnet
  • 09:24 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2305.codfw.wmnet
  • 09:24 klausman@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-lab1001.eqiad.wmnet with reason: host reimage
  • 09:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2304.codfw.wmnet
  • 09:24 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2304.codfw.wmnet
  • 09:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2303.codfw.wmnet
  • 09:23 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2303.codfw.wmnet
  • 09:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2302.codfw.wmnet
  • 09:23 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:23 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:23 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 09:22 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2302.codfw.wmnet
  • 09:22 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 09:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2301.codfw.wmnet
  • 09:22 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 09:22 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2301.codfw.wmnet
  • 09:22 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2060.codfw.wmnet
  • 09:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 09:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 09:21 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2060.codfw.wmnet
  • 09:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2059.codfw.wmnet
  • 09:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 09:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2059.codfw.wmnet
  • 09:20 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 09:20 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kafka-main2005.codfw.wmnet
  • 09:20 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 09:20 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 09:19 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 09:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main2010.codfw.wmnet
  • 09:15 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main2010.codfw.wmnet
  • 09:14 jayme: restoring leadership for all partitions assigned to broker id 2005 on kafka-main-codfw - T363210
  • 09:12 klausman@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 09:09 klausman@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 09:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:01 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host poolcounter1007.eqiad.wmnet
  • 09:01 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host poolcounter1007.eqiad.wmnet with OS bookworm
  • 08:48 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 08:47 klausman@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 08:46 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on poolcounter1007.eqiad.wmnet with reason: host reimage
  • 08:45 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 08:43 klausman@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 08:42 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on poolcounter1007.eqiad.wmnet with reason: host reimage
  • 08:39 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 08:35 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 08:32 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host poolcounter1007.eqiad.wmnet with OS bookworm
  • 08:30 klausman@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 08:29 moritzm: rolling out djangorestbase update from Bookworm point release (replacing our previous bespoke build)
  • 08:28 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter1007.eqiad.wmnet - elukey@cumin1002"
  • 08:28 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter1007.eqiad.wmnet - elukey@cumin1002"
  • 08:28 elukey@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) poolcounter1007.eqiad.wmnet on all recursors
  • 08:28 elukey@cumin1002: START - Cookbook sre.dns.wipe-cache poolcounter1007.eqiad.wmnet on all recursors
  • 08:28 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:28 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter1007.eqiad.wmnet - elukey@cumin1002"
  • 08:27 moritzm: remove djangorestframework 3.14.0-2+wmf12u1 from apt.wikimedia.org, the bug fixed in that custom build has been integrated into Debian Bookworm via a point update and is no longer needed
  • 08:25 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter1007.eqiad.wmnet - elukey@cumin1002"
  • 08:19 elukey@cumin1002: START - Cookbook sre.dns.netbox
  • 08:19 elukey@cumin1002: START - Cookbook sre.ganeti.makevm for new host poolcounter1007.eqiad.wmnet
  • 08:18 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host poolcounter1006.eqiad.wmnet
  • 08:18 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host poolcounter1006.eqiad.wmnet with OS bookworm
  • 08:16 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: gerrit1004.wikimedia.org
  • 08:16 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: gerrit1004.wikimedia.org
  • 08:05 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on poolcounter1006.eqiad.wmnet with reason: host reimage
  • 08:02 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on poolcounter1006.eqiad.wmnet with reason: host reimage
  • 07:53 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host poolcounter1006.eqiad.wmnet with OS bookworm
  • 07:52 moritzm: installing nano updates from Bookworm point release
  • 07:51 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter1006.eqiad.wmnet - elukey@cumin1002"
  • 07:51 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter1006.eqiad.wmnet - elukey@cumin1002"
  • 07:50 elukey@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) poolcounter1006.eqiad.wmnet on all recursors
  • 07:50 elukey@cumin1002: START - Cookbook sre.dns.wipe-cache poolcounter1006.eqiad.wmnet on all recursors
  • 07:50 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:50 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter1006.eqiad.wmnet - elukey@cumin1002"
  • 07:50 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter1006.eqiad.wmnet - elukey@cumin1002"
  • 07:47 elukey@cumin1002: START - Cookbook sre.dns.netbox
  • 07:47 elukey@cumin1002: START - Cookbook sre.ganeti.makevm for new host poolcounter1006.eqiad.wmnet
  • 07:46 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
  • 07:35 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:35 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 07:35 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:35 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 07:35 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 07:34 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 07:34 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 07:34 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 07:34 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:34 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 07:34 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 07:33 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 07:33 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:33 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 07:33 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 07:32 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 07:32 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 07:32 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 07:27 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
  • 06:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[2005,2010].codfw.wmnet with reason: Hardware refresh
  • 06:56 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[2005,2010].codfw.wmnet with reason: Hardware refresh
  • 06:54 jayme: evacuating leadership for all partitions assigned to broker id 2005 on kafka-main-codfw - T363210

2024-09-12

  • 23:00 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@35befba] (releasing): (no justification provided) (duration: 00m 38s)
  • 22:59 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@35befba] (releasing): (no justification provided)
  • 22:34 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@6e810dc] (releasing): (no justification provided) (duration: 00m 34s)
  • 22:33 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@6e810dc] (releasing): (no justification provided)
  • 22:30 dduvall@deploy1003: deploy aborted: (no justification provided) (duration: 01m 43s)
  • 22:28 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@6e810dc] (releasing): (no justification provided)
  • 22:11 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 22:11 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 22:00 tzatziki: removing 1 file for legal compliance
  • 21:56 tzatziki: removing 6 files for legal compliance
  • 21:54 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 21:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for Fix night mode excepted Wikidata namespaces (duration: 07m 09s)
  • 21:52 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 21:48 ladsgroup@deploy1003: ladsgroup, ebrahim: Continuing with sync
  • 21:47 ladsgroup@deploy1003: ladsgroup, ebrahim: Backport for Fix night mode excepted Wikidata namespaces synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:45 ladsgroup@deploy1003: Started scap sync-world: Backport for Fix night mode excepted Wikidata namespaces
  • 21:43 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 21:40 cjming: end of UTC late backport window
  • 21:37 bd808@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 21:37 cjming@deploy1003: Finished scap sync-world: Backport for Revert "Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata" (duration: 06m 37s)
  • 21:33 cjming@deploy1003: cjming, gergesshamon: Continuing with sync
  • 21:32 cjming@deploy1003: cjming, gergesshamon: Backport for Revert "Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:30 cjming@deploy1003: Started scap sync-world: Backport for Revert "Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata"
  • 21:20 tzatziki: removing 1 file for legal compliance
  • 21:16 cjming@deploy1003: Finished scap sync-world: Backport for Revert "Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata" (duration: 08m 57s)
  • 21:13 tzatziki: removing 1 file for legal compliance
  • 21:12 cjming@deploy1003: cjming, trainbranchbot: Continuing with sync
  • 21:09 cjming@deploy1003: cjming, trainbranchbot: Backport for Revert "Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:07 cjming@deploy1003: Started scap sync-world: Backport for Revert "Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata"
  • 21:05 cjming@deploy1003: Sync cancelled.
  • 21:01 cjming@deploy1003: cjming, gergesshamon: Backport for Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:59 cjming@deploy1003: Started scap sync-world: Backport for Lift IP cap on this dates 10/09, 17/09, 24/09 for edit-a-thon for eswiki, commons and wikidata
  • 20:53 cjming@deploy1003: Finished scap sync-world: Backport for eswiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T374484) (duration: 09m 21s)
  • 20:49 cjming@deploy1003: cjming, superzerocool: Continuing with sync
  • 20:46 cjming@deploy1003: cjming, superzerocool: Backport for eswiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T374484) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:44 cjming@deploy1003: Started scap sync-world: Backport for eswiki, commonswiki, wikidata: lift IP cap for edit-a-thon (T374484)
  • 20:43 cjming@deploy1003: Finished scap sync-world: Backport for Enable the dark mode in Portal namespace (T366380) (duration: 08m 57s)
  • 20:38 cjming@deploy1003: ebrahim, cjming: Continuing with sync
  • 20:36 cjming@deploy1003: ebrahim, cjming: Backport for Enable the dark mode in Portal namespace (T366380) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:34 cjming@deploy1003: Started scap sync-world: Backport for Enable the dark mode in Portal namespace (T366380)
  • 20:33 cjming@deploy1003: Finished scap sync-world: Backport for Remove unused $wgAllowRequiringEmailForResets (T242406), Remove unused $wmgPoweredByMediaWikiIcon, Remove unused settings removed in T339959 (duration: 07m 19s)
  • 20:28 cjming@deploy1003: matmarex, cjming: Continuing with sync
  • 20:28 cjming@deploy1003: matmarex, cjming: Backport for Remove unused $wgAllowRequiringEmailForResets (T242406), Remove unused $wmgPoweredByMediaWikiIcon, Remove unused settings removed in T339959 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:25 cjming@deploy1003: Started scap sync-world: Backport for Remove unused $wgAllowRequiringEmailForResets (T242406), Remove unused $wmgPoweredByMediaWikiIcon, Remove unused settings removed in T339959
  • 20:19 cjming@deploy1003: Finished scap sync-world: Backport for Enable AutoModerator on ukwiki (T373823) (duration: 07m 01s)
  • 20:14 cjming@deploy1003: kgraessle, cjming: Continuing with sync
  • 20:14 cjming@deploy1003: kgraessle, cjming: Backport for Enable AutoModerator on ukwiki (T373823) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:12 cjming@deploy1003: Started scap sync-world: Backport for Enable AutoModerator on ukwiki (T373823)
  • 20:11 cjming@deploy1003: Finished scap sync-world: Backport for u4cwiki: create case and case_talk namespaces (T374439) (duration: 07m 36s)
  • 20:06 cjming@deploy1003: hamishz, cjming: Continuing with sync
  • 20:06 cjming@deploy1003: hamishz, cjming: Backport for u4cwiki: create case and case_talk namespaces (T374439) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:03 cjming@deploy1003: Started scap sync-world: Backport for u4cwiki: create case and case_talk namespaces (T374439)
  • 19:57 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on vrts2002.codfw.wmnet with reason: Migration
  • 19:57 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on vrts2002.codfw.wmnet with reason: Migration
  • 19:03 swfrench-wmf: rebuilt php8.1 production images to pick up php-uuid - T372602
  • 18:34 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on vrts2001.codfw.wmnet with reason: Migration
  • 18:33 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on vrts2001.codfw.wmnet with reason: Migration
  • 18:20 swfrench-wmf: ran systemctl reset-failed mediawiki_job_MachineVision_prioritize_uncategorized.service on mwmaint1002 to clear failed state for turned down job - T352884
  • 18:18 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.22 refs T373641
  • 17:28 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on aqs1014.eqiad.wmnet with reason: SSD device troubleshooting
  • 17:27 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on aqs1014.eqiad.wmnet with reason: SSD device troubleshooting
  • 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 17:14 jynus: restarting db1171:s8 mysql process T374610
  • 17:11 isaranto@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69106 and previous config saved to /var/cache/conftool/dbconfig/20240912-170524-arnaudb.json
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'pc2014 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69105 and previous config saved to /var/cache/conftool/dbconfig/20240912-170524-arnaudb.json
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69104 and previous config saved to /var/cache/conftool/dbconfig/20240912-170514-arnaudb.json
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69103 and previous config saved to /var/cache/conftool/dbconfig/20240912-170509-arnaudb.json
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69102 and previous config saved to /var/cache/conftool/dbconfig/20240912-170504-arnaudb.json
  • 17:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69101 and previous config saved to /var/cache/conftool/dbconfig/20240912-170459-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69100 and previous config saved to /var/cache/conftool/dbconfig/20240912-170453-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69099 and previous config saved to /var/cache/conftool/dbconfig/20240912-170449-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69098 and previous config saved to /var/cache/conftool/dbconfig/20240912-170444-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69097 and previous config saved to /var/cache/conftool/dbconfig/20240912-170439-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P69096 and previous config saved to /var/cache/conftool/dbconfig/20240912-170433-arnaudb.json
  • 16:52 kcvelaga@deploy1003: Finished deploy [airflow-dags/analytics_product@d045bb2]: (no justification provided) (duration: 00m 30s)
  • 16:51 kcvelaga@deploy1003: Started deploy [airflow-dags/analytics_product@d045bb2]: (no justification provided)
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69094 and previous config saved to /var/cache/conftool/dbconfig/20240912-165018-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69093 and previous config saved to /var/cache/conftool/dbconfig/20240912-165009-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69092 and previous config saved to /var/cache/conftool/dbconfig/20240912-165003-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69091 and previous config saved to /var/cache/conftool/dbconfig/20240912-164959-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69090 and previous config saved to /var/cache/conftool/dbconfig/20240912-164953-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69089 and previous config saved to /var/cache/conftool/dbconfig/20240912-164948-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69088 and previous config saved to /var/cache/conftool/dbconfig/20240912-164943-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69087 and previous config saved to /var/cache/conftool/dbconfig/20240912-164938-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69086 and previous config saved to /var/cache/conftool/dbconfig/20240912-164933-arnaudb.json
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P69085 and previous config saved to /var/cache/conftool/dbconfig/20240912-164927-arnaudb.json
  • 16:36 topranks: disable ports for now unused ports on asw-d1-codfw and asw-d2-codfw T373102
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69084 and previous config saved to /var/cache/conftool/dbconfig/20240912-163513-arnaudb.json
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69083 and previous config saved to /var/cache/conftool/dbconfig/20240912-163503-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69082 and previous config saved to /var/cache/conftool/dbconfig/20240912-163458-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69081 and previous config saved to /var/cache/conftool/dbconfig/20240912-163453-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69080 and previous config saved to /var/cache/conftool/dbconfig/20240912-163448-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69079 and previous config saved to /var/cache/conftool/dbconfig/20240912-163443-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69078 and previous config saved to /var/cache/conftool/dbconfig/20240912-163438-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69077 and previous config saved to /var/cache/conftool/dbconfig/20240912-163433-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69076 and previous config saved to /var/cache/conftool/dbconfig/20240912-163427-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P69075 and previous config saved to /var/cache/conftool/dbconfig/20240912-163422-arnaudb.json
  • 16:32 urandom: pooling ms-fe2012 moss-fe2002 & thanos-fe2003 — T373102
  • 16:31 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2045.codfw.wmnet
  • 16:31 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2045.codfw.wmnet
  • 16:31 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2044.codfw.wmnet
  • 16:31 claime: Repooling kubernetes2044.codfw.wmnet kubernetes2045.codfw.wmnet - T373102
  • 16:31 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2044.codfw.wmnet
  • 16:24 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:23 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:21 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69073 and previous config saved to /var/cache/conftool/dbconfig/20240912-162007-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69072 and previous config saved to /var/cache/conftool/dbconfig/20240912-161957-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69071 and previous config saved to /var/cache/conftool/dbconfig/20240912-161952-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69070 and previous config saved to /var/cache/conftool/dbconfig/20240912-161947-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2212 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69069 and previous config saved to /var/cache/conftool/dbconfig/20240912-161942-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2211 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69068 and previous config saved to /var/cache/conftool/dbconfig/20240912-161937-arnaudb.json
  • 16:19 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69067 and previous config saved to /var/cache/conftool/dbconfig/20240912-161932-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69066 and previous config saved to /var/cache/conftool/dbconfig/20240912-161927-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2151 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69065 and previous config saved to /var/cache/conftool/dbconfig/20240912-161922-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2128 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P69064 and previous config saved to /var/cache/conftool/dbconfig/20240912-161916-arnaudb.json
  • 16:19 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:18 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns2006.wikimedia.org [reason: [end] T373102 codfw maintenance]
  • 16:14 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 16:13 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 16:12 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 16:12 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 16:09 jynus: restart ms-backup200[12] after maintenance and upgrade
  • 16:08 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 16:08 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 16:07 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns2006.wikimedia.org
  • 16:07 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for dns2006.wikimedia.org
  • 16:01 topranks: move server uplinks in codfw rack D1 from asw-d1-codfw to lsw1-d1-codfw T373102
  • 16:01 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker2110.codfw.wmnet
  • 16:00 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2110.codfw.wmnet
  • 16:00 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2110.codfw.wmnet
  • 16:00 swfrench@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
  • 16:00 swfrench@cumin1002: START - Cookbook sre.discovery.datacenter status all services in all: None - None
  • 15:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2113.codfw.wmnet
  • 15:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2113.codfw.wmnet
  • 15:59 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2113.codfw.wmnet
  • 15:58 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker2112.codfw.wmnet
  • 15:58 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2112.codfw.wmnet
  • 15:58 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2112.codfw.wmnet
  • 15:56 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker2111.codfw.wmnet
  • 15:56 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2111.codfw.wmnet
  • 15:56 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2111.codfw.wmnet
  • 15:51 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on 21 hosts with reason: Move server uplinks codfw racks D2
  • 15:50 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:40:00 on 21 hosts with reason: Move server uplinks codfw racks D2
  • 15:50 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on 21 hosts with reason: Move server uplinks codfw racks D1
  • 15:50 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:40:00 on 21 hosts with reason: Move server uplinks codfw racks D1
  • 15:49 cmooney@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 0:20:00 on 21 hosts with reason: Move server uplinks codfw racks D1
  • 15:49 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on 21 hosts with reason: Move server uplinks codfw racks D1
  • 15:48 swfrench@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
  • 15:48 swfrench@cumin1002: START - Cookbook sre.discovery.datacenter status all services in all: None - None
  • 15:48 swfrench@cumin1002: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
  • 15:47 swfrench@cumin1002: START - Cookbook sre.discovery.datacenter status all services in all: None - None
  • 15:42 urandom: depooling ms-fe2012 moss-fe2002 & thanos-fe2003 — T373102
  • 15:40 arnaudb@cumin1002: dbctl commit (dc=all): 'depool es2034 which was perceived master for es3 - T370852', diff saved to https://phabricator.wikimedia.org/P69063 and previous config saved to /var/cache/conftool/dbconfig/20240912-154008-arnaudb.json
  • 15:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:45:00 on 11 hosts with reason: network maintenance T373101
  • 15:37 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:45:00 on 11 hosts with reason: network maintenance T373101
  • 15:37 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2128 db2151 db2170 db2171 db2211 db2212 es2033 es2034 es2039 pc2014 db2209 - T370852', diff saved to https://phabricator.wikimedia.org/P69062 and previous config saved to /var/cache/conftool/dbconfig/20240912-153720-arnaudb.json
  • 15:33 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2045.codfw.wmnet
  • 15:30 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2045.codfw.wmnet
  • 15:30 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2044.codfw.wmnet
  • 15:29 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2044.codfw.wmnet
  • 15:29 claime: Depooling kubernetes2044.codfw.wmnet kubernetes2045.codfw.wmnet - T373102
  • 15:26 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns2006.wikimedia.org [reason: T373102 codfw maintenance]
  • 15:20 zabe: zabe@mwmaint1002:~$ mwscript extensions/WikimediaMaintenance/migrateESRefToContentTable.php {fawikiquote,fawikisource,fawiktionary} --skip /home/zabe/text_table_cleanup/{fawikiquote,fawikisource,fawiktionary} --dump /home/zabe/text_table_dump/{fawikiquote,fawikisource,fawiktionary} --sleep 1 # T183490
  • 15:11 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:11 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 15:00 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 15:00 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:59 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:59 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:52 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:52 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:47 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2113.codfw.wmnet with OS bullseye
  • 14:38 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:38 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T367781)', diff saved to https://phabricator.wikimedia.org/P69060 and previous config saved to /var/cache/conftool/dbconfig/20240912-143813-arnaudb.json
  • 14:30 cscott: cleanupTitles on enwiki complete (T363538)
  • 14:28 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2113.codfw.wmnet with reason: host reimage
  • 14:25 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2113.codfw.wmnet with reason: host reimage
  • 14:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2112.codfw.wmnet with OS bullseye
  • 14:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P69059 and previous config saved to /var/cache/conftool/dbconfig/20240912-142306-arnaudb.json
  • 14:18 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2109.codfw.wmnet
  • 14:18 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2109.codfw.wmnet
  • 14:18 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2109.codfw.wmnet
  • 14:17 sukhe: sudo cumin "A:dnsbox" "rm /etc/ntp.conf": cleaning up ntpd configuration file to avoid confusion with ntpsec.conf
  • 14:10 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2110.codfw.wmnet with OS bullseye
  • 14:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2113
  • 14:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2113
  • 14:09 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2113
  • 14:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2113.codfw.wmnet 63.0.192.10.in-addr.arpa 3.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:09 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2113.codfw.wmnet 63.0.192.10.in-addr.arpa 3.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2113 - akosiaris@cumin1002"
  • 14:09 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2113 - akosiaris@cumin1002"
  • 14:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P69058 and previous config saved to /var/cache/conftool/dbconfig/20240912-140758-arnaudb.json
  • 14:04 denisse: Make alert2002 the active host - T372418
  • 14:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2112.codfw.wmnet with reason: host reimage
  • 14:02 denisse: Disable meta-monitoring for the alert hosts - T372418
  • 14:02 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2109.codfw.wmnet with OS bullseye
  • 14:01 denisse: Enable the alert[12]002 hosts as alertmanagers - T372418
  • 14:01 denisse: Enable the alert[12]002 hosts as alertmanagers
  • 14:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2112.codfw.wmnet with reason: host reimage
  • 13:56 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2108.codfw.wmnet with OS bullseye
  • 13:56 cscott: mwscript cleanupTitles enwiki 2>&1 | tee ~/T363538-enwiki-cleanupTitles
  • 13:56 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2107.codfw.wmnet
  • 13:56 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2107.codfw.wmnet
  • 13:56 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2107.codfw.wmnet
  • 13:55 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2111.codfw.wmnet with reason: host reimage
  • 13:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T367781)', diff saved to https://phabricator.wikimedia.org/P69057 and previous config saved to /var/cache/conftool/dbconfig/20240912-135251-arnaudb.json
  • 13:52 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:52 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2113
  • 13:52 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2111.codfw.wmnet with reason: host reimage
  • 13:52 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2113.codfw.wmnet with OS bullseye
  • 13:51 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2113.codfw.wmnet
  • 13:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1209 (T367781)', diff saved to https://phabricator.wikimedia.org/P69056 and previous config saved to /var/cache/conftool/dbconfig/20240912-135142-arnaudb.json
  • 13:51 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2113.codfw.wmnet on all recursors
  • 13:51 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2113.codfw.wmnet on all recursors
  • 13:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 13:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 13:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T367781)', diff saved to https://phabricator.wikimedia.org/P69055 and previous config saved to /var/cache/conftool/dbconfig/20240912-135131-arnaudb.json
  • 13:50 akosiaris: homer cr*codfw* commit 'T372878'
  • 13:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T370903)', diff saved to https://phabricator.wikimedia.org/P69054 and previous config saved to /var/cache/conftool/dbconfig/20240912-135003-ladsgroup.json
  • 13:50 cscott: mwscript namespaceDupes enwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-suffix=/T363538 --fix 2>&1 | tee ~/T363538-enwiki-namespaceDupes.take2
  • 13:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2110.codfw.wmnet with reason: host reimage
  • 13:49 cscott: namespaceDupes crashed on MOS:_OVERLINKING, re-running with --add-suffix
  • 13:48 akosiaris: homer lsw1-a3-codfw* commit 'T372878'
  • 13:47 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2107.codfw.wmnet with OS bullseye
  • 13:47 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2113.codfw.wmnet
  • 13:47 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2113.codfw.wmnet
  • 13:46 akosiaris@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2113.codfw.wmnet
  • 13:46 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2113.codfw.wmnet
  • 13:45 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2110.codfw.wmnet with reason: host reimage
  • 13:44 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2112
  • 13:44 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2112
  • 13:44 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2112
  • 13:44 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2112.codfw.wmnet 62.0.192.10.in-addr.arpa 2.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:44 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2112.codfw.wmnet 62.0.192.10.in-addr.arpa 2.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:44 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:44 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2399 to wikikube-worker2113
  • 13:43 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2113
  • 13:43 cscott: mwscript namespaceDupes enwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee ~/T363538-enwiki-namespaceDupes
  • 13:42 hashar: Afternoon backport deployments are completed . NamespaceDupe is being run on enwiki for T363538#10140642
  • 13:42 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2109.codfw.wmnet with reason: host reimage
  • 13:42 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2113
  • 13:42 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:42 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2399 to wikikube-worker2113 - akosiaris@cumin1002"
  • 13:42 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2399 to wikikube-worker2113 - akosiaris@cumin1002"
  • 13:42 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:40 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2109.codfw.wmnet with reason: host reimage
  • 13:38 hashar@deploy1003: Finished scap sync-world: Backport for Elevate pseudo-namespace MOS to a real namespace on enwiki (T363538) (duration: 06m 39s)
  • 13:37 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host poolcounter2006.codfw.wmnet
  • 13:37 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host poolcounter2006.codfw.wmnet with OS bookworm
  • 13:36 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2112
  • 13:36 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2111
  • 13:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2111
  • 13:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P69052 and previous config saved to /var/cache/conftool/dbconfig/20240912-133623-arnaudb.json
  • 13:36 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 13:36 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2111
  • 13:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2111.codfw.wmnet 61.0.192.10.in-addr.arpa 1.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:36 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2111.codfw.wmnet 61.0.192.10.in-addr.arpa 1.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:36 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2111 - akosiaris@cumin1002"
  • 13:35 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2111 - akosiaris@cumin1002"
  • 13:35 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2108.codfw.wmnet with reason: host reimage
  • 13:35 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 13:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P69051 and previous config saved to /var/cache/conftool/dbconfig/20240912-133456-ladsgroup.json
  • 13:34 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2112.codfw.wmnet with OS bullseye
  • 13:34 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2112.codfw.wmnet
  • 13:33 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2399 to wikikube-worker2113
  • 13:33 hashar@deploy1003: cscott, hashar: Continuing with sync
  • 13:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2398 to wikikube-worker2112
  • 13:33 hashar@deploy1003: cscott, hashar: Backport for Elevate pseudo-namespace MOS to a real namespace on enwiki (T363538) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:32 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2112
  • 13:32 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:32 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2112
  • 13:32 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:32 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2108.codfw.wmnet with reason: host reimage
  • 13:31 hashar@deploy1003: Started scap sync-world: Backport for Elevate pseudo-namespace MOS to a real namespace on enwiki (T363538)
  • 13:30 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2111
  • 13:30 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
  • 13:30 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2110
  • 13:30 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2110
  • 13:30 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:29 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2110
  • 13:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2110.codfw.wmnet 60.0.192.10.in-addr.arpa 0.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:29 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2110.codfw.wmnet 60.0.192.10.in-addr.arpa 0.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2110 - akosiaris@cumin1002"
  • 13:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T367781)', diff saved to https://phabricator.wikimedia.org/P69050 and previous config saved to /var/cache/conftool/dbconfig/20240912-132943-arnaudb.json
  • 13:29 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2110 - akosiaris@cumin1002"
  • 13:28 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
  • 13:28 hashar@deploy1003: Finished scap sync-world: Backport for logging: Fix WikimediaDebug "Verbose logging" option (T374583) (duration: 07m 06s)
  • 13:28 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2107.codfw.wmnet with reason: host reimage
  • 13:26 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2111.codfw.wmnet with OS bullseye
  • 13:26 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2111.codfw.wmnet
  • 13:26 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:25 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2398 to wikikube-worker2112
  • 13:25 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2110
  • 13:25 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2110.codfw.wmnet with OS bullseye
  • 13:25 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2110.codfw.wmnet
  • 13:25 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2107.codfw.wmnet with reason: host reimage
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2397 to wikikube-worker2111
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2109
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2109
  • 13:24 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2109
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2109.codfw.wmnet 59.0.192.10.in-addr.arpa 9.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:24 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2109.codfw.wmnet 59.0.192.10.in-addr.arpa 9.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2109 - akosiaris@cumin1002"
  • 13:24 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2109 - akosiaris@cumin1002"
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2111
  • 13:24 hashar@deploy1003: matmarex, hashar: Continuing with sync
  • 13:24 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2111
  • 13:24 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:23 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2397 to wikikube-worker2111 - akosiaris@cumin1002"
  • 13:23 hashar@deploy1003: matmarex, hashar: Backport for logging: Fix WikimediaDebug "Verbose logging" option (T374583) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:23 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2397 to wikikube-worker2111 - akosiaris@cumin1002"
  • 13:22 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 13:21 hashar@deploy1003: Started scap sync-world: Backport for logging: Fix WikimediaDebug "Verbose logging" option (T374583)
  • 13:21 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on poolcounter2006.codfw.wmnet with reason: host reimage
  • 13:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P69049 and previous config saved to /var/cache/conftool/dbconfig/20240912-132116-arnaudb.json
  • 13:20 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P69048 and previous config saved to /var/cache/conftool/dbconfig/20240912-131948-ladsgroup.json
  • 13:18 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on poolcounter2006.codfw.wmnet with reason: host reimage
  • 13:15 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2109
  • 13:15 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2108
  • 13:15 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2108
  • 13:15 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2108
  • 13:15 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2108.codfw.wmnet 58.0.192.10.in-addr.arpa 8.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:15 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:15 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2108.codfw.wmnet 58.0.192.10.in-addr.arpa 8.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P69047 and previous config saved to /var/cache/conftool/dbconfig/20240912-131436-arnaudb.json
  • 13:14 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2397 to wikikube-worker2111
  • 13:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2396 to wikikube-worker2110
  • 13:12 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2110
  • 13:12 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2110
  • 13:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2396 to wikikube-worker2110 - akosiaris@cumin1002"
  • 13:11 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2396 to wikikube-worker2110 - akosiaris@cumin1002"
  • 13:10 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2015.codfw.wmnet
  • 13:09 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2109.codfw.wmnet with OS bullseye
  • 13:09 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2109.codfw.wmnet
  • 13:09 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2015.codfw.wmnet
  • 13:08 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2108
  • 13:08 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2107
  • 13:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2107
  • 13:08 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2107
  • 13:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2107.codfw.wmnet 53.0.192.10.in-addr.arpa 3.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:08 akosiaris@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2107.codfw.wmnet 53.0.192.10.in-addr.arpa 3.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2107 - akosiaris@cumin1002"
  • 13:08 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2107 - akosiaris@cumin1002"
  • 13:06 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2396 to wikikube-worker2110
  • 13:06 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2108.codfw.wmnet with OS bullseye
  • 13:06 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2395 to wikikube-worker2109
  • 13:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T367781)', diff saved to https://phabricator.wikimedia.org/P69046 and previous config saved to /var/cache/conftool/dbconfig/20240912-130608-arnaudb.json
  • 13:06 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2108.codfw.wmnet
  • 13:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2109
  • 13:04 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 13:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T370903)', diff saved to https://phabricator.wikimedia.org/P69045 and previous config saved to /var/cache/conftool/dbconfig/20240912-130441-ladsgroup.json
  • 13:04 akosiaris@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2107
  • 13:04 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2107.codfw.wmnet with OS bullseye
  • 13:04 akosiaris@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2107.codfw.wmnet
  • 13:04 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2109
  • 13:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2395 to wikikube-worker2109 - akosiaris@cumin1002"
  • 13:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T367781)', diff saved to https://phabricator.wikimedia.org/P69044 and previous config saved to /var/cache/conftool/dbconfig/20240912-130400-arnaudb.json
  • 13:03 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2395 to wikikube-worker2109 - akosiaris@cumin1002"
  • 13:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 13:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 13:02 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host poolcounter2006.codfw.wmnet with OS bookworm
  • 12:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P69043 and previous config saved to /var/cache/conftool/dbconfig/20240912-125928-arnaudb.json
  • 12:59 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter2006.codfw.wmnet - elukey@cumin1002"
  • 12:59 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter2006.codfw.wmnet - elukey@cumin1002"
  • 12:58 elukey@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) poolcounter2006.codfw.wmnet on all recursors
  • 12:58 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:58 elukey@cumin1002: START - Cookbook sre.dns.wipe-cache poolcounter2006.codfw.wmnet on all recursors
  • 12:58 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:58 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter2006.codfw.wmnet - elukey@cumin1002"
  • 12:57 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter2006.codfw.wmnet - elukey@cumin1002"
  • 12:57 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2395 to wikikube-worker2109
  • 12:55 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2394 to wikikube-worker2108
  • 12:54 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2108
  • 12:54 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2108
  • 12:54 elukey@cumin1002: START - Cookbook sre.dns.netbox
  • 12:54 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:54 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2394 to wikikube-worker2108 - akosiaris@cumin1002"
  • 12:53 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2394 to wikikube-worker2108 - akosiaris@cumin1002"
  • 12:53 elukey@cumin1002: START - Cookbook sre.ganeti.makevm for new host poolcounter2006.codfw.wmnet
  • 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host deploy2002.codfw.wmnet
  • 12:50 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:50 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2394 to wikikube-worker2108
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2390 to wikikube-worker2107
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2107
  • 12:49 akosiaris@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2107
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:49 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2390 to wikikube-worker2107 - akosiaris@cumin1002"
  • 12:48 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2390 to wikikube-worker2107 - akosiaris@cumin1002"
  • 12:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T370903)', diff saved to https://phabricator.wikimedia.org/P69042 and previous config saved to /var/cache/conftool/dbconfig/20240912-124626-ladsgroup.json
  • 12:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 12:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 12:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 12:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 12:44 akosiaris@cumin1002: START - Cookbook sre.dns.netbox
  • 12:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T367781)', diff saved to https://phabricator.wikimedia.org/P69041 and previous config saved to /var/cache/conftool/dbconfig/20240912-124421-arnaudb.json
  • 12:44 akosiaris@cumin1002: START - Cookbook sre.hosts.rename from mw2390 to wikikube-worker2107
  • 12:41 elukey: thumbor codfw on wikikube moved to poolcounter2005 - T332015
  • 12:38 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 12:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T367781)', diff saved to https://phabricator.wikimedia.org/P69040 and previous config saved to /var/cache/conftool/dbconfig/20240912-123631-arnaudb.json
  • 12:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 12:36 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 12:35 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 12:31 akosiaris: depool mw239[0456789] for re-numbering, renaming and reimaging.
  • 12:29 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2399.codfw.wmnet
  • 12:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host deploy2002.codfw.wmnet
  • 12:28 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2399.codfw.wmnet
  • 12:28 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2398.codfw.wmnet
  • 12:28 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2398.codfw.wmnet
  • 12:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2397.codfw.wmnet
  • 12:27 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2397.codfw.wmnet
  • 12:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2396.codfw.wmnet
  • 12:26 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2396.codfw.wmnet
  • 12:26 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2395.codfw.wmnet
  • 12:26 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2395.codfw.wmnet
  • 12:26 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2394.codfw.wmnet
  • 12:25 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2394.codfw.wmnet
  • 12:25 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2390.codfw.wmnet
  • 12:24 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2390.codfw.wmnet
  • 12:19 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1085.eqiad.wmnet
  • 12:11 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1085.eqiad.wmnet
  • 11:47 urbanecm@deploy1003: Finished scap sync-world: Backport for Babel: Set BabelUseCommunityConfiguration to false (T374611) (duration: 11m 28s)
  • 11:47 jynus: restarting db1171:s7 mysql process T374610
  • 11:38 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 11:35 urbanecm@deploy1003: Started scap sync-world: Backport for Babel: Set BabelUseCommunityConfiguration to false (T374611)
  • 11:20 damilare: SmashPig upgraded from eb7807f8 to 08c79f4f
  • 11:14 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 11:13 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 11:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2029.codfw.wmnet
  • 11:06 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2029.codfw.wmnet
  • 11:05 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2093.codfw.wmnet
  • 11:05 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2093.codfw.wmnet
  • 11:03 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 11:03 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 10:59 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 10:59 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 10:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T371742)', diff saved to https://phabricator.wikimedia.org/P69039 and previous config saved to /var/cache/conftool/dbconfig/20240912-103434-ladsgroup.json
  • 10:33 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.e.f.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
  • 10:32 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache 2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.e.f.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
  • 10:25 btullis: stopping envoyproxy on cephosd1001
  • 10:25 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:25 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for et-0-0-31-100.ssw1-f1-eqiad.eqiad.wmnet - cmooney@cumin1002"
  • 10:25 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for et-0-0-31-100.ssw1-f1-eqiad.eqiad.wmnet - cmooney@cumin1002"
  • 10:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P69038 and previous config saved to /var/cache/conftool/dbconfig/20240912-101927-ladsgroup.json
  • 10:08 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
  • 10:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
  • 10:08 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
  • 10:08 btullis: restarted envoyproxy on cephosd1001
  • 10:08 claime: Increasing mw-wikifunctions replicas to 6
  • 09:59 btullis: stopping envoyproxy on cephosd1001
  • 09:53 arnaudb@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1246.eqiad.wmnet with OS bookworm
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69034 and previous config saved to /var/cache/conftool/dbconfig/20240912-095335-arnaudb.json
  • 09:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T371742)', diff saved to https://phabricator.wikimedia.org/P69033 and previous config saved to /var/cache/conftool/dbconfig/20240912-094912-ladsgroup.json
  • 09:41 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 09:41 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69032 and previous config saved to /var/cache/conftool/dbconfig/20240912-093829-arnaudb.json
  • 09:35 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 09:32 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69031 and previous config saved to /var/cache/conftool/dbconfig/20240912-092324-arnaudb.json
  • 09:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kafka-main2004.codfw.wmnet
  • 09:23 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:23 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:22 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 09:19 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:16 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 09:14 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kafka-main2004.codfw.wmnet
  • 09:12 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 09:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 15%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69030 and previous config saved to /var/cache/conftool/dbconfig/20240912-090818-arnaudb.json
  • 09:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 09:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 09:01 arnaudb@cumin1002: dbctl commit (dc=all): 'T374421', diff saved to https://phabricator.wikimedia.org/P69029 and previous config saved to /var/cache/conftool/dbconfig/20240912-090157-arnaudb.json
  • 08:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2205 to s3 primary T374421', diff saved to https://phabricator.wikimedia.org/P69028 and previous config saved to /var/cache/conftool/dbconfig/20240912-085859-arnaudb.json
  • 08:54 arnaudb: Starting s3 codfw failover from db2209 to db2205 - T374421
  • 08:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69027 and previous config saved to /var/cache/conftool/dbconfig/20240912-085312-arnaudb.json
  • 08:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2216 (T371742)', diff saved to https://phabricator.wikimedia.org/P69026 and previous config saved to /var/cache/conftool/dbconfig/20240912-085232-ladsgroup.json
  • 08:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 08:52 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 08:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T371742)', diff saved to https://phabricator.wikimedia.org/P69025 and previous config saved to /var/cache/conftool/dbconfig/20240912-085209-ladsgroup.json
  • 08:50 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:49 jayme: restoring leadership for all partitions assigned to broker id 2004 on kafka-main-codfw - T363210
  • 08:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main2009.codfw.wmnet
  • 08:47 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main2009.codfw.wmnet
  • 08:44 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:42 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69024 and previous config saved to /var/cache/conftool/dbconfig/20240912-083807-arnaudb.json
  • 08:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P69023 and previous config saved to /var/cache/conftool/dbconfig/20240912-083701-ladsgroup.json
  • 08:35 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 4%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69022 and previous config saved to /var/cache/conftool/dbconfig/20240912-082301-arnaudb.json
  • 08:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212', diff saved to https://phabricator.wikimedia.org/P69021 and previous config saved to /var/cache/conftool/dbconfig/20240912-082154-ladsgroup.json
  • 08:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 3%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69020 and previous config saved to /var/cache/conftool/dbconfig/20240912-080756-arnaudb.json
  • 08:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2212 (T371742)', diff saved to https://phabricator.wikimedia.org/P69019 and previous config saved to /var/cache/conftool/dbconfig/20240912-080647-ladsgroup.json
  • 07:58 gmodena@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 07:58 gmodena@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 07:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 2%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69018 and previous config saved to /var/cache/conftool/dbconfig/20240912-075250-arnaudb.json
  • 07:46 gmodena@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 07:46 gmodena@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 07:39 gmodena@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 07:38 gmodena@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 07:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: post db2229 bootstrap', diff saved to https://phabricator.wikimedia.org/P69017 and previous config saved to /var/cache/conftool/dbconfig/20240912-073744-arnaudb.json
  • 07:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2129.codfw.wmnet onto db2229.codfw.wmnet
  • 07:28 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
  • 07:22 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:22 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 07:22 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:21 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 07:21 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 07:21 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 07:21 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 07:20 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 07:20 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:20 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 07:20 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 07:20 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 07:19 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 07:19 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 07:19 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 07:19 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 07:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 07:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 07:18 slyngshede@cumin1002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Sandeeps out of all services on: 2298 hosts
  • 07:18 slyngshede@cumin1002: START - Cookbook sre.idm.logout Logging Sandeeps out of all services on: 2298 hosts
  • 07:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2212 (T371742)', diff saved to https://phabricator.wikimedia.org/P69016 and previous config saved to /var/cache/conftool/dbconfig/20240912-071034-ladsgroup.json
  • 07:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 07:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 07:09 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
  • 06:58 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2129.codfw.wmnet onto db2229.codfw.wmnet
  • 06:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2129 in db2229 for T373579', diff saved to https://phabricator.wikimedia.org/P69015 and previous config saved to /var/cache/conftool/dbconfig/20240912-065641-arnaudb.json
  • 06:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579
  • 06:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579
  • 06:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579
  • 06:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579
  • 06:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[2004,2009].codfw.wmnet with reason: Hardware refresh
  • 06:34 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[2004,2009].codfw.wmnet with reason: Hardware refresh
  • 06:33 jayme: evacuating leadership for all partitions assigned to broker id 2004 on kafka-main-codfw - T363210
  • 06:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421
  • 06:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421
  • 06:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 06:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 06:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69014 and previous config saved to /var/cache/conftool/dbconfig/20240912-061639-ladsgroup.json
  • 06:05 arnaudb@cumin1002: dbctl commit (dc=all): 'T374592', diff saved to https://phabricator.wikimedia.org/P69013 and previous config saved to /var/cache/conftool/dbconfig/20240912-060550-arnaudb.json
  • 06:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote es2038 to es7 primary and set section read-write T374592', diff saved to https://phabricator.wikimedia.org/P69012 and previous config saved to /var/cache/conftool/dbconfig/20240912-060308-arnaudb.json
  • 06:02 arnaudb: Starting es7 codfw failover from es2039 to es2038 - T374592
  • 06:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P69011 and previous config saved to /var/cache/conftool/dbconfig/20240912-060131-ladsgroup.json
  • 05:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Set es2038 with weight 0 T374592', diff saved to https://phabricator.wikimedia.org/P69010 and previous config saved to /var/cache/conftool/dbconfig/20240912-055903-arnaudb.json
  • 05:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es7 T374592
  • 05:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es7 T374592
  • 05:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P69009 and previous config saved to /var/cache/conftool/dbconfig/20240912-054624-ladsgroup.json
  • 05:44 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 05:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69008 and previous config saved to /var/cache/conftool/dbconfig/20240912-053116-ladsgroup.json
  • 04:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69007 and previous config saved to /var/cache/conftool/dbconfig/20240912-043701-ladsgroup.json
  • 04:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 04:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 04:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69006 and previous config saved to /var/cache/conftool/dbconfig/20240912-043628-ladsgroup.json
  • 04:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P69005 and previous config saved to /var/cache/conftool/dbconfig/20240912-042121-ladsgroup.json
  • 04:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P69004 and previous config saved to /var/cache/conftool/dbconfig/20240912-040613-ladsgroup.json
  • 03:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69003 and previous config saved to /var/cache/conftool/dbconfig/20240912-035105-ladsgroup.json
  • 02:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69002 and previous config saved to /var/cache/conftool/dbconfig/20240912-024635-ladsgroup.json
  • 02:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P69001 and previous config saved to /var/cache/conftool/dbconfig/20240912-024612-ladsgroup.json
  • 02:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P69000 and previous config saved to /var/cache/conftool/dbconfig/20240912-023105-ladsgroup.json
  • 02:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P68999 and previous config saved to /var/cache/conftool/dbconfig/20240912-021557-ladsgroup.json
  • 02:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P68998 and previous config saved to /var/cache/conftool/dbconfig/20240912-020050-ladsgroup.json
  • 00:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P68996 and previous config saved to /var/cache/conftool/dbconfig/20240912-005830-ladsgroup.json
  • 00:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 00:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 00:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T371742)', diff saved to https://phabricator.wikimedia.org/P68995 and previous config saved to /var/cache/conftool/dbconfig/20240912-005808-ladsgroup.json
  • 00:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P68994 and previous config saved to /var/cache/conftool/dbconfig/20240912-004301-ladsgroup.json
  • 00:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P68993 and previous config saved to /var/cache/conftool/dbconfig/20240912-002753-ladsgroup.json
  • 00:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T371742)', diff saved to https://phabricator.wikimedia.org/P68992 and previous config saved to /var/cache/conftool/dbconfig/20240912-001246-ladsgroup.json
  • 00:04 eileen: civicrm upgraded from 929101dc to ac29ff45

2024-09-11

  • 23:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:13 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1052.eqiad.wmnet with OS bookworm
  • 23:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2173 (T371742)', diff saved to https://phabricator.wikimedia.org/P68991 and previous config saved to /var/cache/conftool/dbconfig/20240911-231311-ladsgroup.json
  • 23:13 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 23:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 23:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 23:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 23:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T371742)', diff saved to https://phabricator.wikimedia.org/P68990 and previous config saved to /var/cache/conftool/dbconfig/20240911-231233-ladsgroup.json
  • 22:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P68989 and previous config saved to /var/cache/conftool/dbconfig/20240911-225726-ladsgroup.json
  • 22:46 jforrester@deploy1003: Finished scap sync-world: Backport for SpecialExpandTemplates: Replace use of deprecated OutputPage::addCategoryLinks() (T373830), SpecialExpandTemplates: Replace use of deprecated OutputPage::addCategoryLinks() (T373830) (duration: 07m 27s)
  • 22:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P68988 and previous config saved to /var/cache/conftool/dbconfig/20240911-224218-ladsgroup.json
  • 22:41 jforrester@deploy1003: jforrester: Continuing with sync
  • 22:41 jforrester@deploy1003: jforrester: Backport for SpecialExpandTemplates: Replace use of deprecated OutputPage::addCategoryLinks() (T373830), SpecialExpandTemplates: Replace use of deprecated OutputPage::addCategoryLinks() (T373830) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:38 jforrester@deploy1003: Started scap sync-world: Backport for SpecialExpandTemplates: Replace use of deprecated OutputPage::addCategoryLinks() (T373830), SpecialExpandTemplates: Replace use of deprecated OutputPage::addCategoryLinks() (T373830)
  • 22:28 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti1041.eqiad.wmnet with OS bookworm
  • 22:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:27 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T371742)', diff saved to https://phabricator.wikimedia.org/P68987 and previous config saved to /var/cache/conftool/dbconfig/20240911-222711-ladsgroup.json
  • 22:26 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:21 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti1052.eqiad.wmnet with OS bookworm
  • 22:20 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:19 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1052.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:19 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1051.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:19 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1051.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:19 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1050.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:18 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1050.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:18 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:18 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1049.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:17 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1049.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:17 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1048.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:16 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1048.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:15 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1047.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:14 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1047.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:14 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1046.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:13 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1046.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:13 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:12 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:11 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1044.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 22:03 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1052
  • 22:02 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1051
  • 22:01 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1052
  • 22:01 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1051
  • 22:01 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1050
  • 22:00 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 22:00 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 22:00 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 22:00 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:59 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1050
  • 21:59 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1049
  • 21:59 jforrester@deploy1003: Finished scap sync-world: Backport for ZObjectFactory::validatePersistentKeys: Disable use of JsonSchema, at least temporarily (T374241), ZObjectFactory::validatePersistentKeys: Disable use of JsonSchema, at least temporarily (T374241) (duration: 07m 51s)
  • 21:58 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1049
  • 21:58 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1048
  • 21:57 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1048
  • 21:57 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1047
  • 21:56 inflatador: bking@deploy1003 test deploy of flink operator in staging cancelled with no changes T373195
  • 21:56 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1047
  • 21:56 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1046
  • 21:54 jforrester@deploy1003: jforrester: Continuing with sync
  • 21:54 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1046
  • 21:54 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1045
  • 21:53 jforrester@deploy1003: jforrester: Backport for ZObjectFactory::validatePersistentKeys: Disable use of JsonSchema, at least temporarily (T374241), ZObjectFactory::validatePersistentKeys: Disable use of JsonSchema, at least temporarily (T374241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:53 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1045
  • 21:53 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1043
  • 21:52 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1043
  • 21:51 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1044
  • 21:51 jforrester@deploy1003: Started scap sync-world: Backport for ZObjectFactory::validatePersistentKeys: Disable use of JsonSchema, at least temporarily (T374241), ZObjectFactory::validatePersistentKeys: Disable use of JsonSchema, at least temporarily (T374241)
  • 21:50 jclark@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host ganeti1043
  • 21:50 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1044
  • 21:50 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1042
  • 21:50 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1043
  • 21:50 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1042
  • 21:50 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1041
  • 21:50 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1039
  • 21:50 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1039
  • 21:49 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1040
  • 21:49 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1040
  • 21:48 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1041
  • 21:48 inflatador: bking@deploy1003 test deploying flink operator in staging T373195
  • 21:45 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:44 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:43 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:43 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:41 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1040.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:41 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:41 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:40 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:40 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:40 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1040.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:39 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:37 jclark@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1039.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 21:34 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:33 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt ganeti10 - jclark@cumin1002"
  • 21:33 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt ganeti10 - jclark@cumin1002"
  • 21:30 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 21:30 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:29 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:29 jclark@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 21:24 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 21:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T371742)', diff saved to https://phabricator.wikimedia.org/P68986 and previous config saved to /var/cache/conftool/dbconfig/20240911-212208-ladsgroup.json
  • 21:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 21:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 21:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T371742)', diff saved to https://phabricator.wikimedia.org/P68985 and previous config saved to /var/cache/conftool/dbconfig/20240911-212145-ladsgroup.json
  • 21:14 cjming: end of UTC late backport window
  • 21:11 cjming@deploy1003: Finished scap sync-world: Backport for Deploy Parsoid Read Views to bn/hi/ps/tr wikivoyage (T373229) (duration: 08m 19s)
  • 21:07 cjming@deploy1003: cjming, cscott: Continuing with sync
  • 21:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P68984 and previous config saved to /var/cache/conftool/dbconfig/20240911-210638-ladsgroup.json
  • 21:05 cjming@deploy1003: cjming, cscott: Backport for Deploy Parsoid Read Views to bn/hi/ps/tr wikivoyage (T373229) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:03 cjming@deploy1003: Started scap sync-world: Backport for Deploy Parsoid Read Views to bn/hi/ps/tr wikivoyage (T373229)
  • 21:01 cjming@deploy1003: Finished scap sync-world: Backport for Update wgSitename for tlywiki (T367009) (duration: 11m 51s)
  • 20:56 cjming@deploy1003: cjming, nmw03: Continuing with sync
  • 20:51 cjming@deploy1003: cjming, nmw03: Backport for Update wgSitename for tlywiki (T367009) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P68983 and previous config saved to /var/cache/conftool/dbconfig/20240911-205130-ladsgroup.json
  • 20:49 cjming@deploy1003: Started scap sync-world: Backport for Update wgSitename for tlywiki (T367009)
  • 20:47 cjming@deploy1003: Finished scap sync-world: Backport for Ensure that it is possible to override MFNamespacesWithLeadParagraphs (duration: 09m 54s)
  • 20:43 cjming@deploy1003: jdlrobson, cjming: Continuing with sync
  • 20:42 cjming@deploy1003: jdlrobson, cjming: Backport for Ensure that it is possible to override MFNamespacesWithLeadParagraphs synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:37 cjming@deploy1003: Started scap sync-world: Backport for Ensure that it is possible to override MFNamespacesWithLeadParagraphs
  • 20:36 gmodena@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 20:36 cjming@deploy1003: Finished scap sync-world: Backport for Turn off feature flag to move donate link everywhere (T373585) (duration: 09m 42s)
  • 20:36 gmodena@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 20:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T371742)', diff saved to https://phabricator.wikimedia.org/P68982 and previous config saved to /var/cache/conftool/dbconfig/20240911-203623-ladsgroup.json
  • 20:32 cjming@deploy1003: cjming, toyofuku: Continuing with sync
  • 20:32 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and not P{dns7001*} and A:dnsbox
  • 20:31 ejegg: payments-wiki upgraded from 672c9fb6 to e191de03
  • 20:30 cjming@deploy1003: cjming, toyofuku: Backport for Turn off feature flag to move donate link everywhere (T373585) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:27 cjming@deploy1003: Started scap sync-world: Backport for Turn off feature flag to move donate link everywhere (T373585)
  • 20:23 cjming@deploy1003: Finished scap sync-world: Backport for Roll out appearance menu and font size change to sister projects (T371020) (duration: 13m 09s)
  • 20:18 cjming@deploy1003: jdlrobson, cjming: Continuing with sync
  • 20:13 cjming@deploy1003: jdlrobson, cjming: Backport for Roll out appearance menu and font size change to sister projects (T371020) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:10 cjming@deploy1003: Started scap sync-world: Backport for Roll out appearance menu and font size change to sister projects (T371020)
  • 20:09 gmodena@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 20:09 gmodena@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 19:56 gmodena@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 19:56 gmodena@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 19:47 gmodena@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 19:47 gmodena@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 19:46 gmodena@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 19:46 gmodena@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 19:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T371742)', diff saved to https://phabricator.wikimedia.org/P68980 and previous config saved to /var/cache/conftool/dbconfig/20240911-193335-ladsgroup.json
  • 19:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 19:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 19:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T371742)', diff saved to https://phabricator.wikimedia.org/P68979 and previous config saved to /var/cache/conftool/dbconfig/20240911-193312-ladsgroup.json
  • 19:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P68978 and previous config saved to /var/cache/conftool/dbconfig/20240911-191805-ladsgroup.json
  • 19:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P68977 and previous config saved to /var/cache/conftool/dbconfig/20240911-190257-ladsgroup.json
  • 18:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T371742)', diff saved to https://phabricator.wikimedia.org/P68976 and previous config saved to /var/cache/conftool/dbconfig/20240911-184750-ladsgroup.json
  • 18:42 sukhe: running agent on O:alerting_host
  • 18:30 zabe: zabe@mwmaint1002:~$ mwscript extensions/WikimediaMaintenance/migrateESRefToContentTable.php test2wiki --skip text_table_cleanup/test2wiki text_table_dump/test2wiki --sleep 1 # T183490
  • 18:25 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.22 refs T373641
  • 18:25 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • {{safesubst:SAL entry|1=18:02 zabe@deploy1003: Finished scap sync-world: Backport for migrateESRefToContentTable: Add option for not deleting text row (T183490), migrateESRefToContentTable: Add option to dump tt: -> es: reference (T183490), migrateESRefToContentTable: Add option for not deleting text row (T183490), [[gerrit:1072260|migrateESRefToContentTable: Add option to dump tt: -> es: re}}
  • 17:58 sukhe@cumin1002: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and not P{dns7001*} and A:dnsbox
  • 17:52 sukhe: re-enable puppet on A:dnsbox and [run] agent
  • 17:52 sukhe: re-enable puppet on A:dnsbox and enable agent
  • 17:51 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 17:51 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 17:50 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 17:50 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 17:49 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 17:48 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 17:48 zabe@deploy1003: zabe: Continuing with sync
  • 17:48 zabe@deploy1003: zabe: Backport for migrateESRefToContentTable: Add option for not deleting text row (T183490), migrateESRefToContentTable: Add option to dump tt: -> es: reference (T183490), migrateESRefToContentTable: Add option for not deleting text row (T183490), migrateESRefToContentTable: Add option to dump tt: -> es: reference (T183490)
  • 17:48 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 17:48 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 17:47 sukhe: sukhe@dns7001:~$ sudo systemctl restart ntpsec.service
  • 17:45 moritzm: installing postgresql-15 security updates
  • 17:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T371742)', diff saved to https://phabricator.wikimedia.org/P68975 and previous config saved to /var/cache/conftool/dbconfig/20240911-174422-ladsgroup.json
  • 17:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 17:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 17:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T371742)', diff saved to https://phabricator.wikimedia.org/P68974 and previous config saved to /var/cache/conftool/dbconfig/20240911-174400-ladsgroup.json
  • 17:39 swfrench-wmf: imported php-uuid_1.2.0-12+wmf11u1 into component/php81 - T372507
  • {{safesubst:SAL entry|1=17:30 zabe@deploy1003: Started scap sync-world: Backport for migrateESRefToContentTable: Add option for not deleting text row (T183490), migrateESRefToContentTable: Add option to dump tt: -> es: reference (T183490), migrateESRefToContentTable: Add option for not deleting text row (T183490), [[gerrit:1072260|migrateESRefToContentTable: Add option to dump tt: -> es: ref}}
  • 17:30 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=codfw
  • 17:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P68973 and previous config saved to /var/cache/conftool/dbconfig/20240911-172852-ladsgroup.json
  • 17:25 zabe@deploy1003: Started scap sync-world: Backport for migrateESRefToContentTable: Add option for not deleting text row (T183490), migrateESRefToContentTable: Add option to dump tt: -> es: reference (T183490)
  • 17:17 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (None, T373791) xfer wikidata_main from wdqs2022.codfw.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'es2038 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68972 and previous config saved to /var/cache/conftool/dbconfig/20240911-171739-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68971 and previous config saved to /var/cache/conftool/dbconfig/20240911-171734-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68970 and previous config saved to /var/cache/conftool/dbconfig/20240911-171729-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68969 and previous config saved to /var/cache/conftool/dbconfig/20240911-171724-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2179 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68968 and previous config saved to /var/cache/conftool/dbconfig/20240911-171719-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68967 and previous config saved to /var/cache/conftool/dbconfig/20240911-171714-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68966 and previous config saved to /var/cache/conftool/dbconfig/20240911-171709-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68965 and previous config saved to /var/cache/conftool/dbconfig/20240911-171704-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68964 and previous config saved to /var/cache/conftool/dbconfig/20240911-171700-arnaudb.json
  • 17:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2115 (re)pooling @ 100%: T373101', diff saved to https://phabricator.wikimedia.org/P68963 and previous config saved to /var/cache/conftool/dbconfig/20240911-171655-arnaudb.json
  • 17:14 moritzm: installing gtk+2.0 security updates on bookworm
  • 17:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P68962 and previous config saved to /var/cache/conftool/dbconfig/20240911-171346-ladsgroup.json
  • 17:05 sukhe: sukhe@dns7001:~$ sudo systemctl restart ntpsec.service
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'es2038 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68961 and previous config saved to /var/cache/conftool/dbconfig/20240911-170233-arnaudb.json
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68960 and previous config saved to /var/cache/conftool/dbconfig/20240911-170228-arnaudb.json
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68959 and previous config saved to /var/cache/conftool/dbconfig/20240911-170223-arnaudb.json
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68958 and previous config saved to /var/cache/conftool/dbconfig/20240911-170218-arnaudb.json
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2179 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68957 and previous config saved to /var/cache/conftool/dbconfig/20240911-170213-arnaudb.json
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68956 and previous config saved to /var/cache/conftool/dbconfig/20240911-170208-arnaudb.json
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68955 and previous config saved to /var/cache/conftool/dbconfig/20240911-170203-arnaudb.json
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68954 and previous config saved to /var/cache/conftool/dbconfig/20240911-170158-arnaudb.json
  • 17:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68953 and previous config saved to /var/cache/conftool/dbconfig/20240911-170153-arnaudb.json
  • 17:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2115 (re)pooling @ 75%: T373101', diff saved to https://phabricator.wikimedia.org/P68952 and previous config saved to /var/cache/conftool/dbconfig/20240911-170149-arnaudb.json
  • 16:59 sukhe: sudo cumin "A:dnsbox" 'disable-puppet "merging CR 1072209"'
  • 16:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T371742)', diff saved to https://phabricator.wikimedia.org/P68951 and previous config saved to /var/cache/conftool/dbconfig/20240911-165838-ladsgroup.json
  • 16:58 rzl@deploy1003: Finished scap sync-world: 1071714, 1071715 (T291192) (duration: 07m 37s)
  • 16:57 rzl@deploy1003: rzl: Continuing with sync
  • 16:57 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=(cp2037|cp2038).codfw.wmnet [reason: done T373101]
  • 16:54 rzl@deploy1003: rzl: 1071714, 1071715 (T291192) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:53 rzl@deploy1003: Started scap sync-world: 1071714, 1071715 (T291192)
  • 16:48 cdanis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 16:47 cdanis@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 16:47 arnaudb@cumin1002: dbctl commit (dc=all): 'es2038 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68950 and previous config saved to /var/cache/conftool/dbconfig/20240911-164728-arnaudb.json
  • 16:47 arnaudb@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68949 and previous config saved to /var/cache/conftool/dbconfig/20240911-164723-arnaudb.json
  • 16:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68948 and previous config saved to /var/cache/conftool/dbconfig/20240911-164718-arnaudb.json
  • 16:47 cdanis@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 16:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68947 and previous config saved to /var/cache/conftool/dbconfig/20240911-164713-arnaudb.json
  • 16:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2179 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68946 and previous config saved to /var/cache/conftool/dbconfig/20240911-164708-arnaudb.json
  • 16:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68945 and previous config saved to /var/cache/conftool/dbconfig/20240911-164703-arnaudb.json
  • 16:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68944 and previous config saved to /var/cache/conftool/dbconfig/20240911-164657-arnaudb.json
  • 16:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68943 and previous config saved to /var/cache/conftool/dbconfig/20240911-164653-arnaudb.json
  • 16:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68942 and previous config saved to /var/cache/conftool/dbconfig/20240911-164648-arnaudb.json
  • 16:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2115 (re)pooling @ 50%: T373101', diff saved to https://phabricator.wikimedia.org/P68941 and previous config saved to /var/cache/conftool/dbconfig/20240911-164644-arnaudb.json
  • 16:46 cdanis@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2032.codfw.wmnet
  • 16:36 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2032.codfw.wmnet
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2024.codfw.wmnet
  • 16:36 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2024.codfw.wmnet
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2023.codfw.wmnet
  • 16:36 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2023.codfw.wmnet
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2022.codfw.wmnet
  • 16:36 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2022.codfw.wmnet
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2021.codfw.wmnet
  • 16:36 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2021.codfw.wmnet
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2020.codfw.wmnet
  • 16:36 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2020.codfw.wmnet
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
  • 16:36 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
  • 16:36 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2015.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2015.codfw.wmnet
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2014.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2014.codfw.wmnet
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2359.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2359.codfw.wmnet
  • 16:35 topranks: disable now unused ports on asw-c6-codfw after server move T373101
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2357.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2357.codfw.wmnet
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2356.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2356.codfw.wmnet
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2355.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2355.codfw.wmnet
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2354.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2354.codfw.wmnet
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2353.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2353.codfw.wmnet
  • 16:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2352.codfw.wmnet
  • 16:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2352.codfw.wmnet
  • 16:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2351.codfw.wmnet
  • 16:34 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2351.codfw.wmnet
  • 16:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2350.codfw.wmnet
  • 16:34 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2350.codfw.wmnet
  • 16:34 urandom: pooling thanos-fe2004.codfw.wmnet — T373101
  • 16:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2043.codfw.wmnet
  • 16:34 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2043.codfw.wmnet
  • 16:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2042.codfw.wmnet
  • 16:34 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2042.codfw.wmnet
  • 16:34 claime: Repooling kubernetes2042.codfw.wmnet kubernetes2043.codfw.wmnet mw2350.codfw.wmnet mw2351.codfw.wmnet mw2352.codfw.wmnet mw2353.codfw.wmnet mw2354.codfw.wmnet mw2355.codfw.wmnet mw2356.codfw.wmnet mw2357.codfw.wmnet mw2359.codfw.wmnet parse2014.codfw.wmnet parse2015.codfw.wmnet wikikube-ctrl2002.codfw.wmnet wikikube-worker2020.codfw.wmnet wikikube-worker2021.codfw.wmnet
  • 16:32 arnaudb@cumin1002: dbctl commit (dc=all): 'es2038 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68940 and previous config saved to /var/cache/conftool/dbconfig/20240911-163222-arnaudb.json
  • 16:32 arnaudb@cumin1002: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68939 and previous config saved to /var/cache/conftool/dbconfig/20240911-163217-arnaudb.json
  • 16:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2210 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68938 and previous config saved to /var/cache/conftool/dbconfig/20240911-163212-arnaudb.json
  • 16:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68937 and previous config saved to /var/cache/conftool/dbconfig/20240911-163207-arnaudb.json
  • 16:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2179 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68936 and previous config saved to /var/cache/conftool/dbconfig/20240911-163202-arnaudb.json
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68935 and previous config saved to /var/cache/conftool/dbconfig/20240911-163157-arnaudb.json
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68934 and previous config saved to /var/cache/conftool/dbconfig/20240911-163152-arnaudb.json
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68933 and previous config saved to /var/cache/conftool/dbconfig/20240911-163147-arnaudb.json
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68932 and previous config saved to /var/cache/conftool/dbconfig/20240911-163142-arnaudb.json
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2115 (re)pooling @ 25%: T373101', diff saved to https://phabricator.wikimedia.org/P68931 and previous config saved to /var/cache/conftool/dbconfig/20240911-163137-arnaudb.json
  • 16:28 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2038.codfw.wmnet
  • 16:28 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for cp2038.codfw.wmnet
  • 16:28 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2037.codfw.wmnet
  • 16:27 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for cp2037.codfw.wmnet
  • 16:25 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (None, T373791) xfer wikidata_main from wdqs2022.codfw.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards
  • 16:21 bking@deploy1003: Finished deploy [wdqs/wdqs@316bf7f]: 8 (duration: 00m 12s)
  • 16:21 bking@deploy1003: Started deploy [wdqs/wdqs@316bf7f]: 8
  • 16:20 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on 24 hosts with reason: Move server uplinks codfw racks C7
  • 16:20 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on 24 hosts with reason: Move server uplinks codfw racks C7
  • 16:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2135.codfw.wmnet with reason: network maintenance
  • 16:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2135.codfw.wmnet with reason: network maintenance
  • 16:08 topranks: begin server uplink moves from asw-c6-codfw to lsw1-c6-codfw T373101
  • 16:07 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on 34 hosts with reason: Move server uplinks codfw racks C6
  • 16:07 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on 34 hosts with reason: Move server uplinks codfw racks C6
  • 15:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T371742)', diff saved to https://phabricator.wikimedia.org/P68930 and previous config saved to /var/cache/conftool/dbconfig/20240911-155608-ladsgroup.json
  • 15:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 15:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 15:55 mutante: moscovium - apt-get upgrade - installing new apache2 version and more package upgrades
  • 15:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs2021.codfw.wmnet with reason: T373791
  • 15:50 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs2021.codfw.wmnet with reason: T373791
  • 15:49 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs2021.codfw.wmnet with OS bullseye
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2115 db2116 db2127 db2167 db2168 db2179 db2180 db2210 es2022 es2038 - T370852', diff saved to https://phabricator.wikimedia.org/P68929 and previous config saved to /var/cache/conftool/dbconfig/20240911-154114-arnaudb.json
  • 15:37 urandom: depooling thanos-fe2004.codfw.wmnet — T373101
  • 15:36 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on moscovium.eqiad.wmnet with reason: nftables migration
  • 15:36 mutante: moscovium - rebooting for nftables migration
  • 15:36 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on moscovium.eqiad.wmnet with reason: nftables migration
  • 15:35 mutante: phab2002 - rebooting for nftables migration
  • 15:35 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on phab2002.codfw.wmnet with reason: nftables migration
  • 15:35 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on phab2002.codfw.wmnet with reason: nftables migration
  • 15:31 topranks: push server and vlan configuration to lsw1-c6-codfw with Homer to prep physical moves T373101
  • 15:28 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@4635fcb] (releasing): (no justification provided) (duration: 00m 35s)
  • 15:27 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@4635fcb] (releasing): (no justification provided)
  • 15:26 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@71141b8] (releasing): (no justification provided) (duration: 00m 41s)
  • 15:26 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@71141b8] (releasing): (no justification provided)
  • 15:21 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=(cp2037|cp2038).codfw.wmnet [reason: depooling for T373101]
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 100%: post db2138 → db2238 repool', diff saved to https://phabricator.wikimedia.org/P68928 and previous config saved to /var/cache/conftool/dbconfig/20240911-151754-arnaudb.json
  • 15:08 cmooney@cumin1002: END (ERROR) - Cookbook sre.ganeti.drain-node (exit_code=97) for draining ganeti node ganeti2014.codfw.wmnet
  • 15:05 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2032.codfw.wmnet
  • 15:05 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2032.codfw.wmnet
  • 15:05 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2024.codfw.wmnet
  • 15:04 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2024.codfw.wmnet
  • 15:04 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2023.codfw.wmnet
  • 15:03 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2023.codfw.wmnet
  • 15:03 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2022.codfw.wmnet
  • 15:03 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2022.codfw.wmnet
  • 15:03 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2021.codfw.wmnet
  • 15:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 75%: post db2138 → db2238 repool', diff saved to https://phabricator.wikimedia.org/P68927 and previous config saved to /var/cache/conftool/dbconfig/20240911-150249-arnaudb.json
  • 15:02 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2021.codfw.wmnet
  • 15:02 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2020.codfw.wmnet
  • 15:01 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2020.codfw.wmnet
  • 15:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
  • 15:01 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
  • 15:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2015.codfw.wmnet
  • 15:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 15:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 15:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T371742)', diff saved to https://phabricator.wikimedia.org/P68926 and previous config saved to /var/cache/conftool/dbconfig/20240911-150011-ladsgroup.json
  • 14:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Pool pc5 into production traffic (T374496)', diff saved to https://phabricator.wikimedia.org/P68925 and previous config saved to /var/cache/conftool/dbconfig/20240911-145844-ladsgroup.json
  • 14:58 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2015.codfw.wmnet
  • 14:58 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2014.codfw.wmnet
  • 14:57 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2014.codfw.wmnet
  • 14:57 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2359.codfw.wmnet
  • 14:57 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2359.codfw.wmnet
  • 14:57 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2357.codfw.wmnet
  • 14:56 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2357.codfw.wmnet
  • 14:56 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2356.codfw.wmnet
  • 14:55 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2356.codfw.wmnet
  • 14:55 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2355.codfw.wmnet
  • 14:55 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2355.codfw.wmnet
  • 14:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2354.codfw.wmnet
  • 14:54 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2354.codfw.wmnet
  • 14:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2353.codfw.wmnet
  • 14:53 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2353.codfw.wmnet
  • 14:53 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2352.codfw.wmnet
  • 14:53 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2352.codfw.wmnet
  • 14:52 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2351.codfw.wmnet
  • 14:52 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2351.codfw.wmnet
  • 14:52 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2350.codfw.wmnet
  • 14:51 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2350.codfw.wmnet
  • 14:51 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2043.codfw.wmnet
  • 14:51 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2043.codfw.wmnet
  • 14:50 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2042.codfw.wmnet
  • 14:50 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2042.codfw.wmnet
  • 14:49 claime: Depooling kubernetes2042.codfw.wmnet kubernetes2043.codfw.wmnet mw2350.codfw.wmnet mw2351.codfw.wmnet mw2352.codfw.wmnet mw2353.codfw.wmnet mw2354.codfw.wmnet mw2355.codfw.wmnet mw2356.codfw.wmnet mw2357.codfw.wmnet mw2359.codfw.wmnet parse2014.codfw.wmnet parse2015.codfw.wmnet wikikube-ctrl2002.codfw.wmnet wikikube-worker2020.codfw.wmnet wikikube-worker2021.codfw.wmnet
  • 14:48 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) check for host wikikube-ctrl2003.codfw.wmnet
  • 14:48 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-ctrl2003.codfw.wmnet
  • 14:48 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) check for host wikikube-ctrl2001.codfw.wmnet
  • 14:48 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-ctrl2001.codfw.wmnet
  • 14:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 50%: post db2138 → db2238 repool', diff saved to https://phabricator.wikimedia.org/P68924 and previous config saved to /var/cache/conftool/dbconfig/20240911-144743-arnaudb.json
  • 14:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P68923 and previous config saved to /var/cache/conftool/dbconfig/20240911-144504-ladsgroup.json
  • 14:43 jayme: deployed changeprop-jobqueue changeprop cirrus-streaming-updater eventgate-main eventstreams mw-page-content-change-enrich rdf-streaming-updater for kafka connection string updates - T363210
  • 14:42 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Pool pc5 into production traffic (T374496)', diff saved to https://phabricator.wikimedia.org/P68922 and previous config saved to /var/cache/conftool/dbconfig/20240911-144147-ladsgroup.json
  • 14:41 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:41 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:40 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:40 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 14:39 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 14:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 14:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kafka-main2003.codfw.wmnet
  • 14:38 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:38 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 14:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 14:37 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:37 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 14:34 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:34 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 14:34 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 14:34 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:33 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:32 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2014.codfw.wmnet
  • 14:32 cmooney@cumin1002: END (ERROR) - Cookbook sre.ganeti.drain-node (exit_code=97) for draining ganeti node ganeti2014.codfw.wmnet
  • 14:32 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 25%: post db2138 → db2238 repool', diff saved to https://phabricator.wikimedia.org/P68921 and previous config saved to /var/cache/conftool/dbconfig/20240911-143237-arnaudb.json
  • 14:31 jayme: last 7 helmfile deploys did not happen
  • 14:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 14:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 14:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 14:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P68920 and previous config saved to /var/cache/conftool/dbconfig/20240911-142956-ladsgroup.json
  • 14:25 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kafka-main2003.codfw.wmnet
  • 14:20 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2014.codfw.wmnet
  • 14:19 cmooney@cumin1002: END (ERROR) - Cookbook sre.ganeti.drain-node (exit_code=97) for draining ganeti node ganeti2013.codfw.wmnet
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 10%: post db2138 → db2238 repool', diff saved to https://phabricator.wikimedia.org/P68919 and previous config saved to /var/cache/conftool/dbconfig/20240911-141732-arnaudb.json
  • 14:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T371742)', diff saved to https://phabricator.wikimedia.org/P68918 and previous config saved to /var/cache/conftool/dbconfig/20240911-141449-ladsgroup.json
  • 14:14 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 14:14 fabfur: reverted 1072172 and repooling cp4037 (T370668)
  • 14:13 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main2008.codfw.wmnet
  • 14:13 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main2008.codfw.wmnet
  • 14:09 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2013.codfw.wmnet
  • 14:05 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 11s)
  • 14:05 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 13:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2138.codfw.wmnet onto db2238.codfw.wmnet
  • 13:49 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on ganeti2012.codfw.wmnet with reason: Move ganeti2012 server uplink
  • 13:49 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on ganeti2012.codfw.wmnet with reason: Move ganeti2012 server uplink
  • 13:47 Dreamy_Jazz: Afternoon UTC backport window done
  • 13:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2021.codfw.wmnet with reason: host reimage
  • 13:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Remove ResourceLoaderUseObjectCacheForDeps (T343492), Generate special page name in English for central URLs (T374277), IPInfoLogFormatter: Avoid unnecessary User object creation (T374526), Add arbcom group to zhwiki (T374455), Remove redundant oathauth-enable flag (T374528), Allow ipblock-exempt-grantor to remove ipblock-exempt group flag on zhwiki (T374504), Raise RelatedArticlesCardLimit to 9 in zhwikinews (T374323), Enable Web team search suggestions survey (T373039)
  • 13:40 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2021.codfw.wmnet with reason: host reimage
  • 13:36 dreamyjazz@deploy1003: jdrewniak, hokwelum, dreamyjazz, hamishz: Continuing with sync
  • 13:29 dreamyjazz@deploy1003: jdrewniak, hokwelum, dreamyjazz, hamishz: Backport for Remove ResourceLoaderUseObjectCacheForDeps (T343492), Generate special page name in English for central URLs (T374277), IPInfoLogFormatter: Avoid unnecessary User object creation (T374526), Add arbcom group to zhwiki (T374455), Remove redundant oathauth-enable flag (T374528), Allow ipblock-exempt-grantor to remove ipblock-exempt group flag on zhwiki (T374504), Raise RelatedArticlesCardLimit to 9 in zhwikinews (T374323), Enable Web team search suggestions survey (T373039)
  • 13:29 brouberol@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
  • 13:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for Remove ResourceLoaderUseObjectCacheForDeps (T343492), Generate special page name in English for central URLs (T374277), IPInfoLogFormatter: Avoid unnecessary User object creation (T374526), Add arbcom group to zhwiki (T374455), Remove redundant oathauth-enable flag (T374528), Allow ipblock-exempt-grantor to remove ipblock-exempt group flag on zhwiki (T374504), Raise RelatedArticlesCardLimit to 9 in zhwikinews (T374323), Enable Web team search suggestions survey (T373039)
  • 13:26 elukey@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host poolcounter2005.codfw.wmnet
  • 13:26 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host poolcounter2005.codfw.wmnet with OS bookworm
  • 13:26 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:26 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 13:26 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 13:25 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 13:24 klausman@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:23 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2021.codfw.wmnet with OS bullseye
  • 13:23 klausman@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 13:21 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:21 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 13:11 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on poolcounter2005.codfw.wmnet with reason: host reimage
  • 13:07 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on poolcounter2005.codfw.wmnet with reason: host reimage
  • 13:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T371742)', diff saved to https://phabricator.wikimedia.org/P68917 and previous config saved to /var/cache/conftool/dbconfig/20240911-130639-ladsgroup.json
  • 13:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 13:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 13:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T371742)', diff saved to https://phabricator.wikimedia.org/P68916 and previous config saved to /var/cache/conftool/dbconfig/20240911-130618-ladsgroup.json
  • 12:59 hashar@deploy1003: Finished scap sync-world: Backport for logging: Simplify extra debug logging configuration (duration: 06m 53s)
  • 12:58 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 12:58 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 12:55 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 12:55 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 12:54 hashar@deploy1003: matmarex, hashar: Continuing with sync
  • 12:54 hashar@deploy1003: matmarex, hashar: Backport for logging: Simplify extra debug logging configuration synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:54 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host poolcounter2005.codfw.wmnet with OS bookworm
  • 12:53 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter2005.codfw.wmnet - elukey@cumin1002"
  • 12:53 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM poolcounter2005.codfw.wmnet - elukey@cumin1002"
  • 12:52 elukey@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) poolcounter2005.codfw.wmnet on all recursors
  • 12:52 elukey@cumin1002: START - Cookbook sre.dns.wipe-cache poolcounter2005.codfw.wmnet on all recursors
  • 12:52 elukey@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:52 elukey@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter2005.codfw.wmnet - elukey@cumin1002"
  • 12:52 elukey@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM poolcounter2005.codfw.wmnet - elukey@cumin1002"
  • 12:52 hashar@deploy1003: Started scap sync-world: Backport for logging: Simplify extra debug logging configuration
  • 12:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P68915 and previous config saved to /var/cache/conftool/dbconfig/20240911-125110-ladsgroup.json
  • 12:47 elukey@cumin1002: START - Cookbook sre.dns.netbox
  • 12:47 elukey@cumin1002: START - Cookbook sre.ganeti.makevm for new host poolcounter2005.codfw.wmnet
  • 12:42 hashar@deploy1003: Finished scap sync-world: Backport for logging: Replace 'blackhole' handler with no handlers at all (duration: 06m 43s)
  • 12:37 hashar@deploy1003: matmarex, hashar: Continuing with sync
  • 12:37 hashar@deploy1003: matmarex, hashar: Backport for logging: Replace 'blackhole' handler with no handlers at all synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P68914 and previous config saved to /var/cache/conftool/dbconfig/20240911-123603-ladsgroup.json
  • 12:35 hashar@deploy1003: Started scap sync-world: Backport for logging: Replace 'blackhole' handler with no handlers at all
  • 12:34 brouberol@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
  • 12:33 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2138.codfw.wmnet onto db2238.codfw.wmnet
  • 12:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2138 in db2238 for T373579', diff saved to https://phabricator.wikimedia.org/P68913 and previous config saved to /var/cache/conftool/dbconfig/20240911-122910-arnaudb.json
  • 12:28 moritzm: installing glibc bugfix updates from bookworm 12.7 point release
  • 12:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: provisionning db2238.codfw.wmnet - T373579
  • 12:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: provisionning db2238.codfw.wmnet - T373579
  • 12:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: provisionning db2238.codfw.wmnet - T373579
  • 12:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: provisionning db2238.codfw.wmnet - T373579
  • 12:26 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:26 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:26 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:26 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:26 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:25 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 12:25 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 12:25 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 12:25 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:24 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 12:24 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:24 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 12:24 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:24 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 12:24 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 12:23 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 12:23 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:23 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T371742)', diff saved to https://phabricator.wikimedia.org/P68912 and previous config saved to /var/cache/conftool/dbconfig/20240911-122056-ladsgroup.json
  • 12:19 hashar@deploy1003: Finished scap sync-world: Backport for logging: Fix local variables leaking into global scope (duration: 10m 38s)
  • 12:18 topranks: re-activate Equinix IXP peers on cr1-eqiad T370696
  • 12:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2227.codfw.wmnet onto db2205.codfw.wmnet
  • 12:15 hashar@deploy1003: matmarex, hashar: Continuing with sync
  • 12:12 jayme: restoring leadership for partitions assigned to broker id 2003 on kafka-main-codfw - T363210
  • 12:11 hashar@deploy1003: matmarex, hashar: Backport for logging: Fix local variables leaking into global scope synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:09 hashar@deploy1003: Started scap sync-world: Backport for logging: Fix local variables leaking into global scope
  • 12:08 topranks: test bundling xe-3/0/6 into ae6 on cr1-eqiad T370696
  • 12:03 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
  • 12:01 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 11s)
  • 12:00 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 11:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr1-eqiad with reason: reconfigure equinix port into LAG
  • 11:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on cr1-eqiad with reason: reconfigure equinix port into LAG
  • 11:44 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
  • 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: dragonfly::supernode
  • 11:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: dragonfly::supernode
  • 11:23 _joe_: uploaded conftool 3.2.3 to apt
  • 11:22 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 11:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T371742)', diff saved to https://phabricator.wikimedia.org/P68909 and previous config saved to /var/cache/conftool/dbconfig/20240911-111549-ladsgroup.json
  • 11:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 11:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 11:14 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2227.codfw.wmnet onto db2205.codfw.wmnet
  • 11:01 mfossati@deploy1003: Finished deploy [airflow-dags/platform_eng@19cd97a]: (no justification provided) (duration: 00m 32s)
  • 11:01 mfossati@deploy1003: Started deploy [airflow-dags/platform_eng@19cd97a]: (no justification provided)
  • 10:59 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 10:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2205.codfw.wmnet
  • 10:51 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 10:50 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2205.codfw.wmnet
  • 10:50 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 10:50 fabfur: repooling cp4037 to test haproxykafka (T374473)
  • 10:40 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 10:38 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 10:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 10:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 10:22 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 10:19 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 09:42 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 09:42 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:42 fabfur: depooling cp4037 to test haproxykafka (T374473)
  • 09:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 09:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 09:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:30 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/benthos-cache-invalidator: apply
  • 09:30 jayme@deploy1003: helmfile [staging] START helmfile.d/services/benthos-cache-invalidator: apply
  • 09:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:12 brouberol@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:11 brouberol@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:11 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 09:10 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 08:53 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet with OS bookworm
  • 08:39 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dragonfly-supernode1001.eqiad.wmnet with reason: host reimage
  • 08:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 08:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 08:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T371742)', diff saved to https://phabricator.wikimedia.org/P68908 and previous config saved to /var/cache/conftool/dbconfig/20240911-083831-ladsgroup.json
  • 08:36 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dragonfly-supernode1001.eqiad.wmnet with reason: host reimage
  • 08:25 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host dragonfly-supernode1001.eqiad.wmnet with OS bookworm
  • 08:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P68907 and previous config saved to /var/cache/conftool/dbconfig/20240911-082324-ladsgroup.json
  • 08:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: post fix', diff saved to https://phabricator.wikimedia.org/P68906 and previous config saved to /var/cache/conftool/dbconfig/20240911-082200-arnaudb.json
  • 08:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P68905 and previous config saved to /var/cache/conftool/dbconfig/20240911-080817-ladsgroup.json
  • 08:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: post fix', diff saved to https://phabricator.wikimedia.org/P68904 and previous config saved to /var/cache/conftool/dbconfig/20240911-080654-arnaudb.json
  • 08:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 100%: post db2137 → db2237 repool', diff saved to https://phabricator.wikimedia.org/P68903 and previous config saved to /var/cache/conftool/dbconfig/20240911-080319-arnaudb.json
  • 07:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T371742)', diff saved to https://phabricator.wikimedia.org/P68899 and previous config saved to /var/cache/conftool/dbconfig/20240911-075310-ladsgroup.json
  • 07:53 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[2003,2008].codfw.wmnet with reason: Hardware refresh
  • 07:52 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[2003,2008].codfw.wmnet with reason: Hardware refresh
  • 07:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: post fix', diff saved to https://phabricator.wikimedia.org/P68898 and previous config saved to /var/cache/conftool/dbconfig/20240911-075149-arnaudb.json
  • 07:49 jayme: evacuating leadership for all partitions assigned to broker id 2003 on kafka-main-codfw - T363210
  • 07:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 75%: post db2137 → db2237 repool', diff saved to https://phabricator.wikimedia.org/P68897 and previous config saved to /var/cache/conftool/dbconfig/20240911-074813-arnaudb.json
  • 07:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: post fix', diff saved to https://phabricator.wikimedia.org/P68896 and previous config saved to /var/cache/conftool/dbconfig/20240911-073643-arnaudb.json
  • 07:34 arnaudb@cumin1002: dbctl commit (dc=all): 'prod issue kmwiki.pagelinks', diff saved to https://phabricator.wikimedia.org/P68895 and previous config saved to /var/cache/conftool/dbconfig/20240911-073420-arnaudb.json
  • 07:33 sgimeno@deploy1003: Finished scap sync-world: Backport for EventStreamConfig and stream registration for homepage modules analytics (T370907) (duration: 13m 56s)
  • 07:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 50%: post db2137 → db2237 repool', diff saved to https://phabricator.wikimedia.org/P68894 and previous config saved to /var/cache/conftool/dbconfig/20240911-073307-arnaudb.json
  • 07:29 sgimeno@deploy1003: sgimeno: Continuing with sync
  • 07:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2179 T374512', diff saved to https://phabricator.wikimedia.org/P68893 and previous config saved to /var/cache/conftool/dbconfig/20240911-072612-arnaudb.json
  • 07:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2179 T374512', diff saved to https://phabricator.wikimedia.org/P68892 and previous config saved to /var/cache/conftool/dbconfig/20240911-072458-arnaudb.json
  • 07:24 sgimeno@deploy1003: sgimeno: Backport for EventStreamConfig and stream registration for homepage modules analytics (T370907) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2140 to s4 primary T374512', diff saved to https://phabricator.wikimedia.org/P68891 and previous config saved to /var/cache/conftool/dbconfig/20240911-072210-arnaudb.json
  • 07:21 arnaudb: Starting s4 codfw failover from db2179 to db2140 - T374512
  • 07:19 sgimeno@deploy1003: Started scap sync-world: Backport for EventStreamConfig and stream registration for homepage modules analytics (T370907)
  • 07:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 25%: post db2137 → db2237 repool', diff saved to https://phabricator.wikimedia.org/P68890 and previous config saved to /var/cache/conftool/dbconfig/20240911-071802-arnaudb.json
  • 07:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2140 from API/vslow/dump T374512', diff saved to https://phabricator.wikimedia.org/P68889 and previous config saved to /var/cache/conftool/dbconfig/20240911-071335-arnaudb.json
  • 07:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s4 T374512
  • 07:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2140 with weight 0 T374512', diff saved to https://phabricator.wikimedia.org/P68888 and previous config saved to /var/cache/conftool/dbconfig/20240911-071205-arnaudb.json
  • 07:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s4 T374512
  • 07:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 10%: post db2137 → db2237 repool', diff saved to https://phabricator.wikimedia.org/P68887 and previous config saved to /var/cache/conftool/dbconfig/20240911-070254-arnaudb.json
  • 06:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P68884 and previous config saved to /var/cache/conftool/dbconfig/20240911-063458-ladsgroup.json
  • 06:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P68883 and previous config saved to /var/cache/conftool/dbconfig/20240911-061951-ladsgroup.json
  • 06:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T371742)', diff saved to https://phabricator.wikimedia.org/P68882 and previous config saved to /var/cache/conftool/dbconfig/20240911-060444-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T371742)', diff saved to https://phabricator.wikimedia.org/P68881 and previous config saved to /var/cache/conftool/dbconfig/20240911-050506-ladsgroup.json
  • 05:04 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 05:04 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 05:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T371742)', diff saved to https://phabricator.wikimedia.org/P68880 and previous config saved to /var/cache/conftool/dbconfig/20240911-050444-ladsgroup.json
  • 04:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P68879 and previous config saved to /var/cache/conftool/dbconfig/20240911-044936-ladsgroup.json
  • 04:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P68878 and previous config saved to /var/cache/conftool/dbconfig/20240911-043429-ladsgroup.json
  • 04:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T371742)', diff saved to https://phabricator.wikimedia.org/P68877 and previous config saved to /var/cache/conftool/dbconfig/20240911-041922-ladsgroup.json
  • 03:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T371742)', diff saved to https://phabricator.wikimedia.org/P68876 and previous config saved to /var/cache/conftool/dbconfig/20240911-031643-ladsgroup.json
  • 03:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 03:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 03:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T371742)', diff saved to https://phabricator.wikimedia.org/P68875 and previous config saved to /var/cache/conftool/dbconfig/20240911-031621-ladsgroup.json
  • 03:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P68874 and previous config saved to /var/cache/conftool/dbconfig/20240911-030112-ladsgroup.json
  • 02:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P68873 and previous config saved to /var/cache/conftool/dbconfig/20240911-024605-ladsgroup.json
  • 02:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T371742)', diff saved to https://phabricator.wikimedia.org/P68872 and previous config saved to /var/cache/conftool/dbconfig/20240911-023058-ladsgroup.json
  • 01:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T371742)', diff saved to https://phabricator.wikimedia.org/P68871 and previous config saved to /var/cache/conftool/dbconfig/20240911-013327-ladsgroup.json
  • 01:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 01:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 01:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T371742)', diff saved to https://phabricator.wikimedia.org/P68870 and previous config saved to /var/cache/conftool/dbconfig/20240911-013305-ladsgroup.json
  • 01:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P68869 and previous config saved to /var/cache/conftool/dbconfig/20240911-011758-ladsgroup.json
  • 01:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P68868 and previous config saved to /var/cache/conftool/dbconfig/20240911-010250-ladsgroup.json
  • 00:53 eileen: civicrm upgraded from e830e526 to 929101dc
  • 00:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T371742)', diff saved to https://phabricator.wikimedia.org/P68867 and previous config saved to /var/cache/conftool/dbconfig/20240911-004743-ladsgroup.json

2024-09-10

  • 23:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T371742)', diff saved to https://phabricator.wikimedia.org/P68866 and previous config saved to /var/cache/conftool/dbconfig/20240910-235102-ladsgroup.json
  • 23:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 23:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 23:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T371742)', diff saved to https://phabricator.wikimedia.org/P68865 and previous config saved to /var/cache/conftool/dbconfig/20240910-235040-ladsgroup.json
  • 23:49 cstone: civicrm upgraded from 5dd4edc1 to e830e526
  • 23:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P68864 and previous config saved to /var/cache/conftool/dbconfig/20240910-233533-ladsgroup.json
  • 23:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P68863 and previous config saved to /var/cache/conftool/dbconfig/20240910-232026-ladsgroup.json
  • 23:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T371742)', diff saved to https://phabricator.wikimedia.org/P68862 and previous config saved to /var/cache/conftool/dbconfig/20240910-230518-ladsgroup.json
  • 22:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T371742)', diff saved to https://phabricator.wikimedia.org/P68861 and previous config saved to /var/cache/conftool/dbconfig/20240910-220748-ladsgroup.json
  • 22:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 22:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 22:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T371742)', diff saved to https://phabricator.wikimedia.org/P68860 and previous config saved to /var/cache/conftool/dbconfig/20240910-220726-ladsgroup.json
  • 21:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P68859 and previous config saved to /var/cache/conftool/dbconfig/20240910-215219-ladsgroup.json
  • 21:43 tzatziki: removing 1 file for legal compliance
  • 21:37 tzatziki: removing 8 files for legal compliance
  • 21:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P68858 and previous config saved to /var/cache/conftool/dbconfig/20240910-213712-ladsgroup.json
  • 21:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T371742)', diff saved to https://phabricator.wikimedia.org/P68857 and previous config saved to /var/cache/conftool/dbconfig/20240910-212205-ladsgroup.json
  • 21:14 kindrobot: finish UTC late backport window
  • 21:14 kindrobot: purged AR wiki logos and taglines
  • 21:02 kindrobot@deploy1003: Finished scap sync-world: Backport for [arwiki] Change the wordmark and the tagline (T374430), [arwiki] change Wikipedia logo (T374430) (duration: 34m 17s)
  • 20:57 kindrobot@deploy1003: gergesshamon, kindrobot: Continuing with sync
  • 20:31 tzatziki: removing 9 files for legal compliance
  • 20:30 kindrobot@deploy1003: gergesshamon, kindrobot: Backport for [arwiki] Change the wordmark and the tagline (T374430), [arwiki] change Wikipedia logo (T374430) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:28 kindrobot@deploy1003: Started scap sync-world: Backport for [arwiki] Change the wordmark and the tagline (T374430), [arwiki] change Wikipedia logo (T374430)
  • 20:23 kindrobot@deploy1003: Finished scap sync-world: Backport for Enable native MathML by default on group0 (T373703), Configure QuickSurvey for Web empty search state experiments (T373039) (duration: 09m 37s)
  • 20:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T371742)', diff saved to https://phabricator.wikimedia.org/P68855 and previous config saved to /var/cache/conftool/dbconfig/20240910-202207-ladsgroup.json
  • 20:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 20:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 20:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T371742)', diff saved to https://phabricator.wikimedia.org/P68854 and previous config saved to /var/cache/conftool/dbconfig/20240910-202145-ladsgroup.json
  • 20:18 kindrobot@deploy1003: kindrobot, jdrewniak, physikerwelt: Continuing with sync
  • 20:15 kindrobot@deploy1003: kindrobot, jdrewniak, physikerwelt: Backport for Enable native MathML by default on group0 (T373703), Configure QuickSurvey for Web empty search state experiments (T373039) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:13 kindrobot@deploy1003: Started scap sync-world: Backport for Enable native MathML by default on group0 (T373703), Configure QuickSurvey for Web empty search state experiments (T373039)
  • 20:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P68853 and previous config saved to /var/cache/conftool/dbconfig/20240910-200637-ladsgroup.json
  • 19:54 tzatziki: removing 6 files for legal compliance
  • 19:51 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P68852 and previous config saved to /var/cache/conftool/dbconfig/20240910-195130-ladsgroup.json
  • 19:51 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 19:50 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:50 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 19:47 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:47 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T371742)', diff saved to https://phabricator.wikimedia.org/P68851 and previous config saved to /var/cache/conftool/dbconfig/20240910-193622-ladsgroup.json
  • 19:20 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:20 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 19:12 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:12 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 19:09 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:09 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 19:01 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 19:00 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 18:57 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 18:56 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 18:56 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 18:47 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@71141b8] (releasing): (no justification provided) (duration: 00m 35s)
  • 18:47 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@71141b8] (releasing): (no justification provided)
  • 18:38 swfrench-wmf: ran authdns-update on dns1004 (18:25 UTC) for T372604
  • 18:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T371742)', diff saved to https://phabricator.wikimedia.org/P68849 and previous config saved to /var/cache/conftool/dbconfig/20240910-183055-ladsgroup.json
  • 18:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T371742)', diff saved to https://phabricator.wikimedia.org/P68848 and previous config saved to /var/cache/conftool/dbconfig/20240910-183016-ladsgroup.json
  • 18:26 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@8be3b36] (releasing): (no justification provided) (duration: 05m 43s)
  • 18:20 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@8be3b36] (releasing): (no justification provided)
  • 18:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P68847 and previous config saved to /var/cache/conftool/dbconfig/20240910-181508-ladsgroup.json
  • 18:14 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.22 refs T373641
  • 18:04 swfrench-wmf: ran sre.dns.netbox after adding mwdebug-next LVS VIPs for T372604
  • 18:02 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:01 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly allocated LVS VIPs for mwdebug-next - swfrench@cumin2002"
  • 18:01 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly allocated LVS VIPs for mwdebug-next - swfrench@cumin2002"
  • 18:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P68846 and previous config saved to /var/cache/conftool/dbconfig/20240910-180001-ladsgroup.json
  • 17:57 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 17:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T371742)', diff saved to https://phabricator.wikimedia.org/P68845 and previous config saved to /var/cache/conftool/dbconfig/20240910-174454-ladsgroup.json
  • 17:39 tzatziki: removing 4 files for legal compliance
  • 17:27 tzatziki: removing 15 files for legal compliance
  • 17:09 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2021.codfw.wmnet with OS bullseye
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 100%: T373097', diff saved to https://phabricator.wikimedia.org/P68844 and previous config saved to /var/cache/conftool/dbconfig/20240910-170753-arnaudb.json
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 100%: T373097', diff saved to https://phabricator.wikimedia.org/P68843 and previous config saved to /var/cache/conftool/dbconfig/20240910-170734-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 100%: T373097', diff saved to https://phabricator.wikimedia.org/P68842 and previous config saved to /var/cache/conftool/dbconfig/20240910-170348-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 100%: T373097', diff saved to https://phabricator.wikimedia.org/P68841 and previous config saved to /var/cache/conftool/dbconfig/20240910-170348-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 100%: T373097', diff saved to https://phabricator.wikimedia.org/P68840 and previous config saved to /var/cache/conftool/dbconfig/20240910-170347-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 100%: T373097', diff saved to https://phabricator.wikimedia.org/P68839 and previous config saved to /var/cache/conftool/dbconfig/20240910-170347-arnaudb.json
  • 16:59 cgoubert@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2092.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTARTand with Dell SCP reboot policy GRACEFUL
  • 16:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 75%: T373097', diff saved to https://phabricator.wikimedia.org/P68838 and previous config saved to /var/cache/conftool/dbconfig/20240910-165248-arnaudb.json
  • 16:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 75%: T373097', diff saved to https://phabricator.wikimedia.org/P68837 and previous config saved to /var/cache/conftool/dbconfig/20240910-165228-arnaudb.json
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 75%: T373097', diff saved to https://phabricator.wikimedia.org/P68836 and previous config saved to /var/cache/conftool/dbconfig/20240910-164843-arnaudb.json
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 75%: T373097', diff saved to https://phabricator.wikimedia.org/P68835 and previous config saved to /var/cache/conftool/dbconfig/20240910-164842-arnaudb.json
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 75%: T373097', diff saved to https://phabricator.wikimedia.org/P68834 and previous config saved to /var/cache/conftool/dbconfig/20240910-164842-arnaudb.json
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 75%: T373097', diff saved to https://phabricator.wikimedia.org/P68833 and previous config saved to /var/cache/conftool/dbconfig/20240910-164842-arnaudb.json
  • 16:43 sukhe: running authdns-update
  • 16:42 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns2005.wikimedia.org [reason: end: T373097 codfw maintenance]
  • 16:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1195 (T371742)', diff saved to https://phabricator.wikimedia.org/P68832 and previous config saved to /var/cache/conftool/dbconfig/20240910-163908-ladsgroup.json
  • 16:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1195.eqiad.wmnet with reason: Maintenance
  • 16:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1195.eqiad.wmnet with reason: Maintenance
  • 16:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T371742)', diff saved to https://phabricator.wikimedia.org/P68831 and previous config saved to /var/cache/conftool/dbconfig/20240910-163846-ladsgroup.json
  • 16:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 50%: T373097', diff saved to https://phabricator.wikimedia.org/P68830 and previous config saved to /var/cache/conftool/dbconfig/20240910-163742-arnaudb.json
  • 16:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2165 (re)pooling @ 50%: T373097', diff saved to https://phabricator.wikimedia.org/P68829 and previous config saved to /var/cache/conftool/dbconfig/20240910-163722-arnaudb.json
  • 16:34 claime: Repooled kubernetes2040.codfw.wmnet kubernetes2041.codfw.wmnet kubernetes2058.codfw.wmnet mw2440.codfw.wmnet mw2442.codfw.wmnet mw2443.codfw.wmnet parse2011.codfw.wmnet parse2012.codfw.wmnet parse2013.codfw.wmnet wikikube-worker2039.codfw.wmnet - T373097
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 50%: T373097', diff saved to https://phabricator.wikimedia.org/P68828 and previous config saved to /var/cache/conftool/dbconfig/20240910-163338-arnaudb.json
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 50%: T373097', diff saved to https://phabricator.wikimedia.org/P68827 and previous config saved to /var/cache/conftool/dbconfig/20240910-163338-arnaudb.json
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 50%: T373097', diff saved to https://phabricator.wikimedia.org/P68826 and previous config saved to /var/cache/conftool/dbconfig/20240910-163337-arnaudb.json
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 50%: T373097', diff saved to https://phabricator.wikimedia.org/P68825 and previous config saved to /var/cache/conftool/dbconfig/20240910-163337-arnaudb.json
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 100%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68824 and previous config saved to /var/cache/conftool/dbconfig/20240910-163300-arnaudb.json
  • 16:30 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [CheckUser] Don't write to central indexes when no CentralAuth (T374462), Don't attempt to interact with central indexes for some wikis (T374462), Don't attempt to interact with central indexes for some wikis (T374462) (duration: 06m 49s)
  • 16:26 cgoubert@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-worker2092.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTARTand with Dell SCP reboot policy GRACEFUL
  • 16:25 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 16:25 dreamyjazz@deploy1003: dreamyjazz: Backport for [CheckUser] Don't write to central indexes when no CentralAuth (T374462), Don't attempt to interact with central indexes for some wikis (T374462), Don't attempt to interact with central indexes for some wikis (T374462) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P68823 and previous config saved to /var/cache/conftool/dbconfig/20240910-162339-ladsgroup.json
  • 16:23 dreamyjazz@deploy1003: Started scap sync-world: Backport for [CheckUser] Don't write to central indexes when no CentralAuth (T374462), Don't attempt to interact with central indexes for some wikis (T374462), Don't attempt to interact with central indexes for some wikis (T374462)
  • 16:21 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2012.codfw.wmnet
  • 16:19 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2012.codfw.wmnet
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2039.codfw.wmnet
  • 16:18 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2039.codfw.wmnet
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 25%: T373097', diff saved to https://phabricator.wikimedia.org/P68822 and previous config saved to /var/cache/conftool/dbconfig/20240910-161832-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 25%: T373097', diff saved to https://phabricator.wikimedia.org/P68821 and previous config saved to /var/cache/conftool/dbconfig/20240910-161832-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 25%: T373097', diff saved to https://phabricator.wikimedia.org/P68820 and previous config saved to /var/cache/conftool/dbconfig/20240910-161832-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 25%: T373097', diff saved to https://phabricator.wikimedia.org/P68819 and previous config saved to /var/cache/conftool/dbconfig/20240910-161832-arnaudb.json
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2013.codfw.wmnet
  • 16:18 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2013.codfw.wmnet
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2012.codfw.wmnet
  • 16:18 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2012.codfw.wmnet
  • 16:18 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2025.codfw.wmnet
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host parse2011.codfw.wmnet
  • 16:18 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host parse2011.codfw.wmnet
  • 16:18 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2033.codfw.wmnet
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2443.codfw.wmnet
  • 16:18 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2443.codfw.wmnet
  • 16:18 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2442.codfw.wmnet
  • 16:18 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2442.codfw.wmnet
  • 16:18 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 75%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68818 and previous config saved to /var/cache/conftool/dbconfig/20240910-161753-arnaudb.json
  • 16:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2440.codfw.wmnet
  • 16:17 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2440.codfw.wmnet
  • 16:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2058.codfw.wmnet
  • 16:17 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2058.codfw.wmnet
  • 16:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2041.codfw.wmnet
  • 16:17 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2041.codfw.wmnet
  • 16:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2040.codfw.wmnet
  • 16:17 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2040.codfw.wmnet
  • 16:14 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:13 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2137.codfw.wmnet onto db2237.codfw.wmnet
  • 16:09 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on 26 hosts with reason: Move server uplinks codfw racks C5
  • 16:08 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on 26 hosts with reason: Move server uplinks codfw racks C5
  • 16:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P68817 and previous config saved to /var/cache/conftool/dbconfig/20240910-160831-ladsgroup.json
  • 16:03 topranks: commence maintenance - move server uplinks from old to new switch codfw rack C4 T373097
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 50%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68816 and previous config saved to /var/cache/conftool/dbconfig/20240910-160247-arnaudb.json
  • 16:02 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 23 hosts with reason: Move server uplinks codfw racks C4
  • 16:02 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 23 hosts with reason: Move server uplinks codfw racks C4
  • 15:59 topranks: push server and vlan configuration to lsw1-c5-codfw with Homer to prep physical moves T373097
  • 15:58 topranks: push server and vlan configuration to lsw1-c4-codfw with Homer to prep physical moves T373097
  • 15:57 topranks: move server uplinks in Netbox from asw-c4-codfw to lsw1-c4-codfw to prep physical moves T373097
  • 15:56 topranks: move server uplinks in Netbox from asw-c5-codfw to lsw1-c5-codfw to prep physical moves T373097
  • 15:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T371742)', diff saved to https://phabricator.wikimedia.org/P68815 and previous config saved to /var/cache/conftool/dbconfig/20240910-155324-ladsgroup.json
  • 15:48 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2021.codfw.wmnet with OS bullseye
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 25%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68814 and previous config saved to /var/cache/conftool/dbconfig/20240910-154740-arnaudb.json
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: network maintenance T373097
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: network maintenance T373097
  • 15:45 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2126 db2165 db2166 db2192 db2208 es2037 - T370852', diff saved to https://phabricator.wikimedia.org/P68813 and previous config saved to /var/cache/conftool/dbconfig/20240910-154540-arnaudb.json
  • 15:39 mutante: enabling throttling on Gerrit hosts - T365259
  • 15:39 jelto: enabling throttling on GitLab hosts - T366882
  • 15:34 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2012.codfw.wmnet
  • 15:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wikikube-worker2092.codfw.wmnet with reason: Degraded RAID
  • 15:34 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wikikube-worker2092.codfw.wmnet with reason: Degraded RAID
  • 15:33 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
  • 15:33 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
  • 15:32 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2012.codfw.wmnet
  • 15:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 15%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68811 and previous config saved to /var/cache/conftool/dbconfig/20240910-153234-arnaudb.json
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 5%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68810 and previous config saved to /var/cache/conftool/dbconfig/20240910-151729-arnaudb.json
  • 15:16 ebernhardson@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:16 ebernhardson@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:12 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:12 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns2005.wikimedia.org [reason: T373097 codfw maintenance]
  • 15:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2205.codfw.wmnet with reason: T374425
  • 15:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2205.codfw.wmnet with reason: T374425
  • 15:05 brennen@deploy1003: Finished deploy [phabricator/deployment@84ada67]: deploy phab1004 for T374458 (duration: 00m 50s)
  • 15:04 brennen@deploy1003: Started deploy [phabricator/deployment@84ada67]: deploy phab1004 for T374458
  • 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@84ada67]: deploy phab2002 for T374458 (duration: 00m 36s)
  • 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@84ada67]: deploy phab2002 for T374458
  • 15:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update
  • 15:02 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update
  • 15:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update
  • 15:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 4%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68809 and previous config saved to /var/cache/conftool/dbconfig/20240910-150222-arnaudb.json
  • 15:02 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update
  • 15:02 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2011.codfw.wmnet
  • 15:02 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator/Phorge update
  • 15:01 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator/Phorge update
  • 14:57 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2011.codfw.wmnet
  • 14:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T371742)', diff saved to https://phabricator.wikimedia.org/P68808 and previous config saved to /var/cache/conftool/dbconfig/20240910-144725-ladsgroup.json
  • 14:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 14:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 3%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68807 and previous config saved to /var/cache/conftool/dbconfig/20240910-144716-arnaudb.json
  • 14:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 14:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T371742)', diff saved to https://phabricator.wikimedia.org/P68806 and previous config saved to /var/cache/conftool/dbconfig/20240910-144703-ladsgroup.json
  • 14:37 sukhe: sudo cumin -b11 "A:cp" 'run-puppet-agent --enable "merging CRs 1065286"'
  • 14:35 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2033.codfw.wmnet
  • 14:35 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2025.codfw.wmnet
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 2%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68805 and previous config saved to /var/cache/conftool/dbconfig/20240910-143211-arnaudb.json
  • 14:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P68804 and previous config saved to /var/cache/conftool/dbconfig/20240910-143156-ladsgroup.json
  • 14:29 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:28 sukhe: [end] rolling restart of {pdns-recursor,haproxy}.service on A:dnsbox
  • 14:25 jayme: restoring leadership for partitions assigned to broker id 2002 on kafka-main-codfw - T363210
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 1%: post reimage && maintenance repool', diff saved to https://phabricator.wikimedia.org/P68802 and previous config saved to /var/cache/conftool/dbconfig/20240910-141705-arnaudb.json
  • 14:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P68801 and previous config saved to /var/cache/conftool/dbconfig/20240910-141649-ladsgroup.json
  • 14:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2039.codfw.wmnet
  • 14:16 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles zhwiki | tee T363538-zhwiki-cleanupTitles
  • 14:15 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes zhwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-zhwiki-namespaceDupes
  • 14:15 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2039.codfw.wmnet
  • 14:15 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2013.codfw.wmnet
  • 14:15 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2137.codfw.wmnet onto db2237.codfw.wmnet
  • 14:15 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2013.codfw.wmnet
  • 14:15 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2012.codfw.wmnet
  • 14:14 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles thwiki | tee T363538-thwiki-cleanupTitles
  • 14:14 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CRs 1065286"'
  • 14:14 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2012.codfw.wmnet
  • 14:14 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host parse2011.codfw.wmnet
  • 14:14 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes thwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-thwiki-namespaceDupes
  • 14:13 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles slwiki | tee T363538-slwiki-cleanupTitles
  • 14:13 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes slwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-slwiki-namespaceDupes
  • 14:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles simplewiki | tee T363538-simplewiki-cleanupTitles
  • 14:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes simplewiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-simplewiki-namespaceDupes
  • 14:11 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host parse2011.codfw.wmnet
  • 14:11 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2443.codfw.wmnet
  • 14:11 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kafka-main2002.codfw.wmnet
  • 14:11 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:10 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 14:10 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kafka-main2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 14:10 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2443.codfw.wmnet
  • 14:10 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2442.codfw.wmnet
  • 14:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles mswiki | tee T363538-mswiki-cleanupTitles
  • 14:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes mswiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-mswiki-namespaceDupes
  • 14:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2442.codfw.wmnet
  • 14:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2440.codfw.wmnet
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2137 in db2237 for T373579', diff saved to https://phabricator.wikimedia.org/P68792 and previous config saved to /var/cache/conftool/dbconfig/20240910-140918-arnaudb.json
  • 14:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2440.codfw.wmnet
  • 14:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2058.codfw.wmnet
  • 14:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2227.codfw.wmnet onto db2127.codfw.wmnet
  • 14:08 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2058.codfw.wmnet
  • 14:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2041.codfw.wmnet
  • 14:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: provisionning db2237.codfw.wmnet - T373579
  • 14:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: provisionning db2237.codfw.wmnet - T373579
  • 14:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: provisionning db2237.codfw.wmnet - T373579
  • 14:07 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2041.codfw.wmnet
  • 14:07 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2040.codfw.wmnet
  • 14:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: provisionning db2237.codfw.wmnet - T373579
  • 14:07 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2040.codfw.wmnet
  • 14:06 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 14:06 claime: Depooling kubernetes2040.codfw.wmnet kubernetes2041.codfw.wmnet kubernetes2058.codfw.wmnet mw2440.codfw.wmnet mw2442.codfw.wmnet mw2443.codfw.wmnet parse2011.codfw.wmnet parse2012.codfw.wmnet parse2013.codfw.wmnet wikikube-worker2039.codfw.wmnet - T373097
  • 14:06 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles jawiki | tee T363538-jawiki-cleanupTitles
  • 14:06 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes jawiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-jawiki-namespaceDupes
  • 14:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T371742)', diff saved to https://phabricator.wikimedia.org/P68788 and previous config saved to /var/cache/conftool/dbconfig/20240910-140141-ladsgroup.json
  • 14:01 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles idwiki | tee T363538-idwiki-cleanupTitles
  • 14:01 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kafka-main2002.codfw.wmnet
  • 14:01 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes idwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-idwiki-namespaceDupes
  • 13:59 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles dagwiki | tee T363538-dagwiki-cleanupTitles
  • 13:58 sukhe: sudo cumin -b11 "A:cp" 'run-puppet-agent --enable "merging CRs 1065283"'
  • 13:57 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes dagwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-dagwiki-namespaceDupes
  • 13:52 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles bnwiki | tee T363538-bnwiki-cleanupTitles
  • 13:51 jebe@deploy1003: Finished deploy [analytics/refinery@464c114] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@464c114d] (duration: 03m 43s)
  • 13:51 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes bnwiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-suffix=/T363538 --fix | tee T363538-bnwiki-namespaceDupes
  • 13:50 sukhe: sudo cumin "A:cp" 'disable-puppet "merging CRs 1065283"'
  • 13:48 jebe@deploy1003: Started deploy [analytics/refinery@464c114] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@464c114d]
  • 13:47 jebe@deploy1003: Finished deploy [analytics/refinery@464c114] (thin): Regular analytics weekly train THIN [analytics/refinery@464c114d] (duration: 04m 58s)
  • 13:43 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript cleanupTitles aswiki | tee T363538-aswiki-cleanupTitles
  • 13:42 jebe@deploy1003: Started deploy [analytics/refinery@464c114] (thin): Regular analytics weekly train THIN [analytics/refinery@464c114d]
  • 13:42 jebe@deploy1003: Finished deploy [analytics/refinery@464c114]: Regular analytics weekly train [analytics/refinery@464c114d] (duration: 07m 22s)
  • 13:42 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes.php aswiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --add-prefix=T363538/ --fix | tee T363538-aswiki-namespaceDupes-prefix
  • 13:38 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes.php aswiki --source-pseudo-namespace MOS --dest-namespace 126 --move-talk --fix | tee T363538-aswiki-namespaceDupes # crashed, DBQueryError
  • 13:35 jebe@deploy1003: Started deploy [analytics/refinery@464c114]: Regular analytics weekly train [analytics/refinery@464c114d]
  • 13:35 logmsgbot: lucaswerkmeister-wmde@deploy1003 Finished scap sync-world: Backport for Elevate pseudo-namespace MOS to a real namespace on most wikis which use it (T363538) (duration: 13m 45s)
  • 13:30 logmsgbot: lucaswerkmeister-wmde@deploy1003 lucaswerkmeister-wmde, cscott: Continuing with sync
  • 13:28 logmsgbot: lucaswerkmeister-wmde@deploy1003 lucaswerkmeister-wmde, cscott: Backport for Elevate pseudo-namespace MOS to a real namespace on most wikis which use it (T363538) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc2017.codfw.wmnet,pc1017.eqiad.wmnet with reason: T374355
  • 13:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc2017.codfw.wmnet,pc1017.eqiad.wmnet with reason: T374355
  • 13:21 logmsgbot: lucaswerkmeister-wmde@deploy1003 Started scap sync-world: Backport for Elevate pseudo-namespace MOS to a real namespace on most wikis which use it (T363538)
  • 13:18 logmsgbot: lucaswerkmeister-wmde@deploy1003 Finished scap sync-world: Backport for Support new heading layout (T373039 T374377) (duration: 12m 11s)
  • 13:14 logmsgbot: lucaswerkmeister-wmde@deploy1003 lucaswerkmeister-wmde, jdlrobson: Continuing with sync
  • 13:10 logmsgbot: lucaswerkmeister-wmde@deploy1003 lucaswerkmeister-wmde, jdlrobson: Backport for Support new heading layout (T373039 T374377) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:09 sukhe: rolling restart of {pdns-recursor,haproxy}.service on A:dnsbox
  • 13:06 logmsgbot: lucaswerkmeister-wmde@deploy1003 Started scap sync-world: Backport for Support new heading layout (T373039 T374377)
  • 12:58 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2227.codfw.wmnet onto db2127.codfw.wmnet
  • 12:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2205.codfw.wmnet with reason: maintenance
  • 12:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2205.codfw.wmnet with reason: maintenance
  • 12:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1184 (T371742)', diff saved to https://phabricator.wikimedia.org/P68777 and previous config saved to /var/cache/conftool/dbconfig/20240910-125705-ladsgroup.json
  • 12:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 12:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 12:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T371742)', diff saved to https://phabricator.wikimedia.org/P68776 and previous config saved to /var/cache/conftool/dbconfig/20240910-125643-ladsgroup.json
  • 12:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: provisionning db2127.codfw.wmnet - T373579
  • 12:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: provisionning db2127.codfw.wmnet - T373579
  • 12:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: provisionning db2127.codfw.wmnet - T373579
  • 12:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: provisionning db2127.codfw.wmnet - T373579
  • 12:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2127.codfw.wmnet
  • 12:45 Amir1: dropping bv2013_edits table everywhere
  • 12:43 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2127.codfw.wmnet
  • 12:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P68774 and previous config saved to /var/cache/conftool/dbconfig/20240910-124136-ladsgroup.json
  • 12:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P68773 and previous config saved to /var/cache/conftool/dbconfig/20240910-122629-ladsgroup.json
  • 12:24 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 12:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main2007.codfw.wmnet
  • 12:23 jayme@cumin1002: START - Cookbook sre.hosts.remove-downtime for kafka-main2007.codfw.wmnet
  • 12:19 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 12:18 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 12:13 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 12:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T371742)', diff saved to https://phabricator.wikimedia.org/P68772 and previous config saved to /var/cache/conftool/dbconfig/20240910-121122-ladsgroup.json
  • 12:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T370903)', diff saved to https://phabricator.wikimedia.org/P68771 and previous config saved to /var/cache/conftool/dbconfig/20240910-120357-ladsgroup.json
  • 11:58 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
  • 11:58 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
  • 11:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P68770 and previous config saved to /var/cache/conftool/dbconfig/20240910-114850-ladsgroup.json
  • 11:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P68769 and previous config saved to /var/cache/conftool/dbconfig/20240910-113342-ladsgroup.json
  • 11:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T370903)', diff saved to https://phabricator.wikimedia.org/P68768 and previous config saved to /var/cache/conftool/dbconfig/20240910-111835-ladsgroup.json
  • 11:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T371742)', diff saved to https://phabricator.wikimedia.org/P68767 and previous config saved to /var/cache/conftool/dbconfig/20240910-110614-ladsgroup.json
  • 11:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 11:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 11:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T370903)', diff saved to https://phabricator.wikimedia.org/P68766 and previous config saved to /var/cache/conftool/dbconfig/20240910-110409-ladsgroup.json
  • 11:04 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 11:04 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 10:51 vgutierrez: switching purged in codfw and ulsfo to use main-eqiad kafka cluster - T373189
  • 10:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:44 moritzm: installing bind9 security updates (client-side tools/libs only)
  • 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:14 hnowlan@deploy1003: Finished scap sync-world: Backport for Enable Copyupload-allowed-domain on testwiki, disable on test2 (T356241) (duration: 09m 39s)
  • 10:09 hnowlan@deploy1003: hnowlan: Continuing with sync
  • 10:08 hnowlan@deploy1003: hnowlan: Backport for Enable Copyupload-allowed-domain on testwiki, disable on test2 (T356241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:05 hnowlan@deploy1003: Started scap sync-world: Backport for Enable Copyupload-allowed-domain on testwiki, disable on test2 (T356241)
  • 09:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2431.codfw.wmnet
  • 09:35 cgoubert@cumin1002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2431.codfw.wmnet
  • 09:11 arnaudb@cumin1002: dbctl commit (dc=all): 'T374421 → replag not catching up on exec, ^C to debug', diff saved to https://phabricator.wikimedia.org/P68763 and previous config saved to /var/cache/conftool/dbconfig/20240910-091114-arnaudb.json
  • 09:10 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 09:10 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 09:03 jayme@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
  • 09:02 hashar: Restarting CI Jenkins
  • 09:02 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:02 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:02 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 09:01 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 09:00 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 09:00 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:00 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 09:00 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 08:59 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 08:58 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 08:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2205 with weight 0 T374421', diff saved to https://phabricator.wikimedia.org/P68761 and previous config saved to /var/cache/conftool/dbconfig/20240910-085854-arnaudb.json
  • 08:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421
  • 08:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421
  • 08:51 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
  • 08:47 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
  • 08:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
  • 08:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
  • 08:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 08:44 jayme@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
  • 08:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 08:39 moritzm: installing Java security updates on puppetservers
  • 08:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[2002,2007].codfw.wmnet with reason: Hardware refresh
  • 08:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[2002,2007].codfw.wmnet with reason: Hardware refresh
  • 08:13 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:12 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: https://phabricator.wikimedia.org/T374215 → server depooled has hardware issues
  • 08:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: https://phabricator.wikimedia.org/T374215 → server depooled has hardware issues
  • 08:08 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:07 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:55 jayme: evacuating leadership for all partitions assigned to broker id 2002 on kafka-main-codfw - T363210
  • 07:20 dcausse@deploy1003: Finished scap sync-world: Backport for search: use the stem field when searching mul labels (T371401) (duration: 17m 22s)
  • 07:15 dcausse@deploy1003: dcausse: Continuing with sync
  • 07:10 dcausse@deploy1003: dcausse: Backport for search: use the stem field when searching mul labels (T371401) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:03 dcausse@deploy1003: Started scap sync-world: Backport for search: use the stem field when searching mul labels (T371401)
  • 06:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2017.codfw.wmnet with OS bookworm
  • 06:37 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2017.codfw.wmnet with reason: host reimage
  • 06:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 06:31 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1017.eqiad.wmnet with reason: host reimage
  • 06:18 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host pc2017.codfw.wmnet with OS bookworm
  • 06:16 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host pc1017.eqiad.wmnet with OS bookworm
  • 06:11 kart_: Updated cxserver to 2024-08-28-053620-production
  • 06:11 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 06:10 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:47 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:46 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:37 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:36 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:01 mwpresync@deploy1003: Pruned MediaWiki: 1.43.0-wmf.19 (duration: 00m 58s)
  • 03:47 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.43.0-wmf.22 refs T373641 (duration: 45m 06s)
  • 03:02 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.43.0-wmf.22 refs T373641

2024-09-09

  • 23:00 toyofuku@deploy1003: Finished scap sync-world: Backport for Enable appearance menu for all logged in users on all projects (T371020) (duration: 12m 40s)
  • 22:56 toyofuku@deploy1003: toyofuku, jdlrobson: Continuing with sync
  • 22:50 toyofuku@deploy1003: toyofuku, jdlrobson: Backport for Enable appearance menu for all logged in users on all projects (T371020) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:48 toyofuku@deploy1003: Started scap sync-world: Backport for Enable appearance menu for all logged in users on all projects (T371020)
  • 21:37 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:37 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:35 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:35 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:35 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:35 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:29 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:29 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:27 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:26 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:24 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:24 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:23 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:19 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:19 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:19 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:19 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:18 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:17 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:15 bking@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 21:15 bking@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 21:02 kindrobot: finished UTC late backport window. BACKPORTED: 1071281, 1071202, NOT backported (ran out of time): 1071037, 1070347, 1070354, 1070975
  • 21:01 kindrobot@deploy1003: Finished scap sync-world: Backport for Release donate link to pilot wikis (French Wikipedia and Wikifunctions) (T373585), Fix typo in browser vendor prefix (T374180) (duration: 12m 39s)
  • 20:57 kindrobot@deploy1003: jforrester, toyofuku, kindrobot: Continuing with sync
  • 20:51 kindrobot@deploy1003: jforrester, toyofuku, kindrobot: Backport for Release donate link to pilot wikis (French Wikipedia and Wikifunctions) (T373585), Fix typo in browser vendor prefix (T374180) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:49 kindrobot@deploy1003: Started scap sync-world: Backport for Release donate link to pilot wikis (French Wikipedia and Wikifunctions) (T373585), Fix typo in browser vendor prefix (T374180)
  • 19:33 jforrester@deploy1003: Finished scap sync-world: Backport for ZObjectStructureValidator::validate: use set_time_limit() to limit in the case of run-away JsonSchema (T374241) (duration: 08m 19s)
  • 19:28 jforrester@deploy1003: jforrester: Continuing with sync
  • 19:27 jforrester@deploy1003: jforrester: Backport for ZObjectStructureValidator::validate: use set_time_limit() to limit in the case of run-away JsonSchema (T374241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:25 jforrester@deploy1003: Started scap sync-world: Backport for ZObjectStructureValidator::validate: use set_time_limit() to limit in the case of run-away JsonSchema (T374241)
  • 18:16 ejegg: payments-wiki upgraded from 4f3500f7 to 672c9fb6
  • 17:44 jforrester@deploy1003: Finished scap sync-world: Backport for ZObjectStore::findZTesterResult: Trim our own error so we don't break logstash (T374241) (duration: 12m 05s)
  • 17:39 jforrester@deploy1003: jforrester: Continuing with sync
  • 17:33 jforrester@deploy1003: jforrester: Backport for ZObjectStore::findZTesterResult: Trim our own error so we don't break logstash (T374241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:31 jforrester@deploy1003: Started scap sync-world: Backport for ZObjectStore::findZTesterResult: Trim our own error so we don't break logstash (T374241)
  • 17:15 ebernhardson@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:15 ebernhardson@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:08 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:07 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:05 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:05 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add nes frack firewalls - pt1979@cumin2002"
  • 17:05 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add nes frack firewalls - pt1979@cumin2002"
  • 17:02 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:02 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:54 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Manuel Merz (WMDE) out of all services on: 1552 hosts
  • 16:53 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Manuel Merz (WMDE) out of all services on: 1552 hosts
  • 16:52 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Manuel Merz (WMDE) out of all services on: 677 hosts
  • 16:52 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Manuel Merz (WMDE) out of all services on: 677 hosts
  • 16:46 ejegg: payments-wiki upgraded from e47e61cb to 4f3500f7
  • 16:26 hnowlan@deploy1003: Finished scap sync-world: Backport for Enable async uploads on test2wiki (T356241) (duration: 11m 11s)
  • 16:20 hnowlan@deploy1003: hnowlan: Continuing with sync
  • 16:20 cgoubert@cumin1002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 2 hosts: mw[2428-2429].codfw.wmnet
  • 16:20 cgoubert@cumin1002: START - Cookbook sre.debmonitor.remove-hosts for 2 hosts: mw[2428-2429].codfw.wmnet
  • 16:19 hnowlan@deploy1003: hnowlan: Backport for Enable async uploads on test2wiki (T356241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2106.codfw.wmnet
  • 16:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2106.codfw.wmnet
  • 16:17 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2106.codfw.wmnet
  • 16:17 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 16:15 hnowlan@deploy1003: Started scap sync-world: Backport for Enable async uploads on test2wiki (T356241)
  • 16:13 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 16:13 elukey@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2001.codfw.wmnet with OS bookworm
  • 16:10 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2106.codfw.wmnet with OS bullseye
  • 16:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2105.codfw.wmnet
  • 16:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2105.codfw.wmnet
  • 16:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2105.codfw.wmnet
  • 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2095.codfw.wmnet
  • 16:09 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2095.codfw.wmnet
  • 16:05 hnowlan: homer lsw1-b5-codfw* commit
  • 16:05 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 10s)
  • 16:04 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 16:04 claime: homer lsw1-b6-codfw* commit 'T372878'
  • 16:03 claime: homer cr*codfw* commit 'T372878'
  • 16:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2105.codfw.wmnet with OS bullseye
  • 15:53 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.7.0 plugin update for cephosd bgp - cmooney@cumin1002
  • 15:48 hnowlan@cumin1002: END (ERROR) - Cookbook sre.k8s.pool-depool-node (exit_code=97) pool for host wikikube-worker2095.codfw.wmnet
  • 15:48 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2095.codfw.wmnet
  • 15:47 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.7.0 plugin update for cephosd bgp - cmooney@cumin1002
  • 15:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 15:44 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2106.codfw.wmnet with reason: host reimage
  • 15:41 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2105.codfw.wmnet with reason: host reimage
  • 15:41 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2106.codfw.wmnet with reason: host reimage
  • 15:41 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm
  • 15:40 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw2-d-eqiad
  • 15:40 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw2-d-eqiad
  • 15:37 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2105.codfw.wmnet with reason: host reimage
  • 15:32 sukhe: restart bird on durum1001
  • 15:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw2-d-eqiad with reason: repalce vcp link from d2 port 51 to d4 port 52
  • 15:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw2-d-eqiad with reason: repalce vcp link from d2 port 51 to d4 port 52
  • 15:26 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on wikikube-worker2095.codfw.wmnet with reason: host reimage
  • 15:26 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2095.codfw.wmnet with reason: host reimage
  • 15:09 moritzm: installing imagemagick security updates
  • 15:08 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 14:55 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2106
  • 14:55 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2106
  • 14:54 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2106
  • 14:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2106.codfw.wmnet 57.16.192.10.in-addr.arpa 7.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:54 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2106.codfw.wmnet 57.16.192.10.in-addr.arpa 7.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2106 - cgoubert@cumin1002"
  • 14:54 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2106 - cgoubert@cumin1002"
  • 14:53 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 14:53 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 14:53 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 14:53 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 14:51 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 14:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T371742)', diff saved to https://phabricator.wikimedia.org/P68759 and previous config saved to /var/cache/conftool/dbconfig/20240909-145145-ladsgroup.json
  • 14:51 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 14:51 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:50 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2106
  • 14:50 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2106.codfw.wmnet with OS bullseye
  • 14:49 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2106.codfw.wmnet
  • 14:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2105
  • 14:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2105
  • 14:49 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2105
  • 14:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2105.codfw.wmnet 56.16.192.10.in-addr.arpa 6.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:49 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2105.codfw.wmnet 56.16.192.10.in-addr.arpa 6.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2105 - cgoubert@cumin1002"
  • 14:49 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2105 - cgoubert@cumin1002"
  • 14:49 sukhe: sudo cumin -b1 -s300 'A:dnsbox and A:edges' 'systemctl restart ntpsec.service'
  • 14:45 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:45 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2105
  • 14:45 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2105.codfw.wmnet with OS bullseye
  • 14:44 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2105.codfw.wmnet
  • 14:44 sukhe@cumin1002: END (ERROR) - Cookbook sre.dns.roll-restart-ntp (exit_code=97) rolling restart_daemons on A:dnsbox and not P{dns1004* or dns1005*} and A:dnsbox
  • 14:43 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2105.codfw.wmnet wikikube-worker2106.codfw.wmnet on all recursors
  • 14:43 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2105.codfw.wmnet wikikube-worker2106.codfw.wmnet on all recursors
  • 14:40 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2429 to wikikube-worker2106
  • 14:39 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2106
  • 14:39 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2106
  • 14:39 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:39 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2429 to wikikube-worker2106 - cgoubert@cumin1002"
  • 14:38 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2429 to wikikube-worker2106 - cgoubert@cumin1002"
  • 14:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P68758 and previous config saved to /var/cache/conftool/dbconfig/20240909-143638-ladsgroup.json
  • 14:34 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:34 cgoubert@cumin1002: START - Cookbook sre.hosts.rename from mw2429 to wikikube-worker2106
  • 14:31 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2428 to wikikube-worker2105
  • 14:30 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2105
  • 14:30 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2105
  • 14:30 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:30 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2428 to wikikube-worker2105 - cgoubert@cumin1002"
  • 14:28 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2428 to wikikube-worker2105 - cgoubert@cumin1002"
  • 14:25 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:24 cgoubert@cumin1002: START - Cookbook sre.hosts.rename from mw2428 to wikikube-worker2105
  • 14:22 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2429.codfw.wmnet
  • 14:21 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2429.codfw.wmnet
  • 14:21 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2428.codfw.wmnet
  • 14:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P68757 and previous config saved to /var/cache/conftool/dbconfig/20240909-142131-ladsgroup.json
  • 14:20 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2428.codfw.wmnet
  • 14:17 jforrester@deploy1003: Finished scap sync-world: Backport for Revert "Activate feature flag for moving wikibase item to Other Projects sidebar in pilot wikis." (T66315), Enable Copyupload-allowed-domains on test2wiki (T356241) (duration: 12m 16s)
  • 14:13 jforrester@deploy1003: seanleong-wmde, jforrester, hnowlan: Continuing with sync
  • 14:08 jforrester@deploy1003: seanleong-wmde, jforrester, hnowlan: Backport for Revert "Activate feature flag for moving wikibase item to Other Projects sidebar in pilot wikis." (T66315), Enable Copyupload-allowed-domains on test2wiki (T356241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:07 sukhe@cumin1002: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and not P{dns1004* or dns1005*} and A:dnsbox
  • 14:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T371742)', diff saved to https://phabricator.wikimedia.org/P68756 and previous config saved to /var/cache/conftool/dbconfig/20240909-140623-ladsgroup.json
  • 14:05 jforrester@deploy1003: Started scap sync-world: Backport for Revert "Activate feature flag for moving wikibase item to Other Projects sidebar in pilot wikis." (T66315), Enable Copyupload-allowed-domains on test2wiki (T356241)
  • 14:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:01 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2104.codfw.wmnet
  • 14:01 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2104.codfw.wmnet
  • 14:01 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2104.codfw.wmnet
  • 14:00 sukhe@cumin1002: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough
  • {{safesubst:SAL entry|1=13:58 jforrester@deploy1003: Finished scap sync-world: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021), tests: Disable all Beta Cluster CI testing, all failing (T374242), Don't pass empty type/returnType to zobject lookup when undefined (T374199), [[gerrit:1071265|Use default width/height on gallery to avoid parser instance (T37414}}
  • 13:55 kamila_: homer lsw1-b6-codfw* commit 'T372878'
  • 13:55 kamila_: homer cr*codfw* commit 'T372878'
  • 13:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2104.codfw.wmnet with OS bullseye
  • 13:52 jforrester@deploy1003: dreamyjazz, jforrester: Continuing with sync
  • {{safesubst:SAL entry|1=13:50 jforrester@deploy1003: dreamyjazz, jforrester: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021), tests: Disable all Beta Cluster CI testing, all failing (T374242), Don't pass empty type/returnType to zobject lookup when undefined (T374199), [[gerrit:1071265|Use default width/height on gallery to avoid parser instance (T374146)}}
  • 13:49 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 11s)
  • 13:49 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 13:47 mszabo@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: sync
  • 13:46 sukhe@cumin1002: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough and A:wikidough
  • 13:46 mszabo@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: sync
  • {{safesubst:SAL entry|1=13:46 jforrester@deploy1003: Started scap sync-world: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021), tests: Disable all Beta Cluster CI testing, all failing (T374242), Don't pass empty type/returnType to zobject lookup when undefined (T374199), [[gerrit:1071265|Use default width/height on gallery to avoid parser instance (T374146}}
  • 13:45 James_F: kill 4135240 # scap thread with no attached screen
  • 13:41 mszabo@deploy1003: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 13:40 mszabo@deploy1003: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • {{safesubst:SAL entry|1=13:40 jforrester@deploy1003: dreamyjazz, jforrester: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021), tests: Disable all Beta Cluster CI testing, all failing (T374242), Don't pass empty type/returnType to zobject lookup when undefined (T374199), [[gerrit:1071265|Use default width/height on gallery to avoid parser instance (T374146)}}
  • 13:39 mszabo@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 13:39 mszabo@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 13:38 mszabo@deploy1003: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 13:37 mszabo@deploy1003: helmfile [staging] START helmfile.d/services/ipoid: apply
  • {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: Started scap sync-world: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021), tests: Disable all Beta Cluster CI testing, all failing (T374242), Don't pass empty type/returnType to zobject lookup when undefined (T374199), [[gerrit:1071265|Use default width/height on gallery to avoid parser instance (T374146}}
  • 13:34 sukhe: sudo cumin "A:dnsbox" 'disable-puppet "merging CR 1071616"'
  • 13:32 milimetric@deploy1003: Finished deploy [airflow-dags/platform_eng@574f0de]: (no justification provided) (duration: 00m 26s)
  • 13:32 milimetric@deploy1003: Started deploy [airflow-dags/platform_eng@574f0de]: (no justification provided)
  • 13:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 13:13 dreamyjazz@deploy1003: dreamyjazz: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2104.codfw.wmnet with reason: host reimage
  • 13:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021)
  • 13:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2104.codfw.wmnet with reason: host reimage
  • 12:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for Define wgCheckUserCentralIndexRangesToExclude to exclude WMCS (T373021)
  • 12:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2104
  • 12:48 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2104
  • 12:48 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2104
  • 12:48 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2104.codfw.wmnet 61.16.192.10.in-addr.arpa 1.6.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:48 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2104.codfw.wmnet 61.16.192.10.in-addr.arpa 1.6.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:48 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:48 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2104 - kamila@cumin1002"
  • 12:48 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2104 - kamila@cumin1002"
  • 12:43 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 12:43 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2104
  • 12:43 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2104.codfw.wmnet with OS bullseye
  • 12:43 kamila@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2104.codfw.wmnet
  • 12:41 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2104.codfw.wmnet on all recursors
  • 12:41 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2104.codfw.wmnet on all recursors
  • 12:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:38 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2431 to wikikube-worker2104
  • 12:37 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2104
  • 12:37 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2104
  • 12:37 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:37 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2431 to wikikube-worker2104 - kamila@cumin1002"
  • 12:36 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2431 to wikikube-worker2104 - kamila@cumin1002"
  • 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
  • 12:33 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 12:33 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw2431 to wikikube-worker2104
  • 12:29 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2431.codfw.wmnet
  • 12:28 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2431.codfw.wmnet
  • 12:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T370903)', diff saved to https://phabricator.wikimedia.org/P68755 and previous config saved to /var/cache/conftool/dbconfig/20240909-122321-ladsgroup.json
  • 12:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host dragonfly-supernode2001.codfw.wmnet
  • 12:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P68754 and previous config saved to /var/cache/conftool/dbconfig/20240909-120814-ladsgroup.json
  • 12:07 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 11:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 11:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 11:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P68753 and previous config saved to /var/cache/conftool/dbconfig/20240909-115306-ladsgroup.json
  • 11:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2140 (T371742)', diff saved to https://phabricator.wikimedia.org/P68752 and previous config saved to /var/cache/conftool/dbconfig/20240909-113849-ladsgroup.json
  • 11:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T370903)', diff saved to https://phabricator.wikimedia.org/P68751 and previous config saved to /var/cache/conftool/dbconfig/20240909-113759-ladsgroup.json
  • 11:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 100%: post db2227 clone repool', diff saved to https://phabricator.wikimedia.org/P68750 and previous config saved to /var/cache/conftool/dbconfig/20240909-113613-arnaudb.json
  • 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T370903)', diff saved to https://phabricator.wikimedia.org/P68749 and previous config saved to /var/cache/conftool/dbconfig/20240909-113110-ladsgroup.json
  • 11:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:30 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 11:25 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 11:25 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 11:21 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 11:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 75%: post db2227 clone repool', diff saved to https://phabricator.wikimedia.org/P68748 and previous config saved to /var/cache/conftool/dbconfig/20240909-112107-arnaudb.json
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2379.codfw.wmnet
  • 11:09 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2379.codfw.wmnet
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2435.codfw.wmnet
  • 11:09 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2435.codfw.wmnet
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2434.codfw.wmnet
  • 11:09 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2434.codfw.wmnet
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2430.codfw.wmnet
  • 11:09 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2430.codfw.wmnet
  • 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2423.codfw.wmnet
  • 11:08 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2423.codfw.wmnet
  • 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2422.codfw.wmnet
  • 11:08 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2422.codfw.wmnet
  • 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2420.codfw.wmnet
  • 11:08 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2420.codfw.wmnet
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2407.codfw.wmnet
  • 11:07 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2407.codfw.wmnet
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2406.codfw.wmnet
  • 11:07 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2406.codfw.wmnet
  • 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2402.codfw.wmnet
  • 11:07 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2402.codfw.wmnet
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2389.codfw.wmnet
  • 11:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2389.codfw.wmnet
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2388.codfw.wmnet
  • 11:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2388.codfw.wmnet
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2387.codfw.wmnet
  • 11:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2387.codfw.wmnet
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2386.codfw.wmnet
  • 11:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2386.codfw.wmnet
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2385.codfw.wmnet
  • 11:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2385.codfw.wmnet
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2384.codfw.wmnet
  • 11:06 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2384.codfw.wmnet
  • 11:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 50%: post db2227 clone repool', diff saved to https://phabricator.wikimedia.org/P68747 and previous config saved to /var/cache/conftool/dbconfig/20240909-110601-arnaudb.json
  • 11:05 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2383.codfw.wmnet
  • 11:05 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2383.codfw.wmnet
  • 11:05 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2382.codfw.wmnet
  • 11:05 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2382.codfw.wmnet
  • 11:05 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2381.codfw.wmnet
  • 11:05 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2381.codfw.wmnet
  • 11:05 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2380.codfw.wmnet
  • 11:05 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2380.codfw.wmnet
  • 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2378.codfw.wmnet
  • 11:03 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2378.codfw.wmnet
  • 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2377.codfw.wmnet
  • 11:03 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2377.codfw.wmnet
  • 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2319.codfw.wmnet
  • 11:02 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2319.codfw.wmnet
  • 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2318.codfw.wmnet
  • 11:02 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2318.codfw.wmnet
  • 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2317.codfw.wmnet
  • 11:02 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2317.codfw.wmnet
  • 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2316.codfw.wmnet
  • 11:02 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2316.codfw.wmnet
  • 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2312.codfw.wmnet
  • 11:02 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2312.codfw.wmnet
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2296.codfw.wmnet
  • 11:01 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2296.codfw.wmnet
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2295.codfw.wmnet
  • 11:01 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2295.codfw.wmnet
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2293.codfw.wmnet
  • 11:01 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2293.codfw.wmnet
  • 10:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2127 (re)pooling @ 25%: post db2227 clone repool', diff saved to https://phabricator.wikimedia.org/P68746 and previous config saved to /var/cache/conftool/dbconfig/20240909-105056-arnaudb.json
  • 10:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2127.codfw.wmnet onto db2227.codfw.wmnet
  • 10:34 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:33 jelto@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 10:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:32 jelto@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:31 jelto@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:30 jelto@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 10:19 jelto@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:18 jelto@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:44 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:44 isaranto@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:42 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:27 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2127.codfw.wmnet onto db2227.codfw.wmnet
  • 09:25 moritzm: removing libssl1.1 from prometheus hosts which were dist-upgraded from bullseye to bookworm
  • 09:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2127 in db2227 for T373579', diff saved to https://phabricator.wikimedia.org/P68745 and previous config saved to /var/cache/conftool/dbconfig/20240909-092404-arnaudb.json
  • 09:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: provisionning db2227.codfw.wmnet - T373579
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: provisionning db2227.codfw.wmnet - T373579
  • 09:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: provisionning db2227.codfw.wmnet - T373579
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: provisionning db2227.codfw.wmnet - T373579
  • 09:18 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: gerrit1004.wikimedia.org
  • 09:18 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: gerrit1004.wikimedia.org
  • 09:07 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=mnwiki --add-prefix=BROKEN --fix # T366271
  • 08:57 moritzm: restarting postfix on mx-in/mx-out to pick up openssl updates
  • 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet with OS bookworm
  • 08:51 arnaudb@cumin1002: dbctl commit (dc=all): 'API/vslow/dump T374086', diff saved to https://phabricator.wikimedia.org/P68744 and previous config saved to /var/cache/conftool/dbconfig/20240909-085122-arnaudb.json
  • 08:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2213 to s5 primary T374086', diff saved to https://phabricator.wikimedia.org/P68743 and previous config saved to /var/cache/conftool/dbconfig/20240909-084810-arnaudb.json
  • 08:47 arnaudb: Starting s5 codfw failover from db2123 to db2213 - T374086
  • 08:41 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.wikimedia.org
  • 08:40 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dragonfly-supernode2001.codfw.wmnet with reason: host reimage
  • 08:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2213 from API/vslow/dump T374086', diff saved to https://phabricator.wikimedia.org/P68742 and previous config saved to /var/cache/conftool/dbconfig/20240909-083910-arnaudb.json
  • 08:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s5 T374086
  • 08:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s5 T374086
  • 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2054.codfw.wmnet
  • 08:37 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2054.codfw.wmnet
  • 08:36 elukey@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dragonfly-supernode2001.codfw.wmnet with reason: host reimage
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2055.codfw.wmnet
  • 08:36 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2055.codfw.wmnet
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2057.codfw.wmnet
  • 08:35 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2057.codfw.wmnet
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2055.codfw.wmnet
  • 08:35 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2055.codfw.wmnet
  • 08:35 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host lists2001.wikimedia.org
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2035.codfw.wmnet
  • 08:35 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2035.codfw.wmnet
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2033.codfw.wmnet
  • 08:35 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2033.codfw.wmnet
  • 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2029.codfw.wmnet
  • 08:34 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2029.codfw.wmnet
  • 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2028.codfw.wmnet
  • 08:34 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2028.codfw.wmnet
  • 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2027.codfw.wmnet
  • 08:34 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2027.codfw.wmnet
  • 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2025.codfw.wmnet
  • 08:33 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2025.codfw.wmnet
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2018.codfw.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2018.codfw.wmnet
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2010.codfw.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2010.codfw.wmnet
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2008.codfw.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2008.codfw.wmnet
  • 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2034.codfw.wmnet
  • 08:27 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2034.codfw.wmnet
  • 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: kubernetes2031.codfw.wmnet
  • 08:26 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: kubernetes2031.codfw.wmnet
  • 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2332.codfw.wmnet
  • 08:25 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2332.codfw.wmnet
  • 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2322.codfw.wmnet
  • 08:25 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2322.codfw.wmnet
  • 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2321.codfw.wmnet
  • 08:25 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2321.codfw.wmnet
  • 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: mw2320.codfw.wmnet
  • 08:25 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: mw2320.codfw.wmnet
  • 08:23 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
  • 08:23 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
  • 08:22 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2002.codfw.wmnet with OS bookworm
  • 08:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Reconfig db2140 T373330', diff saved to https://phabricator.wikimedia.org/P68741 and previous config saved to /var/cache/conftool/dbconfig/20240909-082053-arnaudb.json
  • 08:20 elukey@cumin2002: START - Cookbook sre.hosts.reimage for host dragonfly-supernode2001.codfw.wmnet with OS bookworm
  • 08:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2179 to s4 primary T373330', diff saved to https://phabricator.wikimedia.org/P68740 and previous config saved to /var/cache/conftool/dbconfig/20240909-081750-arnaudb.json
  • 08:17 arnaudb: Starting s4 codfw failover from db2140 to db2179 - T373330
  • 08:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2179 from API/vslow/dump T373330', diff saved to https://phabricator.wikimedia.org/P68739 and previous config saved to /var/cache/conftool/dbconfig/20240909-080956-arnaudb.json
  • 08:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2179 with weight 0 T373330', diff saved to https://phabricator.wikimedia.org/P68738 and previous config saved to /var/cache/conftool/dbconfig/20240909-080935-arnaudb.json
  • 08:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s4 T373330
  • 08:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 32 hosts with reason: Primary switchover s4 T373330
  • 08:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 T373175', diff saved to https://phabricator.wikimedia.org/P68737 and previous config saved to /var/cache/conftool/dbconfig/20240909-080558-arnaudb.json
  • 08:04 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage
  • 08:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2220 T373175', diff saved to https://phabricator.wikimedia.org/P68736 and previous config saved to /var/cache/conftool/dbconfig/20240909-080422-arnaudb.json
  • 08:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2218 to s7 primary T373175', diff saved to https://phabricator.wikimedia.org/P68735 and previous config saved to /var/cache/conftool/dbconfig/20240909-080108-arnaudb.json
  • 08:00 arnaudb: Starting s7 codfw failover from db2220 to db2218 - T373175
  • 08:00 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage
  • 07:58 moritzm: installing openssl security updates
  • 07:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2218 from API/vslow/dump T373175', diff saved to https://phabricator.wikimedia.org/P68734 and previous config saved to /var/cache/conftool/dbconfig/20240909-075258-arnaudb.json
  • 07:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2218 with weight 0 T373175', diff saved to https://phabricator.wikimedia.org/P68733 and previous config saved to /var/cache/conftool/dbconfig/20240909-075145-arnaudb.json
  • 07:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: Primary switchover s7 T373175
  • 07:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 29 hosts with reason: Primary switchover s7 T373175
  • 07:39 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestage2002.codfw.wmnet with OS bookworm
  • 07:37 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host kubestage2002.codfw.wmnet
  • 07:37 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubestage2002.codfw.wmnet with OS bookworm
  • 07:37 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestage2002.codfw.wmnet with OS bookworm
  • 07:36 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
  • 07:33 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
  • 07:33 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host kubestage2002.codfw.wmnet
  • 07:33 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
  • 07:33 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
  • 07:17 moritzm: installing Linux 5.10.223 on bullseye hosts
  • 07:06 moritzm: roll out debmonitor-client 0.4.0-2+deb11u1 on bullseye hosts
  • 06:56 moritzm: installing aom security updates

2024-09-07

  • 14:44 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:44 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:38 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 14:38 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 14:38 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 14:38 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 14:38 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:38 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 14:34 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 14:34 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 14:34 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 14:34 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 14:34 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:34 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 10:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db1246.eqiad.wmnet with reason: https://phabricator.wikimedia.org/T374215 → server depooled has hardware issues
  • 10:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db1246.eqiad.wmnet with reason: https://phabricator.wikimedia.org/T374215 → server depooled has hardware issues

2024-09-06

  • 22:35 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:35 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:07 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@6ca00a7] (releasing): (no justification provided) (duration: 00m 43s)
  • 19:06 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@6ca00a7] (releasing): (no justification provided)
  • 19:01 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:00 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 ebernhardson@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 ebernhardson@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 7:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 18:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 7:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 18:12 brett: Import ncmonitor 1.2.1-1 into bookworm-wikimedia apt archive
  • 17:55 brett: Import corto 0.3.1-1 into bookworm-wikimedia apt archive
  • 16:46 kamila_: ran homer on cr*codfw* for T372878
  • 16:30 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2103.codfw.wmnet
  • 16:30 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2103.codfw.wmnet
  • 16:30 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2103.codfw.wmnet
  • 16:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2103.codfw.wmnet with OS bullseye
  • 16:06 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
  • 16:04 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 16:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2103.codfw.wmnet with reason: host reimage
  • 15:59 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2103.codfw.wmnet with reason: host reimage
  • 15:52 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:49 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:44 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@ad2c434] (releasing): (no justification provided) (duration: 00m 41s)
  • 15:43 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@ad2c434] (releasing): (no justification provided)
  • 15:43 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
  • 15:43 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
  • 15:42 topranks: enabling PyBal on lvs1017 to make primary again after repairing faulty fiber link T374247
  • 15:42 elukey: install spicerack 8.13.0 on cumin1002
  • 15:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2103
  • 15:40 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2103
  • 15:39 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2103
  • 15:39 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2103.codfw.wmnet 60.16.192.10.in-addr.arpa 0.6.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:39 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2103.codfw.wmnet 60.16.192.10.in-addr.arpa 0.6.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:39 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:39 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2103 - kamila@cumin1002"
  • 15:39 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2103 - kamila@cumin1002"
  • 15:34 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:34 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:32 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2103
  • 15:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2103.codfw.wmnet with OS bullseye
  • 15:31 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:29 kamila@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2103.codfw.wmnet
  • 15:28 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2103.codfw.wmnet on all recursors
  • 15:28 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2103.codfw.wmnet on all recursors
  • 15:27 mutante: rolling restarts on durum machines
  • 15:27 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=93) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:27 dzahn@cumin2002: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
  • 15:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2430 to wikikube-worker2103
  • 15:23 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2103
  • 15:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:22 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2103
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2430 to wikikube-worker2103 - kamila@cumin1002"
  • 15:19 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2430 to wikikube-worker2103 - kamila@cumin1002"
  • 15:16 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:15 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw2430 to wikikube-worker2103
  • 15:14 topranks: disabling PyBal on lvs1017 to shift traffic to lvs1020 and allow work to fix faulty fibre link T374247
  • 15:13 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Move traffic off lvs1017 to lvs1020 to troubleshooot faulty link
  • 15:13 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: Move traffic off lvs1017 to lvs1020 to troubleshooot faulty link
  • 15:07 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2430.codfw.wmnet
  • 15:07 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2430.codfw.wmnet
  • 15:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2098.codfw.wmnet
  • 15:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2098.codfw.wmnet
  • 15:02 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2098.codfw.wmnet
  • 14:52 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2098.codfw.wmnet with OS bullseye
  • 14:51 btullis@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database bdrwiki (T371759)
  • 14:44 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 14:42 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2095.codfw.wmnet
  • 14:42 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 14:42 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 14:42 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2095.codfw.wmnet
  • 14:41 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2001.codfw.wmnet with OS bookworm
  • 14:28 akosiaris: repool kubernetes1059 T365993
  • 14:28 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes1059.eqiad.wmnet
  • 14:27 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes1059.eqiad.wmnet
  • 14:25 btullis@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database bdrwiki (T371759)
  • 14:23 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
  • 14:20 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
  • 14:17 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2102.codfw.wmnet
  • 14:17 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2102.codfw.wmnet
  • 14:17 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2102.codfw.wmnet
  • 14:13 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2102.codfw.wmnet with OS bullseye
  • 14:10 akosiaris: restart pybal on lvs1019
  • 14:07 akosiaris: silence alerts based on alertname=PHPFPMTooBusy,deployment=mw-wikifunctions,site=codfw T374241
  • 13:59 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestage2001.codfw.wmnet with OS bookworm
  • 13:58 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
  • 13:58 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
  • 13:56 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host kubestage2001.codfw.wmnet
  • 13:56 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
  • 13:56 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
  • 13:52 btullis@cumin1002: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99) for database bdrwiki (T371759)
  • 13:52 jayme: homer lsw1-a6-codfw* commit 'T372878'
  • 13:52 btullis@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database bdrwiki (T371759)
  • 13:48 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2102.codfw.wmnet with reason: host reimage
  • 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
  • 13:44 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2102.codfw.wmnet with reason: host reimage
  • 13:36 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2001.codfw.wmnet with OS bullseye
  • 13:32 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker2101.codfw.wmnet
  • 13:32 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2101.codfw.wmnet
  • 13:32 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2101.codfw.wmnet
  • 13:29 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker2100.codfw.wmnet
  • 13:29 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2100.codfw.wmnet
  • 13:29 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2100.codfw.wmnet
  • 13:29 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2095.codfw.wmnet
  • 13:28 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 13:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2099.codfw.wmnet
  • 13:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2099.codfw.wmnet
  • 13:28 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2099.codfw.wmnet
  • 13:27 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2101.codfw.wmnet with OS bullseye
  • 13:26 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2102
  • 13:26 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2102
  • 13:25 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2100.codfw.wmnet with OS bullseye
  • 13:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
  • 13:23 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2102
  • 13:23 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2102.codfw.wmnet 226.16.192.10.in-addr.arpa 6.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:23 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2102.codfw.wmnet 226.16.192.10.in-addr.arpa 6.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:22 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:22 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2102 - jayme@cumin1002"
  • 13:22 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2102 - jayme@cumin1002"
  • 13:21 claime: homer lsw1-b6-codfw* commit 'T372878'
  • 13:18 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 13:17 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:16 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 13:13 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
  • 13:10 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
  • 13:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2098.codfw.wmnet with reason: host reimage
  • 13:07 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2098.codfw.wmnet with reason: host reimage
  • 13:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2099.codfw.wmnet with OS bullseye
  • 13:02 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2102
  • 13:02 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2102.codfw.wmnet with OS bullseye
  • 13:02 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2102.codfw.wmnet
  • 13:01 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2034 to wikikube-worker2102
  • 13:00 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2102
  • 12:58 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2102
  • 12:58 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:58 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2034 to wikikube-worker2102 - jayme@cumin1002"
  • 12:58 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2034 to wikikube-worker2102 - jayme@cumin1002"
  • 12:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2096.codfw.wmnet
  • 12:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2096.codfw.wmnet
  • 12:57 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2096.codfw.wmnet
  • 12:55 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:54 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2034 to wikikube-worker2102
  • 12:54 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2096.codfw.wmnet with OS bullseye
  • 12:53 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kubestage2001
  • 12:53 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kubestage2001
  • 12:52 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2101.codfw.wmnet with reason: host reimage
  • 12:52 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host kubestage2001
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kubestage2001.codfw.wmnet 195.0.192.10.in-addr.arpa 5.9.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:52 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache kubestage2001.codfw.wmnet 195.0.192.10.in-addr.arpa 5.9.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kubestage2001 - jayme@cumin1002"
  • 12:52 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kubestage2001 - jayme@cumin1002"
  • 12:50 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2101.codfw.wmnet with reason: host reimage
  • 12:49 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:49 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host kubestage2001
  • 12:48 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host kubestage2001.codfw.wmnet with OS bullseye
  • 12:48 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
  • 12:48 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2100.codfw.wmnet with reason: host reimage
  • 12:48 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2097.codfw.wmnet
  • 12:48 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2097.codfw.wmnet
  • 12:48 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2097.codfw.wmnet
  • 12:47 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
  • 12:47 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host kubestage2001.codfw.wmnet
  • 12:45 hnowlan: homer lsw1-b3-codfw* commit
  • 12:45 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw1486.eqiad.wmnet
  • 12:45 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw1486.eqiad.wmnet
  • 12:44 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2100.codfw.wmnet with reason: host reimage
  • 12:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2097.codfw.wmnet with OS bullseye
  • 12:43 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2099.codfw.wmnet with reason: host reimage
  • 12:39 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2099.codfw.wmnet with reason: host reimage
  • 12:38 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2066.codfw.wmnet
  • 12:38 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2066.codfw.wmnet
  • 12:37 claime: homer cr*codfw* commit 'T372878'
  • 12:33 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2101
  • 12:33 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2101
  • 12:33 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2101
  • 12:33 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2101.codfw.wmnet 203.16.192.10.in-addr.arpa 3.0.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:33 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2101.codfw.wmnet 203.16.192.10.in-addr.arpa 3.0.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:33 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:33 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2101 - cgoubert@cumin1002"
  • 12:33 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2101 - cgoubert@cumin1002"
  • 12:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2096.codfw.wmnet with reason: host reimage
  • 12:30 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 12:29 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2101
  • 12:29 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2101.codfw.wmnet with OS bullseye
  • 12:29 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2101.codfw.wmnet
  • 12:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2100
  • 12:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2100
  • 12:28 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2100
  • 12:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2100.codfw.wmnet 202.16.192.10.in-addr.arpa 2.0.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:28 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2100.codfw.wmnet 202.16.192.10.in-addr.arpa 2.0.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2100 - cgoubert@cumin1002"
  • 12:28 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2100 - cgoubert@cumin1002"
  • 12:27 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2096.codfw.wmnet with reason: host reimage
  • 12:24 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 12:24 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2100
  • 12:24 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2100.codfw.wmnet with OS bullseye
  • 12:23 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2100.codfw.wmnet
  • 12:22 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2099
  • 12:22 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2099
  • 12:22 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2099
  • 12:22 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2099.codfw.wmnet 201.16.192.10.in-addr.arpa 1.0.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:22 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2099.codfw.wmnet 201.16.192.10.in-addr.arpa 1.0.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:21 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:21 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2099 - cgoubert@cumin1002"
  • 12:21 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2099 - cgoubert@cumin1002"
  • 12:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2097.codfw.wmnet with reason: host reimage
  • 12:18 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 12:18 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2099
  • 12:18 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2099.codfw.wmnet with OS bullseye
  • 12:18 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2099.codfw.wmnet
  • 12:17 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2029.codfw.wmnet
  • 12:17 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2029.codfw.wmnet
  • 12:16 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2097.codfw.wmnet with reason: host reimage
  • 12:16 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2029.codfw.wmnet
  • 12:16 cgoubert@cumin1002: END (ERROR) - Cookbook sre.k8s.pool-depool-node (exit_code=97) depool for host wikikube-worker2029.codfw.wmnet
  • 12:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2098
  • 12:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2098
  • 12:16 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2029.codfw.wmnet
  • 12:16 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2029.codfw.wmnet
  • 12:16 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2098
  • 12:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2098.codfw.wmnet 176.16.192.10.in-addr.arpa 6.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:16 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2098.codfw.wmnet 176.16.192.10.in-addr.arpa 6.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2098 - hnowlan@cumin1002"
  • 12:15 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2098 - hnowlan@cumin1002"
  • 12:15 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2099.codfw.wmnet
  • 12:15 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2099.codfw.wmnet
  • 12:15 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit1003.wikimedia.org
  • 12:14 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2099.codfw.wmnet wikikube-worker2100.codfw.wmnet wikikube-worker2101.codfw.wmnet on all recursors
  • 12:14 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2099.codfw.wmnet wikikube-worker2100.codfw.wmnet wikikube-worker2101.codfw.wmnet on all recursors
  • 12:13 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2334 to wikikube-worker2101
  • 12:12 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 12:12 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2101
  • 12:12 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2101
  • 12:12 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2098
  • 12:12 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2096
  • 12:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2096
  • 12:10 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 12:09 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2096
  • 12:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2096.codfw.wmnet 173.16.192.10.in-addr.arpa 3.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:09 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2096.codfw.wmnet 173.16.192.10.in-addr.arpa 3.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2096 - hnowlan@cumin1002"
  • 12:09 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2096 - hnowlan@cumin1002"
  • 12:08 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host gerrit1003.wikimedia.org
  • 12:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2095
  • 12:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2095
  • 12:08 moritzm: upgrade ganeti-test2003 to bookworm for some bullseye->bookworm VM migration tests
  • 12:08 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2095
  • 12:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2095.codfw.wmnet 222.16.192.10.in-addr.arpa 2.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:08 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2095.codfw.wmnet 222.16.192.10.in-addr.arpa 2.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2095 - hnowlan@cumin1002"
  • 12:06 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2095 - hnowlan@cumin1002"
  • 12:05 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 12:04 cgoubert@cumin1002: START - Cookbook sre.hosts.rename from mw2334 to wikikube-worker2101
  • 12:04 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: Gerrit reboot
  • 12:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:15:00 on gerrit1003.wikimedia.org with reason: Gerrit reboot
  • 12:03 jelto@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit.wikimedia.org with reason: Gerrit reboot
  • 12:03 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 0:15:00 on gerrit.wikimedia.org with reason: Gerrit reboot
  • 12:03 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2333 to wikikube-worker2100
  • 12:02 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2100
  • 12:02 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2100
  • 12:02 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:02 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2333 to wikikube-worker2100 - cgoubert@cumin1002"
  • 12:02 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2333 to wikikube-worker2100 - cgoubert@cumin1002"
  • 12:01 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2096
  • 12:00 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 12:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2097
  • 12:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2097
  • 11:55 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 11:54 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2097
  • 11:54 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2097.codfw.wmnet 175.16.192.10.in-addr.arpa 5.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:54 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2097.codfw.wmnet 175.16.192.10.in-addr.arpa 5.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:54 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:54 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2097 - hnowlan@cumin1002"
  • 11:54 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2097 - hnowlan@cumin1002"
  • 11:52 cgoubert@cumin1002: START - Cookbook sre.hosts.rename from mw2333 to wikikube-worker2100
  • 11:51 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2095
  • 11:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2098.codfw.wmnet with OS bullseye
  • 11:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2095.codfw.wmnet with OS bullseye
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2098.codfw.wmnet
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2095.codfw.wmnet
  • 11:50 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2332 to wikikube-worker2099
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2096.codfw.wmnet with OS bullseye
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2097
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2096.codfw.wmnet
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2097.codfw.wmnet with OS bullseye
  • 11:50 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2097.codfw.wmnet
  • 11:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2099
  • 11:49 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2099
  • 11:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:49 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2332 to wikikube-worker2099 - cgoubert@cumin1002"
  • 11:49 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2332 to wikikube-worker2099 - cgoubert@cumin1002"
  • 11:48 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2095.codfw.wmnet wikikube-worker2096.codfw.wmnet wikikube-worker2097.codfw.wmnet wikikube-worker2098.codfw.wmnet on all recursors
  • 11:48 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2095.codfw.wmnet wikikube-worker2096.codfw.wmnet wikikube-worker2097.codfw.wmnet wikikube-worker2098.codfw.wmnet on all recursors
  • 11:47 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2322 to wikikube-worker2098
  • 11:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2098
  • 11:46 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2098
  • 11:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2322 to wikikube-worker2098 - hnowlan@cumin1002"
  • 11:45 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 11:43 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2322 to wikikube-worker2098 - hnowlan@cumin1002"
  • 11:41 cgoubert@cumin1002: START - Cookbook sre.hosts.rename from mw2332 to wikikube-worker2099
  • 11:40 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2321 to wikikube-worker2097
  • 11:40 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:40 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2322 to wikikube-worker2098
  • 11:40 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2097
  • 11:39 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2097
  • 11:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2321 to wikikube-worker2097 - hnowlan@cumin1002"
  • 11:39 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2321 to wikikube-worker2097 - hnowlan@cumin1002"
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2334.codfw.wmnet
  • 11:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2031 to wikikube-worker2095
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2321 to wikikube-worker2097
  • 11:35 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2334.codfw.wmnet
  • 11:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2333.codfw.wmnet
  • 11:34 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2095
  • 11:34 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2333.codfw.wmnet
  • 11:34 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2320 to wikikube-worker2096
  • 11:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2332.codfw.wmnet
  • 11:33 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2095
  • 11:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2031 to wikikube-worker2095 - hnowlan@cumin1002"
  • 11:33 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2332.codfw.wmnet
  • 11:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2096
  • 11:32 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2096
  • 11:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:27 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2031 to wikikube-worker2095 - hnowlan@cumin1002"
  • 11:26 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2093.codfw.wmnet
  • 11:24 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:24 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2320 to wikikube-worker2096
  • 11:24 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2031 to wikikube-worker2095
  • 11:23 jayme: homer lsw1-b6-codfw* commit 'T372878'
  • 11:15 moritzm: installing Linux 5.10.223 on bullseye hosts
  • 11:09 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2093.codfw.wmnet with OS bullseye
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
  • 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
  • 11:02 moritzm: rolling out debmonitor-client 0.4.0-2+deb11u1 on bullseye-wikimedia on bullseye hosts T372472
  • 11:00 eoghan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on lists1004.wikimedia.org with reason: T373980
  • 11:00 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
  • 11:00 eoghan@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on lists1004.wikimedia.org with reason: T373980
  • 10:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
  • 10:40 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 10:39 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 10:39 moritzm: uploaded debmonitor-client 0.4.0-2+deb11u1 on bullseye-wikimedia (didn't rebuild the other suites since the fix is specific to Bullseye) T372472
  • 10:38 elukey: factory reset of sretest2001
  • 10:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db1246.eqiad.wmnet with reason: Server failed, rebooted in emergency/single user mode
  • 10:33 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on db1246.eqiad.wmnet with reason: Server failed, rebooted in emergency/single user mode
  • 10:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1246.eqiad.wmnet with reason: Server failed, rebooted in emergency/single user mode
  • 10:32 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1246.eqiad.wmnet with reason: Server failed, rebooted in emergency/single user mode
  • 10:31 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depool db1246 (It's sad)', diff saved to https://phabricator.wikimedia.org/P68731 and previous config saved to /var/cache/conftool/dbconfig/20240906-102551-ladsgroup.json
  • 10:23 elukey: install spicerack 8.13.0 on cumin2002
  • 10:17 elukey: uploaded spicerack_8.13.0 to apt.wikimedia.org bullseye-wikimedia
  • 10:09 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2093.codfw.wmnet with reason: host reimage
  • 10:05 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2093.codfw.wmnet with reason: host reimage
  • 09:57 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2094.codfw.wmnet
  • 09:57 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2094.codfw.wmnet
  • 09:50 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2094.codfw.wmnet
  • 09:48 jayme: homer cr*codfw* commit 'T372878'
  • 09:47 jayme: homer lsw1-b6-codfw* commit 'T372878'
  • 09:46 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2007.codfw.wmnet
  • 09:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2094.codfw.wmnet with OS bullseye
  • 09:35 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2005.codfw.wmnet
  • 09:28 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2005.codfw.wmnet
  • 09:26 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2005.codfw.wmnet
  • 09:20 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2005.codfw.wmnet
  • 09:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2094.codfw.wmnet with reason: host reimage
  • 09:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:12 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2094.codfw.wmnet with reason: host reimage
  • 09:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:57 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: sync
  • 08:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: sync
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2094
  • 08:54 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2094
  • 08:52 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2094
  • 08:52 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2094.codfw.wmnet 224.16.192.10.in-addr.arpa 4.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:52 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2094.codfw.wmnet 224.16.192.10.in-addr.arpa 4.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:52 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:52 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2094 - jayme@cumin1002"
  • 08:51 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2094 - jayme@cumin1002"
  • 08:49 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: sync
  • 08:48 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:48 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2094
  • 08:48 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2094.codfw.wmnet with OS bullseye
  • 08:48 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/proton: sync
  • 08:48 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2094.codfw.wmnet
  • 08:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2093
  • 08:47 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2093
  • 08:46 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2093
  • 08:46 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2093.codfw.wmnet 135.16.192.10.in-addr.arpa 5.3.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:46 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2093.codfw.wmnet 135.16.192.10.in-addr.arpa 5.3.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:46 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:46 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2093 - jayme@cumin1002"
  • 08:46 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2093 - jayme@cumin1002"
  • 08:43 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:43 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2093
  • 08:43 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2093.codfw.wmnet with OS bullseye
  • 08:42 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2093.codfw.wmnet
  • 08:41 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: sync
  • 08:40 elukey@deploy1003: helmfile [staging] START helmfile.d/services/proton: sync
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts acmechief2001.codfw.wmnet
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: acmechief2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:36 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: acmechief2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:36 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2087.codfw.wmnet
  • 08:36 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2087.codfw.wmnet
  • 08:32 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2033 to wikikube-worker2094
  • 08:32 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2094
  • 08:32 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2020 to wikikube-worker2093
  • 08:31 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2094
  • 08:31 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:31 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2033 to wikikube-worker2094 - jayme@cumin1002"
  • 08:31 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2033 to wikikube-worker2094 - jayme@cumin1002"
  • 08:31 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2093
  • 08:31 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2093
  • 08:31 jayme@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 08:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:24 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts acmechief2001.codfw.wmnet
  • 08:24 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2087.codfw.wmnet with OS bullseye
  • 08:23 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:18 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2033 to wikikube-worker2094
  • 08:18 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:18 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2020 to wikikube-worker2093
  • 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts acmechief1001.eqiad.wmnet
  • 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: acmechief1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:00 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host kubernetes2033.codfw.wmnet
  • 08:00 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2033.codfw.wmnet
  • 08:00 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2020.codfw.wmnet
  • 08:00 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: acmechief1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:59 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2020.codfw.wmnet
  • 07:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:51 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts acmechief1001.eqiad.wmnet
  • 07:49 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 10s)
  • 07:49 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 07:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:31 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:21 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2087.codfw.wmnet with reason: host reimage
  • 07:18 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2087.codfw.wmnet with reason: host reimage
  • 07:00 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2087.codfw.wmnet with OS bullseye
  • 06:57 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2087.codfw.wmnet
  • 06:57 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2087.codfw.wmnet with OS bullseye
  • 06:57 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2087.codfw.wmnet with OS bullseye
  • 06:57 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2087.codfw.wmnet
  • 05:37 vgutierrez: repool cp2041
  • 04:57 vgutierrez: repool cp2038
  • 04:24 vgutierrez: restarting purged in cp2038 && cp2041 - T334078
  • 04:16 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp(2038|2041).codfw.wmnet
  • 04:15 vgutierrez: depooling cp2041 && cp2038 due to high purged lag
  • 02:37 ejegg: restarted donations queue consumer
  • 02:31 ejegg: fundraising civicrm upgraded from 67ee99ce to 5dd4edc1
  • 02:29 ejegg: disabled donations queue consumer for civi deploy

2024-09-05

  • 23:43 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
  • 23:43 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
  • 23:36 topranks: re-enable PyBal on lvs1019 after fixing faulty link with replacement optic T374155
  • 22:54 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Move traffic off lvs1019 to lvs1029 to troubleshooot faulty link
  • 22:54 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1019.eqiad.wmnet with reason: Move traffic off lvs1019 to lvs1029 to troubleshooot faulty link
  • 22:53 topranks: disable PyBal on lvs1019 to swing traffic to lvs1020 and allow for intrusive work to correct link errors T374155
  • 22:50 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2002.wikimedia.org with reason: T373980
  • 22:49 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on gerrit2002.wikimedia.org with reason: T373980
  • 22:49 mutante: gerrit-replica.wikimedia.org (gerrit2002) - rebooting T373980
  • 22:03 dancy@deploy1003: Installation of scap version "4.101.3" completed for 211 hosts
  • 21:59 dancy@deploy1003: Installing scap version "4.101.3" for 211 hosts
  • 21:57 inflatador: bking@grafana1002 apply grizzly SLO dashboard updates slo-Search added slo-apigw updated P68729 T328330
  • 21:57 inflatador: bking@grafana1002 apply grizzly SLO dashboard updates slo-Search added slo-apigw updated P68729
  • 21:54 dancy@deploy1003: Installing scap version "4.101.3" for 211 hosts
  • 21:54 dancy@deploy1003: install-world aborted: (duration: 02m 00s)
  • 21:52 dancy@deploy1003: Installing scap version "4.101.3" for 211 hosts
  • 21:33 mutante: gitlab1004 systemct list-units --state=failed listed wmf_auto_restart_ssh-gitlab.service but at the same time it's 'Service ssh-gitlab not present or not running'.(?). Did a systemctl reset-failed to clear monitoring and it doesn't seem to come back. T374106
  • 21:32 dancy@deploy1003: Sync cancelled.
  • 21:31 dancy@deploy1003: dancy: Backport for Revert "Set parser for image gallery in CampaignPageFormatter" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:29 dancy@deploy1003: Started scap sync-world: Backport for Revert "Set parser for image gallery in CampaignPageFormatter"
  • 21:15 topranks: add interface qos scheduler config to remaining switches T339850
  • 21:11 dduvall@deploy1003: Finished deploy [releng/jenkins-deploy@b47c79e] (releasing): (no justification provided) (duration: 00m 39s)
  • 21:10 dduvall@deploy1003: Started deploy [releng/jenkins-deploy@b47c79e] (releasing): (no justification provided)
  • 21:09 dancy@deploy1003: Sync cancelled.
  • 21:02 dancy@deploy1003: dancy: Backport for Set parser for image gallery in CampaignPageFormatter (T374146) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:00 dancy@deploy1003: Started scap sync-world: Backport for Set parser for image gallery in CampaignPageFormatter (T374146)
  • 20:46 dancy@deploy1003: install-world aborted: (duration: 02m 28s)
  • 20:44 dancy@deploy1003: Installing scap version "4.101.2" for 211 hosts
  • 20:41 topranks: add interface qos scheduler config to eqiad switches T339850
  • 20:40 dancy@deploy1003: Finished scap sync-world: Backport for Remove redundandant setting of $wgDefaultUserOptions['math'] (T373703) (duration: 10m 36s)
  • 20:35 dancy@deploy1003: dancy, physikerwelt: Continuing with sync
  • 20:31 dancy@deploy1003: dancy, physikerwelt: Backport for Remove redundandant setting of $wgDefaultUserOptions['math'] (T373703) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:29 dancy@deploy1003: Started scap sync-world: Backport for Remove redundandant setting of $wgDefaultUserOptions['math'] (T373703)
  • 19:51 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1009.eqiad.wmnet with OS bookworm
  • 19:36 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1009.eqiad.wmnet with reason: host reimage
  • 19:33 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1009.eqiad.wmnet with reason: host reimage
  • 19:21 topranks: add interface qos scheduler config to codfw switches T339850
  • 19:20 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1009.eqiad.wmnet with OS bookworm
  • 18:37 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.21 refs T373640
  • 18:06 topranks: add interface qos scheduler config to remaining CRs T339850
  • 17:59 ebernhardson@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:59 ebernhardson@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68725 and previous config saved to /var/cache/conftool/dbconfig/20240905-175316-arnaudb.json
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68724 and previous config saved to /var/cache/conftool/dbconfig/20240905-175316-arnaudb.json
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68723 and previous config saved to /var/cache/conftool/dbconfig/20240905-175315-arnaudb.json
  • 17:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68722 and previous config saved to /var/cache/conftool/dbconfig/20240905-173811-arnaudb.json
  • 17:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68721 and previous config saved to /var/cache/conftool/dbconfig/20240905-173810-arnaudb.json
  • 17:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68720 and previous config saved to /var/cache/conftool/dbconfig/20240905-173810-arnaudb.json
  • 17:33 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:32 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:28 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:28 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68719 and previous config saved to /var/cache/conftool/dbconfig/20240905-172306-arnaudb.json
  • 17:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68718 and previous config saved to /var/cache/conftool/dbconfig/20240905-172305-arnaudb.json
  • 17:10 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2019.codfw.wmnet
  • 17:10 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2019.codfw.wmnet
  • 17:10 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2419.codfw.wmnet
  • 17:10 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2419.codfw.wmnet
  • 17:10 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2418.codfw.wmnet
  • 17:10 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2418.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2417.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2417.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2416.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2416.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2415.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2415.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2414.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2414.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2413.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2413.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2412.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2412.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2338.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2338.codfw.wmnet
  • 17:09 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2337.codfw.wmnet
  • 17:09 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2337.codfw.wmnet
  • 17:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2336.codfw.wmnet
  • 17:08 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2336.codfw.wmnet
  • 17:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2335.codfw.wmnet
  • 17:08 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2335.codfw.wmnet
  • 17:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2039.codfw.wmnet
  • 17:08 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2039.codfw.wmnet
  • 17:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2038.codfw.wmnet
  • 17:08 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2038.codfw.wmnet
  • 17:08 claime: Repooling kubernetes nodes after T373096 - kubernetes2017 kubernetes2021 kubernetes2038 kubernetes2039 mw2335 mw2336 mw2337 mw2338 mw2412 mw2413 mw2414 mw2415 mw2416 mw2417 mw2418 mw2419 wikikube-worker2019
  • 17:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2021.codfw.wmnet
  • 17:08 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2021.codfw.wmnet
  • 17:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2017.codfw.wmnet
  • 17:08 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2017.codfw.wmnet
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68717 and previous config saved to /var/cache/conftool/dbconfig/20240905-170801-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68716 and previous config saved to /var/cache/conftool/dbconfig/20240905-170800-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68715 and previous config saved to /var/cache/conftool/dbconfig/20240905-170800-arnaudb.json
  • 17:06 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp2036.ulsfo.wmnet
  • 17:06 fabfur@cumin1002: conftool action : set/pooled=yes; selector: name=cp2035.ulsfo.wmnet
  • 17:05 Emperor: pool moss-fe2001 ms-fe2011 T373096
  • 17:04 Emperor: moss-be2003 exit maintenance mode T373096
  • 17:03 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2036.codfw.wmnet
  • 17:03 fabfur@cumin1002: START - Cookbook sre.hosts.remove-downtime for cp2036.codfw.wmnet
  • 17:03 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2035.codfw.wmnet
  • 17:03 fabfur@cumin1002: START - Cookbook sre.hosts.remove-downtime for cp2035.codfw.wmnet
  • 16:59 topranks: move server uplinks codfw rack c3 from asw-c3-codfw to lsw1-c3-codfw T373096
  • 16:58 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
  • 16:58 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
  • 16:58 cmooney@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 34 hosts with reason: Move server uplinks codfw racks C3
  • 16:57 cmooney@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on 34 hosts with reason: Move server uplinks codfw racks C3
  • 16:55 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2091.codfw.wmnet
  • 16:55 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
  • 16:55 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
  • 16:51 cmooney@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on 35 hosts with reason: Move server uplinks codfw racks C3
  • 16:51 cmooney@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on 35 hosts with reason: Move server uplinks codfw racks C3
  • 16:51 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2091.codfw.wmnet with OS bullseye
  • 16:49 dancy@deploy1003: Installing scap version "4.101.1" for 211 hosts
  • 16:49 kamila@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2092.codfw.wmnet
  • 16:49 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2092.codfw.wmnet with OS bullseye
  • 16:42 topranks: move server uplinks codfw rack c2 from asw-c2-codfw to lsw1-c2-codfw T373096
  • 16:42 jhathaway: disabling puppet on cp nodes to test puppet change
  • 16:39 cmooney@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 23 hosts with reason: Move server uplinks codfw racks C2
  • 16:38 cmooney@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on 23 hosts with reason: Move server uplinks codfw racks C2
  • 16:38 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.21 refs T373640
  • 16:30 Emperor: depool moss-fe2001 ms-fe2011 T373096
  • 16:30 Emperor: moss-be2003 to maintenance mode T373096
  • 16:27 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2007.codfw.wmnet
  • 16:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: network maintenance T370852
  • 16:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: network maintenance T370852
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2141 db2144 db2150 db2169 db2186 db2191 - T370852', diff saved to https://phabricator.wikimedia.org/P68714 and previous config saved to /var/cache/conftool/dbconfig/20240905-162057-arnaudb.json
  • 16:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 100%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68713 and previous config saved to /var/cache/conftool/dbconfig/20240905-161051-arnaudb.json
  • 16:02 claime: homer cr*-codfw* commit 'T372878'
  • 15:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 75%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68712 and previous config saved to /var/cache/conftool/dbconfig/20240905-155545-arnaudb.json
  • 15:54 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:54 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:48 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on backup2006.codfw.wmnet with reason: Move backup2006 uplink to lsw1-c2-codfw
  • 15:48 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on backup2006.codfw.wmnet with reason: Move backup2006 uplink to lsw1-c2-codfw
  • 15:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2322.codfw.wmnet
  • 15:42 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2322.codfw.wmnet
  • 15:42 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2321.codfw.wmnet
  • 15:42 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2321.codfw.wmnet
  • 15:42 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2320.codfw.wmnet
  • 15:41 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2092.codfw.wmnet with reason: host reimage
  • 15:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 50%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68711 and previous config saved to /var/cache/conftool/dbconfig/20240905-154040-arnaudb.json
  • 15:39 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2092.codfw.wmnet with reason: host reimage
  • 15:39 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2320.codfw.wmnet
  • 15:38 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2031.codfw.wmnet
  • 15:38 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2031.codfw.wmnet
  • 15:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2091.codfw.wmnet with reason: host reimage
  • 15:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2091.codfw.wmnet with reason: host reimage
  • 15:30 topranks: prep lsw1-c3-codfw for server migration from asw-c3-codfw T373096
  • 15:28 Emperor: restart swift-proxy on ms-fe1014 T360913
  • 15:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 25%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68710 and previous config saved to /var/cache/conftool/dbconfig/20240905-152534-arnaudb.json
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2092
  • 15:22 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2092
  • 15:21 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2092
  • 15:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2092.codfw.wmnet 77.0.192.10.in-addr.arpa 7.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:21 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2092.codfw.wmnet 77.0.192.10.in-addr.arpa 7.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:21 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2092 - kamila@cumin1002"
  • 15:21 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2092 - kamila@cumin1002"
  • 15:21 topranks: prep lsw1-c2-codfw for server migration from asw-c2-codfw T373096
  • 15:18 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2125.codfw.wmnet onto db2225.codfw.wmnet
  • 15:17 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2092
  • 15:17 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2092.codfw.wmnet with OS bullseye
  • 15:17 kamila@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2092.codfw.wmnet
  • 15:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2091
  • 15:12 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2091
  • 15:09 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp2036.ulsfo.wmnet
  • 15:08 fabfur@cumin1002: conftool action : set/pooled=no; selector: name=cp2035.ulsfo.wmnet
  • 15:08 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp2036.codfw.wmnet with reason: T373096
  • 15:08 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2091
  • 15:08 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2091.codfw.wmnet 9.0.192.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:08 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2091.codfw.wmnet 9.0.192.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:08 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:08 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2091 - kamila@cumin1002"
  • 15:08 fabfur@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp2036.codfw.wmnet with reason: T373096
  • 15:07 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2091 - kamila@cumin1002"
  • 15:07 fabfur@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp2035.codfw.wmnet with reason: T373096
  • 15:07 fabfur@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp2035.codfw.wmnet with reason: T373096
  • 15:07 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2019.codfw.wmnet
  • 15:06 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2019.codfw.wmnet
  • 15:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2419.codfw.wmnet
  • 15:05 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2419.codfw.wmnet
  • 15:05 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2418.codfw.wmnet
  • 15:05 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2418.codfw.wmnet
  • 15:05 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2417.codfw.wmnet
  • 15:04 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2417.codfw.wmnet
  • 15:04 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2416.codfw.wmnet
  • 15:04 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 15:04 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2091
  • 15:03 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2416.codfw.wmnet
  • 15:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2091.codfw.wmnet with OS bullseye
  • 15:03 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2415.codfw.wmnet
  • 15:03 kamila@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2091.codfw.wmnet
  • 15:03 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2415.codfw.wmnet
  • 15:03 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2414.codfw.wmnet
  • 15:02 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2414.codfw.wmnet
  • 15:02 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2413.codfw.wmnet
  • 15:01 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2413.codfw.wmnet
  • 15:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2412.codfw.wmnet
  • 15:01 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2421 to wikikube-worker2092
  • 15:01 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2412.codfw.wmnet
  • 15:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2338.codfw.wmnet
  • 15:00 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2092
  • 15:00 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2092
  • 15:00 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:00 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2421 to wikikube-worker2092 - kamila@cumin1002"
  • 15:00 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2338.codfw.wmnet
  • 15:00 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2421 to wikikube-worker2092 - kamila@cumin1002"
  • 15:00 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2337.codfw.wmnet
  • 15:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2420 to wikikube-worker2091
  • 15:00 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2337.codfw.wmnet
  • 14:59 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2336.codfw.wmnet
  • 14:59 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2091
  • 14:59 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2336.codfw.wmnet
  • 14:59 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2335.codfw.wmnet
  • 14:58 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2091
  • 14:58 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:58 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2420 to wikikube-worker2091 - kamila@cumin1002"
  • 14:58 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2335.codfw.wmnet
  • 14:58 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2039.codfw.wmnet
  • 14:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: post db2224 clone repool', diff saved to https://phabricator.wikimedia.org/P68709 and previous config saved to /var/cache/conftool/dbconfig/20240905-145755-arnaudb.json
  • 14:57 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2039.codfw.wmnet
  • 14:57 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2038.codfw.wmnet
  • 14:57 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2038.codfw.wmnet
  • 14:57 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2021.codfw.wmnet
  • 14:56 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2021.codfw.wmnet
  • 14:56 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2017.codfw.wmnet
  • 14:56 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 14:55 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2017.codfw.wmnet
  • 14:55 claime: depooling kubernetes nodes for T373096 - kubernetes2017 kubernetes2021 kubernetes2038 kubernetes2039 mw2335 mw2336 mw2337 mw2338 mw2412 mw2413 mw2414 mw2415 mw2416 mw2417 mw2418 mw2419 wikikube-worker2019
  • 14:55 hashar: UTC afternoon backport window completed
  • 14:54 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2420 to wikikube-worker2091 - kamila@cumin1002"
  • 14:52 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw2421 to wikikube-worker2092
  • 14:51 volans: deploying python3-wmflib 1.2.6 fleet-wide
  • 14:51 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 14:50 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw2420 to wikikube-worker2091
  • 14:50 hashar@deploy1003: Finished scap sync-world: Backport for Revert "Fix missing wikibase link in Minerva sidebar" (T66315) (duration: 09m 27s)
  • 14:46 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2421.codfw.wmnet
  • 14:46 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2420.codfw.wmnet
  • 14:46 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2421.codfw.wmnet
  • 14:46 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2420.codfw.wmnet
  • 14:45 hashar@deploy1003: hashar: Continuing with sync
  • 14:43 hashar@deploy1003: hashar: Backport for Revert "Fix missing wikibase link in Minerva sidebar" (T66315) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: post db2224 clone repool', diff saved to https://phabricator.wikimedia.org/P68707 and previous config saved to /var/cache/conftool/dbconfig/20240905-144249-arnaudb.json
  • 14:41 hashar@deploy1003: Started scap sync-world: Backport for Revert "Fix missing wikibase link in Minerva sidebar" (T66315)
  • 14:32 hashar: UTC afternoon backport window not completed! https://gerrit.wikimedia.org/r/c/mediawiki/skins/MinervaNeue/+/1070953 pending
  • 14:30 hashar: UTC afternoon backport window completed
  • 14:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: post db2224 clone repool', diff saved to https://phabricator.wikimedia.org/P68706 and previous config saved to /var/cache/conftool/dbconfig/20240905-142743-arnaudb.json
  • 14:14 hashar@deploy1003: Finished scap sync-world: Backport for Replace confusing uses of $wgDebugLogFile with $wmgExtraLogFile (duration: 10m 37s)
  • 14:14 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:13 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: post db2224 clone repool', diff saved to https://phabricator.wikimedia.org/P68705 and previous config saved to /var/cache/conftool/dbconfig/20240905-141238-arnaudb.json
  • 14:11 topranks: add interface qos schedulers on cr1-codfw T339850
  • 14:10 hashar@deploy1003: matmarex, hashar: Continuing with sync
  • 14:06 hashar@deploy1003: matmarex, hashar: Backport for Replace confusing uses of $wgDebugLogFile with $wmgExtraLogFile synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:06 volans: uploaded python3-wmflib_1.2.6 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
  • 14:04 hashar@deploy1003: Started scap sync-world: Backport for Replace confusing uses of $wgDebugLogFile with $wmgExtraLogFile
  • 13:59 hashar@deploy1003: Finished scap sync-world: Backport for Fix missing wikibase link in Minerva sidebar (T66315) (duration: 19m 21s)
  • 13:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 15%: post db2224 clone repool', diff saved to https://phabricator.wikimedia.org/P68704 and previous config saved to /var/cache/conftool/dbconfig/20240905-135731-arnaudb.json
  • 13:55 filippo@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:54 filippo@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 13:54 hashar@deploy1003: hashar, joelyrookewmde: Continuing with sync
  • 13:52 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2125.codfw.wmnet onto db2225.codfw.wmnet
  • 13:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2125 in db2225 for T373579', diff saved to https://phabricator.wikimedia.org/P68703 and previous config saved to /var/cache/conftool/dbconfig/20240905-134929-arnaudb.json
  • 13:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: provisionning db2225.codfw.wmnet - T373579
  • 13:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: provisionning db2225.codfw.wmnet - T373579
  • 13:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: provisionning db2225.codfw.wmnet - T373579
  • 13:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: provisionning db2225.codfw.wmnet - T373579
  • 13:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 5%: post db2224 clone repool', diff saved to https://phabricator.wikimedia.org/P68702 and previous config saved to /var/cache/conftool/dbconfig/20240905-134225-arnaudb.json
  • 13:41 hashar@deploy1003: hashar, joelyrookewmde: Backport for Fix missing wikibase link in Minerva sidebar (T66315) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:39 hashar@deploy1003: Started scap sync-world: Backport for Fix missing wikibase link in Minerva sidebar (T66315)
  • 13:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 100%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68701 and previous config saved to /var/cache/conftool/dbconfig/20240905-133649-arnaudb.json
  • 13:34 jmm@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling restart_daemons on A:ncredir
  • 13:31 hashar@deploy1003: Finished scap sync-world: Backport for search: use mul fallback for fine-tuned search profiles (T371401) (duration: 06m 28s)
  • 13:29 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 14s)
  • 13:29 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 13:26 hashar@deploy1003: hashar, dcausse: Continuing with sync
  • 13:26 hashar@deploy1003: hashar, dcausse: Backport for search: use mul fallback for fine-tuned search profiles (T371401) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:25 jmm@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling restart_daemons on A:ncredir
  • 13:24 hashar@deploy1003: Started scap sync-world: Backport for search: use mul fallback for fine-tuned search profiles (T371401)
  • 13:22 hashar@deploy1003: Finished scap sync-world: Backport for Allow copyuploads on test2wiki (T356241) (duration: 11m 45s)
  • 13:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 75%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68700 and previous config saved to /var/cache/conftool/dbconfig/20240905-132144-arnaudb.json
  • 13:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2124.codfw.wmnet onto db2224.codfw.wmnet
  • 13:18 hashar@deploy1003: hnowlan, hashar: Continuing with sync
  • 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling restart_daemons on A:ncredir-eqsin
  • 13:15 jmm@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling restart_daemons on A:ncredir-eqsin
  • 13:14 hashar@deploy1003: hnowlan, hashar: Backport for Allow copyuploads on test2wiki (T356241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:11 hashar@deploy1003: Started scap sync-world: Backport for Allow copyuploads on test2wiki (T356241)
  • 13:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 50%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68699 and previous config saved to /var/cache/conftool/dbconfig/20240905-130638-arnaudb.json
  • 12:58 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2029.codfw.wmnet
  • 12:58 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2029.codfw.wmnet
  • 12:58 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2029.codfw.wmnet
  • 12:57 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 12:56 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 12:56 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 12:56 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 12:54 claime: homer lsw1-b6-codfw* commit 'T372878'
  • 12:54 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2029.codfw.wmnet with OS bullseye
  • 12:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 25%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68698 and previous config saved to /var/cache/conftool/dbconfig/20240905-125133-arnaudb.json
  • 12:38 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2124.codfw.wmnet onto db2224.codfw.wmnet
  • 12:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 15%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68697 and previous config saved to /var/cache/conftool/dbconfig/20240905-123627-arnaudb.json
  • 12:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2124 in db2224 for T373579', diff saved to https://phabricator.wikimedia.org/P68696 and previous config saved to /var/cache/conftool/dbconfig/20240905-123619-arnaudb.json
  • 12:35 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: provisionning db2224.codfw.wmnet - T373579
  • 12:35 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: provisionning db2224.codfw.wmnet - T373579
  • 12:35 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: provisionning db2224.codfw.wmnet - T373579
  • 12:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: provisionning db2224.codfw.wmnet - T373579
  • 12:31 eoghan@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host lists1004.wikimedia.org
  • 12:26 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2029.codfw.wmnet with reason: host reimage
  • 12:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P68695 and previous config saved to /var/cache/conftool/dbconfig/20240905-122408-ladsgroup.json
  • 12:23 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2029.codfw.wmnet with reason: host reimage
  • 12:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 5%: post clone repool', diff saved to https://phabricator.wikimedia.org/P68694 and previous config saved to /var/cache/conftool/dbconfig/20240905-122108-arnaudb.json
  • 12:20 eoghan@cumin1002: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
  • 12:12 claime: homer cr*codfw* commit 'T372878'
  • 12:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P68693 and previous config saved to /var/cache/conftool/dbconfig/20240905-120903-ladsgroup.json
  • 12:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2029
  • 12:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2029
  • 12:06 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2029
  • 12:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2029.codfw.wmnet 199.16.192.10.in-addr.arpa 9.9.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:06 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2029.codfw.wmnet 199.16.192.10.in-addr.arpa 9.9.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2029 - cgoubert@cumin1002"
  • 12:05 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2029 - cgoubert@cumin1002"
  • 12:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2029.codfw.wmnet on all recursors
  • 12:04 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2029.codfw.wmnet on all recursors
  • 11:58 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:55 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 11:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P68692 and previous config saved to /var/cache/conftool/dbconfig/20240905-115357-ladsgroup.json
  • 11:52 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2003.wikimedia.org
  • 11:50 samtar@deploy1003: Finished scap sync-world: Backport for IS: Enable CommunityRequests on Meta (T372527) (duration: 07m 09s)
  • 11:46 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host gerrit2003.wikimedia.org
  • 11:45 samtar@deploy1003: samtar: Continuing with sync
  • 11:45 samtar@deploy1003: samtar: Backport for IS: Enable CommunityRequests on Meta (T372527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:43 samtar@deploy1003: Started scap sync-world: Backport for IS: Enable CommunityRequests on Meta (T372527)
  • 11:42 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 11:41 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 11:41 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
  • 11:41 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 11:41 samtar@deploy1003: Finished scap sync-world: Backport for CS: Load CommunityRequests (T372527) (duration: 08m 32s)
  • 11:41 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 11:40 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
  • 11:40 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 11:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
  • 11:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
  • 11:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P68691 and previous config saved to /var/cache/conftool/dbconfig/20240905-113852-ladsgroup.json
  • 11:38 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 11:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
  • 11:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 11:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:36 samtar@deploy1003: samtar: Continuing with sync
  • 11:36 samtar@deploy1003: samtar: Backport for CS: Load CommunityRequests (T372527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 11:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 11:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 11:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 11:33 samtar@deploy1003: Started scap sync-world: Backport for CS: Load CommunityRequests (T372527)
  • 11:32 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 11:32 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
  • 11:31 ladsgroup@cumin2002: dbctl commit (dc=all): 'Depool db2129 T374087', diff saved to https://phabricator.wikimedia.org/P68690 and previous config saved to /var/cache/conftool/dbconfig/20240905-113153-ladsgroup.json
  • 11:31 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:30 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
  • 11:29 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
  • 11:29 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 11:29 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
  • 11:28 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
  • 11:28 ladsgroup@cumin2002: dbctl commit (dc=all): 'Promote db2214 to s6 primary T374087', diff saved to https://phabricator.wikimedia.org/P68689 and previous config saved to /var/cache/conftool/dbconfig/20240905-112846-ladsgroup.json
  • 11:28 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
  • 11:28 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
  • 11:27 Amir1: Starting s6 codfw failover from db2129 to db2214 - T374087
  • 11:27 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2029
  • 11:26 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2029.codfw.wmnet with OS bullseye
  • 11:26 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
  • 11:25 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2029.codfw.wmnet
  • 11:25 jayme@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
  • 11:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 11:25 jayme@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
  • 11:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
  • 11:25 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2029.codfw.wmnet
  • 11:25 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2029.codfw.wmnet
  • 11:24 jayme@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 11:23 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 11:23 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 11:22 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 11:22 jayme@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 11:22 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2090.codfw.wmnet
  • 11:22 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2090.codfw.wmnet
  • 11:22 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2090.codfw.wmnet
  • 11:21 jayme: homer cr*codfw* commit 'T372878'
  • 11:21 ladsgroup@cumin2002: dbctl commit (dc=all): 'Set db2214 with weight 0 T374087', diff saved to https://phabricator.wikimedia.org/P68688 and previous config saved to /var/cache/conftool/dbconfig/20240905-112121-ladsgroup.json
  • 11:20 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
  • 11:20 ladsgroup@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s6 T374087
  • 11:20 ladsgroup@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s6 T374087
  • 11:19 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2090.codfw.wmnet with OS bullseye
  • 11:19 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
  • 11:17 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
  • 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
  • 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
  • 11:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2084.codfw.wmnet
  • 11:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2084.codfw.wmnet
  • 11:00 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2084.codfw.wmnet
  • 10:58 hnowlan: homer lsw1-b3-codfw* commit
  • 10:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2084.codfw.wmnet with OS bullseye
  • 10:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 10:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 10:42 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-text_esams for 9.2.5-1wm2
  • 10:38 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2084.codfw.wmnet with reason: host reimage
  • 10:36 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2084.codfw.wmnet with reason: host reimage
  • 10:30 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2168.codfw.wmnet onto db2222.codfw.wmnet
  • 10:29 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2084
  • 10:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2084
  • 10:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:14 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2084
  • 10:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2084.codfw.wmnet 170.16.192.10.in-addr.arpa 0.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:14 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2084.codfw.wmnet 170.16.192.10.in-addr.arpa 0.7.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2084 - hnowlan@cumin1002"
  • 10:14 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2084 - hnowlan@cumin1002"
  • 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on puppetmaster1003.eqiad.wmnet with reason: hardware fix
  • 10:11 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on puppetmaster1003.eqiad.wmnet with reason: hardware fix
  • 10:04 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:03 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2084
  • 10:03 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2084.codfw.wmnet with OS bullseye
  • 10:03 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2084.codfw.wmnet
  • 09:59 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:59 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2089.codfw.wmnet
  • 09:59 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2089.codfw.wmnet
  • 09:59 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2089.codfw.wmnet
  • 09:54 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2089.codfw.wmnet with OS bookworm
  • 09:53 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:53 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:43 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
  • 09:42 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-text_esams for 9.2.5-1wm2
  • 09:38 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2090.codfw.wmnet with reason: host reimage
  • 09:37 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
  • 09:37 oblivian@deploy1003: Finished scap sync-world: Backport for BounceHandler: add IPs for the new mx servers (T338761) (duration: 11m 16s)
  • 09:36 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
  • 09:36 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2090.codfw.wmnet with reason: host reimage
  • 09:35 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
  • 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict2001.codfw.wmnet
  • 09:32 oblivian@deploy1003: oblivian: Continuing with sync
  • 09:32 oblivian@deploy1003: oblivian: Backport for BounceHandler: add IPs for the new mx servers (T338761) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:32 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2089.codfw.wmnet with reason: host reimage
  • 09:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aphlict2001.codfw.wmnet
  • 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad1004.eqiad.wmnet
  • 09:30 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
  • 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict1002.eqiad.wmnet
  • 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host etherpad1004.eqiad.wmnet
  • 09:26 oblivian@deploy1003: Started scap sync-world: Backport for BounceHandler: add IPs for the new mx servers (T338761)
  • 09:25 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2168.codfw.wmnet onto db2222.codfw.wmnet
  • 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aphlict1002.eqiad.wmnet
  • 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2168 in db2222 for T373579', diff saved to https://phabricator.wikimedia.org/P68686 and previous config saved to /var/cache/conftool/dbconfig/20240905-092339-arnaudb.json
  • 09:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: provisionning db2222.codfw.wmnet - T373579
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: provisionning db2222.codfw.wmnet - T373579
  • 09:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: provisionning db2222.codfw.wmnet - T373579
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: provisionning db2222.codfw.wmnet - T373579
  • 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc2002.codfw.wmnet
  • 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host miscweb1003.eqiad.wmnet
  • 09:19 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2088.codfw.wmnet
  • 09:19 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2088.codfw.wmnet
  • 09:19 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2088.codfw.wmnet
  • 09:18 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_esams for 9.2.5-1wm2
  • 09:18 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2090
  • 09:18 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2090
  • 09:18 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2090
  • 09:18 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2090
  • 09:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host doc2002.codfw.wmnet
  • 09:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host miscweb1003.eqiad.wmnet
  • 09:16 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2090
  • 09:16 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2090.codfw.wmnet 123.16.192.10.in-addr.arpa 3.2.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:16 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:16 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2090.codfw.wmnet 123.16.192.10.in-addr.arpa 3.2.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:16 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:16 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2090 - jayme@cumin1002"
  • 09:16 jayme: homer lsw1-b8-codfw* commit 'T372878'
  • 09:16 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2090 - jayme@cumin1002"
  • 09:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2168.codfw.wmnet onto db2221.codfw.wmnet
  • 09:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2088.codfw.wmnet with OS bookworm
  • 09:12 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:11 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2090
  • 09:10 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2089
  • 09:10 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2089
  • 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet1003.eqiad.wmnet
  • 09:07 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2089
  • 09:07 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2089.codfw.wmnet 122.16.192.10.in-addr.arpa 2.2.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:07 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2089.codfw.wmnet 122.16.192.10.in-addr.arpa 2.2.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:07 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:07 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2089 - jayme@cumin1002"
  • 09:07 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2089 - jayme@cumin1002"
  • 09:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc1003.eqiad.wmnet
  • 09:05 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2090.codfw.wmnet with OS bullseye
  • 09:04 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2090.codfw.wmnet
  • 09:04 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host planet1003.eqiad.wmnet
  • 09:04 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2089
  • 09:04 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2089.codfw.wmnet with OS bookworm
  • 09:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host doc1003.eqiad.wmnet
  • 09:03 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2089.codfw.wmnet
  • 09:03 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2089.codfw.wmnet
  • 09:02 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2089.codfw.wmnet
  • 09:02 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2089.codfw.wmnet wikikube-worker2090.codfw.wmnet on all recursors
  • 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host miscweb2003.codfw.wmnet
  • 09:02 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2089.codfw.wmnet wikikube-worker2090.codfw.wmnet on all recursors
  • 09:02 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2089.codfw.wmnet
  • 09:02 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2089.codfw.wmnet
  • 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet2003.codfw.wmnet
  • 09:01 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2435 to wikikube-worker2090
  • 09:00 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2090
  • 09:00 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2090
  • 09:00 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:00 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2435 to wikikube-worker2090 - jayme@cumin1002"
  • 09:00 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2435 to wikikube-worker2090 - jayme@cumin1002"
  • 09:00 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2434 to wikikube-worker2089
  • 08:59 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2089
  • 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host miscweb2003.codfw.wmnet
  • 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host planet2003.codfw.wmnet
  • 08:57 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2089
  • 08:57 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:57 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2434 to wikikube-worker2089 - jayme@cumin1002"
  • 08:56 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:56 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2434 to wikikube-worker2089 - jayme@cumin1002"
  • 08:55 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2088.codfw.wmnet with reason: host reimage
  • 08:54 vgutierrez: acmechief1002 is now the acme-chief active host
  • 08:52 jayme@cumin1002: START - Cookbook sre.hosts.rename from mw2435 to wikikube-worker2090
  • 08:51 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:51 jayme@cumin1002: START - Cookbook sre.hosts.rename from mw2434 to wikikube-worker2089
  • 08:51 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2088.codfw.wmnet with reason: host reimage
  • 08:50 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host mw2435.codfw.wmnet
  • 08:49 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2435.codfw.wmnet
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2434.codfw.wmnet
  • 08:49 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2434.codfw.wmnet
  • 08:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2088
  • 08:34 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2088
  • 08:34 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2088
  • 08:34 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2088.codfw.wmnet 231.16.192.10.in-addr.arpa 1.3.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:34 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2088.codfw.wmnet 231.16.192.10.in-addr.arpa 1.3.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 08:34 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:34 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2088 - jayme@cumin1002"
  • 08:34 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2088 - jayme@cumin1002"
  • 08:30 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:28 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2088
  • 08:28 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2088.codfw.wmnet with OS bookworm
  • 08:28 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2088.codfw.wmnet
  • 08:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:20 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_esams for 9.2.5-1wm2
  • 08:18 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:09 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2168.codfw.wmnet onto db2221.codfw.wmnet
  • 08:08 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:08 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
  • 08:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2168 in db2221 for T373579', diff saved to https://phabricator.wikimedia.org/P68685 and previous config saved to /var/cache/conftool/dbconfig/20240905-080540-arnaudb.json
  • 08:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 08:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 08:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 08:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 08:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
  • 07:55 elukey: re-enabling puppet on all nodes after puppetdb restarts
  • 07:39 elukey: restart puppetdb on puppetdb[12]003 to pick up the new jvm
  • 07:16 kartik@deploy1003: Finished scap sync-world: Backport for aswiki: Set MT threshold for CX to 80% (T369417) (duration: 09m 37s)
  • 07:15 elukey: disable puppet fleetwide to restart puppetdb jvms (without impacting ongoing puppet runs, getting noise etc..)
  • 07:11 kartik@deploy1003: kartik: Continuing with sync
  • 07:09 kartik@deploy1003: kartik: Backport for aswiki: Set MT threshold for CX to 80% (T369417) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:06 kartik@deploy1003: Started scap sync-world: Backport for aswiki: Set MT threshold for CX to 80% (T369417)
  • 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
  • 07:02 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
  • 06:58 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
  • 06:57 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw

2024-09-04

  • 21:03 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.21 refs T373640
  • 20:54 mutante: gerrit servers: upgraded git package version
  • 20:52 dancy@deploy1003: Finished scap sync-world: Backport for NetworkSession: Only enable for private wikis (T373826) (duration: 06m 34s)
  • 20:48 dancy@deploy1003: ebernhardson, dancy: Continuing with sync
  • 20:48 dancy@deploy1003: ebernhardson, dancy: Backport for NetworkSession: Only enable for private wikis (T373826) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:46 dancy@deploy1003: Started scap sync-world: Backport for NetworkSession: Only enable for private wikis (T373826)
  • 20:44 dancy@deploy1003: Finished scap sync-world: Backport for Turn on Parsoid Read Views for eswikivoyage (T374029) (duration: 09m 46s)
  • 20:40 dancy@deploy1003: cscott, dancy: Continuing with sync
  • 20:36 dancy@deploy1003: cscott, dancy: Backport for Turn on Parsoid Read Views for eswikivoyage (T374029) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:34 dancy@deploy1003: Started scap sync-world: Backport for Turn on Parsoid Read Views for eswikivoyage (T374029)
  • 20:30 dancy@deploy1003: Finished scap sync-world: Backport for ParserOutput: Turn off noisy log - we have the info we need for now (T374046) (duration: 13m 33s)
  • 20:26 dancy@deploy1003: dancy, cscott: Continuing with sync
  • 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:19 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:19 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:19 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:19 dancy@deploy1003: dancy, cscott: Backport for ParserOutput: Turn off noisy log - we have the info we need for now (T374046) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:17 dancy@deploy1003: Started scap sync-world: Backport for ParserOutput: Turn off noisy log - we have the info we need for now (T374046)
  • 20:15 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 20:15 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 19:56 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1005.eqiad.wmnet with OS bookworm
  • 19:56 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 19:54 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 19:49 dancy@deploy1003: Sync cancelled.
  • 19:38 dancy@deploy1003: jdlrobson, dancy: Backport for Fixes: Echo icon not visible after click (T373936) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:37 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1005.eqiad.wmnet with reason: host reimage
  • 19:36 dancy@deploy1003: Started scap sync-world: Backport for Fixes: Echo icon not visible after click (T373936)
  • 19:34 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1005.eqiad.wmnet with reason: host reimage
  • 19:31 eileen: civicrm upgraded from 27b1f673 to 67ee99ce
  • 19:15 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host phab1005.eqiad.wmnet with OS bookworm
  • 19:07 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host phab1005.eqiad.wmnet with OS bookworm
  • 19:07 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 17:59 dancy@deploy1003: Finished scap sync-world: testwikis to 1.43.0-wmf.21 refs T373640 (duration: 06m 48s)
  • 17:52 dancy@deploy1003: Started scap sync-world: testwikis to 1.43.0-wmf.21 refs T373640
  • 17:16 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.21 refs T373640
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'es2036 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68684 and previous config saved to /var/cache/conftool/dbconfig/20240904-171447-arnaudb.json
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68683 and previous config saved to /var/cache/conftool/dbconfig/20240904-171431-arnaudb.json
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68682 and previous config saved to /var/cache/conftool/dbconfig/20240904-171415-arnaudb.json
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68681 and previous config saved to /var/cache/conftool/dbconfig/20240904-171402-arnaudb.json
  • 17:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68680 and previous config saved to /var/cache/conftool/dbconfig/20240904-171351-arnaudb.json
  • 17:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68679 and previous config saved to /var/cache/conftool/dbconfig/20240904-171332-arnaudb.json
  • 17:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68678 and previous config saved to /var/cache/conftool/dbconfig/20240904-171317-arnaudb.json
  • 17:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68677 and previous config saved to /var/cache/conftool/dbconfig/20240904-171254-arnaudb.json
  • 17:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 100%: T370852', diff saved to https://phabricator.wikimedia.org/P68676 and previous config saved to /var/cache/conftool/dbconfig/20240904-171232-arnaudb.json
  • 17:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2083.codfw.wmnet
  • 17:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2083.codfw.wmnet
  • 17:09 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2083.codfw.wmnet
  • 17:00 hnowlan: homer lsw1-b3-codfw* commit 'T372878'
  • 16:59 arnaudb@cumin1002: dbctl commit (dc=all): 'es2036 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68675 and previous config saved to /var/cache/conftool/dbconfig/20240904-165941-arnaudb.json
  • 16:59 arnaudb@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68674 and previous config saved to /var/cache/conftool/dbconfig/20240904-165926-arnaudb.json
  • 16:59 arnaudb@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68673 and previous config saved to /var/cache/conftool/dbconfig/20240904-165909-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68672 and previous config saved to /var/cache/conftool/dbconfig/20240904-165857-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68671 and previous config saved to /var/cache/conftool/dbconfig/20240904-165846-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68670 and previous config saved to /var/cache/conftool/dbconfig/20240904-165827-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68669 and previous config saved to /var/cache/conftool/dbconfig/20240904-165811-arnaudb.json
  • 16:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68668 and previous config saved to /var/cache/conftool/dbconfig/20240904-165749-arnaudb.json
  • 16:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 75%: T370852', diff saved to https://phabricator.wikimedia.org/P68667 and previous config saved to /var/cache/conftool/dbconfig/20240904-165727-arnaudb.json
  • 16:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2083.codfw.wmnet with OS bullseye
  • 16:48 dancy@deploy1003: Finished scap sync-world: Backport for Do not consume 'centralauthtoken' on api.php OPTIONS requests (T373925) (duration: 10m 16s)
  • 16:47 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
  • 16:44 arnaudb@cumin1002: dbctl commit (dc=all): 'es2036 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68666 and previous config saved to /var/cache/conftool/dbconfig/20240904-164435-arnaudb.json
  • 16:44 arnaudb@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68665 and previous config saved to /var/cache/conftool/dbconfig/20240904-164421-arnaudb.json
  • 16:44 arnaudb@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68664 and previous config saved to /var/cache/conftool/dbconfig/20240904-164404-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68663 and previous config saved to /var/cache/conftool/dbconfig/20240904-164351-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68662 and previous config saved to /var/cache/conftool/dbconfig/20240904-164340-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68661 and previous config saved to /var/cache/conftool/dbconfig/20240904-164321-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68660 and previous config saved to /var/cache/conftool/dbconfig/20240904-164305-arnaudb.json
  • 16:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68659 and previous config saved to /var/cache/conftool/dbconfig/20240904-164243-arnaudb.json
  • 16:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 50%: T370852', diff saved to https://phabricator.wikimedia.org/P68658 and previous config saved to /var/cache/conftool/dbconfig/20240904-164221-arnaudb.json
  • 16:41 dancy@deploy1003: matmarex, dancy: Continuing with sync
  • 16:40 dancy@deploy1003: matmarex, dancy: Backport for Do not consume 'centralauthtoken' on api.php OPTIONS requests (T373925) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:37 dancy@deploy1003: Started scap sync-world: Backport for Do not consume 'centralauthtoken' on api.php OPTIONS requests (T373925)
  • 16:29 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2083.codfw.wmnet with reason: host reimage
  • 16:29 claime: T373095 repool kubernetes2011, kubernetes2012, kubernetes2036, kubernetes2037, wikikube-worker2037, wikikube-worker2038, mw2436, mw2437
  • 16:29 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2038.codfw.wmnet
  • 16:29 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2038.codfw.wmnet
  • 16:29 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2037.codfw.wmnet
  • 16:29 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2037.codfw.wmnet
  • 16:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2437.codfw.wmnet
  • 16:28 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2437.codfw.wmnet
  • 16:28 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host mw2436.codfw.wmnet
  • 16:28 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host mw2436.codfw.wmnet
  • 16:27 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2037.codfw.wmnet
  • 16:27 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2037.codfw.wmnet
  • 16:27 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2036.codfw.wmnet
  • 16:27 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2036.codfw.wmnet
  • 16:27 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2012.codfw.wmnet
  • 16:27 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2012.codfw.wmnet
  • 16:27 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubernetes2011.codfw.wmnet
  • 16:27 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host kubernetes2011.codfw.wmnet
  • 16:26 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2083.codfw.wmnet with reason: host reimage
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2036.codfw.wmnet
  • 16:18 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2036.codfw.wmnet
  • 16:18 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2036.codfw.wmnet
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'es2036 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68657 and previous config saved to /var/cache/conftool/dbconfig/20240904-161806-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68656 and previous config saved to /var/cache/conftool/dbconfig/20240904-161806-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2206 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68655 and previous config saved to /var/cache/conftool/dbconfig/20240904-161806-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2207 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68653 and previous config saved to /var/cache/conftool/dbconfig/20240904-161806-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68654 and previous config saved to /var/cache/conftool/dbconfig/20240904-161806-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68652 and previous config saved to /var/cache/conftool/dbconfig/20240904-161806-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2125 (re)pooling @ 25%: T370852', diff saved to https://phabricator.wikimedia.org/P68651 and previous config saved to /var/cache/conftool/dbconfig/20240904-161806-arnaudb.json
  • 16:16 claime: homer lsw1-b8-codfw* commit 'T372878'
  • 16:16 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2036.codfw.wmnet with OS bullseye
  • 16:13 swfrench-wmf: running homer 'cr*codfw*' commit 'T374018'
  • 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2083
  • 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2083
  • 16:09 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2083
  • 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2083.codfw.wmnet 167.16.192.10.in-addr.arpa 7.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:09 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2083.codfw.wmnet 167.16.192.10.in-addr.arpa 7.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2083 - hnowlan@cumin1002"
  • 16:07 effie: restarting mcrouter on codfw
  • 16:06 topranks: migrating servers in codfw rack C1 from asw-c-codfw to lsw1-c1-codfw T373095
  • 16:06 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
  • 16:04 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw[2260,2267].codfw.wmnet
  • 16:04 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:03 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2083 - hnowlan@cumin1002"
  • 16:02 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:01 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 16:00 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 27 hosts with reason: Move server uplinks codfw racks C1
  • 15:59 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 27 hosts with reason: Move server uplinks codfw racks C1
  • 15:58 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:56 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:56 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2083
  • 15:56 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2083.codfw.wmnet with OS bullseye
  • 15:56 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2083.codfw.wmnet
  • 15:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 10 hosts with reason: network maintenance T370852
  • 15:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 10 hosts with reason: network maintenance T370852
  • 15:55 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db2125 db2138 db2149 db2190 db2206 db2207 es2031 es2032 es2036 - T370852', diff saved to https://phabricator.wikimedia.org/P68650 and previous config saved to /var/cache/conftool/dbconfig/20240904-155459-arnaudb.json
  • 15:53 ladsgroup@deploy1003: Finished scap sync-world: Backport for Fix bug causing review form to disappear on unreviewed pages (T373582) (duration: 10m 31s)
  • 15:53 swfrench@cumin2002: START - Cookbook sre.hosts.decommission for hosts mw[2260,2267].codfw.wmnet
  • 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with sync
  • 15:47 ladsgroup@deploy1003: ladsgroup: Backport for Fix bug causing review form to disappear on unreviewed pages (T373582) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:46 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 15:43 aqu@deploy1003: Finished deploy [airflow-dags/analytics@3b0d8ba]: Regular analytics weekly train [airflow-dags@3b0d8ba1] (duration: 00m 48s)
  • 15:43 topranks: configure lsw1-c1-codfw interfaces for servers in advance of move T373095
  • 15:43 ladsgroup@deploy1003: Started scap sync-world: Backport for Fix bug causing review form to disappear on unreviewed pages (T373582)
  • 15:42 aqu@deploy1003: Started deploy [airflow-dags/analytics@3b0d8ba]: Regular analytics weekly train [airflow-dags@3b0d8ba1]
  • 15:25 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2267.codfw.wmnet
  • 15:24 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2267.codfw.wmnet
  • 15:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2081.codfw.wmnet
  • 15:20 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2081.codfw.wmnet
  • 15:15 vgutierrez@cumin1002: conftool action : set/pooled=yes; selector: name=cp7015.magru.wmnet
  • 15:14 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:13 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp7015.magru.wmnet
  • 15:13 vgutierrez@cumin1002: START - Cookbook sre.hosts.remove-downtime for cp7015.magru.wmnet
  • 15:12 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host phab1005.eqiad.wmnet with OS bookworm
  • 15:12 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host phab1005.eqiad.wmnet with OS bookworm
  • 15:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:09 hashar@deploy1003: Finished scap sync-world: Backport for ParserOutput::collectMetadata: Log if given value is non-numeric and also non-string, for easier debugging, and don't fatal (T373920) (duration: 08m 37s)
  • 15:06 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:06 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2036.codfw.wmnet with reason: host reimage
  • 15:05 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 15:04 hashar@deploy1003: hashar: Continuing with sync
  • 15:04 hashar@deploy1003: hashar: Backport for ParserOutput::collectMetadata: Log if given value is non-numeric and also non-string, for easier debugging, and don't fatal (T373920) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:02 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2036.codfw.wmnet with reason: host reimage
  • 15:02 elukey@cumin2002: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:00 hashar@deploy1003: Started scap sync-world: Backport for ParserOutput::collectMetadata: Log if given value is non-numeric and also non-string, for easier debugging, and don't fatal (T373920)
  • 14:50 claime: homer cr*codfw* commit 'T372878'
  • 14:44 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2036
  • 14:44 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2036
  • 14:44 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2036
  • 14:44 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2036.codfw.wmnet 121.16.192.10.in-addr.arpa 1.2.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:44 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2036.codfw.wmnet 121.16.192.10.in-addr.arpa 1.2.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:44 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:44 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2036 - cgoubert@cumin1002"
  • 14:44 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2036 - cgoubert@cumin1002"
  • 14:40 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:40 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:40 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 14:40 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:40 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2036
  • 14:40 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 14:39 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2036.codfw.wmnet with OS bullseye
  • 14:39 arnaudb@cumin1002: dbctl commit (dc=all): 'swap masters for es1 - T373095', diff saved to https://phabricator.wikimedia.org/P68648 and previous config saved to /var/cache/conftool/dbconfig/20240904-143928-arnaudb.json
  • 14:38 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2036.codfw.wmnet
  • 14:38 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:38 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2036.codfw.wmnet
  • 14:38 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2036.codfw.wmnet
  • 14:37 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:37 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 14:35 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 14:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2082.codfw.wmnet
  • 14:33 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2082.codfw.wmnet
  • 14:30 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2088.codfw.wmnet on all recursors
  • 14:30 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2088.codfw.wmnet on all recursors
  • 14:29 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2054 to wikikube-worker2088
  • 14:29 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2088
  • 14:29 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2088
  • 14:29 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:29 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2054 to wikikube-worker2088 - cgoubert@cumin1002"
  • 14:28 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2054 to wikikube-worker2088 - cgoubert@cumin1002"
  • 14:25 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:25 cgoubert@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2054 to wikikube-worker2088
  • 14:22 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:22 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:17 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2082
  • 14:17 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:16 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:16 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2087
  • 14:16 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2087.codfw.wmnet with OS bookworm
  • 14:15 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:15 hnowlan: homer lsw1-b3-codfw* commit
  • 14:13 TheresNoTime: gerrit:1070561 reverted, fwiw
  • 14:11 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:09 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 14:08 TheresNoTime: scap failed, gerrit:1070561 merged but undeployed currently
  • 14:00 samtar@deploy1003: Started scap sync-world: Backport for Allow copyuploads on test2wiki (T356241)
  • 13:57 TheresNoTime: scap failed: Exception K8s Deployment, rolled back
  • 13:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2082.codfw.wmnet with OS bullseye
  • 13:52 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:52 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:50 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 13:48 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:47 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:44 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:44 samtar@deploy1003: Started scap sync-world: Backport for Allow copyuploads on test2wiki (T356241)
  • 13:42 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2086.codfw.wmnet
  • 13:42 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2086.codfw.wmnet
  • 13:41 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2085.codfw.wmnet
  • 13:41 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2085.codfw.wmnet
  • 13:40 samtar@deploy1003: Finished scap sync-world: Backport for Enable CampaignEvents Invitation Lists on igwiki and swwiki (T372582) (duration: 08m 28s)
  • 13:40 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 13:40 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS bookworm
  • 13:38 jayme: homer 'lsw1-b6-codfw*' commit 'T372878'
  • 13:36 samtar@deploy1003: samtar, daimona: Continuing with sync
  • 13:34 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2082.codfw.wmnet with reason: host reimage
  • 13:34 samtar@deploy1003: samtar, daimona: Backport for Enable CampaignEvents Invitation Lists on igwiki and swwiki (T372582) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:34 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2086
  • 13:34 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2086.codfw.wmnet with OS bookworm
  • 13:33 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host phab1005.eqiad.wmnet with OS bookworm
  • 13:32 samtar@deploy1003: Started scap sync-world: Backport for Enable CampaignEvents Invitation Lists on igwiki and swwiki (T372582)
  • 13:31 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2082.codfw.wmnet with reason: host reimage
  • 13:28 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2085
  • 13:28 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2085.codfw.wmnet with OS bookworm
  • 13:15 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2086.codfw.wmnet with reason: host reimage
  • 13:14 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) check for host wikikube-worker2079.codfw.wmnet
  • 13:14 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-worker2079.codfw.wmnet
  • 13:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2082
  • 13:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2082
  • 13:12 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2086.codfw.wmnet with reason: host reimage
  • 13:11 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2082
  • 13:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2082.codfw.wmnet 166.16.192.10.in-addr.arpa 6.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:11 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2082.codfw.wmnet 166.16.192.10.in-addr.arpa 6.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2082 - hnowlan@cumin1002"
  • 13:11 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2082 - hnowlan@cumin1002"
  • 13:09 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2085.codfw.wmnet with reason: host reimage
  • 13:06 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 13:06 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2082
  • 13:06 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2085.codfw.wmnet with reason: host reimage
  • 13:06 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2082.codfw.wmnet with OS bullseye
  • 13:05 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2082
  • 13:01 hashar@deploy1003: Finished deploy [releng/jenkins-deploy@26ba1ab] (releasing): do not set keepUndefinedParameters - T133737 (duration: 00m 38s)
  • 13:00 hashar@deploy1003: Started deploy [releng/jenkins-deploy@26ba1ab] (releasing): do not set keepUndefinedParameters - T133737
  • 12:57 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-text_eqiad for 9.2.5-1wm2
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2087
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2087
  • 12:55 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2087
  • 12:55 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2087.codfw.wmnet 225.16.192.10.in-addr.arpa 5.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:55 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2087.codfw.wmnet 225.16.192.10.in-addr.arpa 5.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:55 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:55 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2087 - jayme@cumin1002"
  • 12:55 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2087 - jayme@cumin1002"
  • 12:52 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2086
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2086
  • 12:52 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2086
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2086.codfw.wmnet 212.16.192.10.in-addr.arpa 2.1.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:52 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2086.codfw.wmnet 212.16.192.10.in-addr.arpa 2.1.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:52 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2086 - jayme@cumin1002"
  • 12:52 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2086 - jayme@cumin1002"
  • 12:50 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2087
  • 12:50 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2087.codfw.wmnet with OS bookworm
  • 12:50 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2087
  • 12:49 jayme@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2087
  • 12:49 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2087.codfw.wmnet with OS bookworm
  • 12:49 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2087.codfw.wmnet with OS bookworm
  • 12:49 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2087
  • 12:48 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:46 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2086
  • 12:45 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2085
  • 12:45 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2085
  • 12:45 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2085
  • 12:45 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2085.codfw.wmnet 138.16.192.10.in-addr.arpa 8.3.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:45 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2085.codfw.wmnet 138.16.192.10.in-addr.arpa 8.3.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:45 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:45 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2085 - jayme@cumin1002"
  • 12:45 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2085 - jayme@cumin1002"
  • 12:42 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2086.codfw.wmnet with OS bookworm
  • 12:42 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:41 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:41 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2086
  • 12:41 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:41 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2085
  • 12:41 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2085.codfw.wmnet with OS bookworm
  • 12:40 jayme@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2085
  • 12:36 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2035 to wikikube-worker2087
  • 12:35 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2087
  • 12:35 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:35 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2087
  • 12:35 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:35 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2035 to wikikube-worker2087 - jayme@cumin1002"
  • 12:34 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2035 to wikikube-worker2087 - jayme@cumin1002"
  • 12:33 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:32 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2010 to wikikube-worker2086
  • 12:31 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2086
  • 12:31 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:31 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2086
  • 12:31 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:31 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2010 to wikikube-worker2086 - jayme@cumin1002"
  • 12:31 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2010 to wikikube-worker2086 - jayme@cumin1002"
  • 12:29 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2035 to wikikube-worker2087
  • 12:28 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2009 to wikikube-worker2085
  • 12:27 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:27 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2085
  • 12:27 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2010 to wikikube-worker2086
  • 12:27 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2085
  • 12:27 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:27 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2009 to wikikube-worker2085 - jayme@cumin1002"
  • 12:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2191.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 12:27 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2009 to wikikube-worker2085 - jayme@cumin1002"
  • 12:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2191.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 12:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 12:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: provisionning db2221.codfw.wmnet - T373579
  • 12:20 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:20 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2009 to wikikube-worker2085
  • 12:08 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2081
  • 12:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2081.codfw.wmnet with OS bullseye
  • 12:05 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2054.codfw.wmnet
  • 12:04 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2054.codfw.wmnet
  • 12:04 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2035.codfw.wmnet
  • 12:04 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2035.codfw.wmnet
  • 12:04 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2010.codfw.wmnet
  • 12:03 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2010.codfw.wmnet
  • 12:03 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2009.codfw.wmnet
  • 12:02 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2009.codfw.wmnet
  • 12:00 hashar@deploy1003: Finished deploy [releng/jenkins-deploy@10f309b] (releasing): do not set keepUndefinedParameters - T133737 (duration: 00m 14s)
  • 12:00 hashar@deploy1003: Started deploy [releng/jenkins-deploy@10f309b] (releasing): do not set keepUndefinedParameters - T133737
  • 11:59 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-text_eqiad for 9.2.5-1wm2
  • 11:50 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_eqiad for 9.2.5-1wm2
  • 11:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2081.codfw.wmnet with reason: host reimage
  • 11:43 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2081.codfw.wmnet with reason: host reimage
  • 11:26 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2081
  • 11:26 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2081
  • 11:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T367781)', diff saved to https://phabricator.wikimedia.org/P68643 and previous config saved to /var/cache/conftool/dbconfig/20240904-112507-arnaudb.json
  • 11:24 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2081
  • 11:24 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2081.codfw.wmnet 165.16.192.10.in-addr.arpa 5.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:24 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2081.codfw.wmnet 165.16.192.10.in-addr.arpa 5.6.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:24 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:24 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2081 - hnowlan@cumin1002"
  • 11:24 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2081 - hnowlan@cumin1002"
  • 11:20 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:20 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2081
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2081.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2081
  • 11:12 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2079.codfw.wmnet
  • 11:12 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2079.codfw.wmnet
  • 11:12 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2079.codfw.wmnet
  • 11:11 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 11:11 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 11:11 claime: homer lsw1-b6-codfw* commit 'T372878'
  • 11:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P68642 and previous config saved to /var/cache/conftool/dbconfig/20240904-110959-arnaudb.json
  • 11:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2317 to wikikube-worker2082
  • 11:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2319 to wikikube-worker2084
  • 11:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2318 to wikikube-worker2083
  • 11:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2082
  • 11:09 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2082
  • 11:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2084
  • 11:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2083
  • 11:08 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-worker2082
  • 11:07 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2084
  • 11:07 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:07 topranks: migrating VMs off ganeti2009 in advance of network maintence codfw rack C1 - T373095
  • 11:06 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2083
  • 11:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2318 to wikikube-worker2083 - hnowlan@cumin1002"
  • 11:06 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2318 to wikikube-worker2083 - hnowlan@cumin1002"
  • 11:05 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2082
  • 11:05 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:05 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2317 to wikikube-worker2082 - hnowlan@cumin1002"
  • 11:05 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:04 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2079.codfw.wmnet with OS bullseye
  • 11:04 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 11:04 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 11:00 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:59 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2317 to wikikube-worker2082 - hnowlan@cumin1002"
  • 10:56 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2318 to wikikube-worker2083
  • 10:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2316 to wikikube-worker2081
  • 10:55 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_eqiad for 9.2.5-1wm2
  • 10:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2081
  • 10:55 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2081
  • 10:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2316 to wikikube-worker2081 - hnowlan@cumin1002"
  • 10:55 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:54 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2316 to wikikube-worker2081 - hnowlan@cumin1002"
  • 10:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P68641 and previous config saved to /var/cache/conftool/dbconfig/20240904-105452-arnaudb.json
  • 10:53 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2319 to wikikube-worker2084
  • 10:50 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2317 to wikikube-worker2082
  • 10:50 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:50 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2316 to wikikube-worker2081
  • 10:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T367781)', diff saved to https://phabricator.wikimedia.org/P68639 and previous config saved to /var/cache/conftool/dbconfig/20240904-103945-arnaudb.json
  • 10:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2218 (T367781)', diff saved to https://phabricator.wikimedia.org/P68638 and previous config saved to /var/cache/conftool/dbconfig/20240904-103718-arnaudb.json
  • 10:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 10:36 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 10:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T367781)', diff saved to https://phabricator.wikimedia.org/P68637 and previous config saved to /var/cache/conftool/dbconfig/20240904-103656-arnaudb.json
  • 10:34 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2079.codfw.wmnet with reason: host reimage
  • 10:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2319.codfw.wmnet
  • 10:32 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2319.codfw.wmnet
  • 10:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2318.codfw.wmnet
  • 10:32 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2079.codfw.wmnet with reason: host reimage
  • 10:31 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2318.codfw.wmnet
  • 10:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2317.codfw.wmnet
  • 10:31 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2317.codfw.wmnet
  • 10:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2316.codfw.wmnet
  • 10:30 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2316.codfw.wmnet
  • 10:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P68636 and previous config saved to /var/cache/conftool/dbconfig/20240904-102148-arnaudb.json
  • 10:19 claime: homer cr*codfw* commit 'T372878'
  • 10:18 hnowlan: temporarily adding own-work licenses to test2wiki MediaWiki:Licenses to test uploads
  • 10:15 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2079
  • 10:15 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2079
  • 10:15 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2079
  • 10:15 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2079.codfw.wmnet 181.16.192.10.in-addr.arpa 1.8.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:15 cgoubert@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2079.codfw.wmnet 181.16.192.10.in-addr.arpa 1.8.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:15 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:15 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2079 - cgoubert@cumin1002"
  • 10:15 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2079 - cgoubert@cumin1002"
  • 10:11 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 10:11 btullis@cumin1002: END (PASS) - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash (exit_code=0) rolling restart_daemons on A:apifeatureusage
  • 10:11 cgoubert@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2079
  • 10:11 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2079.codfw.wmnet with OS bullseye
  • 10:10 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2079.codfw.wmnet
  • 10:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2325 to wikikube-worker2079
  • 10:08 btullis@cumin1002: START - Cookbook sre.apifeatureusage.roll-restart-reboot-logstash rolling restart_daemons on A:apifeatureusage
  • 10:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2079
  • 10:08 cgoubert@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2079
  • 10:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:08 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2325 to wikikube-worker2079 - cgoubert@cumin1002"
  • 10:07 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2325 to wikikube-worker2079 - cgoubert@cumin1002"
  • 10:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P68635 and previous config saved to /var/cache/conftool/dbconfig/20240904-100641-arnaudb.json
  • 10:04 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 10:04 cgoubert@cumin1002: START - Cookbook sre.hosts.rename from mw2325 to wikikube-worker2079
  • 10:03 aqu: Deployed refinery-source using jenkins
  • 10:02 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host mw2325.codfw.wmnet
  • 10:01 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2325.codfw.wmnet
  • 09:58 aqu: Deployed refinery using scap, then deployed onto hdfs
  • 09:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T367781)', diff saved to https://phabricator.wikimedia.org/P68633 and previous config saved to /var/cache/conftool/dbconfig/20240904-095134-arnaudb.json
  • 09:50 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2080.codfw.wmnet
  • 09:50 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2080.codfw.wmnet
  • 09:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T367781)', diff saved to https://phabricator.wikimedia.org/P68632 and previous config saved to /var/cache/conftool/dbconfig/20240904-094824-arnaudb.json
  • 09:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T367781)', diff saved to https://phabricator.wikimedia.org/P68631 and previous config saved to /var/cache/conftool/dbconfig/20240904-094802-arnaudb.json
  • 09:41 aqu@deploy1003: Finished deploy [analytics/refinery@07fd127] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@07fd1275] (duration: 04m 18s)
  • 09:37 aqu@deploy1003: Started deploy [analytics/refinery@07fd127] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@07fd1275]
  • 09:34 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-text_drmrs for 9.2.5-1wm2
  • 09:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P68630 and previous config saved to /var/cache/conftool/dbconfig/20240904-093255-arnaudb.json
  • 09:23 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:23 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns entries for drmrs switch irb interface IPs - cmooney@cumin1002"
  • 09:23 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns entries for drmrs switch irb interface IPs - cmooney@cumin1002"
  • 09:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 7:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 09:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 7:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 09:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T367856)', diff saved to https://phabricator.wikimedia.org/P68629 and previous config saved to /var/cache/conftool/dbconfig/20240904-091937-marostegui.json
  • 09:18 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 09:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P68628 and previous config saved to /var/cache/conftool/dbconfig/20240904-091748-arnaudb.json
  • 09:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:10 isaranto@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:09 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P68627 and previous config saved to /var/cache/conftool/dbconfig/20240904-090429-marostegui.json
  • 09:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T367781)', diff saved to https://phabricator.wikimedia.org/P68626 and previous config saved to /var/cache/conftool/dbconfig/20240904-090240-arnaudb.json
  • 08:59 dcausse@deploy1003: Finished scap sync-world: Backport for Checker: ensure all labels are set (T373086) (duration: 06m 48s)
  • 08:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:55 dcausse@deploy1003: dcausse: Continuing with sync
  • 08:54 dcausse@deploy1003: dcausse: Backport for Checker: ensure all labels are set (T373086) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:52 dcausse@deploy1003: Started scap sync-world: Backport for Checker: ensure all labels are set (T373086)
  • 08:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P68624 and previous config saved to /var/cache/conftool/dbconfig/20240904-084922-marostegui.json
  • 08:48 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2080.codfw.wmnet
  • 08:47 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2080.codfw.wmnet
  • 08:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T367781)', diff saved to https://phabricator.wikimedia.org/P68623 and previous config saved to /var/cache/conftool/dbconfig/20240904-084435-arnaudb.json
  • 08:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T367781)', diff saved to https://phabricator.wikimedia.org/P68622 and previous config saved to /var/cache/conftool/dbconfig/20240904-083622-arnaudb.json
  • 08:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 08:36 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 08:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T367856)', diff saved to https://phabricator.wikimedia.org/P68621 and previous config saved to /var/cache/conftool/dbconfig/20240904-083415-marostegui.json
  • 08:33 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-text_drmrs for 9.2.5-1wm2
  • 08:32 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_drmrs for 9.2.5-1wm2
  • 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P68620 and previous config saved to /var/cache/conftool/dbconfig/20240904-082928-arnaudb.json
  • 08:19 elukey: upgrade ruby-nokogiri on serpens and seaborgium for security upgrades
  • 08:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P68619 and previous config saved to /var/cache/conftool/dbconfig/20240904-081420-arnaudb.json
  • 07:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T367781)', diff saved to https://phabricator.wikimedia.org/P68618 and previous config saved to /var/cache/conftool/dbconfig/20240904-075913-arnaudb.json
  • 07:59 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:58 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2214 (T367781)', diff saved to https://phabricator.wikimedia.org/P68617 and previous config saved to /var/cache/conftool/dbconfig/20240904-075659-arnaudb.json
  • 07:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 07:56 kartik@deploy1003: Finished scap sync-world: Backport for TTMServerAid: Tell PHP that we're OK with $services starting out null (T373921) (duration: 09m 23s)
  • 07:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 07:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T367781)', diff saved to https://phabricator.wikimedia.org/P68616 and previous config saved to /var/cache/conftool/dbconfig/20240904-075637-arnaudb.json
  • 07:54 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:54 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:52 kartik@deploy1003: kartik, abi: Continuing with sync
  • 07:49 kartik@deploy1003: kartik, abi: Backport for TTMServerAid: Tell PHP that we're OK with $services starting out null (T373921) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:47 kartik@deploy1003: Started scap sync-world: Backport for TTMServerAid: Tell PHP that we're OK with $services starting out null (T373921)
  • 07:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P68615 and previous config saved to /var/cache/conftool/dbconfig/20240904-074130-arnaudb.json
  • 07:38 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:34 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:33 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_drmrs for 9.2.5-1wm2
  • 07:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P68614 and previous config saved to /var/cache/conftool/dbconfig/20240904-072623-arnaudb.json
  • 07:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2437.codfw.wmnet
  • 07:14 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2437.codfw.wmnet
  • 07:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2436.codfw.wmnet
  • 07:13 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2436.codfw.wmnet
  • 07:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2038.codfw.wmnet
  • 07:12 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2038.codfw.wmnet
  • 07:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2037.codfw.wmnet
  • 07:12 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2037.codfw.wmnet
  • 07:12 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2037.codfw.wmnet
  • 07:11 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2037.codfw.wmnet
  • 07:11 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2036.codfw.wmnet
  • 07:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T367781)', diff saved to https://phabricator.wikimedia.org/P68613 and previous config saved to /var/cache/conftool/dbconfig/20240904-071115-arnaudb.json
  • 07:10 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2036.codfw.wmnet
  • 07:10 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2012.codfw.wmnet
  • 07:10 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2012.codfw.wmnet
  • 07:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T367781)', diff saved to https://phabricator.wikimedia.org/P68612 and previous config saved to /var/cache/conftool/dbconfig/20240904-071007-arnaudb.json
  • 07:10 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:09 akosiaris: T373095 depool kubernetes2011, kubernetes2012, kubernetes2036, kubernetes2037, wikikube-worker2037, wikikube-worker2038, mw2436, mw2437
  • 07:08 akosiaris@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2011.codfw.wmnet
  • 07:07 akosiaris@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2011.codfw.wmnet
  • 06:11 aqu@deploy1003: Finished deploy [analytics/refinery@07fd127] (thin): Regular analytics weekly train THIN [analytics/refinery@07fd1275] (duration: 04m 55s)
  • 06:06 aqu@deploy1003: Started deploy [analytics/refinery@07fd127] (thin): Regular analytics weekly train THIN [analytics/refinery@07fd1275]

2024-09-03

  • 22:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host phab1005.eqiad.wmnet with OS bookworm
  • 21:42 swfrench-wmf: running homer 'cr*codfw*' commit 'T372878'
  • 21:34 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2080.codfw.wmnet
  • 21:34 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2080.codfw.wmnet
  • 21:31 swfrench-wmf: running homer 'lsw1-b3-codfw*' commit 'T372878'
  • 21:13 jdrewniak@deploy1003: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 18s)
  • 21:11 jdrewniak@deploy1003: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 25s)
  • 20:59 jdrewniak@deploy1003: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 19s)
  • 20:56 jdrewniak@deploy1003: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 21s)
  • 20:56 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host phab1005
  • 20:56 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host phab1005
  • 20:45 toyofuku@deploy1003: Finished scap sync-world: Backport for Turn on donate link in beta (T372757), Disable lead paragraph transform on Wikivoyages (duration: 09m 43s)
  • 20:44 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host phab1005.eqiad.wmnet with OS bookworm
  • 20:44 swfrench-wmf: running homer 'lsw1-b3-codfw*' commit 'T372878'
  • 20:44 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:41 toyofuku@deploy1003: jdlrobson, toyofuku: Continuing with sync
  • 20:39 jclark@cumin1002: START - Cookbook sre.hosts.provision for host phab1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:38 toyofuku@deploy1003: jdlrobson, toyofuku: Backport for Turn on donate link in beta (T372757), Disable lead paragraph transform on Wikivoyages synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:37 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:36 toyofuku@deploy1003: Started scap sync-world: Backport for Turn on donate link in beta (T372757), Disable lead paragraph transform on Wikivoyages
  • 20:31 cjming@deploy1003: Finished scap sync-world: Backport for cirrus: Introduce an expensive query pool counter (T369808) (duration: 06m 47s)
  • 20:26 cjming@deploy1003: ebernhardson, cjming: Continuing with sync
  • 20:26 cjming@deploy1003: ebernhardson, cjming: Backport for cirrus: Introduce an expensive query pool counter (T369808) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:24 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2080.codfw.wmnet with OS bullseye
  • 20:24 cjming@deploy1003: Started scap sync-world: Backport for cirrus: Introduce an expensive query pool counter (T369808)
  • 20:08 jclark@cumin1002: START - Cookbook sre.hosts.provision for host phab1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:07 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host phab1005
  • 20:07 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host phab1005
  • 20:07 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:07 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt phab1005 - jclark@cumin1002"
  • 20:07 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt phab1005 - jclark@cumin1002"
  • 20:04 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 20:03 jclark@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 20:01 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 19:58 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:57 jclark@cumin1002: START - Cookbook sre.hosts.provision for host phab1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:56 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:56 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt phab1005 - jclark@cumin1002"
  • 19:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt phab1005 - jclark@cumin1002"
  • 19:55 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2080.codfw.wmnet with reason: host reimage
  • 19:53 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 19:52 swfrench@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2080.codfw.wmnet with reason: host reimage
  • 19:52 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host gerrit1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:48 jclark@cumin1002: START - Cookbook sre.hosts.provision for host gerrit1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:46 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host gerrit1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:44 jclark@cumin1002: START - Cookbook sre.hosts.provision for host gerrit1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host gerrit1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:42 jclark@cumin1002: START - Cookbook sre.hosts.provision for host gerrit1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:35 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2080
  • 19:35 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2080
  • 19:35 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2080
  • 19:35 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2080.codfw.wmnet 159.16.192.10.in-addr.arpa 9.5.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 19:35 swfrench@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2080.codfw.wmnet 159.16.192.10.in-addr.arpa 9.5.1.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 19:35 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:33 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host phab1005
  • 19:32 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 19:31 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host phab1005
  • 19:31 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:31 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt phab1005 - jclark@cumin1002"
  • 19:30 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt phab1005 - jclark@cumin1002"
  • 19:27 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 19:23 swfrench@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 19:22 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert "Set wgFlaggedRevsHandleIncludes to FR_INCLUDES_CURRENT on ruwiki" (T359529) (duration: 10m 50s)
  • 19:21 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 19:21 swfrench@cumin2002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2080
  • 19:21 swfrench@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2080.codfw.wmnet with OS bullseye
  • 19:20 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2080.codfw.wmnet on all recursors
  • 19:20 swfrench@cumin2002: START - Cookbook sre.dns.wipe-cache wikikube-worker2080.codfw.wmnet on all recursors
  • 19:19 swfrench@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2312 to wikikube-worker2080
  • 19:18 swfrench@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2080
  • 19:18 swfrench@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2080
  • 19:18 swfrench@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:18 swfrench@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2312 to wikikube-worker2080 - swfrench@cumin2002"
  • 19:17 swfrench@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2312 to wikikube-worker2080 - swfrench@cumin2002"
  • 19:17 ladsgroup@deploy1003: ladsgroup: Continuing with sync
  • 19:16 ladsgroup@deploy1003: ladsgroup: Backport for Revert "Set wgFlaggedRevsHandleIncludes to FR_INCLUDES_CURRENT on ruwiki" (T359529) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:13 swfrench@cumin2002: START - Cookbook sre.dns.netbox
  • 19:13 swfrench@cumin2002: START - Cookbook sre.hosts.rename from mw2312 to wikikube-worker2080
  • 19:11 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert "Set wgFlaggedRevsHandleIncludes to FR_INCLUDES_CURRENT on ruwiki" (T359529)
  • 19:08 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
  • 19:08 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
  • 19:07 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
  • 19:07 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
  • 19:07 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
  • 19:07 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
  • 19:06 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2312.codfw.wmnet
  • 19:06 dancy@deploy1003: Finished scap sync-world: testwikis to 1.43.0-wmf.21 refs T373640 (duration: 08m 11s)
  • 19:06 kamila_: ran homer on lsw1-a5-codfw for T372878
  • 19:05 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2312.codfw.wmnet
  • 19:05 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2260.codfw.wmnet
  • 19:05 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2076.codfw.wmnet with OS bullseye
  • 19:04 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2260.codfw.wmnet
  • 18:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2077.codfw.wmnet with OS bullseye
  • 18:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2078.codfw.wmnet with OS bullseye
  • 18:58 dancy@deploy1003: Started scap sync-world: testwikis to 1.43.0-wmf.21 refs T373640
  • 18:13 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.21 refs T373640
  • 17:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
  • 17:53 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
  • 17:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
  • 17:53 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
  • 17:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2073.codfw.wmnet
  • 17:53 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2073.codfw.wmnet
  • 17:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2072.codfw.wmnet
  • 17:53 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2072.codfw.wmnet
  • 17:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2075.codfw.wmnet with OS bullseye
  • 17:48 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2074.codfw.wmnet with OS bullseye
  • 17:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2078.codfw.wmnet with reason: host reimage
  • 17:42 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2078.codfw.wmnet with reason: host reimage
  • 17:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2076.codfw.wmnet with reason: host reimage
  • 17:38 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2076.codfw.wmnet with reason: host reimage
  • 17:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2077.codfw.wmnet with reason: host reimage
  • 17:35 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 17:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2075.codfw.wmnet with reason: host reimage
  • 17:33 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2077.codfw.wmnet with reason: host reimage
  • 17:31 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 17:29 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2075.codfw.wmnet with reason: host reimage
  • 17:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2078
  • 17:26 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2078
  • 17:26 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2078
  • 17:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2078.codfw.wmnet 75.0.192.10.in-addr.arpa 5.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:26 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2078.codfw.wmnet 75.0.192.10.in-addr.arpa 5.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:26 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:26 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2078 - kamila@cumin1002"
  • 17:26 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2078 - kamila@cumin1002"
  • 17:25 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 17:23 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2074.codfw.wmnet with reason: host reimage
  • 17:22 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:22 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2078
  • 17:21 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2076
  • 17:21 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2076
  • 17:21 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2076
  • 17:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2076.codfw.wmnet 66.0.192.10.in-addr.arpa 6.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:21 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2076.codfw.wmnet 66.0.192.10.in-addr.arpa 6.6.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:21 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:21 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2076 - kamila@cumin1002"
  • 17:21 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2076 - kamila@cumin1002"
  • 17:21 hnowlan: homer 'cr*codfw*' commit
  • 17:21 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 17:19 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2074.codfw.wmnet with reason: host reimage
  • 17:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2078.codfw.wmnet with OS bullseye
  • 17:17 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:17 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2076
  • 17:17 hnowlan: homer 'lsw1-a5-codfw*' && homer 'lsw1-a6-codfw*' commit
  • 17:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2077
  • 17:16 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2077
  • 17:16 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2077
  • 17:16 kamila@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2077.codfw.wmnet 71.0.192.10.in-addr.arpa 1.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:16 kamila@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2077.codfw.wmnet 71.0.192.10.in-addr.arpa 1.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:16 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:16 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2077 - kamila@cumin1002"
  • 17:16 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2077 - kamila@cumin1002"
  • 17:12 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:12 kamila@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2077
  • 17:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2075
  • 17:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2075
  • 17:11 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2075
  • 17:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2075.codfw.wmnet 80.0.192.10.in-addr.arpa 0.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:11 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2075.codfw.wmnet 80.0.192.10.in-addr.arpa 0.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:09 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2407 to wikikube-worker2078
  • 17:08 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 17:08 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2078
  • 17:08 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2078
  • 17:08 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:08 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2407 to wikikube-worker2078 - kamila@cumin1002"
  • 17:08 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2407 to wikikube-worker2078 - kamila@cumin1002"
  • 17:06 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2077.codfw.wmnet with OS bullseye
  • 17:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2406 to wikikube-worker2077
  • 17:05 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2077
  • 17:05 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:04 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2077
  • 17:04 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:04 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2406 to wikikube-worker2077 - kamila@cumin1002"
  • 17:04 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2076.codfw.wmnet with OS bullseye
  • 17:04 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2406 to wikikube-worker2077 - kamila@cumin1002"
  • 17:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2402 to wikikube-worker2076
  • 17:03 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw2407 to wikikube-worker2078
  • 17:03 kamila@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2076
  • 17:00 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2075
  • 17:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2074
  • 17:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2074
  • 17:00 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 17:00 kamila@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2076
  • 17:00 kamila@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:00 kamila@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2402 to wikikube-worker2076 - kamila@cumin1002"
  • 17:00 kamila@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2402 to wikikube-worker2076 - kamila@cumin1002"
  • 17:00 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw2406 to wikikube-worker2077
  • 16:59 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2074
  • 16:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2074.codfw.wmnet 78.0.192.10.in-addr.arpa 8.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:59 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2074.codfw.wmnet 78.0.192.10.in-addr.arpa 8.7.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2074 - hnowlan@cumin1002"
  • 16:57 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (None, T373791) xfer wikidata_main from wdqs2022.codfw.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards
  • 16:56 kamila@cumin1002: START - Cookbook sre.dns.netbox
  • 16:56 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2074 - hnowlan@cumin1002"
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.rename from mw2402 to wikikube-worker2076
  • 16:51 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 16:49 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2074
  • 16:49 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2075.codfw.wmnet with OS bullseye
  • 16:49 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2074.codfw.wmnet with OS bullseye
  • 16:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2073.codfw.wmnet with OS bullseye
  • 16:45 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2072.codfw.wmnet with OS bullseye
  • 16:45 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2407.codfw.wmnet
  • 16:45 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2407.codfw.wmnet
  • 16:45 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2406.codfw.wmnet
  • 16:44 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2402.codfw.wmnet
  • 16:44 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2406.codfw.wmnet
  • 16:44 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2402.codfw.wmnet
  • 16:35 aqu@deploy1003: Finished deploy [analytics/refinery@07fd127]: Regular analytics weekly train [analytics/refinery@07fd1275] (duration: 08m 09s)
  • 16:27 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2072.codfw.wmnet with reason: host reimage
  • 16:27 aqu@deploy1003: Started deploy [analytics/refinery@07fd127]: Regular analytics weekly train [analytics/refinery@07fd1275]
  • 16:27 aqu: About to deploy analytics/refinery
  • 16:26 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply heap settings - bking@cumin2002 - T373895
  • 16:24 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2072.codfw.wmnet with reason: host reimage
  • 16:23 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2073.codfw.wmnet with reason: host reimage
  • 16:21 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2073.codfw.wmnet with reason: host reimage
  • 16:17 mutante: contint - installed java jdk 17 packages - just installed, in parallel to existing jdk 11, no change to java_home / what is used by CI yet. T359795
  • 16:09 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (None, T373791) xfer wikidata_main from wdqs2022.codfw.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards
  • 16:07 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 16:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp7015.magru.wmnet with reason: T371554
  • 16:03 brett@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp7015.magru.wmnet with reason: T371554
  • 16:01 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply heap settings - bking@cumin2002 - T373895
  • 15:59 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:59 denisse: Enabling meta monitoring for alert[12]001 - T372418
  • 15:59 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:58 denisse: Reverting back to alert1001 - T372418
  • 15:58 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=codfw
  • 15:45 godog: bounce icinga on alert2002
  • 15:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2073
  • 15:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2073
  • 15:10 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2073
  • 15:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2073.codfw.wmnet 25.0.192.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:10 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2073.codfw.wmnet 25.0.192.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2073 - hnowlan@cumin1002"
  • 15:10 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2073 - hnowlan@cumin1002"
  • 15:07 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:07 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2073
  • 15:07 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2072
  • 15:07 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2072
  • 15:06 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2072
  • 15:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2072.codfw.wmnet 89.0.192.10.in-addr.arpa 9.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:06 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2072.codfw.wmnet 89.0.192.10.in-addr.arpa 9.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 15:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2072 - hnowlan@cumin1002"
  • 15:06 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2072 - hnowlan@cumin1002"
  • 15:02 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2073.codfw.wmnet with OS bullseye
  • 15:01 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:01 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2072
  • 15:01 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2072.codfw.wmnet with OS bullseye
  • 15:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2422 to wikikube-worker2074
  • 15:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2074
  • 14:59 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2074
  • 14:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2423 to wikikube-worker2075
  • 14:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2075
  • 14:57 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2075
  • 14:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2423 to wikikube-worker2075 - hnowlan@cumin1002"
  • 14:57 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 14:56 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2423 to wikikube-worker2075 - hnowlan@cumin1002"
  • 14:51 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 14:51 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2423 to wikikube-worker2075
  • 14:51 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2422 to wikikube-worker2074
  • 14:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2055 to wikikube-worker2073
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2073
  • 14:49 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2073
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2055 to wikikube-worker2073 - hnowlan@cumin1002"
  • 14:48 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2055 to wikikube-worker2073 - hnowlan@cumin1002"
  • 14:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2028 to wikikube-worker2072
  • 14:45 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2072
  • 14:44 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 14:44 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2072
  • 14:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2028 to wikikube-worker2072 - hnowlan@cumin1002"
  • 14:44 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2028 to wikikube-worker2072 - hnowlan@cumin1002"
  • 14:44 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2055 to wikikube-worker2073
  • 14:42 hashar: Restarting CI Jenkins
  • 14:40 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 14:40 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2028 to wikikube-worker2072
  • 14:26 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host mw2423.codfw.wmnet
  • 14:26 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2423.codfw.wmnet
  • 14:25 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host mw2423.codfw.wmnet
  • 14:25 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host mw2422.codfw.wmnet
  • 14:25 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2422.codfw.wmnet
  • 14:25 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host mw2422.codfw.wmnet
  • 14:25 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host kubernetes2055.codfw.wmnet
  • 14:25 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2423.codfw.wmnet
  • 14:25 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2422.codfw.wmnet
  • 14:24 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2055.codfw.wmnet
  • 14:24 hnowlan@cumin1002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host kubernetes2028.codfw.wmnet
  • 14:10 denisse: Resolve DNS queries to alert2002 - T372418
  • 14:06 denisse: Failing over to alert2002 - T372418
  • 14:03 denisse: Stopping services in the alert1001 host - T372418
  • 14:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T371742)', diff saved to https://phabricator.wikimedia.org/P68609 and previous config saved to /var/cache/conftool/dbconfig/20240903-140226-ladsgroup.json
  • 14:00 denisse: Disabling meta-monitoring for the alert hosts - T372418
  • 14:00 denisse: Disabling meta-monitoring for the alert hosts
  • 14:00 jgleeson: smashpig updated from e7c7d116 to e625eef2
  • 13:59 ejegg: payments-wiki upgraded from 54988ad9 to e47e61cb
  • 13:58 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:55 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:54 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:51 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:49 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 13:49 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 13:48 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P68606 and previous config saved to /var/cache/conftool/dbconfig/20240903-134719-ladsgroup.json
  • 13:45 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:41 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 13:41 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 13:37 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:34 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P68605 and previous config saved to /var/cache/conftool/dbconfig/20240903-133211-ladsgroup.json
  • 13:30 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 13:29 jayme@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 13:29 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:25 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:20 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T371742)', diff saved to https://phabricator.wikimedia.org/P68604 and previous config saved to /var/cache/conftool/dbconfig/20240903-131704-ladsgroup.json
  • 13:16 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:10 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 13:07 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 12:52 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:51 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:43 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:43 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2207 (T371742)', diff saved to https://phabricator.wikimedia.org/P68602 and previous config saved to /var/cache/conftool/dbconfig/20240903-122647-ladsgroup.json
  • 12:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
  • 12:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
  • 12:24 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:24 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:20 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 11:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 11:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 11:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T371742)', diff saved to https://phabricator.wikimedia.org/P68601 and previous config saved to /var/cache/conftool/dbconfig/20240903-114232-ladsgroup.json
  • 11:31 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-magru for 9.2.5-1wm2
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P68600 and previous config saved to /var/cache/conftool/dbconfig/20240903-112725-ladsgroup.json
  • 11:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P68599 and previous config saved to /var/cache/conftool/dbconfig/20240903-111218-ladsgroup.json
  • 10:57 moritzm: installing amd64-microcode security updates
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T371742)', diff saved to https://phabricator.wikimedia.org/P68598 and previous config saved to /var/cache/conftool/dbconfig/20240903-105710-ladsgroup.json
  • 10:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
  • 10:29 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:29 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:27 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
  • 10:12 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 10:11 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 10:09 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:09 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 10:08 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 10:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T371742)', diff saved to https://phabricator.wikimedia.org/P68597 and previous config saved to /var/cache/conftool/dbconfig/20240903-100713-ladsgroup.json
  • 10:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 10:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 10:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T371742)', diff saved to https://phabricator.wikimedia.org/P68596 and previous config saved to /var/cache/conftool/dbconfig/20240903-100651-ladsgroup.json
  • 10:02 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 10:02 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 09:56 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2001.codfw.wmnet
  • 09:56 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2001.codfw.wmnet
  • 09:55 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2001.codfw.wmnet
  • 09:55 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2001.codfw.wmnet
  • 09:53 cgoubert@cumin1002: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker2001.codfw.wmnet
  • 09:53 cgoubert@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 09:52 cgoubert@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 09:52 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2001.codfw.wmnet
  • 09:52 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2001.codfw.wmnet
  • 09:51 cgoubert@cumin1002: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker2001.codfw.wmnet
  • 09:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P68594 and previous config saved to /var/cache/conftool/dbconfig/20240903-095144-ladsgroup.json
  • 09:50 cgoubert@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2001.codfw.wmnet
  • 09:50 cgoubert@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2001.codfw.wmnet
  • 09:48 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:47 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:47 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:47 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:47 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 09:47 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 09:47 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:46 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 09:46 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 09:46 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:46 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:46 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:44 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:44 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 09:44 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:43 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:43 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:43 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:42 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:41 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:41 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:41 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:40 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:40 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:40 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:40 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P68592 and previous config saved to /var/cache/conftool/dbconfig/20240903-093637-ladsgroup.json
  • 09:36 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2066.codfw.wmnet with reason: host reimage
  • 09:33 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2066.codfw.wmnet with reason: host reimage
  • 09:24 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-magru for 9.2.5-1wm2
  • 09:21 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-text_codfw for 9.2.5-1wm2
  • 09:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T371742)', diff saved to https://phabricator.wikimedia.org/P68591 and previous config saved to /var/cache/conftool/dbconfig/20240903-092129-ladsgroup.json
  • 09:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:15 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 09:14 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2066.codfw.wmnet
  • 09:14 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2066.codfw.wmnet
  • 09:14 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2066.codfw.wmnet
  • 09:14 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2066.codfw.wmnet
  • 08:57 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:56 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:56 jnuche@deploy1003: Finished scap sync-world: testwikis to 1.43.0-wmf.21 refs T373640 (duration: 45m 42s)
  • 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
  • 08:45 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:45 elukey@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
  • 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet
  • 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet
  • 08:36 moritzm: remove intel-microcode 3.20240312.1~deb11u1 from apt.wikimedia.org (this was a temporary import for the last round of Bullseye reboots, not superceded by 3.20240813.1~deb11u1 from the 11.1 point release) T373795
  • 08:29 kevinbazira@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 08:28 moritzm: installing intel-microcode security updates
  • 08:26 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T371742)', diff saved to https://phabricator.wikimedia.org/P68589 and previous config saved to /var/cache/conftool/dbconfig/20240903-082639-ladsgroup.json
  • 08:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T371742)', diff saved to https://phabricator.wikimedia.org/P68588 and previous config saved to /var/cache/conftool/dbconfig/20240903-082617-ladsgroup.json
  • 08:23 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-text_codfw for 9.2.5-1wm2
  • 08:18 elukey: upgrade spicerack to 8.12.0 on cumin1002
  • 08:13 elukey: upgrade python3-nbconvert on various DE hosts for security upgrades
  • 08:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P68586 and previous config saved to /var/cache/conftool/dbconfig/20240903-081110-ladsgroup.json
  • 08:10 kevinbazira@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 08:10 jnuche@deploy1003: Started scap sync-world: testwikis to 1.43.0-wmf.21 refs T373640
  • 08:06 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_codfw for 9.2.5-1wm2
  • 07:57 moritzm: move LDAP user cn=ncreasy from cn=nda to cn=wmf
  • 07:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P68585 and previous config saved to /var/cache/conftool/dbconfig/20240903-075602-ladsgroup.json
  • 07:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T371742)', diff saved to https://phabricator.wikimedia.org/P68584 and previous config saved to /var/cache/conftool/dbconfig/20240903-074055-ladsgroup.json
  • 07:31 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 07:16 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@4ca4744] (releasing): (no justification provided) (duration: 00m 39s)
  • 07:15 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@4ca4744] (releasing): (no justification provided)
  • 07:11 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:cp-upload_codfw for 9.2.5-1wm2
  • 06:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P68581 and previous config saved to /var/cache/conftool/dbconfig/20240903-062909-ladsgroup.json
  • 06:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P68580 and previous config saved to /var/cache/conftool/dbconfig/20240903-061402-ladsgroup.json
  • 05:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T371742)', diff saved to https://phabricator.wikimedia.org/P68579 and previous config saved to /var/cache/conftool/dbconfig/20240903-055855-ladsgroup.json
  • 05:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2138 (T371742)', diff saved to https://phabricator.wikimedia.org/P68578 and previous config saved to /var/cache/conftool/dbconfig/20240903-050400-ladsgroup.json
  • 05:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 05:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 05:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T371742)', diff saved to https://phabricator.wikimedia.org/P68577 and previous config saved to /var/cache/conftool/dbconfig/20240903-050338-ladsgroup.json
  • 04:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P68576 and previous config saved to /var/cache/conftool/dbconfig/20240903-044830-ladsgroup.json
  • 04:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P68575 and previous config saved to /var/cache/conftool/dbconfig/20240903-043323-ladsgroup.json
  • 04:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T371742)', diff saved to https://phabricator.wikimedia.org/P68574 and previous config saved to /var/cache/conftool/dbconfig/20240903-041816-ladsgroup.json
  • 04:00 mwpresync@deploy1003: Pruned MediaWiki: 1.43.0-wmf.18 (duration: 00m 47s)
  • 03:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T371742)', diff saved to https://phabricator.wikimedia.org/P68573 and previous config saved to /var/cache/conftool/dbconfig/20240903-035610-ladsgroup.json
  • 03:56 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T371742)', diff saved to https://phabricator.wikimedia.org/P68572 and previous config saved to /var/cache/conftool/dbconfig/20240903-035534-ladsgroup.json
  • 03:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P68571 and previous config saved to /var/cache/conftool/dbconfig/20240903-034026-ladsgroup.json
  • 03:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P68570 and previous config saved to /var/cache/conftool/dbconfig/20240903-032519-ladsgroup.json
  • 03:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T371742)', diff saved to https://phabricator.wikimedia.org/P68569 and previous config saved to /var/cache/conftool/dbconfig/20240903-031012-ladsgroup.json
  • 02:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T371742)', diff saved to https://phabricator.wikimedia.org/P68568 and previous config saved to /var/cache/conftool/dbconfig/20240903-020730-ladsgroup.json
  • 02:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 02:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 01:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T371742)', diff saved to https://phabricator.wikimedia.org/P68567 and previous config saved to /var/cache/conftool/dbconfig/20240903-012013-ladsgroup.json
  • 01:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P68566 and previous config saved to /var/cache/conftool/dbconfig/20240903-010506-ladsgroup.json
  • 00:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P68565 and previous config saved to /var/cache/conftool/dbconfig/20240903-004959-ladsgroup.json
  • 00:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T371742)', diff saved to https://phabricator.wikimedia.org/P68564 and previous config saved to /var/cache/conftool/dbconfig/20240903-003452-ladsgroup.json

2024-09-02

  • 23:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1246 (T371742)', diff saved to https://phabricator.wikimedia.org/P68563 and previous config saved to /var/cache/conftool/dbconfig/20240902-232222-ladsgroup.json
  • 23:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 23:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 22:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 22:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 22:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T371742)', diff saved to https://phabricator.wikimedia.org/P68562 and previous config saved to /var/cache/conftool/dbconfig/20240902-223324-ladsgroup.json
  • 22:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P68561 and previous config saved to /var/cache/conftool/dbconfig/20240902-221817-ladsgroup.json
  • 22:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P68560 and previous config saved to /var/cache/conftool/dbconfig/20240902-220310-ladsgroup.json
  • 21:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T371742)', diff saved to https://phabricator.wikimedia.org/P68559 and previous config saved to /var/cache/conftool/dbconfig/20240902-214802-ladsgroup.json
  • 21:02 catrope@deploy1003: Finished scap sync-world: Backport for CodexModule: Fix double-flipping in RTL (T373676) (duration: 11m 31s)
  • 20:58 catrope@deploy1003: catrope: Continuing with sync
  • 20:55 catrope@deploy1003: catrope: Backport for CodexModule: Fix double-flipping in RTL (T373676) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T371742)', diff saved to https://phabricator.wikimedia.org/P68558 and previous config saved to /var/cache/conftool/dbconfig/20240902-205149-ladsgroup.json
  • 20:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 20:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 20:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T371742)', diff saved to https://phabricator.wikimedia.org/P68557 and previous config saved to /var/cache/conftool/dbconfig/20240902-205128-ladsgroup.json
  • 20:51 catrope@deploy1003: Started scap sync-world: Backport for CodexModule: Fix double-flipping in RTL (T373676)
  • 20:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P68556 and previous config saved to /var/cache/conftool/dbconfig/20240902-203620-ladsgroup.json
  • 20:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P68555 and previous config saved to /var/cache/conftool/dbconfig/20240902-202113-ladsgroup.json
  • 20:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T371742)', diff saved to https://phabricator.wikimedia.org/P68554 and previous config saved to /var/cache/conftool/dbconfig/20240902-200606-ladsgroup.json
  • 19:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T371742)', diff saved to https://phabricator.wikimedia.org/P68553 and previous config saved to /var/cache/conftool/dbconfig/20240902-194545-ladsgroup.json
  • 19:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 19:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 19:42 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 09s)
  • 19:42 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 19:35 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 09s)
  • 19:34 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 19:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 19:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 18:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T371742)', diff saved to https://phabricator.wikimedia.org/P68552 and previous config saved to /var/cache/conftool/dbconfig/20240902-185906-ladsgroup.json
  • 18:46 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 09s)
  • 18:45 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 18:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P68551 and previous config saved to /var/cache/conftool/dbconfig/20240902-184359-ladsgroup.json
  • 18:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P68550 and previous config saved to /var/cache/conftool/dbconfig/20240902-182852-ladsgroup.json
  • 18:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 18:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T371742)', diff saved to https://phabricator.wikimedia.org/P68549 and previous config saved to /var/cache/conftool/dbconfig/20240902-181345-ladsgroup.json
  • 17:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1206.eqiad.wmnet with reason: Dumps causing issues
  • 17:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1206.eqiad.wmnet with reason: Dumps causing issues
  • 17:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T371742)', diff saved to https://phabricator.wikimedia.org/P68548 and previous config saved to /var/cache/conftool/dbconfig/20240902-175238-ladsgroup.json
  • 17:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 17:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 17:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T371742)', diff saved to https://phabricator.wikimedia.org/P68547 and previous config saved to /var/cache/conftool/dbconfig/20240902-175216-ladsgroup.json
  • 17:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P68545 and previous config saved to /var/cache/conftool/dbconfig/20240902-173709-ladsgroup.json
  • 17:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P68544 and previous config saved to /var/cache/conftool/dbconfig/20240902-172202-ladsgroup.json
  • 17:21 hnowlan: homer 'cr*codfw*' commit
  • 17:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2071.codfw.wmnet
  • 17:20 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2071.codfw.wmnet
  • 17:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2070.codfw.wmnet
  • 17:19 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2070.codfw.wmnet
  • 17:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2069.codfw.wmnet
  • 17:19 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2069.codfw.wmnet
  • 17:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2068.codfw.wmnet
  • 17:19 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2068.codfw.wmnet
  • 17:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2068.codfw.wmnet with OS bullseye
  • 17:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2070.codfw.wmnet with OS bullseye
  • 17:15 hnowlan: homer 'lsw1-a3-codfw*' commit
  • 17:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2071.codfw.wmnet with OS bullseye
  • 17:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T367856)', diff saved to https://phabricator.wikimedia.org/P68543 and previous config saved to /var/cache/conftool/dbconfig/20240902-171418-marostegui.json
  • 17:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 7:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 17:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 7:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 17:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T367856)', diff saved to https://phabricator.wikimedia.org/P68542 and previous config saved to /var/cache/conftool/dbconfig/20240902-171356-marostegui.json
  • 17:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2069.codfw.wmnet with OS bullseye
  • 17:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T371742)', diff saved to https://phabricator.wikimedia.org/P68541 and previous config saved to /var/cache/conftool/dbconfig/20240902-170654-ladsgroup.json
  • 16:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P68540 and previous config saved to /var/cache/conftool/dbconfig/20240902-165848-marostegui.json
  • 16:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T371742)', diff saved to https://phabricator.wikimedia.org/P68539 and previous config saved to /var/cache/conftool/dbconfig/20240902-164530-ladsgroup.json
  • 16:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 16:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 16:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T371742)', diff saved to https://phabricator.wikimedia.org/P68538 and previous config saved to /var/cache/conftool/dbconfig/20240902-164508-ladsgroup.json
  • 16:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P68537 and previous config saved to /var/cache/conftool/dbconfig/20240902-164343-marostegui.json
  • 16:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2071.codfw.wmnet with reason: host reimage
  • 16:31 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2071.codfw.wmnet with reason: host reimage
  • 16:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P68536 and previous config saved to /var/cache/conftool/dbconfig/20240902-163001-ladsgroup.json
  • 16:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T367856)', diff saved to https://phabricator.wikimedia.org/P68535 and previous config saved to /var/cache/conftool/dbconfig/20240902-162836-marostegui.json
  • 16:15 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2071
  • 16:15 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2071
  • 16:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P68534 and previous config saved to /var/cache/conftool/dbconfig/20240902-161454-ladsgroup.json
  • 16:14 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2071
  • 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2071.codfw.wmnet 52.0.192.10.in-addr.arpa 2.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:14 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2071.codfw.wmnet 52.0.192.10.in-addr.arpa 2.5.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2071 - hnowlan@cumin1002"
  • 16:14 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2071 - hnowlan@cumin1002"
  • 16:11 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 16:08 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2071
  • 16:08 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2071.codfw.wmnet with OS bullseye
  • 15:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T371742)', diff saved to https://phabricator.wikimedia.org/P68533 and previous config saved to /var/cache/conftool/dbconfig/20240902-155947-ladsgroup.json
  • 15:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2068.codfw.wmnet with reason: host reimage
  • 15:51 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2068.codfw.wmnet with reason: host reimage
  • 15:51 elukey: spicerack 8.12.0 installed on cumin2002
  • 15:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2069.codfw.wmnet with reason: host reimage
  • 15:50 elukey: uploaded spicerack_8.12.0 to apt.wikimedia.org bullseye-wikimedia
  • 15:46 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2069.codfw.wmnet with reason: host reimage
  • 15:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2070.codfw.wmnet with reason: host reimage
  • 15:40 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2070.codfw.wmnet with reason: host reimage
  • 15:39 claime: Enabling puppet on O:cache::text for 1070032 - T364400
  • 15:36 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2068
  • 15:36 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2068
  • 15:19 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2070
  • 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2068.codfw.wmnet with OS bullseye
  • 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2069.codfw.wmnet with OS bullseye
  • 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2070.codfw.wmnet with OS bullseye
  • 15:12 jayme: running homer 'cr*codfw*' commit 'T372878'
  • 15:11 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2067.codfw.wmnet
  • 15:11 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2067.codfw.wmnet
  • 15:11 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2066.codfw.wmnet
  • 15:11 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2066.codfw.wmnet
  • 15:09 jayme: running homer 'lsw1-a6-codfw*' commit 'T372878'
  • 15:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:04 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:04 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:04 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 15:02 claime: Enabling puppet on cp2027.codfw.wmnet to test 1070032 - T364400
  • 14:58 claime: Disabling puppet on O:cache::text to merge 1070032 - T364400
  • 14:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T371742)', diff saved to https://phabricator.wikimedia.org/P68531 and previous config saved to /var/cache/conftool/dbconfig/20240902-144751-ladsgroup.json
  • 14:47 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 14:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 14:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 14:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T371742)', diff saved to https://phabricator.wikimedia.org/P68530 and previous config saved to /var/cache/conftool/dbconfig/20240902-144729-ladsgroup.json
  • 14:45 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:44 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P68529 and previous config saved to /var/cache/conftool/dbconfig/20240902-143222-ladsgroup.json
  • 14:22 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) check for host wikikube-ctrl2003.codfw.wmnet
  • 14:22 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node check for host wikikube-ctrl2003.codfw.wmnet
  • 14:22 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
  • 14:22 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
  • 14:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P68528 and previous config saved to /var/cache/conftool/dbconfig/20240902-141715-ladsgroup.json
  • 14:04 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2066.codfw.wmnet with reason: host reimage
  • 14:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T371742)', diff saved to https://phabricator.wikimedia.org/P68527 and previous config saved to /var/cache/conftool/dbconfig/20240902-140208-ladsgroup.json
  • 14:00 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2066.codfw.wmnet with reason: host reimage
  • 14:00 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 10s)
  • 14:00 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 13:51 aqu@deploy1003: Finished deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow (duration: 00m 10s)
  • 13:51 aqu@deploy1003: Started deploy [airflow-dags/analytics_test@5315c8d]: Test Refine through Airflow
  • 13:43 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 13:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1162 (T371742)', diff saved to https://phabricator.wikimedia.org/P68525 and previous config saved to /var/cache/conftool/dbconfig/20240902-133950-ladsgroup.json
  • 13:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 13:39 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 13:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T371742)', diff saved to https://phabricator.wikimedia.org/P68524 and previous config saved to /var/cache/conftool/dbconfig/20240902-133928-ladsgroup.json
  • 13:24 TheresNoTime: done UTC afternoon backport window
  • 13:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P68523 and previous config saved to /var/cache/conftool/dbconfig/20240902-132421-ladsgroup.json
  • 13:21 samtar@deploy1003: Finished scap sync-world: Backport for IS: Add CommunityRequests to InitialiseSettings (T372527) (duration: 06m 43s)
  • 13:16 samtar@deploy1003: samtar: Continuing with sync
  • 13:16 samtar@deploy1003: samtar: Backport for IS: Add CommunityRequests to InitialiseSettings (T372527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:14 samtar@deploy1003: Started scap sync-world: Backport for IS: Add CommunityRequests to InitialiseSettings (T372527)
  • 13:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P68522 and previous config saved to /var/cache/conftool/dbconfig/20240902-130914-ladsgroup.json
  • 13:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Remove wgCheckUserPurgeOldClientHintsData (T359560) (duration: 08m 47s)
  • 13:06 TheresNoTime: `[samtar@mwmaint1002 ~]$ mwscript maintenance/cleanupTitles.php --wiki=ptwiki --prefix=T195546 2>&1 | tee ~/T195546-ptwiki.log` for T195546
  • 13:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
  • 13:01 dreamyjazz@deploy1003: dreamyjazz: Backport for Remove wgCheckUserPurgeOldClientHintsData (T359560) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:58 topranks: apply qos classifers and schedulers to server interfaces on asw-d-codfw T339850
  • 12:58 dreamyjazz@deploy1003: Started scap sync-world: Backport for Remove wgCheckUserPurgeOldClientHintsData (T359560)
  • 12:56 Dreamy_Jazz: Restarted MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 12:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T371742)', diff saved to https://phabricator.wikimedia.org/P68520 and previous config saved to /var/cache/conftool/dbconfig/20240902-125406-ladsgroup.json
  • 12:51 ladsgroup@deploy1003: Finished scap sync-world: Backport for Remove the "powered by mediawiki" override (duration: 09m 05s)
  • 12:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: dump replag
  • 12:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: dump replag
  • 12:47 ladsgroup@deploy1003: ladsgroup: Continuing with sync
  • 12:44 ladsgroup@deploy1003: ladsgroup: Backport for Remove the "powered by mediawiki" override synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:42 ladsgroup@deploy1003: Started scap sync-world: Backport for Remove the "powered by mediawiki" override
  • 12:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for Enable dark mode for Creator: namespace in Commons (duration: 09m 34s)
  • 12:34 ladsgroup@deploy1003: ladsgroup, ebrahim: Continuing with sync
  • 12:32 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1009.eqiad.wmnet
  • 12:30 ladsgroup@deploy1003: ladsgroup, ebrahim: Backport for Enable dark mode for Creator: namespace in Commons synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:28 ladsgroup@deploy1003: Started scap sync-world: Backport for Enable dark mode for Creator: namespace in Commons
  • 12:28 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1009.eqiad.wmnet
  • 12:24 godog: enable oidc for prometheus public web interface - T326657
  • 12:23 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2011.codfw.wmnet
  • 12:22 samtar@deploy1003: Finished scap sync-world: Backport for Add CommunityRequests (T372527) (duration: 28m 14s)
  • 12:21 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 12:19 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw[2271-2277].codfw.wmnet
  • 12:19 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:19 cgoubert@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2271-2277].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 12:19 cgoubert@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2271-2277].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1002"
  • 12:17 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve2011.codfw.wmnet
  • 12:17 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2009.codfw.wmnet
  • 12:14 samtar@deploy1003: samtar: Continuing with sync
  • 12:13 samtar@deploy1003: samtar: Backport for Add CommunityRequests (T372527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:10 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve2009.codfw.wmnet
  • 12:06 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 12:05 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet
  • 12:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2386 to wikikube-worker2068
  • 12:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2068
  • 12:03 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2068
  • 12:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2386 to wikikube-worker2068 - hnowlan@cumin1002"
  • 12:03 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2067.codfw.wmnet with OS bullseye
  • 12:01 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2386 to wikikube-worker2068 - hnowlan@cumin1002"
  • 11:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T371742)', diff saved to https://phabricator.wikimedia.org/P68519 and previous config saved to /var/cache/conftool/dbconfig/20240902-115944-ladsgroup.json
  • 11:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 11:59 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet
  • 11:58 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 11:57 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:56 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2386 to wikikube-worker2068
  • 11:55 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1010.eqiad.wmnet
  • 11:54 samtar@deploy1003: Started scap sync-world: Backport for Add CommunityRequests (T372527)
  • 11:50 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve1010.eqiad.wmnet
  • 11:49 klausman@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1009.eqiad.wmnet
  • 11:43 klausman@cumin1002: START - Cookbook sre.hosts.reboot-single for host ml-serve1009.eqiad.wmnet
  • 11:43 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2067.codfw.wmnet with reason: host reimage
  • 11:41 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2067.codfw.wmnet with reason: host reimage
  • 11:38 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2386 to wikikube-worker2068
  • 11:38 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2386 to wikikube-worker2068
  • 11:37 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:36 cgoubert@cumin1002: START - Cookbook sre.hosts.decommission for hosts mw[2271-2277].codfw.wmnet
  • 11:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw[2261-2262,2268-2270].codfw.wmnet
  • 11:35 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:32 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 11:32 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=99) from mw2386 to wikikube-worker2068
  • 11:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2389 to wikikube-worker2071
  • 11:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2388 to wikikube-worker2070
  • 11:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2071
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2071
  • 11:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2389 to wikikube-worker2071 - hnowlan@cumin1002"
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2389 to wikikube-worker2071 - hnowlan@cumin1002"
  • 11:30 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2070
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2070
  • 11:29 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:29 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2388 to wikikube-worker2070 - hnowlan@cumin1002"
  • 11:28 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mw2386
  • 11:27 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2388 to wikikube-worker2070 - hnowlan@cumin1002"
  • 11:25 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:23 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2387 to wikikube-worker2069
  • 11:22 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2069
  • 11:22 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2069
  • 11:22 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:22 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2387 to wikikube-worker2069 - hnowlan@cumin1002"
  • 11:22 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2067.codfw.wmnet with OS bullseye
  • 11:21 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host mw2386
  • 11:21 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2067.codfw.wmnet with OS bullseye
  • 11:20 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2387 to wikikube-worker2069 - hnowlan@cumin1002"
  • 11:20 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 11:20 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:20 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 11:20 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2067.codfw.wmnet with OS bullseye
  • 11:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2068
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2068
  • 11:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:19 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2386 to wikikube-worker2068 - hnowlan@cumin1002"
  • 11:19 cgoubert@cumin1002: START - Cookbook sre.hosts.decommission for hosts mw[2261-2262,2268-2270].codfw.wmnet
  • 11:13 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:11 claime: Manual puppet node deactivate for mw2295 mw2296 mw2377 mw2378 mw2385 - T372878
  • 11:08 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:03 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2386 to wikikube-worker2068 - hnowlan@cumin1002"
  • 11:02 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 11:02 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 11:02 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2389 to wikikube-worker2071
  • 11:02 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2388 to wikikube-worker2070
  • 11:01 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 11:01 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 10:59 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2387 to wikikube-worker2069
  • 10:57 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:57 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from mw2386 to wikikube-worker2068
  • 10:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2389.codfw.wmnet
  • 10:49 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2389.codfw.wmnet
  • 10:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2388.codfw.wmnet
  • 10:44 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2388.codfw.wmnet
  • 10:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2387.codfw.wmnet
  • 10:42 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2387.codfw.wmnet
  • 10:34 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:34 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:31 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:29 hnowlan@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2386.codfw.wmnet
  • 10:28 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:27 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:26 hnowlan@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host mw2386.codfw.wmnet
  • 10:25 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 10:21 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2067.codfw.wmnet with OS bullseye
  • 10:15 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1008.eqiad.wmnet
  • 10:14 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1011.eqiad.wmnet
  • 10:13 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1010.eqiad.wmnet
  • 10:12 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1009.eqiad.wmnet
  • 10:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on an-worker1127.eqiad.wmnet with reason: Cold booting due to RAID controller battery issue
  • 10:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on an-worker1127.eqiad.wmnet with reason: Cold booting due to RAID controller battery issue
  • 10:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 10:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 10:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 10:06 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 10:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 10:04 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host stat1011.eqiad.wmnet
  • 10:04 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host stat1010.eqiad.wmnet
  • 10:04 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host stat1009.eqiad.wmnet
  • 10:04 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host stat1008.eqiad.wmnet
  • 10:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:01 isaranto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
  • 10:00 isaranto@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
  • 10:00 isaranto@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
  • 09:59 isaranto@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: sync
  • 09:55 isaranto@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
  • 09:55 isaranto@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: sync
  • 09:54 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:53 stevemunene@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 09:30 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2067
  • 09:30 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2067
  • 09:30 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2067
  • 09:30 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2067.codfw.wmnet 88.0.192.10.in-addr.arpa 8.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:30 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2067.codfw.wmnet 88.0.192.10.in-addr.arpa 8.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:30 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:30 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2067 - jayme@cumin1002"
  • 09:30 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2067 - jayme@cumin1002"
  • 09:21 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 100%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68517 and previous config saved to /var/cache/conftool/dbconfig/20240902-091844-arnaudb.json
  • 09:17 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2067
  • 09:16 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2066
  • 09:16 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2066
  • 09:16 zabe@deploy1003: Finished scap sync-world: Backport for Do not log failed autocreations on closed wikis as diagnostic errors (T373650) (duration: 06m 34s)
  • 09:14 elukey: update netboot images for Bullseye and Bookworm point releases (11.11 and 12.7) following https://wikitech.wikimedia.org/wiki/SRE/Infrastructure_Foundations/Debian-installer
  • 09:12 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2066
  • 09:12 jayme@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2066.codfw.wmnet 197.0.192.10.in-addr.arpa 7.9.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:12 jayme@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2066.codfw.wmnet 197.0.192.10.in-addr.arpa 7.9.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 09:12 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:12 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2066 - jayme@cumin1002"
  • 09:12 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2066 - jayme@cumin1002"
  • 09:11 zabe@deploy1003: zabe: Continuing with sync
  • 09:11 zabe@deploy1003: zabe: Backport for Do not log failed autocreations on closed wikis as diagnostic errors (T373650) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:09 zabe@deploy1003: Started scap sync-world: Backport for Do not log failed autocreations on closed wikis as diagnostic errors (T373650)
  • 09:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 75%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68516 and previous config saved to /var/cache/conftool/dbconfig/20240902-090339-arnaudb.json
  • 09:03 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2067.codfw.wmnet with OS bullseye
  • 09:02 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 09:02 jayme@cumin1002: START - Cookbook sre.hosts.move-vlan for host wikikube-worker2066
  • 09:02 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2066.codfw.wmnet with OS bullseye
  • 09:01 elukey: restart puppetserver on puppetserver1002 to pick up new JVM settings - T373527
  • 08:50 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2027 to wikikube-worker2067
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2067
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2008 to wikikube-worker2066
  • 08:49 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2067
  • 08:49 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:48 jayme@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2066
  • 08:48 jayme@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2066
  • 08:48 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:48 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2008 to wikikube-worker2066 - jayme@cumin1002"
  • 08:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 50%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68515 and previous config saved to /var/cache/conftool/dbconfig/20240902-084833-arnaudb.json
  • 08:48 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2008 to wikikube-worker2066 - jayme@cumin1002"
  • 08:47 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:41 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 08:41 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2027 to wikikube-worker2067
  • 08:41 jayme@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2008 to wikikube-worker2066
  • 08:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 25%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68514 and previous config saved to /var/cache/conftool/dbconfig/20240902-083328-arnaudb.json
  • 08:21 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2027.codfw.wmnet
  • 08:18 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2027.codfw.wmnet
  • 08:18 jayme@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubernetes2008.codfw.wmnet
  • 08:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 15%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68513 and previous config saved to /var/cache/conftool/dbconfig/20240902-081822-arnaudb.json
  • 08:17 jayme@cumin1002: START - Cookbook sre.k8s.pool-depool-node depool for host kubernetes2008.codfw.wmnet
  • 08:09 moritzm: installing Linux 6.1.106 on Bookworm hosts
  • 08:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 6%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68512 and previous config saved to /var/cache/conftool/dbconfig/20240902-080317-arnaudb.json
  • 07:51 Emperor: restart swift-proxy on ms-fe2012 T360913
  • 07:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 5%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68511 and previous config saved to /var/cache/conftool/dbconfig/20240902-074811-arnaudb.json
  • 07:43 kartik@deploy1003: Finished scap sync-world: Backport for Enable Section Translation in bdr, btm, and dtp Wikpedias (T371420) (duration: 07m 35s)
  • 07:39 kartik@deploy1003: kartik: Continuing with sync
  • 07:38 kartik@deploy1003: kartik: Backport for Enable Section Translation in bdr, btm, and dtp Wikpedias (T371420) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:36 kartik@deploy1003: Started scap sync-world: Backport for Enable Section Translation in bdr, btm, and dtp Wikpedias (T371420)
  • 07:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 4%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68510 and previous config saved to /var/cache/conftool/dbconfig/20240902-073306-arnaudb.json
  • 07:32 kartik@deploy1003: Finished scap sync-world: Backport for Enable EditCheck references on plwiki (T373079) (duration: 28m 41s)
  • 07:22 kartik@deploy1003: msz2001, kartik: Continuing with sync
  • 07:19 kartik@deploy1003: msz2001, kartik: Backport for Enable EditCheck references on plwiki (T373079) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 3%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68509 and previous config saved to /var/cache/conftool/dbconfig/20240902-071800-arnaudb.json
  • 07:04 kartik@deploy1003: Started scap sync-world: Backport for Enable EditCheck references on plwiki (T373079)
  • 07:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 2%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68508 and previous config saved to /var/cache/conftool/dbconfig/20240902-070255-arnaudb.json
  • 06:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 1%: post replag depool', diff saved to https://phabricator.wikimedia.org/P68507 and previous config saved to /var/cache/conftool/dbconfig/20240902-064749-arnaudb.json
  • 06:35 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: replag
  • 06:35 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on db1206.eqiad.wmnet with reason: replag
  • 06:34 arnaudb@cumin1002: dbctl commit (dc=all): 'depool db1206 replag', diff saved to https://phabricator.wikimedia.org/P68506 and previous config saved to /var/cache/conftool/dbconfig/20240902-063432-arnaudb.json

2024-09-01

  • 16:50 oblivian@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 100%: Repooling after resolving replag', diff saved to https://phabricator.wikimedia.org/P68505 and previous config saved to /var/cache/conftool/dbconfig/20240901-165009-root.json
  • 16:35 oblivian@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 75%: Repooling after resolving replag', diff saved to https://phabricator.wikimedia.org/P68504 and previous config saved to /var/cache/conftool/dbconfig/20240901-163504-root.json
  • 16:19 oblivian@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 50%: Repooling after resolving replag', diff saved to https://phabricator.wikimedia.org/P68503 and previous config saved to /var/cache/conftool/dbconfig/20240901-161958-root.json
  • 16:04 oblivian@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 25%: Repooling after resolving replag', diff saved to https://phabricator.wikimedia.org/P68502 and previous config saved to /var/cache/conftool/dbconfig/20240901-160453-root.json
  • 15:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1221.eqiad.wmnet
  • 15:50 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for db1221.eqiad.wmnet
  • 15:49 oblivian@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 10%: Repooling after resolving replag', diff saved to https://phabricator.wikimedia.org/P68501 and previous config saved to /var/cache/conftool/dbconfig/20240901-154948-root.json
  • 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1221.eqiad.wmnet with reason: replag
  • 15:24 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 16:00:00 on db1221.eqiad.wmnet with reason: replag
  • 15:15 oblivian@cumin1002: dbctl commit (dc=all): 'depool db1221, lag', diff saved to https://phabricator.wikimedia.org/P68499 and previous config saved to /var/cache/conftool/dbconfig/20240901-151530-oblivian.json


Archives

See Server Admin Log/Archives.