Announcement

Collapse
No announcement yet.

Server Maintenance 2/21

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Server Maintenance 2/21

    Originally posted by Shannon via status.skotos.net
    I'll be taking down TEC for maintenance somewhere in the 11.45 am -12.30 pm PST time frame. It'll be down for 1-2 hours. This will allow us to do some maintenance to hopeful resolve some problems that cropped up on Saturday.
    Please plan ahead.

  • #2
    Unfortunately, there was an unexpected and unlucky problem with the maintenance. I just posted this to status.skotos.net:
    Unfortunately our shutdown and the daily backup smashed into each other in an unexpected way. I'm very sorry to say that this (again) corrupted our current backups and is requiring a reversion to that same early Sunday backup that we used yesterday. This is *not* an ongoing problem, but instead a very unlucky coincidence that was further hampered by the current issues with the TEC machine (which resulting in this morning's backup not being copied off the machine).

    On the bright side, we did discover the problem that caused issues starting Saturday (and maybe going back earlier). One of our RAID disks had collapsed. We've got about half-a-dozen backup drives on hand, but not one of the particular model used by TEC. We've thus put two on order (one to replace the one that died, one as a backup). The new one will be installed as soon as it arrives, which shouldn't require any downtime. There will be some slowdown afterward when our RAID rebuilds, but by the end of the weekend, it should be business as usual.
    Looks like everything just came up while I was typing.

    Comment


    • #3
      Two final updates:

      * Bad RAID disk has been replaced. It's now rebuilding. This may cause slowdowns today, but it'll be done sometime in the next 4-8 hours.

      * The problem with backups not getting copied to our external disk has now been repaired too. The state of a mount point and a virtual disk had gotten messed up due to the Sunday crash in the middle of a backup and everything needed to be reset by hand.

      Comment

      Working...
      X