This week has been a bad one for region restarts. There was a problem with the update rolled out to the main grid. The update had to be rolled back later in the day. This means most of the upgrades running in the release channels didn’t upgrade. So, what happened?
Roll Fail
Since this release has been through QA and came from a release channel, how did it get past the testing and yet be so bad it had to be rolled back?
Oskar and Steven Linden explained at the recent Beta Server User Group and Maestro in the Deploys thread in the forum what happened. It seems a region would crash and begin its restart, which is what it is supposed to do… well… not the crash part. The new problem triggered the simulator to restart all the regions in the host. In the logs this looked like an estate manager restart of the non-crashed regions. Since no one reported problems during the test week on the release channel and the restarts didn’t register as a warning in the grid monitoring reports, no one noticed the problem. It was too small to see in the release channel.




