#SL News Week 33

Bandwidth

Bandwidth is a problem right now. Those using capped services and those having ISP’s with a strict abuse ToS are supposedly running into problems.

Server & Scripting 8/2012

The Magnum release channel seems to suffer the most. The worst case is when approaching the SW corner of a region that has 3 adjacent regions. Large draw distance aggravate the problem.

This problem is being worked on. See JIRA: SVC-8124Excessive “ParcelOverlay reliable” messages sent by regions since last rolling restart (2012-08-08). This not a completely pertinent item but discussion about the problem is ongoing and the flood of messages does effect bandwidth used.

A more on point JIRA is: VWR-29499Use of Internet bandwidth exessivo. [sic] But it is referred back to the previous item. The misspelling makes it hard to find via search.

Some are seeing the rate stick up around 2.5mbps.

Traffic Count

Andrew has found… or the Lindens have, is probably more accurate… the problem and are currently rolling a fix. It should be in place before the end of the day. It may take a day or two for count numbers to correct.

The problem is in a backend side server that calculates dwell time. Fixing it fixes the problem and nothing has to roll to the main channel.

There goes Darrius’ theory.

Region Crossings

Now that Havok 2012.1 is running across all channels region crossing are supposed to be better. To some extent they are, but in general they seem to have degraded for vehicles, planes, cars, boats, whatever.

Most of the Scripting-Server Meeting was consumed discussing the topic.

A factor in crossings is server location and which server adjacent regions are in. Some weeks ago the Lindens made an effort to get adjacent regions in the same server or physically adjacent servers to reduce network traffic time. It made a significant improvement in region crossing performance.

The consolidation of the 3 data centers into 2 has disrupted the organization of the regions and server. Once the consolidation is complete another organizing pass will be made. The consolidation should help some too, or at least so I think.

Main Channel

The roll out completed. This is mostly a configuration change roll out. None of the Lindens in Scripting/Server are knowledgeable on the details of what it is. Simon Linden says, “These are just OS and low-level service settings and no hardware changes are made with this update. We’ve been doing some hardware shuffling recently to balance the colos better, but that’s not part of todays’ update.

So, we should little if any changes from today’s roll out.

Blue Steel

This channel has a pile of bug fixes that have been in the QA queue for a long time.

A few people are seeing SVC-8146llRezAtRoot() does not set correct parameters (for sale) on rezzed object in Second Life RC Blue Steel 12.08.03.263047. Kelly linden said, “I have a fix for that one. it is a rare corner case though: it requires that the last person to rez the object in world be different than the owner of the rezzer.

I haven’t run into it.

HTTP

Kelly announced two new features that were not in the initial release notes for Blue Steel:

There are a couple of features on Blue Steel that managed to miss the release notes. llHTTPRequest now accepts a new metadata option HTTP_CUSTOM_HEADER to set custom headers on outbound requests.

I’ll see about getting that into release notes and the wiki this week.

Also if a remote server returns an application/json response to an llHTTPRequest, the script will accept it instead of [throwing an] error.

Group Edit

Baker updated us on the Group Edit problems for large groups:

I moved my group management code completely to WSGI, which will use HTTP to serve the data. This will mean no legacy viewer support for the new version.

On the plus side, things are running splendidly, and I’ll be getting it hooked up to the back-end via a cap later today. Then I’ll be getting the viewer updated with the new data format.

He is hoping to see it move into QA/RC in 2 to 3 weeks. But there is also a viewer component and it is unpredictable when we will see it hit the viewer QA.

Summary

I’m pressed for time so I’ve touched the main points. More later.

7 thoughts on “#SL News Week 33

  1. I have noticed the Bandwidth issue in lightly loaded main channel sims surrounded by like sims. This needs to be fixed SOON as it has (for some, tho luckily not me – yet anyway) Real World implications (depending on ISP) … once the lawyers get a hold of it it could get ugly (the utter silence from the Lab and lack of a roll back are heading into negligence territory).

    Its a little worrying that none of the Lindens in Server or Script groups (if I read you correctly) knows whats going on with the servers – consolidation. colocation, OS changes etc. Perhaps script ppl may not be totally up on this but youd think the server ppl would be.

    I still think the root cause is too many changes, the system barely stabilised before they are on to the next big thing… and of course QA suffers its just impossible to propoeryl test with 4 releaseds a week (who does that anyway?).

    • The lawyers are a minor consideration. The losses are not enough to pay for their time.

      That the Lindens working on parts of the server code and in writing the scripting support code don’t know what is happening with consolidation is not surprising. I’ve asked Lindens and they have said there is an Operations Team. The OT will handle hardware replacement, consolidation, OS updates, and probably run the actual server software roll outs. They work tightly with the Development Lindens we deal with. It has to be a cooperative process. But, neither side needs to know the gory details of what the other is doing and they both are likely too busy to pay that much attention in any event. Programmers do NOT necessarily know how to build a computer. Nor does a hardware jockey have to know how to program.

      The pace of development… the community will never agree on the pace of development. The Lab used to move much slower. It seemed everyone bitched. We had feature requests on the books for years. Fixes took forever. Noe people complain it is going too fast.

      I prefer fast. Because fast or slow the system has problems. It always will. A million lines of code is going to have problems and mistakes. It is life in the computer age. At least in a fast paced development environment things get fixed quickly. …and quickly is relative.

      • Past experience of things such as OS changes suggest that OT people just don’t talk to anyone. If the announce they have started region restarts, they don’t announce when they finish. I don’t recall seeing anything of the data center consolidation. We’re just left with something that looks like a slowly collapsing grid.

        • They post each time the roll out complete. They post each time they run a maintenance process that affects SL. OT talks with the other teams in the Lab.

          How many people do you think care if a network switch or router is changed out? I think the OT’s news would generally be tedious and boring. I think the public facing Lindens give us the highlights that have any ‘interest’ value… and they too aren’t that interested. I suspect most users do not care if SL runs on Linux, Amiga, NeXTSTEP or Inferno.

  2. I remember some time ago there was talk of improving the ability to cross into diagonal regions. I wonder if the bandwidth issue, and the kittycorner regions issue don’t both have to do with an attempt along those lines.

    • That is a possibility. But those things were done some time ago without creating a problem…

  3. Pingback: Bandwidth Problem Update

Leave a Reply

Your email address will not be published. Required fields are marked *