Pathfinding Meeting Week 32

Thursday we had a meeting with Lorca, Falcon, and Stinson Linden. We got some questions answered. Motor Loon had a good whine about Pathfinding, the keyword being good.

Performance

The Lindens have a couple of days of stats to look at now. The result is the average performance of the grid has not changed. This means that while most regions are NOT optimized, they are not pulling down performance.

Pathfinding User Group

The Process

Falcon pulled up the Pathfinding code and explained what the code is doing in English. What he said follows.

Here’s how it works for the record and let me grab the code for reference.

First, the navmesh is updated to account for dynamic obstacles and the time required to do so is measured. The time used is compared to the maximum which is predetermined (4ms for full regions, 1ms for homesteads). If more than that time has been used, then the navmesh update step will be skipped for enough subsequent frames that the average time per frame is reduced to 4ms or 1ms.

You can see this in the “Num Skipped Frames” (I think?) item in the AI Stats section of the server stats floater.

With an optimized region the time used by the dynamic navmesh update should be near zero and we’ve only seen it become an issue in very very few regions.

Next, we perform path searches for some number of characters. There are additional throttles in place for those path searches. If we went over that time, we throttle down how many we try next frame. If we were well under that time, we increase how many we’ll do next frame.

We increase the number of path searches if we aren’t doing all [searches] every frame if <1ms is being spent on them and we decrease the number if >2ms is spent on them.

[Next] Then we handle advancing the characters physics. If the character has spent too long on this step in the past (more than 50us) we will skip updates in the future.

To Optimize or Not

Regarding wheter one optimizes or not, Falcon says, “It [optimizing] will never be worse to set objects to one of the walkable/obstacle settings. It may or may not be better depending on the circumstances.

There are a number of future ideas we have, at the Lab, for neat new things we could do someday that will rely on that same data [walkable/obstacle settings].

So, you might as well mark it [optimize] if you can.

[…] Here’s what I would advise:

  1. Look at performance on your region. Is it noticeably worse than before? If not, you should still optimize for reasons of future features that might use that data, but you can do so whenever you get around to it.
  2. If performance is worse, look at the Sim Stats floater. Is AI Time high? If so, are silhouette steps being skipped? If the answer here is yes, then you have two choices:

    1. optimize your content by setting as many objects as possible to one of the static types (walkable, obstacle, etc.) or
    2. disable dynamic pathfinding.”

Lorca Linden said, “However, (b) is not generally necessary.

Again I want to stress that we have good metrics on average grid wide performance, and we saw no appreciable impact on this average grid wide performance when we released pathfinding to the Grid earlier this week.

Navmesh

Falcon said, “Also, and I don’t think this has been said before at one of these meetings: The view of the physics world you see in the navmesh floater is an extremely accurate representation of how the physics engine sees it (with the exception of a 10cm size difference on boxes…long story for another day).

Gotcha

Motor Loon was one of the region managers that decided to optimize his region. He found a gotcha in how the PF Tools work. Linksets that include phantom prims to reduce Land Impact are not handled properly. The navmesh/linkset list is not telling the user that it will turn off the phantom property of prims in linksets. I am fuzzy on why that happens. It seems like a bug to me. But, it apparently is a complex fix. So, we are likely to see some other fixes to help mitigate the problem before the phantom change bug gets resolved.

Also, a number of Linden Scripting Language (LSL) functions will not work once an object is set to a STATIC part of the Navmesh. Any function that moves a prim does not work when it is in a static object.

The problem comes when the object begins shouting error messages, like: Object: llSetLocalRot() does not work on objects that contribute to the Navmesh. The viewer is reporting most of these error shouting objects at position <0,0,0>, which is useless as the objects are not there. This is a definite bug.

What is happening is problems in multiple places. Using the PF Tools to optimize, one cannot see which linksets have scripts. Lorca is looking at having that information added to the tool.

When the linkset has a child object with a script that moves it, the error is reported as coming from an object located at <0,0,0>. Lorca talking with Kelly Linden finds that changing things so the root prim’s real position is reported is possible. These changes will fix most of the problems Motor ran into and will mitigate most of the issues until a more complex fix can be implemented.

The Sort of Fix

The Lindens will improve the tools and eventually get the phantom switching issue resolved. In the mean time if you have made changes and are trying to find objects shouting errors there is an easier process than going object by object through the entire region.

Stinson Linden reminded us we can use a ‘halves’ process to find the item. Using the process one can generally find a specific item out of a million items in 20 tries worst case.

If you converted your items and got the error, revert half of them back to a dynamic/movable type. If you still get the error, you know which half it is in. Repeat until you find it. It is still tedious, but nowhere near as bad as a one-by-one.

The Question

Falcon asked a question that I think is representative of how some of these bugs make it into a release. Falcon asked, “You were on the Magnum RC for weeks, some of you on the Second Life RC PF for months. This process hasn’t changed. Why didn’t you try this during the beta and provide feedback?

There is no doubt in my mind the question is sincere. The community had weeks to test PF and the PF Tools. But, we did not find the problem. The Lindens did not anticipate these issues because it was not in the range of actions they anticipated. I’ll explain that a bit.

The Lindens are not profuse builders. Nor do they have the incentive to worry about Land Impact cost when they do build. So, their test builds tend to focus on the things they want to test, that they ‘think’ are representative, and they also tend to conform to the Best Practices when building, keyword being ‘tend’.

Residents on the other hand have a primary interest in mitigating Land Impact costs. Conforming to Best Practices is mostly a nice idea rather than a rule. So, we tend to push the envelope on things and use hacks or whatever means will reduce Land Impact costs. Residents can get really creative.

The result is the two groups build differently. What we do often surprises the Lindens. They, or anyone, cannot anticipate what tens of thousands of creative minds will come up with. So, the Lindens are going to miss things. Beta testing is supposed to catch those surprises. But, as I answered Falcon, “Falcon, as for why… it is just human nature to test just the things we want to do to see if they work. The idea that the phantom switch [problem] was likely to happen is not likely to come up for us.

I doubt most users would get into testing region optimization until they had a reason to do so. Once it becomes something they want to do they start working with it. The few people that tried optimization did not run into the problem. It was not until people started optimizing everything in sight that it was found and that did not happen until after the release roll out. This is just part of how software development works when done by humans.

No one did anything ‘wrong’. People just did what people do. In hindsight it is an obvious problem and something that could have been tested and caught. As soon as humans develop perfect foresight these types of problems will stop… don’t expect that to happen any time soon.

Summary

So, a considerable portion of the SL population is spooked by the FS/PH post that regions will take an 18% performance hit from PF, if the regions are not optimized. Some people did not catch the ‘optimized’ part just the 18% hit. Some stopped reading at that point and reacted. While technically possible and factually accurate as to a possibility, it is also unrealistic in normal scenarios.

If those region owners had bothered to do some testing of their own, they would have had some facts to base the decision on. So, the result is many region owners are depriving their self and their residents of new features for no reason. But, they are free to do that.

6 thoughts on “Pathfinding Meeting Week 32

  1. I shall keep saying this.

    What I saw of the distribution of the Magnum RC during the test phase was heavily biased towards water regions, and that straight away minimises building by Residents, and intereractions between the Navmesh and vehicles.

    OK, the Lindens don’t build like we do. But the planning for the testing led to a biased sample. There was also the usual Linden failure to communicate.

    Putting out Havok 7 and Pathfinding at the same time might not have been the best plan. I’m not sure they can be practically separated, but a lot of the vehicle problems seem to be down to the Physics engine.

    • …and I keep telling you what you saw in regard to water based regions was not representative of what regions were in beta.

  2. “the two groups build differently”
    Never has this been clearer than with pathfinding. I could not believe it when I saw the suggestion that I should drop a cylinder around every tree in the woods! Have the Lindens never heard of prim count? Obviously this is not going to happen!
    I may, eventually, set the floors and walkways to walkable but not until I am confident that it actually works.

    • Yeah, Lindens are not as sensitive to prim count as residents. As to prims around trees… for prim and sculpty trees that would likely be a more efficient solution than making the sculpty tree static. Mesh trees with a hand made physics layer would probably less prims and work better for PF.

      I’ve been playing with PF and Navmesh. I found it more efficient to add a couple of prims for walkable and one for static obstacle. It hasn’t worked quite as I expected. But, I’m still experimenting.

  3. I think the reason that content creators don’t go to the beta grid to test stuff is we’re doing as much as we can already creating content for things which already work, or repairing or replacing content broken by previous Lab changes. In the years I’ve been here, there are many things that are talked about that they *will* happen someday and maybe even work on beta grid, but work totally differently if and when they ever got to main grid. Perhaps if LL offered any incentive, even including “we will listen to your suggestions – here’s a beta testing email address” folks would be more inclined to test.

    I tested Linden Realms game in the early days and sent feedback to the LL creators. Unfortunately my HUD never showed back up again after the first detaching, and so while I could wander the Realms, nothing worked. I told them about it, was told again “it should,” and of course to not get rude and argue with people, I just never played the Realms again because the HUD never showed up again even after repeated visits.

    So perhaps they thought the problem was solved; it wasn’t – and playing “get chased by rocky monsters” wasn’t all that interesting to do it without being able to do the quests, so I just gave up on it.

    A real place to leave real posts about real problems without LL getting all defensive and upset would probably be a great starting point. I can’t believe they don’t have some kind of posting feed for us that would let them keep their finger on the game pulse. If there were a bulletin board where one could go to leave comments “this sim is down but it’s not mine so I can’t call concierge,” “teleport just crashed me,” “group chat is broken again” and other simple tweet-like comments, the Lab might have a great way to do faster fixes.

    • There are ways to provide feedback. There are places where the Lindens have their finger on the pulse of SL. Finding those is not easy. The Lindens don’t push the information because of the wackadooles that show up or try to game the pulse.

      Today a person wanted the Lindens to add Blender into the viewer so they could edit mesh in-game. I will admit that would be cool. But, I also realize the level of work it would involve and I doubt I can imagine what it would do to the support people. He could not hear the Linden answer that such an addition would take a lot of work and would therefore be a low priority falling behind other more important fixes and features. Regulars at the meeting eventually just started telling him to add the feature request to the JIRA or to build it into his own viewer.

      Because it is not possible to reason with everyone and the number of people using SL, that have no idea what is involved in programming a feature, that are rude and abusive is so high, Lindens do not stick their heads up very far.

Leave a Reply to Nalates Urriah Cancel reply

Your email address will not be published. Required fields are marked *