Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This PR solves a crash with Robert's bodies logged on your bugtracker. #3095

Merged
merged 15 commits into from
Sep 9, 2021

Conversation

bosvensson1
Copy link
Contributor

Hello,

Firstly my apologies for using this, as I understand deprecated github repository. I use Open MW as a means of slacking off at work and sadly, the gitlab site is blocked on our company network.

This PR solves the crash with Robert's bodies logged on your bugtracker.

Explanation: the code assumes that the node to attach is not shared (has at most one parent), but when a non-skinned trishape is used in a place where we expect skinned trishapes, this assumption breaks. Because in SceneUtil::CopyOp, we do not clone non-skinned trishapes. I assume this is intentional to conserve memory.

My solution puts SceneUtil::attach in charge of cloning exactly the node to attach, so we can be sure the node is not shared.

Also this reduces some redundant cloning that was going on before. We were cloning the entire skeleton for each skinned body part only to discard most of it. Now we are cloning only the nodes belonging to the bodypart.

How to test this PR:
[ ] Ensure Roberts bodies doesn't crash.
[ ] Check if the count of Drawables and Nodes in the scene remains roughly the same as before. (I hope I didn't add any unnecessary cloning)

@bosvensson1
Copy link
Contributor Author

The ci fails with an error 404, while succeeding in master. How is this possible? Is it somehow my fault?

@Capostrophic
Copy link
Collaborator

Unfortunately you'll just have to find a way to submit a GitLab MR as GitHub repo CI doesn't quite get the full unit test and platform coverage and really, maintaining the GitHub CI infrastructure just isn't worth it.

And you should give your commits descriptive names.

As for the changes, partially breaking instance preloading and doing a const_cast to properly process skinned body parts seems evil.

@bosvensson1
Copy link
Contributor Author

Thank you for the hints. I really appreciate a review from a more experienced developer as I am not too familiar with all of this.

And you should give your commits descriptive names.

The description is in the PR description. I was under the impression this is the description that will be assigned to the commit once merged. Is this not the case?

If I need another description for the commit, what is the difference between the PR description and commit description?

As for the changes, partially breaking instance preloading and doing a const_cast to properly process skinned body parts seems evil.

The const_cast approach I copied from another place working with getTemplate. It is necessary because osg provides no const version of the NodeVisitor class. vsg provides if you would like to upgrade.

Instance preloading was mostly unused even before this PR. It was turned off for non-animated objects, probably because we hardly have to clone anything for them (As described in the PR, non-skinned drawables are not cloned). For animated objects, the preloader did prepare instances, but these were mostly not used. Animation::setObjectRoot strips the instance of some unneeded nodes and stores the result into its own cache independent of the SceneManager's instance cache.
The only remaining use for instance preloading were the skinned body parts. This PR should reduce the work required for them by about an order of magnitude, now that we only instance the bodypart in question. In any case, the preloading was incomplete because compileGlObjects for RigGeoemetry is unimplemented. I would be happy to work on this in another PR if you like, but I cannot promise it will work out as I am not familiar with Open MW custom osg drawables yet.

In summary, the preloader instancing was mostly redundant. The non-redundant portion of it is not as necessary anymore.

@AnyOldName3
Copy link
Member

The description is in the PR description. I was under the impression this is the description that will be assigned to the commit once merged. Is this not the case?

We get a merge commit, whose name will be something like Merge #3095 This PR solves a crash with Robert's bodies logged on your bugtracker plus all the other commits you've got on this branch.

@AnyOldName3
Copy link
Member

It's possible to squash at our end, but then we've only got one commit, rather than the commit plus the merge commit that adds it. If you squash the commits yourself, then that's preferable.

You might find that you can do more via GitHub's rich editor at https://github.dev/OpenMW/openmw

@bosvensson1
Copy link
Contributor Author

You might find that you can do more via GitHub's rich editor at https://github.dev/OpenMW/openmw

How amazing is that. I was totally unaware. Will be using this next time for sure. There is still no squash capability unfortunately so we are left with the present mess and I will try to do better next time.

@psi29a
Copy link
Member

psi29a commented Sep 6, 2021

Hello @bosvensson1 , sadly enough github has been left to languish a bit because we focus on GL; but if there are people willing to put time in to get github pipelines up and running so we don't have to rely on travis or appveyor, that would work too.

I can try to open a MR on GL. Would this work for the time being?

@bosvensson1
Copy link
Contributor Author

Absolutely @psi29a, I appreciate. Is it possible to keep the committer intact in the process of uploading to another platform? I want to build a portfolio for myself in the hopes of applying to a better job in the future. Otherwise, I will be content with a "credit to Bo Svensson" in the title. Hopefully there are no gross compilation errors as I've had to manually copy across the changes in the webedit like an idiot. Thanks a lot.

@psi29a
Copy link
Member

psi29a commented Sep 6, 2021

Sure, let's see what https://gitlab.com/OpenMW/openmw/-/merge_requests/1196 says too :)

@psi29a
Copy link
Member

psi29a commented Sep 6, 2021

You can click on the ci/gitlab/etc. Details for more info, but so far we have...

../../components/sceneutil/attach.cpp:115:17: error: no matching function for call to 'mergeUserData'
                mergeUserData(toAttach->getUserDataContainer(), newHandle);
                ^~~~~~~~~~~~~
../../components/sceneutil/attach.cpp:89:10: note: candidate function not viable: 1st argument ('const osg::UserDataContainer *') would lose const qualifier
    void mergeUserData(osg::UserDataContainer* source, osg::Object* target)
         ^
../../components/sceneutil/attach.cpp:121:25: error: no matching member function for call to 'setUserDataContainer'
                handle->setUserDataContainer(toAttach->getUserDataContainer());
                ~~~~~~~~^~~~~~~~~~~~~~~~~~~~
/usr/include/osg/Object:238:14: note: candidate function not viable: 1st argument ('const osg::UserDataContainer *') would lose const qualifier
        void setUserDataContainer(osg::UserDataContainer* udc);
             ^
/usr/include/osg/Object:240:32: note: candidate template ignored: could not match 'ref_ptr<type-parameter-0-0>' against 'const osg::UserDataContainer *'
        template<class T> void setUserDataContainer(const ref_ptr<T>& udc) { setUserDataContainer(udc.get()); }
                               ^
In file included from ../../components/sceneutil/attach.cpp:1:
In file included from ../../components/sceneutil/attach.hpp:6:
/usr/include/osg/ref_ptr:39:67: error: cannot initialize a member subobject of type 'osg::Node *' with an lvalue of type 'const osg::Node *const'
        template<class Other> ref_ptr(const ref_ptr<Other>& rp) : _ptr(rp._ptr) { if (_ptr) _ptr->ref(); }
                                                                  ^    ~~~~~~~
../../components/sceneutil/attach.cpp:179:24: note: in instantiation of function template specialization 'osg::ref_ptr<osg::Node>::ref_ptr<const osg::Node>' requested here
                return toAttach;
                       ^
3 errors generated.

@bosvensson1
Copy link
Contributor Author

The android compiler in particular is being more strict with const correctness. I am grateful for the pointer as it pointed to a legitimate mistake. Should be fixed now.

@psi29a
Copy link
Member

psi29a commented Sep 6, 2021

I put a lot of effort into getting OpenMW part of the FOSS program at gitlab and they are fully committed in helping us out. When we reached out about a limit, that replied that we are part of their top tier. They have been very good to us and we have provided a lot of feedback to their beta instances/pipelines (like Mac and Windows) runners.

As @AnyOldName3 has pointed out elsewhere, it's thanks to the different types of pipelines for windows (for example) we were able to track down a nasty cmake issue that would have gone unnoticed. So we've had contact with upstream cmake (and other libraries we use) to point out those issues.

@psi29a
Copy link
Member

psi29a commented Sep 6, 2021

Seems like GL pipelines are happy, there was a problem with file rename because of an errant / though. Does not effect this PR though.

@psi29a
Copy link
Member

psi29a commented Sep 6, 2021

Now we just need people to test the artefacts, and you need to add a changelog entry please and include yourself as a list of contributors. :)

@AnyOldName3
Copy link
Member

Half of the Windows jobs are because we can't build everything within the one-hour time limit. It was taking an average of 62 minutes to build all our applications, so I had to split the build. Now we spend about forty minutes on duplicated work and ten to fifteen on job-specific stuff. If the free agents were faster, we could do more with each of them.

@psi29a
Copy link
Member

psi29a commented Sep 9, 2021

Well, if you plan on sticking around we can get this into main for further testing. This not only resolves the issue but apparently also conserves memory usage while at it. Cheers!

@psi29a psi29a merged commit 147ed39 into OpenMW:master Sep 9, 2021
@elsid
Copy link
Collaborator

elsid commented Sep 9, 2021

There is a warning https://gitlab.com/OpenMW/openmw/-/jobs/1578121095:

../../../apps/openmw/mwworld/cellpreloader.cpp: In member function 'virtual void MWWorld::PreloadItem::doWork()':
../../../apps/openmw/mwworld/cellpreloader.cpp:98:26: error: variable 'animated' set but not used [-Werror=unused-but-set-variable]
   98 |                     bool animated = false;
      |                          ^~~~~~~~

@elsid
Copy link
Collaborator

elsid commented Sep 9, 2021

This PR causes a crash:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==31504==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x5634c6cc46c7 bp 0x7ffce6597470 sp 0x7ffce6597370 T0)
==31504==The signal is caused by a READ memory access.
==31504==Hint: address points to the zero page.
    #0 0x5634c6cc46c7 in SceneUtil::mergeUserData(osg::UserDataContainer const*, osg::Object*) /home/elsid/dev/openmw/components/sceneutil/attach.cpp:95
    #1 0x5634c6cc54df in SceneUtil::attach(osg::ref_ptr<osg::Node const>, osg::Node*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, osg::Group*) /home/elsid/dev/openmw/components/sceneutil/attach.cpp:115
    #2 0x5634c4f77a78 in MWRender::NpcAnimation::insertBoundedPart(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, osg::Vec4f*) /home/elsid/dev/openmw/apps/openmw/mwrender/npcanimation.cpp:724
    #3 0x5634c4f8645f in MWRender::NpcAnimation::addOrReplaceIndividualPart(ESM::PartReferenceType, int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, osg::Vec4f*) /home/elsid/dev/openmw/apps/openmw/mwrender/npcanimation.cpp:829
    #4 0x5634c4f89897 in MWRender::NpcAnimation::addPartGroup(int, int, std::vector<ESM::PartReference, std::allocator<ESM::PartReference> > const&, bool, osg::Vec4f*) /home/elsid/dev/openmw/apps/openmw/mwrender/npcanimation.cpp:938
    #5 0x5634c4f8bad0 in MWRender::NpcAnimation::updateParts() /home/elsid/dev/openmw/apps/openmw/mwrender/npcanimation.cpp:639
    #6 0x5634c4f901dd in MWRender::NpcAnimation::updateNpcBase() /home/elsid/dev/openmw/apps/openmw/mwrender/npcanimation.cpp:523
    #7 0x5634c4f9389a in MWRender::NpcAnimation::NpcAnimation(MWWorld::Ptr const&, osg::ref_ptr<osg::Group>, Resource::ResourceSystem*, bool, MWRender::NpcAnimation::ViewMode, float) /home/elsid/dev/openmw/apps/openmw/mwrender/npcanimation.cpp:353
    #8 0x5634c5027f53 in MWRender::CharacterPreview::rebuild() /home/elsid/dev/openmw/apps/openmw/mwrender/characterpreview.cpp:350
    #9 0x5634c546e398 in MWGui::InventoryWindow::InventoryWindow(MWGui::DragAndDrop*, osg::Group*, Resource::ResourceSystem*) /home/elsid/dev/openmw/apps/openmw/mwgui/inventorywindow.cpp:74
    #10 0x5634c52c13ee in MWGui::WindowManager::initUI() /home/elsid/dev/openmw/apps/openmw/mwgui/windowmanagerimp.cpp:318
    #11 0x5634c69d8e47 in OMW::Engine::prepareEngine(Settings::Manager&) /home/elsid/dev/openmw/apps/openmw/engine.cpp:793
    #12 0x5634c69dc35e in OMW::Engine::go() /home/elsid/dev/openmw/apps/openmw/engine.cpp:949
    #13 0x5634c698c93d in runApplication(int, char**) /home/elsid/dev/openmw/apps/openmw/main.cpp:316
    #14 0x5634c71414ae in wrapApplication(int (*)(int, char**), int, char**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/elsid/dev/openmw/components/debug/debugging.cpp:205
    #15 0x5634c6983052 in main /home/elsid/dev/openmw/apps/openmw/main.cpp:328
    #16 0x7f61201d9b24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #17 0x5634c4e6b3ed in _start (/home/elsid/dev/openmw/build/gcc/asan/openmw+0x6e63ed)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/elsid/dev/openmw/components/sceneutil/attach.cpp:95 in SceneUtil::mergeUserData(osg::UserDataContainer const*, osg::Object*)
==31504==ABORTING

@psi29a
Copy link
Member

psi29a commented Sep 9, 2021

Now this is interesting, you awake @bosvensson1 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants