skip to main content
Unreal Journal

The Consequence of the Missing References

Missing

Intro

It's been quite a while now since we last used PVS-Studio, an excellent static code analysis tool, on UE4. So we fired it up - not expecting much, to be honest, as the last time we ran the tool, UE4 was looking pretty clean. With a mixture of pulling across our recommended changes and making fixes themselves, the engine was looking much more stable.

Years have passed since then, though, new systems have been added, code has been overhauled, changes have been made across the board. So we were keen to see whether any problems could be found.

This blog post is here.. and we're not about to end it here.. so let's roll on. In this first new look, we're investigating some performance optimizations identified for Sequencer.

PVS Issue V822: Object is Created Where a Reference is Expected

"The analyzer has detected an issue where a new object is created instead of a reference to an already existing object. The creation of an unnecessary object takes up some amount of time and memory."

This is a common problem and one that we've seen before. When a function returns a reference to an object, the callee can either use the reference directly or use it to make a copy of an object - this latter option often being much more expensive and, usually, not necessary.

The first case that we'll look at is in SequencerSectionCategoryNode.cpp. Take a look at the following code:-

FSlateFontInfo FSequencerSectionCategoryNode::GetDisplayNameFont() const
{
	bool bAllAnimated = false;
	for (TSharedRef<FSequencerDisplayNode> ChildNode : GetChildNodes())
	{
		if (ChildNode->GetType() == ESequencerNode::KeyArea)
		{
			const FSequencerSectionKeyAreaNode KeyAreaNode = static_cast<const FSequencerSectionKeyAreaNode&>(ChildNode.Get());
			for (const TSharedRef<IKeyArea>& KeyArea : KeyAreaNode.GetAllKeyAreas())
			{
				FMovieSceneChannel* Channel = KeyArea->ResolveChannel();
				if (!Channel || Channel->GetNumKeys() == 0)
				{
					return FSequencerDisplayNode::GetDisplayNameFont();
				}
				else
				{
					bAllAnimated = true;
				}
			}
		}
	}

Note the highlighted line. We're taking a reference to a FSequencerSectionKeyAreaNode and storing it as a new object, KeyAreaNode. Looking at the following code you can easily see that only a reference is needed. So we change the line to:-

const FSequencerSectionKeyAreaNode& KeyAreaNode = static_cast<const FSequencerSectionKeyAreaNode&>(ChildNode.Get());

Believe it or not but this single character change, on my system at least, reduced that function from 2,567 bytes of machine code to 567 - a saving of around 78%. It also dropped some calls to Realloc() and Free() - so this is a significant improvement for sure. What performance improvement this might give would depend I guess on what's visible in Sequencer.

Next up, let's look at this block of code from SequencerNodeTree.cpp:-

case ESequencerNode::KeyArea:
{
	const FSequencerSectionKeyAreaNode KeyAreaNode = static_cast<const FSequencerSectionKeyAreaNode&>(StartNode.Get());
	for (const TSharedRef<IKeyArea>& KeyArea : KeyAreaNode.GetAllKeyAreas())
	{
		FMovieSceneChannel* Channel = KeyArea->ResolveChannel();
		if (Channel)
		{
			if (Filters->Num() == 0 || Filters->PassesAnyFilters(Channel))
			{
				bPassedAnyFilters = true;
				break;
			}
		}
	}

As in the last case, we could be using a reference here. So we can go ahead and change the line to:-

const FSequencerSectionKeyAreaNode& KeyAreaNode = static_cast<const FSequencerSectionKeyAreaNode&>(StartNode.Get());

While the output code this time is only marginally smaller, it means we're not calling the contructor for FSequencerSectionKeyAreaNode, which in turn isn't copying the KeyAreas TArray, which presumably could be quite large:-

TArray<TSharedRef<IKeyArea>> KeyAreas;

The third and final finding in this theme for today is also in SequencerNodeTree.cpp around line 1140:-

case ESequencerNode::Object:
{
	bIsTrackOrObjectBinding = true;

	const FSequencerObjectBindingNode ObjectNode = static_cast<const FSequencerObjectBindingNode&>(StartNode.Get());
	for (TWeakObjectPtr<>& Object : Sequencer.FindObjectsInCurrentSequence(ObjectNode.GetObjectBinding()))
	{
		if (Object.IsValid() && (Filters->Num() == 0 || Filters->PassesAnyFilters(Object.Get(), StartNode->GetDisplayName()))
			&& LevelTrackFilter->PassesFilter(Object.Get()))
		{
			bPassedAnyFilters = true;
			break;
		}
	}

Exactly the same problem again... so let's go ahead and change that to:-

const FSequencerObjectBindingNode& ObjectNode = static_cast<const FSequencerObjectBindingNode&>(StartNode.Get());

This time let me show you a little of the machine code that's generated from this C++, for Win64 platforms, just so that you can clearly see the difference.

Here's how the code looked before adding the reference:-

const FSequencerObjectBindingNode ObjectNode = static_cast<const FSequencerObjectBindingNode&>(StartNode.Get());

00007FF854331E01  mov         rbx,qword ptr [r15]  
00007FF854331E04  mov         rdx,rbx  
00007FF854331E07  lea         rcx,[rbp+260h]  
00007FF854331E0E  call        FSequencerDisplayNode::FSequencerDisplayNode (07FF8542E3CC0h)  
00007FF854331E13  lea         rax,[FSequencerObjectBindingNode::`vftable' (07FF8545E0118h)]  
00007FF854331E1A  mov         qword ptr [rbp+260h],rax  
00007FF854331E21  movups      xmm0,xmmword ptr [rbx+80h]  
00007FF854331E28  movaps      xmmword ptr [rbp+2E0h],xmm0  
00007FF854331E2F  mov         eax,dword ptr [rbx+90h]  
00007FF854331E35  mov         dword ptr [rbp+2F0h],eax  

And here's how it looked after - you might notice that it's a little bit more compact ;-)

const FSequencerObjectBindingNode& ObjectNode = static_cast<const FSequencerObjectBindingNode&>(StartNode.Get());

00007FF854461F67  mov         r15,qword ptr [rsi]

Summary

The above problems wouldn't be easy to find without the right tools. On this occasion I used PVS-Studio and looked out for some of the standout warnings. Good memory tracking tools can also help with this when the problems are happening in performance-critical code.

The downside to static analysis is hinted at by that first word - static. The analyser looks at the entire codebase and will find problems in code that perhaps is never even called .. or find optimizations in code that's called 2 or 3 times per hour. That shouldn't stop you from using one of these, though - often they'll find a diamond in the rough. We've seen many, many before.

Now go find your own.

Feel free to check out PVS Studio here: www.viva64.com/en/pvs-studio/ .. check out their blog too for some excellent nuggets that they've found in various bits of software (commercial and free).

Credit(s): Robert Troughton (Coconut Lizard)
Help: Josef Gluyas, Gareth Martin (Coconut Lizard)
Status: Fixed in UE4.27
Github Pullhttps://github.com/EpicGames/UnrealEngine/pull/7794

Facebook Messenger Twitter Pinterest Whatsapp Email
Go to Top