Slow node renaming


#1

While working on this thread I noticed a weird performance issue that never noticed before. This happens from Max 2016 to Max 2024.

Why is the performance much better when renaming the nodes after the whole creation?

(
	delete objects
	gc()
	
	st=timestamp(); sh=heapfree
	for j = 1 to 5000 do
	(
		a = box()
		a.name = "b"
	)
	format "time:%ms heap:%\n" (timestamp()-st) (sh-heapfree)
	
	delete objects
	gc()
	
	st=timestamp(); sh=heapfree
	for j = 1 to 5000 do box name:"b"
	format "time:%ms heap:%\n" (timestamp()-st) (sh-heapfree)
	
	delete objects
	gc()
	
	st=timestamp(); sh=heapfree
	for j = 1 to 5000 do box()
	for j in objects do  j.name = "b"
	format "time:%ms heap:%\n" (timestamp()-st) (sh-heapfree)
	
	delete objects
	gc()
)

Max 2016
time:2003ms heap:1354600L
time:2172ms heap:5395528L
time:486ms heap:1354600L

Max 2024
time:2418ms heap:1146616L
time:2776ms heap:5326348L
time:601ms heap:1146584L


#2

add one more test:

(	
	st=timestamp(); sh=heapfree
	for j = 1 to 5000 do box name:(uniquename #box)
	format "time:%ms heap:%\n" (timestamp()-st) (sh-heapfree)
	
	delete objects
	gc()
)

Here’s what happens…

When you create an instance with a name, the system looks for the #unique name and puts that name in the hash table for a quick search next time.
When you don’t specify a name, the system still looks for a unique name, but uses the #default base for that class.
When you apply a name to a node using direct assignment (node.name = #name), the system doesn’t look for a unique name, it just directly adds that name to the hash table.

In my example, I #force to do what the system does by default when no base name is specified.

Does this make sense?

Why the earlier version is faster than the latter, I don’t know, but for some reason I’m not surprised. :wink:


#3

So if you want to name the newly created nodes faster, but give them a #unique name, make up the names yourself and assign them directly.


#4

If it works as you’ve described, then it is a terrible design flaw.

Why would the system look for a unique name when I am already providing one and the system won’t rename it?

But the issue I mention is different, regardless of the name being unique or not. In fact, I am using the same fixed name for all the examples.

If you rename the node just after it was created (inside the same loop), it is slow.

for j = 1 to 5000 do
(
	a = box()
	a.name = "b"
)
time:2003ms

If you create all the nodes, and later, in another loop, you rename all of them, it is 4-5 times faster.

for j = 1 to 5000 do box()
for j in objects do  j.name = "b"

time:486ms

#5

Here are other examples showing the same issue. In this case, the names will be unique (over 6X slower):

for j = 1 to 5000 do
(
	a = box()
	a.name = j as string
)
time:4169ms
for j = 1 to 5000 do box()
for j in objects do  j.name = j as string

time:633ms

Some more using UniqueName

for j = 1 to 5000 do
(
	a = box()
	a.name = uniquename #cube
)
time:4230ms
for j = 1 to 5000 do box()
for j in objects do  j.name = uniquename #cube

time:484ms

Additionally, using the name of the primitive being created as prefix, seems to be much faster than using any other name.:thinking:

for j = 1 to 5000 do box name:(uniquename #cube)
time:4376ms
for j = 1 to 5000 do box name:(uniquename #box)
time:662ms

#6
with redraw off, undo off
(
	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to 1000 do
		(
			node = box()
		)

		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)

	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to 1000 do
		(
			node = box()
			node.name = uniquename #box
		)

		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)
		
	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to 1000 do
		(
			node = box name:(uniquename #box)
		)

		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)
				
	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to 1000 do
		(
			node = box name:(uniquename #fox)
		)

		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)
						
	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to 1000 do
		(
			node = box name:#fox
		)

		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)
				
	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to 1000 do
		(
			node = box()
		)
		for k=1 to 1000 do
		(
			objects[k].name = uniquename #fox
		)

		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)

	gc()
)

I have added several combinations to this snippet. Any other?

We also have to take into account the fact of undo… When we separate creation and naming into two actions, we add an “undo” record for naming. This takes some time.


#7

this problem has been bothering me for many years, and if you remember, I once brought it up on this forum…

things get even worse when we create an even larger number of objects (+50000 for example).
In this case, some other additional factors will come into play.


#8

how the “unique name” mechanics work…

The system doesn’t go through all the objects every time to find a new unique name. It puts the #base of the name in some hash table, and every time a new node is created, it updates the highest used index for this #base.
The name #base is the leading part of the name, which is the name trimmed from the digits on the right.
The highest index is the digit part as an integer number. And the pattern of the digit part doesn’t matter.


#9

Yes, these two, where we don’t use uniquename. And with 5000 iterations it’s more noticiable.

	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to iterations do
		(
			node = box()
			node.name = k as string
		)
		
		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)
	
	(
		delete objects
		gc()

		t0 = timestamp()
		h0 = heapfree

		for k=1 to iterations do
		(
			node = box()
		)
		for k=1 to iterations do
		(
			objects[k].name = k as string
		)

		format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
	)

#10

But my point is not about uniquename, it is just about naming a node. Even with Undo off, the difference is 4-5 times, so I don’t understand why naming them inside the same loop is so bad compared with naming them after the creation loop.


#11

I’ve compared dissasemblies of InterfaceImp::CreateObjectNode(InterfaceImp *this, struct Object *a2) with InterfaceImp::CreateObjectNode(InterfaceImp *this, struct Object *a2, const wchar_t *a3) and the second one is waaay larger method.
It could be that mxs interpreter is smart enough to combine some instructions so if the name is provided it calls the second method


#12

Thank you @Serejah, that may explain it.

However, I don’t understand why it works that way, but there must be some good reasons.

To me, it would make more sense if it worked like this:

(
	-- FASTEST
	for j = 1 to 5000 do
	(
		node = box name:#any
	)
	
	-- SLOWER
	for j = 1 to 5000 do
	(
		node = box()
		node.name = #any
	)
	
	-- SLOWEST
	nodes = for j = 1 to 5000 collect box()
	for node in nodes do node.name = #any
)

But it works all the contrary:

time:2179 heap:5047384L
time:1987 heap:1007736L
time:483 heap:1007744L

#13

in max 10 the following…

iterations = 1000

(
	delete objects
	gc()

	t0 = timestamp()
	h0 = heapfree

	for k=1 to iterations do
	(
		node = box()
	)
	
	format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
)

(
	delete objects
	gc()

	t0 = timestamp()
	h0 = heapfree

	for k=1 to iterations do
	(
		node = box()
		node.name = k as string
	)
	
	format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
)

(
	delete objects
	gc()

	t0 = timestamp()
	h0 = heapfree

	for k=1 to iterations do
	(
		node = box()
	)
	for k=1 to iterations do
	(
		objects[k].name = k as string
	)

	format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
)

gives this result

time: 356	 heap: 137712L
OK
time: 111	 heap: 304952L
OK
time: 368	 heap: 304944L
OK

#14
(
	delete objects
	gc()

	t0 = timestamp()
	h0 = heapfree

	for k=1 to iterations do
	(
		node = box()
		str = node.name
		node.name = str	
	)
	
	format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
)

this gives

time: 159 heap: 224464L

in max 20

which is on par with the 1st

also if you replace k as string with “a” speeds it up too

which suggest there’s some kind look up going on, the new name is not found so is added (not at the naming stage but at the creation node = box() so when you set the new name the loop goes to create a new box but it now has to start from scratch finding the next best name. ?


#15

Ha, Max 2010, what a lovely version (in those days).

Those results make a lot more sense to me.

  1. Create a node with no name. The system has to create a unique name, an so it takes longer.
  2. You provide a name, so there is no need to search for a unique name and it is faster.
  3. You assign a name in a second loop, and so it is slower.

All good, from my point of view, in Max 2010. :+1:


#16

my numbers on 2020

1000
time: 65	 heap: 156128L
OK
time: 173	 heap: 348128L
OK
time: 83	 heap: 348128L
OK

… and 2023

1000
time: 58	 heap: 156128L
OK
time: 163	 heap: 348128L
OK
time: 70	 heap: 348128L
OK

#17
iterations = 1000
delete objects

plugin simpleObject XBox
	name:""
	classID:#(0x02feb12, 0x54374)
	category:"TESTS"
	extends:Box
(
)

gc()

(
	delete objects
	gc()

	t0 = timestamp()
	h0 = heapfree
	for k = 1 to iterations do
	(
		node = xbox()
	)
	
	format "time: %\t heap: %\n" (timestamp() - t0) (h0 - heapfree)
)

:stuck_out_tongue_closed_eyes:


#18

Of course it’s a joke!


#19

it’s not a slow renaming issue. it’s a slow

node = box()

when all the current boxes in the scene have the names
"1", “2”, “3”, “4”, “5”, “6”, “7”, “8”, …

but not when named

"Box001", “Box002”, “Box003”, “Box004”, “Box005”, …