PDA

View Full Version : Optimization ideas needed


noontz
05-26-2008, 12:10 PM
undo off
(
with redraw off
for i = 1 to total do
(
l = copy ny
l.transform = pos[i]
attach olleaves l
)
)
Hi.. This loop is from a kind of scatter thing where "pos" is an array of matrix3 & "ny" is the object Im scattering ( attaching to "olleaves" )

This works, but is extremely slow.. Any clues on how to speed it up? ( The help file dried out quiet quickly on optimization, and this forum houses braincapacity beyond my imagination )

Thx in advance.. Steffen..

davestewart
05-26-2008, 01:20 PM
I did a quick test with spheres, and replaced your loop code:

ny.transform = pos[i]
meshop.attach olleaves ny deleteSourceNode:false

Taking out the copy made it 3 times quicker on average :)

noontz
05-26-2008, 01:28 PM
Hi Dave..

Your answer is just as fast as your solution ;) Thanks so much!!!
I thought I would be out in some kind of facearray stuff to avoid loading "ny" all the time..

This one is going public so Im not the only one thankfull ;)

davestewart
05-26-2008, 01:41 PM
Great stuf :)

noontz
05-26-2008, 02:32 PM
Hi Dave...

For what ever crazy reason im afraid your loop doesnt do the job, when implemented in the rest of my code.. On a test I actually get a slowdown from 8 minutes to 10 minutes, but this could be due to something else, donno ??? ( 7.500 iterations ) Anyway these runtimes are crazy compared to the realtime feedback in for instance the scatter utility, so there must be something hidden under the hood..

davestewart
05-26-2008, 03:10 PM
OK, 7500 sounds like a lot!

Well I don't have much experience with large datasets, but I would suspect that the problem might lie in the fact that as an object's size increases, it becomes increasingly expensive to perform operations on it.

Why don't you try doing, say, 10 iterations of 750 (creating 10 new objects) then a further 10 iterations to attach these 10 new objects together.

It may be that the square root of 7500 might be the best iteration to go for (ie, 86 loops, then, 86 more to attach)

You could store a timestamp after each process in the loop to prove (or disprove) the point

Dave

Bobo
05-26-2008, 03:52 PM
The trouble with attachments is that the Undo buffer gets filled up really fast - each attach operations accumulates the previous version of the mesh, so it grows exponentially.

Thus, if you do not intend to perform an undo after that, you should better add with undo off () around the code. See "How to Make It Faster?" chapter in the MAXScript Reference - the second example in the "Disable Undo system when possible" (at least in the Max 2008 and 2009 version) shows the difference. On my ancient 800MHz system where the original test was performed about 5 years ago, it made the code 20 times faster...

SyncViewS
05-26-2008, 04:07 PM
Hi noontz,
I may have found something that could help you with the speed issue: use the "maxOps.CloneNodes", in place of "copy".
Here two sample scripts and comparison.

First version with "copy" -> 107.109 seconds

(
-- init number of scattered objects
iNumObjects = 10000

-- init the transfom matrix array
am3Transform = #()
am3Transform[iNumObjects] = matrix3 0

-- build the transform matrix array
for i = 1 to iNumObjects do
(
am3Transform[i] = matrixFromNormal(normalize (random [-1, -1, -1] [1, 1, 1]))
am3Transform[i] = translate am3Transform[i] (random [-100, -100, -100] [100, 100, 100])
)

-- create sample object to scatter
oSample = box()

-- init the scattered objects array
aoScatter = #()
aoScatter[iNumObjects] = oSample

iTimeStart = timeStamp()

-- scatter sample object
undo off
(
with redraw off
for i = 1 to iNumObjects do
(
oTemp = copy oSample
oTemp.transform = am3Transform[i]
aoScatter[i] = oTemp
)
)

iTimeStop = timeStamp()

format "Processing took % seconds\n" ((iTimeStop - iTimeStart) / 1000.0)
)

-- Processing took 107.109 seconds



Alternate version with "maxOps.CloneNodes" -> 46.828 seconds

(
-- init number of scattered objects
iNumObjects = 10000

-- init the transfom matrix array
am3Transform = #()
am3Transform[iNumObjects] = matrix3 0

-- build the transform matrix array
for i = 1 to iNumObjects do
(
am3Transform[i] = matrixFromNormal(normalize (random [-1, -1, -1] [1, 1, 1]))
am3Transform[i] = translate am3Transform[i] (random [-100, -100, -100] [100, 100, 100])
)

-- create sample object to scatter
oSample = box()

-- init the scattered objects array
aoScatter = #()
aoScatter[iNumObjects] = oSample

iTimeStart = timeStamp()

-- scatter sample object

-- NOTE: do not use undo off, according to the MAXScript reference it crashes
-- Max if an undo is performed after this operation.
with redraw off
(
for i = 1 to iNumObjects do
(
maxOps.CloneNodes oSample offset:[0,0,0] expandHierarchy:false cloneType:#copy newNodes:&aoTemp
aoTemp[1].transform = am3Transform[i]
aoScatter[i] = aoTemp[1]
)
)

iTimeStop = timeStamp()

format "Processing took % seconds\n" ((iTimeStop - iTimeStart) / 1000.0)
)

-- Processing took 46.828 seconds



The alternate version required less than half time.
I hope this helps.

noontz
05-26-2008, 05:11 PM
Hi all..

Thanks for your efforts. I allways feel a good spirit in this forum!

@ Bobo: Youve saved my ### before, but as you see in the initial code, I allready went through the help content ( unless Im using the wrong syntax/scope offcourse ) ;)

@ Dave & SyncView: Ill go through both & maybe a combo & see what happens..

Ill be back with some results & code. ( Im a little busy ATM so please give me a few days )

Thx again to all three of you!

noontz
05-26-2008, 06:39 PM
with redraw off
for i = 1 to total do
(
maxOps.CloneNodes ny offset:[0,0,0] newNodes:&l
l = l[1]
undo off
(
l.transform = pos[i]
attach olleaves l
)
)

Well that maxOps dropped the runtime to the half as pointed out! Great! I moved the undo off into the loop, as theres a known issue with this & maxOps.CloneNodes.

@ Dave: The attach iterations work linear: 10000 iterations takes 100 times longer than 100 iterations, so there doesnt seem to be any gain on grouping them.

davestewart
05-26-2008, 10:22 PM
Well, here in London it's a bank holiday (so no work), and it's pissing down with rain (so no play).

I've also been working on a timestamping / benchmarking struct in the past few weeks, so this seemed like a perfect case study opportunity!

The results are actually really interesting but with the concept and data in front of you, perhaps not that surprising.

The good news is that I've managed to speed things up 30 times!

Turns out the slowdown IS related caused by the size of the mesh you're attaching the mesh to, so when in a loop, this has a rather nasty cumulative effect.

Also, the square root DOES turn out (4 times out of 5 anyway) to be the most efficient number to split the loop up into.

Anyway - I put the results up on my blog (http://www.keyframesandcode.com/code/development/maxscript/time-stamper-meshopsattach-case-study/), so feel free to take a look.

Cheers,
Dave

erilaz
05-29-2008, 01:44 AM
Hot damn, I love optimised code. :D

Perhaps we should have a general discussion of what makes things faster/less memory intensive.

ZeBoxx2
05-29-2008, 02:17 AM
Perhaps we should have a general discussion of what makes things faster/less memory intensive.
The MaxScript Help file is a good start, then; see the "How to make it faster?" topic %)

erilaz
05-29-2008, 02:19 AM
The MaxScript Help file is a good start, then; see the "How to make it faster?" topic %)

Oh I'm aware of that, I think it would be interesting if other methods have come up in people's scripting journeys to expand the manual's recommendations.

ZeBoxx2
05-29-2008, 03:45 AM
Alright... here's a few I recall off the top of my head - haven't dug into my scripts to find old comments on oddball speed-ups.


At one point I needed random point3 integers, lots and lots of them. The problem is, random <point3> <point3> makes the values within those point3s floats.
So...

-- --------------------------------------------------

startTime = timeStamp()
for i = 1 to 3000000 do (
randP3 = random [0,0,0] [10,10,10]
randP3.x = randP3.x as integer
randP3.y = randP3.y as integer
randP3.z = randP3.z as integer
)
format "time: %\n" ((timeStamp() - startTime) / 1000.0)
time: 16.854

startTime = timeStamp()
for i = 1 to 3000000 do (
randP3x = random 0 10
randP3y = random 0 10
randP3z = random 0 10
randP3 = [randP3x,randP3y,randP3z]
)
format "time: %\n" ((timeStamp() - startTime) / 1000.0)
time: 12.629

Subtle, but I actually had to generate tens of millions; becomes a little less subtle. This is on a slower machine, though :)

-- --------------------------------------------------

Here's another favorite.. UI updates. UI updates are slow.
ProgressBars are a very popular method of showing, well, progress. But updating them all the time is slow. Update them only periodically. For example, only if the update would actually show in the progress bar (depending on its width).

rollout test "test" (
progressbar pb_test
)
createDialog test
pb = test.pb_test -- pre-initialize, lest we incur a speedhit for getting it from the rollout all the time.

startTime = timeStamp()
for i = 1 to 1000000 do (
pb.value = (i / 1000000.0) * 100
)
format "time: %\n" ((timeStamp() - startTime) / 1000.0)
time: 29.012


startTime = timeStamp()
for i = 1 to 1000000 do (
if (mod i 7500 == 0) then ( -- 7500 in this case depending on progressbar width of 133px
pb.value = (i / 1000000.0) * 100
)
)
pb.value = 100 -- make sure it does read 100% at the end
format "time: %\n" ((timeStamp() - startTime) / 1000.0)
time: 2.043

-- lets' get rid of that mult as well
startTime = timeStamp()
for i = 1 to 1000000 do (
if (mod i 7500 == 0) then ( -- 7500 in this case depending on progressbar width of 133px
pb.value = (i / 10000.0)
)
)
pb.value = 100
format "time: %\n" ((timeStamp() - startTime) / 1000.0)
time: 1.943


-- --------------------------------------------------

Another one, no code for obvious reasons.. f you have an intense loop with a lot of if-tests to validate a value; for example to prevent OOB accesses or so, consider try/catch. Typically try/catch is slower than if-tests, but with enough if-tests, try/catch may just win. Pretty rare, and probably not recommended practice; but if you're going for speed...

-- --------------------------------------------------

An old favorite... always name the objects you create, as max will otherwise waste time trying to find a unique name for the new object (obviously only do this if you have no fear of creating non-unique names). This assumes cloneNodes cannot be used for whatever reason.



delete objects
startTime = timeStamp()
undo off (
for i = 1 to 5000 do (
newSphere = geoSphere()
)
)
format "time: %\n" ((timeStamp() - startTime) / 1000.0)
time: 60.226

delete objects
startTime = timeStamp()
undo off (
for i = 1 to 5000 do (
newSphere = geoSphere name:("mySphere" + i as string)
)
)
format "time: %\n" ((timeStamp() - startTime) / 1000.0)
time: 34.149

erilaz
05-29-2008, 04:05 AM
Now that's what i'm talking about! :D

Neuro69
05-29-2008, 06:18 AM
Large meshes can get exponentially slower for certain operations. In a script I wrote to export grids from TIN's using RayMeshGridIntersect, I resorted to splitting up the TIN before doing intersections, which greatly reduced the running time of the script.

When it comes to copying mesh-data, I immediately thought that perhaps it would be faster to use a Trimesh instead of a node, but I'm not sure how that would affect your transforms.

Someone mentioned if-sentences; sometimes it is not convenient to structure your code optimally to minimize if-sentence overhead (like duplicating a lot of similar, large loops for different outcomes). On one such occation I used a function variable with great success. Prepare a number of predefined functions for each outcome and then assign the appropriate one to a variable before you start the loop. Then your loop can handle different cases without any (unnecessary) if-sentences.

noontz
05-29-2008, 02:45 PM
Hi everybody..

Thanks so much for following up! Im monster busy & wont have time to implement & test before next week as mentioned to Dave in a PM ( anyone interested in optimization should check his blog ), but please let the ideas come rolling.. Im all ears ( but no brains ATM )

davestewart
05-29-2008, 02:59 PM
If you haven't already seen from the link above, I've updated my TimeStamper struct. It provides a set of useful methods and properties for all things timing, including:


basic start / stop / and then print / alert the result
multiple starts then print / alert the average / total
differences bewteen different TimeStampers (for comparison tasks)
full reporting on all timed tasks, including a breakdown on which iterations were fastest, that can be used straight-off-the-bat in Excel (see graph below)
You can run multiple timers (each stored in a variable), and the base messaging functions print stuff in English, i.e. "Processing 'Task Name' took 7.81 seconds"

Here's the dump of the tests run above, copied to Excel and a graph made on the columns:

http://www.keyframesandcode.com/resources/maxscript/For%20Scripters/Time%20Stamper/img/time-stamper-graph.gif
The project home page is here: http://www.keyframesandcode.com/code/development/maxscript/time-stamper/

Ideas and suggestions welcome.

Cheers,
Dave

reForm
01-06-2009, 03:08 PM
Thought I'd bump this thread just to thank the previous poster(Dave) for the meshop-attach optimisation.... from 215seconds down to 1second.... quality! :)

davestewart
01-07-2009, 09:47 AM
Aw, thanks Patrick, that's very nice of you!

Vsai
03-13-2009, 08:41 PM
You rock! I wish i'd seen this earlier today before crashing max a hundred times trying to speed it up! ;)

CGTalk Moderation
03-13-2009, 08:41 PM
This thread has been automatically closed as it remained inactive for 12 months. If you wish to continue the discussion, please create a new thread in the appropriate forum.