.net sdk memory management problem


#38

Would you try both functions if you get some time?
It would be great to have your opinion and findings.


#39

slowing down for in-place method doesn’t make sense for me. probably it’s some issue specific to max .net sdk.
if removing unnecessary multiplications and divisions don’t really affect the performance that means 99% of time the method is doing something different. but i don’t see anything unless taking a value from one piece of memory and putting to another. in c++ sdk it’s almost nothing and doesn’t take any time.


#40

I tried both versions and I get same result as PolyTools3D - the in-place version is about 60% slower for me.

I believe this is because that no value data is actually stored on the IPoint3:

bFlags = (dotnetClass "System.Reflection.BindingFlags")
allFields = dotnet.CombineEnums bFlags.Public bFlags.NonPublic bFlags.Instance
fields = ((((dotnetclass "Autodesk.Max.GlobalInterface").Instance.Point3.Create 1 1 1).GetType()).GetFields allFields)
for p in fields do print (p.ToString())

"Point3* unmanaged_"
"Boolean owning_"

The only data is a pointer to a native Point3. Each access requires dereferencing a native pointer, which I assume is slower from a c# assembly.

Nevertheless, the method runs very fast (1-2 ms for 40000 vertices on my machine), and if you need it any faster than this, you might as well start over in C++.


#41

This runs about 18 times faster (yes, this is still c# code :stuck_out_tongue: )


//define this somewhere in your class
   struct Point3
        {
            public float X;
            public float Y;
            public float Z;
        }


Point3* mapP = (Point3*)(mapVerts[0] as Autodesk.Max.Wrappers.Point3).INativeObject__Handle;
Point3* meshP = (Point3*)(mesh.GetVert(0) as Autodesk.Max.Wrappers.Point3).INativeObject__Handle;

for (int i = 0; i < numMapVerts; i++)
{
    mapP->X = 1;
    if (meshP->Z > HeightThreshold)
    {
        mapP->Y = Math.Max(0, mapP->Y - FadeInValue);
    }
    else
    {
        mapP->Y = Math.Min(1, mapP->Y + FadeInValue);
    }
    mapP->Z = mapP->Y;
    mapP++;
    meshP++;
}

for 1000 iterations:
previous fastest version: ~1800ms
this version: ~100ms

It requires you to reference Autodesk.Max.Wrappers in your project, and compile with unsafe code flag.

I wouldn’t recommend doing this in production code, I don’t know what issues could occur.


#42

The C# takes 3ms to process a 100K mesh (3000ms for 1000 calls)

How much is “almost nothing” in C++? Just curious to know how much faster is the native code.


#43

Great! 0.00016 seconds to process a 100K mesh isn’t bad, or is it? :slight_smile:


#44

This would still be faster in C++, especially if it is SIMDed.


#45

this is almost the same what c++ code has to do. excepting that mesh verts and map verts are already arrays of Point3.


#46

i don’t think c++ code can do it much faster. only maybe getting the mesh might be a little faster.
another thing is update the render mesh after you set this colors and force rebuild of a cached one. and i’m not sure it’s possible to rebuild only colors. so the update will take much more time than a calculation.


#47

I was actually expecting the C++ version to be 3-5 times faster, but 20 times faster is still a huge difference.

Although this function involves very common operations, I supposed the difference can be much larger than that in other cases.

Well, I guess a more flexible language comes with its price. What a shame that we can’t maintain and compile a single C++ version.

Thank you Rotem and Denis for your input, and thank you Håvard for sharing this.


#48

c++ version gives me 0.00027 sec for 100K mesh… but my machine is pretty old.
as i said the function doesn’t really do anything else than get and set values in memory.


#49

Can you post the code? I’ll give it a spin.


#50
def_visible_primitive(getMeshMapVerts, "getMeshMapVerts");
 Value* getMeshMapVerts_cf(Value **arg_list, int count)
 {
 	check_arg_count_with_keys(getMeshMapVerts, 1, count);
 	if (!is_mesh(arg_list[0])) return &undefined;
 
 	int vnum = 0;
 
 	Mesh* mesh = arg_list[0]->to_mesh(); 
 	int channel = key_arg_or_default(channel, Integer::intern(1))->to_int();
 	if (mesh->mapSupport(channel))
 	{
 		vnum = min(mesh->numVerts, mesh->getNumMapVerts(channel));
 		UVVert *mv = mesh->mapVerts(channel);
 
 		for (int k=0; k < vnum; k++)
 		{
 			if (mesh->verts[k].z) { } 
 			float y = min(1.0f, max(0.0f, mv[k].y));
 			mv[k] = Point3(y,y,y);
 		}
 	}
 	return Integer::intern(vnum);
 }

it’s not exactly the same but does do the same operations most native for c++ sdk way


#51

Hmm I thought I would try to optimize it with SIMD for the fun of it, but I realize now it’s really not an ideal candidate due to the minimal amount of actions performed inside the loop and the AoS layout of the Point3’s.


#52

Wouldn`t this an ideal candidate for parallelization through TPL?
http://msdn.microsoft.com/en-us/library/dd460717(v=vs.110).aspx


#53

Probably only for meshes with huge amounts of vertices.
I would bet that the overhead of parallelization would be almost as high or higher than the runtime of the single-threaded function, though it’s easy enough to test.


#54

With 40k verts using parallel.for I got ~23 seconds on 10k iterations, and ~37 seconds using the single threaded version. So that is pretty cool.


#55

With the c# code or the c++ code?


#56

C#.
I replaced the for loop with just this:

Parallel.For(0, numMapVerts, i => ...

#57

thank you for information