View Full Version : Polygon seam welding

 Kuroyume016112 December 2006, 08:34 AMHere's a topic that you won't find on Google (or anywhere else). Consider this scenario: You have a Wavefront OBJ polygonal mesh with groups. The groups are seamed up against each other - that is, there are duplicate vertices at meeting polygons between groups. These duplicate vertices must be removed. The first thing you think is - vertex welding! Wrong! The vertices must NOT be touched in any way, shape, or form (literally people). Their number and order must remain consistent for deformations and morphs. But the polygons' vertex indices are free to be altered to 'weld' the polygonal seams. I've searched Google (using several dozen term variations), IEEE, and ACM with no results of interest. My library of 3D texts is void on the topic. I do have a veritable 'brute-force' algorithm in place that does this very well - but it gets really, really slow as the vertex/polygon count increases (say, over 40,000). There has to be a better way - and this is evident in the applications where it is done. My feeling is that there is a structural methodology to increase the performance of the task - i.e.: arrange the polygons and vertices (and other working data for the welding process) so that the seam welding is more optimal. But hints to what this better way is are not evident. The closest paper that I've encountered is by a Japanese guy from 1961 ("Welding of polygons and the type of Riemann surfaces") which deals with Conformal Mapping. Except for the fact that there is no algorithmic discussion, the math involves ultra-super-complex maths (Riemannian spaces, triple integrals, and all sorts of other fun stuff), and never really seems to talk specifically about 'welding polygons', it is the best found so far... ;) Anybody have anything? Thank you very much, Robert
Robert Bateman
12 December 2006, 12:16 AM

Consider this scenario: You have a Wavefront OBJ polygonal mesh with groups. The groups are seamed up against each other - that is, there are duplicate vertices at meeting polygons between groups.

The obj format decsribes a model that someone has created. Why can't you just ask the artist who made it to a better job next time?

These duplicate vertices must be removed.

why?

The first thing you think is - vertex welding! Wrong! The vertices must NOT be touched in any way, shape, or form (literally people).

Why?

Their number and order must remain consistent for deformations and morphs. But the polygons' vertex indices are free to be altered to 'weld' the polygonal seams.

Why? Surely just manipulating the indices doesn't remove the vertices at all. So either you want to remove the vertices and do it properly (taking into account all defomations on the mesh), or just don't remove the vertices at all.

I've searched Google (using several dozen term variations), IEEE, and ACM with no results of interest. My library of 3D texts is void on the topic.

I'll be honest and say that i just can't see what the problem is. I doubt you'd find many texts written on a problem that doesn't exist. In all honesty I just can't see the problem...

UrbanFuturistic
12 December 2006, 01:34 AM
Some morph targets have polygons welded that aren't welded in others. Because of the limitations of 3D morphing, all targets must have the same vertices. Because of the limits of smooth-shading algorithms (I'm guessing the renderer isn't raytracing, and even then antialiasing and rounding errors may produce a visible seam), for shading to progress smoothly across two or more polygons they must share the same vertices, just having vertices which match positions in 3D space won't work.

So: for a morph to work between two models, one of which has a seam where the other does not, some switching of vertices is required so that smooth shading can be properly applied when there is no seam but polygons can be appropriately separated during the transformation.

Unfortunately, while I think I understand the problem, I'm not sure what the solution could be. I do, however, think you're approaching it from the wrong direction.

Look up fast search and comparison algorithms, the kind that allow you to compare two files for similarity for example. This isn't a 3D graphics problem as such, it's much more of a standard data analysis problem so trying to look up welding algorithms when it's a data comparison job is going to lead to a dead end. I'm guessing that what you're doing is taking each vertex and comparing it to every vertex in the other list which would be hideously inefficient. If you're not already doing it, each vertex only needs to be compared once, so once a vertex is matched and 'welded' in one group you never need to compare that vertex again. The other thing that springs to mind is using the poly list; if any polygons share a vertex and that vertex is a duplicate, deal with all the polys attached to that vertex. Repeated integer matching, like all integer operations, should be faster than repeated float matching.

Also, while the data may need to be presented to the software that's being used in a particular manner, there's nothing to stop you referencing the data in another manner and adding extra external data in your own format, this might help with comparison in the sense that relational databases make random searches faster and cubic databases make complete searches (for reports and analysis) faster.

One idea that springs to mind is, reference all the vertices through a list that holds their position in the vertex list (as the polys have to reference them) then sort the list based on their X, then Y, then Z positions and do a binary search rather than looking up the value of each and every vertex.

Ooh, it felt good to exercise my brain, haven't done that for a while.

Google:search and comparison algorithms (http://www.google.co.uk/search?q=search+and+comparison+algorithms). The wikipedia link has links (under 'see also') to other types of search algorithms including Binary Search Tree (http://en.wikipedia.org/wiki/Binary_search_tree) which may or may not be suitable. The main key here is sorting the data into a structure which allows for more efficient searches and since you're using references to sort the data rather than altering the data itself you don't break the models by altering the vertex or polygon order.

Kuroyume0161
12 December 2006, 02:01 AM

The obj format decsribes a model that someone has created. Why can't you just ask the artist who made it to a better job next time?

Why? Surely just manipulating the indices doesn't remove the vertices at all. So either you want to remove the vertices and do it properly (taking into account all defomations on the mesh), or just don't remove the vertices at all.

I'll be honest and say that i just can't see what the problem is. I doubt you'd find many texts written on a problem that doesn't exist. In all honesty I just can't see the problem...

Wavefront OBJ has things called 'groups'. These groups are separate sections of a single polygonal mesh. These groups are sometimes (but not always) separated so that vertices that are coincident in space aren't always the same vertex. When loaded into a 3D application, these show up as seams unless welded. This is pretty standard with Wavefront OBJs (and I've seen hundreds of thousands of them!!!).

There isn't a single model from a single artist - there are hundreds and hundreds of thousands of them from many thousands of artists (think "Poser" here - hint).

I don't want to remove the vertices (no way). The order and count MUST REMAIN THE SAME!!! Yep, the standard procedure is to remove all but one concident vertex (weld) - and you still have to update the polygon vertex indices to reflect the change or the affected polygons would end up with invalid indices. And since you'd be removing vertices from the vertex array, you'd have to update almost all of the polygon vertex indices. That sounds worse to me.

Morphs can be added at any later stage and their correlative indexing would be scrambled by that (or the re-referencing information would need to be retained indefinitely and accounted for every time they are loaded). Reindexing the polygon vertex indices is enough to remove the seams between groups and has no impact on weights or morphs - yes, there are 'ghost' vertices hanging around with no polygons associated with them - no problem in any way whatsoever. They serve their purpose for morphing and deformations (whether you believe that or not).

The problem is that my current procedure for removing the seams by changing polygon vertex indices to one of a coincident group of vertices is not optimal in speed. Poser does it, DAZ|Studio does it, other apps do it (PoseRay, PolyTran, and so on). They're not talking.

I'm the developer of interPoser Ltd and interPoser Pro (and a few other plugins). I'm parsing 10 Poser file formats (including Shader Nodes - although not supported by the plugin app without developing my own shader node system) and Wavefront OBJ (including some stuff in there you don't even know about) using my own code. My skinning, deformation, morph, symmetry, conforming, and other code are all mine - some of it proprietary as the procedures they emulate are also proprietary, so I had to develop my own.

This process needs to be speed optimized because the welding occurs during import. And it's not just 'weld every coincident vertex' either. It goes like a-this:

* Weld bodypart A to bodypart B (bodypart == group) - welding is a command between groups, not a ubiquitous, blind process across the polygonal mesh (note that well!).

* Weld A to B only if they both 'bend' - if neither or only one bends, no welding takes place. The joint is allowed to break as that is how it was designed (believe that or not).

* Weld A to B only at the seam between these bodyparts - welding doesn't just reindex any old coincident vertices, it only welds if the coincidence is found between the two groups, but must consider all coincident points on the same group that are also coincident between them (say two points on group A that are coincident with one on group B - then two of the points are reindexed to the other one).

Seems rather particular and restrictive, right? But after several years of research and study (r.e.), a full year of intensive development with several dozen beta-testers, several hundred users, and testing on my own over this time period, I think that I can safely and confidently say that without them, things don't work as expected. This is the procedure and it works on thousands of items.

Sorry if you still can't understand the problem.

Kuroyume0161
12 December 2006, 02:27 AM
Some morph targets have polygons welded that aren't welded in others. Because of the limitations of 3D morphing, all targets must have the same vertices. Because of the limits of smooth-shading algorithms (I'm guessing the renderer isn't raytracing, and even then antialiasing and rounding errors may produce a visible seam), for shading to progress smoothly across two or more polygons they must share the same vertices, just having vertices which match positions in 3D space won't work.

So: for a morph to work between two models, one of which has a seam where the other does not, some switching of vertices is required so that smooth shading can be properly applied when there is no seam but polygons can be appropriately separated during the transformation.

Unfortunately, while I think I understand the problem, I'm not sure what the solution could be. I do, however, think you're approaching it from the wrong direction.

Look up fast search and comparison algorithms, the kind that allow you to compare two files for similarity for example. This isn't a 3D graphics problem as such, it's much more of a standard data analysis problem so trying to look up welding algorithms when it's a data comparison job is going to lead to a dead end. I'm guessing that what you're doing is taking each vertex and comparing it to every vertex in the other list which would be hideously inefficient. If you're not already doing it, each vertex only needs to be compared once, so once a vertex is matched and 'welded' in one group you never need to compare that vertex again. The other thing that springs to mind is using the poly list; if any polygons share a vertex and that vertex is a duplicate, deal with all the polys attached to that vertex. Repeated integer matching, like all integer operations, should be faster than repeated float matching.

Something similar to this - so that each vertex is compared to all other subsequent vertices only.

for (i = 0; i != (n-1); i++)
{
v = vert[i]
for (w = v+1; w != vert[n]; w++)
{
If (vertices equivalent with epsilon)
{
store i index as long in 'substitution' array.
}
}
}

Because 0 is a valid index, I use 0xFFFFFFFFL (-1L) as the 'unset' or 'unsubstituted' value that initializes the array - memset(array, 0xFF, n<<2). Of course, if there ever happens to be a model with 2^31 vertices, I'm in trouble... ;)

Also, while the data may need to be presented to the software that's being used in a particular manner, there's nothing to stop you referencing the data in another manner and adding extra external data in your own format, this might help with comparison in the sense that relational databases make random searches faster and cubic databases make complete searches (for reports and analysis) faster.

One idea that springs to mind is, reference all the vertices through a list that holds their position in the vertex list (as the polys have to reference them) then sort the list based on their X, then Y, then Z positions and do a binary search rather than looking up the value of each and every vertex.

Ooh, it felt good to exercise my brain, haven't done that for a while.

Google:search and comparison algorithms (http://www.google.co.uk/search?q=search+and+comparison+algorithms). The wikipedia link has links (under 'see also') to other types of search algorithms including Binary Search Tree (http://en.wikipedia.org/wiki/Binary_search_tree) which may or may not be suitable. The main key here is sorting the data into a structure which allows for more efficient searches and since you're using references to sort the data rather than altering the data itself you don't break the models by altering the vertex or polygon order.

The worst part isn't determining coincident vertices and marking substitutions; that's the fast part. The worst part is, using the conditions mentioned in my previous reply, reindexing polygon vertex indices only at the seams as noted. This is done so often that one would think it 'academic' (i.e.: in papers, texts, books, websites) to find specific information on the process, generic information even.

I was hoping for some algorithm that shows a data reorganization or a preprocessing form that would accelerate this process (even if it doesn't consider the stricture on which groups are welded or if they are denoted as bending or not).

So, we're both thinking along the same lines, but I was hoping for a more directly related solution. At least your links and thoughts might give me some alternative ideas. :)

Robert

Cronholio
12 December 2006, 08:13 AM
Oudabig is right this probelm has nothing to do with welding vertices, it's all about finding duplicates.

Where are you getting these OBJs from? A lot of programs solve the problem by not breaking the object up by groups in the first place. It sounds like what you are working with is the poor man's method for creating shader groups, the mesh is broken into submeshes by shader or material or texture or whatever. The correct way to store this information would be in a vertex or primitive integer attribute which is an index into an array of shaders or materials or whatever. Often what formats have is a distinction between a point, which would be a point that defines a surface itself (your welded mesh ), and a vertex which would be a point that defines a single face within the final surface. Sometimes rather than calling them points and vertices they are refered to as vertices and face-vertices (such as in Maya). You can store information that varies per polygon (like discontiguous UV sets, or shader assignments ) on the vertecies or the polygons themselves in an attribute. This way you can keep the mesh whole while varying attributes over the mesh.

For formats that do not make a disctinction between a point and a vertex, what is often done is a vertex duplication list is generated and stored in the model file itself. This list is an array whose count is equal to the number of vertices in the unwelded mesh. Each index in this list has an integer value that is the index to the true vertex number. This way you don't have to iterate over the whole mesh looking for duplicates, you just pull the info out of the array.

The reason why your method is so slow is because you are comparing every vertex to every other vertex in the mesh up to the current vertex number. This linear comparison method is going to take exponentially longer to complete as the density of your mesh increases. How do you solve the problem? Simple, use a format that correctly supports the information you want to include. If you really must use these fractured objs, you are going to to have to search your arrays differently. A good quick method would be to pre sort the array in ascending order (you'll need to also build an array of indices so you can later put the array back into the correct order, creating a an array of a custom vertex type containing the value and the original index would be a good way to go ). Anyway, sort the list in ascending order, start searching in the middle, and keep halving the the array until you find a match, if you don't find a match you have a unique vertex, otherwise, record the match. That'll probably the fastest way to get through the list and find duplicates because you'll be doing far fewer comparisons. With your method, if your mesh has 100,000 vertices, you could potentially do up to 100,000 comparisons per vertex. If you use the method I described, you'll do like 16 or 17 comparisons per vertex at most.

I might also recommend that you come up with an animation cache format that stores only point transformations, and not an entire OBJ including unneccessary info like face lists, uvs, groups etc.

Cronholio
12 December 2006, 08:45 AM
I should also add, and trust me on this, you are going to want to weld only once and do it on a rest pose of the model. If you weld after every frame of animation it's highly possible that you are going to weld points that should not be welded, or not weld points that should be welded, no matter how smart you may think your algorithim is. If the point count on your object varies from frame to frame you are likely to end up with artifacts when you render when using features that depend on consistant point counts such as per object deformation blur.

UrbanFuturistic
12 December 2006, 03:35 PM
Who is this Oudabig? :shrug:

Seriously though. Your brute-forcing is based on the data as-is. If you set up a referenced list and sort that, you sort it once and every search after that will be immensely faster, especially with a binary tree, as the shortest route to any possible match will always be the one taken. After that, you can always do some sorting on the polygons and do a search on those to see which ones use that vertex and 'weld' them all at the same time rather than doing a new search for the same vertex for every polygon.

Kuroyume0161
12 December 2006, 06:21 PM
I should also add, and trust me on this, you are going to want to weld only once and do it on a rest pose of the model. If you weld after every frame of animation it's highly possible that you are going to weld points that should not be welded, or not weld points that should be welded, no matter how smart you may think your algorithim is. If the point count on your object varies from frame to frame you are likely to end up with artifacts when you render when using features that depend on consistant point counts such as per object deformation blur.

Oh, it's only welded once during import before poses/animation/morphs/etc. are applied.

I'm still half tempted to do two screen caps to actually show this. It's as if none of you have ever seen it before (it's quite ubiquitous!). Cinema 4D doesn't do anything about it at all with its own OBJ import - it treats each group within the Wavefront OBJ file as a separate Polygon object (which obviously shows the seams as the objects are disconnected).

Since I'm importing rigged figures, that's not possible - one polygon object with a skeletal hierarchy - I use my own Wavefront OBJ import code (even does a more optimal quad-fan n-gon polygon subdivision when required - all quads with a possible final triangle - quads are better for figures and NURBS). Even with a composite polygon object, the way my import code works, the seams still exist - this is from the file data and not any consequence of my import process.

Well here we go. The first image is Cinema 4D's default Wavefront OBJ import (separate objects). Performing 'Connect' will create a single polygon object that looks exactly the same. Even my single-object import without seam-welding is identical. The seams are clearly visible.

http://www.kuroyumes-developmentzone.com/referenced/seams.jpg

This image is after using the built-in Optimize function to remove the 'seams':

http://www.kuroyumes-developmentzone.com/referenced/noseams.jpg

The Optimize function is great at 'welding' vertices within an epsilon - if all you want to do is generally optimize. It does change vertex counts in the process, which is undesirable.

I'm hoping the visuals aid in understanding. :)

Robert

Cronholio
12 December 2006, 09:42 PM
Who is this Oudabig? :shrug:

Heh, sorry, it was 4:45 AM.

Oh, it's only welded once during import before poses/animation/morphs/etc. are applied.

But are you welding the morph targets as well? That is the question. If you are also welding the morph targets seperately it could cause you problems if vertices you don't want welded become coincident on a particular frame of animation.

Anyway, yes I have seen this problem and I think we understand the issue pretty well. What I'm ultimately telling you is the OBJs are a very base format and you shouldn't be using OBJs if you can at all help it. You should be exporting the geometry to a format that has proper support for groups and material assignments so you don't have to deal with segmented meshes at all. That would be the best solution. I mean these meshes aren't segmented in Poser are they? If not, why export to a fomrat that's going to muck up the mesh by segmenting it and omitting important information?

If you are going to continue to use these OBJs your method of comparing vertex to vertex is going to be very slow. You have to use a better method to find matches. If you are still seeing seams after a mesh is welded it's because you aren't properly handling your normals. I don't think OBJ supports normals so C4D must be creating face varying normals. When you weld you need to create a new normal for each point you weld to smooth out the seam.

Kuroyume0161
12 December 2006, 10:53 PM
But are you welding the morph targets as well? That is the question. If you are also welding the morph targets seperately it could cause you problems if vertices you don't want welded become coincident on a particular frame of animation.

No, the morphs aren't being welded. Their indices are correlated to the unaltered geometry. Morphs can be defined in the imported file that defines the figure. But they can also be 'injected' at any later time - thus the reason to keep the vertex array order and count unaltered.

Anyway, yes I have seen this problem and I think we understand the issue pretty well. What I'm ultimately telling you is the OBJs are a very base format and you shouldn't be using OBJs if you can at all help it. You should be exporting the geometry to a format that has proper support for groups and material assignments so you don't have to deal with segmented meshes at all. That would be the best solution. I mean these meshes aren't segmented in Poser are they? If not, why export to a fomrat that's going to muck up the mesh by segmenting it and omitting important information?

If you are going to continue to use these OBJs your method of comparing vertex to vertex is going to be very slow. You have to use a better method to find matches. If you are still seeing seams after a mesh is welded it's because you aren't properly handling your normals. I don't think OBJ supports normals so C4D must be creating face varying normals. When you weld you need to create a new normal for each point you weld to smooth out the seam.

Nothing that I can do about the use of these OBJ files - they are what is used by the format being imported. Poser welds them (these are the weld commands about which I mentioned earlier). I'm working with the same base objects that Poser loads and must weld them as well.

The seams are currently being correctly welded, just very slowly in cases of many vertices/polygons. OBJ does indeed support normals (v = vertices, vt = UV, vn = normals). The seams are only an artifact caused by the discontinuity between the polygons because of the different albethey coincident vertices. I don't think normals are a factor - these are ignored anyway and not included in many figure geometries at all.

One thing that is still confusing me is that you mentioned sorting the array into ascending order. But upon what criteria? They are already sorted by index (0-n) - obviously :). The Cinema 4D vertex array for polygon objects contains everything (indices and actual point coordinates, not two separate lists or whatnot):

point[0] = vertex(x,y,z)
point[1] = vertex'(x,y,z)
...
point[n] = vertex'''(x,y,z)

So that the array is an array of Vector structs with Float x,y,z values and operators and friend methods (a public class, that is). It's easy enough to duplicate the array and sort it in situ, but should it be sorted on matches (equivalent vertices by epsilon) or x,y,z values?

Also, Cinema 4D's polygon structure is a bit of a pita, especially concerning the separate groups. If you don't mind, I can explain this is more detail so that you see what I'm working with. The polygons are stored in an array as well. So there is a Vector array for vertices and a Polygon struct array for polygons. The polygon struct is:
struct Polygon
{
public:
Polygon (http://forums.cgsociety.org/#polygon1)(LONG t_a, LONG t_b, LONG t_c);
Polygon (http://forums.cgsociety.org/#polygon2)(LONG t_a, LONG t_b, LONG t_c, LONG t_d);
LONG a (http://forums.cgsociety.org/#a3);
LONG b (http://forums.cgsociety.org/#b4);
LONG c (http://forums.cgsociety.org/#c5);
LONG d (http://forums.cgsociety.org/#d6);
};
where a,b,c,d are indices into the Vector array storing the vertices. If the polygon is a triangle, d = c. This is all pretty standard construction for polygonal meshes.

Now the fun part. Since the Polygon mesh is a single object, the groups are stored and determined using 'tags'. Tags are attached to objects and store attributes. For our purposes, there are Point Selection and Polygon Selection tags that are of interest. Cinema 4D stores an array in each of these that just stores a selection state (selected or not) over the entire vertex or polygon set (i.e.: if there are 32000 polygons, there are 32000 selection states in each Polygon Selection tag). So, a group of the OBJ can be stored in a Polygon Selection tag by selecting the polygons in the selection array that match those referenced by the OBJ group.

Getting at the selection set is odd. Each Selection tag stores a BaseSelect class which contains the selection set (this class is used for various other selections in the interface as well). The 'quick' and recommended method to traverse through the selected items in the BaseSelect is:

BaseSelect* pbs = tag->GetBaseSelect();
for (pseg = 0; pbs->GetRange(pseg,&c,&d); ++pseg)
{
for (j = c; j <= d; ++j)
{
....
}
}

As you can see, this is very loop intensive. Doing two of these nested of course means doing a four-level nested for loop to compare polygons of each selection set (group).

I'm trying to divorce myself from these constructs so as to apply the welding optimally - and thus the reason for researching, searching, and asking for help. And, thank you very much for the ideas so far! It may take a little bit to get me to understand how to arrange this data in a fashion that makes the two processes optimal: vertex matching and polygon reindexing.

Robert

Kuroyume0161
12 December 2006, 09:09 PM
Back to provide small update notes:

1. I don't think a Binary Search will provide any usefulness. Every search on use of Binary Search Trees on 3D Vectors results in nothing - much on 'Vector' as in STL Vector, nothing on 3D Vectors. Binary Searches are fast and great for single value comparisons - but no info on multivariate comparisons. Additionally, it seems that my approach is used by others. Two source codes found on "Weld Vertex" showed nearly identical approachs to vertex welding (GLM library by Nate Robins is one). It does seem possible to do a triple sort - one for each vector component - maybe only doing later sorts as required. That is, sort by X. Then I need only concentrate on values of X that are within epsilon - that is if the next comparison element is beyond epsilon, the comparison can move to the next element immediately. Still a little fuzzy how this can be implemented for X, Y, and Z to make it progressive and optimal.

2. But I did see one possible optimization after thinking on this - check that the substitution index hadn't already been set in the main loop. (Big Smile) That DOUBLED the speed of the algorithm. A known artery clogger that took 6min10sec to import/setup now only takes 2min56sec with just that one single change. It was double checked just in case my eyes were deceiving me.

// - First index substitutes for subsequent ones that match within epsilon
Vector v;
BOOL subbed;
LONG j;
LONG pc = pointCount-1;
for (LONG i = 0; i != pc; i++)
{
if (subs[i] != 0xFFFFFFFFL) continue;
// Look for equivalent vertices for substitution
subbed = FALSE;
v = verts[i];
for (j = i+1; j != pointCount; j++)
{
if ((subs[j] == 0xFFFFFFFFL) && VectorEqual(v, verts[j], epsilon))
{
subs[j] = i;
subbed = TRUE;
}
}
if (subbed) subs[i] = i;
}

The line in yellow equal 2.1 times performance (amazing, huh?). Of course, I'm not going to stop here and will continue to see what can be done to find a good search/comparison algorithm to increase performance further as well as optimizing the polygon reindexing process.

ETA: Yes, I'm looking at Octrees. What is really needed here is 'clustering' - quickly get all of the vertices that are considered equivalent grouped together. Then it would just be a matter of working on clusters - skipping all singular references.

Thanks,
Robert

Kuroyume0161
12 December 2006, 05:46 AM
One idea that springs to mind is, reference all the vertices through a list that holds their position in the vertex list (as the polys have to reference them) then sort the list based on their X, then Y, then Z positions and do a binary search rather than looking up the value of each and every vertex.

I'm still not certain how A) fast this would be and B) how to do it. Sorting by X is easy. Subsequent sorts are not so easy. How are you sorting the Y values while retaining the X sort (which, one would assume, should be maintained). You end up with one sort and then a multitude of mini sorts and an even greater multitude of mini-mini sorts. You can't just split the vertex components apart - and if you did, you'd need to maintain the indices on each component. Three sorts is costly in time - no matter how fast the sort algorithm is - and especially when dealing with floating point values!

The Octree solution looks interesting as it can partition the search space to isolate sections of the vertices and reduce the search space per vertex. But it does require some preprocessing and structure preparation before it can be put into effect. There are also boundary issues with Octrees (loose Octrees specifically address this) - that is, what to do with things that may belong to two partitioned spaces - say a vertex that happens to sit on the partition plane of the octree subdivision.

For such a simple problem, very little is said about it anywhere - and I've done several dozen variant searches over the past week. The simple question is "how do you quickly find equivalent vertices in a vertex list or array". Obviously, not by looking for existing solutions!

One thing that I did ponder was computation of a single value that encapsulates the vertex position uniquely, but there is none of which I'm aware. Normals or component-averaging and such don't convey unique 3d positions. This is why octrees are interesting since they clump vertices by position and limit the search space accordingly. And I should mention that even a single X component sort should provide a speed advantage as it will quickly traverse the compare list - again, if the next X value is greater than epsilon, move on.

Robert

UrbanFuturistic
12 December 2006, 11:49 AM
I was thinking more along the lines of

if (X1 == X2)
if (Y1 < Y2)
{Sort by the Y value}
else
if (Y1 == Y2)
if (Z1 < Z2)
{sort by the Z value}

Of course, again you're actually sorting an index to the vertices and not the values of the vertice themselves so you might end up with an array with the values 9, 7, 8, 5, 3, 4, 1, 0, 2, 6 or you might end up with a binary tree holding similar values referring purely to the position in the vertex list. Or, if you wan't to speed things up a little more, create an array/linked list of pointers/structs with pointers that directly reference the arrays/structs holding the vertex info. That way, you can treat the data as both sorted and unsorted data depending on which pointers you use to access it.

Kuroyume0161
12 December 2006, 04:59 AM
Well, after sort of winding my way through various levels of search, it appears that there is quite a lot of related discussion on problems similar (if not identical) to this type of data problem - all-nearest neighbor or all-closest-points problems. The usual solutions seem to be forms of kd-trees and/or hash tables. And despite Mr. Bateman's berating, this is a TOUGH problem - one that has consumed hundreds of papers and probably thousands of books and lectures. Why? Why? Because you didn't understand the problem and dismissed it out of hand...

The great thing is that the fastest are O(n log n) with O(1) in searches many times. The bad thing is that, as usual, these PhDs are very good at explaining and providing complex maths and charts and before-and-after illustrations and time comparisons, but rarely is a good pseudo-code provided (let alone actual working code!). Supposedly Vaidya has a great O(n log n) k-nearest neighbor algorithm - if I could only get to his goddarned paper! Hard to see the advantage if one cannot even read the paper (grumble).

Interestingly enough, one solution actually does a tri-leveled hash table (for each coordinate component), but the explanation is complex for implementation. The preprocess involves five passes (!) alone. I'd rather something that does a simple cubic space partitioning with a possible hash table. This is towards where my current searches are filtering.

Wish me luck and thanks again! :)

Robert

UrbanFuturistic
12 December 2006, 03:26 PM
Actually, that is a couple of things I hadn't thought of. I'm getting rusty :)

If you wanted to create a unique single-integer (or long) identifier, I suppose an md5sum like calculation on the vertices in string format eg 3.57312, 5.573689, -0.456845 would give a suitable unique identifier while still matching when the vertices matched... although the 128bitness of an actual md5sum might be a bit much. It should mean that you're comparing only one value rather than three, one after the other. This will also require a certain amount of syntax checking though and possibly removal of, for example, excessive white space. It's probably also worth considering to what degree of accuracy you want to compare vertices considering both the differing accuracy levels a program may write for vertices and rounding errors. 0.56, 0.560000 and 0.5600001 may end up the same in memory but not as a string which would make accurate comparison difficult. Maybe the easiest thing to do would be to convert to floats/doubles then convert them back to strings in a format you can predict the structure of and hash that.

I was going to question the exact value of generating a kd-tree, but then I remembered the problem of a one sided binary tree being about as much use as a chocolate teapot.

I'm not exactly sure what hashing each co-ordinate component individually would achieve other than returning an integer where a floating point would be, and it is so often much more efficient to work with integers, but you're still having to repeat an operation twice when a single operation would do if the components were concatenated and hashed as one value.

As for the papers, you can buy an electronic version for \$30 from this website (http://www.springerlink.com/content/1432-0444/), it's in Volume 4 (1989) of Discrete and Computational Geometry listed as AnO(n logn) algorithm for the all-nearest-neighbors Problem.

Alternatively, this pdf (http://www.cgg.cvut.cz/~havran/ARTICLES/ingo06rtKdtree.pdf) cites the same paper although you'll need to bone up on your boolean algebra (http://en.wikipedia.org/wiki/Boolean_algebra) and Set Notation (http://en.wikipedia.org/wiki/Set) (I don't know how good you are with these).

Kuroyume0161
12 December 2006, 07:31 PM
It didn't occur to me either. :) As you say, if you haven't been working with a similar problem for a long time, it's easy to forget about these available solutions and what they're called.

Some of these espouse some sort of binary tree (red/black for instance), with others the hashing is part of the tree structuring, while with others the hash table is separate from the tree. There seem to be several approaches to generating hash keys for 3D vectors. The one seen on GameDev.net was rather, hmmm, unique (probably won't use that one). One of the papers uses something like:

Hash Table Size/(MaxSum-MinSum)

But the generation of MaxSum isn't explained well (does it summate a hash bin or the entire point set?).

As can be seen in my code, I'm using an epsilon equivalence, epsilon=(0.00001*'a-scaling-value'), for the vectors. The scaling value is the one used to scale up the vertices during import (Poser values tend to be minute, 1.0 equalling 8'. Using them in Cinema 4D needs between 500-1000 times scaling to avoid aliasing factors in accuracy).

I see that that paper does indeed love Boolean Algebra - their pseudo-code is practically a calculus proof. No problems with boolean algebra or set notation.

Vaidya's paper is available at ACM, but my membership there has expired recently. Will need to get it back in order to have access to it and other papers. Amazingly, nothing of his at IEEE, so I take it that is not part of his membership realm for publishing.

Thanks,
Robert

CGTalk Moderation
12 December 2006, 07:31 PM
This thread has been automatically closed as it remained inactive for 12 months. If you wish to continue the discussion, please create a new thread in the appropriate forum.

1