SSE/SSE2 & Full 64Bit Instructions Support?


#1

Hi,

I have wondered if SSE/SSE2 support is currently utilized in Cinema 4D 8x

And also, will they ever be a full 64bit version?

I have tried under preferences/render threads in winxp64 setting it to 2 etc… but i notice it slows it down by a few seconds, rather than if I leave it at Optimal…

Many thnx

Chris


#2

Cinema supports SSE. No 64 bit support yet, and since Microsoft has no good 64 bit SDK you’ll have wait for 64 bit support. The official 64 bit Xp comes out in the 1st or 2nd quarter in 2005 so there’s no need to hurry.


#3

If you set the number of threads higher then the number of CPUs you got (HT CPUs count for 2) then CINEMA will be a bit slower. However sometimes it’s nice to see more parts of the picture rendered at one time so you can spot errors earlier on.
Cheers
Srek


#4

Hi,
Thanks for clearing things up, but my athlon 64 system , does have hyperthreading, so it shouldnt have slow down at 2 threads , but its only a couple of seconds! :wink:

Thanks
Chris


#5

If you set the number of threads higher then the number of CPUs you got (HT CPUs count for 2) then CINEMA will be a bit slower. However sometimes it’s nice to see more parts of the picture rendered at one time so you can spot errors earlier on.

Well, do you know what I’m still missing from the old days when I used Bryce?

Here’s how the rendering works there, if you don’t know anyway…

It always starts with big image blocks, rendering only a fraction of the image pixels - one in each 64 x 64 block, if I remember correctly. Then it renders one pixel in each 32 x 32 pixel block, an so forth…

This is incredible valueable to spot an error as early as possible. Maybe there are performance reasons not to do this - I really don’t know. However, if it wouldn’t matter, or if the impact is not to massive, this might be a nice alternativ rendering mode.

Maybe in R9.1? :wink: :thumbsup:

Cheers

Kabe


#6

Srek,

When rendering with two processors, there is a delay each time one of the processors finishes its “field” and is inserted in the middle of the other processor’s “field”. This delay is very long sometimes. Would it be possible that the second processor starts at the bottom and starts rendering upwards and so the lines of the two processors would only meet once? Am I saying something stupid?

Jorge Arango


#7

That would only work for two processors/threads.


#8

Actually, every indication I’ve heard so far bodes very well for Whidbey. The compiler performance so far has been good, and it’s not like it hasn’t been put through its paces.

The only catch right now is that XP64 SP2 is still in beta.


#9

Of course, but I’d bet that more than 90% of multiprocessor systems that work with C4D have two processors. For 4 processors, two of them would start at the middle. There would be one meeting for every pair of processors.

Anyway I value your opinion. Do you think that, at least for two processor systems, it could be implemented and it would improve performance?

Jorge Arango


#10

My guess is that it wouldn’t be hard to implement.

However, whether or not it would improve performance is an open question; what they’re doing is dynamically allocationg processor power where it’s needed. If you watch the render bars, you’ll see that in almost every scene, some parts of it render more quickly than others.

Anyway, just wait until multi-core, multi-processor systems with SMT (or HyperThreading in Intel-speak) start to become common :wink:


#11

OK Thalaxis, thanks for your input.

Jorge Arango


#12

I would be extremely surprised to see an Athlon or Opteron system with Hyper Threading :wink:
It’s intel only and included only in the current P4 and Xeon processors.
Cheers
Srek


#13

The delay happen when the thread has to preprocess some informations before it can start to render. This will happen regardless where the rendering line starts. As for a different starting point, i simply don’t know, but it would only make sense for two threads, with four or more it’s not that usefull.
Cheers
Srek


#14

Is see no way how it would be able to improve performance.
Cheers
Srek


#15

I think so, since the much larger available memory range is a godsend for 3D rendering.
You should be aware though that switching to 64 bit means that the application will only run on 64 Bit systems and that none of the old plugins will work on it. All of them would have at least to be recompiled.
Cheers
Srek


#16

Like Srek said, no it doesn’t… but next year you might be able to put a dual-core processor in there to replace the one you have :slight_smile:


#17

Hi,

Sorry I only got the new system last week, after looking at the mainboard box, its Hypertransport, not Hyperthreading…

My bad… :wink:

Thnx

Chris


#18

Srek, talking of memory will R9 be able to take advantage of the 3GB switch in XP.

Cheers
JDP


#19

Yes, it takes advantage of it. It works already with the demo.
Cheers
Srek


#20

As you said:

“The delay happen when the thread has to preprocess some informations before it can start to render. This will happen regardless where the rendering line starts”.

On two processor systems one rendering line usually catches the other several times during the rendering of one frame, thus several of these delays. If the threads start at opposite ends, they will only meet once, and then the frame is finished rendering. The preprocessing will only happen once: at the beginning of each frame.

Jorge Arango