View Full Version : MS STL String
doctorx256 07-09-2008, 11:00 AM I am using standard templates library string in Microsoft Visual Studio 2005 sp1 and it proves that dinkumware's code for std::string is an unbelievable resource hog.
i some cases dinkumware's string is 1000 slower than dynamically allocated C char array, and this is not an overstatement. I have used strstream from both dinkumware's and from a guy in the boost library and found them again dozens of times slower than char array.
I am searching for a string library (preferably only a header file not a whole library/framework) that is foremost fast and memory efficient. Conformity with the STL would be a plus.
Any ideas....
PS. i think that the whole stl library that dinkumware has sold to Microsoft is unbelievably slow. If someone has anyone found a really good implementation of the stl then a link to that would be nice also
|
|
Robert Bateman
07-09-2008, 01:23 PM
I am using standard templates library string in Microsoft Visual Studio 2005 sp1 and it proves that dinkumware's code for std::string is an unbelievable resource hog.
"dinkumware is included in VC2005" therefore "std:: string is a resource hog".
Am I the only one that find's this argument to be slightly flawed?
i some cases dinkumware's string is 1000 slower than dynamically allocated C char array, and this is not an overstatement.
So again, using your words, your argument is:
std::string is 1000 slower doing something, than char[].
However, char[] by it's very definition, doesn't have any abilities to do anything. So you're statement can be re-written as:
std:: string is 1000 slower doing something, than doing nothing.
Well compared to doing nothing, making a cup of tea takes a lifetime....
I have used strstream from both dinkumware's and from a guy in the boost library and found them again dozens of times slower than char array.
Then i seriously doubt your testing methods. You are still testing something against nothing and finding out that doing nothing is always faster than doing something.
I am searching for a string library (preferably only a header file not a whole library/framework) that is foremost fast and memory efficient. Conformity with the STL would be a plus.
Any ideas....
How about STL?
PS. i think that the whole stl library that dinkumware has sold to Microsoft is unbelievably slow.
Now we are starting to get somewhere. I will retort with the following:
I know that the whole stl library that dinkumware has sold to Microsoft is not slow.
STLPort used to be quicker pre VC2005, but that's certainly not the case anymore. STLPort does have a few additional 'optimisations' you can enable (such as preventing all strings from being NULL terminated), but all that they tend to do is make you development times longer, and make it harder to debug.
I've spent a lot of time working in the dinkumware STL implementation, and have even re-written some parts for 16 byte alignments and other small requirements. There really is nothing left in there once you switch to a release build.
Granted it may be slow if you start abusing the containers for things they never intended (eg, using vector and inserting/removing frequently instead of using a list or deque), but that's not really a case of STL being slow, you're just using the wrong container.
You *can* find that resizes are taking too long, but that's the fault of malloc, not STL - and you can always replace the allocator.
There is some overhead with std::vector, std::list etc, since it has to call the copy ctor and destructor on all array elements - but that's not going to be a problem with a std::string since we're only talking about copying char's around.
If someone has anyone found a really good implementation of the stl then a link to that would be nice also
link (http://www.dinkumware.com/)
doctorx256
07-09-2008, 07:11 PM
hm.. i still don't think std::string is still as fast as it needs to be...
indeed i don't think it safe either... take a look in what i found today...
http://bstring.sourceforge.net/features.html
http://www.sgi.com/tech/stl/string_discussion.html
http://bstring.cvs.sourceforge.net/*checkout*/bstring/tree/bstrlib.txt?pathrev=HEAD
i think that string is logged as having a bug in Microsoft database because it is extremely slow, but it couldn't be fixed because that would break computability with existing applications. i can't find the thread tho
PS.
i can't stand java and C# having faster string manipulation/allocation than c++, i just knew there would be something better than """"STANDARD""""....
Robert Bateman
07-10-2008, 04:21 PM
http://bstring.sourceforge.net/features.html
This document is over 8 years old and refers to std::string as found in VC++6. We all know that was slow. That does not refer to STL provided with 2005 (which is substantially better). (Note at the bottom, they are using VC 6.0 (12.00.8168), Intel 6.0, and gcc 3.3.3)
http://www.sgi.com/tech/stl/string_discussion.html
This document is 11 years old and refers to string (note! not std::string). This is an ancient SGI implementation that is the forerunner too - but is not itself STL. STL (and the std namespace), came around afterwards to address performance issues with the original library. This has nothing to do with STL found in VC2005.
http://bstring.cvs.sourceforge.net/*checkout*/bstring/tree/bstrlib.txt?pathrev=HEAD
Since bstring provides comparisons against the performance of STL found in VC++6. I suspect the results are highly flawed and do not compare against the current dinkumware STL implementation (which is about 5 or 6 times faster than the one that shipped with VC++6 - which was addmittedly crap).
i think that string is logged as having a bug in Microsoft database because it is extremely slow, but it couldn't be fixed because that would break computability with existing applications. i can't find the thread tho
slow for what?
PS. i can't stand java and C# having faster string manipulation/allocation than c++, i just knew there would be something better than """"STANDARD""""....
C# and Java have better allocators (as i said before, malloc is known to be slow), however raw string performance itself is faster in C++. If malloc's therefore you bottleneck, go and replace the STL allocator and it'll go considerably faster.
doctorx256
07-10-2008, 08:42 PM
sorry, i was confused, you are right you last post cleared my mind.
i think i should check on MFC string, wxWidgets, Trolltech's string implementation and check IF there is something faster than std::string...
so i will ask you something in a completly different direction...
if i used a smart pointer which pointed char [] and that smart pointer was encapsulated in a vector wouldn't that be faster and more memory efficient than using string encapsulated in vectors. Given that char[] is way faster than strings.
i think share_ptr of tr1 should be avoided, which smart type of pointer would have worked for us in the above question...
i hope i was clear enough in the post above, because i seem to produce random thoughts from time to time...
cheers mate
UrbanFuturistic
07-10-2008, 09:42 PM
Who's to say that a particular implementation of String isn't just a dynamically remalloced (OK, sorry, re 'new'ed) char array with associated method set?
Using 'smart' pointers in a Vector is not going to get around the inherent problems of using Vectors. If you need fast random access to individual datum you need to work with less restrictive data structures that offer more immediate access than the push/pop of vectors, Like Robert said.
String, on the other hand, will be exactly as fast as its implementation. The whole point of C++ (and object-oriented languages in general) is that the interface has no impact on the underlying implementation so you can be using the exact same object creation and method calls and just swap out one implementation for another without changing a single thing in the code outside the class.
With an efficient enough compiler and an efficient implementation there should be no advantage whatsoever to dropping String unless you count possible memory use reduction for embedded systems but well managed C++ can easily be as fast as well managed C.
Robert Bateman
07-12-2008, 10:37 PM
i think i should check on MFC string, wxWidgets, Trolltech's string implementation and check IF there is something faster than std::string...
No there is not. The strings you have mentioned all feature dll linkeage rather than a fast inlined template. I assume you've investigated std::string fully before writing off STL, so your research has already directed you to:
#define _SECURE_SCL 0
If you do not know what I'm talking about, then i think it's fair to assume you've been premature in writing off it's performance. STL is configurable, so if you want speed, replace the allocator and disable the runtime error checking. As I've said to you before, you can replace the allocator for std::string and improve performance significantly. Some people have reported speed increases of up to a factor of 50x. malloc is slow, but std::string is not.
Given that char[] is way faster than strings.
No it is not. char[] is an array. It has no functionality. std::string is a fast (and secure) string implementation that uses char[] as it's storage mechanism.
so i will ask you something in a completly different direction...
if i used a smart pointer which pointed char [] and that smart pointer was encapsulated in a vector wouldn't that be faster and more memory efficient than using string encapsulated in vectors.
It's a waste of time. Most std::string implementations already provide this option. Those that do, generally work faster with that functionality disabled. The simple fact is, that trying to determine which strings are shared is an extremely difficult thing to do - as such it's generally more costly than simply copying the string (though there are a couple of contrived situations where it does give a performance benefit - but they are not things you see in real workd apps).
i think share_ptr of tr1 should be avoided, which smart type of pointer would have worked for us in the above question...
which is generally a pointless thing to do.
scorpion007
07-13-2008, 01:05 PM
doctorx256, Perhaps you could post a sample test case where you find std::string significantly slower than your own implementation of some operation. That would provide something tangible we could look at and see if we could improve its performance.
doctorx256
07-13-2008, 10:49 PM
for comparison i iterate 1.000.000 times
char *ch;
ch = new char[51];
ch = "12345678901234567890123456789012345678901234567890";
and
string str;
str = "12345678901234567890123456789012345678901234567890";
storing them inside a vector
finding the later 5x times slower than the first
although sometime it executes only 60% slower without changing a line of code
when i do PGO it runs only 30% - 35% slower
BUT
when i run it 64bit release mode string is 50% faster which is really crazy
There must be something terribly wrong with my vs studio installation or a bug maybe ?
i don't know what it is i just give up and use string...
PS.
Robert i hope i didn't upset u with my crazy suggestions because i have a feeling i did so...
cheers mate
scorpion007
07-14-2008, 12:59 AM
for comparison i iterate 1.000.000 times
char *ch;
ch = new char[51];
ch = "12345678901234567890123456789012345678901234567890";
Well, first off, this code is horribly wrong :). You allocate some space for ch, then you assign a constant string to the pointer. Why did you need to allocate memory if you're going to use a string constant?
Perhaps you meant:
char ch[] = "...";
or
char *ch = new char[N];
strcpy_s(ch, N, "...");
or even
char *ch = "...";
In any case, you didn't provide a test case (a minimal compilable example that can be tested). It's hard to figure out what exactly you did by your description.
playmesumch00ns
07-14-2008, 12:25 PM
Do you mean that you do
myvector.push_back( str );
1,000,000 times for each and compare the timings? If so it's the memory allocation that's screwing you as rob already said. Pushing a string onto a vector requires making a new string and copying its contents, whereas copying a char* just requires copying a char*.
Robert Bateman
07-17-2008, 02:04 PM
for comparison i iterate 1.000.000 times
That's not really a real world test at all. All you are doing is allocating a load of memory. The std::string version cleans it up correctly, your char* version does not (and leaks - yet another reason why you should use std::string).
storing them inside a vector
which, as playmesumch00ns has said, will re-allocate all of the strings. This is a case of not using the correct container rather than a flaw with std::string. try the test again using either std::deque, or std::list and you'll find no such difference. Alternatively if you *insist* on using std::vector, then you can always store pointers to std::string rather than the classes themselves (which will prevent the memory being re-allocated). The following code is a fairer test of std::string using pointers to prevent the forced reallocation.
class StringTest
: public std::vector<std::string*> {
public:
StringTest() : std::vector<std::string*>() {}
~StringTest() {
for(iterator it = begin(); it != end(); ++it)
delete *it;
}
void addString(const char* str) {
push_back( new std::string(str) );
}
};
class CStringTest
: public std::vector<char*> {
public:
CStringTest() : std::vector<char*>() {}
~CStringTest() {
for(iterator it = begin(); it != end(); ++it)
delete *it;
}
void addString(const char* str) {
size_t len = strlen(str)+1;
char* data = new char[ strlen(str)+1 ];
memcpy(data,str,len);
push_back( data );
}
};
const unsigned NUM_ITERS = 1000000;
void testStdString()
{
StringTest test;
for(unsigned i=0;i!=NUM_ITERS;++i)
test.addString("12345678901234567890123456789012345678901234567890 ");
}
void testCString()
{
CStringTest test;
for(unsigned i=0;i!=NUM_ITERS;++i)
test.addString("12345678901234567890123456789012345678901234567890 ");
}
If you know your containers though, you'll already know the problems with the above code can simply be avoided via reserving the correct capacity beforehand. This removes the need for the pointers completely with std::string (note the additional memory cleanup you *must* perform with C strings).
class StringTest
: public std::vector<std::string> {
public:
StringTest() : std::vector<std::string>() {}
void addString(const char* str) {
push_back( str );
}
};
class CStringTest
: public std::vector<char*> {
public:
CStringTest() : std::vector<char*>() {}
~CStringTest() {
for(iterator it = begin(); it != end(); ++it)
delete *it;
}
void addString(const char* str) {
size_t len = strlen(str)+1;
char* data = new char[ strlen(str)+1 ];
memcpy(data,str,len);
push_back( data );
}
};
const unsigned NUM_ITERS = 1000000;
void testStdString()
{
StringTest test;
test.reserve(NUM_ITERS);
for(unsigned i=0;i!=NUM_ITERS;++i)
test.addString("12345678901234567890123456789012345678901234567890 ");
}
void testCString()
{
CStringTest test;
test.reserve(NUM_ITERS);
for(unsigned i=0;i!=NUM_ITERS;++i)
test.addString("12345678901234567890123456789012345678901234567890 ");
}
finding the later 5x times slower than the first
but it's not std::string that's slower here. It's the malloc/free combination that you are forcing std::string to do by using the incorrect container....
There must be something terribly wrong with my vs studio installation or a bug maybe ?
Yes, you have a bug. The bug however is not in std::string, or in VC++, it's in the way you are using STL.
Robert i hope i didn't upset u with my crazy suggestions because i have a feeling i did so...
cheers mate
you didn't upset me in anyway, I just prefer to nip mis-information and mis-understanding in the bud before it becomes a pseudo-i-read-it-on-a-forum-somewhere-so-it-must-be-true fact.....
doctorx256
07-17-2008, 04:28 PM
Thank you very much for the info. I did some things that you guys mentioned here and i now find std::string in the performance i expected it to be.
My conclusion is that std::string should be used because it is fast and convenient.
I think i have some "sharp edges" in my understanding of STL, and i should investigate further, i should find a more detailed online tutorial/reference of STL than the one i use currently.
Anyway thanks to all for the clearing up.
mummey
07-24-2008, 06:01 PM
I had to read this beginning-to-end for a class and write papers on each chapter:
http://www.amazon.com/gp/product/0201889544/
CGTalk Moderation
07-24-2008, 06:01 PM
This thread has been automatically closed as it remained inactive for 12 months. If you wish to continue the discussion, please create a new thread in the appropriate forum.
vBulletin v3.0.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.