Comments on: Keep data close, part 1 An array with unspecified length as the last member of a struct is only allowed in C99, that might be the reason MSVC++ dislikes it. An array with unspecified length as the last member of a struct is only allowed in C99, that might be the reason MSVC++ dislikes it.

]]>
By: Dan Olson/2011/05/17/keep-data-close-part-1/#comment-4748 Dan Olson Mon, 23 May 2011 08:33:10 +0000 Zero length array in this case is disallowed by the standard and nonportable, which makes it the opposite of legitimate in some ways. http://herbsutter.com/2009/09/02/when-is-a-zero-length-array-okay/ Zero length array in this case is disallowed by the standard and nonportable, which makes it the opposite of legitimate in some ways.

@Alexandre Trog : Hey ;) Moreover, with templates you will increase the code size, since for each different string size specified, it will generate a new class. ( I'll assume that by "code size" you're referring to the final binary size. Tell me if I'm wrong. :) ) Well that assertion is'nt strictly "exact", we shouldn't generalize or assume it's always true (even in Peter's example). In practice, it depends on: - the code (I mean what exactly is in the member functions, if any, and if they can be all inlined) - the compiler (including different same-compiler versions!) - the compilation mode (real-industry-world heavyly templated (meta-prog magic and other simpler things) application I'm working on : debug = 356mb release = 3mo under gcc4.4, with debug and release being non-optimal configurations, checked today - debug mode was +500mb before I used a simple C++-specific - not C - trick) - therefore: the target platform - therefore : the ability of the coder to understand what he's doing with such template expression AFAIK, the standard doesn't guarantee that code will be generated from template code, but in a lot of cases some may be. Depending on the compiler and compiler mode, some "existing at compile-time" code have no use at runtime and will just be erased before link. And with some compilers, even link-time and whole-application optimization can factorize code in some ways. Some compilers (I didn't check that one) in some cases will generate only one class with additional code that will manipulate a void* type instead of having such class code duplicated for each paramter T type. I just wanted to point that template code growing your binary size isn't a correct assumption, it totally depends on what you're doing with such code and how you're compiling it. The only thing that is guaranteed, in practice, not in the standard, is compilation duration explosion (if you don't really know what you're doing). :D (although a compiler like clang can reduce that A LOT) @Alexandre Trog : Hey ;)

Moreover, with templates you will increase the code size, since for each different string size specified, it will generate a new class.

( I’ll assume that by “code size” you’re referring to the final binary size. Tell me if I’m wrong. :) )

Well that assertion is’nt strictly “exact”, we shouldn’t generalize or assume it’s always true (even in Peter’s example).
In practice, it depends on:
– the code (I mean what exactly is in the member functions, if any, and if they can be all inlined)
– the compiler (including different same-compiler versions!)
– the compilation mode (real-industry-world heavyly templated (meta-prog magic and other simpler things) application I’m working on : debug = 356mb release = 3mo under gcc4.4, with debug and release being non-optimal configurations, checked today – debug mode was +500mb before I used a simple C++-specific – not C – trick)
– therefore: the target platform
– therefore : the ability of the coder to understand what he’s doing with such template expression

AFAIK, the standard doesn’t guarantee that code will be generated from template code, but in a lot of cases some may be. Depending on the compiler and compiler mode, some “existing at compile-time” code have no use at runtime and will just be erased before link. And with some compilers, even link-time and whole-application optimization can factorize code in some ways. Some compilers (I didn’t check that one) in some cases will generate only one class with additional code that will manipulate a void* type instead of having such class code duplicated for each paramter T type.

I just wanted to point that template code growing your binary size isn’t a correct assumption, it totally depends on what you’re doing with such code and how you’re compiling it.

The only thing that is guaranteed, in practice, not in the standard, is compilation duration explosion (if you don’t really know what you’re doing). :D (although a compiler like clang can reduce that A LOT)

]]>
By: Tomasz Dąbrowski/2011/05/17/keep-data-close-part-1/#comment-4568 Tomasz Dąbrowski Thu, 19 May 2011 15:48:06 +0000 @Peter : With what you suggest, it is not possible to have the string size specified at runtime. Moreover, with templates you will increase the code size, since for each different string size specified, it will generate a new class. @Tomasz : What you suggest is perfectly suited for fixed sized strings, and terrible for varying sized string - appending text for example. Typically, loading string data from disk, data that won't need to change, are a perfect match for your solution. I used to do that a lot, but instead I simply used a char* that I patched - which only happened once : on loading time. We should always remember that it is essential to specify the context. @Peter : With what you suggest, it is not possible to have the string size specified at runtime. Moreover, with templates you will increase the code size, since for each different string size specified, it will generate a new class.

@Tomasz : What you suggest is perfectly suited for fixed sized strings, and terrible for varying sized string – appending text for example. Typically, loading string data from disk, data that won’t need to change, are a perfect match for your solution.
I used to do that a lot, but instead I simply used a char* that I patched – which only happened once : on loading time.

We should always remember that it is essential to specify the context.

]]>
By: Peter Bindels/2011/05/17/keep-data-close-part-1/#comment-4536 Peter Bindels Thu, 19 May 2011 09:27:42 +0000 You're making a string class that holds an immutable string - you can't make it longer or shorter. So why not use C++ techniques to do that if that's what you want? If you want a dynamic sized, std::string does that and it requires an extra allocation. template class fixed_string { public: template fixed_string(const char[M] source) { if (M > N) throw std::exception(); ... } private: char buffer[N]; }; No placement new, no hacky solutions, no C programming in C++. You’re making a string class that holds an immutable string – you can’t make it longer or shorter. So why not use C++ techniques to do that if that’s what you want? If you want a dynamic sized, std::string does that and it requires an extra allocation.

template
class fixed_string {
public:
template
fixed_string(const char[M] source) {
if (M > N) throw std::exception();

}
private:
char buffer[N];
};

No placement new, no hacky solutions, no C programming in C++.

]]>
By: Garett Bass/2011/05/17/keep-data-close-part-1/#comment-4485 Garett Bass Wed, 18 May 2011 19:36:37 +0000 I was not precise. GCC emits "crash" instructor (UD2) instead of actual printf call, because it doesn't accept non-POD objects in vararg calls (in this case, printf(format, ...)). string_data is not a problem at all. Compiler complains about string class itself (unfortunately, class is considered non-POD if it has constructors/destructor, operator =, even if it contains only simple data). Check it out (very minimal): http://ideone.com/qEwXn C++0x relaxes POD definition and this is fine in G++ in C++0x mode. But in regular mode it instantly crashes. I was not precise. GCC emits “crash” instructor (UD2) instead of actual printf call, because it doesn’t accept non-POD objects in vararg calls (in this case, printf(format, …)).
string_data is not a problem at all. Compiler complains about string class itself (unfortunately, class is considered non-POD if it has constructors/destructor, operator =, even if it contains only simple data).

Check it out (very minimal): I haven't compiled this in GCC yet, so I'm not sure what it would say, but MSVC is perfectly happy with this: void IncrementRefCount (string& s) { string_data* p((string_data*)(s.data - sizeof(string_data))); p->refcount += 1; } Seems like a fairly standard C-style cast, so I wouldn't expect any warnings. Is the non-POD complaint due to the "char data[0]"? If so, it should be ok to modify as follows: struct string_data { int refcount; int length; char data[8]; // may as well be 16-bytes! } void IncrementRefCount (string& s) { // use offsetof instead... string_data* p((string_data*)(s.data - offsetof(string_data, data))); p->refcount += 1; } I haven’t compiled this in GCC yet, so I’m not sure what it would say, but MSVC is perfectly happy with this:

void IncrementRefCount (string& s) {
string_data* p((string_data*)(s.data – sizeof(string_data)));
p->refcount += 1;
}

Seems like a fairly standard C-style cast, so I wouldn’t expect any warnings.

Is the non-POD complaint due to the “char data[0]“? If so, it should be ok to modify as follows:

struct string_data {
int refcount;
int length;
char data[8]; // may as well be 16-bytes!
}

void IncrementRefCount (string& s) {
// use offsetof instead…
string_data* p((string_data*)(s.data – offsetof(string_data, data)));
p->refcount += 1;
}

]]> By: Tomasz Dąbrowski/2011/05/17/keep-data-close-part-1/#comment-4448 Tomasz Dąbrowski Wed, 18 May 2011 15:23:28 +0000 I use this technique in my string class, with one slight modification: struct string_data { int refcount; int length; char data[0]; // as mentioned above }; struct string { char * data; // points to string_data::data! }; To recover the string_data object, I just subtract sizeof(string_data) from string::data. This conveniently allows me to pass string objects to printf() as though they were char*, without needing the std::string-style c_str() call. I use this technique in my string class, with one slight modification:

struct string_data
{
int refcount;
int length;
char data[0]; // as mentioned above
};

struct string
{
char * data; // points to string_data::data!
};

To recover the string_data object, I just subtract sizeof(string_data) from string::data. This conveniently allows me to pass string objects to printf() as though they were char*, without needing the std::string-style c_str() call.

]]>
By: David Sveningsson/2011/05/17/keep-data-close-part-1/#comment-4287 David Sveningsson Tue, 17 May 2011 15:10:46 +0000 Microsoft C++ compiler complains about it: <em>nonstandard extension used : zero-sized array in struct/union</em>. But of course, it's nice, esp. the ability to use actual size of struct in allocation. Microsoft C++ compiler complains about it: nonstandard extension used : zero-sized array in struct/union. But of course, it’s nice, esp. the ability to use actual size of struct in allocation.

]]>
By: Tom Gaulton/2011/05/17/keep-data-close-part-1/#comment-4272 Tom Gaulton Tue, 17 May 2011 13:12:58 +0000 Cool trick - perfect for the string class you describe. I love the fact that allocating memory for it cannot be hidden away from the user - something that C++ seems designed to do. I suppose one drawback with this method is that it's more difficult to create them on the stack if you need to do some temporary manipulations. Cool trick – perfect for the string class you describe. I love the fact that allocating memory for it cannot be hidden away from the user – something that C++ seems designed to do. I suppose one drawback with this method is that it’s more difficult to create them on the stack if you need to do some temporary manipulations.

]]>