Re: Size limit on NSString cStringUsingEncoding?
- From: Michael Ash <mike@xxxxxxxxxxx>
- Date: Thu, 16 Feb 2006 06:26:53 -0600
Steve Edwards <gfx@xxxxxxxxxxx> wrote:
OK, this seems the bast way to go.
However, just 'cause it's bugging me:
( original [str length] = 49,804,307)
char* c = [str cStringUsingEncoding:NSUTF8StringEncoding];
This now works with the UTF-8 encoding.
//-- Turn it back in to NSString to see its length:
NSString *newStr = [NSString stringWithUTF8String:cStr];
unsigned long newLen = [newUTF8String length];
newLen now equals 46,804,207!
What's a missing hundred bytes between friends!?
Unicode does all kinds of weird things that could cause a difference in
length. For example, if you look at an accented character such as e'
(pretend that's all together), you can represent it as either a single
character, or a plain 'e' followed by a character which means "put a '
over the preceeding character". The second version will take up two
characters, even though it appears as only one glyph.
Another possibility is if you have a null character near the end of the
file. This is a legal character for UTF-8, but it will indicate the end of
the string when you treat it as a C string.
--
Michael Ash
Rogue Amoeba Software
.
- References:
- Size limit on NSString cStringUsingEncoding?
- From: Steve Edwards
- Re: Size limit on NSString cStringUsingEncoding?
- From: Michael Ash
- Re: Size limit on NSString cStringUsingEncoding?
- From: Steve Edwards
- Size limit on NSString cStringUsingEncoding?
- Prev by Date: Re: Size limit on NSString cStringUsingEncoding?
- Next by Date: Re: memory leak
- Previous by thread: Re: Size limit on NSString cStringUsingEncoding?
- Next by thread: memory leak
- Index(es):
Relevant Pages
|
Loading