size_t / long

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

If a size_t is cast to a long, and size_t is the length of a unicode string, does the resulting long need to be divided by sizeof(_TCHAR) in order to get the actual length in _TCHARs?
 
songie said:
If a size_t is cast to a long, and size_t is the length of a unicode
string, does the resulting long need to be divided by sizeof(_TCHAR)
in order to get the actual length in _TCHARs?

If the size_t contains the size of a TCHAR string, you should always divide
by sizeof(TCHAR) to get the length of the string.

-cd
 
songie said:
ok Carl.
maybe you can help me:
I've got a string (of _TCHARs), for instance
_TCHAR* mystring = "quick brown fox jumps over lazy dog";

I then use this:
wordlen = (long)_tcsspn(mystring, "abcdefgh...wxyz");
(where the second argument is the whole alphabet but without space)

which of course returns 6. This is what I would expect, as 6 is the

It returns 5...
position of the first character that ISN'T an alphabetic character,
i.e.
a space.

I now want to extract the word "quick" and copy it into a string of
its own.
For this I'm allocating a dynamic array of _TCHARs on the heap (I
don't
know how long the word might be).
using:
_TCHAR* word = new _TCHAR[wordlen];
_tcsncpy(word, mystring, wordlen);
...<do some operations on the 'word' variable>
delete[] word;

You NEED to allocate space for the terminating NULL, so do new
TCHAR[wordlen+1]. As is, you're getting an un-terminated word in your array
and seeing garbage that's past the end of your allocation. Note that in
general, new will allocate more memory than you ask for due to
alignment/granularity requirements.

btw, all of this is much easier and less error-prone if you use std::string
instead of dealing with low-level details yourself:

typedef std::basic_string<TCHAR> tstring;

tstring mystring("quick brown fox jumps over lazy dog");
tstring alphabet("abcdefgh...wxyz");
tstring word = mystring.substr(0,mystring.find_first_not_of(alphabet,0));

-cd
 
It returns 5...

yes, sorry 5. I meant 5.
You NEED to allocate space for the terminating NULL, so do new
TCHAR[wordlen+1]. As is, you're getting an un-terminated word in your array
and seeing garbage that's past the end of your allocation. Note that in
general, new will allocate more memory than you ask for due to
alignment/granularity requirements.

OK so supposing I do, then I'll get even more memory. But I see the point,
it *might* fill it. I'll put it another way...
no I won't, I'll just repeat the question. How can I get the variable to
contain JUST
the string, with terminating null if that's what it entails, and then still
successfully delete[] it?

btw, all of this is much easier and less error-prone if you use std::string
instead of dealing with low-level details yourself:

Nah, that'd defeat the point of writing this part of the program in
unmanaged
C++. I might aswell write it in C#, that the rest of the program's written
in.
This is a routine that's going to be called probably every time
the user types a key, possibly many times per the user types a key.
typedef std::basic_string<TCHAR> tstring;

tstring mystring("quick brown fox jumps over lazy dog");
tstring alphabet("abcdefgh...wxyz");
tstring word = mystring.substr(0,mystring.find_first_not_of(alphabet,0));


mmm. wonder where find_first_not_of() comes from , the tooth fairy?
 
songie said:
It returns 5...

yes, sorry 5. I meant 5.
You NEED to allocate space for the terminating NULL, so do new
TCHAR[wordlen+1]. As is, you're getting an un-terminated word in
your array and seeing garbage that's past the end of your
allocation. Note that in general, new will allocate more memory
than you ask for due to alignment/granularity requirements.

OK so supposing I do, then I'll get even more memory. But I see the
point, it *might* fill it. I'll put it another way...
no I won't, I'll just repeat the question. How can I get the variable
to contain JUST
the string, with terminating null if that's what it entails, and then
still successfully delete[] it?

You can: allocate wordlen+1. There's no way you can force new[] to allocate
exactly as much space as you request - it's free to allocate more. That
said, any attempt by you to access beyond the size you requested is
undefined behavior. It's free to allocate exactly the amount you request
one time, and 10X the amount you request the next time - you simply cannot
assume anything beyond:

1. the allocation was at least as large as you requested.
2. you can safely access all of the elements that you requested (i.e. for
new T[n], you can access indexes 0..n-1).
3. assuming you haven't violated #2, that you can pass the same pointer
returned by new[] to delete[].
Nah, that'd defeat the point of writing this part of the program in
unmanaged
C++. I might aswell write it in C#, that the rest of the program's
written in.

Then there's likely to valid reason to write it in unmanaged C++. You're
apparently operating under the falacious assumption that managed code is
slow, or that this function is going to be a bottleneck in your program
(have you profiled it to find out?).
This is a routine that's going to be called probably every time
the user types a key, possibly many times per the user types a key.

So? Users typing keys are monumentally slow - you can run 10's (maybe
100's) of millions of CPU instructions between keystrokes on a modern CPU.
Besides, it's likely that the std::string solution, if properly written,
will be the same speed as your fragile hand-crafted solution.
mmm. wonder where find_first_not_of() comes from , the tooth fairy?

find_first_not_of is a member function of std::basic_string<CharT> - note
the . between mystring and find_first_not_of in the above sample. It comes
not from the tooth fairy, nor from Microsoft, but from the ISO C++ standard.

-cd
 
You can: allocate wordlen+1. There's no way you can force new[] to
allocate
exactly as much space as you request - it's free to allocate more. That
said, any attempt by you to access beyond the size you requested is
undefined behavior. It's free to allocate exactly the amount you request
one time, and 10X the amount you request the next time - you simply cannot
assume anything beyond:

ok. I get the picture. I'll try it.
1. the allocation was at least as large as you requested.
2. you can safely access all of the elements that you requested (i.e. for
new T[n], you can access indexes 0..n-1).
3. assuming you haven't violated #2, that you can pass the same pointer
returned by new[] to delete[].

Presumably you can also assume that the memory will be contigious?
(Thus, T[n] = T + n)
I think I was just under the false impression that even though I'd
discovered
that I wasn't allocating enough space, the fact that I happened to have got
more
meant that this couldn't be the problem. I'm getting the image that it would
be
undefined behaviour anyway.
Then there's likely to valid reason to write it in unmanaged C++.

If you mean write the whole lot in unmanaged C++, no - I don't want
to do that. The reason simply being that it would take me far too long.
The algorithms are what should be taking my programming time, not
spending hours writing code to display a user interface.
You're
apparently operating under the falacious assumption that managed code is
slow, or that this function is going to be a bottleneck in your program
(have you profiled it to find out?).

No, I'm not saying managed code is slow. I'm just saying it's a known fact
that it's slightly slowER, than unmanamaged C++ code.
If this program was being developed commercially, then some bod with
a degree in design architecture and who never actually has to write any code
would make a decision about which bits are going to be written in which
language. I figured that since I'm more of a "just do it" programmer
(both in profession and in hobby) I should take this step myself, rather
than
simply writing it all in the same language.
So? Users typing keys are monumentally slow - you can run 10's (maybe
100's) of millions of CPU instructions between keystrokes on a modern CPU.

Not the speed I type at (50 - 70 wpm). Which yes, ~1Hz is slow in
electronics terms
you're right. But I don't think that I want to be cutting any slack
nevertheless.

Besides, it's likely that the std::string solution, if properly written,
will be the same speed as your fragile hand-crafted solution.

I doubt it'll be fragile, since it'll be tested with all possible inputs
(and checked for memory leaks if I'm feeling pedantic). I wouldn't have
thought that writing with a class library that I have no knowledge of
and that is generic enough to handle many different scenarios, would be as
fast as writing with native functions that I do have knowledge of and that
are specifically only programmed to do the task I have in mind.
find_first_not_of is a member function of std::basic_string<CharT> - note
the . between mystring and find_first_not_of in the above sample. It comes
not from the tooth fairy, nor from Microsoft, but from the ISO C++
standard.

oh ok, I stand corrected then. Although the STL is made by hewlett-packard,
you realise. But under the hood it's still probably a similar algorithm to
_tcscspn, and is just another header file of mainly unnecessary
bumph compiled into the application.
 
songie said:
oh ok, I stand corrected then. Although the STL is made by
hewlett-packard, you realise. But under the hood it's still probably
a similar algorithm to _tcscspn, and is just another header file of
mainly unnecessary
bumph compiled into the application.

The STL was a proposal by Alex Stepanov (et al) of Hewlett Packard to the
C++ standards committee. The C++ standard incorporates the components
originally included in STL. Note that std::string was not part of the STL
proposal, but rather was created by the C++ committee based on other
proposals. In fact, the string class was in the proposed standard before
STL was proposed, with a number of changes being made to the string class
after STL was introduced to make the string more "STL-like".

And yes, find_first_not_of, under the covers, is no doubt a very similar
algorithm to _tcscspn, but it's standard and portable.

-cd
 
Back
Top