Loading a big array on the heap

Guest · May 14, 2004

I am looking to write a very fast algorithm to merge ranges.
e.g. if it was input the following ranges (1,3), (4,6), (9, 12)
it should return exactly the same thing.
However if it's passed (1, 5), (3, 8), (10, 12) then it should return (1, 8), (10, 12).

for this i'm looking to load a big array onto the heap, use memset to
set the ranges, then read it back.
Technically, I would only need 1 bit for each valid number in the range (the maximum
any number in any range can be will be known). But in practice, I'm going to have to have
at least a byte, as it would remove the need for code to do individual bit settings.

So, would it be quicker to have bytes, or longs?

Simon Trew · May 14, 2004

songie D said:
I am looking to write a very fast algorithm to merge ranges.
e.g. if it was input the following ranges (1,3), (4,6), (9, 12)
it should return exactly the same thing.
However if it's passed (1, 5), (3, 8), (10, 12) then it should return (1, 8), (10, 12).

for this i'm looking to load a big array onto the heap, use memset to
set the ranges, then read it back.
Technically, I would only need 1 bit for each valid number in the range (the maximum
any number in any range can be will be known). But in practice, I'm going to have to have
at least a byte, as it would remove the need for code to do individual bit settings.

So, would it be quicker to have bytes, or longs?

Who knows? On the one hand with bytes you have fewer memory accesses and
more cache hits, but on the other hand if there are nonaligned accesses of
the bytes then it might be slower.

Why not just typedef the numeric type and use that, then profile with
different types. For that matter, templatize the algorithm on the numeric
type.

S.

Alexander Grigoriev · May 15, 2004

This algorithm will be very suboptimal, with O(n) complexity, where 'n' is
total length of ranges or max span.

Better algorithm over a sorted sequence has O(m)*log(m) complexity (m*log(m)
is sort cost), where m is number of ranges, and doesn't depend on the range
length.

songie D · May 15, 2004

can you post a more detailed example?

Alexander Grigoriev said:
This algorithm will be very suboptimal, with O(n) complexity, where 'n' is
total length of ranges or max span.

Better algorithm over a sorted sequence has O(m)*log(m) complexity (m*log(m)
is sort cost), where m is number of ranges, and doesn't depend on the range
length.

(1,
8), (10, 12). going
to have to have bit

Raymond Chen · May 16, 2004

Well, think about it. How did you convert (1, 5), (3, 8), (10, 12) into (1,
8), (10, 12)? Did you take a piece of paper and make 12 boxes, then fill in
through 5, then 3 through 8, then 10 through 12, and then convert the boxes
back into ranges? Probably not. You probably used some brain shortcuts.
Convert those shortcuts into an algorithm.

Jerry Coffin · May 16, 2004

can you post a more detailed example?

How about a more detailed description instead?

Sort the ranges in order.
Step through the ranges, and if the end of one range is right before
the beginning of the next, merge the two. Repeat until end of
ranges.

If merging is slow, you may want to delay doing a merge until you
find a range this is not contiguous (e.g. if you find the first four
ranges are contiguous, the new range is the beginning of the first
and the end of the fourth).

Loading a big array on the heap

Guest

Simon Trew

Alexander Grigoriev

songie D

Raymond Chen

Jerry Coffin