Algorthim Needed

  • Thread starter Thread starter Chris
  • Start date Start date
C

Chris

Not in the right group I know, but it's the only one I visit and someone
here probably knows.

I have an array of integers 1 to X.
It is not sorted and already populated.
If there are 5 elements then 1,2,3,4,5 would have to be in the array
The user could screw up and enter 2,2,3,4,5 this would be wrong.

Is there a way to check if I have a valid array without looping twice?

We couldn't figure it out but I figured there was a way, so I asked people
smarter than me.

Thanks
Chris
 
Not really the right group, but here's an old story.

The class was rowdy and the teacher wanted a few minutes of peace. The
teacher asked the class to add up all the numbers from one to 100. Students
bent to their papers and started adding, and the teacher leaned back -- but
young Gauss was already done.

Gauss noted that you take 1+100 = 101, 2+99 = 101, 3+98 = 101, ...
50+51=101. There are 50 pairs in the set that add up to 101, so the answer
was 50*101 = 5150.

Add up your array and compare the sum to (1+X) * (X/2), with an obvious
adjustment if X is odd.
 
That algorithm wouldn't work if valid data isn't guaranteed. For example, if
the user enters the array "0, 0, 0, 0, 15", it would return a false
positive.

Although it's probably less efficient for smaller sets, this algorithm only
requires a single pass:

public static bool IsDataValid(int[] input)

{

Hashtable values = new Hashtable();

for (int lcv = 0; lcv < input.Length; lcv++)

{

int val = input[lcv];

if (values.ContainsKey(val) || (val < 1) || (val > input.Length))

{

return false;

}

values.Add(val, true);

}

return true;

}
 
Ed Kaim said:
That algorithm wouldn't work if valid data isn't guaranteed. For example, if
the user enters the array "0, 0, 0, 0, 15", it would return a false
positive.

I think it depends on whether the numbers all have to be in the range
[1...max], in which case just checking that for each number and summing
is the efficient way of doing it.
Although it's probably less efficient for smaller sets, this algorithm only
requires a single pass:

public static bool IsDataValid(int[] input)

{

Hashtable values = new Hashtable();

for (int lcv = 0; lcv < input.Length; lcv++)

{

int val = input[lcv];

if (values.ContainsKey(val) || (val < 1) || (val > input.Length))

I don't think that *really* counts as a single pass, as during each
"outer" pass you effectively need to go through all the numbers you've
already added. It's basically O(n log(n)) instead of O(n), assuming
hashtable lookup is O(log(n)).
 
I think it depends on whether the numbers all have to be in the range
[1...max], in which case just checking that for each number and summing
is the efficient way of doing it.

Not exactly. The set "1, 1, 3, 5, 5" would return a false positive. The real
challenge here is to remember which numbers you've already seen.
I don't think that *really* counts as a single pass, as during each
"outer" pass you effectively need to go through all the numbers you've
already added. It's basically O(n log(n)) instead of O(n), assuming
hashtable lookup is O(log(n)).

That's true. Instead of a Hashtable, an array of booleans would be optimal.
Then the lookup would be direct, such as:

....
if ((val < 1) || (val > input.Length) || values[val - 1])
....


Jon Skeet said:
Ed Kaim said:
That algorithm wouldn't work if valid data isn't guaranteed. For example, if
the user enters the array "0, 0, 0, 0, 15", it would return a false
positive.

I think it depends on whether the numbers all have to be in the range
[1...max], in which case just checking that for each number and summing
is the efficient way of doing it.
Although it's probably less efficient for smaller sets, this algorithm only
requires a single pass:

public static bool IsDataValid(int[] input)

{

Hashtable values = new Hashtable();

for (int lcv = 0; lcv < input.Length; lcv++)

{

int val = input[lcv];

if (values.ContainsKey(val) || (val < 1) || (val > input.Length))

I don't think that *really* counts as a single pass, as during each
"outer" pass you effectively need to go through all the numbers you've
already added. It's basically O(n log(n)) instead of O(n), assuming
hashtable lookup is O(log(n)).
 
Ed Kaim said:
I think it depends on whether the numbers all have to be in the range
[1...max], in which case just checking that for each number and summing
is the efficient way of doing it.

Not exactly. The set "1, 1, 3, 5, 5" would return a false positive. The real
challenge here is to remember which numbers you've already seen.

Whoops - true, true. That's what I get for posting in the middle of the
night :)
That's true. Instead of a Hashtable, an array of booleans would be optimal.

Yup.
 
This works, but it still requires looping, unless you know the size of the
array and create a hardcoded addition. Once you end up with a loop, you
might as well test with an inner loop.

The savings here would be in not having to fire off a second loop if the
numbers did not match. But, as with checksums, the validity of the document
is still checked if the checksum does not match. If many of the errors fail
this "checksum", the algorithm you suggest would reduce cycles burned, so it
would certainly be of benefit in some applications.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

**********************************************************************
Think Outside the Box!
**********************************************************************
 
Yup -- bad suggestion on my part (doh!) that's what *I* get for posting
before coffee.

Sorry.
 
Back
Top