distinct selection(collection)

  • Thread starter Thread starter juli
  • Start date Start date
J

juli

I have strings variables in a collection list and I want to create new
collection but to add to it only strings that are distinct (no common
strings).
For example I have an object sentense which is the base for a
collection and there are words in it and I want to create a new
collection of sentenses where there is no similar secound word in
those sentenses.
How do I do this distinct selection from a collection of object?
Thanks!
 
You don't say what collection class you're using so I don't know all the
possibilities or limitations, but in your situation, I would probably create
a custom collection class, inherited from the ArrayList class, and add
methods to return arrays of string objects not having word x at position y.

DalePres
MCAD, MCSE, MCDBA
 
Hello ,
I am declaring a collection class which inherits from CollectionBase.
I have a string variable that is part of objects in collection.
How exactly I create a new collection choosing distinctly those strings?
(which method)
Thanks a lot!
 
Juli,

That would not difficult for you in my opinion.

Create a new collection, loop through the old one, add what you want in the
new record and write the record in the new collection as soon as the key is
different from the last one while looping. Write as well a row at the end
and skip the first time.

This is the oldest method of dataprocessing.

I hope this helps,

Cor
 
Hello,
I will try to define my problem better: I have a collection that is made
from a text file (long one) and it will take too much time to go through
all the collection.
I have there a string variable (one of collection object members) I want
to have a string array wich will contain all those string only once
(distinctly).How can I do it?
Thanks:)
 
Juli,

Did you try it, on an even .5Ghz computer I assume (did not test it now
however from expiriences) that it will probably take less time than about 1
millisecond to go through a collection from less than 10000 rows.

Cor
 
juli said:
Hello,
I will try to define my problem better: I have a collection that is made
from a text file (long one) and it will take too much time to go through
all the collection.
I have there a string variable (one of collection object members) I want
to have a string array wich will contain all those string only once
(distinctly).How can I do it?
Thanks:)
I reckon you want something like the Java Set interface. Below is a
simple implementation of a set I developed some time ago. This class
acts like an ArrayList, but it does not allow duplicate entries.

using System;
using System.Collections;

namespace Objectware.Collections {
public class Set : ICollection {
private ArrayList innerList=new ArrayList();
public Set() {}

public void CopyTo(Array array) {
CopyTo(array, 0);
}

public void CopyTo(Array array, int index) {
innerList.CopyTo(array, index);
}

public int Count {
get { return innerList.Count; }
}

public object SyncRoot {
get { return innerList.SyncRoot; }
}

public bool IsSynchronized {
get { return innerList.IsSynchronized; }
}

public int Add(object value) {
if (!Contains(value)) {
return innerList.Add(value);
} else return -1;
}
public void AddAll(ICollection collection) {
foreach (object item in collection) {
Add(item);
}
}
public void Remove(object value) {
innerList.Remove(value);
}
public void RemoveAll(ICollection collection)
{
foreach (object item in collection)
{
Remove(item);
}
}
public bool RetainAll(ICollection collection) {
ArrayList newList=new ArrayList(innerList);
foreach (object item in collection) {
newList.Remove(item);
}
int oldCount=innerList.Count;
innerList=newList;
return oldCount!=innerList.Count;
}

public bool Contains(object value) {
return innerList.Contains(value);
}
public bool ContainsAll(ICollection collection) {
foreach (object item in collection) {
if (!Contains(item)) return false;
}
return true;
}

public void Clear() {
innerList.Clear();
}

public IEnumerator GetEnumerator() {
return innerList.GetEnumerator();
}
}
}

Anders Norås
http://dotnetjunkies.com/weblog/anoras/
 
Anders said:
I reckon you want something like the Java Set interface. Below is a
simple implementation of a set I developed some time ago. This class
acts like an ArrayList, but it does not allow duplicate entries.

Note, that this is just as expensive as having an ArrayList directly.
Filling the list is worst-case and expected-random-input: O(n^2). Making
lookups is worst-case and expected-random-input O(n).

You could use something (roughly) as simple as:

public interface ISet: ICollection {
void Add(object item);
void Remove(object item);
bool Contains(object item);
}
public class HashSet: Hashtable, ISet {
void ISet.Add(object item) { base.Add(item,null); }
void ISet.Remove(object item) { base.Remove(item); }
bool ISet.Contains(object item) { return base.Contains(item); }
IEnumerator IEnumerable.GetEnumerator()
{ return base.Keys.GetEnumerator(); }
int ICollection.Count { get { return base.Count; } }
}

Which would give you O(n) filling and O(1) lookups.

ISet strings = new HashSet();
foreach ( string s in WhatEverReadsYourStrings )
try {
strings.Add(s)
} catch ( ArgumentException ) {
// Duplicate inserts are ignored
}
// strings now contains no duplicates
 
Back
Top