Replace any chars not in allowed list

  • Thread starter Thread starter Grok
  • Start date Start date
G

Grok

I need an elegant way to remove any characters in a string if they are
not in an allowed char list. The part cleaning files of the
non-allowed characters will run as a service, so no forms here.

The list also needs to be editable by the end-user so I'll be
providing a form on which they can edit the allowed character list.

The end-user is non-technical so asking them to type a regular
expression is out.

Is there anything in the framework 2.0 to clean a string of unwanted
characters, or better, clean any characters not in an allow list?
Seems like there would be to clean SQL strings to prevent injection
attacks.
 
string.Replace work fine for me:

string Description = row["Desc"].ToString();
Description = Description.Replace("\n", " | ");//Remove line feed
Description = Description.Replace("\r", " | ");//Remove line feed
char c=Convert.ToChar(19);
Description = Description.Replace(c, ":".ToCharArray()[0]);


schneider
 
Thanks but opposite of my needs. I am not looking for a way to
replace a known character, but to remove any characters that are NOT
in a given character list. Something like:

Function CleanAString(ByVal theDirtyString As String) As String
Dim allowedChars As String
allowedChars = GetAllowedCharsFromRegKey()
Return String.AwesomeStringCleaner(theDirtyString, allowedChars)
End Function

with "abcdef" on allowed list,
CleanAString("aa+bb,cc112233/dd\ee:ff")
returns "aabbccddeeff"


string.Replace work fine for me:

string Description = row["Desc"].ToString();
Description = Description.Replace("\n", " | ");//Remove line feed
Description = Description.Replace("\r", " | ");//Remove line feed
char c=Convert.ToChar(19);
Description = Description.Replace(c, ":".ToCharArray()[0]);


schneider

I need an elegant way to remove any characters in a string if they are
not in an allowed char list. The part cleaning files of the
non-allowed characters will run as a service, so no forms here.

The list also needs to be editable by the end-user so I'll be
providing a form on which they can edit the allowed character list.

The end-user is non-technical so asking them to type a regular
expression is out.

Is there anything in the framework 2.0 to clean a string of unwanted
characters, or better, clean any characters not in an allow list?
Seems like there would be to clean SQL strings to prevent injection
attacks.
 
Dim tempString As String = theDirtyString

For Each c as Char In theDirtyString
If Not allowedChars.Contains(c.ToString) Then
tempString = tempString.Replace(c.ToString, String.Empty)
End If
Next

Return tempString

Something like that...?



Grok said:
Thanks but opposite of my needs. I am not looking for a way to
replace a known character, but to remove any characters that are NOT
in a given character list. Something like:

Function CleanAString(ByVal theDirtyString As String) As String
Dim allowedChars As String
allowedChars = GetAllowedCharsFromRegKey()
Return String.AwesomeStringCleaner(theDirtyString, allowedChars)
End Function

with "abcdef" on allowed list,
CleanAString("aa+bb,cc112233/dd\ee:ff")
returns "aabbccddeeff"


string.Replace work fine for me:

string Description = row["Desc"].ToString();
Description = Description.Replace("\n", " | ");//Remove line feed
Description = Description.Replace("\r", " | ");//Remove line feed
char c=Convert.ToChar(19);
Description = Description.Replace(c, ":".ToCharArray()[0]);


schneider

I need an elegant way to remove any characters in a string if they are
not in an allowed char list. The part cleaning files of the
non-allowed characters will run as a service, so no forms here.

The list also needs to be editable by the end-user so I'll be
providing a form on which they can edit the allowed character list.

The end-user is non-technical so asking them to type a regular
expression is out.

Is there anything in the framework 2.0 to clean a string of unwanted
characters, or better, clean any characters not in an allow list?
Seems like there would be to clean SQL strings to prevent injection
attacks.
 
Clever and perfect!

Dim tempString As String = theDirtyString

For Each c as Char In theDirtyString
If Not allowedChars.Contains(c.ToString) Then
tempString = tempString.Replace(c.ToString, String.Empty)
End If
Next

Return tempString

Something like that...?



Grok said:
Thanks but opposite of my needs. I am not looking for a way to
replace a known character, but to remove any characters that are NOT
in a given character list. Something like:

Function CleanAString(ByVal theDirtyString As String) As String
Dim allowedChars As String
allowedChars = GetAllowedCharsFromRegKey()
Return String.AwesomeStringCleaner(theDirtyString, allowedChars)
End Function

with "abcdef" on allowed list,
CleanAString("aa+bb,cc112233/dd\ee:ff")
returns "aabbccddeeff"


string.Replace work fine for me:

string Description = row["Desc"].ToString();
Description = Description.Replace("\n", " | ");//Remove line feed
Description = Description.Replace("\r", " | ");//Remove line feed
char c=Convert.ToChar(19);
Description = Description.Replace(c, ":".ToCharArray()[0]);


schneider

<Grok> wrote in message I need an elegant way to remove any characters in a string if they are
not in an allowed char list. The part cleaning files of the
non-allowed characters will run as a service, so no forms here.

The list also needs to be editable by the end-user so I'll be
providing a form on which they can edit the allowed character list.

The end-user is non-technical so asking them to type a regular
expression is out.

Is there anything in the framework 2.0 to clean a string of unwanted
characters, or better, clean any characters not in an allow list?
Seems like there would be to clean SQL strings to prevent injection
attacks.
 
Grok said:
Is there anything in the framework 2.0 to clean a string of unwanted
characters, or better, clean any characters not in an allow list?
Seems like there would be to clean SQL strings to prevent injection
attacks.

If SQL is your objective then consider using parameterised queries instead:
they are automatically injection-proof.

Andrew
 
Grok said:
Clever and perfect!

A little more efficent, maybe?

Dim SB As New System.Text.StringBuilder

For Each c As Char In theDirtyString
If allowedChars.IndexOf(c) > 0 Then
SB.Append(c)
End If
Next c

Return SB.ToString
 
If SQL is your objective then consider using parameterised queries instead:
they are automatically injection-proof.

Andrew

No that was just an example. The actual issue here is customer has an
IBM Content Manager server and is using a custom-written importer tool
(java). The tool does not properly escape certain characters when
attempting to pass values as indices to new documents. Somewhere
between the java importer and the ICM libraries, one of them rejects
ampersands, apostrohes, and quite a few other characters.

So rather than fixing the utility, they asked me to clean the input.
The CleanString() function will go into my existing project, while a
new Windows Form application will be created to let them edit the
allowed characters. All boring stuff except searching for a canned
string cleaner in the FCL before trying to write one from scratch,
such as the two gentlemen here did.
 
A little more efficent, maybe?
Dim SB As New System.Text.StringBuilder

For Each c As Char In theDirtyString
If allowedChars.IndexOf(c) > 0 Then
SB.Append(c)
End If
Next c

Return SB.ToString

Memory-wise yes, StringBuilder will be far easier on the working set (and
CPU) than just using System.String (and optimisation was left up to the OP
as far as I was concerned). However, your solution is technically
incorrect. IndexOf is 0 based in .NET, therefore you should have a <> 0
rather than > 0.

Happy to help ;-P
 
Back
Top