Re: Formula to Remove Special Characters

  • Thread starter Thread starter Ron Rosenfeld
  • Start date Start date
R

Ron Rosenfeld

I am trying to figure out a formula that I can use to remove special
characters from Excel 2007 cells. I've used a user function in the
past but can't this time. The Paste special; doesn't cut it either.

Does anyone know of a way to build a formula to remove special
characters?

Thank you,
Jeff

You'll need to define what your "special characters" are.

When you write that you can't use what you used in the past, is that because
you are not allowed to use UDF's? Or because the previous UDF doesn't work?

If the former, you could use nested SUBSTITUTE's, and also, if required, TRIM
and/or CLEAN.

e.g.:

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(CLEAN(A1)),"$",""),"%",""),CHAR(160),"")

and just extend the SUBSTITUTE nesting as required.

If you can still use a UDF, I would suggest this. Note that there are two ways
of expressing the pattern. One lists the characters you wish to keep. This
list is enclosed in brackets, and preceded by a "^" which means "match anything
NOT in the subsequent list".

The other method would be to list all of the special characters you wish to
remove. The caveat here is that if "-" is to be removed, it must appear first
or last. I have also included a \xA0 in the "to be removed list" to show how
certain ASCII characters can be listed. This is ASCII character 160 which is a
non-breaking space, often used in HTML documents.

==========================
Option Explicit
Function RemoveSpChars(s As String) As String
Dim re As Object

'List all allowed characters, but start with the negation "^"
Const sPat As String = "[^A-Z0-9 ]"

'List all characters to be removed, "-" must
' be listed first or last
'Const sPat As String = "[!@#$%^&*()_+=\xA0]"

Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = sPat
RemoveSpChars = re.Replace(s, "")
Set re = Nothing
End Function
================================
--ron
 
I tried a variation of this formula to clean up some text that was copied/pasted from MS Word with formatting intact. However, it doesn't do what I thought it would do. =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(CLEAN(K7)),"$",""),"%",""),CHAR(160)," "),CHAR(100)," "),CHAR(152)," ") results in replacing the letter "d" with a space. I thought that ASCII character 160 was a non-breaking space. Am I doing something wrong?
I am trying to figure out a formula that I can use to remove special
characters from Excel 2007 cells. I have used a user function in the
past but cannot this time. The Paste special; does not cut it either.

Does anyone know of a way to build a formula to remove special
characters?

Thank you,
Jeff


Here is what I have used ion the past that I cannot use now in case it
might help some one.

Here is first the formula used to call the function followed by the
code.

=RegExpReplace(EndUsrPhon1,"^ +| +$|\\n|/n| |[^0-9 ]","",TRUE,FALSE)



Function RegExpReplace(LookIn As String, PatternStr As String,
Optional ReplaceWith As String = "", _
Optional ReplaceAll As Boolean = True, Optional MatchCase As
Boolean = True)
' This function uses Regular Expressions to parse a string, and
replace parts of the string
' matching the specified pattern with another string. The
optional argument ReplaceAll controls
' whether all instances of the matched string are replaced (True)
or just the first instance (False)

Static RegX As Object

If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
With RegX
.Pattern = PatternStr
.Global = ReplaceAll
.IgnoreCase = Not MatchCase
End With

RegExpReplace = RegX.Replace(LookIn, ReplaceWith)

End Function
On Monday, December 07, 2009 10:57 PM T. Valko wrote:
TRIM will remove all char 32 spaces except for single spaces between words.

CLEAN will remove *some* nonprintable characters.

If you know what the character codes are you can use the SUBSTITUTE
function:

=SUBSTITUTE(A1,CHAR(255),"")

=SUBSTITUTE(SUBSTITUTE(A1,CHAR(255),""),CHAR(160),"")
On Tuesday, December 08, 2009 7:06 AM Ron Rosenfeld wrote:
You'll need to define what your "special characters" are.

When you write that you cannot use what you used in the past, is that because
you are not allowed to use UDF's? Or because the previous UDF does not work?

If the former, you could use nested SUBSTITUTE's, and also, if required, TRIM
and/or CLEAN.

e.g.:

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(CLEAN(A1)),"$",""),"%",""),CHAR(160),"")

and just extend the SUBSTITUTE nesting as required.

If you can still use a UDF, I would suggest this. Note that there are two ways
of expressing the pattern. One lists the characters you wish to keep. This
list is enclosed in brackets, and preceded by a "^" which means "match anything
NOT in the subsequent list".

The other method would be to list all of the special characters you wish to
remove. The caveat here is that if "-" is to be removed, it must appear first
or last. I have also included a \xA0 in the "to be removed list" to show how
certain ASCII characters can be listed. This is ASCII character 160 which is a
non-breaking space, often used in HTML documents.

==========================
Option Explicit
Function RemoveSpChars(s As String) As String
Dim re As Object

'List all allowed characters, but start with the negation "^"
Const sPat As String = "[^A-Z0-9 ]"

'List all characters to be removed, "-" must
' be listed first or last
'Const sPat As String = "[!@#$%^&*()_+=\xA0]"

Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = sPat
RemoveSpChars = re.Replace(s, "")
Set re = Nothing
End Function
================================
--ron
 
I tried a variation of this formula to clean up some text that was copied/pasted from MS Word with formatting intact. However, it doesn't do what I thought it would do. =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(CLEAN(K7)),"$",""),"%",""),CHAR(160)," "),CHAR(100)," "),CHAR(152)," ") results in replacing the letter "d" with a space. I thought that ASCII character 160 was a non-breaking space. Am I doing something wrong?
I am trying to figure out a formula that I can use to remove special
characters from Excel 2007 cells. I have used a user function in the
past but cannot this time. The Paste special; does not cut it either.

Does anyone know of a way to build a formula to remove special
characters?

Thank you,
Jeff


Here is what I have used ion the past that I cannot use now in case it
might help some one.

Here is first the formula used to call the function followed by the
code.

=RegExpReplace(EndUsrPhon1,"^ +| +$|\\n|/n| |[^0-9 ]","",TRUE,FALSE)



Function RegExpReplace(LookIn As String, PatternStr As String,
Optional ReplaceWith As String = "", _
Optional ReplaceAll As Boolean = True, Optional MatchCase As
Boolean = True)
' This function uses Regular Expressions to parse a string, and
replace parts of the string
' matching the specified pattern with another string. The
optional argument ReplaceAll controls
' whether all instances of the matched string are replaced (True)
or just the first instance (False)

Static RegX As Object

If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
With RegX
.Pattern = PatternStr
.Global = ReplaceAll
.IgnoreCase = Not MatchCase
End With

RegExpReplace = RegX.Replace(LookIn, ReplaceWith)

End Function
On Monday, December 07, 2009 10:57 PM T. Valko wrote:
TRIM will remove all char 32 spaces except for single spaces between words.

CLEAN will remove *some* nonprintable characters.

If you know what the character codes are you can use the SUBSTITUTE
function:

=SUBSTITUTE(A1,CHAR(255),"")

=SUBSTITUTE(SUBSTITUTE(A1,CHAR(255),""),CHAR(160),"")
On Tuesday, December 08, 2009 7:06 AM Ron Rosenfeld wrote:
You'll need to define what your "special characters" are.

When you write that you cannot use what you used in the past, is that because
you are not allowed to use UDF's? Or because the previous UDF does not work?

If the former, you could use nested SUBSTITUTE's, and also, if required, TRIM
and/or CLEAN.

e.g.:

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(CLEAN(A1)),"$",""),"%",""),CHAR(160),"")

and just extend the SUBSTITUTE nesting as required.

If you can still use a UDF, I would suggest this. Note that there are two ways
of expressing the pattern. One lists the characters you wish to keep. This
list is enclosed in brackets, and preceded by a "^" which means "match anything
NOT in the subsequent list".

The other method would be to list all of the special characters you wish to
remove. The caveat here is that if "-" is to be removed, it must appear first
or last. I have also included a \xA0 in the "to be removed list" to show how
certain ASCII characters can be listed. This is ASCII character 160 which is a
non-breaking space, often used in HTML documents.

==========================
Option Explicit
Function RemoveSpChars(s As String) As String
Dim re As Object

'List all allowed characters, but start with the negation "^"
Const sPat As String = "[^A-Z0-9 ]"

'List all characters to be removed, "-" must
' be listed first or last
'Const sPat As String = "[!@#$%^&*()_+=\xA0]"

Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = sPat
RemoveSpChars = re.Replace(s, "")
Set re = Nothing
End Function
================================
--ron
 
I tried a variation of this formula to clean up some text that was copied/pasted from MS Word with formatting intact. However, it doesn't do what I thought it would do. =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(CLEAN(K7)),"$",""),"%",""),CHAR(160)," "),CHAR(100)," "),CHAR(152)," ") results in replacing the letter "d" with a space. I thought that ASCII character 160 was a non-breaking space. Am I doing something wrong?
I am trying to figure out a formula that I can use to remove special
characters from Excel 2007 cells. I have used a user function in the
past but cannot this time. The Paste special; does not cut it either.

Does anyone know of a way to build a formula to remove special
characters?

Thank you,
Jeff


Here is what I have used ion the past that I cannot use now in case it
might help some one.

Here is first the formula used to call the function followed by the
code.

=RegExpReplace(EndUsrPhon1,"^ +| +$|\\n|/n| |[^0-9 ]","",TRUE,FALSE)



Function RegExpReplace(LookIn As String, PatternStr As String,
Optional ReplaceWith As String = "", _
Optional ReplaceAll As Boolean = True, Optional MatchCase As
Boolean = True)
' This function uses Regular Expressions to parse a string, and
replace parts of the string
' matching the specified pattern with another string. The
optional argument ReplaceAll controls
' whether all instances of the matched string are replaced (True)
or just the first instance (False)

Static RegX As Object

If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
With RegX
.Pattern = PatternStr
.Global = ReplaceAll
.IgnoreCase = Not MatchCase
End With

RegExpReplace = RegX.Replace(LookIn, ReplaceWith)

End Function
On Monday, December 07, 2009 10:57 PM T. Valko wrote:
TRIM will remove all char 32 spaces except for single spaces between words.

CLEAN will remove *some* nonprintable characters.

If you know what the character codes are you can use the SUBSTITUTE
function:

=SUBSTITUTE(A1,CHAR(255),"")

=SUBSTITUTE(SUBSTITUTE(A1,CHAR(255),""),CHAR(160),"")
On Tuesday, December 08, 2009 7:06 AM Ron Rosenfeld wrote:
You'll need to define what your "special characters" are.

When you write that you cannot use what you used in the past, is that because
you are not allowed to use UDF's? Or because the previous UDF does not work?

If the former, you could use nested SUBSTITUTE's, and also, if required, TRIM
and/or CLEAN.

e.g.:

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(CLEAN(A1)),"$",""),"%",""),CHAR(160),"")

and just extend the SUBSTITUTE nesting as required.

If you can still use a UDF, I would suggest this. Note that there are two ways
of expressing the pattern. One lists the characters you wish to keep. This
list is enclosed in brackets, and preceded by a "^" which means "match anything
NOT in the subsequent list".

The other method would be to list all of the special characters you wish to
remove. The caveat here is that if "-" is to be removed, it must appear first
or last. I have also included a \xA0 in the "to be removed list" to show how
certain ASCII characters can be listed. This is ASCII character 160 which is a
non-breaking space, often used in HTML documents.

==========================
Option Explicit
Function RemoveSpChars(s As String) As String
Dim re As Object

'List all allowed characters, but start with the negation "^"
Const sPat As String = "[^A-Z0-9 ]"

'List all characters to be removed, "-" must
' be listed first or last
'Const sPat As String = "[!@#$%^&*()_+=\xA0]"

Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = sPat
RemoveSpChars = re.Replace(s, "")
Set re = Nothing
End Function
================================
--ron
 
Back
Top