DB field name valid

  • Thread starter Thread starter mp
  • Start date Start date
M

mp

to ensure string is legal field name, first thought is below:

private string ConvertToDBFieldName ( string str )
{//disallowed characters in database fieldname ` . ! ’ [ ] " * ?
StringBuilder sb = new StringBuilder ( str );
sb.Replace ( " ", "_" );
sb.Replace ( "`", "" );
sb.Replace ( ".", "" );
sb.Replace ( "!", "" );
sb.Replace ( "’", "" );
sb.Replace ( "[", "" );
sb.Replace ( "]", "" );
sb.Replace ( "\"", "" );
sb.Replace ( "*", "" );
sb.Replace ( "?", "" );
return sb.ToString ( );
}
comments?
thanks
mark
 
Peter Duniho said:
to ensure string is legal field name, first thought is below:

private string ConvertToDBFieldName ( string str )
{//disallowed characters in database fieldname ` . ! ? [ ] " *
?
StringBuilder sb = new StringBuilder ( str );
sb.Replace ( " ", "_" );
sb.Replace ( "`", "" );
sb.Replace ( ".", "" );
sb.Replace ( "!", "" );
sb.Replace ( "?", "" );
sb.Replace ( "[", "" );
sb.Replace ( "]", "" );
sb.Replace ( "\"", "" );
sb.Replace ( "*", "" );
sb.Replace ( "?", "" );
return sb.ToString ( );
}
comments?

Another example of where the Regex class might be nicer. But even without
Regex, it can be somewhat simpler/more-readable:

private static HashSet<char> _removeChars =
new HashSet(new char[] { '`', '.', '!', '?', '[', ']', '\\', '*',
'?' });

private string ConvertToDBFieldName(string str)
{
StringBuilder sb = new StringBuilder(str.Length);

foreach (char ch in str)
{
if (ch == ' ')
{
sb.Append('_');
}
else if (!_removeChars.Contains(ch))
{
sb.Append(ch);
}
}

return str;
}

The use of repeated calls to Replace() is not quite as bad when using
StringBuilder as if you were using String. But it's still potentially
expensive, and in any case always excessively so. IMHO, simply filtering
as you copy characters is clearer and more efficient. What's not to like?
:)

Pete

very nice!
:-)
thanks
mark
ps I assumed you meant > return sb.ToString();
and it seemed to want HashSet<char>(new char[]

i'll have to give some thought to the regex version..
still not in my head yet.
:-)
i'm sure there's a simple negating character grouping construct...
i'll look it up
thanks again
mark
 
to ensure string is legal field name, first thought is below:

Field name for what DBMS? Access allows spaces, for example, as does SQL
Server. Also, and unfortunately, SQL Server allows at least one of the
bracket characters ( [ & ] ) inside a field name. I've actually run across
one before. I cursed the table creator with everything I had. Hopefully he
at least contracted some boils....
 
Jeff Johnson said:
Field name for what DBMS?

sqlite

Access allows spaces, for example, as does SQL

i know they're allowed, i just didn't want them

Also, and unfortunately, SQL Server allows at least one of the
bracket characters ( [ & ] ) inside a field name. I've actually run across
one before. I cursed the table creator with everything I had. Hopefully he
at least contracted some boils....

:-)
 
Big Steel said:
to ensure string is legal field name, first thought is below:

private string ConvertToDBFieldName ( string str )
{//disallowed characters in database fieldname ` . ! ’ [ ]
" * ? []>> }
comments?

It's overkill validation.

actually this just runs once, i'm just massaging some existing strings
and making them palatable for a CREATE tbl (fldName T,....) string
I knew the input strings had some flaky chars so just throwing together
a little cleanup temp method
 
Peter Duniho said:
[...]
ps I assumed you meant> return sb.ToString();
and it seemed to want HashSet<char>(new char[]

Yes to both. The hazards of posting from a Mac. :) Fortunately,
mistakes like those are easy to detect and fix. :)

yep, just tagged that on for sake of archives
thanks again
mark
 
Jeff Johnson said:
i'm sure there's a simple negating character grouping construct...

[^<your negative character list>]

thats' what I thought i remembered but had to look it up again to be sure
seems i have to put the carat outside ^[<your negative character list>]
i tried this
Regex rgx = new Regex ( @"^[\s()'-]" );

sb.Append ( rgx.Match ( str ) );

but that wiped everything out

so i tried

foreach (char c in str)
{
if (!rgx.IsMatch ( c.ToString ( ) ))
{
sb.Append ( c.ToString ( ) );
}
}

which worked but i don't think that's how i'm supposed to have to do it with
regex...i have to go back to my articles and re-read

thanks

mark
 
mp said:
Jeff Johnson said:
i'm sure there's a simple negating character grouping construct...

[^<your negative character list>]

thats' what I thought i remembered but had to look it up again to be sure
seems i have to put the carat outside ^[<your negative character list>]

no you big dummy that means beginning of string...
jeff was right
but for some reason my mangling appeared to work????
i'll have to look at this some more

:-\
 
mp said:
to ensure string is legal field name, first thought is below:

Field name for what DBMS? Access allows spaces, for example, as does SQL
Server. Also, and unfortunately, SQL Server allows at least one of the
bracket characters ( [& ] ) inside a field name. I've actually run across
one before. I cursed the table creator with everything I had. Hopefully he
at least contracted some boils....

Better to pick names that can be expected to be valid in
most/all databases.

Arne
 
i'm sure there's a simple negating character grouping construct...

[^<your negative character list>]

I think that is by far the best solution proposed so far.

Suggestion for implementation.

private static readonly Regex dbid = new Regex("[^A-Za-z0-9]+",
RegexOptions.Compiled);
public static string Str2DBID(string s)
{
return dbid.Replace(s, "");
}

Easy to read. Easy to modify. Because it has the list of allowed
characters instead of the list of disallowed characters then it
is a lot more robust to "unusual" input (chinese etc.).

Arne
 
to ensure string is legal field name, first thought is below:

private string ConvertToDBFieldName ( string str )
{//disallowed characters in database fieldname ` . ! � [ ] " * ?
StringBuilder sb = new StringBuilder ( str );
sb.Replace ( " ", "_" );
sb.Replace ( "`", "" );
sb.Replace ( ".", "" );
sb.Replace ( "!", "" );
sb.Replace ( "�", "" );
sb.Replace ( "[", "" );
sb.Replace ( "]", "" );
sb.Replace ( "\"", "" );
sb.Replace ( "*", "" );
sb.Replace ( "?", "" );
return sb.ToString ( );
}
comments?

It's overkill validation.

More like underkill.

There are many other problematic characters.

Arne
 
i'm sure there's a simple negating character grouping construct...

[^<your negative character list>]

I think that is by far the best solution proposed so far.

By "so far" you mean as in it was suggested in the very first reply? :)

????

It was not the first reply.

And I did not see anyone else pointing to [^...].
Suggestion for implementation.

private static readonly Regex dbid = new Regex("[^A-Za-z0-9]+",
RegexOptions.Compiled);
public static string Str2DBID(string s)
{
return dbid.Replace(s, "");
}

Easy to read. Easy to modify. Because it has the list of allowed
characters instead of the list of disallowed characters then it
is a lot more robust to "unusual" input (chinese etc.).

As I mentioned in my first response, I think Regex can be very useful in
this kind of situation. The main issue is that he actually has two
different mappings; he wants ' ' to map to '_' as well, which would
require two passes through the string if using Regex.

Replacing spaces with underscore could just be done with replace:

public static string Str2DBID(string s)
{
return dbid.Replace(s.Replace(" ", "_"), "");
}
The alternative I mentioned (and for which I posted code) isn't quite as
elegant as Regex, but it has a potential performance advantage.

Of course, it may or may not be the case that that performance advantage
is important and/or realized. But at least it's better than a whole
bunch of calls to StringBuilder.Replace(). :)

Id to database name does not something that should be a
performance critical operation.

Arne
 
[...]
By "so far" you mean as in it was suggested in the very first reply? :)

????

It was not the first reply.

Another example of where the Regex class might be nicer. But even
without Regex, it can be somewhat simpler/more-readable:

That was the very first reply, 14 minutes after the original message. I
didn't mention the exact Regex syntax specifically, but it's implied.

Well you may consider "the Regex class might be nicer" to be a
solution.

But I don't - "implied" solutions are rarely helpful.

Jeff posted the regex to do it.
[...]
As I mentioned in my first response, I think Regex can be very useful in
this kind of situation. The main issue is that he actually has two
different mappings; he wants ' ' to map to '_' as well, which would
require two passes through the string if using Regex.

Replacing spaces with underscore could just be done with replace:

public static string Str2DBID(string s)
{
return dbid.Replace(s.Replace(" ", "_"), "");
}

So? That's still two passes through the string. It really doesn't matter
whether you're using Regex or String.Replace().

Sure it is two passes. But I am really not concerned about that.

And it does matter whether it is regex or string replace. Replacing
space with underscore is logical a simple replace operation and
should not be done with a pattern matching mechanism.

It is about simple understandable code not about performance.

Arne
 
Back
Top