Using string with unicode chars in OleDb command

P

Pavils Jurjans

Hello, here's my testing code:

string dbConnStr = "*connstr*";
string sqlQuery;
OleDbCommand myCommand;
OleDbConnection myConnection = new OleDbConnection(dbConnStr);
myConnection.Open();

bool useParameters = false;

if (useParameters)
{
// With parameters
sqlQuery = "UPDATE App_Companies SET cmpNotes = ? WHERE cmpID = 0";
myCommand = new OleDbCommand(sqlQuery, myConnection);
myCommand.Parameters.Add("notes", OleDbType.VarWChar).Value = "*some
japanese characters*";
}
else
{
// With direct execute
sqlQuery = "UPDATE App_Companies SET cmpNotes = '*some japanese
characters*' WHERE cmpID = 0";
foreach (char c in sqlQuery) Console.Write ("{0:x4} ", (int)c); // DEBUG
myCommand = new OleDbCommand(sqlQuery, myConnection);
}

I save this CS file in UTF-8 encoding, and I know that csc.exe understands
it correctly, as I get correct results with using parameters. However, I
fail to update the ntext field of my database, using the unicode chars in
the sql statemnt itself. Is there some settings I can adjust to be able
*not* to use the parameters? Why is this difference in the first place?

Thanks,

-- Pavils
 
M

Miha Markic [MVP C#]

Hi,

It is not clear to me, why don't you just use parameters as it is faster and
way more secure way.
I guess that behaviour with direct statement is linked to sql server client
settings.
 
J

Jon Skeet [C# MVP]

It is not clear to me, why don't you just use parameters as it is faster and
way more secure way.

Just to back this up - I totally agree. Keep the SQL simple and in
ASCII, and use parameters for the query.
 
P

Pavils Jurjans

Hello,
It is not clear to me, why don't you just use parameters as it is faster and
way more secure way.

In this particular test case, it's of course enough to be able to use
parameters. But, I am about to have some queries whose SELECT statements and
WHERE statements are built dynamically, and in those situations parameters
are not of help. What regards the security, it's not too hard to escape the
special sharacters, etc, so that's not an issue.
I guess that behaviour with direct statement is linked to sql server client
settings.

What settings exactly? I don't see any property of connection or command
object, that would be connected in any way with multi-byte string
interpretation.


-- Pavils
 
J

Jon Skeet [C# MVP]

Pavils Jurjans said:
In this particular test case, it's of course enough to be able to use
parameters. But, I am about to have some queries whose SELECT statements and
WHERE statements are built dynamically, and in those situations parameters
are not of help. What regards the security, it's not too hard to escape the
special sharacters, etc, so that's not an issue.

I believe it's a lot harder to *correctly* escape special characters in
*all* cases than it is to dynamically generate parameters. Why do you
believe that parameters are of no help with dynamically generated WHERE
statements?
 
P

Pavils Jurjans

Hello John,
I believe it's a lot harder to *correctly* escape special characters in
*all* cases than it is to dynamically generate parameters.

Hmm, what's so hard in this?
Why do you believe that parameters are of no help with dynamically
generated WHERE statements?

I have a situation, where SELECT is done on the two table join, where the
second table is another SELECT statement, that may vary from case to case.
As far as I know, and maybe I am mistaken, there is no way to put the inner
SQL statement in as a parameter, and then let the Command object parse the
inner statement with the same set of parameters so that dynamic WHERE
criteria values are inserted correctly. Also, I can't insert the field name
in sql statement, using parameter. Parameters are generally designed for
inserting literal constants. Of course, I could build the sql statement
using just string concatenation, to the point where it's almost "final", but
there's a line of "?"s, pending for parameter insertion. Then, according to
the prepared statement, add the parameters in correct order (as there is no
way to use parameter names in OleDb), and the execute it. I see that it's
absolutely possible, but I don't like the two-stage way of this, and also
the complexities what arise in thinking out correct order of the parameters.

As for my initial question, I've found the answer, it's in KB article
#239530, and it states, that I have to prefix unicode string literals with
"N", if I'm using MSSQL server (which is my current development platform,
but the project has to be database-agnostic, so that's why I use OleDb).

Rgds,

Pavils
 
J

Jon Skeet [C# MVP]

Pavils Jurjans said:
Hmm, what's so hard in this?

Getting it right for all types and across different database vendors.
As William Ryan has said before, no-one every suffered a SQL injection
attack because of parameters...
I have a situation, where SELECT is done on the two table join, where the
second table is another SELECT statement, that may vary from case to case.
As far as I know, and maybe I am mistaken, there is no way to put the inner
SQL statement in as a parameter, and then let the Command object parse the
inner statement with the same set of parameters so that dynamic WHERE
criteria values are inserted correctly.

No - you put everything in the SQL apart from the values of actual
parameters. All of that SQL should be ASCII with nothing to escape.
Also, I can't insert the field name in sql statement, using parameter.

True (AFAIK). You would have to include the field name directly in the
SQL statement - but there you can just make sure that the field name
only contains ASCII characters, and disallow things like quote
characters in field names. It's a lot easier to enforce relatively
rigid criteria than to make sure you can escape things, IMO.
Parameters are generally designed for inserting literal constants.

Could you expand on that?
Of course, I could build the sql statement
using just string concatenation, to the point where it's almost "final", but
there's a line of "?"s, pending for parameter insertion. Then, according to
the prepared statement, add the parameters in correct order (as there is no
way to use parameter names in OleDb), and the execute it. I see that it's
absolutely possible, but I don't like the two-stage way of this, and also
the complexities what arise in thinking out correct order of the parameters.

I believe that's likely to be the best course though.
As for my initial question, I've found the answer, it's in KB article
#239530, and it states, that I have to prefix unicode string literals with
"N", if I'm using MSSQL server (which is my current development platform,
but the project has to be database-agnostic, so that's why I use OleDb).

So there's one "gotcha" about putting the values directly into the SQL
- a task you thought was easy. Now, how can you be absolutely sure that
it's the only "gotcha" that you didn't previously know about?
 
P

Pavils Jurjans

Getting it right for all types and across different database vendors.
As William Ryan has said before, no-one every suffered a SQL injection
attack because of parameters...

That's true, I like parameters, they are here to help me, and I admit their
benefits. The problems appear when I need to insert non-literal values, such
as inline sql statements, or part of sql statements.
True (AFAIK). You would have to include the field name directly in the
SQL statement - but there you can just make sure that the field name
only contains ASCII characters, and disallow things like quote
characters in field names. It's a lot easier to enforce relatively
rigid criteria than to make sure you can escape things, IMO.

There's no talk about that the field names contain special characters, or
even unicode characters. That would be a very stupid thing to do, while
designing the schemas. However, dynamically changing the fields in the sql
request, via parameters, is not possible (as well as any other non-literal
part of sql statement). I am always bound to do thinkg in two stages -
concatenate stings so that the right fields are selected, and the right
order statement is used, and then inject parameters that supply literal
values. I also can not inject part of literal value, for example, if I want
to look for all strings that contain "abc", I want to use this statement:
WHERE myField like '%abc%'. Using parameters, I cant inject the parameter in
between the %%'s, so I need to adjust it beforehand. It's not exaclty huge
work to do, but it just points to little things that parameters are not
capable to do.
Could you expand on that?

I think I wanted to say "literal values". I don't know what's the right
semantics here. I mean, that parameters can be injected in places, where
there would be literal values in the sql statement, such as 5, 'abc', etc.
parameters.

I believe that's likely to be the best course though.

The best, if I'd settle for using parameters. However, I have code from
other projects that bould be called a "query builder', that handles what
parameters can do, and also is able to inject values without any adjusting
(as for field names and whole WHERE statements) (and yes, I am aware that
the source for these parameters in no way should be taken from form data in
web application). Of course, the code should be slightly adjusted for every
different database engine, but if I organize the code so that the string
formatting for different types is easily adjustable, there's no any problem
here.
So there's one "gotcha" about putting the values directly into the SQL
- a task you thought was easy. Now, how can you be absolutely sure that
it's the only "gotcha" that you didn't previously know about?

Well, there could be more gotchas, as in every project, where no 100% of
"how things exactly work" is known. The point in this case would be to
estimate the costs of developing an alternative query builder, vs doing the
two-stage query building, ie. first with string concatenation, then
injecting values using parameters.

Rgds,

Pavils
 
M

Miha Markic [MVP C#]

I want to use this statement:
WHERE myField like '%abc%'. Using parameters, I cant inject the parameter in
between the %%'s, so I need to adjust it beforehand. It's not exaclty huge
work to do, but it just points to little things that parameters are not
capable to do.

True, but you could add % to parameter value and use "WHERE myField like
@param".
 
J

Jon Skeet [C# MVP]

Pavils Jurjans said:
That's true, I like parameters, they are here to help me, and I admit their
benefits. The problems appear when I need to insert non-literal values, such
as inline sql statements, or part of sql statements.


There's no talk about that the field names contain special characters, or
even unicode characters. That would be a very stupid thing to do, while
designing the schemas.

Indeed - so you check for that when validating the input, and you don't
need to worry about escaping etc.
However, dynamically changing the fields in the sql
request, via parameters, is not possible (as well as any other non-literal
part of sql statement). I am always bound to do thinkg in two stages -
concatenate stings so that the right fields are selected, and the right
order statement is used, and then inject parameters that supply literal
values. I also can not inject part of literal value, for example, if I want
to look for all strings that contain "abc", I want to use this statement:
WHERE myField like '%abc%'. Using parameters, I cant inject the parameter in
between the %%'s, so I need to adjust it beforehand. It's not exaclty huge
work to do, but it just points to little things that parameters are not
capable to do.

Yes, it's a bit of work - but I believe it's well worth it in the long
run.
I think I wanted to say "literal values". I don't know what's the right
semantics here. I mean, that parameters can be injected in places, where
there would be literal values in the sql statement, such as 5, 'abc', etc.

Ah, right. Yes, I agree.

The best, if I'd settle for using parameters. However, I have code from
other projects that bould be called a "query builder', that handles what
parameters can do, and also is able to inject values without any adjusting
(as for field names and whole WHERE statements) (and yes, I am aware that
the source for these parameters in no way should be taken from form data in
web application). Of course, the code should be slightly adjusted for every
different database engine, but if I organize the code so that the string
formatting for different types is easily adjustable, there's no any problem
here.

Assuming you've got it right, of course. I know that I trust the
authors of DB providers to do escaping etc better than I can - they
know their product inside out, I hope. They've already done the hard
work for you in parameters, so why reproduce that effort? I know it
means expending effort somewhere else (the "like" handling, field names
etc) but I'd rather put a bit more effort in and be confident that I
wasn't going to suffer a SQL injection attack.
Well, there could be more gotchas, as in every project, where no 100% of
"how things exactly work" is known. The point in this case would be to
estimate the costs of developing an alternative query builder, vs doing the
two-stage query building, ie. first with string concatenation, then
injecting values using parameters.

.... and then estimate the costs to your business if you make a mistake
and a company loses some critical data due to an escaping bug.

Obviously it's your choice - but even after all you've written, I'd
strongly recommend using parameters wherever possible, even if it means
a bit more work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top