Validation of text

Guest · Jun 28, 2006

Is it possible to validate entered text so that it must start with a
predefine letter, say 'H', followed by a numeric value of varying length?

Guest · Jun 28, 2006

Sure, in the Before Update event of the text box:

If Left(Me.SomeControl, 1) <> "H" Then
MsgBox "Must Start with H"
Cancel = True
ElseIf Not IsNumeric(Right(Me.SomeControl, Len(Me.SomeControl) -1)) Then
MsgBox "Non - Numeric Characters found after the H"
Cancel = True
End If

onedaywhen · Jun 28, 2006

Klatuu said:
Sure, in the Before Update event of the text box

The OP is also advised to have a validation rule or CHECK constraint in
the database.

The basic pattern is

NOT LIKE 'H*[!0-9]*'

To do this properly, both ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching between DAO and ADO
could (inadvertently) circumvent data integrity checks e.g.

CREATE TABLE Test3 (
data_col VARCHAR(10) NOT NULL,
CONSTRAINT data_col__pattern
CHECK (
data_col NOT LIKE 'H%[!0-9]%'
AND data_col NOT LIKE 'H*[!0-9]*'
)
);

Jamie.

--

Guest · Jun 28, 2006

I would advise the OP Not to use database level validation. Your post,
itself, is a good argument against it.

onedaywhen said:
Sure, in the Before Update event of the text box

Click to expand...

The OP is also advised to have a validation rule or CHECK constraint in
the database.

The basic pattern is

NOT LIKE 'H*[!0-9]*'

To do this properly, both ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching between DAO and ADO
could (inadvertently) circumvent data integrity checks e.g.

CREATE TABLE Test3 (
data_col VARCHAR(10) NOT NULL,
CONSTRAINT data_col__pattern
CHECK (
data_col NOT LIKE 'H%[!0-9]%'
AND data_col NOT LIKE 'H*[!0-9]*'
)
);

Jamie.

onedaywhen · Jun 28, 2006

Klatuu said:
I would advise the OP Not to use database level validation.
Because...?

Your post,
itself, is a good argument against it.

I don't understand. I'm in favour of database validation, not arguing
against it. What do you mean?

TIA,
Jamie.

--

Guest · Jun 28, 2006

onedaywhen said:
Because...?

To do this properly, both ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching between DAO and ADO
could (inadvertently) circumvent data integrity checks e.g.

Because table and field level validation are product specific. Should you
switch between ADO and DAO or decide to upsize to SQL Server, Oracle, etc. It
requires a lot of rewrite. Validation done at the form level doesn't care
what the DB engine is.

I don't understand. I'm in favour of database validation, not arguing
against it. What do you mean?

There are arguments on both sides of this issue. I prefer form level
validation because it is not DB engine specific and because I feel I have
more control over the error handling that way.

If any MVPs have weighed in in favor of table/field validation, I have not
seen it. If any choose to and have a convincing argument in it's favor, I am
open to new ideas.

onedaywhen · Jun 29, 2006

Klatuu said:
I would advise the OP Not to use database level
validation [because] to do this properly, both
ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching
between DAO and ADO could (inadvertently)
circumvent data integrity checks

I'm not sure what you are saying here.

If you are saying that designing effective database constraints is hard
work then I entirely agree. If you conclude that database constraints
are not worth the effort then I couldn't disagree more.

Consider that if you have no database level constraints then your data
integrity is can be destroyed by DAO *and* ADO...actually any
technology that can connect to your database e.g. do your users have
Excel? But most significantly, your data integrity is at the mercy of
bugs in your front end application.

Because table and field level validation are product specific. Should you
upsize to SQL Server, Oracle, etc. It
requires a lot of rewrite.

You have portability issues in mind and that is a good thing.

I have been suggesting that you write your database constraints in
standard SQL *because* it can be easily ported to other platforms. Can
your VBA code by easily ported to other front end application
programming languages? Does it even port to VB.NET? Other forms-based
apps in the MS Office suite?

There are arguments on both sides of this issue.

Of course. But have you considered that you can (and if you are doing
your best, should) write form-level validation *and* database level
validation? Yes, more hard work but that's what being a professional is
all about, don't you think?

If any MVPs have weighed in in favor of table/field validation, I have not
seen it. If any choose to and have a convincing argument in it's favor, I am
open to new ideas.

I'm glad to hear you are open-minded but the idea of putting
constraints in the database is not new.

OK, if MVPs is what it takes, I'll search on my favourite MVPs (in no
particular order):

John Vinson
http://groups.google.com/group/microsoft.public.access.forms/msg/a721397365f07afe
"Good point Jamie - though the OP crossposted to forms and formscoding
as well. It would probably be wise to do both - on the Form so you can
control the error message and make it friendlier (or more hostile if
you prefer <g>), and in the table to prevent "backdoor" entry of
invalid data."

Allen Browne
http://groups.google.com/group/microsoft.public.access.formscoding/msg/312525459bb1531e
"Start with a bound form (bound to a table), with bound controls (bound
to
fields.) You may find that you don't actually need any code.
Particularly if
you used the Validation Rule property of the fields in your table, you
can
probably avoid any code at all."

Albert D.Kallal
http://groups.google.com/group/microsoft.public.access.forms/msg/15d1f38048d27ac8
By the way, we could have skipped the "code" example, and not used the
before update. One could simply enter the following expression as a
validation rule in the properties sheet.

Ken Snell
http://groups.google.com/group/microsoft.public.access.tablesdbdesign/msg/52f746a8b0ee99fd
"where you use the Validation Rule sometimes is a matter of
preference; other times, it's a matter of where it's easier to
write/test/handle the validation. If you put the Validation Rule in the

table, ACCESS will generate an error message that may or may not be
meaningful to your user"

Michel 'Vanderghast' Walsh
http://groups.google.com/group/microsoft.public.access.queries/msg/c66ec7e912d24be8
"I wrote many CHECK constraint between tables, before, in Jet"

Tibor Karaszi
http://groups.google.com/group/microsoft.public.sqlserver.programming/msg/c33934af18bc8c0a
['Why do we do double work?'] "Nothing new here, just wanted to push
the point that constraints can be a great timesaver by
finding incorrect code immediately!"

I'm not saying the above people agree with any particular point of
view, rather I would suggest they at least seem *not* to opposed the
practise of putting constraints in the database layer.

BTW why are you fixating on MVPs? I'm sure every one of them would
agree there are countless more enthusiastic volunteers (Tim Fergusson
and david Epsom come quickly to mind), eminent authors, etc whose
opinions are just as valuable.

I would urge you to spend a few minutes reading this series of three
short articles on the location of database constraints:

http://www.dbazine.com/ofinterest/oi-articles/celko27/

Jamie.

--

onedaywhen · Jun 29, 2006

onedaywhen said:
The basic pattern is

NOT LIKE 'H*[!0-9]*'

Further thought: you may want to avoid 'leading' zeros in the numeric
portion e.g. basic pattern:

NOT LIKE 'H*[!0-9]*' AND NOT LIKE 'H0*'

Jamie.

--

Guest · Jun 29, 2006

Good morning.
Great Feedback.

onedaywhen said:
I would advise the OP Not to use database level
validation [because] to do this properly, both
ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching
between DAO and ADO could (inadvertently)
circumvent data integrity checks

Click to expand...

I'm not sure what you are saying here.

If you are saying that designing effective database constraints is hard
work then I entirely agree. If you conclude that database constraints
are not worth the effort then I couldn't disagree more.

It is not that they are not worth the effort. It is that I find it more
difficult to respond to a database generated error than an application
generated error. But, that may be just my problem. I would have to admit,
like most people, I tend to stay with what is familiar. I began programming
before there were databases. Everything was done at the application level.
There were no relationships or referential integrity that you didn't write
yourself.

Consider that if you have no database level constraints then your data
integrity is can be destroyed by DAO *and* ADO...actually any
technology that can connect to your database e.g. do your users have
Excel? But most significantly, your data integrity is at the mercy of
bugs in your front end application.

I agree.

You have portability issues in mind and that is a good thing.

I have been suggesting that you write your database constraints in
standard SQL *because* it can be easily ported to other platforms. Can
your VBA code by easily ported to other front end application
programming languages? Does it even port to VB.NET? Other forms-based
apps in the MS Office suite?

This is not a valid argument. All applications are product specific. But
then, and perhaps to a lesser degree, so are databases. Standard SQL isn't.
And that is not surprising. Since there have been computers, committies of
one kind or another have come up with "Standards". Then all the vendors get
on board and say "We support THE Standard <insert technology here>, but we
also offer 'extensions'". To name a few: COBOL, BASIC, XBASE, Java, SQL,
etc, etc, etc.
So we programmers start using the extensions and soon have a product that is
difficult to port. Even considering upsizing an mdb app to an adp app using
SQL Server takes more than running the upsize wizard.
That being said, we still want to consider the impact on upsizing when we
design or apps and databases.

Of course. But have you considered that you can (and if you are doing
your best, should) write form-level validation *and* database level
validation? Yes, more hard work but that's what being a professional is
all about, don't you think?

Sure. I will certainly agree that using database level validation where it
is appropriate, in a perfect world, would be the most ideal situation for all
the obvious reasons. My main reservation is that dealing with a database
level error is much more difficult in Access than dealing with an application
level error.

I'm glad to hear you are open-minded but the idea of putting
constraints in the database is not new.

After 30 years in this business, there is very little, at the conceptual
level, that is new. My first encounter with the concept of enforced data
integrity was in a COBOL/ISAM shop where all IO was done using canned
procedures that enforced the data rules.

OK, if MVPs is what it takes, I'll search on my favourite MVPs (in no
particular order):

John Vinson
http://groups.google.com/group/microsoft.public.access.forms/msg/a721397365f07afe
"Good point Jamie - though the OP crossposted to forms and formscoding
as well. It would probably be wise to do both - on the Form so you can
control the error message and make it friendlier (or more hostile if
you prefer <g>), and in the table to prevent "backdoor" entry of
invalid data."

Allen Browne
http://groups.google.com/group/microsoft.public.access.formscoding/msg/312525459bb1531e
"Start with a bound form (bound to a table), with bound controls (bound
to
fields.) You may find that you don't actually need any code.
Particularly if
you used the Validation Rule property of the fields in your table, you
can
probably avoid any code at all."

Albert D.Kallal
http://groups.google.com/group/microsoft.public.access.forms/msg/15d1f38048d27ac8
By the way, we could have skipped the "code" example, and not used the
before update. One could simply enter the following expression as a
validation rule in the properties sheet.

Ken Snell
http://groups.google.com/group/microsoft.public.access.tablesdbdesign/msg/52f746a8b0ee99fd
"where you use the Validation Rule sometimes is a matter of
preference; other times, it's a matter of where it's easier to
write/test/handle the validation. If you put the Validation Rule in the

table, ACCESS will generate an error message that may or may not be
meaningful to your user"

This is my main aversion to database level validation:

If you put the Validation Rule in the table, ACCESS will generate an error
message that may or may not be meaningful to your user"

Michel 'Vanderghast' Walsh
http://groups.google.com/group/microsoft.public.access.queries/msg/c66ec7e912d24be8
"I wrote many CHECK constraint between tables, before, in Jet"

Tibor Karaszi
http://groups.google.com/group/microsoft.public.sqlserver.programming/msg/c33934af18bc8c0a
['Why do we do double work?'] "Nothing new here, just wanted to push
the point that constraints can be a great timesaver by
finding incorrect code immediately!"

I'm not saying the above people agree with any particular point of
view, rather I would suggest they at least seem *not* to opposed the
practise of putting constraints in the database layer.

BTW why are you fixating on MVPs? I'm sure every one of them would
agree there are countless more enthusiastic volunteers (Tim Fergusson
and david Epsom come quickly to mind), eminent authors, etc whose
opinions are just as valuable.

Not fixating on MVPs. There are some who are not, I think, that are more
knowledgeable than some MVPs. It is only a credential. MVP is a
recongnition of knowledge and participation. But, let is not confuse
knowledge with wisdom

I would urge you to spend a few minutes reading this series of three
short articles on the location of database constraints:

http://www.dbazine.com/ofinterest/oi-articles/celko27/

Good articles by an arogant author. Thanks for the reference.

I certinly have enjoyed our discussion. You make some good points worth
serious consideration.

onedaywhen · Jun 29, 2006

Klatuu said:
After 30 years in this business, there is very little, at the conceptual
level, that is new. My first encounter with the concept of enforced data
integrity was in a COBOL/ISAM shop where all IO was done using canned
procedures that enforced the data rules.

I'm just about 'wise' enough to have used pre-SQL ISAMs. That's why I
just love thing like CHECK constraints in the database: simple,
flexible, powerful. They do port to other SQLs quite well, noting that
I have to dumb-down when porting to SQL Server because Jet's are better
implemented.

I certinly have enjoyed our discussion. You make some good points worth
serious consideration.

Likewise ;-)

Jamie.

--

Guest · Jun 29, 2006

Interesting idea, but I still am not sure how to deal with db level
validation. Any pointers there would be appreciated.

onedaywhen · Jun 30, 2006

Klatuu said:
Interesting idea, but I still am not sure how to deal with db level
validation. Any pointers there would be appreciated.

Well, if we're talking about database validation then let's just
consider Jet's CHECK constraints and only worry for now about ANSI
wildcard characters: % for multiple characters and _ (underscore) for a
single character.

The way I see it there are two flavours of Jet CHECK constraint: row
level and table level (but in reality they are the same animal).

Access's Validation Rules are considered to be either at the 'column'
level or at the 'table' level, though I'd argue the latter is actually
at the *row* level because one can reference multiple columns in the
same row but not other rows in the table. I think the differentiation
is based on the Validation Text property more than anything.

Basic validation of values to enforce business rules, like the one that
started this thread: "must start with a the letter 'H' followed by a
numeric value of varying length". I also assume that H01 is illegal
because it is considered equivalent to H1 and that H0 is illegal
because it is effectively 'zero'.

Jet allows use to define multiple CHECK constraints per row/table,
therefore we can follow Celko's advice in the article (up thread) and
split these into multiple rules. It would be good to set a Validation
Text property for each but because it is tied to closely to Access
we're only allowed one per *table*. The best we can do is use a
meaningful name. Remember it is our intention that we will trap input
errors with validation in the front end application so the user being
exposed to these in error messages is a 'last resort':

CREATE TABLE Employees (
employee_nbr VARCHAR(11) NOT NULL PRIMARY KEY,
CONSTRAINT employee_nbr__basic_pattern
CHECK (employee_nbr NOT LIKE 'H%[!0-9]%'),
CONSTRAINT employee_nbr__no_leading_zeros
CHECK (employee_nbr NOT LIKE 'H0%')
);

At this point, before we start building the front end, we should test
each rule e.g.

values I expect to pass:
INSERT INTO Employees (employee_nbr) VALUES ('H1');
INSERT INTO Employees (employee_nbr) VALUES ('H900');

values I expect to fail:
INSERT INTO Employees (employee_nbr) VALUES ('F111');
INSERT INTO Employees (employee_nbr) VALUES ('H1A3');
INSERT INTO Employees (employee_nbr) VALUES ('H0');
INSERT INTO Employees (employee_nbr) VALUES ('H05');

[Just to prove this is a worthy exercise, my first insert inexplicable
failed. A quick look at my code revealed a typo [!9-0], which could
have been tricky to debug later.]

Validation using multiple values on the same row. For our payroll
table, let's model periods of earnings history in the recommended way
using closed-open representation with start_date and end date pairs, a
null end date signifying the current pay period.

Basic validation of the dates is that the start date will be midnight
and the end date will be the smallest granule (for Jet this is one
second) before midnight.

Note that although end_date can be NULL there is no need to explicitly
test for NULL in validation. This is due to the nature of nulls in SQL
DDL (data declaration language e.g. table design). Whereas in SQL DML
(data declaration language e.g. queries) an UNKNOWN resulting from a
comparison with a NULL value causes rows to be removed from a
resultset, in SQL DDL the UNKNOWN cannot be known to fail the rule
therefore it is allowed to pass. This makes sense when you think of our
NULL end data as being a placeholder for a date which will certainly be
known at some time in the future, so it is right to defer validation
until the value is known. I think this is mainly due to pragmatics
though i.e. without this implicit behaviour, validation of nullable
columns would *always* have to explicitly handle NULL.

An obvious domain rule, yet one that is often missed in database
validation rules, is that the end date cannot occur before the start
date in the same row. Similarly, salary cannot be negative.

Another obvious one is that each employee there should only be a
maximum of one row with a null date, which is best enforces with a
UNIQUE constraint, because the end dates should not be duplicated
either for an employee. Start dates are similarly unique for an
employee and, because they are not nullable, makes a good compound
natural key:

CREATE TABLE EarningsHistory (
employee_nbr VARCHAR(11) NOT NULL
REFERENCES Employees (employee_nbr)
ON DELETE NO ACTION
ON UPDATE CASCADE,
start_date DATETIME DEFAULT DATE() NOT NULL,
CONSTRAINT earnings_start_date__open_interval
CHECK(
HOUR(start_date) = 0
AND MINUTE(start_date) = 0
AND SECOND(start_date) = 0),
end_date DATETIME,
CONSTRAINT earnings_end_date__one_granule_closed_interval
CHECK(
HOUR(end_date) = 23
AND MINUTE(end_date) = 59
AND SECOND(end_date) = 59),
CONSTRAINT earnings_dates_order
CHECK (start_date < end_date),
salary_amount CURRENCY NOT NULL,
CONSTRAINT earnings_salary_amount__value
CHECK (salary_amount >= 0),
UNIQUE (employee_nbr, end_date),
PRIMARY KEY (employee_nbr, start_date)
);

Once again, test immediately.

Rows I expected to pass:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-01-01 00:00:00#, #2004-12-31 23:59:59#, 10000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2006-01-01 00:00:00#, NULL, 12000.00)
;

Rows I expected to fail:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 00:00:00#, #2005-12-31 00:00:00#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 14:29:39#, #2005-12-31 23:59:59#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-12-01 00:00:00#, #2005-01-31 23:59:59#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 00:00:00#, #2005-12-31 23:59:59#, -99.99)
;

[Once again my first test insert revealed an error: a copy and paste
resulted in me testing start_date in my rule for end_date!]

The data integrity of our payroll table is quite good. If we can't get
bad values in using direct INSERTs then no front end application can.

However, the constraints are not complete to my satisfaction. For
example, this spoils the data:

INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-11-01 00:00:00#, #2006-02-28 23:59:59#, 11000.00)
;

It passes all the existing constraints yet it creates bad (illogical)
data. Note end_date can be null, signifying the current date, so we'll
replace it with the current date in queries:

SELECT employee_nbr,
#2006-01-01# AS report_date, salary_amount
FROM EarningsHistory
WHERE #2006-01-01# BETWEEN start_date AND
IIF(end_date IS NULL, NOW(), end_date);

According to the data, the employee was earning two different amounts
simultaneously. This could result in some tricky situations so a
further constraint is required.

To write this kind of constraint I usually start with some data that
fails the rule, write a query that identifies all the bad rows, then
turn it into a constraint.

First some more bad data:

INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-03-01 00:00:00#, #2006-04-01 23:59:59#, 15000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-07-01 00:00:00#, #2004-07-31 23:59:59#, 16000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2006-05-01 00:00:00#, #2006-12-31 23:59:59#, 17000.00)
;

As everyone knows, if they think about it, that a period overlaps a
later period if its end date occurs before the later period's start
date (I think...I'm doing this all off the top of my head!) Therefore,
I think this should identify the bad data:

SELECT *
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr = E2.employee_nbr
AND EarningsHistory.start_date < E2.start_date
AND
(
E2.start_date
< IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)
OR IIF(E2.end_date IS NULL,
NOW(),
E2.end_date) < IIF(EarningsHistory.end_date IS NULL,
NOW(), EarningsHistory.end_date)
)

I then start deleting the bad rows individually. The resultset should
only return zero rows when all the bad data has been removed, so I run
the query after each single row deletion:

DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-11-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-03-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-07-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2006-05-01 00:00:00#
;

OK, only after the final delete did the resultset return empty. I'll
assume the logic is sound and convert it to a constraint:

ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__no_overlapping_periods
CHECK (0 = (
SELECT *
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr = E2.employee_nbr
AND EarningsHistory.start_date < E2.start_date
AND
(
E2.start_date
< IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)
OR IIF(E2.end_date IS NULL,
NOW(),
E2.end_date) < IIF(EarningsHistory.end_date IS NULL,
NOW(), EarningsHistory.end_date)
)
)
);

Try the inserts again and they should all fail.

Some further business rules spring to mind. Contiguous periods were the
salary amounts are the same is useless information indicative of an
error:

ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__contiguous_periods_salary_must_change
CHECK (0 = (
SELECT COUNT(*)
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr =
E2.employee_nbr
AND DATEADD('s', 1,
IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)) = E2.start_date
AND EarningsHistory.salary_amount = E2.salary_amount
)
);

Another rule could disallow gaps between periods (if they still
employed but aren't being paid then add a period where the salary is
zero):

ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__periods_must_be_contiguous
CHECK ( 0 = (
SELECT COUNT(*)
FROM EarningsHistory AS E1
WHERE EXISTS (
SELECT *
FROM EarningsHistory AS E2
WHERE E1.employee_nbr = E2.employee_nbr
AND E1.start_date < E2.start_date)
AND NOT EXISTS (
SELECT * FROM EarningsHistory AS E2
WHERE E1.employee_nbr = E2.employee_nbr
AND DATEADD('s', 1, E1.end_date) = E2.start_date
)
)
);

Then there are CHECK constraints that can reference rows in other
tables...

It should be obvious by now that writing validation rules in the
database layer is as easy as writing a query.

Jamie

--

Guest · Jun 30, 2006

I guess my question wasn't clear.
The creation of the rules is not a problem. What I am not clear on is how
to recognize and deal with a validation violation at the application level.
For example, using Form level validation for an End Date, I woul use the
control's Before Update event, check to see that the End Date is > Start
Date, and if not, present a message box to the user and cancel the Update.

What is the process using database validation rule to accomplish the same
thing and not have the user see one of Access' totally meaningless error
messages?

BTW, don't be so quick to put a < 0 constraint on salary. You haven't seen
my pay check

onedaywhen said:
Interesting idea, but I still am not sure how to deal with db level
validation. Any pointers there would be appreciated.

Click to expand...

Well, if we're talking about database validation then let's just
consider Jet's CHECK constraints and only worry for now about ANSI
wildcard characters: % for multiple characters and _ (underscore) for a
single character.

The way I see it there are two flavours of Jet CHECK constraint: row
level and table level (but in reality they are the same animal).

Access's Validation Rules are considered to be either at the 'column'
level or at the 'table' level, though I'd argue the latter is actually
at the *row* level because one can reference multiple columns in the
same row but not other rows in the table. I think the differentiation
is based on the Validation Text property more than anything.

Basic validation of values to enforce business rules, like the one that
started this thread: "must start with a the letter 'H' followed by a
numeric value of varying length". I also assume that H01 is illegal
because it is considered equivalent to H1 and that H0 is illegal
because it is effectively 'zero'.

Jet allows use to define multiple CHECK constraints per row/table,
therefore we can follow Celko's advice in the article (up thread) and
split these into multiple rules. It would be good to set a Validation
Text property for each but because it is tied to closely to Access
we're only allowed one per *table*. The best we can do is use a
meaningful name. Remember it is our intention that we will trap input
errors with validation in the front end application so the user being
exposed to these in error messages is a 'last resort':

CREATE TABLE Employees (
employee_nbr VARCHAR(11) NOT NULL PRIMARY KEY,
CONSTRAINT employee_nbr__basic_pattern
CHECK (employee_nbr NOT LIKE 'H%[!0-9]%'),
CONSTRAINT employee_nbr__no_leading_zeros
CHECK (employee_nbr NOT LIKE 'H0%')
);

At this point, before we start building the front end, we should test
each rule e.g.

values I expect to pass:
INSERT INTO Employees (employee_nbr) VALUES ('H1');
INSERT INTO Employees (employee_nbr) VALUES ('H900');

values I expect to fail:
INSERT INTO Employees (employee_nbr) VALUES ('F111');
INSERT INTO Employees (employee_nbr) VALUES ('H1A3');
INSERT INTO Employees (employee_nbr) VALUES ('H0');
INSERT INTO Employees (employee_nbr) VALUES ('H05');

[Just to prove this is a worthy exercise, my first insert inexplicable
failed. A quick look at my code revealed a typo [!9-0], which could
have been tricky to debug later.]

Validation using multiple values on the same row. For our payroll
table, let's model periods of earnings history in the recommended way
using closed-open representation with start_date and end date pairs, a
null end date signifying the current pay period.

Basic validation of the dates is that the start date will be midnight
and the end date will be the smallest granule (for Jet this is one
second) before midnight.

Note that although end_date can be NULL there is no need to explicitly
test for NULL in validation. This is due to the nature of nulls in SQL
DDL (data declaration language e.g. table design). Whereas in SQL DML
(data declaration language e.g. queries) an UNKNOWN resulting from a
comparison with a NULL value causes rows to be removed from a
resultset, in SQL DDL the UNKNOWN cannot be known to fail the rule
therefore it is allowed to pass. This makes sense when you think of our
NULL end data as being a placeholder for a date which will certainly be
known at some time in the future, so it is right to defer validation
until the value is known. I think this is mainly due to pragmatics
though i.e. without this implicit behaviour, validation of nullable
columns would *always* have to explicitly handle NULL.

An obvious domain rule, yet one that is often missed in database
validation rules, is that the end date cannot occur before the start
date in the same row. Similarly, salary cannot be negative.

Another obvious one is that each employee there should only be a
maximum of one row with a null date, which is best enforces with a
UNIQUE constraint, because the end dates should not be duplicated
either for an employee. Start dates are similarly unique for an
employee and, because they are not nullable, makes a good compound
natural key:

CREATE TABLE EarningsHistory (
employee_nbr VARCHAR(11) NOT NULL
REFERENCES Employees (employee_nbr)
ON DELETE NO ACTION
ON UPDATE CASCADE,
start_date DATETIME DEFAULT DATE() NOT NULL,
CONSTRAINT earnings_start_date__open_interval
CHECK(
HOUR(start_date) = 0
AND MINUTE(start_date) = 0
AND SECOND(start_date) = 0),
end_date DATETIME,
CONSTRAINT earnings_end_date__one_granule_closed_interval
CHECK(
HOUR(end_date) = 23
AND MINUTE(end_date) = 59
AND SECOND(end_date) = 59),
CONSTRAINT earnings_dates_order
CHECK (start_date < end_date),
salary_amount CURRENCY NOT NULL,
CONSTRAINT earnings_salary_amount__value
CHECK (salary_amount >= 0),
UNIQUE (employee_nbr, end_date),
PRIMARY KEY (employee_nbr, start_date)
);

Once again, test immediately.

Rows I expected to pass:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-01-01 00:00:00#, #2004-12-31 23:59:59#, 10000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2006-01-01 00:00:00#, NULL, 12000.00)
;

Rows I expected to fail:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 00:00:00#, #2005-12-31 00:00:00#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 14:29:39#, #2005-12-31 23:59:59#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-12-01 00:00:00#, #2005-01-31 23:59:59#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 00:00:00#, #2005-12-31 23:59:59#, -99.99)
;

[Once again my first test insert revealed an error: a copy and paste
resulted in me testing start_date in my rule for end_date!]

The data integrity of our payroll table is quite good. If we can't get
bad values in using direct INSERTs then no front end application can.

However, the constraints are not complete to my satisfaction. For
example, this spoils the data:

INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-11-01 00:00:00#, #2006-02-28 23:59:59#, 11000.00)
;

It passes all the existing constraints yet it creates bad (illogical)
data. Note end_date can be null, signifying the current date, so we'll
replace it with the current date in queries:

SELECT employee_nbr,
#2006-01-01# AS report_date, salary_amount
FROM EarningsHistory
WHERE #2006-01-01# BETWEEN start_date AND
IIF(end_date IS NULL, NOW(), end_date);

According to the data, the employee was earning two different amounts
simultaneously. This could result in some tricky situations so a
further constraint is required.

To write this kind of constraint I usually start with some data that
fails the rule, write a query that identifies all the bad rows, then
turn it into a constraint.

First some more bad data:

INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-03-01 00:00:00#, #2006-04-01 23:59:59#, 15000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-07-01 00:00:00#, #2004-07-31 23:59:59#, 16000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2006-05-01 00:00:00#, #2006-12-31 23:59:59#, 17000.00)
;

As everyone knows, if they think about it, that a period overlaps a
later period if its end date occurs before the later period's start
date (I think...I'm doing this all off the top of my head!) Therefore,
I think this should identify the bad data:

SELECT *
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr = E2.employee_nbr
AND EarningsHistory.start_date < E2.start_date
AND
(
E2.start_date
< IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)
OR IIF(E2.end_date IS NULL,
NOW(),
E2.end_date) < IIF(EarningsHistory.end_date IS NULL,
NOW(), EarningsHistory.end_date)
)

I then start deleting the bad rows individually. The resultset should
only return zero rows when all the bad data has been removed, so I run
the query after each single row deletion:

DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-11-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-03-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-07-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2006-05-01 00:00:00#
;

OK, only after the final delete did the resultset return empty. I'll
assume the logic is sound and convert it to a constraint:

ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__no_overlapping_periods
CHECK (0 = (
SELECT *
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr = E2.employee_nbr
AND EarningsHistory.start_date < E2.start_date
AND
(
E2.start_date
< IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)
OR IIF(E2.end_date IS NULL,
NOW(),
E2.end_date) < IIF(EarningsHistory.end_date IS NULL,
NOW(), EarningsHistory.end_date)
)
)
);

Try the inserts again and they should all fail.

Some further business rules spring to mind. Contiguous periods were the
salary amounts are the same is useless information indicative of an
error:

ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__contiguous_periods_salary_must_change
CHECK (0 = (
SELECT COUNT(*)
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr =
E2.employee_nbr
AND DATEADD('s', 1,
IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)) = E2.start_date
AND EarningsHistory.salary_amount = E2.salary_amount
)
);

Another rule could disallow gaps between periods (if they still
employed but aren't being paid then add a period where the salary is
zero):

ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__periods_must_be_contiguous
CHECK ( 0 = (
SELECT COUNT(*)
FROM EarningsHistory AS E1
WHERE EXISTS (
SELECT *
FROM EarningsHistory AS E2
WHERE E1.employee_nbr = E2.employee_nbr
AND E1.start_date < E2.start_date)
AND NOT EXISTS (
SELECT * FROM EarningsHistory AS E2
WHERE E1.employee_nbr = E2.employee_nbr
AND DATEADD('s', 1, E1.end_date) = E2.start_date
)
)
);

Then there are CHECK constraints that can reference rows in other
tables...

It should be obvious by now that writing validation rules in the

Rick Brandt · Jun 30, 2006

Klatuu said:
I guess my question wasn't clear.
The creation of the rules is not a problem. What I am not clear on
is how to recognize and deal with a validation violation at the
application level. For example, using Form level validation for an
End Date, I woul use the control's Before Update event, check to see
that the End Date is > Start Date, and if not, present a message box
to the user and cancel the Update.

What is the process using database validation rule to accomplish the
same thing and not have the user see one of Access' totally
meaningless error messages?

You put in the database rule and ALSO use your BeforeUpdate code to prevent
the ugly message or else use the Form's Error event to replace the built in
message with your own.

onedaywhen · Jun 30, 2006

Klatuu said:
What I am not clear on is how
to recognize and deal with a validation violation at the application level.

Don't ask me; that's the front end guy's problem said:
For example, using Form level validation for an End Date, I woul use the
control's Before Update event, check to see that the End Date is > Start
Date, and if not, present a message box to the user and cancel the Update.

For example, using Form level validation for an End Date, I woul use the
control's Before Update event, check to see that the End Date is > Start
Date, and if not, present a message box to the user and cancel the Update.

What is the process using database validation rule to accomplish the same
thing and not have the user see one of Access' totally meaningless error
messages?

I would guess you continuing doing everything you were already doing.
If one of the CHECK constraints bites it indicates there is a bug in
the front end.

I think you should be handling data engine error rather than directly
exposing them to the user e.g. (aircode):

' Step over Access' totally meaningless error messages
rowsAffected = 0
errored = False
On Error Resume Next
' Would really do this with
' ADO Command and Paramters
con.Execute _
"EXECUTE AddNewSalary 'H1'," & _
" #2006-06-30#, 15000.00", rowsAffected
errored = CBool(Err.Number <> 0)
On Error Goto 0
If errored Then
' Provide totally meaningful error message
End if
If rowsAffected = 0 Then
' Handle failure here
End If
etc

Thanks for getting me to finish typing up my notes. Actually, it's
still a WIP because I have to write the section on inter-table CHECK
constraints by re-writing my existing ones using joins on a Calendar
auxiliary table...

Jamie.

--

Guest · Jun 30, 2006

So what is the point, then, of doing the database level validation if I still
have to do the same checks at the form level to be able to handle the errors?

This is my primary aversion to db level validation. It is not as easy to
handle incorrect input this way.

Rick Brandt · Jun 30, 2006

Klatuu said:
So what is the point, then, of doing the database level validation if
I still have to do the same checks at the form level to be able to
handle the errors?

Because your front end is NOT the only way the data could be manipulated.
You should think of the database and the application as two separate
projects. Job 1 is to have a proper database with good data integrity
rules. Then you can think about how your apop interacts with that database.

This comes naturally to people like myself who almost exclusively work with
server databases, many of which existed long before I came along to add an
Access interface to it. With an all Jet application it is easy to consider
the tables as a simple subset of objects in the total application, but the
database should properly be able to stand completely on its own.

This is my primary aversion to db level validation. It is not as
easy to handle incorrect input this way.

Proper database application design is not always about what is "easy".

Guest · Jun 30, 2006

Thanks for the input, Rick. I see your point. In a larger database
environment where multiple applicatioins could be using the database, I'm
with you. I'm sure you have more knowledge and experience with server
databases than I. I have mostly been on the application side. I have
studied and had training in relational database theory, so it is not an
unfamiliar environment.

My intent on this thread was to get opinions and knowledge. That I have done.

There is one statement you made I object to:

Proper database application design is not always about what is "easy".

I find that rude and condescending. Nothing in life is always about what is
easy, but it make so sense perform useless tasks just so you can think of
yourself as a hero.

Rick Brandt · Jun 30, 2006

Klatuu said:
There is one statement you made I object to:
I find that rude and condescending. Nothing in life is always about
what is easy, but it make so sense perform useless tasks just so you
can think of yourself as a hero.

It was not my intent to be rude. If you took it that way I apologize.
"Easy" was your word after all, not mine.

I suppose I disagree with your term "useless tasks". It is not useless to
have good data integrity. It could be argued though whether avoiding engine
error messages is worth the extra trouble. For the most part I don't bother
with "user friendly" error messages. Most users only have to see "Duplicate
Key Error" a handful of times before they know what it means. If not, then
they can ask someone.

If the designer of the front end makes the decision that engine errors need
to be replaced with something more descriptive to the user then that is the
effort I would consider "extra". Not necessarily "useless", but definitely
"extra". That decision in my opinion is completely separate from the those
taken to keep the data clean with the latter having much more priority.

Guest · Jun 30, 2006

I respect your opinion and agree that both need serious and complete
consideration. I would not mind the "extra" because coming from 30 years of
writing user interface applications, I feel it is very important that users
receive information that is meaningful to them.
My problem here is not being familiar with how to trap and handle database
level errors. Any enlightenment would be appreciated. It doesn't have to be
detailed, conceptual is fine.

Validation of text

Guest

Guest

onedaywhen

Guest

onedaywhen

Guest

onedaywhen

onedaywhen

Guest

onedaywhen

Guest

onedaywhen

Guest

Rick Brandt

onedaywhen

Guest

Rick Brandt

Guest

Rick Brandt

Guest