G
Guest
Is it possible to validate entered text so that it must start with a
predefine letter, say 'H', followed by a numeric value of varying length?
predefine letter, say 'H', followed by a numeric value of varying length?
Klatuu said:Sure, in the Before Update event of the text box
onedaywhen said:Sure, in the Before Update event of the text box
The OP is also advised to have a validation rule or CHECK constraint in
the database.
The basic pattern is
NOT LIKE 'H*[!0-9]*'
To do this properly, both ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching between DAO and ADO
could (inadvertently) circumvent data integrity checks e.g.
CREATE TABLE Test3 (
data_col VARCHAR(10) NOT NULL,
CONSTRAINT data_col__pattern
CHECK (
data_col NOT LIKE 'H%[!0-9]%'
AND data_col NOT LIKE 'H*[!0-9]*'
)
);
Jamie.
Klatuu said:I would advise the OP Not to use database level validation.
Because...?
Your post,
itself, is a good argument against it.
onedaywhen said:Because...?
I don't understand. I'm in favour of database validation, not arguing
against it. What do you mean?
Klatuu said:I would advise the OP Not to use database level
validation [because] to do this properly, both
ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching
between DAO and ADO could (inadvertently)
circumvent data integrity checks
Because table and field level validation are product specific. Should you
upsize to SQL Server, Oracle, etc. It
requires a lot of rewrite.
There are arguments on both sides of this issue.
If any MVPs have weighed in in favor of table/field validation, I have not
seen it. If any choose to and have a convincing argument in it's favor, I am
open to new ideas.
onedaywhen said:The basic pattern is
NOT LIKE 'H*[!0-9]*'
onedaywhen said:I would advise the OP Not to use database level
validation [because] to do this properly, both
ANSI (%) and non-ANSI (*) wildcard characters
should be accommodated, otherwise simply switching
between DAO and ADO could (inadvertently)
circumvent data integrity checks
I'm not sure what you are saying here.
If you are saying that designing effective database constraints is hard
work then I entirely agree. If you conclude that database constraints
are not worth the effort then I couldn't disagree more.
Consider that if you have no database level constraints then your data
integrity is can be destroyed by DAO *and* ADO...actually any
technology that can connect to your database e.g. do your users have
Excel? But most significantly, your data integrity is at the mercy of
bugs in your front end application.
You have portability issues in mind and that is a good thing.
I have been suggesting that you write your database constraints in
standard SQL *because* it can be easily ported to other platforms. Can
your VBA code by easily ported to other front end application
programming languages? Does it even port to VB.NET? Other forms-based
apps in the MS Office suite?
Of course. But have you considered that you can (and if you are doing
your best, should) write form-level validation *and* database level
validation? Yes, more hard work but that's what being a professional is
all about, don't you think?
I'm glad to hear you are open-minded but the idea of putting
constraints in the database is not new.
OK, if MVPs is what it takes, I'll search on my favourite MVPs (in no
particular order):
John Vinson
http://groups.google.com/group/microsoft.public.access.forms/msg/a721397365f07afe
"Good point Jamie - though the OP crossposted to forms and formscoding
as well. It would probably be wise to do both - on the Form so you can
control the error message and make it friendlier (or more hostile if
you prefer <g>), and in the table to prevent "backdoor" entry of
invalid data."
Allen Browne
http://groups.google.com/group/microsoft.public.access.formscoding/msg/312525459bb1531e
"Start with a bound form (bound to a table), with bound controls (bound
to
fields.) You may find that you don't actually need any code.
Particularly if
you used the Validation Rule property of the fields in your table, you
can
probably avoid any code at all."
Albert D.Kallal
http://groups.google.com/group/microsoft.public.access.forms/msg/15d1f38048d27ac8
By the way, we could have skipped the "code" example, and not used the
before update. One could simply enter the following expression as a
validation rule in the properties sheet.
Ken Snell
http://groups.google.com/group/microsoft.public.access.tablesdbdesign/msg/52f746a8b0ee99fd
"where you use the Validation Rule sometimes is a matter of
preference; other times, it's a matter of where it's easier to
write/test/handle the validation. If you put the Validation Rule in the
table, ACCESS will generate an error message that may or may not be
meaningful to your user"
Michel 'Vanderghast' Walsh
http://groups.google.com/group/microsoft.public.access.queries/msg/c66ec7e912d24be8
"I wrote many CHECK constraint between tables, before, in Jet"
Tibor Karaszi
http://groups.google.com/group/microsoft.public.sqlserver.programming/msg/c33934af18bc8c0a
['Why do we do double work?'] "Nothing new here, just wanted to push
the point that constraints can be a great timesaver by
finding incorrect code immediately!"
I'm not saying the above people agree with any particular point of
view, rather I would suggest they at least seem *not* to opposed the
practise of putting constraints in the database layer.
BTW why are you fixating on MVPs? I'm sure every one of them would
agree there are countless more enthusiastic volunteers (Tim Fergusson
and david Epsom come quickly to mind), eminent authors, etc whose
opinions are just as valuable.
I would urge you to spend a few minutes reading this series of three
short articles on the location of database constraints:
http://www.dbazine.com/ofinterest/oi-articles/celko27/
Klatuu said:After 30 years in this business, there is very little, at the conceptual
level, that is new. My first encounter with the concept of enforced data
integrity was in a COBOL/ISAM shop where all IO was done using canned
procedures that enforced the data rules.
I certinly have enjoyed our discussion. You make some good points worth
serious consideration.
Klatuu said:Interesting idea, but I still am not sure how to deal with db level
validation. Any pointers there would be appreciated.
onedaywhen said:Interesting idea, but I still am not sure how to deal with db level
validation. Any pointers there would be appreciated.
Well, if we're talking about database validation then let's just
consider Jet's CHECK constraints and only worry for now about ANSI
wildcard characters: % for multiple characters and _ (underscore) for a
single character.
The way I see it there are two flavours of Jet CHECK constraint: row
level and table level (but in reality they are the same animal).
Access's Validation Rules are considered to be either at the 'column'
level or at the 'table' level, though I'd argue the latter is actually
at the *row* level because one can reference multiple columns in the
same row but not other rows in the table. I think the differentiation
is based on the Validation Text property more than anything.
Basic validation of values to enforce business rules, like the one that
started this thread: "must start with a the letter 'H' followed by a
numeric value of varying length". I also assume that H01 is illegal
because it is considered equivalent to H1 and that H0 is illegal
because it is effectively 'zero'.
Jet allows use to define multiple CHECK constraints per row/table,
therefore we can follow Celko's advice in the article (up thread) and
split these into multiple rules. It would be good to set a Validation
Text property for each but because it is tied to closely to Access
we're only allowed one per *table*. The best we can do is use a
meaningful name. Remember it is our intention that we will trap input
errors with validation in the front end application so the user being
exposed to these in error messages is a 'last resort':
CREATE TABLE Employees (
employee_nbr VARCHAR(11) NOT NULL PRIMARY KEY,
CONSTRAINT employee_nbr__basic_pattern
CHECK (employee_nbr NOT LIKE 'H%[!0-9]%'),
CONSTRAINT employee_nbr__no_leading_zeros
CHECK (employee_nbr NOT LIKE 'H0%')
);
At this point, before we start building the front end, we should test
each rule e.g.
values I expect to pass:
INSERT INTO Employees (employee_nbr) VALUES ('H1');
INSERT INTO Employees (employee_nbr) VALUES ('H900');
values I expect to fail:
INSERT INTO Employees (employee_nbr) VALUES ('F111');
INSERT INTO Employees (employee_nbr) VALUES ('H1A3');
INSERT INTO Employees (employee_nbr) VALUES ('H0');
INSERT INTO Employees (employee_nbr) VALUES ('H05');
[Just to prove this is a worthy exercise, my first insert inexplicable
failed. A quick look at my code revealed a typo [!9-0], which could
have been tricky to debug later.]
Validation using multiple values on the same row. For our payroll
table, let's model periods of earnings history in the recommended way
using closed-open representation with start_date and end date pairs, a
null end date signifying the current pay period.
Basic validation of the dates is that the start date will be midnight
and the end date will be the smallest granule (for Jet this is one
second) before midnight.
Note that although end_date can be NULL there is no need to explicitly
test for NULL in validation. This is due to the nature of nulls in SQL
DDL (data declaration language e.g. table design). Whereas in SQL DML
(data declaration language e.g. queries) an UNKNOWN resulting from a
comparison with a NULL value causes rows to be removed from a
resultset, in SQL DDL the UNKNOWN cannot be known to fail the rule
therefore it is allowed to pass. This makes sense when you think of our
NULL end data as being a placeholder for a date which will certainly be
known at some time in the future, so it is right to defer validation
until the value is known. I think this is mainly due to pragmatics
though i.e. without this implicit behaviour, validation of nullable
columns would *always* have to explicitly handle NULL.
An obvious domain rule, yet one that is often missed in database
validation rules, is that the end date cannot occur before the start
date in the same row. Similarly, salary cannot be negative.
Another obvious one is that each employee there should only be a
maximum of one row with a null date, which is best enforces with a
UNIQUE constraint, because the end dates should not be duplicated
either for an employee. Start dates are similarly unique for an
employee and, because they are not nullable, makes a good compound
natural key:
CREATE TABLE EarningsHistory (
employee_nbr VARCHAR(11) NOT NULL
REFERENCES Employees (employee_nbr)
ON DELETE NO ACTION
ON UPDATE CASCADE,
start_date DATETIME DEFAULT DATE() NOT NULL,
CONSTRAINT earnings_start_date__open_interval
CHECK(
HOUR(start_date) = 0
AND MINUTE(start_date) = 0
AND SECOND(start_date) = 0),
end_date DATETIME,
CONSTRAINT earnings_end_date__one_granule_closed_interval
CHECK(
HOUR(end_date) = 23
AND MINUTE(end_date) = 59
AND SECOND(end_date) = 59),
CONSTRAINT earnings_dates_order
CHECK (start_date < end_date),
salary_amount CURRENCY NOT NULL,
CONSTRAINT earnings_salary_amount__value
CHECK (salary_amount >= 0),
UNIQUE (employee_nbr, end_date),
PRIMARY KEY (employee_nbr, start_date)
);
Once again, test immediately.
Rows I expected to pass:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-01-01 00:00:00#, #2004-12-31 23:59:59#, 10000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2006-01-01 00:00:00#, NULL, 12000.00)
;
Rows I expected to fail:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 00:00:00#, #2005-12-31 00:00:00#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 14:29:39#, #2005-12-31 23:59:59#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-12-01 00:00:00#, #2005-01-31 23:59:59#, 11000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2005-01-01 00:00:00#, #2005-12-31 23:59:59#, -99.99)
;
[Once again my first test insert revealed an error: a copy and paste
resulted in me testing start_date in my rule for end_date!]
The data integrity of our payroll table is quite good. If we can't get
bad values in using direct INSERTs then no front end application can.
However, the constraints are not complete to my satisfaction. For
example, this spoils the data:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-11-01 00:00:00#, #2006-02-28 23:59:59#, 11000.00)
;
It passes all the existing constraints yet it creates bad (illogical)
data. Note end_date can be null, signifying the current date, so we'll
replace it with the current date in queries:
SELECT employee_nbr,
#2006-01-01# AS report_date, salary_amount
FROM EarningsHistory
WHERE #2006-01-01# BETWEEN start_date AND
IIF(end_date IS NULL, NOW(), end_date);
According to the data, the employee was earning two different amounts
simultaneously. This could result in some tricky situations so a
further constraint is required.
To write this kind of constraint I usually start with some data that
fails the rule, write a query that identifies all the bad rows, then
turn it into a constraint.
First some more bad data:
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-03-01 00:00:00#, #2006-04-01 23:59:59#, 15000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2004-07-01 00:00:00#, #2004-07-31 23:59:59#, 16000.00)
;
INSERT INTO EarningsHistory
(employee_nbr, start_date, end_date, salary_amount)
VALUES ('H1', #2006-05-01 00:00:00#, #2006-12-31 23:59:59#, 17000.00)
;
As everyone knows, if they think about it, that a period overlaps a
later period if its end date occurs before the later period's start
date (I think...I'm doing this all off the top of my head!) Therefore,
I think this should identify the bad data:
SELECT *
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr = E2.employee_nbr
AND EarningsHistory.start_date < E2.start_date
AND
(
E2.start_date
< IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)
OR IIF(E2.end_date IS NULL,
NOW(),
E2.end_date) < IIF(EarningsHistory.end_date IS NULL,
NOW(), EarningsHistory.end_date)
)
I then start deleting the bad rows individually. The resultset should
only return zero rows when all the bad data has been removed, so I run
the query after each single row deletion:
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-11-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-03-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2004-07-01 00:00:00#
;
DELETE FROM EarningsHistory WHERE employee_nbr = 'H1'
AND start_date = #2006-05-01 00:00:00#
;
OK, only after the final delete did the resultset return empty. I'll
assume the logic is sound and convert it to a constraint:
ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__no_overlapping_periods
CHECK (0 = (
SELECT *
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr = E2.employee_nbr
AND EarningsHistory.start_date < E2.start_date
AND
(
E2.start_date
< IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)
OR IIF(E2.end_date IS NULL,
NOW(),
E2.end_date) < IIF(EarningsHistory.end_date IS NULL,
NOW(), EarningsHistory.end_date)
)
)
);
Try the inserts again and they should all fail.
Some further business rules spring to mind. Contiguous periods were the
salary amounts are the same is useless information indicative of an
error:
ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__contiguous_periods_salary_must_change
CHECK (0 = (
SELECT COUNT(*)
FROM EarningsHistory, EarningsHistory AS E2
WHERE EarningsHistory.employee_nbr =
E2.employee_nbr
AND DATEADD('s', 1,
IIF(EarningsHistory.end_date IS NULL, NOW(),
EarningsHistory.end_date)) = E2.start_date
AND EarningsHistory.salary_amount = E2.salary_amount
)
);
Another rule could disallow gaps between periods (if they still
employed but aren't being paid then add a period where the salary is
zero):
ALTER TABLE EarningsHistory ADD
CONSTRAINT earnings_history__periods_must_be_contiguous
CHECK ( 0 = (
SELECT COUNT(*)
FROM EarningsHistory AS E1
WHERE EXISTS (
SELECT *
FROM EarningsHistory AS E2
WHERE E1.employee_nbr = E2.employee_nbr
AND E1.start_date < E2.start_date)
AND NOT EXISTS (
SELECT * FROM EarningsHistory AS E2
WHERE E1.employee_nbr = E2.employee_nbr
AND DATEADD('s', 1, E1.end_date) = E2.start_date
)
)
);
Then there are CHECK constraints that can reference rows in other
tables...
It should be obvious by now that writing validation rules in the
Klatuu said:I guess my question wasn't clear.
The creation of the rules is not a problem. What I am not clear on
is how to recognize and deal with a validation violation at the
application level. For example, using Form level validation for an
End Date, I woul use the control's Before Update event, check to see
that the End Date is > Start Date, and if not, present a message box
to the user and cancel the Update.
What is the process using database validation rule to accomplish the
same thing and not have the user see one of Access' totally
meaningless error messages?
Klatuu said:What I am not clear on is how
to recognize and deal with a validation violation at the application level.
Don't ask me; that's the front end guy's problem said:For example, using Form level validation for an End Date, I woul use the
control's Before Update event, check to see that the End Date is > Start
Date, and if not, present a message box to the user and cancel the Update.
For example, using Form level validation for an End Date, I woul use the
control's Before Update event, check to see that the End Date is > Start
Date, and if not, present a message box to the user and cancel the Update.
What is the process using database validation rule to accomplish the same
thing and not have the user see one of Access' totally meaningless error
messages?
Klatuu said:So what is the point, then, of doing the database level validation if
I still have to do the same checks at the form level to be able to
handle the errors?
This is my primary aversion to db level validation. It is not as
easy to handle incorrect input this way.
I find that rude and condescending. Nothing in life is always about what isProper database application design is not always about what is "easy".
Klatuu said:There is one statement you made I object to:
I find that rude and condescending. Nothing in life is always about
what is easy, but it make so sense perform useless tasks just so you
can think of yourself as a hero.