SQL code for 'first instance of X'

  • Thread starter Thread starter Jaek
  • Start date Start date
J

Jaek

Hello and thanks in advance -

I have a table with a list of books (identified by ISBN) and their prices
week-to-week, call it 'PRICES'.

PRICES has 4 fields: ISBN, Price, Year, WeekNumber.

Prices dont' change that often, so for a given book, the same price will
appear week after week for months at a time, but every now and then the
publisher ups the price by a dollar or two. My mission: I need to identify
the week when the price changes for a given book (put another way I need the
earliest week each unique price appears for a given book).

It's pretty easy to get a list of ISBNs and their unique prices in a given
period by only including the ISBN and Price and using a DISTINCT query (or
grouping in an aggregate query). I've then tried tying this back with a
correlated subquery to get the least value for the year and date columns
where a given ISBN and Price appear. But my query writing must be especially
cruddy because executing the SQL has consistently crashed the app!

I know one of y'all has the simple answer to this... Thanks.
 
Hello and thanks in advance -

I have a table with a list of books (identified by ISBN) and their prices
week-to-week, call it 'PRICES'.

PRICES has 4 fields: ISBN, Price, Year, WeekNumber.

Prices dont' change that often, so for a given book, the same price will
appear week after week for months at a time, but every now and then the
publisher ups the price by a dollar or two. My mission: I need to identify
the week when the price changes for a given book (put another way I need the
earliest week each unique price appears for a given book).

It's pretty easy to get a list of ISBNs and their unique prices in a given
period by only including the ISBN and Price and using a DISTINCT query (or
grouping in an aggregate query). I've then tried tying this back with a
correlated subquery to get the least value for the year and date columns
where a given ISBN and Price appear. But my query writing must be especially
cruddy because executing the SQL has consistently crashed the app!

I know one of y'all has the simple answer to this... Thanks.

Whoa. Your design is screwing you up. You say that "Prices don't
change that often" but you're still "storing the prices week to
week." WHY?

You don't need to store a value for each week. Only for when the
price changes. Then you can just get the last record where the date
is greater than or equal to the price change date. Just use MAX
(SaleDate) WHERE SaleDate<=[SomeDate].

ISBN, Price, Year, WeekNumber

SELECT TOP 1 BookPriceHistory.ISBN, BookPriceHistory.ChangeDate,
BookPriceHistory.Price
FROM BookPriceHistory
WHERE (((BookPriceHistory.ISBN)=[ISBN:]) AND
((BookPriceHistory.ChangeDate)<=[Enter a date:]))
ORDER BY BookPriceHistory.ChangeDate;
 
I am hving a hard time following what you want.

If I understand right, you have a table I'll call 'BOOKS', with fields
ISBN (PK),
BookTitle (Text)
Author (Text)

And you have a table named 'PRICES', with fields
ID (PK)
ISBN (FK into table 'BOOKS')
Price (Cur)
Year (Int) Note:this is a bad name for a field. It is a reserved word
WeekNumber (Int)


So for records in 'PRICES' like:

1-1-1, $3.99,2009, 2 **
1-1-1, $3.99,2009, 3
1-1-1, $5.49,2009, 4 **
1-1-1, $6.99,2009, 5 **
1-1-1, $6.99,2009, 6


1-2-1, $5.99,2009, 2 **
1-2-1, $6.59,2009, 4 **
1-2-1, $6.99,2009, 4 **
1-2-1, $6.99,2009, 5
1-2-1, $6.99,2009, 6
1-2-1, $6.99,2009, 7
1-2-1, $6.99,2009, 8

You want the starred records returned???


HTH
 
Steve - You couldn't be more right. Those starred records are exactly what
I'm looking for.

-Jake
 
I've been trying to get the SQL correct.... I think it will tale two queries
or a query/subquery.

Maybe one of the SQL gurus will have the answer....
 
Steve I think you're correct, it needs a subquery (at least one), and it is
trickier than it looks at first.

In general I think it would be an extremely useful operation to have a
template for...if there is indeed an SQL guru who can step out of the
crowd...
 
Hi Jaek

I am certainly not a SQL guru, and I've been watching this thread with
interest hoping that one would drop by, but FWIW, here's what I've come up
with:

SELECT ISBN, Price,
Min([Yr]*100+[Wk]) AS YrWk
FROM BookPrices
GROUP BY ISBN, Price;

However, it has two drawbacks:

1. (minor) the year and week have been concatenated into a numeric
expression for ordering purposes - not really a problem because it's easy to
separate them again.

2. (potentially major) if a price changes (up ow down) and then returns to
the original price then the query will return only the FIRST time it was
that price.

This will work if you can guarantee that the prices only move in one
direction, or at least will never be repeated.

I might be (probably am!) wrong, but I can't see any way of doing this with
pure Jet SQL. The reason is that to find the change points you need to read
the recordset sequentially.
 
After much thought I might propose the following UNTESTED solution. First
step create a ranking query this should return sequential numbers from 0 to n
where zero should be your first record for the ISBN and n will be the last
record for an ISBN

SELECT A.ISBN, A.[YEAR], A.WeekNumber, A.Price
, Count(ISBN) as RankNumber
FROM BookPrices as A LEFT JOIN BookPrices as B
ON A.ISBN = B.ISBN
AND A.[Year] + A.[WeekNumber] > B.[Year] + B.[WeekNumber]
GROUP BY A.ISBN, A.[YEAR], A.WeekNumber, A.Price

Now using that saved query you should be able to get the records you want.
You may have to adjust the line
AND A.RankNumber = B.RankNumber+1
to
AND A.RankNumber = B.RankNumber-1
if I got confused on which record you want to return.

SELECT A.ISBN, A.[YEAR], A.WeekNumber, A.Price
FROM qRanked as A LEFT JOIN qRanked as B
ON A.ISBN = B.ISBN
AND A.RankNumber = B.RankNumber+1
WHERE A.RankNumber = 0
OR A.Price <> B.Price

Please let me know if this works for you. It may be very slow with large
amounts of data. Make sure you have indexes on ISBN, Year, and WeekNumber to
speed up the process.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Hi John

You're brilliant!

With a couple of tweaks, that gives the required result. Count(ISBN) needs
to be changed to Count(B.ISBN) to avoid ambiguity, and
A.[Year] + A.[WeekNumber] > B.[Year] + B.[WeekNumber]
needs to be changed to
A.[Year]*100 + A.[WeekNumber] > B.[Year]*100 + B.[WeekNumber]
to ensure that 2009 week 1 is after 2008 week 52.

I don't know how it would perform with lots of data, but it works fine with
20 records <smile>

--
Cheers,
Graham M

John Spencer MVP said:
After much thought I might propose the following UNTESTED solution. First
step create a ranking query this should return sequential numbers from 0
to n where zero should be your first record for the ISBN and n will be the
last record for an ISBN

SELECT A.ISBN, A.[YEAR], A.WeekNumber, A.Price
, Count(ISBN) as RankNumber
FROM BookPrices as A LEFT JOIN BookPrices as B
ON A.ISBN = B.ISBN
AND A.[Year] + A.[WeekNumber] > B.[Year] + B.[WeekNumber]
GROUP BY A.ISBN, A.[YEAR], A.WeekNumber, A.Price

Now using that saved query you should be able to get the records you want.
You may have to adjust the line
AND A.RankNumber = B.RankNumber+1
to
AND A.RankNumber = B.RankNumber-1
if I got confused on which record you want to return.

SELECT A.ISBN, A.[YEAR], A.WeekNumber, A.Price
FROM qRanked as A LEFT JOIN qRanked as B
ON A.ISBN = B.ISBN
AND A.RankNumber = B.RankNumber+1
WHERE A.RankNumber = 0
OR A.Price <> B.Price

Please let me know if this works for you. It may be very slow with large
amounts of data. Make sure you have indexes on ISBN, Year, and WeekNumber
to speed up the process.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County

Graham said:
Hi Jaek

I am certainly not a SQL guru, and I've been watching this thread with
interest hoping that one would drop by, but FWIW, here's what I've come
up with:

SELECT ISBN, Price,
Min([Yr]*100+[Wk]) AS YrWk
FROM BookPrices
GROUP BY ISBN, Price;

However, it has two drawbacks:

1. (minor) the year and week have been concatenated into a numeric
expression for ordering purposes - not really a problem because it's easy
to separate them again.

2. (potentially major) if a price changes (up ow down) and then returns
to the original price then the query will return only the FIRST time it
was that price.

This will work if you can guarantee that the prices only move in one
direction, or at least will never be repeated.

I might be (probably am!) wrong, but I can't see any way of doing this
with pure Jet SQL. The reason is that to find the change points you need
to read the recordset sequentially.
 
Glad to hear it worked. I thought it should,but didn't have time to
build a test table and see if it would.

Thanks for the feedback.

'====================================================
John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
'====================================================


Graham said:
Hi John

You're brilliant!

With a couple of tweaks, that gives the required result. Count(ISBN) needs
to be changed to Count(B.ISBN) to avoid ambiguity, and
A.[Year] + A.[WeekNumber] > B.[Year] + B.[WeekNumber]
needs to be changed to
A.[Year]*100 + A.[WeekNumber] > B.[Year]*100 + B.[WeekNumber]
to ensure that 2009 week 1 is after 2008 week 52.

I don't know how it would perform with lots of data, but it works fine with
20 records <smile>
 
I **THINK** that my proposed solution (as modified by Graham Mendano) could be
modified to show both changes but it would require the ranking to be done on
unique values of ISBN, Year, and WeekNumber.

So you would need a starting query to get just the unique values of ISBN,
Year, and WeekNumber
, then a query get the ranking based on that
, then a query to add back the prices
, and then the final query

// Query named Q1//
SELECT DISTINCT ISBN, [Year] as TheYear, Weeknumber
FROM BookPrices

//Query named Q2 (yeah that's really imaginative) //
SELECT A.ISBN, A.TheYear, A.WeekNumber,
COUNT(B.ISBN) as RankNumber
FROM Q1 as A LEFT JOIN Q2 as B
ON A.ISBN = B.ISBN
AND A.TheYear*100 + A.WeekNumber > B.TheYear*100 + B.WeekNumber
GROUP BY A.ISBN, A.TheYear, A.WeekNumber

// New query named QRanked //
SELECT A.*, B.RankNumber
FROM Prices as A INNER JOIN Q2
ON A.ISBN = Q2.ISBN
AND A.[Year] = B.TheYear
AND A.WeekNumber = B.WeekNumber

// Final query //
SELECT A.ISBN, A.[YEAR], A.WeekNumber, A.Price
FROM qRanked as A LEFT JOIN qRanked as B
ON A.ISBN = B.ISBN
AND A.RankNumber = B.RankNumber+1
WHERE A.RankNumber = 0
OR A.Price <> B.Price

You could make a fairly complex query with nested subqueries out of that if
you didn't have a field named YEAR (a reserved word).

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Yes, I also hate to use ranking queries. But sometimes I don't see a better
solution. Of course, with large datasets things can get awfully slow.

Your solution may be faster, but even it is going to be slow with the need to
use calls to DateSerial. I don't think there is a good solution to this
problem with large datasets and the specified data structure.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Hi John/Dale

We seem to be chatting amonsgt ourselves, as Jaek hasn't dropped by since
last week. However, it is an interesting discussion :-)

I agree with Dale - for this kind of application I would *always* go for a
structure of:
ItemFK
EffectiveDate
Price

It is so easy, using that structure, to ascertain the price of an item on
any given date. It also makes maintenance much easier, as you have only one
record per price change.

Nevertheless, it was an interesting exercise. It actually crossed my mind
that Jaek's purpose might have been to run the query as a one-off to convert
the old structure to an "EffectiveDate" form.
 
Back
Top