Find similar items

  • Thread starter Thread starter Tem
  • Start date Start date
T

Tem

I have a single table that contains information of photos

ie.
ID PhotoName PhotoTags
23 my cat cat animal pet
24 cell phone electronic communication
25 tiger animal zoo

What would be a possible way to write a query that returns similar items
- share similar tags, similar photo name


can this be done with a sql query?

Thank you
Tem
 
It can, but you need a better design. Each individual tag should be in a
row by itself in a PhotoTags table, with a foreign key to the Photos table.

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau


I have a single table that contains information of photos

ie.
ID PhotoName PhotoTags
23 my cat cat animal pet
24 cell phone electronic communication
25 tiger animal zoo

What would be a possible way to write a query that returns similar items
- share similar tags, similar photo name


can this be done with a sql query?

Thank you
Tem
 
need to join the table on itself for this. note it will be SLOOOOWWWW if
the table is huge.

select t1.*, t2.*
from yourtable t1 join yourtable t2 on t1.photoname = t2.photoname
and t1.phototags = t2.phototags
 
can this be done with an SQL query? <<

Yes, but why not buy a document or textbase package which is designed
to work with this type of data?
 
thanks ill try it

TheSQLGuru said:
need to join the table on itself for this. note it will be SLOOOOWWWW if
the table is huge.

select t1.*, t2.*
from yourtable t1 join yourtable t2 on t1.photoname = t2.photoname
and t1.phototags = t2.phototags


--
Kevin G. Boles
Indicium Resources, Inc.
SQL Server MVP
kgboles a earthlink dt net
 
If you are on 2005, check Term Extraction and Term Lookup Transformations
You can build pretty much "smart" service on top of this
If you are interested in fuzzy matching - you can take a look at Fuzzy
Lookup and Fuzzy Grouping Transformations
 
this is very cool!

Mikhail Berlyant said:
If you are on 2005, check Term Extraction and Term Lookup Transformations
You can build pretty much "smart" service on top of this
If you are interested in fuzzy matching - you can take a look at Fuzzy
Lookup and Fuzzy Grouping Transformations
 
Table Tags
ID TagName PhotoId
1 cat 23
2 animal 23
3 pet 23
4 animal 25

select PhotoId from Tags where PhotoId = 23 and (other photoId that has 23's
tags)

Need some help with this sql statement
 
So is the requirement that there must be a match on ANY tags of PhotoId = 23
(in which case, PhotoID = 25 WILL match) or is it that you want a match on
ALL tags of PhotoId = 23 (in which case PhotoId WILL NOT match)?

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau


Table Tags
ID TagName PhotoId
1 cat 23
2 animal 23
3 pet 23
4 animal 25

select PhotoId from Tags where PhotoId = 23 and (other photoId that has 23's
tags)

Need some help with this sql statement
 
Below is VERY SIMPLIFIED example :
you create dictionary table and populate it with distinct terms from all
phrases you have:

TagID TagName
1 cat
2 animal
3 pat
4 electronic
5 communication
6 zoo

Next you join this Dictionary table with Phrase table and end up with Tags
table (in more real example you would have here frequency field, but for now
you can skip this for simplicity sake):

PhotoID TagID
23 1
23 2
23 3
24 4
24 5
25 2
25 6

Your final query would look like:

SELECT PhotoID, COUNT(*) AS Score
FROM Tags
WHERE TagID in (SELECT TagID FROM Tags WHERE PhotoID = 23)
GROUP BY PhotoID
ORDER BY Score DESC

I hope you will get an idea
 
What im trying to say is

since 23 has the tags "cat animal pet" it should return other photoIds that
contain "cat OR animal OR pet" (in any order)

1 cat 23
2 animal 23
3 pet 23
4 animal 25
5 dog 25

If possible put the closest matching ones "cat AND animal AND pet" at the
top.
Hence getting similar photos
 
Then this should do it:

select
t1.PhotoID
, count (*)
from
Tags t1
join
Tags t2 on t2.TagName = t1.TagName
where
t2.PhotoId = 23
and
t1.PhotoId <> 23
group by
t1.PhotoID
order by
count (*) desc

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau


What im trying to say is

since 23 has the tags "cat animal pet" it should return other photoIds that
contain "cat OR animal OR pet" (in any order)

1 cat 23
2 animal 23
3 pet 23
4 animal 25
5 dog 25

If possible put the closest matching ones "cat AND animal AND pet" at the
top.
Hence getting similar photos
 
The query did not return anything


ID TagName PhotoID
1 cat 23
2 animal 23
3 pet 23
4 animal 25
5 dog 25
6 car 26
7 phone 26

if in the query @PhotoID = 25, it should return 23
if in the query @PhotoID = 26, it should return nothing
 
Please ignore my last post

Count still does not work

ID TagName PhotoID
1 cat 23
2 animal 23
3 pet 23
4 animal 25
5 dog 25
6 car 26
7 phone 26
8 cat 27
9 animal 27

if in the query @PhotoID = 25, it should return
PhotoId Count
23 1


and if in the query @PhotoID = 27, it should return
PhotoId Count
27 2
25 1
 
My original code works. Here's a complete repro:

create table Tags
(
ID int primary key
, TagName varchar (12) not null
, PhotoID int not null
)
go
insert Tags values (1, 'cat', 23)
insert Tags values (2, 'animal', 23)
insert Tags values (3, 'pet', 23)
insert Tags values (4, 'animal', 25)
insert Tags values (5, 'dog', 25)
insert Tags values (6, 'car', 26)
insert Tags values (7, 'phone', 26)
insert Tags values (8, 'cat ', 27)
insert Tags values (9, 'animal', 27)
go
select
t1.PhotoID
, count (*)
from
Tags t1
join
Tags t2 on t2.TagName = t1.TagName
where
t2.PhotoId = 27
and
t1.PhotoId <> 27
group by
t1.PhotoID
order by
count (*) desc
go
drop table tags

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau


Please ignore my last post

Count still does not work

ID TagName PhotoID
1 cat 23
2 animal 23
3 pet 23
4 animal 25
5 dog 25
6 car 26
7 phone 26
8 cat 27
9 animal 27

if in the query @PhotoID = 25, it should return
PhotoId Count
23 1


and if in the query @PhotoID = 27, it should return
PhotoId Count
27 2
25 1
 
thanks

Tom Moreau said:
My original code works. Here's a complete repro:

create table Tags
(
ID int primary key
, TagName varchar (12) not null
, PhotoID int not null
)
go
insert Tags values (1, 'cat', 23)
insert Tags values (2, 'animal', 23)
insert Tags values (3, 'pet', 23)
insert Tags values (4, 'animal', 25)
insert Tags values (5, 'dog', 25)
insert Tags values (6, 'car', 26)
insert Tags values (7, 'phone', 26)
insert Tags values (8, 'cat ', 27)
insert Tags values (9, 'animal', 27)
go
select
t1.PhotoID
, count (*)
from
Tags t1
join
Tags t2 on t2.TagName = t1.TagName
where
t2.PhotoId = 27
and
t1.PhotoId <> 27
group by
t1.PhotoID
order by
count (*) desc
go
drop table tags

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA, MCITP, MCTS
SQL Server MVP
Toronto, ON Canada
https://mvp.support.microsoft.com/profile/Tom.Moreau


Please ignore my last post

Count still does not work

ID TagName PhotoID
1 cat 23
2 animal 23
3 pet 23
4 animal 25
5 dog 25
6 car 26
7 phone 26
8 cat 27
9 animal 27

if in the query @PhotoID = 25, it should return
PhotoId Count
23 1


and if in the query @PhotoID = 27, it should return
PhotoId Count
27 2
25 1
 
Back
Top