Extract eMail data

  • Thread starter Thread starter vjp2.at
  • Start date Start date
V

vjp2.at

I get an email that says "Joe is a bozo, he has 333 baloons" I have
user-defined Outlook fields for BOZO and BALLOONS. I want to have the
program to automagically start a new record and put "joe" in the BOZO
field and 333 in BALOONS. If "BOZO" is a KEY and I already have a
record for Joe, I want the program to update the existing record
instead. The eMail has not been previously formatted for XML. I have
been told "XML Smart Tags" is the way to accomplish this task. Is
this so? How do I start? I've written hundreds of FORTRAN and Algol
programs and tens of Pascal and C programs and modified one Assembler
program (Assembler twenty years ago), bu tmostly form a math rather
than a data perspective.


- = -
Vasos-Peter John Panagiotopoulos II, Reagan Mozart Pindus BioStrategist
http://ourworld.compuserve.com/homepages/vjp2/vasos.htm
---{Nothing herein constitutes advice. Everything fully disclaimed.}---
[Homeland Security means private firearms not lazy obstructive guards]
[Yellary Clinton & Yellalot Spitzer: Nasty Together]
 
The question would be, which different forms of "Joe is a bozo, he has
333 balloons" are you expecting?

Depending on whether your emails are basically constrained or about a
certain domain, you can write a simple rules engine-type of code. You
might even get away with writing a lex/yacc/bison parser.

If you're expecting natural language input or input in different
languages, then there's a whole lot of things you'd need to think
about, like sentence boundary detection, word-boundary detection,
phrase detection, Part-of-Speech deduction, and word-sense
disambiguation. For all these, there are techniques ranging from the
script-kitten approach, to rule generators, to various classification
techniques, markov models, and so on...

What is the size of your problem, and what amount of resources do you
have to solve this problem? Depending on these constraints, you might
choose an approach that is not necessarily the best technology-wise,
but works for your problem.

Regards,
Milind
 
Back
Top