how can can i copy some text to a file and utilities for parsing?

  • Thread starter Thread starter dara
  • Start date Start date
D

dara

Hi

I appreciate if someone can tell me what class to use to copy text data
to a file.
And also are there any utilities to parse this data so that in can be
rewritten in comma separated form?

Thanks in advance.
Dara
 
hi,
to write data in files, try the System.IO namespace, class such as
TextWriter, FileWriter, FileStream ...

for the comma separator try string.Parse

Hope that's help.
ROM
 
thanks rom i'll give it a try
take care

Romain TAILLANDIER said:
hi,
to write data in files, try the System.IO namespace, class such as
TextWriter, FileWriter, FileStream ...

for the comma separator try string.Parse

Hope that's help.
ROM
 
Dara,

Let me know exactly what you want to do, and I'll post some code for
you. :-)
 
Hi Adam,

I have a string that hold the html content from a web page. I like to parse
the names and addresses and make a list. Then write this list into a
comma separated file each record on a new line.

The text is loos like this:

[html code]
name1 [html code] address1 [html code]
name2 [html code] address2 [html code]
....

And I like the text file to look like:

name1, address1
name2. address2
....

Thanks for youy time @----->>--
Dara


"Adam P. Tatusko, MCSD .NET, MCAD .NET, MCDBA, MCSE, MCSA"
 
Dara,

Very cool! I'm about to embark on a project that will do a similar
action. Since the [html code] is most likely not a constant we need to
extract the needed data in a bit more of a sneaky way. Hopefully the
code in [html code] is unique enough that we can find each element
explicitly. This process is called HTML Scraping or Web Scraping. Could
you provide the exact format of the html content in the string for me,
so that I can better write the code for you?

--Adam
 
Hi Adam,

I want to extract NAME and ADDRESS. I was wondering if there
is more support to write the list in xml format. i dont know if having
data in xml format makes it easier to use DataSets.
Thanks

Sample html:

<table cellpadding=0 cellspacing=0 width=300 border=2>
<tr><td colspan=4><div class="ls'>
NAME<br>
</div></td></tr>
<td width=100 class="in" valign=top><div class="ls">
ADDRESS<br>
<SCRIPT LANGUAGE="javascript">


"Adam Tatusko, MCSD .NET, MCAD .NET, MCDBA, MCSE, MCSA"
 
Since your data elements occur after the <div class=ls> tag and before
the next <br> tag, you can extract the data using a while loop as
follows.

********************
using System;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string htmlContent = "<table cellpadding=0 cellspacing=0
width=300 border=2><tr><td colspan=4><div
class=ls>NAME1<br></div></td></tr></div></td></tr><td width=100
class=in valign=top><div class=ls>ADDRESS1<br><tr><td colspan=4><div
class=ls>NAME2<br></div></td></tr></div></td></tr><td width=100
class=in valign=top><div class=ls>ADDRESS2<br><tr><td colspan=4><div
class=ls>NAME3<br></div></td></tr></div></td></tr><td width=100
class=in valign=top><div class=ls>ADDRESS3<br>";
string tokenStart = "<div class=ls>";
string tokenEnd = "<br>";
string name = String.Empty;
string address = String.Empty;
int endPosition = 0;
int startPosition = 0;

while (endPosition < htmlContent.Length - tokenEnd.Length)
{
startPosition = htmlContent.IndexOf(tokenStart,
endPosition) + tokenStart.Length;
endPosition = htmlContent.IndexOf(tokenEnd,
startPosition);
name = htmlContent.Substring(startPosition, endPosition
- startPosition);

startPosition = htmlContent.IndexOf(tokenStart,
endPosition) + tokenStart.Length;
endPosition = htmlContent.IndexOf(tokenEnd,
startPosition);
address = htmlContent.Substring(startPosition,
endPosition - startPosition);

Console.WriteLine("name = {0}, address = {1}", name,
address);
}

Console.ReadLine();
}
}
}
********************
When run this code results in the following output:
name = NAME1, address = ADDRESS1
name = NAME2, address = ADDRESS2
name = NAME3, address = ADDRESS3
 
Sorry that, Google Groups distorts the posted text so much, as you can
see above. Anyone know how to make code look good on Google Groups
posts?
 
Adam,

Thanks so much it looks neat. I am given some other thing too do for a
couple
of days after which i ll use your code. I hope that all those tags dont
change
from page to page. But im suse i can find something constant for anchor
points.
I appreiate your help :)

Dara


"Adam Tatusko, MCSD .NET, MCAD .NET, MCDBA, MCSE, MCSA"
 
Back
Top