J
John M
Hi,
I have a program that does alot of analysis on HTML files. The HTML files
range from 300k - 1MB in size. My program processes the HTML using a SAX
style approach. The program runs very slow taking several minutes to process
a file. I profiled the program and discovered that approximately half the
processing time is spent in the IndexOf method, the other half is spent in
the SubString method of the String class.
What I would like to know is:
1) Are there any SAX style HTML parsers for .Net.
2) What are my alternatives to the String class and it's IndexOf and
SubString methods.
TIA,
John
I have a program that does alot of analysis on HTML files. The HTML files
range from 300k - 1MB in size. My program processes the HTML using a SAX
style approach. The program runs very slow taking several minutes to process
a file. I profiled the program and discovered that approximately half the
processing time is spent in the IndexOf method, the other half is spent in
the SubString method of the String class.
What I would like to know is:
1) Are there any SAX style HTML parsers for .Net.
2) What are my alternatives to the String class and it's IndexOf and
SubString methods.
TIA,
John