Split large text file by number of lines?

  • Thread starter Thread starter ivan.perak
  • Start date Start date
I

ivan.perak

Hello,

im a beginner in VB.NET... The thing i would like to do is as it
follows....

I have a text file (list of names, every name to the next line) which
is about 350000 lines long. I would like to split it and create a new
file at every lets say 20000 lines... so, the directory output would
have to be something like this:

File1: 1-20000 lines of the original file
File2: 20001-40000 lines of the original file
File3: 40001-60000 lines of the original file

etc.

Can it be done simply? one form with field to enter the number of
lines, button to load a text file and a "Start" button...

Thanks in advance
 
Yes.

Read the source file line by line

Write each line to the target file

After each nth line, close the target file and open a new one (with a
different name of course).
 
This code I have writen works but it takes some time to complete(about 50
seconds for a 1 mb text file)

Mabye beter to do a "readall" and then use the SPLIT(str, vbcrlf) function
anyway this should do it

add a textbox and a button. This is created in vb.net 2005 (the free
version from microsoft)



Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button1.Click

TextSplitter()

End Sub


Sub TextSplitter()

' open the source fle and read assign it to a stream

Dim AsciiStreamReader As IO.StreamReader =
IO.File.OpenText("C:\HugeSourceTextFile1.txt")

Dim sb As New Text.StringBuilder

Dim LineCounter As Integer = 0

Dim FileNumber As Integer = 1

Dim bProcessWinMsg As Boolean = 0

Me.Text = "processng file... "

While AsciiStreamReader.EndOfStream = False

bProcessWinMsg += 1

If bProcessWinMsg Then Application.DoEvents()

sb.Append(AsciiStreamReader.ReadLine() & vbCrLf)

If LineCounter = CInt(TextBox1.Text) Or AsciiStreamReader.EndOfStream = True
Then

' Writes the data stored in the stringBuiler(sb) and then closes the file

IO.File.WriteAllText("C:\" & "File " & FileNumber & ".txt", sb.ToString,
Encoding.ASCII)

' Reset the line count, clear the sb string and increment the file number

LineCounter = 0

sb.Length = 0

FileNumber += 1

End If



LineCounter += 1



End While

Me.Text = "Complete: created " & FileNumber & " files"

End Sub
 
Michael M. said:
This code I have writen works but it takes some time to
complete(about 50 seconds for a 1 mb text file)

Mabye beter to do a "readall" and then use the SPLIT(str, vbcrlf)
function anyway this should do it

add a textbox and a button. This is created in vb.net 2005 (the
free version from microsoft)


Suggestion (untested):

Sub TextSplitter()

Dim fsIN, fsOut As IO.FileStream
Dim sr As IO.StreamReader
Dim sw As IO.StreamWriter
Dim OutCount As Integer

fsIN = New IO.FileStream( _
"infile.txt", IO.FileMode.Open, IO.FileAccess.Read _
)

sr = New IO.StreamReader(fsIN, System.Text.Encoding.Default)

Do
Dim Line As String
Dim LineCount As Integer

Line = sr.ReadLine()
If Line Is Nothing Then Exit Do

If fsOut Is Nothing Then
OutCount += 1

fsOut = New IO.FileStream( _
"outfile" & OutCount & ".txt", _
IO.FileMode.CreateNew, IO.FileAccess.Write _
)

sw = New IO.StreamWriter(fsOut, System.Text.Encoding.Default)
LineCount = 0
End If

sw.WriteLine(Line)
LineCount += 1

If LineCount = 20000 Then
sw.Close()
fsOut = Nothing
End If
Loop

If fsOut IsNot Nothing Then
sw.Close()
End If

fsIN.Close()

End Sub


Be aware that Encoding.Ascii supports only 7 bit characters.


Armin
 
Armin Zingler je napisao/la:
Suggestion (untested):

Sub TextSplitter()

Dim fsIN, fsOut As IO.FileStream
Dim sr As IO.StreamReader
Dim sw As IO.StreamWriter
Dim OutCount As Integer

fsIN = New IO.FileStream( _
"infile.txt", IO.FileMode.Open, IO.FileAccess.Read _
)

sr = New IO.StreamReader(fsIN, System.Text.Encoding.Default)

Do
Dim Line As String
Dim LineCount As Integer

Line = sr.ReadLine()
If Line Is Nothing Then Exit Do

If fsOut Is Nothing Then
OutCount += 1

fsOut = New IO.FileStream( _
"outfile" & OutCount & ".txt", _
IO.FileMode.CreateNew, IO.FileAccess.Write _
)

sw = New IO.StreamWriter(fsOut, System.Text.Encoding.Default)
LineCount = 0
End If

sw.WriteLine(Line)
LineCount += 1

If LineCount = 20000 Then
sw.Close()
fsOut = Nothing
End If
Loop

If fsOut IsNot Nothing Then
sw.Close()
End If

fsIN.Close()

End Sub


Be aware that Encoding.Ascii supports only 7 bit characters.


Armin

thanks man, this code does exactly what i need, and pretty fast....
 
Back
Top