My problem is that I have a text file I am reading with a StreamReader object and I am using ReadLine to get one line of text at a time. Nothing special here, but periodically I need to go back to a previous position in the file and read that part again.
If you use the BaseStream property of the StreamReader Object, then you can use the Position property and Seek method to get and set the current position in the file. These two items would appear to be all that you need to get the random access you need; however, when you use Position to get the offset of the file after the ReadLine, you don’t get the position of the file after the last line read, you get the position of the end of the buffer (usually 1024 bytes in size). So, when you use Seek to go back to that position, you will unlikely get back to the position you want unless the buffer boundary just happens to work out right.
I searched around for a simple solution to this problem (there may be and I haven’t found it yet). There were lots of posting about using FileStream, but it doesn't have ReadLine to read a line of text, so they were suggesting that you implement your own version of ReadLine for FileStream. I also found information on DiscardBufferedData that can be used with StreamReader, but it doesn’t help you get the correct offset after using ReadLine.
There were several suggestions on writing your own version of StreamReader:
http://bytes.com/forum/thread238508.html
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=6720&SiteID=1
I finally bit the bullet and created my own class to accomplish what I need (and only what I need). The previous two posts didn’t provide me with a working solution, so I am going to post what I have come up with. It may need some more refinement, but it appears to work okay. This is in VB.NET, but should be easily translatable into C# if necessary.
The first code snippet is an example of using the FileClass to read a file. In this example, I am looking for the string “*****StartOfData******”. When this string is found, I get the position and then call my PreprocessData function to read the rest of the data. I then go back to the position and run ProcessData from the same point:
Dim list() As String
Dim sep() As Char = {""c}
Try
Dim s As New FileClass
s.Open(TextBoxDataFile.Text)
Dim buffer As String = ""
Do
If Not s.GetNextLine(buffer) Then
Exit Do
End If
list = buffer.Split(sep)
If buffer = "*****StartOfData*****" Then
Dim startOfData As Integer = s.GetCurrentOffset()
PreprocessData(s)
s.SetCurrentOffset(startOfData)
ProcessData(s)
End If
Loop Until s.EOF()
s.Close()
Catch ex As Exception
Return False
End Try
The FileClass is shown below:
Imports System.IO
Imports System.text
Public Class FileClass
Const BUFFER_SIZE As Integer = 1024
Private g_file As StreamReader = Nothing
Private g_line As Integer = 0
Private g_position As Integer = 0
Private g_buffer(BUFFER_SIZE) As Char
Private g_bufferSize As Integer = 0
Private g_offset As Integer = 0
Private g_eofFlag As Boolean = True
Private g_lineBuffer As New StringBuilder(BUFFER_SIZE)
Private g_bufferOffset As Integer = 0
Public Function Open(ByVal filename As String) As Boolean
If Not g_file Is Nothing Then Close()
g_file = New StreamReader(filename)
g_line = 0
g_position = 0
g_eofFlag = False
g_bufferSize = 0
g_bufferOffset = 0
LoadBuffer()
End Function
Public Function Close() As Boolean
g_file.Close()
g_file = Nothing
g_line = 0
g_position = 0
g_eofFlag = True
g_bufferSize = 0
Return True
End Function
Public Function GetCurrentOffset() As Integer
Return g_offset
End Function
Public Function SetCurrentOffset(ByVal offset As Integer) As Boolean
Dim pos As Long = g_file.BaseStream.Seek(offset, SeekOrigin.Begin)
g_file.DiscardBufferedData()
LoadBuffer()
Return offset = pos
End Function
Public Function GetNextLine(ByRef data As String) As Boolean
g_lineBuffer.Length = 0
Dim ch As Char
Dim flag As Boolean = False
While Not flag
ch = g_buffer(g_position)
If ch = vbCr Then
' do nothing - skip cr
ElseIf ch = vbLf Then
flag = True
Else
g_lineBuffer.Append(ch)
End If
g_position = g_position + 1
If g_position = g_bufferSize Then
If Not LoadBuffer() Then
Exit While
End If
End If
End While
If flag Then
g_offset = g_bufferOffset + g_position
data = g_lineBuffer.ToString
Return True
End If
Return False
End Function
Private Function LoadBuffer() As Boolean
g_bufferOffset = Convert.ToInt32(g_file.BaseStream.Position)
g_position = 0
g_bufferSize = g_file.Read(g_buffer, 0, BUFFER_SIZE)
If g_bufferSize = 0 Then
g_eofFlag = True
Return False
End If
Return True
End Function
Public Function EOF() As Boolean
Return g_eofFlag
End Function
End Class
The FileClass is pretty simple, it just has to fill a buffer and look for the carriage returns and linefeeds itself to make generate the line it reads.