Tuesday, 17 September 2013

Read file in blocks

Read file in blocks

I'm having an issue which I'm currently stuck on.
I have a big file in the following format:
Block 1
Line 1: Type1/Type2
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
Block 2
Line 1: Type1/Type2
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
Block 3
Line 1: Type1/Type2
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
Block 4
Line 1: Type1/Type2
Line 2: Time
Line 3: Data we need
Line 4: 00.*
Line 5: Fix 100
Line 6: In..
Line 7: Ou..
Line 8: Data we need
Line 9: Next
Line 10: Multi_Exit
Etc
I want to read the first line of each block, to check if Type1 or Type2.
After this I want to print Line 3 and Line 8 of each block and keep on
doing that until file ends.
I have tried the following codes:
p = './file.txt'
fin = open(p, 'r')
for i, line in enumerate(fin):
if i%11 == 2 or i%11 == 7:
print line
fin.close()
I have noticed after this code is run on my big file the line changes. I
can only assume my block lengths isn't fixed to 10 lines (plus one line
space before the next block starts). So this method isn't ideal.
I have also tried regular expression but I'm having trouble storing my
results on the format I want such as:
For Type 1
File the output should be: Line 3: Data Line 8: Data
Single space between it.
This is the next code I have tried:
for line in fin:
if re.match("(Line 1|Line 3|Line 8)", line):
writeToFile(line)
Where writeToFile function does the following:
def writeToFile(filein):
p = './output.txt'
fo = open(p, 'a')
fo.write(filein)
fo.close()
This is how the output.txt file looks:
Line 1: Type1/Type2
Line 3: Data we need
Line 8: Data we need
Line 1: Type1/Type2
Line 3: Data we need
Line 8: Data we need
Line 1: Type1/Type2
Line 3: Data we need
Line 8: Data we need
Which is not exactly the desired outcome. I don't even mind to play around
with this output file and check for Line 1 if Type 1. Then get Line 3 and
Line 8 put them in the same line. Keep on doing that, until Type 2 is
found and do the same with Line 3 and Line 8 and store it in different
output file.
I hope I haven't complicated things.

No comments:

Post a Comment