Introduction to Python
davidbpython.com
Data Parsing & Extraction: File Operations and the for Looping Statement
the for loop block statement with a list
for with a list repeats its block as many times as there are items in the list.
mylist = [1, 2, 'b']
for myvar in mylist: # myvar = next(mylist) (i.e., <B>1</B>)
print(myvar) # 1
print('===') # ===
print('done')
The above code produces this output:
# 1
# ===
# 2
# ===
# b
# ===
# done
- Similar to a while block, the for looping block statement repeats the contents of its block multiple times (looping).
- However, the for block will repeat only as many times as there are items in the list.
- myvar is called the control variable.
- The control variable is reassigned the next value from the list for each iteration of the loop.
- This means that if the list has 3 items, the loop executes 3 times and myvar is reassigned a new value 3 times.
- Special note: the variable myvar may be given any name - it is a variable like any other that you might create and use.
review: the concept of incrementing
We reassign the value of an integer to effect an incrementing.
x = 0 # int, 0
x = x + 1 # int, 1
x = x + 1 # int, 2 (can also say x += 1)
x = x + 1 # int, 3
print(x) # 3
- For each of the three incrementing statements above, a new value that equals the value of x is created, and then assigned back to x.
- The previous value of x is replaced with the new, incremented value.
- Incrementing is most often used for counting within loops -- see next.
using a for loop to count list items
An integer, incremented once for each iteration, can be used to count iterations.
mylist = [1, 2, 'b']
my_counter = 0
for thisvar in mylist:
my_counter = my_counter + 1
print(f'count: {my_counter} items') # count: 3 items
- The value of my_counter is initialized at 0 before the loop begins.
- Then, since the incrementing line my_counter = my_counter + 1 is inside the looping block, the value of my_counter goes up once with each iteration.
- (Please note that the len() function can count list items more efficiently, but we are using a counter to demonstrate the counter technique, which can be used in situations where len() can't be used -- as when looping through a file -- discussed shortly.)
using a for loop to sum list items
A float value, updated for each iteration, can be used to sum up the values that it encounters with each iteration.
mylist = [1, 2, 3]
my_sum = 0
for val in mylist:
my_sum = my_sum + val
print(f'sum: {my_sum}') # sum: 6 (value of 1 + 2 + 3)
- The value of my_sum is initialized at 0 before the loop begins.
- Then, since the incrementing line my_sum = my_sum + val is inside the looping block, the value of my_sum goes up once with each iteration.
- (Please note that the sum() function can count list items more efficiently, but we are using a summing variable to demonstrate the summing technique, which can be used in situations where sum() can't be used, as when we are summing values from a file -- discussed shortly.)
the open() function and the 'file' object
The 'file' object represents a connection to a file that is saved on disk.
fh = open('students.txt') # a 'file' object
print(type(fh)) # <class '_io.TextIOWrapper'>
- The open() function causes Python to ask the operating system to access the file.
- If the file exists and is readable, the operating system will create a connection to the file and give Python access to it.
- We will be able to use this object to read the file data into our program.
- (The actual type of the file is _io.TextIOWrapper, but we will call it a 'file' object.)
reading a file with the for statement
for with a 'file' object repeats its block as many times as there are lines in the file.
fh = open('students.txt') # file object allows looping
# through a series of strings
for xx in fh: # xx is a string, a line from the file;
print(xx) # this prints each line of students.txt
fh.close() # close the file
- xx is the control variable, and it is automatically assigned each line in the file, as a string.
- Again, the control variable xx is reassigned for each iteration of the loop.
- This means that if the file has 5 lines, the loop executes 5 times and xx is reassigned a new value (a new line of the file) 5 times.
- break and continue work with for as well as while loops.
summarizing: csv parsing with for looping and string parsing
Here we put together all features learned in this session.
fh = open('revenue.csv') # 'file' object
counter = 0
summer = 0.0
for line in fh: # str, "Haddad's,PA,239.50\n" (first line from file)
line = line.rstrip() # str, "Haddad's,PA,239.50"
fieldlist = line.split(',') # list, ["Haddad's", 'PA', '239.50']
rev_val = fieldlist[2] # str, '239.50'
f_rev = float(rev_val) # float, 239.5
counter = counter + 1 # incrementing once for each iteration
summer = summer + f_rev # adding the value found at each iteration to a sum
fh.close()
print(f'counter: {counter}') # 7 (number of lines in file)
print(f'summer: {summer}') # 662.01000001 (sum of all 3rd col values in file)
- This example puts together everything we learned in this session.
- Each line is a string, which gets stripped, split into fields and then the last item in the line converted to float.
- We then use a summing variable to sum up the values found on each line.
- If we wish at the end we can derive an average value by dividing summer by counter.
- (Note that the tiny remainder is expected, and can be rounded to 2 places.)
sidebar: writing and appending to files using the file object
Files can be opened for writing or appending; we use the 'file' object and the file .write() method.
fh = open('new_file.txt', 'w')
fh.write("here's a line of text\n")
fh.write('I add the newlines explicitly if I want to write to the file\n')
fh.close()
Note that we are explicitly adding newlines to the end of each line -- the write() method doesn't do this for us.
[pr]