Python 3home |
Introduction to Python
davidbpython.com
This technique forms the core of much of what we do in Python.
In order to work with data, the usual steps are:
We call this process Extract-Transform-Load, or ETL. ETL is at the heart of what core Python does best.
Similar to the counting and summing algorithm, this one collects values instead.
build a list of company names
company_list = [] # empty list
fh = open('revenue.csv') # 'file' object
for line in fh: # str, 'Haddad's,PA,239.50\n'
line = line.rstrip() # str, 'Haddad's,PA,239.50'
items = line.split(',') # list, ["Haddad's", 'PA', '239.50']
company_list.append(items[0]) # list, ["Haddad's"]
print(company_list) # ["Haddad's", 'Westfield', 'The Store', "Hipster's",
# 'Dothraki Fashions', "Awful's", 'The Clothiers']
fh.close()
Ex. 5.14 - 5.15
This program uses a set to collect unique items from repeating data.
state_set = set() # empty set
fh = open('revenue.csv') # 'file' object
for line in fh: # str, 'Haddad's,PA,239.50'
items = line.split(',') # list, ["Haddad's", 'PA', '239.50']
state_set.add(items[1]) # set, {'PA'}
print(state_set) # set, {'PA', 'NY', 'NJ'} (your order may be different)
chosen_state = input('enter a state: ')
if chosen_state in state_set:
print(f'{chosen_state} found in the file')
else:
print(f'{chosen_state} not found')
fh.close()
5.22 & 5.23
A file is automatically closed upon exiting the 'with' block.
A 'best practice' is to open files using a 'with' block. When execution leaves the block, the file is automatically closed.
with open('pyku.txt') as fh:
for line in fh:
print(line)
# At this point (once outside the with block), filehandle fh
# has been closed. There is no need to call fh.close().
However, we should understand the minimal cost of not closing our files:
Once we have read a file as a single string, we can "chop it up" any way we like.
# read(): file text as a single strings
fh = open('guido.txt') # 'file' object
text = fh.read() # read() method called on
# file object returns a string
fh.close() # close the file
print(text)
print(len(text)) # 207 (number of characters in the file)
# single string, entire text:
# 'For three months I did my day job, \nand at night and
# whenever I got a \nchance I kept working on Python. \n
# After three months I was to the \npoint where I could
# tell people, \n"Look here, this is what I built."'
String .split() on a whole file string returns a list of words.
file_text = """For three months I did my day job,
and at night and whenever I got a
chance I kept working on Python.
After three months I was to the
point where I could tell people,
"Look here, this is what I built." """
words = file_text.split() # split entire file on whitespace (spaces or newlines)
print(words)
# ['For', 'three', 'months', 'I', 'did', 'my', 'day', 'job,',
# 'and', 'at', 'night', 'and', 'whenever', 'I', 'got', 'a',
# 'chance', 'I', 'kept', 'working', 'on', 'Python.', 'After',
# 'three', 'months', 'I', 'was', 'to', 'the', 'point', 'where',
# 'I', 'could', 'tell', 'people,', '“Look', 'here,', 'this',
# 'is', 'what', 'I', 'built.”']
print(len(words)) # 42 (number of words in the file)
String .splitlines() will split any string on the newlines, delivering a list of lines from the file.
file_text = """For three months I did my day job,
and at night and whenever I got a
chance I kept working on Python.
After three months I was to the
point where I could tell people,
"Look here, this is what I built."" """
lines = file_text.splitlines()
print(lines)
# ['For three months I did my day job, ', 'and at night and whenever I got a ',
# 'chance I kept working on Python. ', 'After three months I was to the ',
# 'point where I could tell people, ', '“Look here, this is what I built.”']
print(len(lines)) # 6 (number of lines in the file)
String .splitlines() will split any string on the newlines, delivering a list of lines from the file.
fh = open('pyku.txt') # 'file' object
file_text = fh.read() # entire file as a single string
lines = file_text.splitlines()
print(lines)
# ["We're out of gouda.", 'That parrot has ceased to be.',
# 'Spam, spam, spam, spam, spam.']
print(len(lines)) # 3 (number of lines in the file)
Ex. 5.27 -> 5.29
for: read (newline ('\n') marks the end of a line)
fh = open('students.txt') # file object allows looping
# through a series of strings
for my_file_line in fh: # my_file_line is a string
print(my_file_line) # prints each line of students.txt
fh.close() # close the file
read(): read entire file as a single string
fh = open('students.txt') # file object allows reading
text = fh.read() # read() method called on file
# object returns a string
fh.close() # close the file
print(text) # entire text as a single string
readlines(): read as a list of strings (each string a line)
fh = open('students.txt')
file_lines = fh.readlines() # file.readlines() returns
# a list of strings
fh.close() # close the file
print(file_lines) # entire text as a list of lines
We don't have call to write to a file in this course, but it's important to know how.
wfh = open('newfile.txt', 'w') # open for writing
# (will overwrite an existing file)
wfh.write('this is a line of text\n')
wfh.write('this is a line of text\n')
wfh.write('this is a line of text\n')
wfh.close()
This function allows us to iterate over an integer sequence.
counter = range(10)
for i in counter:
print(i) # prints integers 0 through 9
for i in range(3, 8): # prints integers 3 through 7
print(i)
If we need an literal list of integers, we can simply pass the iterable to a list:
intlist = list(range(5))
print(intlist) # [0, 1, 2, 3, 4]