Python 3

home

All Slides on One Page




Introduction; Installations and Setup

class goals


- hello and welcome! this is Python Programming - course is practically focused - all examples build towards practical skills - learn to think like a coder - important to pay close attention to how we solve problems - in fact, our main goal in this class is getting to know the Interpreter


about python

Python's popularity is due to its elegance and simplicity.



Guido points out that we spend much more time reading code than writing it


the zen of python

This is the manifesto of the Python language.


The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!


about me: David Blaikie

I am dedicated to student success.



about you: welcome!

Prior exposure to Python is helpful, but not required.


You do not have to know anything about Python or programming, but some personal qualities will be very helpful. These are "soft skills" that will benefit you greatly as you proceed:


thinking like a coder means keeping certain understandings in mind throughout this course and especially at the start, I will emphasize these skills


three technical requirements to write and run programs

If you already have an editor and Python installed, you do not need to add the editor or Python.



Please keep in mind that if you are already able to write and run Python programs, you only need to add the class files.


configuring VS Code

I personally feel that suggestions and popups are more of a distraction than a help.



To suppress some suggestions in VS Code: 1. go to your Settings (on Windows, File > Settings; on Mac, Code > Settings) 2. in the search blank, type 'suggestions' (no quotes) 3. check the 'Suppress Suggestions' box 4. set all three 'Quick Suggestions' to Value 'off' 5. you may also want to increase your font: search for 'Font' and set it to a higher number


the course materials

The zip file contains all files needed for our course exercises.


1. Please look for the file called python_data.zip in your course files. 2. Unzip the folder so that it has the following structure:


python_data/
├── 01/
│   ├── 1.1.py
│   ├── 1.2.py
│   ├── ..etc
│   ├── solutions/
├── 02
│   ├── 2.1.py
│   ├── 2.2.py
│   ├── ..etc
│   ├── solutions/
├── 03
├── 04
├── ..etc
├── 13
└── dir1

3. Place this folder in a location where you can find it 4. Later in this session we'll open and explore the folder.




Your New Partner: the Python Interpreter

what do computers do?

Computers can do many different things for us.


Think about what our computers do for us:


what do computers really do?

At base, computers really only do three things.



Python can do many things, but we will focus on the first item -- working with data. The main purpose of any programming language is to allow us to store data in memory and then to process that data according to our needs.


programming languages

A programming language like Python is designed to allow us to give instructions to our computer.



the Python Interpreter

The Interpreter is Python itself.



evaluate - compile - run

When we run a python program, the Interpeter takes these three steps.



what the interpreter can do

Python is very smart in some ways.



what the interpreter can't do

Python is not smart in some ways, too!



how to respond to exceptions (errors)

We should seek to understand what the Interpreter is telling us.




This learning is not just about making programs work -- it's about understanding the interpreter -- what it can and can't do.




Executing Programs and Using the Lab Exercises

creating a new script (.py file) in VS Code

A 'workspace' in VS Code is usually the same as a'project'.


Open a Folder, which will correspond to a new workspace.


Add a new file.

Create a 'hello, world!' script.
print('hello, world!')
print()
Take care when reproducing the above script - every character must be in its place. (The print() at end is to clarify the Terminal output.) Next, we'll execute the script. scripts vs. programs


executing a script

VS Code may be able to run your script, or some configuration may be required.


Attempt to run your script.


understanding terminal output in VS Code

By default, VS Code passes your code to the Python Interpreter and executes it at the command line.


On my Mac, I see this output:

->  /Users/david/miniconda3/bin/python /Users/david/test_project/test.py
->  (base) DavidBs-MacBook-Pro:test_project david% /Users/david/miniconda3/bin/python /Users/david/test_project/test.py
->  hello, world!

->  (base) DavidBs-MacBook-Pro:test_project david%


when programs run without error in VSCode terminal

'Without error' means Python did everything you asked.


On my Mac, I see this output:

->  /Users/david/miniconda3/bin/python /Users/david/test_project/test.py
->  (base) DavidBs-MacBook-Pro:test_project david% /Users/david/miniconda3/bin/python /Users/david/test_project/test.py
->  hello, world!

->  (base) DavidBs-MacBook-Pro:test_project david%


when exceptions occur

An 'exception' is raised when Python cannot, or will not, do everything you asked in you program, or doesn't understand what the code means (e.g., because of a SyntaxError).


To demonstrate an exception, I removed one character from my code. Here is the result:

->  /Users/david/miniconda3/bin/python /Users/david/test_project/test.py
->  (base) DavidBs-MacBook-Pro:test_project david% /Users/david/miniconda3/bin/python /Users/david/test_project/test.py
->   File "/Users/david/test_project/test.py", line 2
->     print('hello, world!)
          ^
-> SyntaxError: unterminated string literal (detected at line 2)

(Again, the arrows above are just indicating where each line of output starts - they will not appear in your output.) How should we read our exception?


Throughout this course I will repeatedly call upon you to identify the exception type, pinpoint the error to the line, and seek to understand the error in terms of where Python says it occurred.


the SyntaxError exception

Some element of the code is misplaced or missing.


print('hello, world!)
print()

File "/Users/david/test_project/test.py", line 2
  print('hello, world!)
        ^
SyntaxError: unterminated string literal (detected at line 2)

How do we respond to a SyntaxError? First by understanding that there's something missing or out of place in the syntax (the proper placement of language elements -- brackets, braces, parentheses, quotes, etc.) We look at the syntax on the line, and compare it to similar examples in other code that we've seen. Careful comparison between our code and working code will usually show us what's missing or misplaced. In the example above, the first print() statement is missing a quotation mark. It might be hard to see at first, but eventually you will develop "eyes" for this kind of error. pythonreference.com


writing code: comments and blank lines

Use hash marks to comment individual lines; blank lines are ignored.


 1 # this program adds numbers
 2 var1 = 5
 3 var2 = 2
 4
 5 # add these numbers together
 6 var3 = var1 + var2
 7
 8 # these lines are 'commented out'
 9 # var3 = var3 * 2
10 # var3 = var3 / 20
11
12 print(var3)


Opening the Lab Exercises in VS Code

The exercises should be opened as a single folder in VS Code.


python_data
├── 01
├── 02
│   ├── 2.1.py
│   ├── 2.2.py
│   ├── ..etc
│   ├── 2.6_lab.py
│   ├── 2.7_lab.py
│   ├── ..etc
│   └── solutions/


Using the Lab Exercises

We will use some exercises for demos in class; you will use them to practice your skills, and prepare for tests.


├── 02
│   ├── 2.1.py       <-- 'journey' exercise
│   ├── 2.2.py
│   ├── ..etc
│   ├── 2.6_lab.py   <-- 'lab' (practice) exercise
│   ├── 2.7_lab.py
│   ├── ..etc
│   └── solutions/

The exercises come in two forms:




Creating and Identifying Objects by Type

the variable

A variable is a value assigned ("assigned" or "bound") to an object.


xx = 10               # assign 10 to xx
yy = 2

zz = xx * yy          # compute 10 * 2 and assign integer 20 to variable yy

print(zz)             # print 20 to screen

xx is a variable, bound to 10 = is an assignment operator assigning 10 to xx yy is another variable, bound to 2 * is a multiplication operator computing its operands (10 and 2) zz is bound to the product, 20 print() is a function that renders its argument to the screen.


the literal: a value typed into our code

early on we need to distinguish between a variable and a literal.


xx = 10               # assign 10 to xx
yy = 2

zz = xx * yy          # compute 10 * 2 and assign integer 20 to variable yy

print(zz)             # print 20 to screen


next slide should be an update or continuation of this same slide, with bullet points added


the literal: a value typed into our code

early on we need to distinguish between a variable and a literal.


xx = 10               # assign 10 to xx
yy = 2

zz = xx * yy          # compute 10 * 2 and assign integer 20 to variable yy

print(zz)             # print 20 to screen


the object

An object is a data value of a particular type.


Every data value in Python is an object.


var_int = 100                  # assign integer object 100 to variable var_int

var2_float = 100.0             # assign float object 100.0 to variable var2_float

var3_str = 'hello!'            # assign str object 'hello' to variable var3_str

At every point you must be aware of the type and value of every object in your code.


object types for this session

The three object types we'll look at in this unit are int, float and str. They are the "atoms" of Python's data model.


data typeknown asdescriptionexample value
intintegera whole number5
floatfloata floating-point number5.03
strstringa character sequence, i.e. text'hello, world!'


sidebar: string literal syntax

The string may be bounded by 3 different quotation marks -- all produce a string.


s1 = 'hello, quote'          # single quotes
s2 = "hello, quote"          # double quotes

# triple quotes:  put quotes around multiple lines
s3 = """hello, quote
Sincerely, Python"""


s4 = 'He said "yes!"'               # using single quotes to include double quotes
s5 = "Don't worry about that."      # using double quotes to include a single quote


identifying type through syntax

The way a variable is written in the code determines type.


It's vital that we always be aware of type.


a = 5.0
b = '5.0'
c = 5


the next slide should be a continuation of this one


identifying type through syntax

The way a variable is written in the code determines type.


It's vital that we always be aware of type.


a = 5.0         # float (written with a decimal point)
b = '5.0'       # str   (written with quotes)
c = 5           # int   (written as a whole number)

Other languages (like Java and C) use explicit type declarations to indicate type, for example int c = 5. But Python does not do this. Instead, it relies on the syntax of the literal (whole number, floating-point, quotation marks, etc.)


can we identify type through printing?

Printing is usually not enough to determine type, since a string can look like any object.


a = 5.0
b = '5.0'
c = 5

print(a)         # 5.0
print(b)         # 5.0
print(c)         # 5

b looks like a float, but it is a str.


identifying type through the type() function

If we're not sure, we can always have Python tell us an object's type.


a = 5.0
b = '5.0'
c = 5

print(type(a))         # <class 'float'>
print(type(b))         # <class 'str'>
print(type(c))         # <class 'int'>

exercise 2.1


python is strongly typed

This means that what an object can do is defined by its type.


a = 5            # int, 5
b = 10.0         # float, 10.0
c = '10.0'       # str, '10.0'

x = a + b        # 15.0           (adding int to float)

y = a + c        # TypeError      (cannot add int to str!)


variable naming rules

You must follow correct style even though Python does not always require it.


name = 'Joe'
age = 29

my_wordy_variable = 100

student3 = 'jg61'




Math and String Operators

+, -, *, /: math operators

Math operators behave as you might expect.


var_int = 5
var2_float = 10.3

var3_float = 5 + 10.3       # int plus a float:  15.3, a float

var4_float = 10.3 - 0.3     # float minus a float:  15.0, a float

var5_float = 15.0 / 3       # float divided by an int:  5.0, a float


Ex. 2.2


identifying type through an operation

Every operation or function call results in a predictable type.


With two integers, the result is integer. If a float is involved, it's always float.

vari = 7
vari2 = 3
varf = 3.0

var3 = var * var2      # 35, an int.

var4 = var + var2      # 10.0, a float

However when an integer is divided into another integer, the result is always a float, even if there is no remainder.

var = 6
var2 = 3

var3 = var / var2      # 2.0, a float


pythonreference


** exponentiation operator

The exponentiation operator (**) raises its left operand to the power of its right operand and returns the result as a float or int.


var = 11 ** 2     # "eleven raised to the 2nd power (squared)"
print(var)        # 121

var = 3 ** 4
print(var)        # 81

% Modulus Operator

The modulus operator (%) shows the remainder that would result from division of two numbers.


var = 11 % 2      # "eleven modulo two"
print(var)        # 1   (11/2 is 5, with a remainder of 1)


var2 = 10 % 2     # "ten modulo two"
print(var2)       # 0   (10/2 is 5, with a remainder of 0)


+ operator with strings: concatenation

The plus operator (+) with two strings returns a concatenated string.


aa = 'Hello, '
bb = 'World!'

cc = aa + bb     # 'Hello, World!'

Note that this is the same operator (+) that is used with numbers for summing. Python uses the type of the operands (values on either side of the operator) to determine behavior and result. Ex. 2.5


* operator with one string and one integer: string repetition

The "string repetition operator" (*) creates a new string with the operand string repeated the number of times indicated by the other operand:

aa = '!'
bb = 5

cc = aa * bb       # '!!!!!!'

Note that this is the same operator (*) that is used with numbers for multiplication. Python uses the type of the operands to determine behavior and result. Ex. 2.6


+ operator "overloading"

Object types determine behavior.


int or float "added" to int or float: addition

tt = 5            # assign an integer value to tt
zz = 10.0         # assign a float value to zz

qq = tt + zz      # compute 5 plus 10 and assign float 15.0 to qq

str "added" to str: concatenation

kk = '5'          # assign a str value (quotes mean str) to kk
rr = '10.0'       # assign a str value to rr

mm = kk + rr      # concatenate '5' and '10.0'
                  # to construct a new str object, assign to mm

print(mm)         # '510.0'


* operator "overloading"

Again, object types determine behavior.


int or float "multipled" by int or float: multiplication

tt = 5            # int, 5
zz = 10           # int, 10

qq = tt * zz      # int, 50 (5 * 10)
print(qq)         # 50

str "multiplied" by int: string repetition

aa = '5'          # str, '5'
bb = 3            # int, 3

cc = aa * bb      # str, '555' ('5' * 3)


introduce the concept of labs


studying for the quizzes and the midterm and final exam

the exercises and weekly assignments are practice for the exams





Built-In Functions

built-in functions

Built-in functions activate functionality when they are called.


aa = 'hello'        # str, 'hello'

bb = len(aa)        # pass string object aa as an argument to function len(),
                    # which returns an integer object as a return value.

print(bb)            # int, 5


len() function

The len() function takes a string argument and returns an integer -- the length of (number of characters in) the string.


varx = 'hello, world!'

vary = len(varx)        # int, 13

pythonreference


round() function

The round() function takes a float argument and returns another float, rounded to the specified decimal place.


aa = 5.9583

bb = round(aa, 2)     # float, 5.96

cc = round(aa)        # int, 6


float precision and the round() function

Some floating-point operations will result in a number with a small remainder:

x = 0.1 + 0.2
print(x)            # 0.30000000000000001  (should be 0.3?)

y = 0.1 + 0.1 + 0.1 - 0.3
print(y)            # 5.551115123125783e-17  (should be 0.0?)

This remainder represents the float imprecision of your computer. No binary machine is capable of calculating floating-point math with perfect precision, although many programs (like Excel) may simulate it.


The solution when using Python is to round any result:

x = 0.1 + 0.2       # 0.30000000000000001

z = round(x, 1)
print(z)            # 0.3

input() function

This function allows us to enter data into the program through the keyboard.


cc = input('enter name:  ')    # program pauses!  Now the user types something

print(cc)                      # [a string, whatever the user typed]


exit() function: terminate the program

The exit() function terminates execution immediately. An optional string argument can be passed as an error message.


exit(0)             # 0 indicates a successful termination (no error)

exit('error!  here is a message')     # string argument passed to exit()
                                      # indicates an error led to termination

exit() to manipulate execution during development

This function can be used as a temporary stop to the program if we'd like to isolate some statements.


We can also use exit() to simply stop program execution in order to debug:

aa = '55'
bb = float(aa)
print('type of bb is:')
print((type(bb)))

exit()                  # we inserted this to stop the code
                        # from continuing; we'll remove it later

cc = bb * 2             # because of exit() above, this code
                        # will not be reached

int() "conversion" function

This function can convert a str or float to the int type.


# str -> int
aa = '55'
bb = int(aa)         # int, 55
print(type(bb))      # <class 'int'>

# float -> int
var = 5.95
var2 = int(var)      # int, 5: the remainder is lopped off (not rounded)


float() "conversion" function

This function converts an int or str to the float type.


# int -> float
xx = 5
yy = float(xx)       # float, 5.0

# str -> float
var = '5.95'
var2 = float(var)    # float, 5.95

str() "conversion" function

This function converts any value to the str type.


var = 5              # int, 5
var2 = 5.5           # float, 5.5

svar = str(var)      # str, '5'
svar2 = str(var2)    # str, '5.5'

Any object type can be converted to str. ex. 2.12 - 2.16


conversion challenge: treating a string like a number

Because Python is strongly typed, conversions can be necessary.


Numeric data sometimes arrives as strings (e.g. from input() or a file). Use int() or float() to convert to numeric types.


aa = input('enter number and I will double it:  ')

print(type(aa))         # <class 'str'>

num_aa = int(aa)        # int() takes the string as an argument
                        # and returns an integer

print(num_aa * 2)       # prints the input number doubled

You can use int() and float() to convert strings to numbers.


beginner's tip: avoid improvising syntax!

Just starting out, some students improvise syntax that doesn't exist.


Imagine that would like to find the length of a string. What do you do? Some students begin writing code from memory, even though they are not completely familiar with the right syntax.


they may write something like this...

var = 'hello'

mylen = var.len()      # or mylen = length('var')
                       # or mylen = lenth(var)

...and then run it, only to get a strange error that's difficult to diagnose. The solution is to never improvise syntax. Instead, always start with an existing example.


beginner's tip: use existing examples of a feature to write new code using it

When you want to use a Python feature, you must follow an existing example !


Let's say you have a string and you'd like to get its length:

s = "this is a string I'd like to measure"

You look up the function in a reference, like pythonreference.com:

mylen = len('hello')

Then you use the feature syntax very carefully:

slen = len(s)           # int, 36

However, the code you write may be slightly different than the example code:


review: distinguish between variables and string literals

early on we need to distinguish between a variable and a literal.


xx = 10          # int, 10
yy = 2           # int, 2

zz = xx * yy     # int, 20

print(zz)


next slide should be an update or continuation of this same slide, with bullet points added


review: distinguish between variables and string literals

early on we need to distinguish between a variable and a literal.


xx = 10          # int, 10
yy = 2           # int, 2

zz = xx * yy     # int, 20

print(zz)


example: confusing a string literal with a variable name

Here's an example of this common error that beginners make - try to avoid it!


Going back to our previous example - you'd like to use len() to measure this string:

s = "this is a string I'd like to measure"

You look up the function in a reference, like pythonreference.com:

mylen = len('hello')

You have been told to make your syntax match the example's. But should you do this?

slen = len('s')            # int, 1

You were expecting a length of 36, but you got a length of 1. Can you see why? The variable s points to a long string. The literal string, 's', is just a one-character string. In trying to match the example code, you may have thought you needed to also match the quotes. But keep in mind that you may be using a variable where the example code has a literal, but these two are interchangeable. The takeaway is this: anyplace a literal is used, a variable can be used instead; and anyplace a variable is used, a literal can be used instead. ex 2.17 and 2.18 illustrate not confusing literal and variable




Conditionals and Blocks; Object Methods

conditionals: if/elif/else and while

All programs must make decisions during execution.


Consider these decisions by programs you know:


Each program will decide conditional statements allow any program to make the decisions it needs to do its work.


'if' statement

The if statement executes code in its block only if the test is True.


aa = input('please enter a positive integer: ')
int_aa = int(aa)

if int_aa < 0:                          # test:  is this a True statement?
    print('error:  input invalid')      # block (2 lines) -- lines are
    exit()                              # executed only if test is True

d_int_aa = int_aa * 2
print('your value doubled is ' + str(d_int_aa))

The two components of an if statement are the test and the block. The test determines whether the block will be executed.


'else' statement

An else statement will execute its block if the if test before it was not True.


xx = input('enter a value less than 100:  ')
yy = int(xx)

if yy < 100:
    print(xx + ' is a valid number')
    print('congratulations.')

else:
    print(xx + ' is too high')
    print('please re-run and try again.')

Since else means "otherwise", we can say that only one block of an if/else statement will execute.


'elif' statement

elif is also used with if (and optionally else): you can chain additional conditions for other behavior.


zz = input('type an integer and I will tell you its sign:  ')
zyz = int(zz)

if zyz > 0:
    print('that number is positive')

elif zyz < 0:
    print('that number is negative')

else:
    print('0 is neutral')


the python code block

A code block is marked by indented lines. The end of the block is marked by a line that returns to the prior indent.


xx = input('enter a value less than 100:  ')      # not in any block
yy = int(xx)                                      # not in any block

if yy < 100:                               # the start of the 'if' block
    print(xx + ' is a valid number')
    print('congratulations.')              # last line of the 'if' block

else:                                      # the start of the 'else' block
    print(xx + ' is too high')
    print('please re-run and try again.')  # last line of the 'else' block

Note also that a block is preceded by an unindented line that ends in a colon.


nested blocks increase indent

Blocks can be nested within one another. A nested block (a "block within a block") is indented further to the right.


var_a = int(input('enter a number: '))
var_b = int(input('enter another number:  '))

if var_b >= var_a:                                  # 'outer' block
    print("the test was true")
    print("var b is at least as large")

    if var_a == var_b:                              # 'inner' block
        print('the two values are equivalent')

    print("in outer block, not in the inner block")  # back in 'outer' block

print('this gets printed in any case (i.e., not part of either block)')

Decision trees using 'if' and 'else' is a part of most programs.


comparison operators with numbers

>, <, <=, >= tests with numbers work as you might expect.


var = 5
var2 = 3.3

if var >= var2:
    print('var is greater or equal')

if var == var2:
    print('they are equivalent')

== with strings

With strings, this operator tests to see if two strings are identical.


var = 'hello'
var2 = 'hello'

if var == var2:
    print('these are equivalent strings')


the in operator with strings

in with strings allows you can to see if a 'substring' appears within a string.


article = 'The market rallied, buoyed by a rise in Samsung Electronics.'

if 'Samsung' in article:
    print('Samsung was found')


and "compound" test

Python uses the operator and to combine tests: both must be True.


The and compound statement if both tests are True, the entire statement is True.


xx = input('what is your ID?  ')
yy = input('what is your pin?  ')

if xx == 'dbb212' and yy == '3859':
    print('you are a validated user')
else:
    print('you are not validated')

Note the lack of parentheses around the tests -- if the syntax is unambiguous, Python will understand. We can use parentheses to clarify compound statements like these, but they often aren't necessary. Beginners may think they need to put parentheses around some values. You should avoid parentheses wherever you can.


or "compound" test

Python uses the operator or to combine tests: either can be True for the entire expression to be True.


aa = input('please enter "q" or "quit" to quit: ')

if aa == 'q' or aa == 'quit':
    exit()

print('continuing...')

Again, note the lack of parentheses around the tests -- if the syntax is unambiguous, Python will understand. We can use parentheses to clarify compound statements like these, but they often aren't necessary. Beginners may think they need to put parentheses around some values. You should avoid parentheses wherever you can.


testing a variable against two values

Both sides of an 'or' or 'and' must be complete tests.


if aa == 'q' or aa == 'quit':          # not "if aa == 'q' or 'quit'""
    exit()

Note the 'or' test above -- we would not say if aa == 'q' or 'quit'; this would always succeed (for reasons discussed later).


testing a variable against multiple values

We can also test a variable against multiple values by using in with a list (more on lists next week):

if aa in ['q', 'quit']:
    exit()

negating an if test with not

You can negate a test with the not keyword.


var_a = 5
var_b = 10

if not var_a > var_b:
    print("var_a is not larger than var_b (well - it isn't).")

Of course this particular test can also be expressed by replacing the comparison operator > with <=, but when we learn about new True/False condition types we'll see how this operator can come in handy.


boolean (bool) values True and False

True and False are boolean values (type bool), and are produced by expressions that can be seen as True or False.


aa = 3
bb = 5

if aa > bb:
    print("that is true")

Tests are actually expressions that resolve to True or False, which are values of boolean type:

var = 5
var2 = 10
xx = (5 > 3)
print(xx)            # True
print(type(xx))      # <class 'bool'>

Note that we would almost never assign comparisons like these to variables, but we are doing so here to illustrate that they resolve to boolean values. ex 3.1 - 3.9




The while Block Statement and Looping Blocks

the concept of incrementing

Incrementing means increasing by one.


x = 0         # int, 0

x = x + 1     # int, 1
x = x + 1     # int, 2     (can also say x += 1)
x = x + 1     # int, 3

print(x)      # 3


while looping block

A while block with a test causes Python to loop through a block repetitively, as long as the test is True.


This program prints each number between 0 and 4:

cc = 0                 # initialize a counter

while cc < 5:          # if test is True, enter the block; if False, drop below
    print(cc)
    cc = cc + 1        # increment cc:  add 1 to its current value

    # WHEN WE REACH THE END OF THE BLOCK,
    # JUMP BACK TO THE while TEST

print('done')

The block is executing the print() and cc = cc + 1 lines multiple times - again and again until the test becomes False. Of course, the value being tested (cc) must change as the loop progresses - otherwise the loop will cycle indefinitely (infinite loop).


understanding while looping blocks

while loop statements have 3 components: the test, the block, and the automatic return.


cc = 10

while cc > 0:         # the TEST (if True, enter the block)

       print(cc)      # the BLOCK (execute as regular Python statements)
       cc = cc - 1

       # the AUTOMATIC RETURN [invisible!]
       # (at end of block, go back to the test)

print('done')


loop control: break

break is used to exit a loop regardless of the test condition.


xx = 0
print('Hello, User')

while xx < 10:

    answer = input("do you want the loop to break? ")

    if answer == 'y':
        break                  # drop down below the block

    xx = xx + 1
    print('I have now greeted you ' + str(xx) + ' times')


print("ok, I'm done")

loop control: continue

The continue statement jumps program flow to next loop iteration.


x = 0

while x < 10:

    x = x + 1

    if x % 2 != 0:             # will be True if x is odd
        continue               # jump back up to the test and test again

    print(x)

Note that print(x) will not be executed if the continue statement comes first. Can you figure out what this program prints?


2
4
6
8
10

the while True looping block

while with True and break provide us with a handy way to keep looping until we wish to stop, and at any point in the block.


while True:

    var = input('please enter a positive integer:  ')

    if int(var) > 0:
        break

    else:
        print('sorry, try again')


print('thanks for the integer!')

Note the use of True in a while expression: since True is always True, the if test will be always be True, and will cause program flow to enter (and re-enter) the block every time execution returns to the top. Therefore the break statement is essential to keep this block from looping indefinitely. ex 3.17 - 3.22


debugging loops: the "fog of code"

What do we do when we get bad output, but with no error messages?


The output of the code should be the sum of all numbers from 0-10 (i.e. 55), but instead it is 10:

revcounter = 0
while revcounter < 10:

    varsum = 0
    revcounter = revcounter + 1
    varsum = varsum + revcounter

    print("loop iteration complete")
    print("revcounter value: ", revcounter)
    print("varsum value: ", varsum)
    input('pausing...')
    print()
    print()

print(varsum)                         # 10

Why is it not working? You may see it right away, but I'd like you to imagine that this code is a lot more complicated, such that it won't be easy to see the reason. And you may be tempted to tinker with the code to see whether you can get the correct output, but it's important to understand that we need to be more methodical. We do this with print() statements.




Object Methods

object methods

Objects are capable of behaviors, which are expressed as methods.


var = 'Hello, World!'

var2 = var.replace('World', 'Mars')      # replace substring, return a str

print(var2)                              # Hello, Mars!


methods vs. functions

Compare method syntax to function syntax.


mystr = 'HELLO'

x = len(mystr)            # function len() (stands alone)

y = mystr.count('L')      # method .count() (attached to the string variable)

Methods and functions are both called (using the parentheses after the name of the function or method). Both also may take an argument and/or may return a return value.


string method: .upper()

This "transforming" method returns a new string with a string's value uppercased.


var = 'hello'
newvar = var.upper()        # str, 'HELLO'

print(newvar)               # 'HELLO'

This method does not take an explicit argument, because it works with the string object itself.


string method: .lower()

This "transforming" method returns a new string with a string's value lowercased.


var = 'Hello There'
newvar = var.lower()        # str, 'hello there'

print(newvar)               # 'hello there'

This method does not take an explicit argument, because it works with the string object itself.


string method: .replace()

This "transforming" method returns a new string based on an old string, with specified text replaced.


var = 'My name is Marie'

newvar = var.replace('Marie', 'Greta')    # str, 'My name is Greta'

print(newvar)                             # My name is Greta

This method takes two arguments, the search string and replace string.


string method: .isdigit()

This "inspector" method returns True if a string is all digits.


mystring = '12345'
if mystring.isdigit():
    print("that string is all numeric characters")

if not mystring.isdigit():
    print("that string is not all numeric characters")

Since it returns True or False, inspector methods like isdigit() are used in an if or while expression. To test the reverse (i.e. "not all digits"), use if not before the method call.


string method: .endswith()

This "inspector" method returns True if a string starts with or ends with a substring.


bb = 'This is a sentence.'
if bb.endswith('.'):
    print("that line had a period at the end")

string method: .startswith()

This "inspector"method returns True if the string starts with a substring.


cc = input('yes? ')
if cc.startswith('y') or cc.startswith('Y'):
    print('thanks!')
else:
    print("ok, I guess not.")

string method: .count()

This "inspector" method returns a count of occurrences of a substring within a string.


aa = 'count the substring within this string'
bb = aa.count('in')
print(bb)             # 3 (the number of times 'in' appears in the string)

string method: .find()

This "inspector" method returns the character position of a substring within a string.


xx = 'find the name in this string'
yy = xx.find('name')
print(yy)             # 9 -- the 10th character in mystring

ex. 3.27 - 3.28


f'' strings for string formatting

An f'' string allows us to embed any value (such as numbers) into a new, completed string.


aa = 'Jose'
var = 34

bb = f'{aa} is {var} years old.'

print(bb)                                  # Jose is 34 years old.


Ex. 3.29


f'' string format codes

There are numerous options for justifying, formatting numbers, and more.


overview of formatting

# text padding and justification
# :<15     # left justify width
# :>10     # right justify width
# :^8      # center justify width

# numeric formatting
:f         # as float (6 decimal places)
:.2f       # as float (2 decimal places)
:,         # 000's comma separators
:,.2f      # 000's comma separators with float rounded to 2 places

f'' string format code examples

There are even more options, you can search online for details.


examples

x = 34563.999999

f'hi:  {x:<30}'      # 'hi:  34563.999999                  '

f'hi:  {x:>30}'      # 'hi:                    34563.999999'

f'hi:  {x:^30}'      # 'hi:           34563.999999         '

f'hi:  {x:f}'        # 'hi:  34563.999999'

f'hi:  {x:.2f}'      # 'hi:  34564.00'

f'hi:  {x:,}'        # 'hi:  34,563.999999'

f'hi:  {x:,.2f}'     # 'hi:  34,564.00'

Please note that f'' strings are available only as of Python 3.6. ex 3.29


sidebar: method and function return values in an expression; combining expressions

The return value of an expression can be used in another expression.


letters = "aabbcdefgafbdchabacc"

vara = letters.count("a")         # 5

varb = len(letters)               # 20

varc = vara / varb                # 5 / 20, or 0.25

vard = varc * 100                 # 25


print(len(letters) / letters.count("a") * 100)  # statements combined


a note on style in your homework submissions

Professional coders respect good style because it makes code easier to read.





Data Parsing & Extraction: String Methods

our first data format: csv

The CSV format will allow us to explore Python's text parsing tools.


    19260701,0.09,0.22,0.30,0.009
    19260702,0.44,0.35,0.08,0.009
    19270103,0.97,0.21,0.24,0.010


CSV structure: "fields" and "records"

Tables consist of records (rows) and fields (column values).


Tabular text files are organized into rows and columns.


comma-separated values file (CSV)

    19260701,0.09,0.22,0.30,0.009
    19260702,0.44,0.35,0.08,0.009
    19270103,0.97,0.21,0.24,0.010
    19270104,0.30,0.15,0.73,0.010
    19280103,0.43,0.90,0.20,0.010
    19280104,0.14,0.47,0.01,0.010

space-separated values file

    19260701   -0.09    0.22    0.30   0.009
    19260702    0.44    0.35   -0.08   0.009
    19270103    0.97   -0.21    0.24  -0.010
    19270104    0.30   -0.15    0.73   0.010
    19280103   -0.43    0.90    0.20   0.010
    19280104    0.14    0.47    0.01  -0.010


presentation note: ask student to name the two structural characters


table data in text files

Text files are just sequences of characters. Commas and newline characters separate the data.


If we print a CSV text file, we may see this:

    19260701,0.09,0.22,0.30,0.009
    19260702,0.44,0.35,0.08,0.009
    19270103,0.97,0.21,0.24,0.010
    19270104,0.30,0.15,0.73,0.010
    19280103,0.43,0.90,0.20,0.010
    19280104,0.14,0.47,0.01,0.010

However, here's what a text file really looks like under the hood:

19260701,0.09,0.22,0.30,0.009\n19260702,0.44,0.35,0.08,
0.009\n19270103,0.97,0.21,0.24,0.010\n19270104,0.30,0.15,
0.73,0.010\n19280103,0.43,0.90,0.20,0.010\n19280104,0.14,
0.47,0.01,0.010


tabular data: looping, parsing and summarizing

Looping through file line strings, we can split and isolate fields on each line.


The process:


1. Open the file for reading.

fh = open('myfile.csv')

2. Use a for loop to read each line of the file, one at a time. Each line will be represented as a string.

for line in fh:

3. Remove the newline from the end of each string with .rstrip

    line = line.rstrip()

4. Divide (using .split()) the string into fields.

    fields = line.split(',')

5. Read a value from one of the fields, representing the data we want.

    val = fields[4]

6. As the loop progresses, build a sum of values from each line.

    mysum = mysum + float(val)

We will begin by reviewing each feature necessary to complete this work, and then we will put it all together.


string method: .rstrip()

This method can remove any character, or whitespace from the right side of a string.


When no argument is passed, the newline character (or any "whitespace" character) is removed from the end of the line:

line_from_file = 'jw234,Joe,Wilson\n'

stripped = line_from_file.rstrip()      # str, 'jw234,Joe,Wilson'

When a string argument is passed, that character is removed from the end of the ine:

line_from_file = 'I have something to say.'

stripped = line_from_file.rstrip('.')   # str, 'I have something to say'

Whitespace characters are any characters that don't print directly, but we may see their presence: space, tab, or newline characters are whitespace.


string method: .split() with a delimiter

This method divides a delimited string into a list.


line_from_file = 'jw234:Joe:Wilson:Smithtown:NJ:2015585894\n'

xx = line_from_file.split(':')

print(xx)                         # ['jw234', 'Joe', 'Wilson',
                                  #  'Smithtown', 'NJ', '2015585894\n']


string method: .split() without a delimiter

We can also thing of a string as delimited by spaces.


gg = 'this is a file    with    some     whitespace'

hh = gg.split()                   # splits on any "whitespace character"

print(hh)                         # ['this', 'is', 'a', 'file',
                                  #  'with', 'some', 'whitespace']


ex 4.1 - 4.2 (skipping 4.3, 4.4, slicing)




Data Parsing & Extraction: List Operations and String Slicing

lists and list subscripting

Subscripting allows us to select individual items from a list.


fields = ['jw234', 'Joe', 'Wilson', 'Smithtown', 'NJ', '2015585894']

var = fields[0]           # 'jw234', 1st item
var2 = fields[4]          # 'NJ', 3rd item
var3 = fields[-1]         # '2015585894' (-1 means last item)


Ex. 4.5


list slicing

Slicing allows us to select multiple items from a list.


letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
first_four = letters[0:4]
print(first_four)                     # ['a', 'b', 'c', 'd']

# no upper bound takes us to the end
print(letters[5:])                    # ['f', 'g', 'h']

Here are the rules for slicing:


string slicing

Slicing a string selects characters the way that slicing a list selects items.


mystr = '20140313 15:33:00'
year =  mystr[0:4]               # '2014'
month = mystr[8:10]              # '03'
day =   mystr[10:12]             # '13'

Again, please review the rules for slicing:


now can go back to 4.3, 4.4


the IndexError exception

An IndexError exception indicates use of an index for a list item that doesn't exist.


mylist = ['a', 'b', 'c']

print(mylist[5])            # IndexError:  list index out of range

Since mylist does not contain a sixth item (i.e., at index 5), Python tells us it cannot complete this operation.




Data Parsing & Extraction: File Operations and the for Looping Statement

the for loop block statement with a list

for with a list repeats its block as many times as there are items in the list.


mylist = [1, 2, 'b']

for myvar in mylist:     # myvar = next(mylist)   (i.e., <B>1</B>)
    print(myvar)         # 1
    print('===')         # ===
print('done')

The above code produces this output:

# 1
# ===
# 2
# ===
# b
# ===
# done


Ex. 4.12


review: the concept of incrementing

We reassign the value of an integer to effect an incrementing.


x = 0         # int, 0

x = x + 1     # int, 1
x = x + 1     # int, 2     (can also say x += 1)
x = x + 1     # int, 3

print(x)      # 3


using a for loop to count list items

An integer, incremented once for each iteration, can be used to count iterations.


mylist = [1, 2, 'b']

my_counter = 0

for thisvar in mylist:
    my_counter = my_counter + 1

print(f'count:  {my_counter} items')   # count:  3 items


using a for loop to sum list items

A float value, updated for each iteration, can be used to sum up the values that it encounters with each iteration.


mylist = [1, 2, 3]

my_sum = 0

for val in mylist:
    my_sum = my_sum + val

print(f'sum:  {my_sum}')     # sum: 6  (value of 1 + 2 + 3)


4.12 - 4.13


the open() function and the 'file' object

The 'file' object represents a connection to a file that is saved on disk.


fh = open('students.txt')     # a 'file' object

print(type(fh))               # <class '_io.TextIOWrapper'>


reading a file with the for statement

for with a 'file' object repeats its block as many times as there are lines in the file.


fh = open('students.txt')              # file object allows looping
                                       # through a series of strings

for xx in fh:                          # xx is a string, a line from the file;
    print(xx)                          # this prints each line of students.txt

fh.close()                             # close the file


Ex. 4.14


summarizing: csv parsing with for looping and string parsing

Here we put together all features learned in this session.


fh = open('revenue.csv')          # 'file' object

counter = 0
summer = 0.0

for line in fh:                   # str, "Haddad's,PA,239.50\n"  (first line from file)

    line = line.rstrip()          # str, "Haddad's,PA,239.50"
    fieldlist = line.split(',')   # list, ["Haddad's", 'PA', '239.50']

    rev_val = fieldlist[2]        # str, '239.50'
    f_rev = float(rev_val)        # float, 239.5

    counter = counter + 1         # incrementing once for each iteration
    summer = summer + f_rev       # adding the value found at each iteration to a sum

fh.close()

print(f'counter:  {counter}')     # 7 (number of lines in file)
print(f'summer:   {summer}')      # 662.01000001  (sum of all 3rd col values in file)


Ex 4.28


sidebar: writing and appending to files using the file object

Files can be opened for writing or appending; we use the 'file' object and the file .write() method.


fh = open('new_file.txt', 'w')
fh.write("here's a line of text\n")
fh.write('I add the newlines explicitly if I want to write to the file\n')
fh.close()

Note that we are explicitly adding newlines to the end of each line -- the write() method doesn't do this for us.




Containers: More List Operations

using containers to collect data

Containers are Python objects that can contain other objects.



containers allow for manipulation and analysis

Once collected, values in a container can be sorted or filtered (i.e. selected) according to whatever rules we choose. A collection of integer or floating-point values offers many opportunities for analysis. We can calculate:


A collection of string values allows us to perform text analysis:


container object summary : list, set, tuple

Compare and contrast the characteristics of each container.


mylist =  ['a', 'b', 'c', 'd', 1, 2, 3]

mytuple = ('a', 'b', 'c', 'd', 1, 2, 3)

myset =   {'a', 'b', 'c', 'd', 1, 2, 3}

mydict =  {'a': 1, 'b': 2, 'c': 3, 'd': 4}

list: ordered, mutable sequence of objects tuple: ordered, immutable sequence of objects set: unordered, mutable, unique collection of objects dict: unordered, mutable collection of object key-value pairs, with unique keys (discussed upcoming)


review: the list container object

A list is an ordered sequence of values.


var = []                     # initialize an empty list

var2 = [1, 2, 3, 'a', 'b']   # initialize a list of values


review: subscripting a list

Subscripting allows us to read individual items from a list.


mylist = [1, 2, 3, 'a', 'b']       # list

xx = mylist[2]                     # 3

yy = mylist[-1]                    # 'b'


review: slicing a list

Slicing a list returns a new list.


var2 = [1, 2, 3, 'a', 'b']            # list

sublist1 = var2[0:3]                  # [1, 2, 3]

sublist2 = var2[2:4]                  # [3, 'a']

sublist3 = var2[3:]                   # ['a', 'b']

Remember the rules of slicing:


in operator: finding an item within a list

The in operator returns True if an item is in the list.


mylist = [1, 2, 3, 'a', 'b']             # list

if 'b' in mylist:                        # this is True for mylist
    print("'b' can be found in mylist")

print('b' in mylist)                     # True:  the 'in' operator
                                         # actually returns True or False

Ex. 5.1


summary functions: len(), sum(), max(), min()

Summary functions offer a speedy answer to basic analysis questions: how many? How much? Highest value? Lowest value?


mylist = [1, 3, 5, 7, 9]        # list

print(len(mylist))               # 5 (count of items)
print(sum(mylist))               # 25 (sum of values)
print(min(mylist))               # 1 (smallest value)
print(max(mylist))               # 9 (largest value)

sorting a list

sorted() returns a new list of sorted values.


mylist = [4, 9, 1.2, -5, 200, 20]

smyl = sorted(mylist)              # list, [-5, 1.2, 4, 9, 20, 200]

Ex. 5.2


concatenating two lists with +

List concatenation works in the same way as it does with strings.


var = ['a', 'b', 'c']
var2 = ['d', 'e', 'f']

var3 = var + var2            # list, ['a', 'b', 'c', 'd', 'e', 'f']

adding (appending) an item to a list

var = []

var.append(4)                # Note well! call is not assigned
var.append(5.5)              # list is changed in-place

print(var)                    # [4, 5.5]


5.11


the AttributeError exception

An AttributeError exception occurs when calling a method on an object type that doesn't support that method.


mylines = ['line1\n', 'line2\n', 'line3\n']

mylines = mylines.rstrip()         # AttributeError:
                                   # 'list' object has no attribute 'rstrip'

Debugging:


Understanding the name AttributeError:


the AttributeError when using .append()

This exception may sometimes result from a misuse of the append() method -- it should not be assigned to any variable.


mylist = ['a', 'b', 'c']

# oops:  returns None -- call to append() should not be assigned
mylist = mylist.append('d')

mylist = mylist.append('e')        # AttributeError:  'NoneType'
                                   # object has no attribute 'append'


the correct use of .append()

Just remember that we don't assign from .append().


mylist = ['a', 'b', 'c']

mylist.append('d')                 # now mylist equals ['a', 'b', 'c', 'd']


sidebar: removing a container item

There are a number of additional list methods to manipulate a list, though they are less often used.


mylist = ['a', 'hello', 5, 9]

popped = mylist.pop(0)      # str, 'a'
                            # (argument specifies the index  of the item to remove)

mylist.remove(5)            # remove an item by value
print(mylist)               # ['hello', 9]

mylist.insert(0, 10)
print(mylist)               # [10, 'hello', 9]



Containers: Tuples and Sets

tuples and sets: like lists but different

It's helpful to contrast these containers with lists.



It's easy to remember how to use one of these containers by considering how they differ in behavior.


the tuple container object

A tuple is an immutable, ordered sequence of values.


var2 = (1, 2, 3, 'a', 'b')     # initialize a tuple of values


subscripting a tuple

Subscripting allows us to read individual items from a tuple.


mytuple = (1, 2, 3, 'a', 'b')       # initialize a tuple of values

xx = mytuple[3]                     # 'a'

Note that as with lists, indexing starts at 0, so index 1 is the 2nd item, index 2 is the 3rd item, etc.


slicing a tuple

Slicing a tuple returns a new tuple.


var2 = (1, 2, 3, 'a', 'b')             # initialize a tuple of values

subtuple1 = var2[0:3]                  # (1, 2, 3)

subtuple2 = var2[2:4]                  # (3, 'a')

subtuple3 = var2[3:]                   # ('a', 'b')

Remember the rules of slicing, which are the same as lists and strings:


concatenating two tuples with +

Concatenation works in the same way as with lists and strings.


var = ('a', 'b', 'c')
var2 = ('d', 'e', 'f')

var3 = var + var2                  # ('a', 'b', 'c', 'd', 'e', 'f')

Ex. 5.12


set container object

A set is an unordered, unique collection of values.


myset = set()                  # initialize an empty set (note that empty
                               # curly braces are reserved for dicts)

myset = {'a', 9999, 4.3, 'a'}  # initialize a set with items

print(myset)                   # {9999, 4.3, 'a'}


adding an item to a set

The set changes in place; any duplicate item will be ignored.


myset = set()        # initialize an empty set

myset.add(4.3)       # note well!  we do not assign back to myset
myset.add('a')
myset.add('a')

print(myset)         # {'a', 4.3}    (order is not
                     #                necessarily maintained)


getting information about a set or tuple

Here are len() and in with a tuple.


# get the length of a set or tuple (compare to len() of a list or string)
myset = {1, 2, 3, 'a', 'b'}

yy = len(myset)                # 5


# test for membership in a set or tuple
mytuple = (1, 2, 3, 'a', 'b')

if 'b' in mytuple:                        # bool, True
    print("'b' can be found in mytuple")

print('b' in mytuple)                     # "True":  the 'in' operator
                                          # actually returns True or False


looping through a set or tuple

The 'for' loop allows us to traverse a set or tuple and work with each item.


mytuple = (1, 2, 3, 'a', 'b')            # could also be a set here

for var in mytuple:
    print(var)                           # prints 1, then 2, then 3,
                                         # then a, then b


summary functions: len(), sum(), max(), min()

These functions also work as they do with lists.


Whether a set or tuple, these operations work in the same way.


mytuple = (1, 3, 5, 7, 9)       # initialize a tuple
myset =   {1, 3, 5, 7, 9}       # initialize a set

print(len(mytuple))             # 5  (count of items)
print(sum(myset))               # 25 (sum of values)

print(min(myset))               # 1 (smallest value)
print(max(mytuple))             # 9 (largest value)

sorting a set or tuple

Regardless of type, sorted() returns a list of sorted values.


mytuple = (4, 9, 1.2, -5, 200, 20)       # could also be a set here

smyl = sorted(mytuple)                   # [-5, 1.2, 4, 9, 20, 200]


Ex. 5.13


why do we need sets?

The set's duplicate elimination behavior gives us certain advantages.


As we saw, sets have 2 important characteristics:


How can we use a set?




Building Up Containers from File

introduction: building up containers from file

This technique forms the core of much of what we do in Python.


In order to work with data, the usual steps are:


We call this process Extract-Transform-Load, or ETL. ETL is at the heart of what core Python does best.


looping through a data source and building up a list

Similar to the counting and summing algorithm, this one collects values instead.


build a list of company names

company_list = []                        # empty list
fh = open('revenue.csv')                 # 'file' object

for line in fh:                          # str, 'Haddad's,PA,239.50\n'
    line = line.rstrip()                 # str, 'Haddad's,PA,239.50'

    items = line.split(',')              # list, ["Haddad's", 'PA', '239.50']

    company_list.append(items[0])        # list, ["Haddad's"]


print(company_list)   # ["Haddad's", 'Westfield', 'The Store', "Hipster's",
                      #  'Dothraki Fashions', "Awful's", 'The Clothiers']

fh.close()

Ex. 5.14 - 5.15


looping through a data source and building up a unique set

This program uses a set to collect unique items from repeating data.


state_set = set()                       # empty set
fh = open('revenue.csv')                # 'file' object

for line in fh:                         # str, 'Haddad's,PA,239.50'

    items = line.split(',')             # list, ["Haddad's", 'PA', '239.50']
    state_set.add(items[1])             # set, {'PA'}

print(state_set)       # set, {'PA', 'NY', 'NJ'}   (your order may be different)

chosen_state = input('enter a state:  ')

if chosen_state in state_set:
   print(f'{chosen_state} found in the file')
else:
    print(f'{chosen_state} not found')

fh.close()


5.22 & 5.23


reading a file with with

A file is automatically closed upon exiting the 'with' block.


A 'best practice' is to open files using a 'with' block. When execution leaves the block, the file is automatically closed.

with open('pyku.txt') as fh:
    for line in fh:
        print(line)

# At this point (once outside the with block), filehandle fh
# has been closed.  There is no need to call fh.close().


However, we should understand the minimal cost of not closing our files:

A file open for writing should be closed as soon as possible. The data may not appear in the file until it has been closed. 4.15


slicing and dicing a file: the line, word, character count (1/3)

Once we have read a file as a single string, we can "chop it up" any way we like.


# read(): file text as a single strings
fh = open('guido.txt')          # 'file' object
text = fh.read()                # read() method called on
                                # file object returns a string

fh.close()                      # close the file

print(text)
print(len(text))                 # 207 (number of characters in the file)

    # single string, entire text:

    # 'For three months I did my day job, \nand at night and
    #  whenever I got a \nchance I kept working on Python.  \n
    #  After three months I was to the \npoint where I could
    #  tell people, \n"Look here, this is what I built."'


slicing and dicing a file: splitting a string into words (2/3)

String .split() on a whole file string returns a list of words.


file_text = """For three months I did my day job,
and at night and whenever I got a
chance I kept working on Python.
After three months I was to the
point where I could tell people,
"Look here, this is what I built." """

words = file_text.split()      # split entire file on whitespace (spaces or newlines)

print(words)
    # ['For', 'three', 'months', 'I', 'did', 'my', 'day', 'job,',
    #  'and', 'at', 'night', 'and', 'whenever', 'I', 'got', 'a',
    #  'chance', 'I', 'kept', 'working', 'on', 'Python.', 'After',
    #  'three', 'months', 'I', 'was', 'to', 'the', 'point', 'where',
    #  'I', 'could', 'tell', 'people,', '“Look', 'here,', 'this',
    #  'is', 'what', 'I', 'built.”']

print(len(words))       # 42 (number of words in the file)


slicing and dicing a file: the line, word, character count (3/3)

String .splitlines() will split any string on the newlines, delivering a list of lines from the file.


file_text = """For three months I did my day job,
and at night and whenever I got a
chance I kept working on Python.
After three months I was to the
point where I could tell people,
"Look here, this is what I built."" """

lines = file_text.splitlines()

print(lines)

    # ['For three months I did my day job, ', 'and at night and whenever I got a ',
    #  'chance I kept working on Python.  ', 'After three months I was to the ',
    #  'point where I could tell people, ', '“Look here, this is what I built.”']

print(len(lines))          # 6 (number of lines in the file)


"whole file" parsing: reading a file as a list of lines

String .splitlines() will split any string on the newlines, delivering a list of lines from the file.


fh = open('pyku.txt')           # 'file' object

file_text = fh.read()           # entire file as a single string

lines = file_text.splitlines()

print(lines)

    # ["We're out of gouda.", 'That parrot has ceased to be.',
    #  'Spam, spam, spam, spam, spam.']

print(len(lines))          # 3 (number of lines in the file)


Ex. 5.27 -> 5.29


Summary: 3 ways to read strings from a file

for: read (newline ('\n') marks the end of a line)

fh = open('students.txt')        # file object allows looping
                                 # through a series of strings
for my_file_line in fh:          # my_file_line is a string
    print(my_file_line)           # prints each line of students.txt

fh.close()                       # close the file

read(): read entire file as a single string

fh = open('students.txt')  # file object allows reading
text = fh.read()                 # read() method called on file
                                 # object returns a string
fh.close()                       # close the file

print(text)                       # entire text as a single string

readlines(): read as a list of strings (each string a line)

fh = open('students.txt')
file_lines = fh.readlines()      # file.readlines() returns
                                 # a list of strings
fh.close()                       # close the file

print(file_lines)                 # entire text as a list of lines


sidebar: writing to a file

We don't have call to write to a file in this course, but it's important to know how.


wfh = open('newfile.txt', 'w')    # open for writing
                                  # (will overwrite an existing file)

wfh.write('this is a line of text\n')
wfh.write('this is a line of text\n')
wfh.write('this is a line of text\n')

wfh.close()


sidebar: the range() function

This function allows us to iterate over an integer sequence.


counter = range(10)
for i in counter:
    print(i)                        # prints integers 0 through 9

for i in range(3, 8):               # prints integers 3 through 7
    print(i)

If we need an literal list of integers, we can simply pass the iterable to a list:

intlist = list(range(5))
print(intlist)                      # [0, 1, 2, 3, 4]



Dictionaries: Lookup Tables

dictionaries

A dictionary (or dict) is a collection of unique key/value pairs of objects.


mydict = {}                      # empty dict

mydict = {'a':1, 'b':2, 'c':3}   # dict with str keys and int values

val = mydict['a']                # look up 'a'; returns 1


example uses: dictionaries

Pairs describe data relationships that we often want to consider:


You yourself may consider data in pairs, even in your personal life:


types of dictionaries

There are three main ways dictionaries are used.



initialize a dict

Dicts are marked by curly braces. Keys and values are separated with colons.


mydict = {}                        # empty dict

mydict = {'a':1, 'b':2, 'c':3}     # dict with str keys and int values

add a key/value pair to a dict

We use subscript syntax to assign a value to a key.


mydict = {'a':1, 'b':2, 'c':3}

mydict['d'] = 4       # setting a new key and value

print(mydict)         # {'a': 1, 'c': 3, 'b': 2, 'd': 4}

retrieve a value from a dict using a key

We also use subscript syntax to retrieve a value.


mydict = {'a':1, 'b':2, 'c':3, 'd': 4}

dval = mydict['d']       # value for 'd' is 4

xxx = mydict['c']        # value for 'c' is 3

You might notice that this subscripting is very close in syntax to list subscripting. The only difference is that instead of an integer index we are using the dict key (most often a string).


the KeyError exception

This exception is raised when we request a key that does not exist in the dict.


mydict = {'a': 1, 'b': 2, 'c': 3}

val = mydict['d']       # KeyError:  'd'

Like the IndexError exception, which is raised if we ask for a list item that doesn't exist, KeyError is raised if we ask for a dict key that doesn't exist.


check for key membership

If we're not sure whether a key is in the dict, before we subscript we can check to confirm.


mydict = {'a': 1, 'b': 2, 'c': 3}

if 'a' in mydict:
    print("'a' is a key in mydict")

Ex. 6.1 - 6.4




Dictionaries: Rankings

dictionary rankings

Dictionaries can be sorted by value to produce a ranking.



loop through dict keys and values

We loop through keys and then use subscripting to get values.


mydict = {'a': 1, 'b': 2, 'c': 3, 'd': 4}

for key in mydict:         # a
    val =  mydict[key]

    print(key)             # a
    print(val)             # 1
    print()
                           # b
                           # 2

                           # (continues with 'c' and 'd')


Ex. 6.8


review: sorting any container with sorted()

With any container or iterable (list, tuple, file), sorted() returns a list of sorted items.


namelist = ['banana', 'apple', 'dates', 'cherry']

slist = sorted(namelist, reverse=True)

print(slist)          # ['dates', 'cherry', 'banana', 'apple']

Remember that no matter what container is passed to sorted(), the function returns a list. Also remember that the reverse=True argument to sorted() can be used to sort the items in reverse order.


sorting a dict (sorting its keys)

sorted() returns a sorted list of a dict's keys.


bowling_scores = {'bob': 123, 'zeb': 98, 'mike': 202, 'alice': 184}

sorted_keys = sorted(bowling_scores)

print(sorted_keys)       # [ 'alice', 'bob', 'mike', 'zeb' ]

Ex. 6.9


sorting a dictionary's keys by its values

A special argument to sorted() can cause Python to sort a dict's keys by its values.


bowling_scores = {'jeb': 123, 'zeb': 98, 'mike': 202, 'alice': 184}

sorted_keys = sorted(bowling_scores, key=bowling_scores.get)

print(sorted_keys)                 # ['zeb', 'jeb', 'alice', 'mike']

for player in sorted_keys:
    print(f"{player} scored {bowling_scores[player]}")

        ##  zeb scored 98
        ##  jeb scored 123
        ##  alice scored 184
        ##  mike scored 202

The key= argument allows us to instruct sorted() how to sort the keys. The sorting works in part by use of the dict .get() method (discussed later). Passing .get to sorted() causes it to sort by value instead of by key. Ex. 6.10


assign multiple values to individual variables

multi-target assignment allows us to "unpack" the values in a container.


If the container on the right has 3 values, we may unpack them to three named variables.

company, state, revenue = ["Haddad's", 'PA', '239.50']

print(company)      # Haddad's
print(revenue)      # 239.50

But if the values we want are in a CSV line we can split them to a list -- and then assign them using multi-target assignment.

csv_line = "Haddad's,PA,239.50"

company, state, revenue = csv_line.split(',')

print(company)      # Haddad's
print(state)        # PA

Ex. 6.14


build up a dict from two fields in a file

As with all containers, we loop through a data source, select and add to a dict.


ids_names = {}                 # empty dict

fh = open('student_db.txt')

for line in fh:
    stuid, street, city, state, zip = line.split(':')

    ids_names[stuid] = state   # key id is paired to
                               # student's state


print("here is the state for student 'jb29':  ")
print(ids_names['jb29'])        #  NJ

fh.close()

ex. 6.15




Dictionaries: Aggregations

dict aggregations

A "counting" or "summing" dictionary answers the question "how many of each" or "how much of each".


Aggregations may answer the following questions:


The dict is used to store this information. Each unique key in the dict will be associated with a count or a sum, depending on how many we found in the data source or the sum of values associated with each key in the data source.


building a counting dict

A "counting" dict increments the value associated with each key, and adds keys as new ones are found.


state_count = {}                  # empty dict

fh = open('revenue.csv')

for line in fh:                   # str, "Haddad's,PA,239.50\n"

    items = line.split(',')       # list, ["Haddad's", 'PA', '239.50\n']
    state = items[1]              # str, 'PA'

    if state not in state_count:
        state_count[state] = 0

    state_count[state] = state_count[state] + 1

print(state_count)                # {'PA': 2, 'NJ': 2, 'NY': 3}
fh.close()


Ex. 6.16


building a summing dict

A "summing" dict sums the value associated with each key, and adds keys as new ones are found.


state_sum = {}                  # empty dict

fh = open('revenue.csv')        # 'file' object

for line in fh:                 # str, "Haddad's,PA,239.50\n"

    items = line.split(',')     # ["Haddad's", 'PA', '239.50']
    state = items[1]            # str, 'PA'
    value = float(items[2])     # float, 239.5

    if state not in state_sum:
        state_sum[state] = 0

    state_sum[state] = state_sum[state] + value

print(state_sum)      # {'PA': 263.45, 'NJ': 265.4, 'NY': 133.16}

fh.close()


dictionary size with len()

len() counts the pairs in a dict.


mydict = {'a': 1, 'b': 2, 'c': 3}

print(len(mydict))                 # 3 (number of keys in dict)

Ex. 6.21


sidebar: dict .get() method

This method may be used to retrieve a value without checking the dict to see if the key exists.


mydict = {'a': 1, 'b': 2, 'c': 3}

xx = mydict.get('a', 0)          # 1 (key exists so paired value is returned)

yy = mydict.get('zzz', 0)        # 0 (key does not exist so the
                                 #    default value is returned)


Ex. 6.22


sidebar: obtaining keys in a dict

The .keys() method gives access to the keys in a dict.


mydict = {'a': 1, 'b': 2, 'c': 3}

these_keys = mydict.keys()

for key in these_keys:
    print(key)

print(list(these_keys))            # ['a', 'c', 'b']


sidebar: obtaining values in a dict

The .values() method gives access to the values in a dict.


mydict = {'a': 1, 'b': 2, 'c': 3}

values = list(mydict.values())     # [1, 2, 3]

if 'c' in mydict.values():
    print("'c' was found")

for value in mydict.values():
    print(value)


sidebar: using the dict .items() method

.items() gives key/value pairs as 2-item tuples.


mydict = {'a': 1, 'b': 2, 'c': 3}

for key, value in mydict.items():
    print(key, value)               # a 1
                                    # b 2
                                    # c 3

print(list(mydict.items()))         # [('a', 1), ('c', 3), ('b', 2)]


Ex. 6.23


sidebar: working with dict items()

dict items() can give us a list of 2-item tuples. dict() can convert this list back to a dictionary.


mydict = {'a': 1, 'b': 2, 'c': 3}
these_items = list(mydict.items())    # [('a', 1), ('c', 3), ('b', 2)]

some_items = these_items[0:3]         # [('a', 1), ('c', 3)]

newdict = dict(some_items)

print(newdict)                        # {'a': 1, 'b': 2}

2-item tuples can be sorted and sliced, so they are a handy alternate structure.


sidebar: converting parallel lists to tuples

zip() zips up parallel lists into tuples; dict() can convert this to dict.


list1 = ['a', 'b', 'c', 'd']
list2 = [ 1,   2,   3,   4 ]

tupes = list(zip(list1, list2))

print(tupes)          # [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
print(dict(tupes))    # {'a': 1,    'b': 2,   'c': 3,   'd': 4}

Occasionally we are faced with two lists that relate to each other one a 1-to-1 basis... or, we sometimes even shape our data into this form. Paralell lists like these can be zipped into multi-item tuples.




The JSON File Format and Multidimensional Containers

the JSON file format

JavaScript Object Notation is a simple "data interchange" format for sending or storing structured data as text.



a sample json file

Fortunately for us, JSON resembles Python in many ways, making it easy to read and understand.


contents of file sample.json

{
   "key1":  ["a", "b", "c"],
   "key2":  {
              "innerkey1": 5,
              "innerkey2": "woah"
            },
   "key3":  false,
   "key4":  null
}


reading a structure from a json file

The json.load() function decodes the contents of a JSON file.


import json                 # this module is used to read JSON files

fh = open('sample.json')

mys = json.load(fh)         # load data objects from the file,
                            # convert into Python objects

fh.close()

print((type(mys)))            # dict (the outer container of this struct)

print(mys['key2']['innerkey2'])     # woah

Ex. 7.1


reading a structure from a json string

The json.loads() function decodes the contents of a JSON string.


import json                   # we use this module to read JSON
import requests               # use 'pip install' to install


response = requests.get('https://davidbpython.com/mystruct.json')

text = response.text          # str, entire file data

mys = json.loads(text)        # read string, convert into Python container

print((type(mys)))              # dict (the outer container of this struct)

print((mys['key2']['innerkey2']))    # woah


About requests:

Ex. 7.2


printing a complex object readably: writing to a string

A nested object can be confusing to read.


If we have an multidimensional object that is squished together and hard to read, we can use .dumps() with indent=4

import json

obj = {'a': {'x': 1, 'y': 2, 'z': 3}, 'b': {'x': 1, 'y': 2, 'z': 3}, 'c': {'x': 1, 'y': 2, 'z': 3} }

print((json.dumps(obj, indent=4)))

this prints:

{
    "a": {
        "x": 1,
        "y": 2,
        "z": 3
    },
    "b": {
        "x": 1,
        "y": 2,
        "z": 3
    }
}

sidebar: writing an object to json file

We can use json.dump() to write to a JSON file.


import json

wfh = open('newfile.json', 'w')    # open file for writing

obj = {'a': 1, 'b': 2}

json.dump(obj, wfh)

wfh.close()




Reading Multidimensional Containers: Subscripting

pinpointing a specific value within a structure

We can use subscripts to "travel to" a value within a multidimensional.


value_table =       [
                       [ 1, 2, 3 ],
                       [ 10, 20, 30 ],
                       [ 100, 200, 300 ]
                    ]

val1 = value_table[0][0]       # float, 1
val2 = value_table[0][2]       # float, 3
val3 = value_table[2][2]       # float, 300


pinpointing a value within a list of dicts

In a list of dicts, each item is a dict.


lod = [
    { 'fname': 'Ally',
      'lname': 'Kane'   },
    { 'fname': 'Bernie',
      'lname': 'Bain'   },
    { 'fname': 'Josie',
      'lname': 'Smith'  }
]

val = lod[2]['lname']         # 'Smith'

val2 = lod[0]['fname']        # 'Ally'


pinpointing a value in a dict of dicts

A dict of dicts has string keys and dict values.


dod = {
    'ak23':  { 'fname': 'Ally',
               'lname': 'Kane' },
    'bb98':  { 'fname': 'Bernie',
               'lname': 'Bain' },
    'js7':   { 'fname': 'Josie',
               'lname': 'Smith' },
}

val = dod['ak23']['fname']     # 'Ally'

val2 = dod['js7']['lname']     # 'Smith'


Ex. 7.8 -> 7.10




Reading Multidimensional Containers: Looping

looping through a struct to read each "inner" structure

We begin by identifying the "inner" structures; 'for' looping takes us to each one in turn.


value_table =       [
                       [ 1, 2, 3 ],
                       [ 10, 20, 30 ],
                       [ 100, 200, 300 ]
                    ]

for inner_list in value_table:    # list, [ 1, 2, 3 ]

    print(inner_list[0])          # 1
                                  # 10
                                  # 100


looping through and accessing values within a list of dicts

In a list of dicts, each item is a dict.


lod = [
    { 'fname': 'Ally',
      'lname': 'Kane'   },
    { 'fname': 'Bernie',
      'lname': 'Bain'   },
    { 'fname': 'Josie',
      'lname': 'Smith'  }
]

for inner_dict in lod:
    print(inner_dict['fname'])         # Ally
    print(inner_dict['lname'])         # Kane
    print()

                                       # Bernie
                                       # Bain

                                       # Josie
                                       # Smith


looping through and accessing values within a dict of dicts

In dict of dicts, looping through retrieves each key, and we must subscript to retrieve the "inner" dict.


dod = {
    'ak23':  { 'fname': 'Ally',
               'lname': 'Kane' },
    'bb98':  { 'fname': 'Bernie',
               'lname': 'Bain' },
    'js7':   { 'fname': 'Josie',
               'lname': 'Smith' },
}

for id_key in dod:
    inner_dict = dod[id_key]

    print((inner_dict['fname']))        # Ally
    print((inner_dict['lname']))        # Kane
    print()


Ex. 7.16 -> 7.18 also to discuss building a struct from file (Ex. 7.25 -> 7.30)




Introduction to User-Defined Functions

user-defined functions

A user-defined function is a block of code that can be executed by name.


def add(val1, val2):
    valsum = val1 + val2
    return valsum

ret = add(5, 10)           # int, 15

ret2 = add(0.3, 0.9)       # float, 1.2

A function is a block of code:


user defined functions: calling the function

calling means activating the function and running its code.


def print_hello():
    print("Hello, World!")

print_hello()             # prints 'Hello, World!'
print_hello()             # prints 'Hello, World!'
print_hello()             # prints 'Hello, World!'


user defined functions: arguments

The arguments are the inputs to a function.


def print_hello(greeting, person):              # note we do not
    full_greeting = f'{greeting}, {person}!'    # refer to 'name1'
    print(full_greeting)                        # 'place2', etc.
                                                # inside the function
name1 = 'Hello'
place1 = 'World'

print_hello(name1, place1)             # prints 'Hello, World!'


name2 = 'Bonjour'
place2 = 'Python'

print_hello(name2, place2)             # prints 'Bonjour, Python!'


user defined functions: function return values

A function's return value is passed back from the function using the return statement.


def print_hello(greeting, person):
    full_greeting = f'{greeting}, {person}!'
    return full_greeting

msg = print_hello('Bonjour', 'parrot')

print(msg)                                       # 'Bonjour, parrot!'


Ex. 9.1 - 9.5




Exception Trapping

exception trapping: handling exceptions after they are raised

Introduction: unanticipated vs. anticipated exceptions


Think of exceptions that we see raised by Python (SyntaxError, IndexError, etc.) as being of two general kinds -- unanticipated and anticipated:


Examples of anticipated exceptions:

In each of these cases, we know the exception could be raised, and so we write code to try to avoid the exception, or to deal with it if it does.


KeyError: when a dictionary key cannot be found

If the user enters a key, but it can't be found in the dict.


mydict = {'1972': 3.08, '1973': 1.01, '1974': -1.09}

uin = input('please enter a year: ')         # user enters '2116'

print(f'value for {uin} is {mydict[uin]}')

  #  Traceback (most recent call last):
  #    File "/Users/david/test.py", line 5, in <module>
  #      print(f'value for {uin} is {mydict[uin]}')
  #                                  ~~~~~~^^^^^
  #  KeyError: '2116'


ValueError: when the wrong value is used with a function or statement.

If we ask the user for a number, but they give us something else.


uin = input('please enter an integer:  ')

intval = int(uin)                           # user enters 'hello'

print('{uin} doubled is {intval*2}')

  #  Traceback (most recent call last):
  #    File "/Users/david/test.py", line 3, in <module>
  #      intval = int(uin)                           # user enters 'seven'
  #               ^^^^^^^^
  #  ValueError: invalid literal for int() with base 10: 'hello'


FileNotFoundError: when a file can't be found.

If we attempt to open a file, but it has been moved or deleted.


filename = 'thisfile.txt'

fh = open(filename)

  #  Traceback (most recent call last):
  #    File "/Users/david/test.py", line 3, in <module>
  #      fh = open(filename)
  #           ^^^^^^^^^^^^^^
  #  FileNotFoundError: [Errno 2] No such file or directory: 'thisfile.txt'

one approach to managing exceptions: "asking for permission"

Up to now we have managed anticipated exceptions by testing to make sure an action will be succesful.


Examples of testing for anticipated exceptions:


So far we have been dealing with anticipated exceptions by checking first -- for example, using .isdigit() to make sure a user's input is all digits before converting to int().
However, there is an alternative to "asking for permission": begging for forgiveness.


another approach to managing exceptions: "begging for forgiveness"

The try block can trap exceptions and the except block can deal with them.


try:
    uin = input('please enter an integer:  ')   # user enters 'hello'
    intval = int(uin)                           # int() raises a ValueError
                                                # ('hello' is not a valid value)

    print('{uin} doubled is {intval*2}')

except ValueError:
    exit('sorry, I needed an int')   # the except block cancels the
                                     # ValueError and takes action


the procedure for setting up exception handling

It's important to witness the exception and where it it is raised before attempting to trap it.


It's strongly recommended that you follow a specific procedure in order to trap an exception:

  1. allow the exception to be raised
  2. note the exception type and line number where it was raised
  3. wrap the line that caused the exception in a try: block
  4. follow with an except: block, containing statements to be executed if the exception is raised
  5. test that when the exception is raised, the except block is executed
  6. test that when the exception is not raised, the except block is not executed

  7. Ex. 9.12 - 9.13


    trapping multiple exceptions

    Multiple exceptions can be trapped using a tuple of exception types.


    companies = ['Alpha', 'Beta', 'Gamma']
    
    user_index = input('please enter a ranking:  ')   # user enters '4' or 'hello'
    
    try:
        list_idx = int(user_index) - 1
    
        print(f'company at ranking {user_index} is {companies[list_idx]}')
    
    except (ValueError, IndexError):
        exit(f'max index is {len(companies) - 1}')
    


    chaining except: blocks

    The same try: block can be followed by multiple except: blocks, which we can use to specialize our response to the exception type.


    companies = ['Alpha', 'Beta', 'Gamma']
    
    user_index = input('please enter a ranking:  ')   # user enters '4'
    
    try:
        list_idx = int(user_index) - 1
    
        print(f'company at ranking {user_index} is {companies[list_idx]}')
    
    except ValueError:
        exit('please enter a numeric ranking')
    
    except IndexError:
        exit(f'max index is {len(companies) - 1}')
    

    The exception raised will be matched against each type, and the first one found will excecute its block. Ex. 9.14


    avoiding except: and except Exception:

    When we don't specify an exception, Python will trap any exception. This is a bad practice.


    ui = input('please enter a number: ')
    
    try:
        fval = float(ui)
    except:                  # AVOID!!  Should be 'except ValueError:'
        exit('please enter a number - thank you')
    

    However, this is a bad practice. Why?

    1. except: or except Exception: can trap any type of exception, so an unexpected exception could go undetected
    2. except: or except Exception: does not specify which type of exception was expected, so it is less clear to the reader


    (There are certain limited circumstances under which we might use except: by itself, or except Exception. One comment practice is to place the entire program execution in a try: block and to trap any exception that is raised, so the exception can be logged and the program doesn't need to exit as a result.)




    Set Operations and List Comprehensions

    container processing: set comparisons

    Set comparisons make it easy to compare 2 sets for membership.


    set_a = {1, 2, 3, 4}
    set_b = {3, 4, 5, 6}
    
    print(set_a.union(set_b))           # {1, 2, 3, 4, 5, 6}  (set_a + set_b)
    
    print(set_a.difference(set_b))      # {1, 2}              (set_a - set_b)
    
    print(set_a.intersection(set_b))    # {3, 4}     (what is common between them?)
    


    Ex. 9.15 - 9.17


    "transforming" list comprehension

    List comprehensions build a new list based on an existing list.


    This list comprehension doubles each value in the nums list

    nums = [1, 2, 3, 4, 5]
    
    dblnums = [      val * 2      for val in nums     ]
       #            transform       'for' loop
    
    print(dblnums)            # [2, 4, 6, 8, 10]
    


    Ex. 9.18


    "filtering" list comprehension

    List comprehensions can also select values to place in the new list.


    This list comprehension selects only those values above 35 degrees Celsius:

    daily_temps = [26.1, 31.0, 38.4, 36.1, 38.3, 34.1, 32.7, 33.3]
    
    hitemps = [       t          for t in daily_temps        if t > 35     ]
        #          transform          'for' loop               filter
    
    print(hitemps)          # [37.4, 36.1, 38.3]
    


    Ex. 9.19


    combining a transforming with a filtering list comprehension

    We can choose to filter or transform or both.


    This list comprehension selects values above 35C and converts them to Fahrenheit:

    daily_temps = [26.1, 31.0, 38.4, 36.1, 38.3, 34.1, 32.7, 33.3]
    
    f_hitemps = [ round((t * 9/5) + 32, 1)     for t in daily_temps     if t > 35 ]
         #              transform                'for' loop               filter
    
    print(f_hitemps)          # [37, 36, 37]
    

    Ex. 9.19


    list comprehensions: examples

    List comprehensions are a powerful convenience, but not ever required.


    Some common operations can be accomplished in a single line. In this example, we produce a list of lines from a file, stripped of whitespace.

    stripped_lines = [ i.rstrip() for i in open('pyku.txt').readlines() ]
    

    We can even combine expressions for some fancy footwork

    totals = [  float(i.split(',')[2])
                for i in open('revenue.csv')
                if i.split(',')[1] == 'NY'    ]
    

    This last example borders on the overcomplicated -- if we are trying to do too much with a list comprehension, we might be better off with a conventional 'for' loop.


    list comprehensions: why?

    A list comprehension is a single statement.



    sidebar: list comprehensions with dictionaries

    Since dicts can be converted to and from 2-item tuples, we can manipulate them using list comprehensions.


    Recall that dict .items() returns a list of 2-item tuples, and that the dict() constructor uses the same 2-item tuples to build a dict.

    mydict =  {'a': 5, 'b': 1, 'c': -3}
    
    # dict -> list of tuples
    my_items = list(mydict.items())      # list, [('a', 5), ('b', 1), ('c', -3)]
    
    # list of tuples -> dict
    mydict2 = dict(my_items)       # dict, {'a':5,   'b':1,   'c':-3}
    

    Here's an example: filtering a dictionary by value - accepting only those pairs whose value is larger than 0:

    mydict = {'a': 5, 'b': 1, 'c': -3}
    
    filtered_dict = dict([ (i, j)
                           for (i, j) in mydict.items()
                           if j > 0 ])
    
               # {'a': 5, 'b': 1}
    



    The Command Prompt: Moving Around and Looking

    the command prompt

    The Command Prompt includes powerful commands for working with files and programs.



    opening the command prompt: Windows

    In your Windows search, look for one of the following, and open it:


    You should see something similar to the following:

    C:\Users\david>                       < -- command line

    After opening this window, note the blinking cursor: this is your computer's operating system, awaiting your next command. (Please note that there may be small differences between your output and this illustration; these can usually be ignored.)


    opening the command prompt: Mac or Linux*

    in your Spotlight search, look for Terminal, and open it:


    You should see something similar to the following:

    Last login: Thu Sep  3 13:46:14 on ttys001
    
    Davids-MBP-3:~ david$                 < -- command line

    After opening the command prompt program on your computer, note the blinking cursor: this is your computer's operating system awaiting your next command. (Please note that there may be small differences between your output and this illustration; these can usually be ignored.)


    the present working directory (pwd)

    Your command line session is located at one particular directory on the file tree at any given time.


    On Windows, the pwd is automatically displayed at the prompt:

    C:\Users\david>

    On Mac/Linux, type pwd and hit [Enter]:

    Davids-MBP-3:~ david$ pwd
    /Users/david


    listing files in a directory: Windows

    dir is the command to list the contents of a directory.


    Type dir and hit [Enter]:

    C:\Users\david> dir
    
     Volume Serial Number is 0246-9FF7
    
     Directory of C:\Users\david
    
    08/29/2020  11:37 AM    <DIR>          .
    08/29/2020  11:37 AM    <DIR>          ..
    08/29/2020  11:28 AM    <DIR>          Contacts
    08/29/2020  12:50 PM    <DIR>          Desktop
    ... etc ...

    The contents of the directory include all files and folders that can be found in it.


    listing files in a directory: Mac

    ls is the command to list the contents of a directory.


    Type ls and hit [Enter]:

    Davids-MBP-3:~ david$ ls
    
    Applications          Downloads         Movies
    Desktop               Dropbox           Music
    Documents             Library           Public
    ... etc ...

    The contents of the directory include all files and folders that can be found in it.


    visualizing the directory tree

    Starting from the root, each folder may have files and other folders within.


    C:\Users
    ├── david           <--- my pwd when I open my Terminal
    │   ├── Desktop
    │   │   └── python_data
    │   │        ├── 00
    │   │        ├── 01
    │   │        │    ├─ 1.1.py
    │   │        │    ├─ 1.2.py
    │   │        │    ├─ 1.3.py
    │   │        │    etc.
    │   │        ├── 02
    │   │        │    ├─ 2.1.py
    │   │        │    ├─ 2.2.py
    │   │        │    ├─ 2.3.py
                 etc.


    moving around the directory tree with 'cd'

    cd stands for 'change directory'. This command works for both Windows and Mac.


    on Mac/Linux:

    Davids-MBP-3:~ david$ pwd
    /Users/david
    
    Davids-MBP-3:~ david$ cd Desktop
    
    Davids-MBP-3:~ david$ pwd
    /Users/david/Desktop

    on Windows:

    C:\Users\david> cd Desktop
    C:\Users\david\Desktop>

    moving to the child directory

    To visit a directory "below" where we are, we simply name the child dir.


    We move "down" the directory tree by using the name of the next directory -- this extends the path:

    C:\Users\david> cd Desktop
    
    C:\Users\david\Desktop> cd python_data
    
    C:\Users\david\Desktop\python_data> cd 02
    
    C:\Users\david\Desktop\python_data\02>

    We can also travel multiple levels by specifying a longer path:

    C:\Users\david> cd Desktop\python_data\02
    
    C:\Users\david\Desktop\python_data\02>

    (Please that on Windows we use the backslash separator (\); on Mac it is the forward slash(/).)


    moving to the parent directory

    The '..' (double dot) indicates the parent directory and can move us one directory "up".


    If we'd like to travel up the directory tree, we use the special .. directory value, which signifies the parent directory:

    C:\Users\david\Desktop\python_data\02> cd ..
    
    C:\Users\david\Desktop\python_data> cd ..
    
    C:\Users\david\Desktop> cd ..
    
    C:\Users\david>

    We can also travel multiple levels with multiple ../'s:

    C:\Users\david\Desktop\python_data\02> cd ..\..\..
    
    C:\Users\david>

    (Please that on Windows we use the backslash separator (\); on Mac it is the forward slash(/).)


    using ls or dir with cd

    These two commands together allow us explore our filesystem.


    Here is an example journey through some folders, viewing the contents of each folder as we move (this shows Mac/Linux output, but in Windows you may replace ls with dir):

    Davids-MBP-3:Desktop david$ pwd
    /Users/david/Desktop
    
    Davids-MBP-3:Desktop david$ ls
    python_data
    
    Davids-MBP-3:Desktop david$ cd python_data
    
    Davids-MBP-3:python_data david$ pwd
    /Users/david/Desktop/python_data
    
    Davids-MBP-3:python_data david$ ls
    00
    01
    02
    ... etc.
    
    Davids-MBP-3:python_data david$ cd 02
    
    Davids-MBP-3:02 david$ ls
    2.1.py       2.2.py       2.3.py       2.4.py       2.5.py
    2.6.py       2.7.py       2.8.py       2.9.py       2.10.py
    
    Davids-MBP-3:02 david$ pwd
    /Users/david/Desktop/python_data/02



    The Command Prompt: Executing Python Programs

    verifying the PATH environment variable for Python

    To execute scripts from the command line, we must ensure that the OS can find Python.


    Please begin by opening a new Terminal or Command Prompt window. At the prompt, type python -V (make sure it is a capital V). (Please note that your prompt may look different than mine.)


    Python can be found: python version is displayed

    C:\Users\david> python -V
    Python 3.11.5

    Python can be found, but at the wrong version:

    david@192 ~ % python -V
    Python 2.7.16

    Python can't be found:

    david@192 ~ %  python -V
    'python' is not a recognized...   or   'python': command not found...

    If your path is not set correctly to a 3.x version of Python, you can find instructions on setting it in the supplementary documents for this class. You'll need to set the PATH to point to Python to continue with the remaining steps in this lesson. You may also contact your course manager for assistance.


    executing a python script from the command line

    Here we ask Python to run a script directly (without our IDE's help).


    If you are in the same directory as the script, you can execute a program by running Python and telling Python the name of the script:


    On Windows:

    C:\Users\david\Desktop\python_data\02> python 2.1.py

    On Mac:

    Davids-MBP-3:02 david% python 2.1.py

    Please note: if your prompt looks like this: >>>, you have entered the python interactive prompt. Type quit() and hit [Enter] to leave it. If there are any issues finding Python, please contact your course manager for assistance.


    the STDIN, STDOUT and STDERR data streams

    Your output goes to the screen, but in truth, it's going to "standard out".



    redirecting the STDOUT data stream to a file

    STDOUT can be redirected to other places besides the screen.


    hello.py: print a greeting

    print('hello, world!')
    

    redirecting STDOUT to a file at the command line (Windows or Mac):

    mycomputer% python hello.py                # default:  to the screen
    hello, world!
    
    mycomputer% python hello.py > newfile.txt  # redirect to a file (not the screen)
                                               # (we see no output)
    
    mycomputer% cat newfile.txt       # Mac:  cat spits out a file's contents
    hello, world!
    
    C:\> type newfile.txt             # Windows: type spits out a file's contents
    hello, world!


    redirecting the STDOUT data stream to STDIN of another program

    The 'pipe' character can connect two programs; the output of a will be redirected as the input of b


    Mac: direct output to the wc command (count lines, words and characters)

    mycomputer% python hello.py | wc
    
           1       2      14                   # the output of wc

    Windows: direct output to find command (count lines):

    mycomputer% python hello.py | find /c /v ""
    
    1


    reading and redirecting the STDIN data stream

    STDIN is the 'input pipe' to our program (usually the keyboard, but can be redirected to read from a file or other program).


    import sys
    
    for line in sys.stdin.readlines():
        print(line)
    
    filetext = sys.stdin.read()          # alternative to above
    

    A program like the above could be called this way, directing a file into our program's STDIN:

    mycomputer% python readfile.py < file_to_be_read.txt

    We can of course also direct the output of a program into our program's STDIN through use of a pipe:

    mycomputer% ls -l | python readfile.py



    The Command Prompt: Program Arguments

    sys.argv to capture command line arguments

    sys.argv is a list that holds string arguments entered at the command line


    a python script get_args.py

    import sys                           # import the sys module
    
    print('first arg: ' + sys.argv[1])   # print first command line arg
    print('second arg: ' + sys.argv[2])  # print second command line arg
    

    running the script from the command line, with two arguments

    % python myscript.py hello there
    first arg: hello
    second arg: there


    Please note: if your prompt looks like this: >>>, you have entered the python interactive prompt. Type quit() and hit [Enter] to leave it.


    The default item in sys.argv: the program name

    sys.argv[0] will always contain the name of our program.


    a python script print_args.py

    import sys
    print(sys.argv)
    

    (passing 3 arguments)

    % python print_args.py hello there budgie
    ['myscript2.py', 'hello', 'there', 'budgie']

    running the script from the command line (passing no arguments)

    % python print_args.py
    ['myscript2.py']


    IndexError with sys.argv (when user passes no argument)

    Since we read arguments from a list, we can trigger an IndexError if we try to read an argument from sys.argv that wasn't passed at the command line.


    a python script addtwo.py

    import sys
    
    firstint = int(sys.argv[1])
    secondint = int(sys.argv[2])
    
    mysum = firstint + secondint
    
    print(f'the sum of the two values is {mysum}')
    

    passing 2 arguments

    % python addtwo.py 5 10
    the sum of the two values is 15

    passing no arguments

    % python addtwo.py
    Traceback (most recent call last):
      File "addtwo.py", line 3, in <module>
    firstint = int(sys.argv[1])
    IndexError: list index out of range

    How to handle this exception? Test the len() of sys.argv, or trap the IndexError exception.




    File and Directory Listings

    writing to files using the file object

    To open a file for writing, use the 2nd argument 'w'.


    fh = open('new_file.txt', 'w')
    fh.write("here's a line of text\n")
    fh.write('I add the newlines explicitly if I want to write to the file\n')
    fh.close()
    


    appending to files using the file object

    To open a file for appending, use the 2nd argument 'a'.


    fh = open('new_file.txt', 'a')
    fh.write("20250505 1203   something happened\n")
    fh.close()
    


    use os.getcwd() to show the present/current working directory.

    Below program assumes were are starting in our home directory:

    import os                # os ('operating system') module talks
                             # to the os (for file access & more)
    
    cwd = os.getcwd()        # str, '/Users/david'
    
    print(cwd)
    


    a sample file tree

    We'll use this tree to explore relative filepaths.


    dir1
    ├── file1.txt
    ├── test1.py
    │
    ├── dir2a
    │   ├── file2a.txt
    │   ├── test2a.py
    │   │
    │   ├── dir3a
    │   │   ├── file3a.txt
    │   │   ├── test3a.py
    │   │   │
    │   │   └── dir4
    │   │       ├── file4.txt
    │   │       └── test4.py
    └── dir2b
        ├── file2b.txt
        ├── test2b.py
        │
        └── dir3b
           ├── file3b.txt
           └── test3b.py
    


    relative filepaths

    These paths locate files relative to the present working directory.


    If the file you want to open is in the same directory as the script you're executing, use the filename alone:

    fh = open('filename.txt')
    

    relative filepaths: parent directory

    To reach the parent directory, prepend the filename with ../


    fh = open('../filename.txt')
    

    relative filepaths: child directory

    To reach the child directory, prepend the filename with the name of the child directory.


    fh = open('childdir/filename.txt')
    

    relative filepaths: sibling directory

    To reach a sibling directory, prepend the filename with ../ and the name of the child directory.


    fh = open('../childdir/filename.txt')
    

    To reach a sibling directory, we must go "up, then down" by using ../ to go to the parent, then the sibling directory name to go down to the child.


    absolute filepaths

    These paths locate files from the root of the filesystem.


    In Windows, absolute paths begin with a drive letter, usually C:\:

    """ test3a.py:  open and read a file """
    
    filepath = r'C:\Users\david\Desktop\python_data\dir1\file1.txt'
    fh = open(filepath)
    
    print(fh.read())
    

    (Note that r'' should be used when expressing in our Python program any Windows paths that contain backslashes.)


    On the Mac, absolute paths begin with a forward slash:

    """ test3a.py:  open and read a file """
    
    filepath = '/Users/david/Desktop/python_data/dir1/file1.txt'
    fh = open(filepath)
    
    print(fh.read())
    

    (The above paths assume that the python_data folder is in the Desktop directory; your may have placed yours elsewhere on your system. Of course, the above paths also assume that my home directory is called david/; yours is likely different.)


    os.path.join()

    This function joins together directory and file strings with slashes appropriate to the current operating system.


    dirname = '/Users/david'
    filename = 'journal.txt'
    
    filepath = os.path.join(dirname, filename)             # '/Users/david/journal.txt'
    
    filepath2 = os.path.join(dirname, 'backup', filename)  # '/Users/david/backup/journal.txt'
    


    os.listdir(): list a directory

    os.listdir() can read the contents of any directory.


    import os
    
    mydirectory = '/Users/david'
    
    items = os.listdir(mydirectory)
    
    for item in items:                                # 'photos'
    
        item_path = os.path.join(mydirectory, item)
    
        print(item_path)  # /Users/david/photos/
                          # /Users/david/backups/
                          # /Users/david/college_letter.docx
                          # /Users/david/notes.txt
                          # /Users/david/finances.xlsx
    

    Note the os.path.join() call. This is a standard algorithm for looping through a directory -- each item must be joined to the directory to ensure that the filepath is correct.


    exceptions for missing or incorrect files or directories

    Several exceptions can indicate a file or directory misfire.


    exception typetriggered by
    FileNotFoundErrorattempt to open a file not in this location
    FileExistsErrorattempt to create a directory (or in some cases a file) that already exists
    IsADirectoryErrorattempt to open() a file that is already a directory
    NotADirectoryErrorattempt to os.listdir() a directory that is not a directory
    PermissionErrorattempt to read or write a file or directory to which you haven't the permissions
    WindowsError, OSErrorthese exception types are sometimes raised in place of one or more of the above when on a Windows computer



    traversing a directory tree with os.walk()

    os.walk() visits every directory in a directory tree so we can list files and folders.


    import os
    root_dir = '/Users/david'
    for root, dirs, files in os.walk(root_dir):
    
        for tdir in dirs:                    # loop through dirs in this directory
            print(os.path.join(root, tdir))  # print full path to tdir
    
        for tfile in files:                  # loop through files in this dir
            print(os.path.join(root, tfile)) # print full path to file
    


    At each iteration, these three variables are assigned these values:




    File Tests and Manipulations

    os.path.isfile() and os.path.isdir()

    With these functions we can see whether a file is a plain file, or a directory.


    import os                         # os ('operating system') module talks
                                      # to the os (for file access & more)
    mydirectory = '/Users/david'
    
    items = os.listdir(mydirectory)   # list of strings, files found in this directory
    
    for item in items:                # str, first file or dir found in directory
    
        item_path = os.path.join(mydirectory, item)  # join directory name and file or dir
    
        if os.path.isdir(item_path):
            print(f"{item}:  directory")
        elif os.path.isfile(item_path):
            print(f"{item}:  file")
                                         # photos:  directory
                                         # backups:  directory
                                         # college_letter.docx:  file
                                         # notes.txt:  file
                                         # finances.xlsx:  file
    


    os.path.exists()

    This function tests to see if a file exists on the filesystem.


    import os
    
    fn = input('please enter a file or directory name:  ')
    if not os.path.exists(fn):
        print('item does not exist')
    
    elif os.path.isfile(fn):
        print('item is a file')
    
    elif os.path.isdir(fn):
        print('item is a directory')
    


    read file size with os.path.getsize()

    os.path.getsize() takes a filename and returns the size of the file in bytes


    import os
    
    mydirectory = '/Users/david'
    
    items = os.listdir(mydirectory)
    
    for item in items:
        item_path = os.path.join(mydirectory, item)
        item_size = os.path.getsize(item_path)
        print(f"{item_path}:  {item_size} bytes")
    


    moving or renaming a file

    moving and renaming a file are essentailly the same thing


    import os
    
    filename = 'file1.txt'
    new_filename = 'newname.txt'
    
    os.rename(filename, new_filename)
    

    import os
    
    filename = 'file1.txt'      # or could be a filepath incluing directory
    move_to_dir = 'old/'
    
    # renaming file1.txt to old/file1.txt
    os.rename(filename, os.path.join(move_to_dir, filename))
    


    copying or backing up a file

    import shutil                      # the 'shell utilities' module
    
    filename = 'file1.txt'
    backup_filename = 'file1.txt_bk'   # must be a filepath, including filename
    
    shutil.copyfile(filename, backup_filename)
    

    import shutil
    
    filename = 'file1.txt'
    target_dir = 'backup'              # can be a filepath or just a directory name
    
    shutil.copy(filename, target_dir)  # dst can be a folder; use shutil.copy2()
    


    creating a directory: os.mkdir()

    This function is named after the unix utility mkdir.


    import os
    
    os.mkdir('newdir')
    

    A new directory will be created if one does not already exist.


    removing a directory or file tree: os.remove() and shutil.rmtree()

    If your directory is not empty, shutil.rmtree must be used.


    import os
    import shutil
    
    os.mkdir('newdir')
    
    wfh = open('newdir/newfile.txt', 'w')  # creating a file in the dir
    wfh.write('some data')
    wfh.close()
    
    os.rmdir('newdir')        # OSError: [Errno 66] Directory not empty: 'newdir'
    

    shutil.rmtree('newdir')   # success
    


    copying a file tree

    Again, take care when working with entire trees!


    import shutil
    
    shutil.copytree('olddir', 'newdir')
    

    Regardless of what files and folders are in the directory to be copied, all files and folders (and indeed all child folders and files within those) will be copied to the new name or location.




    Interacting with External Processes

    the operating system (OS) manages files and processes

    Through interacting with the OS, we can manage files and launch other programs.



    the subprocess module

    This module can launch external programs from your Python script.


    The subprocess module allows us to:


    subprocess.call()

    Executes a command and outputs to STDOUT.


    for Mac/Linux, using ls:

    import subprocess
    
    subprocess.call(['ls', '-l'])      # -l means 'long listing'
    

    for Windows, using dir:

    import subprocess
    
    subprocess.call(['dir', '/b'], shell=True)  # /b means 'bare listing'
    


    subprocess.call(): redirecting output

    The output of the called program can be directed to a file or other process.


    sending output of command to a write file (and error output to STDOUT)

    import subprocess
    import sys
    
    wfh = open('outfile.txt', 'w')
    subprocess.call(['ls', '-l'], stdout=wfh, stderr=sys.stdout)
    wfh.close()
    

    reading the contents of a file to the input of command wc

    fh = open('pyku.txt')
    subprocess.call(['wc'], stdin=fh)
    fh.close()
    


    subprocess.call(): executing through the shell with shell=True

    The shell means the program that runs your Command or Terminal Prompt.


    import subprocess
    
    subprocess.call('dir /b', shell=True)
    


    subprocess.check_output()

    This command executes a command and returns the output to a byte string rather than STDOUT.


    (using the dir Windows command)

    import subprocess
    
    var = subprocess.check_output(["dir", "."])
    var = var.decode('utf-8')
    print(var)                   # prints the file listing for the current directory
    

    (using the wc Mac/Linux command)

    out = subprocess.check_output(['wc', 'pyku.txt'])
    out = out.decode('utf-8')
    print(out)                  #        3     15     80 pyku.txt
    


    forking child processes with multiprocessing

    forking allows a running program to execute multiple copies of itself simultaneously.



    multiprocessing example

    forking allows a running program to execute multiple copies of itself simultaneously.


    from multiprocessing import Process
    import os
    import time
    
    def info(title):                # function for a process to identify itself
        print(title)
        if hasattr(os, 'getppid'):  # only available on Unix
            print('parent process:', os.getppid())
        print('process id:', os.getpid())
    
    def func(childnum):                # function for child to execute
        info(f'|Child Process {childnum}|')
        print('now taking on time consuming task...')
        time.sleep(3)
        print(f'{childnum}:  done')
        print()
    
    if __name__ == '__main__':
    
        info('|Parent Process|')
        print(); print()
        procs = []
        for num in range(3):
            p = Process(target=func, args=(num,))
                                             # a new process object
                                             # target is function f
            p.start()                        # new process is spawned
            procs.append(p)                  # collecting list of Process objects
    
        for p in procs:
            p.join()                         # parent waits for child to return
        print('parent concludes')
    

    multiprocessing output and discussion

    Because multiple processes are spawned, we must imagine each one as a separately running program.



    Looking closely at the output, note that all three processes executed and that the parent didn't continue until it had heard back from each of them

    |Parent Process|
    parent process: 92180
    process id: 92316
    
    |Child Process 0|
    parent process: 92316
    process id: 92318
    now taking on time consuming task...
    |Child Process 2|
    parent process: 92316
    process id: 92320
    now taking on time consuming task...
    |Child Process 1|
    parent process: 92316
    process id: 92319
    now taking on time consuming task...
    0:  done
    2:  done
    1:  done
    parent concludes



    More About User-Defined Functions

    user-defined functions and code organization

    User-defined functions help us organize our code -- and our thinking.


    Let's now return to functions from the point of view of code organization. Functions are useful because they:


    review: function block, argument and return value

    def add(val1, val2):
        mysum = val1 + val2
        return mysum
    
    a = add(5, 10)      # int, 15
    
    b = add(0.2, 0.2)   # float, 0.4
    

    Review what we've learned about functions:


    Ex. 12.1 - 12.4


    functions without a return statement return None

    When a function does not return anything, it returns None.


    def do(arg):
        print(f'{arg} doubled is {arg * 2}')
        # no return statement returns None
    
    x = do(5)        # (prints '5 doubled is 10')
    
    print(x)         # None
    


    Actually, since do() does not return anything useful, then we should not call it with an assignment (i.e., x = above), because no useful value will be returned. If you should call a function and find that its return value is None, it often means that it was not meant to be assigned because there is no useful return value. Ex. 12.5 - 12.6


    the None object type

    The None value is the "value that means 'no value".


    zz = None
    
    print(zz)        # None
    print(type(zz))  # <class 'NoneType'>
    
    aa = 'None'      # oops, this is a string -- not the None value!
    


    function argument type: positional

    Positional arguments are required to be passed, and assigned by position.


    def greet(firstname, lastname):
        print(f"Hello, {firstname} {lastname}!")
    
    greet('Joe', 'Wilson')   # passed two arguments:  correct
    
    greet('Marie')           # TypeError: greet() missing 1 required positional argument: 'lastname'
    


    function argument type: keyword

    Keyword args are not required, and if not passed return a default value.


    def greet(lastname, firstname='Citizen'):
        print(f"Hello, {firstname} {lastname}!")
    
    greet('Kim', firstname='Joe')   # Hello, Joe Kim!
    
    greet('Kim')                    # Hello, Citizen Kim!
    


    12.7 - 12.8




    User-Defined Function Variable Scoping

    variable name scoping: the local variable

    Variable names initialized inside a function are local to the function.


    def myfunc():
        tee = 10
        return tee
    
    var = myfunc()
    
    print(var)          # 10
    
    print(tee)          # NameError ('tee' does not exist here)
    


    Ex. 12.9


    variable name scoping: the global variable

    Any variable defined in our code outside a function is global.


    var = 'hello global'      # global variable
    
    def myfunc():
        print(var)            # this global is available here
    
    myfunc()                  # hello global
    


    Ex. 12.10 - 12.11


    "pure" functions

    Functions that do not touch outside variables, and do not create "side effects" (for example, calling exit(), print() or input()), are considered "pure" -- and are preferred.


    "Pure" functions have the following characteristics:


    "pure" functions: working only with "inside" (local) variables

    "Outside" (Global) variables are ones defined outside the function -- they should be avoided.


    wrong way: referring to an outside variable inside a function

    val = '5'                   # defined outside any function
    
    def doubleit():
        dval = int(val) * 2     # BAD:  function refers to "global" variable 'val'
        return dval
    
    new_val = doubleit()
    

    right way: passing outside variables as arguments

    val = '5'                   # defined outside any function
    
    def doubleit(arg):
        dval = int(arg) * 2     # GOOD:  refers to the same value '5',
        return dval             #        but accessed through local
                                #        argument 'arg'
    
    new_val = doubleit(val)     # passing variable to function -
                                #   correct way to get a value into the function
    


    "pure" functions: avoiding "side-effects"

    print(), input(), exit() all "touch" the outside world and in many cases should be avoided inside functions.



    Although it is of course possible (and sometimes practical) to use these built-in functions inside our function, we should avoid them if we are interested in making a function "pure". It should also be noted that it's fine to use any of these in a function during development - they are all useful development tools.


    "pure" functions: using raise instead of exit() inside functions

    exit() should not be called inside a function.


    def doubleit(arg):
        if not arg.isdigit():
            raise ValueError('arg must be all digits')   # GOOD:  error signaled with raise
        dval = int(arg) * 2
        return dval
    
    val = input('what is your value? ')
    new_val = doubleit(val)
    


    signalling errors (exceptions) with raise

    raise creates an error condition (exception) that usually terminates program execution.



    To raise an exception, we simply follow raise with the type of error we would like to raise, and an optional message:

    raise ValueError('please use a correct value')
    

    You may raise any existing exception (you may even define your own). Here is a list of common exceptions:

    Exception TypeReason
    TypeError the wrong type used in an expression
    ValueError the wrong value used in an expression
    FileNotFoundError a file or directory is requested that doesn't exist
    IndexError use of an index for a nonexistent list/tuple item
    KeyError a requested key does not exist in the dictionary
    Ex. 12.12 - 12.13


    global variables and function "purity"

    Globals should be used inside functions only in select circumstances.


    STATE_TAX = .05    # ALL CAPS designates a "constant"
    
    
    def calculate_bill(bill_amount, tip_pct):
    
        tax = bill_amount * STATE_TAX     # int, 5
        tip = bill_amount * tip_pct       # float, 20.0
    
        total_amount = bill_amount + tax + tip   # float, 125.0
    
        return total_amount
    
    
    total = calculate_bill(100, .20)      # float, 125.0
    


    "pure" functions: why prefer them?

    Here are some positive reasons to strive for purity.


    You may have noticed that these "impure" practices do not cause Python errors. So why should we avoid them?



    The above perspective will become clearer as you write longer programs. As your programs become more complex, you will be confronted with more complex errors that are sometimes difficult to trace. Over time you'll realize that the best practice of using pure functions enhances the "quality" of your code -- making it easier to write, maintain, extend and understand the programs you create. Again, please note that during development it is perfectly allowable to call print(), exit() or input() from inside a function. We may also decide on our own that this is all right in shorter programs, or ones that we working on in isolation. It is with longer programs and collaborative projects where purity becomes more important.


    proper code organization

    Let's discuss some essential elements of a program.


    Here are the main components of a properly formatted program. Please Sse the tip_calculator.py file in your files directory for an example:


    review of tip_calculator.py


    the four variable scopes: L-E-G-B

    Four kinds of variables: (L)ocal, (E)nclosing, (G)lobal and (B)uiltin.


    filename = 'pyku.txt'       # 'filename':  global
    
                                # 'get_text':  global (function name is a
                                #                      variable as well)
    def get_text(fname):        # 'fname':     local
        fh = open(fname)        # 'fh':        local; 'open':  builtin
        text = fh.read()        # 'text':      local
        return text
    
    txt = get_text(filename)    # 'txt':       global
    print(txt)                  # 'print':     builtin