Introduction to Python

davidbpython.com




Projects, Session 3



PLEASE REMEMBER:

  1. re-read the assignment before submitting
  2. go through the checklist including the tests
  3. make sure your notations are as specified in the homework instructions

All requirements are detailed in the homework instructions document.

Careless omissions will result in reductions to your solution grade.

 

SPECIAL NOTE ON OPENING FILES You may have noticed that the filenames we are using in the Inclass Exercises begin with a '../'. This specifies that the file can be found in the parent directory, or directory that is the parent or "one above" the folder our script is in. Any path information that precedes a filename (as long as it does not begin with C:/ or / (Mac) is called a relative path. Not all relative paths will specify the parent directory as ../ does. To find a file using a relative path, we must know a) the location of the file, and b) the location from which we are running our script (the "present working directory"); from this we can determine c) the path needed to access the file location from the pwd. If Python can't find your file, it may be because the relative path is incorrect.

If the file you want to open is in the same directory as the script you're executing, use the filename alone:
fh = open('filename.txt')
If the file you want to open is in the parent directory from the script you're executing, use the filename with ../:
fh = open('../filename.txt')
If the file you want to open is in a child directory from the script you're executing, use the filename with the child directory name prepended:
fh = open('<childdir>/filename.txt')

(Replace <childdir> with the name of the child directory.)

 
3.1 Notes typing assignment. Please write out this week's transcription notes. The notes are displayed as an image named transcription in each week's project files folder.

This does not need to be in a Python program - you can use a simple text file.

 
3.2 Use a while True loop to keep taking input until user enters a 4-digit string.

The while True loop will repeatedly take user input from the keyboard -- a 4-digit year -- and reject the input with a polite (or rude) message if it is not all digits, and 4 characters long. If the input is bad, the program asks again -- this is done with a while True loop. If the input is good, it should break out of the loop:

Sample program run:
please enter a 4-digit year:  hello
sorry, that was bad input
please enter a 4-digit year:  what's wrong?
sorry, that was bad input
please enter a 4-digit year:  1990
thanks!  Your value is 1990

[program breaks out and then exits]

Please use a single 'if' statement with an 'and' or 'or' compound test, rather than two 'if' statements -- see discussion for more detail. Please keep in mind that in a compound test, each test stands alone and is evaluated by itself -- this means that the comparison test on either side of 'and' or 'or' must be complete. HOMEWORK CHECKLIST: all points are required


    testing: you have run the program with the sample inputs as shown and are seeing the output exactly as shown (contact me if your output is different and you're unable to adjust to match)

    when testing for bad input (must be 4-characters long and must be all digits, otherwise input is "bad"), program uses a compound test (with 'and' or 'or') rather than using an 'if' block nested inside another 'if' block (nested 'if' blocks are inherently more complex logic)

    when testing for .isdigit(), code does not compare to True (i.e., 'if x.isdigit() != True'). Instead, say if x.isdigit(): or if not x.isdigit():

    program does not use exit() to stop the loop -- it uses break (which immediately takes us out of the loop) and drops down below

    code does not feature input() more than once. To do this, put your input() call just inside the while True block as the first statement, and to signal an error and ask again, put the 'sorry' message at the bottom and then allow the while to loop back to take input again.

    there are no extraneous comments or "testing" code lines

    program follows all recommendations in the "Code Quality" handout

 
3.3 Calculate the sum and count of Mkt-RF values (leftmost column of floats) for a given year.

Start with a 4-digit string year (for example, '1990') assigned to a string variable:

year = '1990'

Then looping through the FF_data.txt file, calculate the count and also the sum of MktRF values (the leftmost column of floating-point values, i.e. the 2nd column in the file) for that year. (Note the below output is from FF_data.txt)

Sample run (with year = '1990')
253
-12.77
Sample run (with year = '1927')
301
26.26
Sample run (with year = '1945')
286
32.55

Note: it will be much easier to use FF_abbreviated.txt for testing, as you can mentally calculate the correct amount and check it against your program's output. Once this is confirmed, you can then run the program against the FF_data.txt file and compare your output to the sample output. If Python can't find your file, it may be because the relative path is incorrect. Please see note on filepaths near top of this page. See the homework discussion for more detail and a step-by-step. HOMEWORK CHECKLIST: all points are required


    testing: you have run the program with the sample inputs as shown and are seeing the output exactly as shown (contact me if your output is different and you're unable to adjust to match)

    all variable names are lowercase only. MktRF makes sense as a name but it is not appropriate for regular Python variables.

    actions that need to be done repetitively are located inside the loop - for example, incrementing the count or adding to the sum, which needs to happen once for every line in the file.

    The 'for' loop is not nested in an 'if' or 'while' block.

    actions that should only be done once are located outside the loop - for example, initializing the count and sum variables to 0, or printing the result

    there are no extraneous comments or "testing" code lines

    program follows all recommendations in the "Code Quality" handout

 
3.4 Assign 35.9 to a float variable, and 7 to an int variable:
mysum = 35.9
mycount = 7

Next calculate an "average" value by dividing the float by the int. Round the result to two places so the result is 5.13.

 
3.5 Create a complete program built with the last 3 solutions, and test the entire program.

PASTE IN THE CODE, ONE AFTER THE OTHER for the previous three assignments into one complete program, and alter variable names so that they work together. (You'll also remove the hard-coded year from the 'for' loop section, the hard-coded 'sum' and 'average' values from the average calculation section, and any print statements that show the result from the first two solutions.) However please don't indent any one of the pasted solutions inside any of the others' blocks! They are to follow each other, one after the other. This is a complete program. You must test this entire program as shown below. Please paste in the 1st code solution (not the transcription), then paste in the 2nd code solution, then paste in the 3rd code solution. You must not try to rewrite the code, remove or change lines (however please do remove the hard-coded year from the 2nd solution, and the hard-coded count and sum from the 3rd solution, and you may need to change some variable names and the print statement at the end). You must not mix the solutions in any way. You must test the solution to see that it works as shown below. Consider the work done in the previous three assignments:

  • the first program took user input for a 4-digit year
  • the second program used a 4-digit year to add up a sum (float) and count (int) of values for that year
  • the third program divided a float by an int to calculate an average, and rounded that result to 2 places.

These assignments are meant to be built into a single program. You only need to paste them in one after the other, remove the hard-coded year from the 2nd program and hard-coded count and sum from the 3rd solution, and change the variable names as necessary for the program to work as shown below.

Please do not rewrite this solution! Do not let your pasted-in solutions overlap or "nest"! You only need to paste the three solutions, in order, remove the hard-coded year, count and sum, and "hook up" the three solutions by changing the variable names so that they match between parts.

Why shouldn't you rewrite your solution or let your solutions overlap? If you attempt to rewrite the solution, you will likely begin nesting blocks in a way that this assignment is designed to prevent. We will discuss in class, or if curious, contact me to discuss.

There is no need for notations in this solution.

(Note the below output is from FF_data.txt)

Sample program runs:
please enter a 4-digit year:  hello
sorry, that was bad input
please enter a 4-digit year:  what's wrong?
sorry, that was bad input
please enter a 4-digit year:  1990
count 253, sum -12.77, avg -0.05
please enter a 4-digit year:  1927
count 301, sum 26.26, avg 0.09
please enter a 4-digit year:  1945
count 286, sum 32.55, avg 0.11

HOMEWORK CHECKLIST: all points are required


    testing: you have run the program with the sample inputs as shown and are seeing the output exactly as shown (contact me if your output is different and you're unable to adjust to match)

    this is a complete program that has been successfully tested as shown in the example runs above

    the three original programs do not overlap or are rewritten in any way - they follow one another separately.

    the program uses f'' strings to combine numbers with strings, or strings with other strings

    you do not need to include notations for this solution

    there are no extraneous comments or "testing" code lines

    program follows all recommendations in the "Code Quality" handout

 

EXTRA CREDIT / SUPPLEMENTARY

 
3.6 (Extra credit / supplementary): write a Python program that can download CSV data from the internet and parse with the CSV module: real-world data is often more complex than the data we are working with, and supplementary exercises like these may be more complicated and/or time consuming than the carefully constructed assignments that are required. For these I will not work out solutions, so please enjoy the adventure and send me your questions.

Using requests You can refer to the slide deck from Session 3 for complete info on requests, but here is a summary:

import requests

my_url = ('https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&'
          'symbol=IBM&interval=5min&apikey=demo&datatype=csv')

         # please note parens are only needed because I have 2 parts of
         # a string I'd like to put together -- it's called "impicit concatenation"
         # if you're not doing that, then parens around the str are not needed


response = requests.get(my_url)         # return a 'response' object
text = response.text                    # read into a string

Parsing CSV Files Downloaded CSV files should be parsed with the CSV module, as CSV can be more complex than just comma separators.

The csv.reader() function usually requires a file object, but we can also pass a list of lines to it:
import csv
import requests

my_url = ('https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&'
          'symbol=IBM&interval=5min&apikey=demo&datatype=csv')

         # please note parens are only needed because I have 2 parts of
         # a string I'd like to put together -- it's called "impicit concatenation"
         # if you're not doing that, then parens around the str are not needed


response = requests.get(my_url)         # return a 'response' object
text = response.text                    # read into a string

lines = text.splitlines()               # list of str (lines)

reader = csv.reader(lines)

header_line = next(reader)              # (Only if file data has a header line - skips to next line.
                                        #  Returns the skipped line.)

print(f'header line (may not be needed for your program):  {header_line}')

for row in reader:
    print(row)

For discussion of requests, see the slide deck entitled "Supplementary Modules: CSV, SQL, JSON and the Internet". Select one of the sources listed at the link below - pick something interesting, and something you feel you might be able to learn from based on the (albeit simple) techniques we've described.

https://catalog.data.gov/dataset?res_format=CSV&page=1

The yellow [CSV] button under each source is actually a link to CSV data (in most cases), but we don't want to download the data to a browser, so don't click the button. When you've found data you'd like to analyze, right-click this yellow [CSV] button an select "Copy Link Location". (If you don't see that choice, move your cursor on and off, and try again.) Copy this link to a plain text file for use in your program. Now use your selected link (not necessarily the one above) with the requests module to download the data of your choice from the website.

Remember that the .text attribute of the response object will return a string of the server's response. You can then split into lines if desired:
text = response.text
lines = text.splitlines()
print(lines[0:4])

Then you can decide which values you'd like to analyze (at this point, we can only do sums and averages). Send me your questions. Enjoy!

 
[pr]