Introduction to Python

davidbpython.com




Projects, Session 5



PLEASE REMEMBER:

  1. re-read the assignment before submitting
  2. go through the checklist including the tests
  3. make sure your notations are as specified in the homework instructions

All requirements are detailed in the homework instructions document.

Careless omissions will result in reductions to your solution grade.

 

NOTE ON OPENING FILES

If the file you want to open is in the same directory as the script you're executing, use the filename alone:
fh = open('filename.txt')
If the file you want to open is in the parent directory from the script you're executing, use the filename with ../:
fh = open('../filename.txt')
If the file you want to open is in a child directory from the script you're executing, use the filename with the child directory name prepended:
fh = open('<childdir>/filename.txt')

(Replace <childdir> with the name of the child directory.)

 
5.1 Notes typing assignment. Please write out this week's transcription notes. The notes are displayed as an image named transcription in each week's project files folder.

This does not need to be in a Python program - you can use a simple text file.

 
5.2 Filepaths Exercises.

As usual, returned solutions will lose points. It is recommended to confirm (through testing) that your answer is correct before submitting to ensure that you will receive credit. Notations are not required for this solution.

Start with the below file tree, which is available in this week's data folder:
dir1
├── file1.txt
├── test1.py
│
├── dir2a
│   ├── file2a.txt
│   ├── test2a.py
│   │
│   └── dir3a
│       ├── file3a.txt
│       ├── test3a.py
│       │
│       └── dir4
│           ├── file4.txt
│           └── test4.py
└── dir2b
    ├── file2b.txt
    ├── test2b.py
    │
    └── dir3b
       ├── file3b.txt
       └── test3b.py

To complete this assignment, please open and edit each of the below 5 .py scripts in the tree so that they open the noted .txt files:

  1. test2a.py: open and read file3a.txt
  2. test1.py: open and read file4.txt
  3. test4.py: open and read file3a.txt
  4. test3a.py: open and read file1.txt
  5. test2b.py: also open and read file2a.txt

Your job is to fill in the relative filepath (i.e. not starting with C:\Users or /Users) needed to open the indicated file in the open(r'') function call in each script.
Test (i.e., run) each of the above scripts to verify that they open and read the indicated file.
Finally, copy out the paths that you used to open each file as indicated below. (The first has been done for you.)
Keep in mind, this is not a program to be run! Simply fill in the open() filepaths you used to read each .txt file from the indicated .py file.

######## test2a.py:  read file3a.txt ########

fh = open('')            # add relative filepath here to open file3a.txt


######## test1.py:  read file4.txt ########

fh = open('')            # add relative filepath here to open file4.txt


######## test4.py:  read file3a.txt ########

fh = open('')            # add relative filepath here to open file3a.txt


######## test3a.py:  read file1.txt ########

fh = open('')            # add relative filepath here to open file1.txt


######## test2b.py:  read file2a.txt ########

fh = open('')            # add relative filepath here to open file2a.txt
                         # hint:  you must go "down, then up" within the single filepath

See "Filepaths for Locating Files" slide deck from last session for a discussion to assist in completing this assignment. Send me any questions you may have.

 
5.3 Lookup Dictionary. Reading file states.csv (see the file in this session's source data), build a dict of pairs with each state's name as key and the abbreviation as the value (for example, New York as key and NY as value). Then, read user input for a state name. If the state name is a key in the dict, display the abbreviation for that state. If it is not a key in the dict (use 'if'), print the error message no state found with name "<keyname>" where keyname is the key that was not found.

PLEASE DO NOT USE try/except FOR THIS EXERCISE. Use an 'if' statement to see if the key is in the dict. Note that if Python can't find your file, it may be because the relative path is incorrect. Please see "Filepaths for Locating Files" in the Session 4 Slides.

Sample program runs:
there are 50 pairs in the lookup dict
please enter a state name:  California
Abbreviation for California is CA
there are 50 pairs in the lookup dict
please enter a state name:  New York
Abbreviation for New York is NY
there are 50 pairs in the lookup dict
please enter a state name:  Oman
no state found with name "Oman"

  • Start with an empty dict
  • Loop through state_name_abbrev.csv line-by-line
  • Inside the loop, isolate (split out) the state name and abbreviation from each line
  • Still inside the loop, add a key/value pair to the dict: each state name key and 2-character abbreviation value
  • After the dict is fully built and the loop is complete, ask the user for a state name using input()
  • If the state name is a key in the dict, print out its abbreviation
  • If the name is not in the dict, print out "no state found with name ... " and include the submitted name

See discussion for more detail.

HOMEWORK CHECKLIST: all points are required


    testing: you have run the program with the sample inputs as shown and are seeing the output exactly as shown (contact me if your output is different and you're unable to adjust to match)

    program does not include try/except

    program loads the entire 50 pair dict before taking user input for a state name

    program does not loop through the dict to find the key. You can use the state name directly to get the value from the dict -- no looping is used.

    Program does not us a counter to determine the length of the dict, nor does it use a hard-coded "50". We want to avoid hard-coding values into our programs in case the data changes (for example, we add territories or Canadian provinces). Instead use len() on the dict to get its size.

    program does not require Python to take the len() of the dict more than one time

    there are no extraneous comments or "testing" code lines

    program follows all recommendations in the "Code Quality" handout

 
5.4 Lookup Dictionary with try/except (notations not required for this solution). Please resubmit your solution to the previous exercise, this time using try/except instead of 'if' to respond if the user types an invalid state name. With an invalid state name, Python should raise an exception -- if so, trap the exception and print the error message no state found with name "<keyname>" where <keyname> is the key that was not found.

YOU MUST USE try/except TO TRAP THE ERROR if the user types a bad key. Please do not use 'if' in this exercise. YOU MUST NOT USE except: by itself, nor except Exception: as these trap any type of exception. We must be specific when trapping exceptions.

Sample program runs:
there are 50 pairs in the lookup dict
please enter a state name:  California
Abbreviation for California is CA
there are 50 pairs in the lookup dict
please enter a state name:  New York
Abbreviation for New York is NY
there are 50 pairs in the lookup dict
please enter a state name:  Oman
no state found with name "Oman"

See discussion for more detail. HOMEWORK CHECKLIST: all points are required


    testing: you have run the program with the sample inputs as shown and are seeing the output exactly as shown (contact me if your output is different and you're unable to adjust to match)

    program is exactly the same except for using try/except (rather than 'if') to handle the exception raised if the user inputs a bad state name

    program does not use an 'if' statement to detect a bad state name. Instead, it uses try to trap the exception that is raised, and except to print an error message

    program does not use except: by itself, nor except Exception: as these trap any type of exception

    try: block is placed around only the line where the exception is expected, and no other lines are included

 
5.5 Ranking. Reading cities_green_space.csv, build a dictionary that pairs city name keys with "pct" float values.
{'Amsterdam': 13.0, 'Austin': 10.0, 'Barcelona': 28.0, 'Bogotá': 4.9,
 'Brussels': 18.8, 'Buenos Aires': 9.4, 'Cape Town': 24.0, 'Chengdu': 42.3,
 'Dublin': 26.0, 'Edinburgh': 49.2, 'Guangzhou': 19.78, 'Helsinki': 40.0,
 'Hong Kong': 40.0, 'Istanbul': 2.2, 'Johannesburg': 24.0, 'Lisbon': 18.0,
 'London': 33.0, 'Los Angeles': 34.7, 'Melbourne': 9.3, 'Milan': 13.74,
 'Montréal': 12.82, 'Moscow': 18.0, 'Nanjing': 40.67, 'New York': 27.0,
 'Oslo': 68.0, 'Paris': 10.0, 'Rome': 38.9, 'San Francisco': 13.0,
 'Seoul': 27.91, 'Shanghai': 16.2, 'Shenzhen': 40.9, 'Singapore': 47.0,
 'Stockholm': 40.0, 'Sydney': 46.0, 'Taipei': 6.56, 'Tokyo': 7.5,
 'Toronto': 13.0, 'Vienna': 50.0, 'Warsaw': 17.0, 'Zürich': 41.0}

Next, sort the dict keys by value in reverse order, then loop through and print each city name and its pct value.

Expected Output:
Cities Ranked by Greenspace (% of total area)

Oslo 68.0
Vienna 50.0
Edinburgh 49.2
Singapore 47.0
Sydney 46.0
Chengdu 42.3
Zürich 41.0
Shenzhen 40.9
Nanjing 40.67
Helsinki 40.0
Hong Kong 40.0
Stockholm 40.0
Rome 38.9
Los Angeles 34.7
London 33.0
Barcelona 28.0
Seoul 27.91
New York 27.0
Dublin 26.0
Cape Town 24.0
Johannesburg 24.0
Guangzhou 19.78
Brussels 18.8
Lisbon 18.0
Moscow 18.0
Warsaw 17.0
Shanghai 16.2
Milan 13.74
Amsterdam 13.0
San Francisco 13.0
Toronto 13.0
Montréal 12.82
Austin 10.0
Paris 10.0
Buenos Aires 9.4
Melbourne 9.3
Tokyo 7.5
Taipei 6.56
Bogotá 4.9
Istanbul 2.2
 
5.6 (Extra credit.) Summing dictionary. Reading FF_abbreviated.txt, build a dictionary that sums all of the Mkt-RF values (the 2nd column, or leftmost float values) associated with each year. Sort the dictionary's keys by value and print each key and corresponding value, so that the values sort ascending. Do not loop through the source data more than once. (You will loop through the sorted dict keys once you have built the dict.)
Sample program run:
1926:  3.39
1928:  3.88
1927:  4.67

(Note: these results are rounded - don't worry if yours are off by a small amount.)

  • Loop through FF_abbreviated.txt
  • On each line, isolate the 4-digit year and 1st float value, converting the float value to float
  • If the year is not yet in the dictionary, set a year key and and the float value as the value in the dict
  • Otherwise (if the year is already in the dictionary), add the float value from the current line to the value for this year in the dict
  • (Loop will repeat the 3 above steps once for each row in the file.)
  • When the loop is done, sort the dict keys by value
  • Loop through the sorted keys and print out the year and value as shown in the sample run
  • Avoid looping through the source data lines more than once
  • Store all values in a dictionary only. A dictionary is all you need to compile a sum for each year.

Note that if Python can't find your file, it may be because the relative path is incorrect. Please see "Filepaths for Locating Files" in the Session 4 Slides. See the discussion for more detail.

Additional challenge: use the full file; select sort direction; select number of results. Read from FF_data.txt. Take two inputs from the user: number of results, and the word 'ascending' or 'descending'.

Sample program run:
please enter a number of results:  5
please enter a directory (ascending or descending):  descending
1933:  53.89
1954:  40.06
1935:  38.15
1958:  35.96
1945:  32.55

These results show the best market years in history -- can you calculate the worst? (Note: these results are rounded at the very end during the loop in which we display the floats. If your numbers are off by a small amount, it may be due to your own computer's float precision.)

  • Read and validate two values from two calls to input(): number of results (must be an integer) and 'ascending' or 'descending' (must be one of those two words)
  • Read the full FF_data.txt file
  • Populate the dictionary as in the for-credit part of the assignment
  • Generate a list of year keys sorted by value, and use reverse=True if user specified 'descending'
  • Slice the sorted list so there are only as many results as specified in the integer argument
  • Do not use a counter to count how many to display! Use a slice of the sorted list of keys
  • Loop through the sorted list and print out the year and associated value from the dictionary, rounding each value to 2 places.
  • DO NOT REPEAT CODE (within this solution)! Some students repeat the same sorting, slicing and looping code once for each of 'ascending' and 'descending'. This is not necessary; instead, simply use the 'ascending' or 'descending' value to determine whether or not to reverse the sort (reverse=True or reverse=False).

    Note: this solution must be completed without repeating any code within this solution! (Of course it will repeat much of the code in the regular credit solution.)


    HOMEWORK CHECKLIST: all points are required
        testing: you have run the program with the sample inputs as shown and are seeing the output exactly as shown (contact me if your output is different and you're unable to adjust to match)

        the program loops through the file only once

        all of the sums are stored in the dict - program does not use an additional float value to hold the current sum

        there are no repeated statements (for example, the same 'for' loop with everything the same except for direction of the sort -- see note about repeated code above

        there are no extraneous comments or "testing" code lines

        program follows all recommendations in the "Code Quality" handout

  •  
    [pr]