Advanced Python
Project Discussion, Session 5
5.1 | Do another git commit. Please add a new file or make a change to an existing file, and commit and push the change to github. Again, please paste your git URL here so I can view it. Thanks! |
5.2 | line_by_date(): a sorting helper function that helps sorted() to sort a line from the file by its "canonical date" (a value that would sort in YYYYMMDD order). |
Sort the lines in dated_file.csv by date and write back a new file with the dates in order. |
|
You can choose to use this starter code, completing only the line_by_date() function:
source_fname = 'dated_file.csv'
target_fname = 'sorted_file.csv'
def line_by_date(this_line):
""" this sorting helper function takes a single line as argument
and returns the value by which that line should be sorted """
# your code here
# read the file into a list of lines
fh = open(source_fname)
lines = fh.readlines()
slines = sorted(lines, key=line_by_date)
# write the lines to a new file
wfh = open(target_fname, 'w') # open file for writing
for line in slines:
wfh.write(line)
wfh.close()
print(f'wrote to {target_fname}')
|
|
Using this code, you only need to write the line_by_date() function. The program will automatically write the target file - you need to check the data written to the target file to see if the results are as expected. The sorting helper function line_by_date expects a single string file line as an argument. The function must return a value that can be used as a "sortable value" for that line. Please remember that the sorting helper function expects a single line from the file as argument. It does not try to loop through lines. |
|
So, how can we tell sorted() how to sort a line like this one?
09/03/2018,A,C,23.85 |
|
We must have our sort function take the line as argument and return a sortable value for that line. Consider that every date can be represented as an 8-digit "canonical" number that is sortable as a number. The canoncical date for the above string is: |
|
20180903 |
|
Since the YYYYMMDD value is universally sortable (it will be sorted correctly as an integer or a string) it can stand as a "sortable value" for a dated line. Therefore your sorting helper function need only split or slice out the elements of the date (year, month, day) and return an 8-digit value in the format YYYYMMDD like the one illustrated above. In this way, the lines from the file can be sorted by their date. |
|
5.3 | A sorting helper function that helps sorted() to sort a list of dicts by each dict's mean_temp: opening and loading the file weny_lod_tiny.json, please sort the dicts in this list by the mean_temp value. |
The first thing you should do inside def by_mean_temp(arg) function is add print(arg) inside the sort function, then run the program: |
|
def by_mean_temp(arg):
print(f'arg to sort function: {arg}')
|
|
You should see this result: |
|
arg to sort function: {'date': '1/1/16', 'mean_temp': '98', 'precip': '0', 'events': ''} arg to sort function: {'date': '1/2/16', 'mean_temp': '93', 'precip': '0', 'events': ''} arg to sort function: {'date': '1/3/16', 'mean_temp': '101', 'precip': '0', 'events': ''} |
|
Observe the type and value of arg. The container to be sorted is a list of dicts, so one item to be sorted is a dict. So the argument to the sort function is a dict (i.e. each item from the list, sent one at a time to the function through a total of 3 function calls) and the return value should be the 'mean_temp' value from that dict. Remember to think in terms of one item being sent to the sort function. The function does not attempt to loop through the items all at once, because it only receives one item (one dict) at a time! It is called once for each item being sorted, and passes that item to the function. Given that arg is a dict and we are sorting each dict by its mean_temp, simply return the value of mean_temp from the dict. If you're seeing the '101' dict come first, remember that strings will sort a '1' value before a '9' value, i.e. strings sort alphabetically. What can you do to this value to cause it to be treated as an integer? |
|
5.4 | A sorting helper function that helps sorted() to sort the keys in a dict of dicts by each dict's date, then prints each key and associated dict. |
Hint: you may refer to the dictionary, a global variable, inside the function. Usually we discourage referring to a global inside a function but with sort functions it is much more acceptable. Again, the first thing you should do inside def by_mean_temp(arg) function is print the value of arg, then run the program. Use this to determine what value to return in order to sort the dicts and print them out. |
|
def by_mean_temp(arg):
print(f'arg to sort function: {arg}')
|
|
arg to sort function: 1/1/16 arg to sort function: 1/2/16 arg to sort function: 1/3/16 |
|
Observe the type and value of arg. The container to be sorted is a dicts of dicts, and when we sort dictinoaries we're sorting keys, so one item to be sorted is a string key. So to return the 'mean_temp' value associated with a key, you'll need to subscript the dictionary with the key to retrieve the dict associated with that key, then subscript that dict to get the mean temp. Remember to think in terms of one item being sent to the sort function. The function does not attempt to loop through the items all at once, because it only receives one item (one dict key) at a time! It is called once for each item being sorted, and passes that item to the function. Given that arg is a dict key and we are sorting each dict by its mean_temp, simply return the value of mean_temp from the function. Please also note that it's OK to refer to the dict dod inside the function - that is how we'll get the dict associated with the key. If you're seeing the '101' dict come first, remember that strings will sort a '1' value before a '9' value, i.e. strings sort alphabetically. What can you do to this value to cause it to be treated as an integer? |
|
EXTRA CREDIT |
|
5.5 | (Extra credit) Decorator Function. Write a decorator function logfunc() that logs the entry and exit of every function in a program by printing the datetime as well as the function's name and arguments. |
A decorator function applies extra functionality to any function, by taking the function as argument and returning a newly defined function as return value. The returned function will "replace" the original function -- usually by calling the original function and allowing to do its work, as well as performing additional tasks. Step 1: Define and call function greet() with arguments. |
|
Start with this code, and make sure it's clear to you:
import time
def greet(greeting, place='world'):
time.sleep(2)
return f'{greeting}, {place}!'
msg = greet('hello')
print(f'returned value: {msg}')
msg2 = greet('goodbye', place='Mars')
print(f'returned value: {msg2}')
|
|
We are calling this function with a greeting, and calling it with a greeting and a place. Make sure you understand the mechanics of the keyword argument. The function contains a line that pauses execution for 2 seconds (time.sleep(2)). This is done so we will see differing timestamps before and after we call the function. Step 2: Print the function object you created.A function object is no different from any other kind of object. You can print it by printing the variable. But when printing, assigning or passing a function object you must make sure not to call it unless you intend to see it work and return its return value. |
|
At the bottom of your script, print the function object thusly:
print(greet) # <function greet at 0x1034bc560>
|
|
Note that we didn't say greet() -- the parentheses mean a call. For now, we're just interested in seeing the function object itself, because we're going to pass the object to another function. If you add the parentheses (and pass arguments), you'll be printing the return value, not the function object. Step 3: Define function logfunc().Define logfunc() to take one argument, print the argument, and then return it.Next, at the bottom of your script, call logfunc() and pass it greet. Make sure you do not call greet(), you must only pass the variable greet. You should see the same function greet object printed as when you printed greet. Step 4: Decorate greet() with @logfunc and then call greet().Remove your call to logfunc(). Instead, place @logfunc as the line above def greet() to decorate the greet() function. This is basically saying, "when you see that I'm decorated, replace me with the function returned from logfunc()".You should see the same printing of the greet function object that you did when you printed greet directly, but this time the printed function is the first thing you see printed. This is because when greet() is defined, Python sees the decorator and automatically calls logfunc(). It invisibly replaces greet with the function returned from logfunc() -- which in this case, is the same function greet. Step 5: Define a function impostor() inside logfunc()Inside logfunc(), delete the argument printing and, indented inside logfunc(), place a def impostor function definition. The impostor() function should:
|
|
print('I am an impostor function!')
print('I am getting called when you call greet()!')
print(f'The arguments you passed: {arg}, {place}')
|
|
Finally, instead of returning the function argument from logfunc(), return impostor (make sure not to call impostor, simply return it). You should see the impostor messages printed twice. You should also see returned value: 99 printed twice --these are the return values from when you called greet(). Finally, if you are still printing the function greet, you'll see that you are no longer printing greet() -- you're printing the impostor function ("logfunc.<locals>.impostor".) Take extra care of what is inside def impostor() and what is inside def logfunc(). impostor() returns 99, while logfunc() returns impostor. If you don't line up the indents to reflect this, the functions will not operate as expected. Step 6: Confirm you understandings.You should make sure you understand, or at least keep in mind, the following:
|
|
Step 7: Replace the arguments in the impostor() definition.Make the imposter arguments look like this: |
|
def impostor(*args, **kwargs):
|
|
Inside impostor, instead of printing arg and place, print args and kwargs (leave off the stars to see the actual objects). Now instead of seeing the arguments, you'll see a tuple and a dict that contain the arguments. Why is this? the * operator in a function def converts positional arguments to a tuple; the ** operator converts keyword arguments to a dict. args and kwargs will be a tuple and dict holding the values that were passed to the function. Step 8: Inside impostor() call the function argument you received in logfunc(), passing *args and **kwargs as arguments.This is the function object that has been passsed to logfunc(), that you originally printed. Call the function with the arguments *args, **kwargs and assign the return value to a variable. Return the variable from impostor()Please note you are not to call greet inside impostor(); you are to call the argument that was passed to logfunc(). You should see the impostor messages alongside of the message printed by greet(). This is because when you call greet() with arguments, Python calls the impostor, and the impostor prints some messages and calls greet() with the arguments. Finally, the impostor returns the return value it retrieved from its call to greet(), and that is returned to the original function call. Note that the * and ** work in reverse when used outside a function def: * will convert a tuple or list to positional arguments; ** will convert a dict to keyword arguments. Step 9: Complete the messages requested in the assignment.At this point you should hopefully be able to complete the assignment as requested, but please let me know your questions! Thanks |
|
5.6 | (extra credit) Sort a dict of lists by sum of list values associated with each key. |
year_mktrfs = {
'1926': [0.97, 0.44],
'1927': [0.83, 0.3, 0.0, 0.72],
'1928': [0.43, 0.34, 0.71]
}
|
|
Using a sorting helper function, sort the dict keys by the sum of values for each key. Loop through the sorted keys and print the key and the list of values for that key, in the following order (highest sum to lowest): |
|
1927 [0.83, 0.3, 0.0, 0.72] 1928 [0.43, 0.34, 0.71] 1926 [0.97, 0.44] |
|
The dict keys can be sorted with the help of a sorting helper function by_yearsum():
skeys = sorted(year_mktrfs, key=by_yearsum, reverse=True)
|
|
If you use the above sorted() code, the sort function by_yearsum will automatically receive a year key as argument. You can then use the key to get the value from the dict (the list of Mkt-RF values) and then use the sum() function on the list to get the sum of values from the list. Please keep in mind that the sort function does not try to sort anything. It merely expects only one key (a year) as argument and returns the sum of the list of values for that year that it obtains from the dict. |
|
Your main body code will look like this - the only code you need to supply is for by_yearsum()
year_mktrfs = {
'1926': [0.97, 0.44],
'1927': [0.83, 0.3, 0.0, 0.72],
'1928': [0.43, 0.34, 0.71]
}
def by_yearsum(key):
""" helper function to help sort dict keys by sum of value """
# your code here
sorted_keys = sorted(year_mktrfs, key=by_yearsum)
for thiskey in sorted_keys:
print(thiskey, year_mktrfs[thiskey])
|
|
5.7 | (extra credit) Travel to and sort the keys of a dict of dicts that is embedded in a deeply nested structure: open ip_routes_tiny.json in this week's archive, and determine the path that can lead you to the dict associated with the key "routes". |
First, open ip_routes_tiny.json in a text file and take a look at it. Note the structure of the outermost container - it is a dictionary that has 2 keys. The 2nd key, "result", has a value that is a list. The first item in that list is a dict. That dict has a key "vfrs" which is paired with another dict. That dict has a key "default" paired with another dict. That dict has a key "routes" which is paired with the dict we wish to sort. The first step is to "travel to" the dict of dicts in the file that is associated with the "routes" key. By this I mean that you can use subscripts to get to this dict of dicts. |
|
For example, if your code starts this way:
import json
fh = open('ip_routes_tiny.json')
routes_dict = json.load(fh)
|
|
Then to get to the first list inside routes_dict we would use this code:
flist = routes_dict['result']
|
|
Further, to get to the first dict inside the list, we would replace the above with this code:
fdict = routes_dict['result'][0]
|
|
We can continue chaining dict subscripts to the above to take us all the way to the dict of ip address keys. That is the dict we wish to sort. |
|
The best way to begin is to create a chained subscript to assign the ip address dict to a new variable name, and then sort that variable. Make sure the structure you are sorting is the one you expect. Use the json module to print out this structure so you can see it properly:
print(json.dumps(ip_dict, indent=4))
|
|
This assumes that you have assigned the ip address dict to a variable named ip_dict. If you are printing the correct structure, you should see something close to this:
{ "209.191.231.0/24": { "kernelProgrammed": true, "directlyConnected": false, "preference": 20, etc. |
|
You can then look to sort the keys for this simpler dict. Please let me know if you would like additional disscussion on this assignment. |
|