Advanced Python
In-Class Exercises, Session 3
Notes if you are using Jupyter Notebook: to call exit() from a notebook, please use sys.exit() (requires import sys). If a strange error occurs, it may be because Jupyter retains variables from all executed cells. To reset the notebook, click 'Restart Kernel' (the circular arrow) -- this will reset variables, but will not undo any changes made. |
|
REQUESTS: VIEWING RESPONSE |
|
Ex. 3.1 | Issue a GET request, view response text with .text. Use requests.get() with a url to retrieve a response from the weather service. Print the .text attribute to see the body of the response. |
import requests
url = 'https://forecast.weather.gov/product.php?site=NWS&issuedby=CTP&product=AFD'
response = requests.get()
|
|
Ex. 3.2 | Issue a GET request, view headers. Print the .headers attribute of the response to see the headers sent back by the weather server. |
You can also loop through the dict-like response.headers to see each key/value pair clearly. |
|
import requests
url = 'https://forecast.weather.gov/product.php?site=NWS&issuedby=CTP&product=AFD'
response = requests.get(url)
|
|
Ex. 3.3 | Issue a GET request, view status code. Print the .status_code attribute from the response, |
Next, change one of the parameters to see how the code changes; also, change the spelling of the word 'product'. In addition, use requests.status_codes._codes[response.status_code] with the status code to see the meaning of the response code. |
|
import requests
url = 'https://forecast.weather.gov/product.php?site=NWS&issuedby=CTP&product=AFD'
response = requests.get(url)
# print(requestH.status_codes._codes[])
|
|
REQUESTS: ENCODED (text) and UNENCODED (bytes) RESPONSE |
|
Ex. 3.4 | Issue a web request, view response.text (decoded) and response.content (undecoded). |
With the first URL, check the type of response.text and the type of response.content. Each contains the response content, but one is decoded as a string and the other is encoded as bytes. Next, check the value of response.encoding to see what encoding the yahoo page uses. The check the same with the Microsoft English page and the Microsoft French page. |
|
import requests
url = 'http://www.yahoo.com'
# url = 'https://www.microsoft.com/en-us/' # Microsoft in English
# url = 'https://www.microsoft.com/fr-fr/' # Microsoft in French
response = requests.get(url)
|
|
Ex. 3.5 | Saving an image, sound or zip file. Issue the below web request, open a local file with 'wb' (meaning, "write as bytes") and write the unencoded text to the file. (To access unencoded text, use response.content instead of reponse.text). |
import requests
url = 'https://davidbpython.com/advanced_python/supplementary/python.png' # a URL to an image
response = requests.get(url)
|
|
Check the folder where this script or notebook is located; you should find the image file there. |
|
LAB |
|
Ex. 3.6 | Retrieve a page and save to disk. |
Download the following page and save to disk in a new file (write response.text to the file). Make sure to close the file. After writing the file, please open the file in a browser (use your browser's File > Open File menu command). In another tab on your browser, open the url directly. Compate the two pages. (The locally saved file will not have correct formatting or colors - this is because these are supplied by a separate CSS file that was not downloaded.) |
|
import requests
url = 'https://pycoders.com/'
|
|
Ex. 3.7 | View response status code. |
Continuing the previous exercise, use the .status_code attribute to view the response status. Now change the URL to something you wouldn't expect to be correct, run and view the status code. |
|
Ex. 3.8 | View response headers. |
Continuing the previous exercise, use the .headers attribute to retrieve the headers coming back as a dict from the response. Loop through and print each key/value pair in the dict. |
|
Ex. 3.9 | Retrieve and save an image. |
Use the below URL to retrieve an image from the internet, and save it locally to a filename of your choice (make sure it has a .jpg file extension so it is recognized properly by your image viewer). The data must be read and written as binary data (i.e., not plaintext). Therefore when writing, use 'wb' (write binary) instead of 'w'. |
|
import requests
url = 'https://cdn.vox-cdn.com/uploads/chorus_image/image/32167377/monty-python-3by2.0.jpg'
|
|
REQUESTS: CONFIGURING THE REQUEST |
|
Ex. 3.10 | Demo exercise: issue a request and view elements of the request. |
First, issue the following request and view the headers reflected back. Next, uncomment the 'headers' lines to see how http_reflect "sees" you (i.e., what browser and platform it thinks you are requesting from). Finally, change 'text/plain' to 'text/html' and see http_reflect respond with HTML instead of plain text. |
|
import requests
spoof_browser = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36"
response = requests.get('http://davidbpython.com/cgi-bin/http_reflect'
#, headers={
# 'User-Agent': spoof_browser,
# 'Accept': 'text/plain',
# }
)
print(response.text)
|
|
Ex. 3.11 | Demo exercise: issue a request and send parameters with .get(). |
In the below program, add the params= argument with the dict 'my_params' to send key/value pairs to the http_reflect service as part of the query string. In the response, find the parameters you sent. |
|
import requests
my_params = {'a': 1, 'b': 'hello'}
response = requests.get('http://davidbpython.com/cgi-bin/http_reflect')
print(response.text)
|
|
Ex. 3.12 | Issue a request and send parameters with .post(). |
In the below program, add the data= argument with the dict 'my_data' to send key/value pairs to the http_reflect service as part of the body of the request. In the response, find the parameters you sent. |
|
import requests
my_data = {'a': 1, 'b': 'hello'}
response = requests.post('http://davidbpython.com/cgi-bin/http_reflect')
print(response.text)
|
|
REQUESTS: UPLOADING A FILE |
|
Ex. 3.13 | Upload a file to a server program. Add the files= parameter to .post() to upload the below file. |
Note that the file has been opened with 'rb', which stands for 'read binary': we are uploading encoded bytes. |
|
import requests
# open file for reading without decoding (returns a bytestring)
file_bytes = open('../test_file.txt', 'rb')
file_dict = { 'file': ('test_file.txt', file_bytes,
'text/plain') }
response = requests.post('https://davidbpython.com/cgi-bin/http_reflect')
print(response.text)
|
|
LAB |
|
Ex. 3.14 | Demo exercise: issue request with headers, parameters and data. |
Please note below the approach for setting headers, parameters and data in a request, and note that each have been successfully sent to (and reflected back from) the server. |
|
import requests
# link to my reflection program
url = 'http://davidbpython.com/cgi-bin/http_reflect'
div_bar = '=' * 10
# headers, parameters and message data to be passed to request
# change to 'text/html' for an HTML response
header_dict = { 'Accept': 'text/plain' }
param_dict = { 'key1': 'val1', 'key2': 'val2' }
data_dict = { 'text1': "We're all out of gouda." }
# a GET request (change to .post for a POST request)
response = requests.get(url, headers=header_dict,
params=param_dict,
data = data_dict)
# status of the response (OK, Not Found, etc.)
response_status = response.status_code
# headers sent by the server
response_headers = response.headers
# body sent by server
response_text = response.text
# outputting response elements (status, headers, body)
# response status
print(f'{div_bar} response status {div_bar}\n')
print(response_status)
print(); print()
# response headers
print(f'{div_bar} response headers {div_bar}\n')
for key in response_headers:
print(f'{key}: {response_headers[key]}\n')
print()
# response body
print(f'{div_bar} response body {div_bar}\n')
print(response_text)
|
|
Ex. 3.15 | Use a parameter to configure a web API request. |
Complete a request from the below URL by using the parameter term with a value that is a term you would like to search on urbandictionary.com. (Warning: urbandictionary.com definitions can use offensive language. For a "clean" set of definitions, try the defition for squee as shown below.) Read and save the resulting JSON text to a file. |
|
import requests
url = 'http://api.urbandictionary.com/v0/define'
|
|
Ex. 3.16 | Read and parse JSON data. |
Continue the previous program by using the .json() method of the response object to read the data as a JSON object. You should find that the object is a dict, that it has one key list with a value that is a list, and that if you loop through this list of dicts, the key author is the author's name, and definition is the definition. |
|
Ex. 3.17 | Read and parse CSV data. |
Request the following data and retrieve as text. Load the text as a csv.reader object, then read line-by-line and print out each parsed line. (Hint: to load the data into text into csv.reader, split the text using .splitlines() first, and pass this list to csv.reader().) |
|
import csv
import requests
url = 'http://davidbpython.com/advanced_python/supplementary/dated_file.csv'
|
|
WEB SCRAPING |
|
Ex. 3.18 | BeautifulSoup object: the below code reads a string read from an html file and parses the file and its tags into a BeautifulSoup object. |
Explore the following attributes of the object named 'soup':
|
|
from bs4 import BeautifulSoup
scrapee = '../dormouse.html'
text = open(scrapee).read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Note that you may occasionally encounter a UnicodeDecodeError when you attempt to read a file from the internet. In these cases you should tell Python which encoding to use: |
|
text = open(scrapee, encoding='utf-8')
|
|
Ex. 3.19 | "first tag" attribute access; .text attribute: access the object for soup.title, soup.body, soup.p, soup.meta. |
Print the type of 1 of these objects. Print the .text attribute of each of these objects. |
|
from bs4 import BeautifulSoup
scrapee = '../dormouse.html'
text = open(scrapee).read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Ex. 3.20 | Access tag by parameter using a dict. Use a dict with .find() to specify a 'class' parameter value (e.g. attrs={'class': 'story_title'}) |
from bs4 import BeautifulSoup
scrapee = '../dormouse.html'
text = open(scrapee).read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Ex. 3.21 | individual parameter values: for a tag that has them, use a subscript to access parameter values of a tag (e.g. tag['value'] for the first meta tag) |
from bs4 import BeautifulSoup
scrapee = '../dormouse.html'
text = open(scrapee).read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Ex. 3.22 | Find all parameter values: for those tags that have them, use .find_all() to access multiple tags with the same criteria. |
from bs4 import BeautifulSoup
scrapee = '../dormouse.html'
text = open(scrapee).read()
soup = BeautifulSoup(text, 'html.parser')
|
|
LAB |
|
The following exercises will parse the test_scrape.html page in this week's session folder. |
|
<html> <head> <title>This is a page title.</title> </head> <body> <h3 class="heading">This is a page heading.</h3> <p>This is some text.</p> <p>This is some more text.</p> <h3 class="midpage">This is a midpage heading.</h3> <p>This is even more text.</p> <div content="Some div parameter content we want!"> Some div text! </div> </body> </html> |
|
Ex. 3.23 | In test_scrape.html, scrape the page title (the title within the <title> tags) and print it. |
from bs4 import BeautifulSoup
fname = '../test_scrape.html'
fh = open(fname)
text = fh.read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Expected output: |
|
This is a page title.
|
|
Note that you may occasionally encounter a UnicodeDecodeError when you attempt to read a file from the internet. In these cases you should tell Python which encoding to use: |
|
text = open(scrapee, encoding='utf-8')
|
|
Ex. 3.24 | Scrape and print the <h3> tag in the middle of the page (not the first <h3> tag) |
from bs4 import BeautifulSoup
fname = '../test_scrape.html'
fh = open(fname)
text = fh.read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Expected Output: |
|
This is a midpage heading. |
|
Ex. 3.25 | Scrape and print the <div> text as well as the "content" parameter value. |
from bs4 import BeautifulSoup
fname = '../test_scrape.html'
fh = open(fname)
text = fh.read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Expected Output: |
|
Some div parameter content we want! Some div text! |
|
Ex. 3.26 | Scrape and print each of the <p> tag texts. |
from bs4 import BeautifulSoup
fname = '../test_scrape.html'
fh = open(fname)
text = fh.read()
soup = BeautifulSoup(text, 'html.parser')
|
|
Expected Output: |
|
This is some text. This is some more text. This is even more text. |
|
WORKING WITH ENCODINGS AND UNICODE |
|
Ex. 3.27 | Use ord() and chr() to encode characters to integers and decode integers to characters. |
Use the ord() function to convert this character to an integer; then use the chr() function to convert back from integer to character. |
|
char = 'A'
|
|
Ex. 3.28 | Demonstratin: note how you can use integers to retrieve characters. |
You can use integers to retrieve characters. Run the following program: |
|
for idx in range(65, 70):
print(chr(idx))
|
|
Ex. 3.29 | Use str.encode() and bytes.decode() to convert a string to a bytestring and back to string. |
greet.encode() should include an encoding ('ascii', 'latin-1' or 'utf-8'): |
|
greet = 'Hello, world!'
bytestr = greet.encode(# add encoding here)
print(bytestr)
# subscript the bytestring to see individual characters
# now call bytestr.decode() with the same encoding to see the string again
|
|
Ex. 3.30 | Encode a latin-1 string to bytes and back to string, then try to encode as ascii |
The below string contains a non-ascii character. Encode into the following encodings: 'latin-1', 'utf-8' and 'ascii'. |
|
string = 'voilĂ '
bytestr = string.encode(# add encoding here)
print(bytestr)
|
|
Ex. 3.31 | Open a file in various encodings. The text of the following file is French, and we are opening it in utf-8 (Python's default). Try to open the file with encoding='latin-1' and encoding='ascii'. |
filename = '../la_vie.txt'
fh = open(filename, encoding='utf-8')
print(fh.read())
|
|
Ex. 3.32 | Use the chardet library to "sniff" an encoding. Given the below strings, use chardet.detect() with each of the two bytestring to get Python's best guess as to its encoding. Print the resulting dict from .detect(). |
import chardet
bytestr1 = 'voilĂ '.encode('utf-8')
bytestr2 = 'hello'.encode('utf-8')
|
|
optional: FLASK |
|
In these exercises, we'll re-create a Flask application that takes form input and displays an image. You can supplement these instructions with your own application idea -- one that uses user input to choose and display images. |
|
Ex. 3.33 | Create a simple Flask 'hello world' app that greets the user, and start the application. |
Use the basic hello_flask.py example. Start the application by running it from the command line (the Terminal or Command Prompt window). Expected Output: |
|
$ ./hello.py * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat * Debugger is active! * Debugger PIN: 323-629-461 |
|
Ex. 3.34 | Call the application from a browser. |
Call the application from a browser with the following URL; you should see "Hello, World!" displayed in the browser. |
|
http://localhost:5000/hello |
|
Expected Output: |
|
Hello, world! |
|
Ex. 3.35 | (Exploratory exercise): view log messages at command line. |
The sample code from the hello world app contains a print() statement. Look in the Terminal or Command Prompt window to see the print output displayed. Refresh the page several times so you can see that the log message is displayed. Sample program run: |
|
*** DEBUG: inside hello_world() *** 127.0.0.1 - - [24/Mar/2018 13:45:59] "GET /hello HTTP/1.1" 200 - *** DEBUG: inside hello_world() *** 127.0.0.1 - - [24/Mar/2018 13:45:59] "GET /hello HTTP/1.1" 200 - *** DEBUG: inside hello_world() *** 127.0.0.1 - - [24/Mar/2018 13:46:00] "GET /hello HTTP/1.1" 200 - |
|
Ex. 3.36 | Add another @app.route() function. |
Add another @app.route() function that says "Goodbye, now! Come on back, y'hear?". This function should be called when the user adds /goodbye to the URL. Call the new URL: |
|
http://localhost:5000/goodbye |
|
Expected Output: |
|
Goodbye, now! Come on back, y'hear? |
|
Ex. 3.37 | Add templates for /hello and /goodbye so that when the user visits, it returns a template rather than a string. |
The templates must be placed within a folder called template/ that is in the same directory as the script. Here is the template to add for hello (hello.html, included in this session's files): |
|
<html> <head><title>A Greeting</title></head> <body> <h1>Hello, Template!</h1> </body> </html> |
|
Expected Output: |
|
Hello, Template! |
|
Here is the template to add for goodbye (goodbye.html, included in this session's files): |
|
<html> <head><title>A Farewell</title></head> <body> <h1>Goodbye, now! Come on back, y'hear?!</h1> </body> </html> |
|
Expected Output: |
|
Goodbye, now! Come on back, y'hear?! |
|
Ex. 3.38 | Add a "static" HTML page with a link that calls the hello/ function, and another link that calls the goodbye/ function. |
This page has been added as launch.html, also in the templates/ directory. |
|
<html> <head><title>Launch</title></head> <body> <h1>Launch!</h1> <A HREF="http://localhost:5000/hello">say hello</A><br><br> <A HREF="http://localhost:5000/goodbye">say goodbye</A> </body> </html> |
|
As you can see, each link points to another place in the app. After saving this page as launch.html, use your browser's File > Open File... to navigate to the page and display it. You'll see the two links displayed. Expected output: Click on each one to see that the page calls the Flask app and displays the proper template. |
|
Ex. 3.39 | Add a link on the hello/ page that calls the goodbye/ page, and one on the goodbye/ page that calls the hello/ page. |
However, don't use the <A HREF= URL as was done in the launch page. Instead, use the flask.url_for() function and allow Flask to build the URL for you: |
|
<A HREF="{{ url_for('hello_world') }}">say goodbye</A> |
|
Also make special note that the argument to url_for() is the name of the function, not the @app.route() URL string. Expected output: You should be able to click back and forth between one page and another by clicking the link on each page. |
|
Ex. 3.40 | Add the following form to the hello.html template, which features a <select> dropdown and and an <input> field. Call the app through the form by submitting the form. |
<form action="http://localhost:5000/goodbye"> Please enter a caption: <input type="text" name="caption"> <br> Please select a mood: <select name="mood"> <option>happy</option> <option>sad</option> </select> <br> <input type="submit"> </form> |
|
Ex. 3.41 | Read the caption and mood field values and include them in the goodbye.html template. |
The form sends field values for "caption" and "mood" to the app when it calls the goodbye/ URL; retrieve these values using flask.request.args.get('caption') and flask.request.args.get('mood'). Use the {{ [token name] }} token (where [token_name] is the token name) to insert a value into the template. Add parameter arguments (for example caption=[caption variable], where [caption variable] is whatever variable you used to retrieve the value from the form) to render_template() to insert the value. Expected Output: |
|
Goodbye, now! Come on back, y'hear?! caption: Hi there (or whatever you typed into form) mood: happy (or whichever mood you chose) say hello |
|
Note that Hi there and happy are values that I entered into the form when I submitted it. |
|
Ex. 3.42 | Add an image to the goodbye page. |
Add the happup.jpg image to the goodbye.html template. In order to reference images in an HTML page served by flask, we must put the images in static/images and reference them using <IMG SRC="{{ url_for('static', filename='images/[name of image file]') }}">. Expected Output: You should see the happup.jpg image along with the form input values |
|
Ex. 3.43 | Choose an image based on the user's dropdown choice. |
Now add the sadpup.jpg image to the static/images/ directory and modify the script so that it sends either image name to the template based on the user's "mood" choice. Use the {% if [varname] == [value] %} {% else %} {% endif %} construct to have the template show either the happup.jpg or sadpup.jpg image. Also, place the caption string beneath the image. Expected Output: You should see the happup.jpg or sadpup.jpg image displayed along with a caption beneath. |
|