Advanced Python
Projects, Pandas Part II

9.1 Analyze COVID-19 Infection Rates or U.S. Population Rates by County over Time.

Jupyter Notebooks: covid_analysis_event.ipynb Data File: covid_event.csv The notebook of 15 questions takes you through various types of analysis of reports of cases in the states and counties of the U.S. The Project Discussion document can offer many useful hints on which features and parameter arguments to use.

 

Option 2: US Population Create a new Jupyter Notebook and look to building the following charts:

  • count # of counties in each state
  • line chart: given a state input, population over time
  • bar chart: all 50 states population for latest year, sorted
  • bar chart: top 10 states and population for latest year, sorted
  • bar chart: given a state input, top 5 most populous counties in the state
  • bar chart: horizontal "share" bar?
  • pie chart: 50 states, each share of population
  • pie chart: top 10 states and share of population among top 10
  • bar chart: given a state input, caculate % annual increase of population (use df.shift())

Data File: county_population.csv You may want to examine the data using Excel or another spreadsheet program; you may also need to use .isna().any() and .isna().all() to check to see where empty fields may exist, and .fillna() to fill empty cells or .dropna() to remove rows with empty cells where needed. Original URLs for data: https://data.nber.org/census/popest/county_population.csv (please note that after downloading, this file should be opened with the latin-1 character set, i.e. df = pd.read_csv('county_population.csv', encoding='latin-1'))

[pr]