• Assigned: Sunday, February 28, 2016
  • Due: By the beginning of class Thursday, March 8, 2016
  • Submit via GitHub

Building Violations

Export the building violations data from 2016 thus far as CSV from the data portal:

https://data.cityofchicago.org/Buildings/Building-Violations/22u3-xenr

Also download the community areas shapefile:

https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6

(Click Export... then click Shapefile)

Finally, download the nltk in python by running:

import nltk
nltk.download()

and then download the “stopwords” package from the All Packages tab.

1a

Using QGIS, generate a heatmap of building violations overlaid on the community areas. Save the map as an image (Project > Save as Image) and include it in your solution repository.

Open the building violations data in python. The “VIOLATION INSPECTOR COMMENTS” field contains (uppercase!) free text describing each building violation.

1b

Concatenate the non null inspector comments into one string.

1c

Remove punctuation.

1d

Tokenize the inspector comments.

1e

Remove stopwords.

1f

Apply the Porter stemmer.

1g

What are the ten most common terms in the building violation inspector comments (after the above transformations)?

Introduction to Flask

We will use the web interface to a pandas DataFrame we’ve been developing in class from the last lecture as a starting point:

https://github.com/computationforpolicy/lecture-examples/tree/master/webapp

We’ll examine this web app and then add a new page.

2a

Recall from lecture that the code for our web application is stored in dfdisplay.py. Note that config.py contains the path to the CSV we’d like to display. The code in dfdisplay.py takes a CSV file and renders it as an HTML page:

@app.route('/')
def load_dataframe():
    return render_template('dataframe.html', data=df.to_dict(), cols=cols, nrows=len(df))

Try to run the application from the command line:

$ python dfdisplay.py

On the command line, it will produce output beginning with “ * Running on …”.

In a web browser, go to the address that is produced in the output (should be an ip:port tuple like 127.0.0.1:5000). As your web browser submits HTTP GET requests, the server running on the command line should log each one and produce output to the command line. Paste one of the lines of output you get in your response.

2b

Look at templates/dataframe.html. We can run some code between {{ and }} and {% and %}. This code is then processed by the Jinja2 template engine that Flask uses. Note that not all Python expressions are supported by the Jinja2 template engine. What syntax differences do you notice in the code in the template and regular Python?

2c

We want to add a link that will take us to an “About” page. Add a new route in the Python code dfdisplay.py to accomplish this. Write a code stanza that routes ‘/about’ to the template about.html that exists in the templates/ directory. You should render this template and pass a string name that contains your name. Check that it works and that both pages render correctly. Copy this working webapp directory to your repo and commit.