Classifying Forest Cover Type

In this assignment, you will have to analyze the Forest Cover data set, build a few different classifiers to classify the forest cover type of a parcel of land, and evaluate the performance of the classifiers. Even though we have used the forest cover data from Kaggle, we will use a larger, more complete forest cover data set from UCI data repository.

While you will be iteratively wrangling the data, selecting relevant features, tweaking classifiers, and evaluating the results, your final report (in the form of a Jupyter Notebook) should only describe the most essential parts of your efforts (i.e., do not describe in full detail all your missteps and detours). Your report should discuss:

Other considerations:

Deliverables

  1. Jupyter Notebook (ipynb file) describing your work
  2. Any additional scripts you have used outside of the Python/Jupyter environment

For those who have problems running the processing within Jupyter Notebook, you may run the heavy lifting python code outside of Jupyter Notebook, but you must write up your work in Jupyter Notebook for submission.

Do not submit the data set with your submission, but you need to submit any code/scripts needed to run your analysis starting from the data set from the website.

Submission Procedure

Submit your files via Laulima->Assignment.

If you have many files, you might want to zip them up into one archive (zip and tgz accepted. rar is NOT accepted).