Getting Started with Python

1. Get Acquainted with the Jupyter Environment

2. Get Some Data

We will use the forest cover data from Kaggle

3. Reading the Data

Data types, Data Structures, Array indices

Reading CSV files

4. Manipulating the Data

Copy semantics vs side effects

Transformation pipelines

For-loops

5. Getting the Basic Statistics

Find the mean and distinct values

Function definitions

Test yourself: Write a function to calculate the variance of a given colummn in the data set.

numpy

numpy statistics

scipy statistics

6. Plotting the Data

matplotlib

Plot id and elevation.

Try sorting the data by decreasing elevation

Plot a histogram of the forest cover labels

Plot a histogram of another attribute

7. Training a Model

scikit-learn

Decision trees

8. Testing the Model

Notebook Files