Data Science Foundations

CIS 705 (CRN 89862) — ICS 691E Topics in CS Area 4 (CRN 89321)

Tue & Thu 12 PM - 1:15 PM Hamilton Library Basement Room 2K


Data Science Foundations is a graduate course introducing all aspects of data science foundations to non-specialists - computer programming is not a pre-requisite skill for this course. The course will cover: ethics, policy, regulatory frameworks; the data analysis process; programming tools; data acquisition and cleaning; data analysis and mining methods; data visualization; publication, curation and preservation; applications of data science in various domains, industries and sectors.

For more information, please consult the syllabus.

Instructor: Lipyeow Lim. POST 303E. Wed 2-3pm or by appointment. 808-956-3495. lipyeow at hawaii dot edu.

Examinations: There is no written final exam.

Textbooks:

  1. DSFS - Data Science from Scratch: First Principles with Python. 1st Edition. Joel Grus. O’Reilly Media. ISBN-13: 978-1491901427
  2. DMAA - Data Mining and Analysis: Fundamental Concepts and Algorithms. Mohammed J. Zaki, Wagner Meira, Jr. Cambridge University Press ISBN-13: 978-0521766333
  3. TS - Think Stats 2e, Allen B. Downey.
  4. TB - Think Bayes Allen B. Downey.

Communications: We will be using Slack for communications (falling back on email where appropriate). Please post questions there so that the whole class can benefit. Remote students (limited to 3 participants) can join the class via appear.in

Late policy: work submitted past due date and time will receive zero credits.

Student Conduct: All students are expected to conduct themselves above and beyond the standard set forth in UH Systemwide Student Conduct Code.

Disability: Any student who feels s/he may need an accommodation based on the impact of a disability is invited to contact the instructor privately. The instructor would be happy to work with you, and the KOKUA Program (Office for Students with Disabilities) to ensure reasonable accommodations in the course. KOKUA can be reached at (808) 956-7511 or (808) 956-7612 (voice/text) in room 013 of the Queen Liliuokalani Center for Student Services.

Schedule

Week Date Topic Before Class In Class After Class
1 Tue Aug 21 Introduction Slides | Syllabus | Buy vs Rent Install Python
1 Thu Aug 23 No class because of Hurricane Lane
2 Tue Aug 28 Introduction Foundational Methodology for Data Science Slides | Buy vs Rent video
2 Thu Aug 30 Python DSFS Ch.1-2 Getting Started with Python video
3 Tue Sep 4 Python - numpy,matplotlib,sklearn DSFS Ch.3 Getting Started with Python video
3 Thu Sep 6 Databases - data modeling Slides | Modeling Data video
4 Tue Sep 11 Databases - SQL Slides | Modeling Data video | Install sqlite3 | Assignment 1
4 Thu Sep 13 Databases - SQL DSFS Ch.23 Slides | SQL video
5 Tue Sep 18 Databases - SQL Slides | SQL video
5 Thu Sep 20 Linear Algebra - no F2F class DSFS Ch.4 Linear Algebra Review
6 Tue Sep 25 Linear Algebra - kNN,SVD DSFS Ch.12 kNN ipynb | Hi.Dim.ipynb | PCA ipynb video
6 Thu Sep 27 Linear Algebra - LDA,DT DSFS Ch.17 LDA ipynb | DT ipynb video
7 Tue Oct 2 Linear Algebra - SGD,SVM DSFS Ch.8 video | Assignment 1 Due.
7 Thu Oct 4 Linear Algebra - NN,OneVsAll,Kernel,Ensemble Ch.18 NN demos | Ensemble demos
8 Tue Oct 9 Probability & Statistics TS Ch.1-2 Stats 1 video
8 Thu Oct 11 Probability & Statistics - no F2F class TS Ch.3-4 Stats 2
9 Tue Oct 16 DS in Agriculture - smartyields video video
9 Thu Oct 18 Probability & Statistics - PMF,PDF,CDF TS Ch.5-6 Stats 3 video | Assignment 2
10 Tue Oct 23 Probability & Statistics - two rv,estimation TS Ch.7-8 Stats 4 | Stats 5 video
10 Thu Oct 25 Ethics & Policy - Guest Lecture by Dr. Winter Readings for Ethics & Policy | WMD Ch.0 | WMD Ch.1 Slides video
11 Tue Oct 30 Probability & Statistics - Hypothesis Testing, CLT TS Ch.9+14 Stats 6 video
11 Thu Nov 1 Data - scraping,probabilistic interpretation,time series,text TS Ch.12 | DSFS 9+20 Data Preproc. video
12 Tue Nov 6 Election Day Holiday
12 Thu Nov 8 Data - embeddings for text & graph,Bayesian stats TB Ch.1 video
13 Tue Nov 13 Probability & Statistics - Bayesian methods Stats 7 video
13 Thu Nov 15 Class Cancelled
14 Tue Nov 20 Cluster Analysis - kmeans,hierarchical,Gaussian Mixture DSFS Ch.19 k-means video
14 Thu Nov 22 Thanksgiving Holiday
15 Tue Nov 27 Data Curation - Dr.Sutherland Slides video
15 Thu Nov 29 Cluster Analysis - freqitem,LDA,Recommender Systems Slides | Slides | LDA video
16 Tue Dec 4 Project Project Videos
16 Thu Dec 6 Project Project Videos

About this site: Modules lists the topics covered. Learning outcomes collect all the desired student learning outcomes of all the modules. Readings list the “passive” learning opportunities like reviewing of textbook sections, web pages, screencasts, etc. Experiences list the “active” learning opportunities where you must actually demonstrate a capability.