ICS 491 Special Topics: Big Data Analytics (CRN 79342)

Mon & Wed 3:00PM - 4:15PM POST 126


ICS491 is a special topics course covering the concepts and skills required for mining massive data sets with a focus on the practical application of the concepts, tools and techniques in real-world data mining situations. The course teaches the student everything (s)he need to know to get going, from selecting the appropriate big data platforms, preparing inputs, interpreting outputs, evaluating results, to the algorithmic methods at the heart of successful data mining approaches. This is a writing intensive course. For more information, please consult the syllabus.

Instructor: Lipyeow Lim. POST 303E. Wed 10:30AM - 12:30PM or by appointment. 808-956-3495. lipyeow at hawaii dot edu.

Examinations: There is no written final exam, but there will be a course project.

Textbooks: Data Mining and Analysis: Fundamental Concepts and Algorithms. Mohammed J. Zaki, Wagner Meira, Jr. Cambridge University Press ISBN-13: 978-0521766333

Communications: We will be using Slack for communications (falling back on email where necessary). Please post questions there so that the whole class can benefit.

Remote Students: Skype audio & video link will be available. Classes will be recorded and posted to youtube.

Late policy: work submitted past due date and time will receive zero credits.

Student Conduct: All students are expected to conduct themselves above and beyond the standard set forth in UH Systemwide Student Conduct Code.

Disability: Any student who feels s/he may need an accommodation based on the impact of a disability is invited to contact the instructor privately. The instructor would be happy to work with you, and the KOKUA Program (Office for Students with Disabilities) to ensure reasonable accommodations in the course. KOKUA can be reached at (808) 956-7511 or (808) 956-7612 (voice/text) in room 013 of the Queen Liliuokalani Center for Student Services.

Schedule

Week Date Topic Before Class In Class After Class
1 Mon Aug 21 Introduction Slides | Syllabus Install Python & Libraries
1 Wed Aug 23 Introduction - data acquisition Ch.1.{1-2} Python Intro video | HW1
2 Mon Aug 28 Introduction - linear regression Ch.1.{3-5} Python Intro video
2 Wed Aug 30 Introduction - nearest neighbor Ch.1 Slides video
3 Mon Sep 4 Labor Day Holiday HW2 | HW1 due
3 Wed Sep 6 No F2F Class. CLUSTER Conference. Watch video. Do HW2. video
4 Mon Sep 11 Thinking about Data Ch.2-3 Slides | Ex1. Analyzing Numeric Data video | HW3 | HW2 due
4 Wed Sep 13 Frequent Itemset Analysis Ch.8 Slides | Ex2. Analyzing Co-occuring Events video
5 Mon Sep 18 Frequent Itemset Analysis - FP Ch.8 Slides video | HW3 due Sep 19
5 Wed Sep 20 Collaborative Filtering Ex3. Analyzing Movie Ratings Data video | HW4
6 Mon Sep 25 Collaborative Filtering - Alternating Least Squares ICDM 2008 paper video
6 Wed Sep 27 Cluster Analysis - kmeans Ch.13 Slides | Ex4. Analyzing Clusters video | HW5
7 Mon Oct 2 Cluster Analysis - hierachical Ch.14 video | HW4 due.
7 Wed Oct 4 No F2F class. FutureFocus Conference. Learn about Docker Containers. A Beginner-Friendly Introduction to Containers, VMs and Docker Ex5. Containers HW6
8 Mon Oct 9 Big Data Platforms - hadoop Slides video
8 Wed Oct 11 Big Data Platforms - hadoop Run the Hadoop Standalone example Ex6. Hadoop video | HW7 | HW5 due on Fri Oct 13.
9 Mon Oct 16 Big Data Platforms - spark Slides video | Start thinking about project | HW8 | HW6 due.
9 Wed Oct 18 Cluster Analysis - density based Ch.15 Slides | Project video
10 Mon Oct 23 Probabilistic Classification - Naive Bayes, Bayesian Networks Ch 18 Slides video
10 Wed Oct 25 Probabilistic Classification - revisit EM, LDA Ch 13.3 | Latent Dirichlet Allocation Slides video | Project Proposal due.
11 Mon Oct 30 Decision Trees & Forests Ch 19 Slides video
11 Wed Nov 1 Data Science @ Booz Allen Hamilton Talk video | HW7 due
12 Mon Nov 6 Dimensionality Reduction Ch 6-7 Slides | Slides video | HW8 due.
12 Wed Nov 8 Linear Discriminant Analysis| SVM| Logistic Regression Ch 20-21 | Logistic Regression Slides | Slides | Slides video
13 Mon Nov 13 Feature Engineering - tf.idf Slides | Slides video
13 Wed Nov 15 Deep Learning| Feature Engineering - neural networks, skip gram, CBOW But what is a neural network? | Gradient Descent: how neural networks learn | What is backpropagation? Neural Networks video
14 Mon Nov 20 Feature Engineering| Deep Learning - word2vec,audio data,RNN,LSTM,autoencoders Slides video
14 Wed Nov 22 Deep Learning - CNN for image data (by Jonas Krause) Slides video
15 Mon Nov 27 The Dark Side Weapons of Math Destruction Ch.0 | Weapons of Math Destruction Ch.1 Discussion on WMD
15 Wed Nov 29 Visual Analytics (Aberto) Slides video
16 Mon Dec 4 Project Kyle | Stephanie | Hailing | Ed
16 Wed Dec 6 Project Ayush | Mano | Eric | Ling-chih | Wyatt

About this site: Modules lists the topics covered. Learning outcomes collect all the desired student learning outcomes of all the modules. Readings list the “passive” learning opportunities like reviewing of textbook sections, web pages, screencasts, etc. Experiences list the “active” learning opportunities where you must actually demonstrate a capability.