Modeling

MATERIALS

Course videos & presentation materials

Module 1

Welcome & Introduction

Weeks 1-2

Read

Beginner's Guide to Python

Start with the Video for Module 1 above & this link if you're new to programming

Interactive | Watch

Video and Slides for Module 1 (above & in PDF)

Do

  1. Download & install Python

  2. Download & install PyCharm (or other editor)

  3. Bookmark W3Schools Python Intro

  4. Optional: Sign up for zyBooks and subscribe to the Python book (Code: UTSABME6303Fall2020) - zyBooks is additional practice

  5. Try the first code "3 ways" as presented in the video lecture (@ ~28:20 into the video)

Slides for Module 1

Module 2

Python Syntax 

Weeks 2-3

Read

W3Schools: Python Syntax Intro

Interactive | Watch

Video and Slides for Module 2 (above & in PDF)

Do

  1. Complete Class Survey (also above)

  2. Make & Upload a 15-30 sec Introduction Video about yourself here

  3. Complete W3Schools Python Exercises 1-34, Syntax through Booleans

  4. Optional: Complete zyBooks Chapters 1 (Intro) & 2 (Variables & Expressions)

  5. Try the “Joining a tiger team” syntax practice as presented in the video lecture (@ 27:48 in the video)

Module 3

Python Operators 

Weeks 3-4

Read

  1. W3Schools: Python Operators Intro

  2. Python Reference Operators Guide (credit:  Jakub Przywóski)

Interactive | Watch

Video and Slides for Module 3 (above & in PDF)

Do

  1. Complete W3Schools Python Exercises 35-39 

  2. Optional: Complete zyBooks Chapters 3 & 4

  3. Complete Coding Challenge I. Submit via Blackboard for grading by next Tuesday, Sept 15th at 11:59 pm.

  4. Walk through why 1^2 returns 3, 1^10 returns 11,  and 1^11 returns 10 in the Python command shell. For more background, see the Module 3 video @ 7:59 for an illustration of the bitwise operator NOR (^). Read more about Python logic gates in this tutorial.

Module 4

Python Logic Expressions 

Weeks 4-5

Read

W3Schools: Python "If....Else", "While" and "For Loops" Intro

Interactive | Watch

Slides for Module 4

Do

  1. Watch introduction videos from classmates

  2. Upload a 15-30 sec video if you have not already here by Sept 18. You will be assigned a team via email by Sept 21. Introduce yourself to your team project members.

  3. Complete W3Schools Python Exercises 61-78, "If....Else", "While" and "For Loops"

  4. Optional: Complete zyBooks Chapters 5 & 6

  5. Try the “Loop & logic” logic expression practice as presented in the video lecture

Module 5

Python Applications:

Omics and Biosensor Data Processing 

Weeks 5-6 

Read

1) New to biomedicine? Read about "Omics" (Wikipedia Entry)

2) A Primer on Data Analytics in Functional Genomics: How to Move from Data to Insight?

3) 10 Clustering Algorithms with Python

4) Read about the data and challenges for the 3 class project options:

  1. Activity data from fitness wearables (data download).

  2. Proteomics & clinical data from cancer patients (data download)

  3. Facebook data on COVID symptoms

 

Interactive | Watch

Participate in the live session (9/21), watch the "Introduction to Clustering" tutorial video (9/28) and / or review slides 

 

Link to data examples used in live session

Do

  1. Cluster the example fitness data in your favorite interactive high-dimensional data tool. For an example in Biowheel, see below: 

    1. Register for a free Biowheel account (academic.dibsvis.com). For more details on using Biowheel to visualize data, click here.

    2. Download the Fit_Profile_2020.xlsx file, then drag and drop the data into the Biowheel browser. 

    3. Check that the data uploaded by clicking the "data" icon (first icon on the left)

    4. Pick the data types you want to cluster (e.g., Miles Ran, Calories) by clicking the "pie" icon then the name of the data type 

    5. Cluster the data by clicking the "magic wand" icon. For details on using Biowheel to cluster data, see video here

      • Try 2, 3 and 4 clusters​. In k-means, k = number of clusters

      • Try adding or deleting the data types you cluster

  2. Introduce yourself to your project teammates (assigned via email on 9/21-9/22)

  3. Decide on class project topic with your teammates: (1) identify trends in cardiovascular health from fitness watch & smartphone data; (2) predict leukemia patients' response to chemotherapy from clinical and proteomics data or (3) develop an early detection method from COVID Facebook symptoms data

  4. Download the data for your choice of class project. For open source biomedical data stored on Synapse, you need to register an email first.

  5. Optional: Read about Biopython and download Biopython if your team plans to use it for your course project (useful for molecular data)

Tutorial

Introduction to Clustering

Q & A Session (9/28)

Module 6

Python Functions 

Weeks 6-7 

Read

1) K-means Clustering in Python using Scikit-learn and a tutorial

2) A second K-means Clustering in Python tutorial

3) A third K-means Clustering in Python tutorial

4) Read about the syntax of Python functions in W3Schools 

Interactive | Watch

  1. Review the tutorial video "Introduction to Clustering" (from 9/28 Q & A session). 

Do

  1. Complete W3Schools Function Exercises

  2. Install sci-kit learn into PyCharm or your favorite editor. Follow these instructions for PyCharm. 

  3. Download your team project data:

  4. Cluster a subset of the data in Biowheel (see Module 5) or your favorite data clustering visualization tool.

    1. For more details on using Biowheel to visualize data, see video tutorials here. 

    2. For details on using Biowheel to cluster data, see video here

  5. Copy  & modify the k-means algorithm described here into an executable Python function in PyCharm in a directory where it can call your data as input

  6. Cluster your class project data (or a subset of it) in Python using the k-means function you saved.   

Logic Gates

Introduction to Binary Numbers & Bitwise Operators

Q & A Session (10/5)

Recap of Python Operators 

Interactive | Watch

 

Review Python operators: watch the Module 3 video on Python Operators, and read more about Python logic gates and the binary number system

 

Question: Why does 1^11 return 10 in Python?

 

Read more about Python logic gates like the XOR gate above in this JournalDev tutorial (credit: Pankaj Kumar). 

Circular Heatmaps & Visual Clustering

Introduction to Biowheel

Q & A Session (10/5)

Interactive | Watch

 

Circular Heatmaps

Biowheel: a software program available to make circular heatmaps: academic.dibsvis.com.

  1. Register for a free academic account.

  2. Drag and drop Excel files into the program

  3. Select desired data and format. 

For details on using Biowheel to visualize data, see video tutorials here.

As noted, Dr. Qutub and Qutub lab alumni co-founded DiBS, which developed Biowheel. The academic portal is provided for free, academic research use. Citations: Hill et al., Nature Methods, 2016; DREAM 8 Breast Cancer Portal; bioRxiv 099739; Bioinformatics Peer Prize.

Modified K-Means Clustering

For details on using Biowheel to cluster data (the "magic wand" icon @ academic.dibvis.com), read the description here. Clustering can be performed in an unsupervised (k-means) or semi-supervised manner.  

NOTE: Students are welcome to use any choice of tool to visualize data in clusters (e.g., Excel, Biowheel, Cluster 3.0, cBioPortal) and/or design their own program. The goal of displaying the clusters via a heatmap is to identify how the cluster algorithm is handling data and grouping patterns. 

Module 7

Python Functions - Part 2 

Weeks 7-8 

Read

1) Read (or re-read) about the syntax of Python functions in W3Schools 

2) Read the Python function tutorial from Tutorials Point.

3) Read about built-in Python functions at Python.org.

4) Read another Python function tutorial from Programiz.

Interactive | Watch

  1. Participate in the live Q & A session (10/5) on logic operators and Python functions.

  2. Review the video tutorials on Biowheel data visualization and interactive clustering. Alternatives: review tutorials on cBioPortal,  Cluster 3.0, or your choice of favorite interactive, high-dimensional data visualization platform. 

Do

  1. Complete Coding Challenge II. Submit via Blackboard for grading by next Tuesday, Oct 13th at 11:59 pm.

  2. Meet with your team members for the class project

    1. Plan a weekly schedule for completing the project over the next 8 weeks.

    2. Assign weekly project parts to each team member

  3. Optional: Complete Chapter 6: Functions in zyBooks.​

 

NOTE: Final coding presentations are 12/3 and the project report is due 12/11. The project counts for 50% of the course grade. The three Coding Challenges count for 36% (12% per Coding Challenge).

Module 8

Python Functions - Part 3 

Weeks 8-9 

Read

1) Read (or re-read) about clustering in Python using sci-kit learn (sklearn)

2) Read (or re-read) "getting started" tips for PyCharm

3) Read (or re-read) installation tips for sci-kit learn (see Module 5)

Trouble with installation? See tips on installing libraries below.

Interactive | Watch

  1. Optional: Participate in the live Q & A session (10/12) on PyCharm programs and clustering.

  2. Install packages (sklearn + matplotlib) as outlined here (see Module 6)

  3. Download and run the example k-means vs mini k-means program covered in the Q & A, and/or other example clustering programs. Scroll to the bottom of the scikit-learn website page to download the source code.

Do

  1. Follow-up from Module 7: Complete Coding Challenge II if you have not already. Submit via Blackboard for grading by Tuesday, Oct 13th at 11:59 pm.

  2. Meet with your team members for the class project. Complete weekly assigned parts 

NOTE:

  • The fitness data for Coding Challenge II can be imported into your program; data can be input into a function you define; or data can be copy/pasted into an np.array, as shown in the Q & A on 10/12.

  • Coding Challenge II is focused on (1) learning to run programs in Python and (2) understanding the steps, assumptions and mathematics of clustering data. You will not be graded on data input/output for Coding Challenge II, and this material will be covered in later Modules.

  • Need help getting started for Coding Challenge II? Try the default k-means cluster algorithm from scikit-learn in the Python command shell on values you provide manually in a matrix via np.array. Example screenshot below.

Example of applying k-means in the Python command shell

Once you are comfortable with clustering using the command shell, try downloading and modifying a cluster program or writing your own in PyCharm or another editor. 

 

Installation errors? Check out installation tips on scikit learn and NumPy below.

Installation Tips

Installing Scikit Learn, NumPy & matplotlib in PyCharm 

Q & A Session (10/12)

Install Scikit Learn (sklearn + NumPy)

in PyCharm

by the terminal

in PyCharm

by the terminal

Install matplotlib

K-Means Example

Python Code

Module 9

Introduction to Image Analysis in Python

Weeks 9-10 

Read

Read about Scikit-image, a Python image analysis package, and review example image processing code here. A user guide is here.

Interactive | Watch

  1. Participate in the live class session (10/19) on a basic introduction to image analysis in Python.

Do

  1. Install packages scikit-imagescipy and matplotlib.

  2. Access and run the example code provided from the live class session.

    • Access the example code and cell image here

    • Run the code in PyCharm or another Python editor

    • Try uncommenting and commenting out parts of the code to display images and manipulations of images

  3. Meet with your team members for the class project. Complete weekly assigned parts 

Module 10

Introduction to Objects, Inheritance, Modules

Weeks 10-11 

Read

  1. An introduction to modules at Python.org

  2. W3Schools introduction to classes, objects and inheritance

Interactive | Watch

  1. Optional: Participate in the live Q & A session (10/26) 

Do

  1. Complete W3Schools exercises on objects, inheritance and modules

  2. Access and run the example code provided from the live class session.

    • Access the example code on modules here

    • Run the code in PyCharm or another Python editor

    • Try defining a new child class of favoriteIceCream that inherits all attributes of favoriteIceCream and includes additional attributes of 'color' and 'temperature'

  3. Meet with your team members for the class project. Complete weekly assigned parts

  4. Optional: Complete zyBooks chapters 9, 11 and 13 on Python classes, modules and inheritance  

Module 11

Introduction to Machine Learning and AI in Python

Weeks 11-12 

Read

  1. W3Schools introduction to machine learning

  2. Review scikit-learn options for machine learning algorithms

Interactive | Watch

  1. Optional: Participate in the live Q & A session (11/9) on predictive algorithms

  2. Run the example linear regression model "example_linear_regression.py" provided in the Q & A and modify the code by changing the input data values for "cups of coffee" and "heartrate." Try out the example provided for principle component analysis or PCA (""example_PCA_fitnessdata") 

  3. Google's Crash Course in Machine Learning

Do

  1. Complete Coding Challenge III and submit your answers through Blackboard by 11/17 at 11:59 pm.

  2. Meet with your team members for the class project. Complete weekly assigned parts

Module 12

Python File Handling & Intro to Databases

Intro to Pandas data structures

Linear regression example applied to multiple input variables 

Weeks 12-13 

Read

  1. W3Schools introduction to file handling

  2. W3Schools background on mulitple input linear regression models

  3. Review scikit-learn's datasets

  4. Introduction to pandas

Interactive | Watch

  1. Optional: Participate in the live Q & A session (11/16) on file handling and databases used in the predictive algorithms or watch the tutorials above.

  2. Run the example linear regression models "example_linear_regression_2.py" and "multiple_input_linear_regression.py" provided in the Q & A and modify code to import a new dataset rom the Scikit-learn databases or a file you upload. Decide on input variables and what you want to predict.

    • Apply linear regression to this dataset.

    • Try using multiple input variables to predict the output or target attribute. See the videos above and the code examples  "multiple_input_linear_regression.py" and "data_Example.py"

  3. Try out other predictive algorithms including Random Forest using the example code "example_Predictions.py"

Do

  1. Complete Coding Challenge III and submit your answers through Blackboard by 11/17 at 11:59 pm.

  2. Meet with your team members for the class project. Complete weekly assigned parts

Modules 13-14

Working with Public Biomedical Databases

Weeks 13-14 

Read

  1. W3Schools introduction to making a database in Python using MySQL

  2. How to search PubMed with Python (Credit: Marco Bonzanini)

  3. Read requirements for the class project presentations and report.

Interactive | Watch

  1. Required: Participate in the live class presentations Thursday (12/3) on the final class projects.

  2. Optional: Participate in the live Q & A on the class project (11/30 at 1:30 pm) and class session (11/30 at 2:30 pm) on applications of Python to public biomedical data

  3. Explore examples of freely available biomedical databases. Examples:

    1. Allen Brain atlases​

    2. Human Cell Atlas | Human Tissue Protein Atlas

    3. COVID-19 databases

    4. Leukemia Atlases

    5. Image databases

    6. Cluster databases

    7. Satellite databases (e.g., CNES, NASA ECOSTRESS)

    8. Space health & biomedical databases

Do

  1. Meet with your team members to work on completing the class project presentations for Thursday, Dec 3rd at 2:30 pm and reports due Friday, Dec 11th by 11:59 pm. 

Modules 15-16

Final Applications of Python to the Biomedical Industry Projects  

Weeks 15-16 

Read

  1. Read requirements for the class project presentations and report.

Interactive | Watch

  1. Participate in the live class presentations Thursday (12/3) on the final class projects.

  2. Provide feedback to teams on Thursday 12/3. Submit via email.

  3. Optional: Participate in the live Q & A on the class project (12/7 at 2:30 pm).

Do

  1. Complete the class project presentations for Thursday, Dec 3rd at 2:30 pm

  2. Submit final project reports on Blackboard by Friday, Dec 11th by 11:59 pm.