Navigation
Classification and Clustering of Paediatric Cancers
Between March and November in 2018, I completed Udacity's Machine Learning Engineer Nanodegree. After taking courses in supervised, unsupervised, deep, and reinforcement learning, I designed and implemented a capstone project, carrying out a series of studies using a dataset from a pan-cancer analysis of paediatric cancers.
I built a series of classifiers to predict cancer histotypes and trained them with the dataset comprising activities of mutational signatures, including a decision tree, a naive Bayes classifier, support vector machines, an ensemble method (Adaboost), and a multilayer perceptron.
Additionally, I obtained biological insights from the dataset. I quantified the intra-histotype variations in it by hierarchical clustering and extracted latent features from it by principal component analysis.
Here are the results: report and code.