Data Analytics - NUS Medical Sciences Cluster

DATA

Our Mission

The Data Analytics Core (DAC) provides support for the data  analytic needs of the Medical Sciences Cluster (MSC), comprising the Departments of Anatomy, Biochemistry, Microbiology & Immunology, Pharmacology, and Physiology.

Our Goals

DATA1

Enabler of data-driven medical science research: We provide support and services from the design of high-throughput experiments, to the analysis of these datasets for hypothesis generation and testing, focusing on next-generation sequencing and mass-spectrometry proteomic/lipidomic data.

Facilitator to bridge medical and data sciences: We aim to bridge medical and data sciences by facilitating discussions to build a research community empowered to harness advances in data science to address biomedical research questions. In addition, we provide training workshops to introduce medical science researchers to data analytics and to build their skills for self-directed analysis.

Gateway to computational resources: The facility aims to build local compute/storage resources to serve as a research data repository and computing platform for small-medium scale analysis. For large-scale analysis, the facility is connected to the National Supercomputing Centre Singapore (NSCC) via a 100G link for data-intensive operations. For routine and self-directed analysis of DNAseq and RNAseq data, we provide access to a web-based analysis portal through our collaboration with CSI.

At the heart of what we do is our mission to foster the MSC research community by supporting collaborative efforts in data-intensive projects.

Who We Are

The Data Analytics Core is anchored at the Department of Biochemistry, a member of the Medical Sciences Cluster at the Yong Loo Lin School of Medicine. Our multi-disciplinary team provides consultation and training ranging from genomics, transcriptomics to high performance computing and machine learning.

DATA3
Capture

Research Computing

DATA5

The Data Analytics Core provides services and support for the analysis of datasets that are large/complex with multiple biological parameters.

In collaboration with Cancer Science Institute (CSI), MSC researchers can request for access to the CSI analysis portal by emailing A/Prof Henry Yang. Once access is granted, researchers can upload NGS sequences for analysis via a Web GUI interface. Examples of analysis pipelines include:

  • DNA-seq (Genome alignment, mutation calling, mutation annotation)
  • RNA-seq (Genome alignment, gene expression, isoform expression, alternative splicing)
  • ChIP/RIP-seq (Genome alignment, peak calling, UCSC hub)
  • smallRNA-seq (Genome alignment, small-RNA expression)
  • 4C-seq (Genome alignment, interactions, report)
  • SHAPE-seq (Transcriptome alignment, reactivity calculation, structure prediction)
  • RNA-editing (Genome alignment, variant calling, candidates selection)
  • rMATS (Genome alignment, splicing analysis)
  • mmPCR (Primer design, genome search, multiplex primers)

Training & Workshops

The Data Analytics Core will be launching introductory training workshops for researchers who wish to learn how to conduct the analysis themselves. In addition, we are building a repository of links and online resources that are useful for analyzing datasets (https://datacore.space)

How to Get Started

Contact Kenneth (bchbhkk@nus.edu.sg) and Maulana (bchmb@nus.edu.sg) to start a project with the Data Analytics Core facility!