Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/4745
Title: High Throughput Reproducible Literate Phylogenetic Analysis
Authors: S., Ruhila
Keywords: High Throughput Reproducible Literate
Phylogenetic Analysis
Issue Date: 2022
Publisher: IEEE
Citation: PDGC 2022 - 2022 7th International Conference on Parallel Distributed and Grid Computing, 337-340.
Abstract: We present a holistic approach from a literate programming perspective to frame and solve systems biology problems. In particular, given the large data-sets required for answering questions relating to evolutionary histories we focus on the generalization and workflow required on a typical SLURM or PBS TORQUE queue driven high performance computing cluster. We demonstrate how to leverage multiple CLI tools compiled for efficient use in a portable manner on heterogeneous computational resources and further demonstrating the use of R to generate literate data-driven plots and analysis. High Performance Computing cluster (HPC) bottlenecks and installation barriers are also discussed and mitigation strategies are developed. As a concrete example we demonstrate the estimation of a phylogenetic tree, used to pose and answer questions on evolutionary lineages. In this manner, a generalized approach which can be used for systems biology is elucidated for manipulating phylogenetic data, including its validation, multiple sequence alignment, tree estimation through different models and reproduction.
Description: Only IISERM authors are available in the record.
URI: https://doi.org/10.1109/PDGC56933.2022.10053210
http://hdl.handle.net/123456789/4745
Appears in Collections:Research Articles

Files in This Item:
File Description SizeFormat 
Need To Add…Full Text_PDF.15.36 kBUnknownView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.