High Throughput Reproducible Literate Phylogenetic Analysis
| dc.contributor.author | S., Ruhila | |
| dc.date.accessioned | 2023-08-16T17:56:07Z | |
| dc.date.available | 2023-08-16T17:56:07Z | |
| dc.date.issued | 2022 | |
| dc.description | Only IISERM authors are available in the record. | en_US |
| dc.description.abstract | We present a holistic approach from a literate programming perspective to frame and solve systems biology problems. In particular, given the large data-sets required for answering questions relating to evolutionary histories we focus on the generalization and workflow required on a typical SLURM or PBS TORQUE queue driven high performance computing cluster. We demonstrate how to leverage multiple CLI tools compiled for efficient use in a portable manner on heterogeneous computational resources and further demonstrating the use of R to generate literate data-driven plots and analysis. High Performance Computing cluster (HPC) bottlenecks and installation barriers are also discussed and mitigation strategies are developed. As a concrete example we demonstrate the estimation of a phylogenetic tree, used to pose and answer questions on evolutionary lineages. In this manner, a generalized approach which can be used for systems biology is elucidated for manipulating phylogenetic data, including its validation, multiple sequence alignment, tree estimation through different models and reproduction. | en_US |
| dc.identifier.citation | PDGC 2022 - 2022 7th International Conference on Parallel Distributed and Grid Computing, 337-340. | en_US |
| dc.identifier.uri | https://doi.org/10.1109/PDGC56933.2022.10053210 | |
| dc.identifier.uri | http://hdl.handle.net/123456789/4745 | |
| dc.language.iso | en_US | en_US |
| dc.publisher | IEEE | en_US |
| dc.subject | High Throughput Reproducible Literate | en_US |
| dc.subject | Phylogenetic Analysis | en_US |
| dc.title | High Throughput Reproducible Literate Phylogenetic Analysis | en_US |
| dc.type | Article | en_US |