Śaunak Sen | Research Group

matrix linear models for connecting metabolite composition to individual characteristics: This paper shows how we can enhance metabolomics data analysis using matrix linear models. The framework allows us to connect metabolite characteristics (such as type of lipid or number of double bonds) directly to individual characteristics (such as sex or an intervention).

We are working on methods and tools for extracting information from large matrix-valued data (think very large spreadsheets) that are common in modern biology. Examples include technologies such as transcriptomics, metabolomics, and high-throughput genetic screens. We are using a number of complementary approaches including bilinear models, penalized regression, multivariate kernel regression, matrix factorization, and gradient-based optimization techniques.

We have developed a number of software packages in the Julia programming language, a new promising language for scientific computing and data science: MatrixLM/MatrixLMnet (penalized matrix linear models), FlxQTL (multivariate linear mixed models for genetic mapping), BulkLMM/LiteQTL (real-time eQTL mapping), and GeneNetworkAPI/MetabolomicsWorkbenchAPI (interface to GeneNetwork and Metabolomics Workbench databases and computational tools).

joining the research group

If you are interested in joining the research group please reach out to me by email.

Internships: If you are an undergraduate or graduate student (master's, doctoral or medical) you may work with us as an intern. Most interns work over the summer, and it is best to apply directly to the Biomedical Data Science Internship Program for consideration.
Graduate students (UTHSC/CGHS): I am part of the UTHSC College of Graduate Health Sciences. The most appropriate programs are Epidemiology, Health Outcomes and Policy Research and Biomedical Sciences programs.
Graduate students (UTK/Bredesen Center): I am also a credentialed faculty member of the Bredesen Center for Interdisciplinary Research and Graduate Education. The most appropriate programs are Data Science and Engineering (DSE) and Genome Science and Technology (GST).
Postdoctoral scholars: Generally speaking, we are interested in scholars with strong statistical intuition, data analysis skills, and computer programming experience.

Regardless of the …

real-time genome scans for multiple quantitative traits using linear mixed models: We have developed algorithms and a Julia implementation to perform genome scans of a large number of quantitative traits using linear mixed models. It is suitable for genome scans for whole transcriptome data (eQTL scanning) or other high-throughput traits. For the BXD mouse data such computations take a few seconds.

speeding up eqtl scans in the bxd population using GPUs: To facilitate interactive use of genotype-phenotype relationships in the BXD population, we sought to speed up eQTL scans. We were able to decrease runtimes to approaching real-time computation.

flexible multivariate linear mixed models for structured multiple traits: Flexible modeling of structured traits using multivariate linear mixed models

sparse matrix linear models for structured high-throughput data: We have developed a fast algorithm for fitting L1-penalized multivariate linear models for high throughput data

matrix linear models for high-throughput genetic screens: A flexible and computationally efficient approach for analyzing high throughput chemical genetic screens

contact

Mailing address:
Department of Preventive Medicine
University of Tennessee Health Science Center
645 Doctors Office Building
66 North Pauline Street, Memphis, TN 38163-2181

Phone: +1.901.448.4590
Fax: +1.901.448.7041
Email: sen@uthsc.edu
Twitter: @saunaksen
Orcid: 0000-0003-4519-6361