Data Analysis

Data science and AI are of existential importance to big pharma, and have a growing role in drug discovery and development. Pharma companies are looking for new ways to cut down the costs and coming up with innovative approaches to drug discovery to continue to be relevant and sustain their impact.

Data science has the potential to make an impact in all operational pipelines. All this information comes in different data types, including structured data, such as SQL database stores (tables of data with columns and rows); unstructured data, such as document files (satellite photos, for example); or streaming data from sensors. Graph learning and graph theory is being used in data science and deep learning. One of the promises of graph learning and deep learning is drug discovery.

Nowadays, a public cloud provider can store petabytes of data and scale up thousands of servers for as long as it is needed to accomplish the big data project. This is available for a reasonable price and can be utilized by any organization in the world. Big data analysis is undoubtedly providing much more hope than hype in drug R&D.

Genotype to Phenotype Analysis and Gene expression regulation


With the help of machine learning, statistical machine learning approaches such as Random Forest or OmicKriging are used to train the genotype data and gene expression data in GTEx Project, GEUVADIS and DGN databases, and then estimate the missing expression data in the imported genotype data. The purpose of PrediXcan is to establish the relationship between genetically regulated gene expression and traits.

Learn More

Bayesian Sparse Linear Mixed Model (BSLMM) is a mixed model of linear mixed model (LMM) and sparse regression model. Both linear mixed models (LMMs) and sparse regression models are widely used in genetic applications, including multi-gene modeling in genome-wide association studies. BSLMM assumes that the effect of genes on phenotype obeys a mixed distribution composed of two normal distributions, and the degree of mixing of the two normal distributions is determined by different mixing ratios.

Learn More

Online Inquiry

Please submit a detailed description of your project. We will provide you with a customized project plan to meet your research requests. You can also send emails directly to for inquiries.

Please input "protheragen" as verification code.

  • 2200 Smithtown Avenue, Room 1, Ronkonkoma, NY 11779-7329, USA
  • Phone: 1-516-666-0889
  • Fax: 1-516-927-0118
  • Email:


Copyright © 2005-2020 [ Protheragen ] All Rights Reserved.