Data Analytics for cell and gene therapy by Dr. Ruchi Dass, @drruchibhatt

Cell and gene therapies are becoming more and more popular because of encouraging clinical results worldwide. Major pharma manufacturing companies have invested in the concept’s commercialization worldwide. Recently, we read about Takeda’s license for commercialization of Aloficel (developed by TiGenix), Celgene’s acquisition of Juno Therapeutics or Gilead’s acquisition of Kite Pharma.

As this sector grows further, there is hope that more and more complex therapies will enter the market leading to a consequent increase in the number of treated patient population. This will put further pressure on manufacturing and R&D leading to larger adoption of QbD (Quality by design) principles. The data volume will increase significantly and that is where the concept of big data analytics will kick in.

Data is important- it guides the manufacturing operations; allows proper monitoring and control to ensure quality and assures efficiency, production quality, and regulatory compliance. Below are listed the possible use cases for use of predictive models of data analytics in the pharma and life sciences industry.

Predictive Models for manufacturing outcomes– Predictive Analytics modules can be used for process Improvement Analysis to Improve Scheduling and Throughput. Take an example of this biotech company which is the only final stage bio-manufacturing facility in the world. The company is struggling to meet rapidly increasing customer demand. Repeated unanticipated production delays and starvation at critical parts of the operation were causing not only late and missed deliveries, but the expiration of batches of product at a cost of approximately $1 million per batch. Predictive analytics can be deployed in such a facility to uncover the root cause(s) of the unanticipated delays creating the late and missed deliveries; project such incompetence in advance; Run “what if” scenarios to see if additional capacity from the new facility would be required; the facility could produce two more lots per month; better way to do long-term expansion planning and predict when and where more line capacity would be required in future.

Predictive Models for RISK Monitoring– Traditional monitoring typically allocates resources equally among study sites, regardless of clinical data or the risk to patients. Routine visits to all clinical sites with 100% SDV (comparing all data points on every case report form to all subjects’ medical records) are common — and are the largest cost driver in clinical trial budgets. Predictive analytics allows sponsors to assess investigator risk and allocate monitoring resources where they are needed most. Real-time predictive modeling in the form of risk-based monitoring enables a study sponsor to adjust the level of monitoring as risk changes at individual sites.

Predictive Models for Financial Modelling and Cost-effectiveness If you have a rich data asset then using Predictive Analytics you can benchmark cost and visualize data in that context to help sponsors forecast, budget and negotiate the cost for outsourcing clinical trials. Predictive Models for Performance and Operational Analytics- For a mature CRO, determining the productivity, utilization, and profitability of clinical research initiatives can help the CRO team allocates the most effective team members to certain projects. Based on the client which contracts the study, trends such as:

Invoice-to-Cash cycle times The propensity of the client to request amendments to the research deliverables. Ensure the scheduling of clinical researchers and clinical laboratory space is realistic, based on the nature and volume of projects. One can also get a sneak peek to trial site performance capabilities for selecting the best investigators, eliminating non-performing sites and reducing enrolment timelines. With such insights, planning and forecasting clinical enrolment performance, rescuing off-track trials and optimizing contingency plans for those trials becomes easy.

Predictive Models for Patient/Subject discovery- Today, a researcher, for example, might use data mining to find clusters of disease subtypes in hope of finding subtypes to focus on that specific target or hopefully enable a more precise treatment course. Attribute-importance algorithms now help researchers, for instance, select the subset of genes most likely used in discriminating types of cancer. Researchers can use predictive analytics to find factors associated with a disease or predict which patient might respond best to an experimental treatment.  

Carefully conducted clinical trials are performed in human volunteers to provide answers to questions such as:

  1. Does this treatment work?
  2. Does it work better than other treatments?
  3. Does it have side effects?

Clinical trials also provide important information on the cost-effectiveness of treatment, the clinical value of a diagnostic test and how a treatment improves quality of life. Ever wondered how patients get selected for clinical trials without these advanced algorithms? Traditionally, physicians have selected trials by manual analysis of patients’ data. The review of resulting selections has shown that they usually do not check all clinical trials and occasionally miss an appropriate trial.

Until now we only have some web systems to address the problem to an extent. To address this problem, Industry has developed near expert systems that help to select trials for each patient. It prompts a clinician to enter the results of medical tests and uses them to identify appropriate trials. If the available records do not provide enough data, the system suggests additional tests. This is a cumbersome process to find eligible patients and doesn’t help reduce any related costs.

With Predictive analytics models, Patients are identified to enroll in clinical trials based on more sources—for example, social media—than doctors’ visits. Furthermore, the criteria for including patients in a trial could take significantly more factors (for instance, genetic information) into account to target specific populations, thereby enabling trials that are smaller, shorter, less expensive, and more powerful.


Csaszar E, Kirouac DC, Yu M et al. Rapid expansion of human hematopoietic stem cells by automated control of inhibitory feedback signaling. Cell Stem Cell 2012; 10(2), 218–229.

Food and Drug Administration. FDA Data Integrity and Compliance With CGMP – Guidance for Industry. 2016.

Geris L, Lambrechts T, Carlier A, Papantoniou I. The future is digital: In silico tissue engineering. Curr. Opin. Biomed. Eng. 2018; 6, 92–98.

Streamlining data management & process analytics for the manufacturing of cell & gene therapies. 2018 Sébastien de Bournonville, Toon Lambrechts, Thomas Pinna, Ioannis Papantoniou & Jean-Marie Aerts, bioinsights.

Stanton D. Lonza: CAR-T Manufacturing Glitch an Industry Problem, not Just Novartis’s BioProcess

International 2018; [Online] therapeutic-class/lonza-car-t-manufacturing-glitch-an-industry-problem- not-just-novartiss/.

Viazzi S, Lambrechts T, Schrooten J, Papantoniou I, Aerts JM. Real-time characterization of the harvesting process for adherent mesenchymal stem cell cultures based on on-line imaging and model-based monitoring. Biosyst. Eng. 2015; 138, 104–113.


Dr. Ruchi Dass Digital Health Influencer & Health Innovator (HIT, Big Data, IoT, Analytics and Cloud)| TED speaker | Investor and Mentor LinkedIn@drruchibhattWebsite

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top
👋 Hello
Hello!! 👋 Manish here, Thanks for visiting The Healthcare IT Experts Blog !! How can i help you?