From cells to drug responses - machine learning in cancer research - Julian d...PyData
PyData Amsterdam 2018
Machine learning plays an important role in cancer research. In this talk, we’ll tackle the challenge of predicting which patients are likely to respond to given anti-cancer treatments. In doing so, we’ll show how tools such as Snakemake/Bioconda can be used to create reproducible workflows and illustrate the challenges of interpreting predictive models in large, highly-correlated feature spaces.
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...David Peyruc
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART’s Application to Clinical Biomarker Discovery Studies in Sanofi
Sherry Cao, Sanofi
This presentation will discuss challenges we are encountering in clinical biomarker discovery
study and how we are using tranSMART to help to address them.
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET
Abstract
Omics techniques (e.g., i.e., transcriptomics, genomics, and epigenomics) report quantitative measures of more than tens of thousands of biological features and provide a more comprehensive molecular perspective of studied diabetes mechanisms compared to transitional approaches. Identifying representative molecular signatures from the tremendous number of biological features becomes a central problem in utilizing the data for clinical decision-making. Exploring the complex causal relations of the identified representative molecular signatures and diabetes phenotypes can be the most effective and efficient ways to improve the understanding of diabetes and assess the cause of diabetes for the new patients with already collected data influencing (e.g., TEDDY project). However, due to the unavoidable patient heterogeneity, statistical randomness, and experimental noise in the high-dimension, low-sample-size omics data of the diabetic patients, utilizing the available data for clinical decision-making remains an ongoing challenge for many researchers. To overcome the limitations, in this study we developed (1) a generative adversarial network (GAN)-based model to generate synthetic omics data for the samples with few omics profiles available; (2) a deep learning-based fusion network model for phenotype prediction of type-1 diabetes; (3) a long short-term memory (LSTM)-based model for predicting outcomes of islet autoantibody and persistent positivity. The models are tested on the multi-omics data in TEDDY project.
Presenter: Wei Zhang, Ph.D. Assistant Professor, Department of Computer Science & Genomics and Bioinformatics Cluster, University of Central Florida
Upcoming webinars schedule: https://dknet.org/about/webinar
In this presentation important aspects of target selection and internal standardization in protein LC-MS are discussed. In addition there are 3 slides about coronavirus protein LC-MS considerations.
From cells to drug responses - machine learning in cancer research - Julian d...PyData
PyData Amsterdam 2018
Machine learning plays an important role in cancer research. In this talk, we’ll tackle the challenge of predicting which patients are likely to respond to given anti-cancer treatments. In doing so, we’ll show how tools such as Snakemake/Bioconda can be used to create reproducible workflows and illustrate the challenges of interpreting predictive models in large, highly-correlated feature spaces.
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...David Peyruc
tranSMART Community Meeting 5-7 Nov 13 - Session 3: tranSMART’s Application to Clinical Biomarker Discovery Studies in Sanofi
Sherry Cao, Sanofi
This presentation will discuss challenges we are encountering in clinical biomarker discovery
study and how we are using tranSMART to help to address them.
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET
Abstract
Omics techniques (e.g., i.e., transcriptomics, genomics, and epigenomics) report quantitative measures of more than tens of thousands of biological features and provide a more comprehensive molecular perspective of studied diabetes mechanisms compared to transitional approaches. Identifying representative molecular signatures from the tremendous number of biological features becomes a central problem in utilizing the data for clinical decision-making. Exploring the complex causal relations of the identified representative molecular signatures and diabetes phenotypes can be the most effective and efficient ways to improve the understanding of diabetes and assess the cause of diabetes for the new patients with already collected data influencing (e.g., TEDDY project). However, due to the unavoidable patient heterogeneity, statistical randomness, and experimental noise in the high-dimension, low-sample-size omics data of the diabetic patients, utilizing the available data for clinical decision-making remains an ongoing challenge for many researchers. To overcome the limitations, in this study we developed (1) a generative adversarial network (GAN)-based model to generate synthetic omics data for the samples with few omics profiles available; (2) a deep learning-based fusion network model for phenotype prediction of type-1 diabetes; (3) a long short-term memory (LSTM)-based model for predicting outcomes of islet autoantibody and persistent positivity. The models are tested on the multi-omics data in TEDDY project.
Presenter: Wei Zhang, Ph.D. Assistant Professor, Department of Computer Science & Genomics and Bioinformatics Cluster, University of Central Florida
Upcoming webinars schedule: https://dknet.org/about/webinar
In this presentation important aspects of target selection and internal standardization in protein LC-MS are discussed. In addition there are 3 slides about coronavirus protein LC-MS considerations.
Golden Helix’s SNP & Variation Suite (SVS) has been used by researchers around the world to do trait analysis and association testing on large cohorts of samples in both humans and other species. As Next-Generation Sequencing of whole genomes becomes more affordable, large cohorts of Whole Genome Sequencing (WGS) samples are available to search for additional trait association signals that were not found in array-based testing. In fact, recent papers have shown that WGS analysis using advanced GREML (Genomic Relatedness Restricted Maximum Likelihood) techniques is able to outperform micro-array based GWAS methods in the analysis of complex traits and proportion of the trait heritability explained.
Our latest update release of SVS has expanded the exiting maximum likelihood and GRM methods to support these new techniques. We have also enhanced various other association testing and prediction methodologies. This webcast showcases:
- Newly supported analysis workflow for whole genome variants using LD binning and enhanced GBLUP analysis
- Enhanced gender correction using REML
- Additional capabilities for genomic prediction and phenotype prediction
We are continually improving our products based on our customer’s feedback. We hope you enjoy this recording highlighting the exciting new features and select enhancements we have made.
The field of genomics is experiencing rapid transformation due to the development of novel experimental techniques and the application of artificial intelligence (AI) methodologies. In this presentation, we will explore the various ways in which machine learning can be applied to genomics datasets, as well as the challenges that must be addressed to effectively implement these techniques.
"Microbial Genomics @NIST" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by National Institute for Standards and Technology October 2014 by Nathan Olson from NIST.
"Microbial Genomics @NIST" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by the National Institute for Standards and Technology October 2014 by Nathan Olson from NIST.
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...QIAGEN
Traditional RNA sequencing (RNA-Seq) is a powerful tool for expression profiling, but is hindered by PCR amplification bias and inaccuracy at low expressing genes. QIAseq RNA is a flexible and precise tool developed for mitigating these complications, allowing digital gene expression analysis. This in-depth webinar will cover sample requirements, experimental design, NGS platform-specific challenges and workflow for gene enrichment, library prep and sequencing. The applications of QIASeq RNA Panels in cancer research, stem cell differentiation and elucidating the effects small molecules on signaling pathways will be highlighted.
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...QIAGEN
Liquid biopsies enable us to monitor the evolution of genetic aberrations in primary tumors as they shed the tumor cells into the circulation. The limitation is the ability to detect these low frequency genetic aberrations in a consistent manner to understand short- and long-term implications and how this information will be used in the clinic. This slidedeck will cover the challenges and solutions associated with multiple steps as one starts with liquid biopsy and move towards finding a new biomarker.
Analysis of Single-Cell Sequencing Data by CLC/Ingenuity: Single Cell Analysi...QIAGEN
Single-cell analysis is useful to study genetic heterogeneity between individual cells and can help in result interpretation by looking at the average behavior of a large number of cells. Applications include circulating tumor cells, cells from small biopsies and cells from in vitro fertilized embryos. In this slidedeck, we show how single cell next-generation sequencing data can be analyzed and what challenges needs to be overcome. One of the examples we use is single cell data from two colorectal cancer cell lines.
Golden Helix’s SNP & Variation Suite (SVS) has been used by researchers around the world to do trait analysis and association testing on large cohorts of samples in both humans and other species. As Next-Generation Sequencing of whole genomes becomes more affordable, large cohorts of Whole Genome Sequencing (WGS) samples are available to search for additional trait association signals that were not found in array-based testing. In fact, recent papers have shown that WGS analysis using advanced GREML (Genomic Relatedness Restricted Maximum Likelihood) techniques is able to outperform micro-array based GWAS methods in the analysis of complex traits and proportion of the trait heritability explained.
Our latest update release of SVS has expanded the exiting maximum likelihood and GRM methods to support these new techniques. We have also enhanced various other association testing and prediction methodologies. This webcast showcases:
- Newly supported analysis workflow for whole genome variants using LD binning and enhanced GBLUP analysis
- Enhanced gender correction using REML
- Additional capabilities for genomic prediction and phenotype prediction
We are continually improving our products based on our customer’s feedback. We hope you enjoy this recording highlighting the exciting new features and select enhancements we have made.
The field of genomics is experiencing rapid transformation due to the development of novel experimental techniques and the application of artificial intelligence (AI) methodologies. In this presentation, we will explore the various ways in which machine learning can be applied to genomics datasets, as well as the challenges that must be addressed to effectively implement these techniques.
"Microbial Genomics @NIST" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by National Institute for Standards and Technology October 2014 by Nathan Olson from NIST.
"Microbial Genomics @NIST" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by the National Institute for Standards and Technology October 2014 by Nathan Olson from NIST.
Targeted RNAseq for Gene Expression Using Unique Molecular Indexes (UMIs): In...QIAGEN
Traditional RNA sequencing (RNA-Seq) is a powerful tool for expression profiling, but is hindered by PCR amplification bias and inaccuracy at low expressing genes. QIAseq RNA is a flexible and precise tool developed for mitigating these complications, allowing digital gene expression analysis. This in-depth webinar will cover sample requirements, experimental design, NGS platform-specific challenges and workflow for gene enrichment, library prep and sequencing. The applications of QIASeq RNA Panels in cancer research, stem cell differentiation and elucidating the effects small molecules on signaling pathways will be highlighted.
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...QIAGEN
Liquid biopsies enable us to monitor the evolution of genetic aberrations in primary tumors as they shed the tumor cells into the circulation. The limitation is the ability to detect these low frequency genetic aberrations in a consistent manner to understand short- and long-term implications and how this information will be used in the clinic. This slidedeck will cover the challenges and solutions associated with multiple steps as one starts with liquid biopsy and move towards finding a new biomarker.
Analysis of Single-Cell Sequencing Data by CLC/Ingenuity: Single Cell Analysi...QIAGEN
Single-cell analysis is useful to study genetic heterogeneity between individual cells and can help in result interpretation by looking at the average behavior of a large number of cells. Applications include circulating tumor cells, cells from small biopsies and cells from in vitro fertilized embryos. In this slidedeck, we show how single cell next-generation sequencing data can be analyzed and what challenges needs to be overcome. One of the examples we use is single cell data from two colorectal cancer cell lines.
Similar to [DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen Milicevic (20)
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdfDataScienceConferenc1
In this talk, I'll journey from my time as a Research Assistant at the Bernoulli Institute, delving into the classification of neurodegenerative diseases, to my encounters with groundbreaking biotechnology and AI companies like Proteinea, AlProtein, Rology, and Natrify in Egypt. These innovative ventures are reshaping industries from their Egyptian hub. Join me as I illuminate the transformative power of this thriving ecosystem, showcasing Egypt's remarkable strides in biotech and AI on the global stage.
Building big scale data product doesn't rely only on sophisticated modeling. It also requires an agile methodology, iterative research & development process, versatile big data stack, and a value-oriented mindset. I'll discuss how we -at Dsquares- build big-scale AI product that leverages clients' data from different industries to deliver business-critical value to the end customer. I'll cover the process of product discovery, R&D tasks for unsolved problems, and mapping business requirements into big data technical requirements.
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptxDataScienceConferenc1
Innovation thrives at the intersection of data and creativity. While brainstorming has traditionally fueled the generation of new ideas, leveraging data alongside creative techniques empowers organizations to develop more effective and impactful innovations
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...DataScienceConferenc1
In today's fast-paced and competitive business environment, harnessing the power of data is essential for staying ahead. Building a data-driven culture within an organization is not just a strategic advantage, but a necessity for those who wish to thrive and innovate. In this insightful talk, our esteemed speaker, a Chief Data Scientist with a decade of experience in the financial services sector, will unravel the complexities of embedding data into the DNA of your organization. The speaker will explore the key tenets of establishing a data-centric mindset, the importance of executive support, and the need for enhancing data literacy across the company. Practical solutions and real-world examples will be provided, demonstrating how to overcome obstacles and successfully integrate a data-driven approach. Attendees will learn strategies for empowering every team member to use data effectively and how to leverage technology to facilitate this cultural shift. The session promises to be a guide for those looking to champion data within their organizations, offering actionable insights for transformation.
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdfDataScienceConferenc1
The use of Artificial Intelligence (AI) is rapidly transforming the recruitment landscape. This talk explores the various ways AI is being used in hiring, from candidate sourcing and screening to skills assessments and interview preparation. We'll discuss the benefits of AI, such as increased efficiency and reduced bias, but also address potential drawbacks like ethical considerations and the human touch.
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...DataScienceConferenc1
In today's business landscape, data strategy plays a pivotal role in driving innovation within business models. This talk explores how organizations can leverage data effectively to transform their operations, products, and services.
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...DataScienceConferenc1
Delve into the unexplored potential of scene graphs in the realms of Generative AI and innovative data product development. This session unveils the intricate role of scene graphs in generating realistic content and driving advancements in computer vision, and automated content creation. Join us for a journey into the intersection of scene graphs and cutting-edge AI, gaining insights into their pivotal role in reshaping the landscape of data-centric innovation. This talk is your gateway to understanding how structured visual representations are shaping the future of AI and revolutionizing the creation of data-driven solutions.
This presentation will delve into the transformative role of Artificial Intelligence in reshaping social media landscapes. We'll explore cutting-edge AI technologies that are integrating with social media platforms, altering how we interact, consume content, and perceive digital communities. The talk will also cast a visionary eye towards future trends, discussing potential impacts on user experience, content creation, digital marketing, and privacy concerns. Join us to uncover how AI is not just a tool but a game-changer in the evolving narrative of social media.
Supercharge your software development with Azure OpenAI Service! Azure cloud platform provides access to cutting-edge AI models for diverse tasks. Explore different models for generating content, translating languages, and even generating code. Leverage data grounding to fine-tune models for your specific needs. Discover how Azure OpenAI Service accelerates innovation and injects intelligence into your software creations.
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...DataScienceConferenc1
In this insightful talk, we'll embark on a journey from the origins of programming in 1883 and the conceptualization of AI in the 1950s, to the current explosion of AI applications reshaping our world. We'll unravel why AI has surged to prominence in the last decade, driven by unprecedented data generation and significant hardware advancements. With examples ranging from individual email filtering to complex supply chain optimizations, we'll explore AI's pervasive impact across various sectors including finance, manufacturing, healthcare, and media. The talk will address the challenges of AI implementation, such as the high cost of AI teams and the quest for universally applicable models, while highlighting the promising horizon of no-code AI platforms democratizing access. Furthermore, we'll delve into the ethical dimensions of AI, from biases to privacy concerns, and the pressing question of AI's potential to replace human roles. Lastly, we'll discuss the transformative potential of language models and generative AI, underscoring the importance of understanding and integrating AI into our lives and businesses for a future that's both scalable and sustainable.
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...DataScienceConferenc1
Transitioning to a career in data science requires careful planning and smart choices. In this session, I'll help you understand how to switch to data science. Using my own experiences and what I've learned from the industry, we'll break down the important steps for a successful transition. We'll cover everything from figuring out which skills you can carry over to learning the technical stuff and connecting with other professionals. By the end, you'll have the knowledge and tools you need to start your journey into data science, whether you're a seasoned professional looking for something new or just starting out in the field.
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...DataScienceConferenc1
With the continuous growth of the digital environment, the risks in the online realm also increase. This calls for strong security measures to safeguard valuable information and essential systems. Artificial Intelligence (AI) has become a powerful weapon in the fight against cyber threats. This talk presents a thorough examination of the most recent algorithms and applications of artificial intelligence in the field of cybersecurity.
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptxDataScienceConferenc1
What is Generative AI and how does it work? Could it eventually replace us? Let's delve deep into the heart of this groundbreaking technology and uncover the truths and myths surrounding Generative AI and how to make the most of it.
Background: The digital twin paradigm holds great promise for healthcare, most importantly efficiently integrating many disparate healthcare data sources and servicing complex tasks like personalizing care, predicting health outcomes, and planning patient care, even though many technical and scientific challenges remain to be overcome. Objective: As part of the QUALITOP project, we conducted a comprehensive analysis of diverse healthcare data, encompassing both prospective and retrospective datasets, along with an in-depth examination of the advanced analytical needs of medical institutions across five European Union countries. Through these endeavors, we have systematically developed and refined a formal Personal Medical Digital Twin (PMDT) model subjected to iterative validation by medical institutions to ensure its applicability, efficacy, and utility. Findings: The PMDT is based on an interconnected set of expressive knowledge structures that are calibrated to capture an individual patient’s psychosomatic, cognitive, biometrical and genetic information in one personal digital footprint in a manner that allows medical professionals to run various models to predict an individual’s health issues over time and intervene early with personalized preventive care.Conclusion: At the forefront of digital transformation, the PMDT emerges as a pivotal entity, positioned at the convergence of Big Data and Artificial Intelligence. This paper introduces a PMDT environment that lays the foundation for the application of comprehensive big data analytics, continuous monitoring, cognitive simulations, and AI techniques. By integrating stakeholders across the care continuum, including patients, this system enables the derivation of insights and facilitates informed decision-making for personalized preventive care.
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdfSachin Sharma
Pediatric nurses play a vital role in the health and well-being of children. Their responsibilities are wide-ranging, and their objectives can be categorized into several key areas:
1. Direct Patient Care:
Objective: Provide comprehensive and compassionate care to infants, children, and adolescents in various healthcare settings (hospitals, clinics, etc.).
This includes tasks like:
Monitoring vital signs and physical condition.
Administering medications and treatments.
Performing procedures as directed by doctors.
Assisting with daily living activities (bathing, feeding).
Providing emotional support and pain management.
2. Health Promotion and Education:
Objective: Promote healthy behaviors and educate children, families, and communities about preventive healthcare.
This includes tasks like:
Administering vaccinations.
Providing education on nutrition, hygiene, and development.
Offering breastfeeding and childbirth support.
Counseling families on safety and injury prevention.
3. Collaboration and Advocacy:
Objective: Collaborate effectively with doctors, social workers, therapists, and other healthcare professionals to ensure coordinated care for children.
Objective: Advocate for the rights and best interests of their patients, especially when children cannot speak for themselves.
This includes tasks like:
Communicating effectively with healthcare teams.
Identifying and addressing potential risks to child welfare.
Educating families about their child's condition and treatment options.
4. Professional Development and Research:
Objective: Stay up-to-date on the latest advancements in pediatric healthcare through continuing education and research.
Objective: Contribute to improving the quality of care for children by participating in research initiatives.
This includes tasks like:
Attending workshops and conferences on pediatric nursing.
Participating in clinical trials related to child health.
Implementing evidence-based practices into their daily routines.
By fulfilling these objectives, pediatric nurses play a crucial role in ensuring the optimal health and well-being of children throughout all stages of their development.
CHAPTER 1 SEMESTER V PREVENTIVE-PEDIATRICS.pdfSachin Sharma
This content provides an overview of preventive pediatrics. It defines preventive pediatrics as preventing disease and promoting children's physical, mental, and social well-being to achieve positive health. It discusses antenatal, postnatal, and social preventive pediatrics. It also covers various child health programs like immunization, breastfeeding, ICDS, and the roles of organizations like WHO, UNICEF, and nurses in preventive pediatrics.
For those battling kidney disease and exploring treatment options, understanding when to consider a kidney transplant is crucial. This guide aims to provide valuable insights into the circumstances under which a kidney transplant at the renowned Hiranandani Hospital may be the most appropriate course of action. By addressing the key indicators and factors involved, we hope to empower patients and their families to make informed decisions about their kidney care journey.
PET CT beginners Guide covers some of the underrepresented topics in PET CTMiadAlsulami
This lecture briefly covers some of the underrepresented topics in Molecular imaging with cases , such as:
- Primary pleural tumors and pleural metastases.
- Distinguishing between MPM and Talc Pleurodesis.
- Urological tumors.
- The role of FDG PET in NET.
ICH Guidelines for Pharmacovigilance.pdfNEHA GUPTA
The "ICH Guidelines for Pharmacovigilance" PDF provides a comprehensive overview of the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) guidelines related to pharmacovigilance. These guidelines aim to ensure that drugs are safe and effective for patients by monitoring and assessing adverse effects, ensuring proper reporting systems, and improving risk management practices. The document is essential for professionals in the pharmaceutical industry, regulatory authorities, and healthcare providers, offering detailed procedures and standards for pharmacovigilance activities to enhance drug safety and protect public health.
The dimensions of healthcare quality refer to various attributes or aspects that define the standard of healthcare services. These dimensions are used to evaluate, measure, and improve the quality of care provided to patients. A comprehensive understanding of these dimensions ensures that healthcare systems can address various aspects of patient care effectively and holistically. Dimensions of Healthcare Quality and Performance of care include the following; Appropriateness, Availability, Competence, Continuity, Effectiveness, Efficiency, Efficacy, Prevention, Respect and Care, Safety as well as Timeliness.
How many patients does case series should have In comparison to case reports.pdfpubrica101
Pubrica’s team of researchers and writers create scientific and medical research articles, which may be important resources for authors and practitioners. Pubrica medical writers assist you in creating and revising the introduction by alerting the reader to gaps in the chosen study subject. Our professionals understand the order in which the hypothesis topic is followed by the broad subject, the issue, and the backdrop.
https://pubrica.com/academy/case-study-or-series/how-many-patients-does-case-series-should-have-in-comparison-to-case-reports/
The Importance of Community Nursing Care.pdfAD Healthcare
NDIS and Community 24/7 Nursing Care is a specific type of support that may be provided under the NDIS for individuals with complex medical needs who require ongoing nursing care in a community setting, such as their home or a supported accommodation facility.
Empowering ACOs: Leveraging Quality Management Tools for MIPS and BeyondHealth Catalyst
Join us as we delve into the crucial realm of quality reporting for MSSP (Medicare Shared Savings Program) Accountable Care Organizations (ACOs).
In this session, we will explore how a robust quality management solution can empower your organization to meet regulatory requirements and improve processes for MIPS reporting and internal quality programs. Learn how our MeasureAble application enables compliance and fosters continuous improvement.
India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...Kumar Satyam
According to TechSci Research report, "India Clinical Trials Market- By Region, Competition, Forecast & Opportunities, 2030F," the India Clinical Trials Market was valued at USD 2.05 billion in 2024 and is projected to grow at a compound annual growth rate (CAGR) of 8.64% through 2030. The market is driven by a variety of factors, making India an attractive destination for pharmaceutical companies and researchers. India's vast and diverse patient population, cost-effective operational environment, and a large pool of skilled medical professionals contribute significantly to the market's growth. Additionally, increasing government support in streamlining regulations and the growing prevalence of lifestyle diseases further propel the clinical trials market.
Growing Prevalence of Lifestyle Diseases
The rising incidence of lifestyle diseases such as diabetes, cardiovascular diseases, and cancer is a major trend driving the clinical trials market in India. These conditions necessitate the development and testing of new treatment methods, creating a robust demand for clinical trials. The increasing burden of these diseases highlights the need for innovative therapies and underscores the importance of India as a key player in global clinical research.
3. Common biostatistics tasks
● Cleaning and transforming data
● Data description
● Statistical testing
● Tabulation and visualization
● Bioinformatics (applied statistics for genomics)
● Post-hoc power calculations
● ...
4. Common biostatistics tasks
● Cleaning and transforming data
● Data description
● Statistical testing
● Tabulation and visualization
● Bioinformatics (applied statistics for genomics)
● Post-hoc power calculations
● Complain they weren't consulted earlier
5.
6. Post-hoc sample size / power analysis
● Due to convenience, we justify choices already made
● Find the similar effect size in literature
● Use the posterior distribution as prior
● Set the desired power (80-100%)
● Adjust as needed for dropout, loss, margin-of-error
● Obtain the sample size you already have
12. Natural variability of RNA per gene
De Torrente et al. (2020)
Surprisingly, the expression of less than 50% of all genes
was Normally-distributed, with other distributions including
Gamma, Bimodal, Cauchy, and Lognormal also
represented.
Liu et al. (2019)
Based on the analysis of a group of real gene expression
profiles, this study reveal that the primary density
distributions of the real profiles are normal/log-normal and
t distributions, accounting for 80% and 19% respectively.
20K+ genes
13. Representing RNAs with fragments
Gamma-Poisson distribution
Count and normalize to quantify (TPM)
14. Overview of the pipeline
Effect
between
groups
Inter-individual
variation in RNA
Batch effects
Representation
variability
Tissue
sample
Chemical
preparation
Sequencing
18. RNA characterization of COVID-19 (2021) - Plan
● Total RNA – virus and host (human)
● Nasopharyngeal swabs and blood samples
● Paired design (on admittance and discharge from hospital)
● 18 individuals, total of 72 samples
● Which biological pathways are affected? (DEG)
● What can we say about the viral load? (metagenomics)
19. Estimating sample size for RNA
● Theoretical models with assumed distributions
● Parameters inferred from previous datasets
● R-packages: RNASeqDesign, PROPER, powsimR, ssizeRNA
● Web tool: RNASeqSampleSize
● Variable result
● If cost is not relevant, choose the most conservative (largest)
20. Proposed approach
● Perform one estimate and use it
● Remove unwanted variability (batch
effect)
● Reduce variability with paired design
● Use meaningful metadata
● Filter the genes
21. ● Remove unwanted variability
● Paired design
● Meaningful metadata
● Filter genes
A number of methods based on SVD remove high level batch effects
without specifically tracing them to interpretable variables.
One can use housekeeping or control genes as markers.
• SVA
• RUVseq
These methods produce new surrogate variables.
Colleague quote:
"Once I see batch effects, I can correct them mathematically, but I
never trust that dataset again."
23. ● Remove unwanted variability
● Paired design
● Meaningful metadata
● Filter genes
Paired design - taking control samples from patients
after resolution or before the event.
● Increases power
● Not all analysis frameworks can take advantage of it
● Sometimes biologically difficult
● Reduces DF by half
24. ● Remove unwanted variability
● Paired design
● Meaningful metadata
● Filter genes
Gender and age can always be relevant.
Collect metrics of sample quality (before and after
sequencing).
Disease subtypes can be a covariate or group variable.
Helps choosing when sequencing a subset.
25. ● Remove unwanted variability
● Paired design
● Meaningful metadata
● Filter genes
Multiple testing correction for 20K+ genes.
Remove mostly unexpressed genes.
A priori removal is allowed.
27. Annotation representation testing – Panther.db
● Annotation is a subset of genes
● Multiple available annotation sets (structure, function, pathway...)
● We only use significant genes
● Overrepresentation test – chi-square to compare observed and
expected frequencies
● Enrichment test – Mann-Whitney to test randomness of ranks
28. Molecular function in blood (PAIRED)
● Increased
immunoglobulin binding
● Reduced smell (in blood!)
● Reduced oxygen binding
and carrier activity
● We consider the result
validated
29. Takeaways of the study
● Study rescued by pairing
● No batch to correct
● Almost no metadata
● Smaller signal in blood
● Specific tissue (nasal) more
robust
33. Easier to control for batches
● Pairing absorbs a proportion of
batch effects
● Usually 8 lanes in a flowcell
● Focus on pairs instead of whole
samples
● Aggregation of datasets easier
34. Technical downsides of pairing
● Loss of half DF
● Many frameworks cannot use it as easily as GLM-based ones
● RNA is used for other analyses:
○ SUPPA2 for alternative splicing
○ Building empirical distribution from all pairs of samples
○ If pairing was implemented, would reduce the observations
drastically
36. Tissue implications
● Specific tissues have robust signatures without pairing
● Blood reflects many tissues:
○ Weaker signal
○ Local changes reflected
● Systemic effects are found only in blood
● Always available for sampling (minimum invasive)
● Blood analysis benefits from pairing
37. Utility implications
● Paired designs are easier to aggregate to meta-studies (robust to
batch effects)
● Blood controls can be used as unpaired controls for other studies (if
healthy enough)
● Solves the problem of finding controls
● If controls are after resolution, questionable health (long COVID)
● Some chronic diseases cannot be caught early or ever resolved, so
pairing is impossible
38. Example – cardiovascular events
● We are interested in markers of
plaque progression/instability
● Patient checkup and sampling every
X months
● Sequencing is expensive, sampling
and storing is not
● Sequence only the previous two
samples before the event
39. Example – neurodegenerative disease (ALS)
● We cannot predict the disease (10% familial)
● Patient available for sampling once diseased
● Sequence patients sufficiently apart
● We cannot find the root cause of ALS, as we
are not catching the initial event
● We can find signatures of neuronal suffering
and death, which is an actionable point
● Generalizes to all chronic diseases
40. Example – cancer
● For DNA, tumor is matched with blood
sample control
● For RNA, we need the normal
surrounding tissue
● Sampling the healthy normal target
tissue may be problematic
● Tissue margin – potential normal
sample
● Admixture of tumor in normal reduces
the signal (but not critically for RNA)
41. Many thanks to...
● Institute for Biocides and Medical Ecology
for providing the samples and sequencing
● HTEC Group for providing computational
resources and support
● School of Medicine, University of Belgrade
for supporting research
● Thanks to DSC organizers for the invite
● Last but not least...
Hello, my name is Ognjen Milicevic from Belgrade, Serbia. Because of my mixed medical and engineering background, today I chose to tackle an interdisciplinary subject -