SlideShare a Scribd company logo
Hello!
I AM Manu Gupta
I am the product owner for the data
warehousing solution for Department of
Internal Medicine, University of Michigan
You can find me at gmanu@umich.edu
BIG CONCEPT
A data warehousing solution for
storing expression data.
Tech Stack
Our stack includes
◦ openSUSE Linux Server
◦ PostgreSQL
◦ Django
◦ R
◦ Nginx
Currently, we have over 3.5 million
data points in the database and
46 relations.
What we store?
What it is intended to do?
◦ Internal data storage & analysis
◦ Expression data with their annotation.
◦ Data includes
▫ Platforms (Affymetrix Array)
▫ Dataset (GDS* / GSE* based on GEO)
▫ Series (GSE* based on GEO)
▫ Samples (GSM* based on GEO)
▫ Annotations (Technical & Biological)
▫ Expression Values
▫ QC Results
▫ Sequence Data
Quick Review
What it is not intended to do
◦ Store clinical data
OUR MANTRA
UsableFlexible Efficient
“
◦ Accommodate for upcoming
technology ( for example
RNASeq)
◦ Answer variety of questions.
- Flexibility
Flexibility
The questions we can answer
◦ Dataset
▫ Find all the series in a sample.
▫ Find the QC Results, Expression Value of a series.
▫ Find group of genes which are very highly
expressed or very poorly expressed.
▫ Find expression values between specific sequence
start & end positions.
◦ Sequence Groups
▫ Find all the homologenes given a particular gene.
▫ Find the annotation of a sequence
▫ Find all the sequence between a particular
sequence start and end positions.
“
◦ Effort should be minimized to
store and retrieve data from the
dataset.
- Usability
OUR PROCESS IS EASY
Use batch uploads for uploading loads of data
(demo)
Retrieve the datasets really fast
Use a web interface
“
◦ Wherever possible, allow
computations.
- Efficiency
OUR PROCESS IS EASY
Statistical analysis over R, Python or any programming
language
Visualizations
Data transformations to external databases such as
Nephromine, Transmart
THANK YOU
ANY QUESTIONS
CREDITS
Special thanks to
◦ Felix Eichinger, Viji Nair
◦ Shruti, Fan, Adam, Lauren & Izhar for
visualizations
◦ Presentation template by SlidesCarnival
◦ Photographs by Unsplash

More Related Content

Similar to Data warehousing solution for Department of Internal Medicine, University of Michigan

Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
Spark Summit
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
QuantUniversity
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
Sri Ambati
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Sri Ambati
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Hong ChangBum
 
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionDeveloping in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
Robin van Emden
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
QuantUniversity
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet Sentiment
Lucinda Linde
 
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfOracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Equnix Business Solutions
 
[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo
[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo
[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo
BD
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Neotys_Partner
 
Running Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBRunning Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDB
MongoDB
 
LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19
Alberto Paro
 
AcceleTest
AcceleTestAcceleTest
AcceleTest
Liz Martin
 
AcceleTest
AcceleTestAcceleTest
AcceleTest
Liz Martin
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomics
Altuna Akalin
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
c.titus.brown
 
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
Equnix Business Solutions
 

Similar to Data warehousing solution for Department of Internal Medicine, University of Michigan (20)

Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
 
DNA_Services
DNA_ServicesDNA_Services
DNA_Services
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
 
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionDeveloping in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet Sentiment
 
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdfOracle to PostgreSQL, Challenges to Opportunity.pdf
Oracle to PostgreSQL, Challenges to Opportunity.pdf
 
[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo
[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo
[FOSS4G 2015 SEOUL] Spatial tajo supporting spatial queries on Apache Tajo
 
Arraygen_Brochure
Arraygen_BrochureArraygen_Brochure
Arraygen_Brochure
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
 
Running Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBRunning Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDB
 
LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19
 
AcceleTest
AcceleTestAcceleTest
AcceleTest
 
AcceleTest
AcceleTestAcceleTest
AcceleTest
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomics
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
 
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
 

More from Manu Gupta

Wikipedia reading experience
Wikipedia reading experienceWikipedia reading experience
Wikipedia reading experience
Manu Gupta
 
Watt-e | Gamifying energy savings in your neighborhood
Watt-e | Gamifying energy savings in your neighborhoodWatt-e | Gamifying energy savings in your neighborhood
Watt-e | Gamifying energy savings in your neighborhood
Manu Gupta
 
Information Diffusion model for rural village in India
Information Diffusion model for rural village in IndiaInformation Diffusion model for rural village in India
Information Diffusion model for rural village in India
Manu Gupta
 
How does political and economic factors affect life expectancy in african cou...
How does political and economic factors affect life expectancy in african cou...How does political and economic factors affect life expectancy in african cou...
How does political and economic factors affect life expectancy in african cou...
Manu Gupta
 
Yelp - Displaying more information while searching on Yelp
Yelp - Displaying more information while searching on YelpYelp - Displaying more information while searching on Yelp
Yelp - Displaying more information while searching on Yelp
Manu Gupta
 
Exploring UX / Customer Research
Exploring UX / Customer Research Exploring UX / Customer Research
Exploring UX / Customer Research
Manu Gupta
 
How might we make reviewing high school work easier through visualizations?
How might we make reviewing high school work easier through visualizations?How might we make reviewing high school work easier through visualizations?
How might we make reviewing high school work easier through visualizations?
Manu Gupta
 
iMAJN - Games for ESL Students
iMAJN - Games for ESL StudentsiMAJN - Games for ESL Students
iMAJN - Games for ESL Students
Manu Gupta
 

More from Manu Gupta (8)

Wikipedia reading experience
Wikipedia reading experienceWikipedia reading experience
Wikipedia reading experience
 
Watt-e | Gamifying energy savings in your neighborhood
Watt-e | Gamifying energy savings in your neighborhoodWatt-e | Gamifying energy savings in your neighborhood
Watt-e | Gamifying energy savings in your neighborhood
 
Information Diffusion model for rural village in India
Information Diffusion model for rural village in IndiaInformation Diffusion model for rural village in India
Information Diffusion model for rural village in India
 
How does political and economic factors affect life expectancy in african cou...
How does political and economic factors affect life expectancy in african cou...How does political and economic factors affect life expectancy in african cou...
How does political and economic factors affect life expectancy in african cou...
 
Yelp - Displaying more information while searching on Yelp
Yelp - Displaying more information while searching on YelpYelp - Displaying more information while searching on Yelp
Yelp - Displaying more information while searching on Yelp
 
Exploring UX / Customer Research
Exploring UX / Customer Research Exploring UX / Customer Research
Exploring UX / Customer Research
 
How might we make reviewing high school work easier through visualizations?
How might we make reviewing high school work easier through visualizations?How might we make reviewing high school work easier through visualizations?
How might we make reviewing high school work easier through visualizations?
 
iMAJN - Games for ESL Students
iMAJN - Games for ESL StudentsiMAJN - Games for ESL Students
iMAJN - Games for ESL Students
 

Recently uploaded

Bringing AI into a Mid-Sized Company: A structured Approach
Bringing AI into a Mid-Sized Company: A structured ApproachBringing AI into a Mid-Sized Company: A structured Approach
Bringing AI into a Mid-Sized Company: A structured Approach
Brian Frerichs
 
INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)
INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)
INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)
blessyjannu21
 
The Importance of COVID-19 PCR Tests for Travel in 2024.pptx
The Importance of COVID-19 PCR Tests for Travel in 2024.pptxThe Importance of COVID-19 PCR Tests for Travel in 2024.pptx
The Importance of COVID-19 PCR Tests for Travel in 2024.pptx
Global Travel Clinics
 
The positive impact of SGRT – The Berkshire Cancer Centre experience
The positive impact of SGRT – The Berkshire Cancer Centre experienceThe positive impact of SGRT – The Berkshire Cancer Centre experience
The positive impact of SGRT – The Berkshire Cancer Centre experience
SGRT Community
 
ABDOMINAL COMPARTMENT SYSNDROME
ABDOMINAL COMPARTMENT SYSNDROMEABDOMINAL COMPARTMENT SYSNDROME
ABDOMINAL COMPARTMENT SYSNDROME
Rommel Luis III Israel
 
Empowering ACOs: Leveraging Quality Management Tools for MIPS and Beyond
Empowering ACOs: Leveraging Quality Management Tools for MIPS and BeyondEmpowering ACOs: Leveraging Quality Management Tools for MIPS and Beyond
Empowering ACOs: Leveraging Quality Management Tools for MIPS and Beyond
Health Catalyst
 
One Gene One Enzyme Theory.pptxvhvhfhfhfhf
One Gene One Enzyme Theory.pptxvhvhfhfhfhfOne Gene One Enzyme Theory.pptxvhvhfhfhfhf
One Gene One Enzyme Theory.pptxvhvhfhfhfhf
AbdulMunim54
 
Top massage center in ajman chandrima Spa
Top massage center in ajman chandrima  SpaTop massage center in ajman chandrima  Spa
Top massage center in ajman chandrima Spa
Chandrima Spa Ajman
 
Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...
Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...
Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...
ILC- UK
 
ICH Guidelines for Pharmacovigilance.pdf
ICH Guidelines for Pharmacovigilance.pdfICH Guidelines for Pharmacovigilance.pdf
ICH Guidelines for Pharmacovigilance.pdf
NEHA GUPTA
 
Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...
Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...
Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...
IMARC Group
 
Champions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdf
Champions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdfChampions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdf
Champions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdf
eurohealthleaders
 
Trauma Outpatient Center .
Trauma Outpatient Center                       .Trauma Outpatient Center                       .
Trauma Outpatient Center .
TraumaOutpatientCent
 
Feeding plate for a newborn with Cleft Palate.pptx
Feeding plate for a newborn with Cleft Palate.pptxFeeding plate for a newborn with Cleft Palate.pptx
Feeding plate for a newborn with Cleft Palate.pptx
SatvikaPrasad
 
Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...
Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...
Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...
The Lifesciences Magazine
 
Rate Controlled Drug Delivery Systems.pdf
Rate Controlled Drug Delivery Systems.pdfRate Controlled Drug Delivery Systems.pdf
Rate Controlled Drug Delivery Systems.pdf
Rajarambapu College of Pharmacy Kasegaon Dist Sangli
 
Dimensions of Healthcare Quality
Dimensions of Healthcare QualityDimensions of Healthcare Quality
Dimensions of Healthcare Quality
Naeemshahzad51
 
Health Education on prevention of hypertension
Health Education on prevention of hypertensionHealth Education on prevention of hypertension
Health Education on prevention of hypertension
Radhika kulvi
 
Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...
Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...
Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...
salisonsalim1
 
Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.
Dinesh Chauhan
 

Recently uploaded (20)

Bringing AI into a Mid-Sized Company: A structured Approach
Bringing AI into a Mid-Sized Company: A structured ApproachBringing AI into a Mid-Sized Company: A structured Approach
Bringing AI into a Mid-Sized Company: A structured Approach
 
INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)
INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)
INFECTION OF THE BRAIN -ENCEPHALITIS ( PPT)
 
The Importance of COVID-19 PCR Tests for Travel in 2024.pptx
The Importance of COVID-19 PCR Tests for Travel in 2024.pptxThe Importance of COVID-19 PCR Tests for Travel in 2024.pptx
The Importance of COVID-19 PCR Tests for Travel in 2024.pptx
 
The positive impact of SGRT – The Berkshire Cancer Centre experience
The positive impact of SGRT – The Berkshire Cancer Centre experienceThe positive impact of SGRT – The Berkshire Cancer Centre experience
The positive impact of SGRT – The Berkshire Cancer Centre experience
 
ABDOMINAL COMPARTMENT SYSNDROME
ABDOMINAL COMPARTMENT SYSNDROMEABDOMINAL COMPARTMENT SYSNDROME
ABDOMINAL COMPARTMENT SYSNDROME
 
Empowering ACOs: Leveraging Quality Management Tools for MIPS and Beyond
Empowering ACOs: Leveraging Quality Management Tools for MIPS and BeyondEmpowering ACOs: Leveraging Quality Management Tools for MIPS and Beyond
Empowering ACOs: Leveraging Quality Management Tools for MIPS and Beyond
 
One Gene One Enzyme Theory.pptxvhvhfhfhfhf
One Gene One Enzyme Theory.pptxvhvhfhfhfhfOne Gene One Enzyme Theory.pptxvhvhfhfhfhf
One Gene One Enzyme Theory.pptxvhvhfhfhfhf
 
Top massage center in ajman chandrima Spa
Top massage center in ajman chandrima  SpaTop massage center in ajman chandrima  Spa
Top massage center in ajman chandrima Spa
 
Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...
Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...
Global launch of the Healthy Ageing and Prevention Index 2nd wave – alongside...
 
ICH Guidelines for Pharmacovigilance.pdf
ICH Guidelines for Pharmacovigilance.pdfICH Guidelines for Pharmacovigilance.pdf
ICH Guidelines for Pharmacovigilance.pdf
 
Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...
Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...
Veterinary Diagnostics Market PPT 2024: Size, Growth, Demand and Forecast til...
 
Champions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdf
Champions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdfChampions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdf
Champions of Health Spotlight On Leaders Shaping Germany's Healthcare.pdf
 
Trauma Outpatient Center .
Trauma Outpatient Center                       .Trauma Outpatient Center                       .
Trauma Outpatient Center .
 
Feeding plate for a newborn with Cleft Palate.pptx
Feeding plate for a newborn with Cleft Palate.pptxFeeding plate for a newborn with Cleft Palate.pptx
Feeding plate for a newborn with Cleft Palate.pptx
 
Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...
Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...
Cold Sores: Causes, Treatments, and Prevention Strategies | The Lifesciences ...
 
Rate Controlled Drug Delivery Systems.pdf
Rate Controlled Drug Delivery Systems.pdfRate Controlled Drug Delivery Systems.pdf
Rate Controlled Drug Delivery Systems.pdf
 
Dimensions of Healthcare Quality
Dimensions of Healthcare QualityDimensions of Healthcare Quality
Dimensions of Healthcare Quality
 
Health Education on prevention of hypertension
Health Education on prevention of hypertensionHealth Education on prevention of hypertension
Health Education on prevention of hypertension
 
Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...
Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...
Mastoid cavity problem and obilteration presentation by Dr Salison Salim Pani...
 
Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.Tips for Pet Care in winters How to take care of pets.
Tips for Pet Care in winters How to take care of pets.
 

Data warehousing solution for Department of Internal Medicine, University of Michigan

  • 1. Hello! I AM Manu Gupta I am the product owner for the data warehousing solution for Department of Internal Medicine, University of Michigan You can find me at gmanu@umich.edu
  • 2. BIG CONCEPT A data warehousing solution for storing expression data.
  • 3. Tech Stack Our stack includes ◦ openSUSE Linux Server ◦ PostgreSQL ◦ Django ◦ R ◦ Nginx Currently, we have over 3.5 million data points in the database and 46 relations.
  • 4. What we store? What it is intended to do? ◦ Internal data storage & analysis ◦ Expression data with their annotation. ◦ Data includes ▫ Platforms (Affymetrix Array) ▫ Dataset (GDS* / GSE* based on GEO) ▫ Series (GSE* based on GEO) ▫ Samples (GSM* based on GEO) ▫ Annotations (Technical & Biological) ▫ Expression Values ▫ QC Results ▫ Sequence Data
  • 5. Quick Review What it is not intended to do ◦ Store clinical data
  • 7. “ ◦ Accommodate for upcoming technology ( for example RNASeq) ◦ Answer variety of questions. - Flexibility
  • 8. Flexibility The questions we can answer ◦ Dataset ▫ Find all the series in a sample. ▫ Find the QC Results, Expression Value of a series. ▫ Find group of genes which are very highly expressed or very poorly expressed. ▫ Find expression values between specific sequence start & end positions. ◦ Sequence Groups ▫ Find all the homologenes given a particular gene. ▫ Find the annotation of a sequence ▫ Find all the sequence between a particular sequence start and end positions.
  • 9. “ ◦ Effort should be minimized to store and retrieve data from the dataset. - Usability
  • 10. OUR PROCESS IS EASY Use batch uploads for uploading loads of data (demo) Retrieve the datasets really fast Use a web interface
  • 11. “ ◦ Wherever possible, allow computations. - Efficiency
  • 12. OUR PROCESS IS EASY Statistical analysis over R, Python or any programming language Visualizations Data transformations to external databases such as Nephromine, Transmart
  • 14. CREDITS Special thanks to ◦ Felix Eichinger, Viji Nair ◦ Shruti, Fan, Adam, Lauren & Izhar for visualizations ◦ Presentation template by SlidesCarnival ◦ Photographs by Unsplash