SlideShare a Scribd company logo
1 of 29
Download to read offline
ForPeerReview
Only/NotforDistribution
Big Data
Towards more transparent and reproducible omics studies
through a common metadata checklist and data publications
Journal: Big Data
Manuscript ID: BIG-2013-0039
Manuscript Type: Original Article
Date Submitted by the Author: 05-Nov-2013
Complete List of Authors: Kolker, Eugene; Seattle Children's Research Institute, Bioinformatics and
High-Throughput Analysis Laboratory; Seattle Children's, Predictive
Analytics; Data-Enabled Life Sciences Alliance,
Ozdemir, Vural; Gaziantep University, Office of the President; Data-
Enabled Life Sciences Alliance,
Martens, Lennart; VIB, Department of Medical Protein Research; Ghent
University, Department of Biochemistry; Data-Enabled Life Sciences
Alliance,
Hancock, William; Northeastern University, Barnett Institute; Data-Enabled
Life Sciences Alliance,
Anderson, Gordon; Pacific Northwest National Laboratory, Fundamental &
Computational Sciences Directorate; Data-Enabled Life Sciences Alliance,
Anderson, Nathaniel; Seattle Children's Research Institute, Bioinformatics
and High-Throughput Analysis Laboratory; Data-Enabled Life Sciences
Alliance,
Aynacioglu, Sukru; Gaziantep University, Department of Pharmacology;
Data-Enabled Life Sciences Alliance,
Baranova, Ancha; George Mason University, School of Systems Biology;
Data-Enabled Life Sciences Alliance,
Campagna, Shawn; University of Tennesee, Knoxville, Department of
Chemistry; Data-Enabled Life Sciences Alliance,
Chen, Rui; Stanford University, Department of Genetics; Data-Enabled Life
Sciences Alliance,
Choiniere, John; Seattle Children's Research Institute, Bioinformatics and
High-Throughput Analysis Laboratory; Data-Enabled Life Sciences Alliance,
Dearth, Stephen; University of Tennesee, Knoxville, Department of
Chemistry; Data-Enabled Life Sciences Alliance,
Feng, Wu-Chun; Virginia Tech, SyNeRGy Laboratory; Virginia Tech,
Department of Computer Science; Virginia Tech, Department of Electrical
and Computer Engineering; Data-Enabled Life Sciences Alliance,
Ferguson, Lynnette; University of Auckland, Department of Nutrition; Data-
Enabled Life Sciences Alliance,
Fox, Geoffrey; Indiana University, School of Informatics and Computing;
Data-Enabled Life Sciences Alliance,
Frishman, Dmitrij; Technische Universitat Muchen, ; Data-Enabled Life
Sciences Alliance,
Grossman, Robert; University of Chicago, Institute for Genomics and
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
ForPeerReview
Only/NotforDistribution
Systems Biology; University of Chicago, Department of Medicine; Data-
Enabled Life Sciences Alliance,
Heath, Allison; University of Chicago, Institute for Genomics and Systems
Biology; Data-Enabled Life Sciences Alliance,
Higdon, Roger; Seattle Children's Research Institute, Bioinformatics and
High-Throughput Analysis Laboratory; Seattle Children's, Predictive
Analytics; Data-Enabled Life Sciences Alliance,
Hutz, Mara; Federal University of Rio Grande do Sul, Departamento de
Genetica; Data-Enabled Life Sciences Alliance,
Janko, Imre; Seattle Children's Research Institute, High-Throughput
Analysis Core; Data-Enabled Life Sciences Alliance,
Jiang, Lihua; Stanford University, Department of Genetics; Data-Enabled
Life Sciences Alliance,
Joshi, Sanjay; EMC, Life Sciences; Data-Enabled Life Sciences Alliance,
Kel, Alexander; GeneXplain GmbH, ; Data-Enabled Life Sciences Alliance,
Kemnitz, Joseph; University of Wisconsin-Madison, Department of Cell and
Regenerative Biology; University of Wisconsin-Madison, Wisconsin National
Primate Research Center; Data-Enabled Life Sciences Alliance,
Kohane, Isaac; Harvard Medical School, Department of Pediatrics; Harvard
Medical School, Health Sciences and Technology; Harvard Medical School,
Center for Biomedical Informatics; Data-Enabled Life Sciences Alliance,
Kolker, Natali; Seattle Children’s Research Institute, High-throughput
Analysis Core; Seattle Children’s Hospital, Predictive Analytics; Data-
Enabled Life Sciences Alliance,
Lancet, Doron; Weizmann Institute of Science, Department of Molecular
Genetics; Data-Enabled Life Sciences Alliance,
Lee, Elaine; Data-Enabled Life Sciences Alliance, ; Seattle Children's
Research Institute, High-Throughput Analysis Core
Li, Weizhong; University of California, San Diego, Center for Research in
Biological Systems; Data-Enabled Life Sciences Alliance,
Lisitsa, Andrey; Russian Human Proteome Organization, ; Institute of
Biomedical Chemistry, ; Data-Enabled Life Sciences Alliance,
Llerena, Adrian; Extremadura University Hospital and Medical School,
Clinical Research Center; Data-Enabled Life Sciences Alliance,
MacNealy-Koch, Courtney; Seattle Children's Research Institute,
Bioinformatics and High-throughput Analysis Laboratory; Data-Enabled Life
Sciences Alliance (DELSA Global),
Marshall, Jean-Claude; Catholic Health Initiatives, Center for Translational
Research; Data-Enabled Life Sciences Alliance,
Masuzzo, Paola; VIB, Department of Medical Protein Research; Ghent
University, Department of Biochemistry; Data-Enabled Life Sciences
Alliance,
May, Amanda; University of Tennesee, Knoxville, Department of
Chemistry; Data-Enabled Life Sciences Alliance,
Mias, George; Stanford University, Department of Genetics; Data-Enabled
Life Sciences Alliance,
Monroe, Matthew; Pacific Northwest National Laboratory, Biological
Sciences Division; Data-Enabled Life Sciences Alliance,
Montague, Elizabeth; Seattle Children's Research Institute, Bioinformatics
and High-Throughput Analysis Laboratory; Data-Enabled Life Sciences
Alliance,
Mooney, Sean; The Buck Institute for Research on Aging, ; Data-Enabled
Life Sciences Alliance,
Nesvizhskii, Alexey; University of Michigan, Department of Pathology;
University of Michigan, Department of Computational Medicine and
Bioinformatics; Data-Enabled Life Sciences Alliance,
Noronha, Santosh; Indian Institute of Technology Bombay, Department of
Chemical Engineering; Data-Enabled Life Sciences Alliance,
Omenn, Gilbert; University of Michigan, Department of Computational
Medicine and Bioinformatics; University of Michigan, Department of
Molecular Medicine and Genetics; University of Michigan, School of Public
Page 1 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Health ; Data-Enabled Life Sciences Alliance,
Rajasimha, Harsha; Jeeva Informatics Solutions LLC, ; Data-Enabled Life
Sciences Alliance,
Ramamoorthy, Preveen; National Jewish Health, Department of Medicine;
Data-Enabled Life Sciences Alliance,
Sheehan, Jerry; University of California, San Diego, California Institute of
Telecommunications and Information Technology; Data-Enabled Life
Sciences Alliance,
Smarr, Larry; University of California, San Diego, California Institute of
Telecommunications and Information Technology; Data-Enabled Life
Sciences Alliance,
Smith, Charles; Seattle Children's Research Institute, Center for
Developmental Therapeutics; Data-Enabled Life Sciences Alliance,
Smith, Todd; Digital World Biology, ; Data-Enabled Life Sciences Alliance,
Snyder, Michael; Stanford University, Department of Genetics; Stanford
University, Center for Genomics and Personalize Medicine; Data-Enabled
Life Sciences Alliance,
Rapole, Srikanth; National Centre for Cell Science, Proteomics Laboratory;
Data-Enabled Life Sciences Alliance,
Srivastava, Sanjeeva; Indian Institute of Technology Bombay, Proteomics
Laboratory; Data-Enabled Life Sciences Alliance,
Stanberry, Larissa; Seattle Children's Research Institute, Bioinformatics
and High-throughput Analysis Laboratory; Seattle Children’s Hospital,
Predictive Analytics; Data-Enabled Life Sciences Alliance,
Stewart, Elizabeth; Seattle Children's, Bioinformatics and High-Throughput
Data Analysis Laboratory; Data Enabled Life Sciences Alliance ,
Toppo, Stefano; University of Padova, Department of Molecular Medicine;
Data-Enabled Life Sciences Alliance,
Uetz, Peter; J Craig Venter Institute, ; Data-Enabled Life Sciences Alliance,
Verheggen, Kenneth; VIB, Department of Medical Protein Research; Ghent
University, Department of Biochemistry; Data-Enabled Life Sciences
Alliance,
Voy, Brynn; University of Tennessee, Knoxville, Department of Animal
Science; Data-Enabled Life Sciences Alliance,
Warnich, Louise; University of Stellenbosch, Department of Genetics; Data-
Enabled Life Sciences Alliance,
Wilhelm, Steven; University of Tennessee, Knoxville, Department of
Microbiology; Data-Enabled Life Sciences Alliance,
Yandl, Gregory; Seattle Children's Research Institute, Bioinformatics and
High-throughput Analysis Laboratory; Data-Enabled Life Sciences Alliance,
Keywords: genomics, computational biology, data acquisition and cleaning
Page 2 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Title: Towards more transparent and reproducible omics studies
through a common metadata checklist and data publications
Running Title:
Metadata checklist for omics studies and data publications
Keywords:
Multi-disciplinary, multi-omics, metadata, checklist, guidelines, transparency, reproducibility,
integrated analysis, data, reuse
Authors:
Eugene Kolker*
Bioinformatics and High-Throughput Analysis Laboratory,
Seattle Children’s Research Institute,
Predictive Analytics, Seattle Children’s
1900 Ninth Avenue
Seattle, WA 98101
Email: eugene.kolker@seattlechildrens.org
Phone: 206-884-7170
Vural Özdemir
Office of the President, Gaziantep University,
International Affairs and Global Development Strategy,
and Faculty of Communications
Üniversite Bulvarı P.K. 27310 Şehitkamil
Gaziantep, Kilis Yolu, Turkey
Email: vural.ozdemir@gantep.edu.tr
Phone: 90-545-550-88-80
Lennart Martens
Department of Medical Protein Research,
VIB, Ghent, Belgium
Department of Biochemistry,
Ghent University, Ghent, Belgium
Albert Baertsoenkaai 3
9000 GENT
Email: lennart.martens@vib-ugent.be
Phone: 32-9-264-93-58
William Hancock
Barnett Institute, Department of Chemistry
Northeastern University
Page 3 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
341 Mugar Building
Boston MA 02155
Email: wi.hancock@neu.edu
Phone: 617-373-4881
Gordon Anderson
Fundamental & Computational Sciences Directorate
Pacific Northwest National Laboratory
902 Battelle Boulevard
P.O. Box 999, MSIN K8-98
Richland, WA 99352
Email: Gordon@pnl.gov
Phone: 509-371-6582
Nathaniel Anderson
Bioinformatics and High-Throughput Analysis Laboratory,
Seattle Children’s Research Institute
1900 Ninth Avenue
Seattle, WA 98101
Email: nate.anderson@seattlechildrens.org
Phone: 206-884-1230
Şükrü Aynacioglu
Department of Pharmacology
Gaziantep University
University Boulevard, 27310 Sehitkamil
Gaziantep, Turkey
Email: aynacioglu@gantep.edu.tr
Phone: 90-342-360-1023
Ancha Baranova
School of Systems Biology
George Mason University
10900 University Blvd, MSN 5B3
Manassas, VA 20110
Email: abaranov@gmu.edu
Phone: 703-993-4293
Shawn R. Campagna
Department of Chemistry
University of Tennessee, Knoxville
618 Buehler Hall
Knoxville, TN 37996-1600
Email: campagna@utk.edu
Phone: 865-974-7337
Page 4 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Rui Chen
Department of Genetics
Stanford University
300 Pasteur Drive, Alway M-308
Stanford, CA 94305-5120
Email: ruichens@stanford.edu
Phone: 650-723-3277
John Choiniere
Bioinformatics and High-Throughput Analysis Laboratory
Seattle Children’s Research Institute
1900 Ninth Avenue
Seattle, WA 98101
Email: john.choiniere@seattlechildrens.org
Phone: 206-884-1251
Stephen P. Dearth
Department of Chemistry
University of Tennessee, Knoxville
1420 Circle Dr.
Knoxville, TN 37996
sdearth@ion.chem.utk.edu
Phone: 865-974-3141
Wu-Chun Feng
SyNeRGy Laboratory
Departments of Computer Science & Electrical and Computer Engineering
Virginia Tech
2202 Kraft Dr
Blacksburg, VA 24060
Email: feng@cs.vt.edu
Phone: 540-231-1192
Lynnette Ferguson
Auckland Cancer Society Research Centre
Department of Nutrition
University of Auckland
Private Bag 92019, Auckland Mail Centre
Auckland 1142
New Zealand
Email: l.ferguson@auckland.ac.nz
Phone: 64-9-373-7599, ext. 86372
Geoffrey Fox
School of Informatics and Computing
Indiana University
Page 5 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
815 E 10th Street, Room 105
Bloomington, IN 47408
Email: gcf@indiana.edu
Phone: 812-856-7977
Dmitrij Frishman
Technische Universität München,
Wissenshaftzentrum Weihenstephan ,A
m Forum 1, 85354 Freising, Germany
Email: d.frishman@wzw.tum.de
Phone: 49-8161-712134
Robert Grossman
Institute for Genomics and Systems Biology
and Department of Medicine,
University of Chicago
900 East 57th Street, Room 10142
Chicago, IL 60637
Email: robert.grossman@uchicago.edu
Phone: 773-834-4669
Allison Heath
Institute for Genomics and Systems Biology,
University of Chicago
Knapp Center for Biomedical Discovery
900 East 57th Street
Chicago, IL 60637
Email: aheath@uchicago.edu
Phone: 773 573-9560
Roger Higdon
Bioinformatics and High-Throughput Analysis Laboratory,
Seattle Children’s Research Institute
Predictive Analytics, Seattle Children’s
1900 Ninth Avenue
Seattle, WA 98101
Email: roger.higdon@seattlechildrens.org
Phone: 206-884-7172
Mara H. Hutz
Departamento de Genética,
Instituto de Biociências,
Federal University of Rio Grande do Sul,
Caixa Postal 15053, 91501-970
Porto Alegre, RS, Brazil.
E-mail: mara.hutz@ufrgs.br
Page 6 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Imre Janko
High-Throughput Analysis Core,
Seattle Children’s Research Institute
1900 Ninth Avenue
Seattle, WA 98101
Email: imre.janko@seattlechildrens.org
Phone: 206-884-1064
Lihua Jiang
Department of Genetics
Stanford University
300 Pasteur Drive, M-344A
Stanford, California 94305
Email: lihuaj@stanford.edu
Phone: 650-723-9914
Sanjay Joshi
Life Sciences
EMC2
Email: sanjay.joshi@emc.com
Phone: 425-429-4727
Alexander Kel
GeneXplain GmbH
Am Exer 10 b
D-38302 Wolfenbüttel
Germany
Email: alexander.kel@genexplain.com
Phone: +49-05331-992200-13
Joseph W. Kemnitz
Department of Cell and Regenerative Biology
Wisconsin National Primate Research Center
University of Wisconsin-Madison
1220 Capitol Court
Madison, WI 53715-1299
Email: kemnitz@primate.wisc.edu
Phone: 608-263-3588
Isaac S. Kohane
Pediatrics and Health Sciences and Technology
Children's Hospital and Harvard Medical School
HMS Center for Biomedical Informatics
Countway Library of Medicine
10 Shattuck Street
Page 7 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Boston, MA 02115
Email: isaac_kohane@harvard.edu
Phone: 617-432-2144
Natali Kolker
High-Throughput Analysis Core,
Seattle Children’s Research Institute
Predictive Analytics, Seattle Children’s
1900 Ninth Avenue
Seattle, WA 98101
Email: natali.kolker@seattlechildrens.org
Phone: 206-884-7171
Doron Lancet
Crown Human Genome Center
Department of Molecular Genetics
Weizmann Institute of Science
Rehovot 76100, Israel
Email: doron.lancet@weizmann.ac.il
Phone: 972-8-934-3683
Elaine Lee
High-throughput Analysis Core,
Seattle Children’s Research Institute
1900 Ninth Avenue
Seattle, WA 98101
Email: Elaine.lee@seattlechildrens.org
Phone: 206-884-1250
Weizhong Li
Center for Research in Biological Systems
University of California, San Diego
9500 Gilman Drive MC 0446
Atkinson Hall, Room 3113
La Jolla, CA 92093-0446
Email: liwz@sdsc.edu
Phone: 858-534 4143
Andrey Lisitsa
Russian Human Proteome Organization (RHUPO)
Institute of Biomedical Chemistry
Moscow, 199121 Russia
Pogodinskaya str., 10
Email: Lisitsa063@gmail.com
Phone: 7-499-246-3731
Page 8 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Adrián Llerena
Clinical Research Center
Extremadura University Hospital and Medical School
Hospital Infanta Cristina Av de Elvas s/n
06080 Badajoz, Spain
Email: allerena@unex.es
Phone: +34924218040
Courtney MacNealy-Koch
DELSA Global,
Seattle Children’s Research Institute
1900 Ninth Avenue
Seattle, WA 98101
Email: courtney.macnealykoch@seattlechildrens.org
Phone: 206-884-1171
Jean-Claude Marshall
Center for Translational Research
Catholic Health Initiatives
7601 Osler Drive,
Towson, MD 21204
Email: Jean-ClaudeMarshall@catholichealth.net
Phone: 410.427.2587
Paola Masuzzo
Department of Medical Protein Research
VIB, Ghent, Belgium
Department of Biochemistry
Ghent University, Ghent, Belgium
Albert Baertsoenkaai 3
9000 GENT
Email: paola.masuzzo@vib-ugent.be
Phone: 32-9-264-93-33
Amanda L. May
Department of Chemistry
UT Knoxville
1420 Circle Dr.
Knoxville, TN 37996-1600
Email: amay@ion.chem.utk.edu
Phone: 865-974-3141
George Mias
Department of Genetics
Stanford University
300 Pasteur Drive, Alway Building M308
Page 9 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Stanford, CA 94305-5120
Email: george.mias@stanford.edu
Phone: 650-723-3277
Matthew Monroe
Biological Sciences Division
Pacific Northwest National Laboratory
902 Batelle Boulevard
P.O. Box 999 MSIN K8-98
Richland, WA 99352
Email: matthew.monroe@pnnl.gov
Phone: 509-371-6580
Elizabeth Montague
Bioinformatics and High-Throughput Analysis Lab,
Seattle Children’s Research Institute
Predictive Analytics, Seattle Children’s
1900 Ninth Avenue
Seattle, WA 98101
Email: elizabeth.montague@seattlechildrens.org
Phone: 206-884-1137
Sean Mooney
The Buck Institute for Research on Aging
8001 Redwood Blvd.
Novato, CA 94945
Email: smooney@buckinstitute.org
Phone: 415-209-2038
Alexey Nesvizhskii
Department of Pathology,
Department of Computational Medicine and Bioinformatics
University of Michigan
4237 Med Sci I, 1301 Catherine Road
Ann Arbor, MI 48109-0602
Email: nesvi@med.umich.edu
Phone: 734)-764-3516
Santosh Noronha
Department of Chemical Engineering
Indian Institute of Technology Bombay
Powai, Mumbai
400076, INDIA
Email: noronha@iitb.ac.in
Phone: 91-22-2576 7238
Page 10 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Gilbert Omenn
Center for Computational Medicine and Bioinformatics
Departments of Molecular Medicine & Genetics and Human Genetics
Department of Computational Medicine and Bioinformatics
School of Public Health
100 Washtenaw Avenue
2017 Palmer Commons
Ann Arbor, MI 48109-2218
Email: gomenn@umich.edu
Phone: 734-763-7583
Harsha K. Rajasimha
Jeeva Informatics Solutions LLC
7706 Majestic Way
Derwood, MD USA
Email: harsha@jeevadx.com
Phone: 540-239-0465
Preveen Ramamoorthy
Advanced Diagnostic Laboratories,
Department of Medicine National Jewish Health
1400 Jackson St
Denver, CO 80206
Email: RamamoorthyP@NJHealth.org
Phone: 303-398-1501
Jerry Sheehan
California Institute for Telecommunications and Information Technology
University of California, San Diego
9500 Gilman Drive #0436
La Jolla, CA 92093-0436
Email: jsheehan@ucsd.edu
Phone: 858-534-1723
Larry Smarr
California Institute for Telecommunications and Information Technology
University of California, San Diego
9500 Gilman Drive
La Jolla, CA 92093-0436
Email: lsmarr@ucsd.edu
Phone: 858-822-4284
Charles V. Smith
Center for Developmental Therapeutics
Seattle Children’s Research Institute
1900 Ninth Avenue
Page 11 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Seattle, WA 98101
Email: charlesskip.smith@seattlechildrens.org
Phone: 206-884-7453
Todd Smith
Digital World Biology
2442 NW Market St. PMB 160
Seattle WA, 98107
Email: todd@digitalworldbiology.com
Phone: 206-375-6843
Michael Snyder
Department of Genetics
Stanford Center for Genomics and Personalized Medicine
Stanford University
300 Pasteur Drive, M-344A
Stanford, California 94305
Email: mpsnyder@Stanford.edu
Phone: 650-736-8099
Srikanth Rapole
Proteomics Laboratory
National Centre for Cell Science
NCCS Complex, Pune University Campus
Ganeshkhind, Maharashtra
Pune- 411007, India
Email: rsrikanth@nccs.res.in
Phone: +91-20-25708075
Sanjeeva Srivastava
Proteomics Laboratory
Indian Institute of Technology Bombay
Powai, Mumbai
400 076, India
Email: sanjeeva@iitb.ac.in
Phone: 022-2576-7779
Larissa Stanberry
Bioinformatics and High-Throughput Analysis Laboratory,
Seattle Children’s Research Institute
Predictive Analytics, Seattle Children’s
1900 Ninth Avenue
Seattle, WA 98101
Email: larissa.stanberry@seattlechildrens.org
Phone: 206-884-1059
Page 12 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Elizabeth Stewart
Bioinformatics and High-Throughput Analysis Laboratory
Seattle Children’s Research Institute
1900 Ninth Avenue
Seattle, WA 98101
Email: elizabeth.stewart@seattlechildrens.org
Phone: 206-884-7176
Stefano Toppo
Department of Molecular Medicine
University of Padova
Via U. Bassi 58/b
I-35131 Padova, Italy
Email: stefano.toppo@unipd.it
Phone: 39-049-8276958
Peter Uetz
J Craig Venter Institute (JCVI)
9704 Medical Center Drive
Rockville, MD 20850
Email: uetz@jcvi.org
Phone: 301-795-7589
Kenneth Verheggen
Department of Medical Protein Research
VIB, Ghent, Belgium
Department of Biochemistry
Ghent University, Ghent, Belgium
Albert Baertsoenkaai 3
9000 GENT
Email: kenneth.verheggen@ugent.be
Phone: 32-9-264-93-58
Brynn H. Voy
Department of Animal Science
UT Institute of Agriculture
2506 River Drive
Knoxville, Tennessee 37996
Email: bhvoy@utk.edu
Phone: 865-974-4729
Louise Warnich
Department of Genetics
Faculty of AgriSciences
University of Stellenbosch
Private Bag X1,
Page 13 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Matieland, 7602,
Stellenbosch, South Africa
E-mail: lw@sun.ac.za
Phone: 27-21-808-5888
Steven W. Wilhelm
Department of Microbiology
University of Tennessee, Knoxville
M409 Walters Life Sciences
Knoxville, Tennessee 37996-0845
Email: Wilhelm@utk.edu
Phone: 865-974-0665
Gregory Yandl
Bioinformatics and High-Throughput Analysis Laboratory,
Seattle Children’s Research Institute
Predictive Analytics, Seattle Children’s
1900 Ninth Avenue
Seattle, WA 98101
Email: gregory.yandl@seattlechildrens.org
Phone: 206-884-3009
*Signifies corresponding author
Page 14 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Abstract
Biological processes are fundamentally driven by complex interactions between
biomolecules. Integrated high-throughput omics studies enable multifaceted views of cells,
organisms, or their communities. With the advent of new post-genomics technologies omics
studies are becoming increasingly prevalent yet the full impact of these studies can only be
realized through data harmonization, sharing, meta-analysis, and integrated research,. These
three essential steps require consistent generation, capture, and distribution of the metadata. To
ensure transparency, facilitate data harmonization, and maximize reproducibility and usability of
life sciences studies, we propose a simple common omics metadata checklist. The proposed
checklist is built on the rich ontologies and standards already in use by the life sciences
community. The checklist will serve as a common denominator to guide experimental design,
capture important parameters, and be used as a standard format for stand-alone data publications.
This omics metadata checklist and data publications will create efficient linkages between omics
data and knowledge-based life sciences innovation and importantly, allow for appropriate
attribution to data generators and infrastructure science builders in the post-genomics era. We
ask that the life sciences community test the proposed omics metadata checklist and data
publications and provide feedback for their use and improvement.
Page 15 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
A COMMON OMICS METADATA CHECKLIST PROPOSAL
Modern life science technologies enable rapid and efficient acquisition of omics data.
These data comprehensively measure multi-layered molecular networks and provide a snapshot
of biological processes in a cell, organism, or their communities. Collected on the same sample
at the same time, omics data provide information on the functioning of biomolecules and their
interactions. Omics studies are essential for the systemic investigation of biological systems--an
endeavor that is crucial to improve our ability to manage and cure diseases, identify drug targets,
understand regulatory cascades, and predict ecosystem responses to environmental changes.
Through the pioneering efforts of Drs. Smarr and Snyder 1-4
, two powerful multi-omics
human datasets were recently made available. Smarr's dataset includes a wide variety of
molecular measures and clinical parameters meticulously collected and cataloged for years,
while Snyder's integrative personal multi-omics study presents his personal genomics,
transcriptomics, proteomics, metabolomics, and autoantibody profiles collected over a 14-month
period. Both studies yielded unique physiological insights not previously possible, including
early indications of vulnerabilities to specific diseases.
In the near future these kinds of personal omics studies will become routine and will
inevitably result in vast and diverse volumes of omics data. Therefore, the scientific community
must commit to a common format for publishing the design and analysis of these studies that will
ensure the compatibility, reproducibility, and reuse of the resulting data.
Page 16 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
The use, integration, and reuse of data require accurate and comprehensive capture of the
associated metadata including details describing experimental design, sample acquisition and
preparation, instrument protocols, and processing steps. The data and metadata must be captured
together in a rigorous and consistent manner to allow the integration of data across omics
experiments. The use of ontologies, naming conventions and standards can increase the
compatibility and usability of these diverse data. Fortunately, life sciences data have certain core
similarities. However, combined with these similarities come the different nuances among
various technology platforms such as transcriptomics, proteomics and metabolomics, as well as
application contexts such as neuroscience and hematology. The differences are compounded by
the multiplicity of standards within a field--transcriptomics alone has at least 15 standards
potentially applicable to the data 5,6
. Such complexities not only make reproducible, integrative,
accurate, and comprehensive capture of data and metadata an intricate challenge that must be
overcome but also place an excessive burden on researchers trying to convey metadata 5,7
.
Pioneering attempts in this area were made in 2007, when the “Minimum Information
about a Biomedical or Biological Investigation" project brought many of these efforts for the life
sciences together into an umbrella organization: MIBBI 8,9
. In MIBBI, each set of guidelines is
developed by a working group concentrated in a specific field, (for example, fMRI, or QTL and
association studies). Through this approach, MIBBI aspires to capture all essential metadata and
data that are necessary to replicate any given experiment within a field. Also the framework
known as Minimal Information about any Sequence (MIxS) expands the breadth of information
available by integrating the individual genomics checklists developed by the Genomics
Standards Consortium with environmental information 10
. In addition, the NIH’s National Center
Page 17 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
for Biotechnology Information developed a format for cataloging information about samples
enabling further metadata availability 11
. While these frameworks are critical to the reuse of data,
they do not fully take into account the interlocking aspects needed for harmonization of diverse
omics data types.
Recently, the Nature Publishing Group implemented a publication checklist that provides
another example of an approach to improve the transparency and reproducibility of life sciences
publications 12,13
. The checklist requires the researcher and/or corresponding author to enter
specific information on experimental design, statistical analysis, and reagents. This checklist is
endorsed by the Data-Enabled Life Sciences Alliance (DELSA Global, 14
).
The unveiling of the Nature publication initiative brought into focus the need for a
complementary omics checklist that allows the capture and publication of critical metadata
associated with omics data sets. To this end, life sciences researchers from DELSA Global 15-19
propose a single common omics metadata checklist as described below. By integrating DELSA
researchers’ collective experiences with omics guidelines and publication requirements, one
simplified, yet informative and flexible checklist was created to capture the essential aspects of
omics studies 20
.
Publication of a completed checklist will serve to inform the life sciences community of
the details needed to properly utilize the given data set. This type of “resource publication” has
long been done by Nucleic Acid Research in its annual Database issue. Nanopubs and
Micropubs are two newer publication avenues that could serve to quickly and accurately share
Page 18 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
information 21,22
. There are also other forms of data publications including, for example, ISA
Tools and the Scientific Data journal 23,24
.
It is worth noting that multi-omics data from a longitudinal study of a single individual
(e.g., the Smarr and Snyder datasets) in their entirety constitute essentially a whole new data
type. Supplied with detailed metadata, these data could become a part of a greater, well-
documented collage of data within a specific domain. Due to the large amount of data and the
complexity in the data acquisition it is exceedingly difficult to capture, disseminate, and interpret
the metadata. Generally, minimal reporting requirements are aimed at enabling replication of an
experiment, a concept that is not easily applied to the longitudinal personal omics studies.
Reuse of data can be enabled with more succinct and concise reporting.
The checklist we propose therefore has a simple structure covering four concise sections:
experiment information, experimental design, experimental methods, and data processing. The
experiment information section includes details of the lab, funding sources, data identification,
and a brief abstract to address why the experiment was done. The experimental design section is
meant to capture the high level data about the experiment and its statistical design, including
sample selection, replication, and randomization. The experimental methods section contains
details about instrumentation and sample preparation. The data processing section captures
information regarding methods and tools used in experimental data processing and data analysis.
Table 1.
Page 19 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
The metadata captured by this checklist will serve as interlocking bridges for data
harmonization and therefore they focus on details of the experimental design and subsequent
data analyses. In multi-omics studies, the researcher would fill a checklist for each omics data-
type measured. As test cases, two datasets of the integrative personal multi-omics study were
used 25
. The proposed checklist integrates existing ontologies and standards in order to
standardize terminology and simplify data input. In its short structured form, the checklist
captures important experimental parameters and strikes a balance between comprehensiveness
and ease of use. As such, the checklist can serve as a guide to the design of omics studies.
Implementation of this checklist will enable efficient portability and meta-analysis of the
data, as well as transparent communication and greater reproducibility of omics studies. Yet the
checklist is just the first step towards full utilization of the data. Traditional publication avenues
and new data publications, for example, OMICS Journal of Integrative Biology, Journal of
Proteome Research, Big Data, eLife, and Scientific Data could test and adopt the format to
ensure that the crucial information needed to allow data to be harmonized for broader usage is
published 26-28
. The assessment of the metadata quality and the data they accompany could be
done through community resources like PubMed Commons 29,30
.
Data submissions to single omics databases such as, for example, ArrayExpress and Geo
for transcriptomics, or PRIDE and Proteome Exchange for proteomics, would benefit from both
additional omics metadata within the given database and robust harmonization with other data-
types in other databases 31-34
. The checklist could also aid submissions to multi-omics databases,
data repositories or data clouds. Examples include: data clouds, such as The Open Science Data
Page 20 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Cloud, and data repositories, such as Dryad for raw data, and MOPED for processed data 35-37
.
When compatibility and sharing of data and metadata cease to be an issue, a deeper
understanding of cells, organisms, and their communities will ensue.
CONCLUSION
The proposed metadata checklist offers a much-needed and balanced approach to bring
about data harmonization across omics studies. This is accomplished while also maintaining the
flexibility needed to adapt to complex and ever evolving study designs and omics application
contexts in the post-genomics era of the life sciences.
ACKNOWLEDGEMENTS
All authors of this publication are members of the Data-Enabled Life Sciences Alliance (DELSA
Global). This manuscript has been reviewed and endorsed by the Alliance. Research reported in
this publication was supported by Seattle Children’s Research Institute (SCRI), EMC2
, and Intel.
This support is very much appreciated. The content is solely the responsibility of the authors and
does not necessarily represent the official views of the Moore Foundation, SCRI, EMC2
, or Intel.
AUTHOR DISCLOSURE STATEMENT
No competing financial interests exist.
Page 21 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
REFERENCES
1. Smarr L. Quantifying your body: A how-to guide from a systems biology perspective.
Biotechnology Journal 2012; 7:980-991.
2. Bowden M. The Measured Man. The Atlantic 2012; June 13.
www.theatlantic.com/magazine/print/2012/0.7/the-measured-man/309018/
3. Chen R, Mias G, Li-Pook-Than J, et al. Personal omics profiling reveals dynamic molecular
and medical phenotypes. Cell 2012; 148:1293–1307.
4. Mias G, Snyder M. Personal genomes, quantitative dynamic omics and personalized medicine.
Quantitative Biology 2013; 1:71–90.
5. Tennenbaum JD, Sansone S-A, Haendel M. A sea of standards for omics data: sink or swim?
J Am Med Inform Assoc 2013; [accessed 2013-10-17].
http://jamia.bmj.com/content/early/2013/09/27/amiajnl-2013-002066.full.html.
6. Field D, Sansone SA, Collis A, et al. 'Omics Data Sharing. Science 2009; 326:234-236.
7. Editorial. On the table. Nature Genetics 2011; 43:1.
8. Taylor C, Field D, Sansone S, et al. Promoting coherent minimum reporting guidelines for
biological and biomedical investigations: the MIBBI project. Nat Biotechnol 2008; 8:889-96.
Page 22 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
9. Kettner C, Field D, Sansone SA, et al. Meeting Report from the Second “Minimum
Information for Biological and Biomedical Investigations” (MIBBI) workshop. Stand Genomic
Sci 2010; 3:259–266.
10. Yilmaz P, Kottmann R, Field D, et al. Minimum information about a marker gene sequence
(MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nature
Biotechnology 2011; 29:415–420.
11. Barrett T, Clark K, Gevorgyan R, et al. BioProject and BioSample databases at NCBI:
facilitating capture and organization of metadata. Nucleic Acids Res 2012; 40:D57–D63.
12. Editorial. Announcement: Reducing our irreproducibility. Nature 2013; 496:398.
13. Reporting Checklist For Life Sciences Articles. Nature Publishing Group, May-2013.
http://www.nature.com/authors/policies/checklist.pdf. [Accessed: 01-Aug-2013].
14. Kolker E, Altinas I, Bourne P, et al. In praise of open research measures. Nature 2013;
498:170.
15. Data-Enabled Life Sciences Alliance. 2013. www.delsaglobal.org.
Page 23 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
16. Kolker E and Stewart E. Opinion: Data to Knowledge to Action: Introducing DELSA Global,
a community initiative to connect experts, share data, and democratize science. The Scientist,
April 18, 2012. [accessed 2013-10-23] http://www.the-
scientist.com/?articles.view/articleNo/31985/title/Opinion--Data-to-Knowledge-to-Action/
17. Kolker E, Stewart E, and Ozdemir V. Opportunities and challenges for the life sciences
community. OMICS 2012; 16:138–147.
18. Kolker E, Stewart E, and Ozdemir V. DELSA Global for “Big Data” and the Bioeconomy:
Catalyzing Collective Innovation. Industrial Biotechnology 2012; 8:176-178.
19. Stewart E, Smith T, De Souza A, et al. DELSA Workshop IV: Launching the Quantified
Human Initiative. Big Data 2013; 1:187-190.
20. Data-Enabled Life Sciences Alliance. Multi-Omics Metadata Checklist. 2013.
http://www.delsaglobal.org/news/publications/item/84-checklist
21. Nanopub. 2013. nanopub.org.
22. Micropub. 2013. micropub.org.
23. ISA tools. 2013. isa-tools.org.
Page 24 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
24. Scientific Data. 2013. www.nature.com/scientificdata.
25. Snyder M, Mias G, Stanberry L, et al. Metadata checklist for the integrated personal omics
study: proteomics and metabolomics experiments; OMICS submitted.
26. OMICS Journal of Integrative Biology. 2013 www.liebertpub.com/OMI.
27. Journal of Proteome Research. 2013. pubs.acs.org/journal/jprobs.
28. eLife. 2013. elife.elifesciences.org.
29. PubMed commons. 2013. www.ncbi.nlm.nih.gov/pubmedcommons
30. Swartz A. Post-Publication Peer Review Mainstreamed. The Scientist [accessed 2013-10-
23]. www.the-scientist.com/?articles.view/articleNo/37969/title/Post-Publication-Peer-Review-
Mainstreamed/
31. ArrayExpress. 2013. www.ebi.ac.uk/arrayexpress
32. Geo. 2013. www.ncbi.nlm.nih.gov/geo/
33. PRIDE. 2013. www.ebi.ac.uk/pride
Page 25 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
34. Proteome Exchange. 2013. proteomexchange.org.
35. Open Science Data Cloud. 2013. www.opensciencedatacloud.org
36. Dryad. 2013. datadryad.org.
37. MOPED. 2013. moped.proteinspire.org.
Page 26 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Table 1. Multi-omics metadata checklist
Checklist Version Version 1.0 (2013)
Experiment Information Description
Lab Name Lab conducting the experiment
Date Checklist submission date
Author Information Name, organization, contacts
Title of Experiment One-sentence description of the particular experiment
Project Project name, ID, organization
Funding Funding sources for the project
Digital ID Multiple Digital IDs may be listed, such as those to GEO, MOPED,
PRIDE, DOIs, etc.
Abstract A short description of the experiment briefly stating the goals of the
research and principal outcomes if any (100 words or less)
Experimental Design Description
Organism e.g., Human, Mouse
OMICS Type(s) Utilized e.g., Proteomics, Metabolomics
Reference Published paper that utilize these data, their PMID, or other relevant
IDs or links
Experimental Design Design Specifications, Type of Replication (Biological, Technical, Time
points), Grouping of subjects, samples or replicates, Randomization,
Comparisons, other salient design attributes
Experiment Focus Description of experiment goals and objectives
Sample Description Description of samples
Tissue/Cell Type ID* e.g., BRENDA
Localization ID e.g., GO
Condition ID DOID, Text
Experimental Methods Description
Sample Prep Description Description of steps taken, kits used
Platform Type e.g., Microarray, LC-MS/MS, GC-MS, sequencing platform
Instrument Name e.g., LTQ-Orbitrap, psi-MS ontology, HiSeq, IonTorrent, Chip name
(Microarray)
Instrument Details e.g. Ion source, mass analyzer
Instrument Protocol e.g., Fragmentation method (CID, HCD, ETD), MS/MS scans per MS
scan, sequencing cycles, paired ends, single reads, hybridization
methods (microarray)
Data Processing Description
Page 27 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ForPeerReview
Only/NotforDistribution
Instrument Software Name List of software used, including version
Processing/Normalization
Methods/Software
Description of processing and normalization methods & software
Sequence/Annotation
Database
Source, version or date
ID Method/Software Name of search engine used + post processing to ID molecules
ID/Expression measures e.g., thresholds and cutoffs for ID, spectral counts, peak area, reads,
log2 expression
Data Analysis
Method/Software
Methods and software used for expression analysis, error estimation
I/O Data File Formats List of files formats for raw and processed data (e.g., txt, xml, etc.),
specifications and software tools to ensure readability
Reserved/Extension Any additional information related to the experiment
*ID – Identification
Page 28 of 26
ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com
Mary Ann Liebert, Inc.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

More Related Content

What's hot

Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017Warren Kibbe
 
Principles organization and_operation_of_a_dna_bank
Principles organization and_operation_of_a_dna_bankPrinciples organization and_operation_of_a_dna_bank
Principles organization and_operation_of_a_dna_bankEspirituanna
 
Clinical Genomics and Medicine
Clinical Genomics and MedicineClinical Genomics and Medicine
Clinical Genomics and MedicineWarren Kibbe
 
BioData West 2017 Brochure.PDF
BioData West 2017 Brochure.PDFBioData West 2017 Brochure.PDF
BioData West 2017 Brochure.PDFMichael Shackil
 
Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...
Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...
Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...FAO
 
Dr. Nanyingi Technology Keynote
Dr. Nanyingi Technology KeynoteDr. Nanyingi Technology Keynote
Dr. Nanyingi Technology KeynoteNanyingi Mark
 
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic MedicineBioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicineiosrjce
 
Reg Sci Lecture Dec 2016
Reg Sci Lecture Dec 2016Reg Sci Lecture Dec 2016
Reg Sci Lecture Dec 2016Rick Silva
 
24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & Phenotyping
24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & Phenotyping24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & Phenotyping
24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & PhenotypingInsideScientific
 
Added Value of Open data sharing using examples from GenomeTrakr
Added Value of Open data sharing using examples from GenomeTrakrAdded Value of Open data sharing using examples from GenomeTrakr
Added Value of Open data sharing using examples from GenomeTrakrExternalEvents
 
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...ExternalEvents
 
헬스케어 빅데이터로 무엇을 할 수 있는가?
헬스케어 빅데이터로 무엇을 할 수 있는가?헬스케어 빅데이터로 무엇을 할 수 있는가?
헬스케어 빅데이터로 무엇을 할 수 있는가? Hyung Jin Choi
 
CV of Rong Chen
CV of Rong ChenCV of Rong Chen
CV of Rong ChenRong Chen
 

What's hot (20)

Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017
 
Dr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 MedicineDr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 Medicine
 
Principles organization and_operation_of_a_dna_bank
Principles organization and_operation_of_a_dna_bankPrinciples organization and_operation_of_a_dna_bank
Principles organization and_operation_of_a_dna_bank
 
Clinical Genomics and Medicine
Clinical Genomics and MedicineClinical Genomics and Medicine
Clinical Genomics and Medicine
 
BioData West 2017 Brochure.PDF
BioData West 2017 Brochure.PDFBioData West 2017 Brochure.PDF
BioData West 2017 Brochure.PDF
 
Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...
Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...
Whole Genome Sequencing and Food Safety: Potential relevance to the work of C...
 
Cco retroviruses_2013_art_slides
Cco  retroviruses_2013_art_slidesCco  retroviruses_2013_art_slides
Cco retroviruses_2013_art_slides
 
2013 03 genomic medicine slides
2013 03 genomic medicine slides2013 03 genomic medicine slides
2013 03 genomic medicine slides
 
Dr. Nanyingi Technology Keynote
Dr. Nanyingi Technology KeynoteDr. Nanyingi Technology Keynote
Dr. Nanyingi Technology Keynote
 
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic MedicineBioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
 
Dreyfuss.berkeley.2010
Dreyfuss.berkeley.2010Dreyfuss.berkeley.2010
Dreyfuss.berkeley.2010
 
Reg Sci Lecture Dec 2016
Reg Sci Lecture Dec 2016Reg Sci Lecture Dec 2016
Reg Sci Lecture Dec 2016
 
24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & Phenotyping
24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & Phenotyping24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & Phenotyping
24x7 Automated Behavior Tracking For Rodent Safety Pharmacology & Phenotyping
 
P4 Medicine May 2011
P4 Medicine May 2011P4 Medicine May 2011
P4 Medicine May 2011
 
Temprano trial nejm
Temprano trial nejmTemprano trial nejm
Temprano trial nejm
 
The Foundation of P4 Medicine
The Foundation of P4 MedicineThe Foundation of P4 Medicine
The Foundation of P4 Medicine
 
Added Value of Open data sharing using examples from GenomeTrakr
Added Value of Open data sharing using examples from GenomeTrakrAdded Value of Open data sharing using examples from GenomeTrakr
Added Value of Open data sharing using examples from GenomeTrakr
 
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infectio...
 
헬스케어 빅데이터로 무엇을 할 수 있는가?
헬스케어 빅데이터로 무엇을 할 수 있는가?헬스케어 빅데이터로 무엇을 할 수 있는가?
헬스케어 빅데이터로 무엇을 할 수 있는가?
 
CV of Rong Chen
CV of Rong ChenCV of Rong Chen
CV of Rong Chen
 

Viewers also liked

Developing the Reimbursement Story 2016-03-10
Developing the Reimbursement Story 2016-03-10Developing the Reimbursement Story 2016-03-10
Developing the Reimbursement Story 2016-03-10Lyssa Friedman
 
Hassan el meligy cv sc vc 20160212
Hassan el meligy cv sc vc 20160212Hassan el meligy cv sc vc 20160212
Hassan el meligy cv sc vc 20160212Beta-Research.org
 
Piel y anexos
Piel y anexos Piel y anexos
Piel y anexos IPN
 
Benchmarking for Co X
Benchmarking for Co XBenchmarking for Co X
Benchmarking for Co XA P Dharkar
 
FINAL 07.06.16 Proposal L Guihot
FINAL 07.06.16 Proposal  L GuihotFINAL 07.06.16 Proposal  L Guihot
FINAL 07.06.16 Proposal L GuihotLeanne Guihot
 
Rapport Huurders in BeeldRku
Rapport Huurders in BeeldRkuRapport Huurders in BeeldRku
Rapport Huurders in BeeldRkuRoelof Kuik
 
Disability Research
Disability ResearchDisability Research
Disability ResearchMir Mehboob
 
Konsep Asas Etnik, Budaya, Masyarakat, Perpaduan
Konsep Asas Etnik, Budaya, Masyarakat, PerpaduanKonsep Asas Etnik, Budaya, Masyarakat, Perpaduan
Konsep Asas Etnik, Budaya, Masyarakat, Perpaduankhatijah1889
 
CTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat Berintegrasi
CTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat BerintegrasiCTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat Berintegrasi
CTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat BerintegrasiMahyuddin Khalid
 

Viewers also liked (10)

Developing the Reimbursement Story 2016-03-10
Developing the Reimbursement Story 2016-03-10Developing the Reimbursement Story 2016-03-10
Developing the Reimbursement Story 2016-03-10
 
Hassan el meligy cv sc vc 20160212
Hassan el meligy cv sc vc 20160212Hassan el meligy cv sc vc 20160212
Hassan el meligy cv sc vc 20160212
 
Piel y anexos
Piel y anexos Piel y anexos
Piel y anexos
 
Benchmarking for Co X
Benchmarking for Co XBenchmarking for Co X
Benchmarking for Co X
 
FINAL 07.06.16 Proposal L Guihot
FINAL 07.06.16 Proposal  L GuihotFINAL 07.06.16 Proposal  L Guihot
FINAL 07.06.16 Proposal L Guihot
 
Rapport Huurders in BeeldRku
Rapport Huurders in BeeldRkuRapport Huurders in BeeldRku
Rapport Huurders in BeeldRku
 
Checklist
ChecklistChecklist
Checklist
 
Disability Research
Disability ResearchDisability Research
Disability Research
 
Konsep Asas Etnik, Budaya, Masyarakat, Perpaduan
Konsep Asas Etnik, Budaya, Masyarakat, PerpaduanKonsep Asas Etnik, Budaya, Masyarakat, Perpaduan
Konsep Asas Etnik, Budaya, Masyarakat, Perpaduan
 
CTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat Berintegrasi
CTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat BerintegrasiCTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat Berintegrasi
CTU555 Sejarah Malaysia - Hubungan Etnik ke arah Masyarakat Berintegrasi
 

Similar to BIG DATA paper

Genomics, Cellular Networks, Preventive Medicine, and Society
Genomics, Cellular Networks, Preventive Medicine, and SocietyGenomics, Cellular Networks, Preventive Medicine, and Society
Genomics, Cellular Networks, Preventive Medicine, and SocietyLarry Smarr
 
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...Larry Smarr
 
Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015Melanie Courtot
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...The Hive
 
New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...
New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...
New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...Golden Helix Inc
 
Pgd discussion challengesconcerns
Pgd discussion challengesconcernsPgd discussion challengesconcerns
Pgd discussion challengesconcernst7260678
 
Suggested guidelines for_immunohistochemical_techn
Suggested guidelines for_immunohistochemical_technSuggested guidelines for_immunohistochemical_techn
Suggested guidelines for_immunohistochemical_technEduardo J Kwiecien
 
DIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeDIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeMelanie Swan
 
The Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
 

Similar to BIG DATA paper (20)

Genomics, Cellular Networks, Preventive Medicine, and Society
Genomics, Cellular Networks, Preventive Medicine, and SocietyGenomics, Cellular Networks, Preventive Medicine, and Society
Genomics, Cellular Networks, Preventive Medicine, and Society
 
2013 09 atul butte mahajani symposium
2013 09 atul butte mahajani symposium2013 09 atul butte mahajani symposium
2013 09 atul butte mahajani symposium
 
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...
 
Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015
 
Atul Butte's presentation at JGI March 2015
Atul Butte's presentation at JGI March 2015Atul Butte's presentation at JGI March 2015
Atul Butte's presentation at JGI March 2015
 
Atul Butte's presentation at ASHG 2014
Atul Butte's presentation at ASHG 2014Atul Butte's presentation at ASHG 2014
Atul Butte's presentation at ASHG 2014
 
2014 simr presentation
2014 simr presentation2014 simr presentation
2014 simr presentation
 
Atul Butte's presentation at LINCS 2013
Atul Butte's presentation at LINCS 2013Atul Butte's presentation at LINCS 2013
Atul Butte's presentation at LINCS 2013
 
Presentation given at UCSF Precision Medicine meeting 4/11/2015
Presentation given at UCSF Precision Medicine meeting 4/11/2015 Presentation given at UCSF Precision Medicine meeting 4/11/2015
Presentation given at UCSF Precision Medicine meeting 4/11/2015
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
 
JALANov2000
JALANov2000JALANov2000
JALANov2000
 
Atul Butte's AAPS keynote presentation 6/2015
Atul Butte's AAPS keynote presentation 6/2015Atul Butte's AAPS keynote presentation 6/2015
Atul Butte's AAPS keynote presentation 6/2015
 
New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...
New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...
New Study Identifies High-Risk Variants Associated with Autism Spectrum Disor...
 
Pgd discussion challengesconcerns
Pgd discussion challengesconcernsPgd discussion challengesconcerns
Pgd discussion challengesconcerns
 
Precision Medicine World Conference 2017
Precision Medicine World Conference 2017Precision Medicine World Conference 2017
Precision Medicine World Conference 2017
 
Guna_Rajagopal_CV
Guna_Rajagopal_CVGuna_Rajagopal_CV
Guna_Rajagopal_CV
 
Suggested guidelines for_immunohistochemical_techn
Suggested guidelines for_immunohistochemical_technSuggested guidelines for_immunohistochemical_techn
Suggested guidelines for_immunohistochemical_techn
 
2014 farr institute presentation
2014 farr institute presentation2014 farr institute presentation
2014 farr institute presentation
 
DIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeDIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the Genome
 
The Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across Scales
 

More from PreveenRamamoorthy

JHG Breast Ovarian Strand 2016
JHG Breast Ovarian Strand 2016JHG Breast Ovarian Strand 2016
JHG Breast Ovarian Strand 2016PreveenRamamoorthy
 
Clin Cancer Res-1999-Chen-3583-93
Clin Cancer Res-1999-Chen-3583-93Clin Cancer Res-1999-Chen-3583-93
Clin Cancer Res-1999-Chen-3583-93PreveenRamamoorthy
 
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14PreveenRamamoorthy
 
Compound het filaggrin mutations
Compound het filaggrin mutationsCompound het filaggrin mutations
Compound het filaggrin mutationsPreveenRamamoorthy
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerPreveenRamamoorthy
 
Press release- CAP-ISO accreditation
Press release- CAP-ISO accreditationPress release- CAP-ISO accreditation
Press release- CAP-ISO accreditationPreveenRamamoorthy
 

More from PreveenRamamoorthy (10)

GeneticsResearch2014
GeneticsResearch2014GeneticsResearch2014
GeneticsResearch2014
 
JHG Breast Ovarian Strand 2016
JHG Breast Ovarian Strand 2016JHG Breast Ovarian Strand 2016
JHG Breast Ovarian Strand 2016
 
Xpert Flu Press release
Xpert Flu Press releaseXpert Flu Press release
Xpert Flu Press release
 
Clin Cancer Res-1999-Chen-3583-93
Clin Cancer Res-1999-Chen-3583-93Clin Cancer Res-1999-Chen-3583-93
Clin Cancer Res-1999-Chen-3583-93
 
Walter Reed paper
Walter Reed paperWalter Reed paper
Walter Reed paper
 
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
J. Clin. Microbiol.-2014-Davidson-JCM.01144-14
 
Compound het filaggrin mutations
Compound het filaggrin mutationsCompound het filaggrin mutations
Compound het filaggrin mutations
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
 
JCV2010
JCV2010JCV2010
JCV2010
 
Press release- CAP-ISO accreditation
Press release- CAP-ISO accreditationPress release- CAP-ISO accreditation
Press release- CAP-ISO accreditation
 

BIG DATA paper

  • 1. ForPeerReview Only/NotforDistribution Big Data Towards more transparent and reproducible omics studies through a common metadata checklist and data publications Journal: Big Data Manuscript ID: BIG-2013-0039 Manuscript Type: Original Article Date Submitted by the Author: 05-Nov-2013 Complete List of Authors: Kolker, Eugene; Seattle Children's Research Institute, Bioinformatics and High-Throughput Analysis Laboratory; Seattle Children's, Predictive Analytics; Data-Enabled Life Sciences Alliance, Ozdemir, Vural; Gaziantep University, Office of the President; Data- Enabled Life Sciences Alliance, Martens, Lennart; VIB, Department of Medical Protein Research; Ghent University, Department of Biochemistry; Data-Enabled Life Sciences Alliance, Hancock, William; Northeastern University, Barnett Institute; Data-Enabled Life Sciences Alliance, Anderson, Gordon; Pacific Northwest National Laboratory, Fundamental & Computational Sciences Directorate; Data-Enabled Life Sciences Alliance, Anderson, Nathaniel; Seattle Children's Research Institute, Bioinformatics and High-Throughput Analysis Laboratory; Data-Enabled Life Sciences Alliance, Aynacioglu, Sukru; Gaziantep University, Department of Pharmacology; Data-Enabled Life Sciences Alliance, Baranova, Ancha; George Mason University, School of Systems Biology; Data-Enabled Life Sciences Alliance, Campagna, Shawn; University of Tennesee, Knoxville, Department of Chemistry; Data-Enabled Life Sciences Alliance, Chen, Rui; Stanford University, Department of Genetics; Data-Enabled Life Sciences Alliance, Choiniere, John; Seattle Children's Research Institute, Bioinformatics and High-Throughput Analysis Laboratory; Data-Enabled Life Sciences Alliance, Dearth, Stephen; University of Tennesee, Knoxville, Department of Chemistry; Data-Enabled Life Sciences Alliance, Feng, Wu-Chun; Virginia Tech, SyNeRGy Laboratory; Virginia Tech, Department of Computer Science; Virginia Tech, Department of Electrical and Computer Engineering; Data-Enabled Life Sciences Alliance, Ferguson, Lynnette; University of Auckland, Department of Nutrition; Data- Enabled Life Sciences Alliance, Fox, Geoffrey; Indiana University, School of Informatics and Computing; Data-Enabled Life Sciences Alliance, Frishman, Dmitrij; Technische Universitat Muchen, ; Data-Enabled Life Sciences Alliance, Grossman, Robert; University of Chicago, Institute for Genomics and ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc.
  • 2. ForPeerReview Only/NotforDistribution Systems Biology; University of Chicago, Department of Medicine; Data- Enabled Life Sciences Alliance, Heath, Allison; University of Chicago, Institute for Genomics and Systems Biology; Data-Enabled Life Sciences Alliance, Higdon, Roger; Seattle Children's Research Institute, Bioinformatics and High-Throughput Analysis Laboratory; Seattle Children's, Predictive Analytics; Data-Enabled Life Sciences Alliance, Hutz, Mara; Federal University of Rio Grande do Sul, Departamento de Genetica; Data-Enabled Life Sciences Alliance, Janko, Imre; Seattle Children's Research Institute, High-Throughput Analysis Core; Data-Enabled Life Sciences Alliance, Jiang, Lihua; Stanford University, Department of Genetics; Data-Enabled Life Sciences Alliance, Joshi, Sanjay; EMC, Life Sciences; Data-Enabled Life Sciences Alliance, Kel, Alexander; GeneXplain GmbH, ; Data-Enabled Life Sciences Alliance, Kemnitz, Joseph; University of Wisconsin-Madison, Department of Cell and Regenerative Biology; University of Wisconsin-Madison, Wisconsin National Primate Research Center; Data-Enabled Life Sciences Alliance, Kohane, Isaac; Harvard Medical School, Department of Pediatrics; Harvard Medical School, Health Sciences and Technology; Harvard Medical School, Center for Biomedical Informatics; Data-Enabled Life Sciences Alliance, Kolker, Natali; Seattle Children’s Research Institute, High-throughput Analysis Core; Seattle Children’s Hospital, Predictive Analytics; Data- Enabled Life Sciences Alliance, Lancet, Doron; Weizmann Institute of Science, Department of Molecular Genetics; Data-Enabled Life Sciences Alliance, Lee, Elaine; Data-Enabled Life Sciences Alliance, ; Seattle Children's Research Institute, High-Throughput Analysis Core Li, Weizhong; University of California, San Diego, Center for Research in Biological Systems; Data-Enabled Life Sciences Alliance, Lisitsa, Andrey; Russian Human Proteome Organization, ; Institute of Biomedical Chemistry, ; Data-Enabled Life Sciences Alliance, Llerena, Adrian; Extremadura University Hospital and Medical School, Clinical Research Center; Data-Enabled Life Sciences Alliance, MacNealy-Koch, Courtney; Seattle Children's Research Institute, Bioinformatics and High-throughput Analysis Laboratory; Data-Enabled Life Sciences Alliance (DELSA Global), Marshall, Jean-Claude; Catholic Health Initiatives, Center for Translational Research; Data-Enabled Life Sciences Alliance, Masuzzo, Paola; VIB, Department of Medical Protein Research; Ghent University, Department of Biochemistry; Data-Enabled Life Sciences Alliance, May, Amanda; University of Tennesee, Knoxville, Department of Chemistry; Data-Enabled Life Sciences Alliance, Mias, George; Stanford University, Department of Genetics; Data-Enabled Life Sciences Alliance, Monroe, Matthew; Pacific Northwest National Laboratory, Biological Sciences Division; Data-Enabled Life Sciences Alliance, Montague, Elizabeth; Seattle Children's Research Institute, Bioinformatics and High-Throughput Analysis Laboratory; Data-Enabled Life Sciences Alliance, Mooney, Sean; The Buck Institute for Research on Aging, ; Data-Enabled Life Sciences Alliance, Nesvizhskii, Alexey; University of Michigan, Department of Pathology; University of Michigan, Department of Computational Medicine and Bioinformatics; Data-Enabled Life Sciences Alliance, Noronha, Santosh; Indian Institute of Technology Bombay, Department of Chemical Engineering; Data-Enabled Life Sciences Alliance, Omenn, Gilbert; University of Michigan, Department of Computational Medicine and Bioinformatics; University of Michigan, Department of Molecular Medicine and Genetics; University of Michigan, School of Public Page 1 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 3. ForPeerReview Only/NotforDistribution Health ; Data-Enabled Life Sciences Alliance, Rajasimha, Harsha; Jeeva Informatics Solutions LLC, ; Data-Enabled Life Sciences Alliance, Ramamoorthy, Preveen; National Jewish Health, Department of Medicine; Data-Enabled Life Sciences Alliance, Sheehan, Jerry; University of California, San Diego, California Institute of Telecommunications and Information Technology; Data-Enabled Life Sciences Alliance, Smarr, Larry; University of California, San Diego, California Institute of Telecommunications and Information Technology; Data-Enabled Life Sciences Alliance, Smith, Charles; Seattle Children's Research Institute, Center for Developmental Therapeutics; Data-Enabled Life Sciences Alliance, Smith, Todd; Digital World Biology, ; Data-Enabled Life Sciences Alliance, Snyder, Michael; Stanford University, Department of Genetics; Stanford University, Center for Genomics and Personalize Medicine; Data-Enabled Life Sciences Alliance, Rapole, Srikanth; National Centre for Cell Science, Proteomics Laboratory; Data-Enabled Life Sciences Alliance, Srivastava, Sanjeeva; Indian Institute of Technology Bombay, Proteomics Laboratory; Data-Enabled Life Sciences Alliance, Stanberry, Larissa; Seattle Children's Research Institute, Bioinformatics and High-throughput Analysis Laboratory; Seattle Children’s Hospital, Predictive Analytics; Data-Enabled Life Sciences Alliance, Stewart, Elizabeth; Seattle Children's, Bioinformatics and High-Throughput Data Analysis Laboratory; Data Enabled Life Sciences Alliance , Toppo, Stefano; University of Padova, Department of Molecular Medicine; Data-Enabled Life Sciences Alliance, Uetz, Peter; J Craig Venter Institute, ; Data-Enabled Life Sciences Alliance, Verheggen, Kenneth; VIB, Department of Medical Protein Research; Ghent University, Department of Biochemistry; Data-Enabled Life Sciences Alliance, Voy, Brynn; University of Tennessee, Knoxville, Department of Animal Science; Data-Enabled Life Sciences Alliance, Warnich, Louise; University of Stellenbosch, Department of Genetics; Data- Enabled Life Sciences Alliance, Wilhelm, Steven; University of Tennessee, Knoxville, Department of Microbiology; Data-Enabled Life Sciences Alliance, Yandl, Gregory; Seattle Children's Research Institute, Bioinformatics and High-throughput Analysis Laboratory; Data-Enabled Life Sciences Alliance, Keywords: genomics, computational biology, data acquisition and cleaning Page 2 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 4. ForPeerReview Only/NotforDistribution Title: Towards more transparent and reproducible omics studies through a common metadata checklist and data publications Running Title: Metadata checklist for omics studies and data publications Keywords: Multi-disciplinary, multi-omics, metadata, checklist, guidelines, transparency, reproducibility, integrated analysis, data, reuse Authors: Eugene Kolker* Bioinformatics and High-Throughput Analysis Laboratory, Seattle Children’s Research Institute, Predictive Analytics, Seattle Children’s 1900 Ninth Avenue Seattle, WA 98101 Email: eugene.kolker@seattlechildrens.org Phone: 206-884-7170 Vural Özdemir Office of the President, Gaziantep University, International Affairs and Global Development Strategy, and Faculty of Communications Üniversite Bulvarı P.K. 27310 Şehitkamil Gaziantep, Kilis Yolu, Turkey Email: vural.ozdemir@gantep.edu.tr Phone: 90-545-550-88-80 Lennart Martens Department of Medical Protein Research, VIB, Ghent, Belgium Department of Biochemistry, Ghent University, Ghent, Belgium Albert Baertsoenkaai 3 9000 GENT Email: lennart.martens@vib-ugent.be Phone: 32-9-264-93-58 William Hancock Barnett Institute, Department of Chemistry Northeastern University Page 3 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 5. ForPeerReview Only/NotforDistribution 341 Mugar Building Boston MA 02155 Email: wi.hancock@neu.edu Phone: 617-373-4881 Gordon Anderson Fundamental & Computational Sciences Directorate Pacific Northwest National Laboratory 902 Battelle Boulevard P.O. Box 999, MSIN K8-98 Richland, WA 99352 Email: Gordon@pnl.gov Phone: 509-371-6582 Nathaniel Anderson Bioinformatics and High-Throughput Analysis Laboratory, Seattle Children’s Research Institute 1900 Ninth Avenue Seattle, WA 98101 Email: nate.anderson@seattlechildrens.org Phone: 206-884-1230 Şükrü Aynacioglu Department of Pharmacology Gaziantep University University Boulevard, 27310 Sehitkamil Gaziantep, Turkey Email: aynacioglu@gantep.edu.tr Phone: 90-342-360-1023 Ancha Baranova School of Systems Biology George Mason University 10900 University Blvd, MSN 5B3 Manassas, VA 20110 Email: abaranov@gmu.edu Phone: 703-993-4293 Shawn R. Campagna Department of Chemistry University of Tennessee, Knoxville 618 Buehler Hall Knoxville, TN 37996-1600 Email: campagna@utk.edu Phone: 865-974-7337 Page 4 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 6. ForPeerReview Only/NotforDistribution Rui Chen Department of Genetics Stanford University 300 Pasteur Drive, Alway M-308 Stanford, CA 94305-5120 Email: ruichens@stanford.edu Phone: 650-723-3277 John Choiniere Bioinformatics and High-Throughput Analysis Laboratory Seattle Children’s Research Institute 1900 Ninth Avenue Seattle, WA 98101 Email: john.choiniere@seattlechildrens.org Phone: 206-884-1251 Stephen P. Dearth Department of Chemistry University of Tennessee, Knoxville 1420 Circle Dr. Knoxville, TN 37996 sdearth@ion.chem.utk.edu Phone: 865-974-3141 Wu-Chun Feng SyNeRGy Laboratory Departments of Computer Science & Electrical and Computer Engineering Virginia Tech 2202 Kraft Dr Blacksburg, VA 24060 Email: feng@cs.vt.edu Phone: 540-231-1192 Lynnette Ferguson Auckland Cancer Society Research Centre Department of Nutrition University of Auckland Private Bag 92019, Auckland Mail Centre Auckland 1142 New Zealand Email: l.ferguson@auckland.ac.nz Phone: 64-9-373-7599, ext. 86372 Geoffrey Fox School of Informatics and Computing Indiana University Page 5 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 7. ForPeerReview Only/NotforDistribution 815 E 10th Street, Room 105 Bloomington, IN 47408 Email: gcf@indiana.edu Phone: 812-856-7977 Dmitrij Frishman Technische Universität München, Wissenshaftzentrum Weihenstephan ,A m Forum 1, 85354 Freising, Germany Email: d.frishman@wzw.tum.de Phone: 49-8161-712134 Robert Grossman Institute for Genomics and Systems Biology and Department of Medicine, University of Chicago 900 East 57th Street, Room 10142 Chicago, IL 60637 Email: robert.grossman@uchicago.edu Phone: 773-834-4669 Allison Heath Institute for Genomics and Systems Biology, University of Chicago Knapp Center for Biomedical Discovery 900 East 57th Street Chicago, IL 60637 Email: aheath@uchicago.edu Phone: 773 573-9560 Roger Higdon Bioinformatics and High-Throughput Analysis Laboratory, Seattle Children’s Research Institute Predictive Analytics, Seattle Children’s 1900 Ninth Avenue Seattle, WA 98101 Email: roger.higdon@seattlechildrens.org Phone: 206-884-7172 Mara H. Hutz Departamento de Genética, Instituto de Biociências, Federal University of Rio Grande do Sul, Caixa Postal 15053, 91501-970 Porto Alegre, RS, Brazil. E-mail: mara.hutz@ufrgs.br Page 6 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 8. ForPeerReview Only/NotforDistribution Imre Janko High-Throughput Analysis Core, Seattle Children’s Research Institute 1900 Ninth Avenue Seattle, WA 98101 Email: imre.janko@seattlechildrens.org Phone: 206-884-1064 Lihua Jiang Department of Genetics Stanford University 300 Pasteur Drive, M-344A Stanford, California 94305 Email: lihuaj@stanford.edu Phone: 650-723-9914 Sanjay Joshi Life Sciences EMC2 Email: sanjay.joshi@emc.com Phone: 425-429-4727 Alexander Kel GeneXplain GmbH Am Exer 10 b D-38302 Wolfenbüttel Germany Email: alexander.kel@genexplain.com Phone: +49-05331-992200-13 Joseph W. Kemnitz Department of Cell and Regenerative Biology Wisconsin National Primate Research Center University of Wisconsin-Madison 1220 Capitol Court Madison, WI 53715-1299 Email: kemnitz@primate.wisc.edu Phone: 608-263-3588 Isaac S. Kohane Pediatrics and Health Sciences and Technology Children's Hospital and Harvard Medical School HMS Center for Biomedical Informatics Countway Library of Medicine 10 Shattuck Street Page 7 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 9. ForPeerReview Only/NotforDistribution Boston, MA 02115 Email: isaac_kohane@harvard.edu Phone: 617-432-2144 Natali Kolker High-Throughput Analysis Core, Seattle Children’s Research Institute Predictive Analytics, Seattle Children’s 1900 Ninth Avenue Seattle, WA 98101 Email: natali.kolker@seattlechildrens.org Phone: 206-884-7171 Doron Lancet Crown Human Genome Center Department of Molecular Genetics Weizmann Institute of Science Rehovot 76100, Israel Email: doron.lancet@weizmann.ac.il Phone: 972-8-934-3683 Elaine Lee High-throughput Analysis Core, Seattle Children’s Research Institute 1900 Ninth Avenue Seattle, WA 98101 Email: Elaine.lee@seattlechildrens.org Phone: 206-884-1250 Weizhong Li Center for Research in Biological Systems University of California, San Diego 9500 Gilman Drive MC 0446 Atkinson Hall, Room 3113 La Jolla, CA 92093-0446 Email: liwz@sdsc.edu Phone: 858-534 4143 Andrey Lisitsa Russian Human Proteome Organization (RHUPO) Institute of Biomedical Chemistry Moscow, 199121 Russia Pogodinskaya str., 10 Email: Lisitsa063@gmail.com Phone: 7-499-246-3731 Page 8 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 10. ForPeerReview Only/NotforDistribution Adrián Llerena Clinical Research Center Extremadura University Hospital and Medical School Hospital Infanta Cristina Av de Elvas s/n 06080 Badajoz, Spain Email: allerena@unex.es Phone: +34924218040 Courtney MacNealy-Koch DELSA Global, Seattle Children’s Research Institute 1900 Ninth Avenue Seattle, WA 98101 Email: courtney.macnealykoch@seattlechildrens.org Phone: 206-884-1171 Jean-Claude Marshall Center for Translational Research Catholic Health Initiatives 7601 Osler Drive, Towson, MD 21204 Email: Jean-ClaudeMarshall@catholichealth.net Phone: 410.427.2587 Paola Masuzzo Department of Medical Protein Research VIB, Ghent, Belgium Department of Biochemistry Ghent University, Ghent, Belgium Albert Baertsoenkaai 3 9000 GENT Email: paola.masuzzo@vib-ugent.be Phone: 32-9-264-93-33 Amanda L. May Department of Chemistry UT Knoxville 1420 Circle Dr. Knoxville, TN 37996-1600 Email: amay@ion.chem.utk.edu Phone: 865-974-3141 George Mias Department of Genetics Stanford University 300 Pasteur Drive, Alway Building M308 Page 9 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 11. ForPeerReview Only/NotforDistribution Stanford, CA 94305-5120 Email: george.mias@stanford.edu Phone: 650-723-3277 Matthew Monroe Biological Sciences Division Pacific Northwest National Laboratory 902 Batelle Boulevard P.O. Box 999 MSIN K8-98 Richland, WA 99352 Email: matthew.monroe@pnnl.gov Phone: 509-371-6580 Elizabeth Montague Bioinformatics and High-Throughput Analysis Lab, Seattle Children’s Research Institute Predictive Analytics, Seattle Children’s 1900 Ninth Avenue Seattle, WA 98101 Email: elizabeth.montague@seattlechildrens.org Phone: 206-884-1137 Sean Mooney The Buck Institute for Research on Aging 8001 Redwood Blvd. Novato, CA 94945 Email: smooney@buckinstitute.org Phone: 415-209-2038 Alexey Nesvizhskii Department of Pathology, Department of Computational Medicine and Bioinformatics University of Michigan 4237 Med Sci I, 1301 Catherine Road Ann Arbor, MI 48109-0602 Email: nesvi@med.umich.edu Phone: 734)-764-3516 Santosh Noronha Department of Chemical Engineering Indian Institute of Technology Bombay Powai, Mumbai 400076, INDIA Email: noronha@iitb.ac.in Phone: 91-22-2576 7238 Page 10 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 12. ForPeerReview Only/NotforDistribution Gilbert Omenn Center for Computational Medicine and Bioinformatics Departments of Molecular Medicine & Genetics and Human Genetics Department of Computational Medicine and Bioinformatics School of Public Health 100 Washtenaw Avenue 2017 Palmer Commons Ann Arbor, MI 48109-2218 Email: gomenn@umich.edu Phone: 734-763-7583 Harsha K. Rajasimha Jeeva Informatics Solutions LLC 7706 Majestic Way Derwood, MD USA Email: harsha@jeevadx.com Phone: 540-239-0465 Preveen Ramamoorthy Advanced Diagnostic Laboratories, Department of Medicine National Jewish Health 1400 Jackson St Denver, CO 80206 Email: RamamoorthyP@NJHealth.org Phone: 303-398-1501 Jerry Sheehan California Institute for Telecommunications and Information Technology University of California, San Diego 9500 Gilman Drive #0436 La Jolla, CA 92093-0436 Email: jsheehan@ucsd.edu Phone: 858-534-1723 Larry Smarr California Institute for Telecommunications and Information Technology University of California, San Diego 9500 Gilman Drive La Jolla, CA 92093-0436 Email: lsmarr@ucsd.edu Phone: 858-822-4284 Charles V. Smith Center for Developmental Therapeutics Seattle Children’s Research Institute 1900 Ninth Avenue Page 11 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 13. ForPeerReview Only/NotforDistribution Seattle, WA 98101 Email: charlesskip.smith@seattlechildrens.org Phone: 206-884-7453 Todd Smith Digital World Biology 2442 NW Market St. PMB 160 Seattle WA, 98107 Email: todd@digitalworldbiology.com Phone: 206-375-6843 Michael Snyder Department of Genetics Stanford Center for Genomics and Personalized Medicine Stanford University 300 Pasteur Drive, M-344A Stanford, California 94305 Email: mpsnyder@Stanford.edu Phone: 650-736-8099 Srikanth Rapole Proteomics Laboratory National Centre for Cell Science NCCS Complex, Pune University Campus Ganeshkhind, Maharashtra Pune- 411007, India Email: rsrikanth@nccs.res.in Phone: +91-20-25708075 Sanjeeva Srivastava Proteomics Laboratory Indian Institute of Technology Bombay Powai, Mumbai 400 076, India Email: sanjeeva@iitb.ac.in Phone: 022-2576-7779 Larissa Stanberry Bioinformatics and High-Throughput Analysis Laboratory, Seattle Children’s Research Institute Predictive Analytics, Seattle Children’s 1900 Ninth Avenue Seattle, WA 98101 Email: larissa.stanberry@seattlechildrens.org Phone: 206-884-1059 Page 12 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 14. ForPeerReview Only/NotforDistribution Elizabeth Stewart Bioinformatics and High-Throughput Analysis Laboratory Seattle Children’s Research Institute 1900 Ninth Avenue Seattle, WA 98101 Email: elizabeth.stewart@seattlechildrens.org Phone: 206-884-7176 Stefano Toppo Department of Molecular Medicine University of Padova Via U. Bassi 58/b I-35131 Padova, Italy Email: stefano.toppo@unipd.it Phone: 39-049-8276958 Peter Uetz J Craig Venter Institute (JCVI) 9704 Medical Center Drive Rockville, MD 20850 Email: uetz@jcvi.org Phone: 301-795-7589 Kenneth Verheggen Department of Medical Protein Research VIB, Ghent, Belgium Department of Biochemistry Ghent University, Ghent, Belgium Albert Baertsoenkaai 3 9000 GENT Email: kenneth.verheggen@ugent.be Phone: 32-9-264-93-58 Brynn H. Voy Department of Animal Science UT Institute of Agriculture 2506 River Drive Knoxville, Tennessee 37996 Email: bhvoy@utk.edu Phone: 865-974-4729 Louise Warnich Department of Genetics Faculty of AgriSciences University of Stellenbosch Private Bag X1, Page 13 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 15. ForPeerReview Only/NotforDistribution Matieland, 7602, Stellenbosch, South Africa E-mail: lw@sun.ac.za Phone: 27-21-808-5888 Steven W. Wilhelm Department of Microbiology University of Tennessee, Knoxville M409 Walters Life Sciences Knoxville, Tennessee 37996-0845 Email: Wilhelm@utk.edu Phone: 865-974-0665 Gregory Yandl Bioinformatics and High-Throughput Analysis Laboratory, Seattle Children’s Research Institute Predictive Analytics, Seattle Children’s 1900 Ninth Avenue Seattle, WA 98101 Email: gregory.yandl@seattlechildrens.org Phone: 206-884-3009 *Signifies corresponding author Page 14 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 16. ForPeerReview Only/NotforDistribution Abstract Biological processes are fundamentally driven by complex interactions between biomolecules. Integrated high-throughput omics studies enable multifaceted views of cells, organisms, or their communities. With the advent of new post-genomics technologies omics studies are becoming increasingly prevalent yet the full impact of these studies can only be realized through data harmonization, sharing, meta-analysis, and integrated research,. These three essential steps require consistent generation, capture, and distribution of the metadata. To ensure transparency, facilitate data harmonization, and maximize reproducibility and usability of life sciences studies, we propose a simple common omics metadata checklist. The proposed checklist is built on the rich ontologies and standards already in use by the life sciences community. The checklist will serve as a common denominator to guide experimental design, capture important parameters, and be used as a standard format for stand-alone data publications. This omics metadata checklist and data publications will create efficient linkages between omics data and knowledge-based life sciences innovation and importantly, allow for appropriate attribution to data generators and infrastructure science builders in the post-genomics era. We ask that the life sciences community test the proposed omics metadata checklist and data publications and provide feedback for their use and improvement. Page 15 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 17. ForPeerReview Only/NotforDistribution A COMMON OMICS METADATA CHECKLIST PROPOSAL Modern life science technologies enable rapid and efficient acquisition of omics data. These data comprehensively measure multi-layered molecular networks and provide a snapshot of biological processes in a cell, organism, or their communities. Collected on the same sample at the same time, omics data provide information on the functioning of biomolecules and their interactions. Omics studies are essential for the systemic investigation of biological systems--an endeavor that is crucial to improve our ability to manage and cure diseases, identify drug targets, understand regulatory cascades, and predict ecosystem responses to environmental changes. Through the pioneering efforts of Drs. Smarr and Snyder 1-4 , two powerful multi-omics human datasets were recently made available. Smarr's dataset includes a wide variety of molecular measures and clinical parameters meticulously collected and cataloged for years, while Snyder's integrative personal multi-omics study presents his personal genomics, transcriptomics, proteomics, metabolomics, and autoantibody profiles collected over a 14-month period. Both studies yielded unique physiological insights not previously possible, including early indications of vulnerabilities to specific diseases. In the near future these kinds of personal omics studies will become routine and will inevitably result in vast and diverse volumes of omics data. Therefore, the scientific community must commit to a common format for publishing the design and analysis of these studies that will ensure the compatibility, reproducibility, and reuse of the resulting data. Page 16 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 18. ForPeerReview Only/NotforDistribution The use, integration, and reuse of data require accurate and comprehensive capture of the associated metadata including details describing experimental design, sample acquisition and preparation, instrument protocols, and processing steps. The data and metadata must be captured together in a rigorous and consistent manner to allow the integration of data across omics experiments. The use of ontologies, naming conventions and standards can increase the compatibility and usability of these diverse data. Fortunately, life sciences data have certain core similarities. However, combined with these similarities come the different nuances among various technology platforms such as transcriptomics, proteomics and metabolomics, as well as application contexts such as neuroscience and hematology. The differences are compounded by the multiplicity of standards within a field--transcriptomics alone has at least 15 standards potentially applicable to the data 5,6 . Such complexities not only make reproducible, integrative, accurate, and comprehensive capture of data and metadata an intricate challenge that must be overcome but also place an excessive burden on researchers trying to convey metadata 5,7 . Pioneering attempts in this area were made in 2007, when the “Minimum Information about a Biomedical or Biological Investigation" project brought many of these efforts for the life sciences together into an umbrella organization: MIBBI 8,9 . In MIBBI, each set of guidelines is developed by a working group concentrated in a specific field, (for example, fMRI, or QTL and association studies). Through this approach, MIBBI aspires to capture all essential metadata and data that are necessary to replicate any given experiment within a field. Also the framework known as Minimal Information about any Sequence (MIxS) expands the breadth of information available by integrating the individual genomics checklists developed by the Genomics Standards Consortium with environmental information 10 . In addition, the NIH’s National Center Page 17 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 19. ForPeerReview Only/NotforDistribution for Biotechnology Information developed a format for cataloging information about samples enabling further metadata availability 11 . While these frameworks are critical to the reuse of data, they do not fully take into account the interlocking aspects needed for harmonization of diverse omics data types. Recently, the Nature Publishing Group implemented a publication checklist that provides another example of an approach to improve the transparency and reproducibility of life sciences publications 12,13 . The checklist requires the researcher and/or corresponding author to enter specific information on experimental design, statistical analysis, and reagents. This checklist is endorsed by the Data-Enabled Life Sciences Alliance (DELSA Global, 14 ). The unveiling of the Nature publication initiative brought into focus the need for a complementary omics checklist that allows the capture and publication of critical metadata associated with omics data sets. To this end, life sciences researchers from DELSA Global 15-19 propose a single common omics metadata checklist as described below. By integrating DELSA researchers’ collective experiences with omics guidelines and publication requirements, one simplified, yet informative and flexible checklist was created to capture the essential aspects of omics studies 20 . Publication of a completed checklist will serve to inform the life sciences community of the details needed to properly utilize the given data set. This type of “resource publication” has long been done by Nucleic Acid Research in its annual Database issue. Nanopubs and Micropubs are two newer publication avenues that could serve to quickly and accurately share Page 18 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 20. ForPeerReview Only/NotforDistribution information 21,22 . There are also other forms of data publications including, for example, ISA Tools and the Scientific Data journal 23,24 . It is worth noting that multi-omics data from a longitudinal study of a single individual (e.g., the Smarr and Snyder datasets) in their entirety constitute essentially a whole new data type. Supplied with detailed metadata, these data could become a part of a greater, well- documented collage of data within a specific domain. Due to the large amount of data and the complexity in the data acquisition it is exceedingly difficult to capture, disseminate, and interpret the metadata. Generally, minimal reporting requirements are aimed at enabling replication of an experiment, a concept that is not easily applied to the longitudinal personal omics studies. Reuse of data can be enabled with more succinct and concise reporting. The checklist we propose therefore has a simple structure covering four concise sections: experiment information, experimental design, experimental methods, and data processing. The experiment information section includes details of the lab, funding sources, data identification, and a brief abstract to address why the experiment was done. The experimental design section is meant to capture the high level data about the experiment and its statistical design, including sample selection, replication, and randomization. The experimental methods section contains details about instrumentation and sample preparation. The data processing section captures information regarding methods and tools used in experimental data processing and data analysis. Table 1. Page 19 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 21. ForPeerReview Only/NotforDistribution The metadata captured by this checklist will serve as interlocking bridges for data harmonization and therefore they focus on details of the experimental design and subsequent data analyses. In multi-omics studies, the researcher would fill a checklist for each omics data- type measured. As test cases, two datasets of the integrative personal multi-omics study were used 25 . The proposed checklist integrates existing ontologies and standards in order to standardize terminology and simplify data input. In its short structured form, the checklist captures important experimental parameters and strikes a balance between comprehensiveness and ease of use. As such, the checklist can serve as a guide to the design of omics studies. Implementation of this checklist will enable efficient portability and meta-analysis of the data, as well as transparent communication and greater reproducibility of omics studies. Yet the checklist is just the first step towards full utilization of the data. Traditional publication avenues and new data publications, for example, OMICS Journal of Integrative Biology, Journal of Proteome Research, Big Data, eLife, and Scientific Data could test and adopt the format to ensure that the crucial information needed to allow data to be harmonized for broader usage is published 26-28 . The assessment of the metadata quality and the data they accompany could be done through community resources like PubMed Commons 29,30 . Data submissions to single omics databases such as, for example, ArrayExpress and Geo for transcriptomics, or PRIDE and Proteome Exchange for proteomics, would benefit from both additional omics metadata within the given database and robust harmonization with other data- types in other databases 31-34 . The checklist could also aid submissions to multi-omics databases, data repositories or data clouds. Examples include: data clouds, such as The Open Science Data Page 20 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 22. ForPeerReview Only/NotforDistribution Cloud, and data repositories, such as Dryad for raw data, and MOPED for processed data 35-37 . When compatibility and sharing of data and metadata cease to be an issue, a deeper understanding of cells, organisms, and their communities will ensue. CONCLUSION The proposed metadata checklist offers a much-needed and balanced approach to bring about data harmonization across omics studies. This is accomplished while also maintaining the flexibility needed to adapt to complex and ever evolving study designs and omics application contexts in the post-genomics era of the life sciences. ACKNOWLEDGEMENTS All authors of this publication are members of the Data-Enabled Life Sciences Alliance (DELSA Global). This manuscript has been reviewed and endorsed by the Alliance. Research reported in this publication was supported by Seattle Children’s Research Institute (SCRI), EMC2 , and Intel. This support is very much appreciated. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Moore Foundation, SCRI, EMC2 , or Intel. AUTHOR DISCLOSURE STATEMENT No competing financial interests exist. Page 21 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 23. ForPeerReview Only/NotforDistribution REFERENCES 1. Smarr L. Quantifying your body: A how-to guide from a systems biology perspective. Biotechnology Journal 2012; 7:980-991. 2. Bowden M. The Measured Man. The Atlantic 2012; June 13. www.theatlantic.com/magazine/print/2012/0.7/the-measured-man/309018/ 3. Chen R, Mias G, Li-Pook-Than J, et al. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 2012; 148:1293–1307. 4. Mias G, Snyder M. Personal genomes, quantitative dynamic omics and personalized medicine. Quantitative Biology 2013; 1:71–90. 5. Tennenbaum JD, Sansone S-A, Haendel M. A sea of standards for omics data: sink or swim? J Am Med Inform Assoc 2013; [accessed 2013-10-17]. http://jamia.bmj.com/content/early/2013/09/27/amiajnl-2013-002066.full.html. 6. Field D, Sansone SA, Collis A, et al. 'Omics Data Sharing. Science 2009; 326:234-236. 7. Editorial. On the table. Nature Genetics 2011; 43:1. 8. Taylor C, Field D, Sansone S, et al. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat Biotechnol 2008; 8:889-96. Page 22 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 24. ForPeerReview Only/NotforDistribution 9. Kettner C, Field D, Sansone SA, et al. Meeting Report from the Second “Minimum Information for Biological and Biomedical Investigations” (MIBBI) workshop. Stand Genomic Sci 2010; 3:259–266. 10. Yilmaz P, Kottmann R, Field D, et al. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nature Biotechnology 2011; 29:415–420. 11. Barrett T, Clark K, Gevorgyan R, et al. BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata. Nucleic Acids Res 2012; 40:D57–D63. 12. Editorial. Announcement: Reducing our irreproducibility. Nature 2013; 496:398. 13. Reporting Checklist For Life Sciences Articles. Nature Publishing Group, May-2013. http://www.nature.com/authors/policies/checklist.pdf. [Accessed: 01-Aug-2013]. 14. Kolker E, Altinas I, Bourne P, et al. In praise of open research measures. Nature 2013; 498:170. 15. Data-Enabled Life Sciences Alliance. 2013. www.delsaglobal.org. Page 23 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 25. ForPeerReview Only/NotforDistribution 16. Kolker E and Stewart E. Opinion: Data to Knowledge to Action: Introducing DELSA Global, a community initiative to connect experts, share data, and democratize science. The Scientist, April 18, 2012. [accessed 2013-10-23] http://www.the- scientist.com/?articles.view/articleNo/31985/title/Opinion--Data-to-Knowledge-to-Action/ 17. Kolker E, Stewart E, and Ozdemir V. Opportunities and challenges for the life sciences community. OMICS 2012; 16:138–147. 18. Kolker E, Stewart E, and Ozdemir V. DELSA Global for “Big Data” and the Bioeconomy: Catalyzing Collective Innovation. Industrial Biotechnology 2012; 8:176-178. 19. Stewart E, Smith T, De Souza A, et al. DELSA Workshop IV: Launching the Quantified Human Initiative. Big Data 2013; 1:187-190. 20. Data-Enabled Life Sciences Alliance. Multi-Omics Metadata Checklist. 2013. http://www.delsaglobal.org/news/publications/item/84-checklist 21. Nanopub. 2013. nanopub.org. 22. Micropub. 2013. micropub.org. 23. ISA tools. 2013. isa-tools.org. Page 24 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 26. ForPeerReview Only/NotforDistribution 24. Scientific Data. 2013. www.nature.com/scientificdata. 25. Snyder M, Mias G, Stanberry L, et al. Metadata checklist for the integrated personal omics study: proteomics and metabolomics experiments; OMICS submitted. 26. OMICS Journal of Integrative Biology. 2013 www.liebertpub.com/OMI. 27. Journal of Proteome Research. 2013. pubs.acs.org/journal/jprobs. 28. eLife. 2013. elife.elifesciences.org. 29. PubMed commons. 2013. www.ncbi.nlm.nih.gov/pubmedcommons 30. Swartz A. Post-Publication Peer Review Mainstreamed. The Scientist [accessed 2013-10- 23]. www.the-scientist.com/?articles.view/articleNo/37969/title/Post-Publication-Peer-Review- Mainstreamed/ 31. ArrayExpress. 2013. www.ebi.ac.uk/arrayexpress 32. Geo. 2013. www.ncbi.nlm.nih.gov/geo/ 33. PRIDE. 2013. www.ebi.ac.uk/pride Page 25 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 27. ForPeerReview Only/NotforDistribution 34. Proteome Exchange. 2013. proteomexchange.org. 35. Open Science Data Cloud. 2013. www.opensciencedatacloud.org 36. Dryad. 2013. datadryad.org. 37. MOPED. 2013. moped.proteinspire.org. Page 26 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 28. ForPeerReview Only/NotforDistribution Table 1. Multi-omics metadata checklist Checklist Version Version 1.0 (2013) Experiment Information Description Lab Name Lab conducting the experiment Date Checklist submission date Author Information Name, organization, contacts Title of Experiment One-sentence description of the particular experiment Project Project name, ID, organization Funding Funding sources for the project Digital ID Multiple Digital IDs may be listed, such as those to GEO, MOPED, PRIDE, DOIs, etc. Abstract A short description of the experiment briefly stating the goals of the research and principal outcomes if any (100 words or less) Experimental Design Description Organism e.g., Human, Mouse OMICS Type(s) Utilized e.g., Proteomics, Metabolomics Reference Published paper that utilize these data, their PMID, or other relevant IDs or links Experimental Design Design Specifications, Type of Replication (Biological, Technical, Time points), Grouping of subjects, samples or replicates, Randomization, Comparisons, other salient design attributes Experiment Focus Description of experiment goals and objectives Sample Description Description of samples Tissue/Cell Type ID* e.g., BRENDA Localization ID e.g., GO Condition ID DOID, Text Experimental Methods Description Sample Prep Description Description of steps taken, kits used Platform Type e.g., Microarray, LC-MS/MS, GC-MS, sequencing platform Instrument Name e.g., LTQ-Orbitrap, psi-MS ontology, HiSeq, IonTorrent, Chip name (Microarray) Instrument Details e.g. Ion source, mass analyzer Instrument Protocol e.g., Fragmentation method (CID, HCD, ETD), MS/MS scans per MS scan, sequencing cycles, paired ends, single reads, hybridization methods (microarray) Data Processing Description Page 27 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
  • 29. ForPeerReview Only/NotforDistribution Instrument Software Name List of software used, including version Processing/Normalization Methods/Software Description of processing and normalization methods & software Sequence/Annotation Database Source, version or date ID Method/Software Name of search engine used + post processing to ID molecules ID/Expression measures e.g., thresholds and cutoffs for ID, spectral counts, peak area, reads, log2 expression Data Analysis Method/Software Methods and software used for expression analysis, error estimation I/O Data File Formats List of files formats for raw and processed data (e.g., txt, xml, etc.), specifications and software tools to ensure readability Reserved/Extension Any additional information related to the experiment *ID – Identification Page 28 of 26 ScholarOne Support phone: 434-964-4100 email: ts.mcsupport@thomson.com Mary Ann Liebert, Inc. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60