6. Scien&fic
IT
Principles
• Be`er
infrastructure
than
any
non-‐profit
life
sciences
research
ins&tu&on
in
the
world
• None
of
our
scien&sts
should
say:
–
“I
wish
I
was
back
at
______”
• We
don’t
lose
data
• Minimize
data
movement
• 365x24x7
Availability
(-‐
8
hrs/year
down&me)
• Compute
and
storage
cost
recovery
7. Computa&onal
Environment
•
Fiber-‐connected
Data
Centers:
Janelia
and
HQ
• High
Throughput
Compu&ng
– 3
million
jobs
scheduled
per
month
– Heterogenous
job
mix,
many
interac&ve
jobs
•
Hundreds
of
Users
–
Wide
range
of
sophis&ca&on
• Trends
–
Spark,
MongoDB,
Scality
Object
Store
9. Storage:
Match
storage
class
to
use
case
• General
purpose,
remote
backup
-‐
Isilon
-‐
(2.5
PB)
– Home
directories,
most
lab
files
– Cluster
accessible
• High
performance
scratch
space
-‐
DDN
GPFS
(1
PB)
– Designed
for
larger
(>40MB)
files
– Cluster
accessible
• Image
Data
Storage
-‐
Scality
Object
Store
(1.4
PB)
– Replicated
(offsite)
and
Unreplicated
Rings
(0.7PB
each)
– Cluster
accessible
via
API
• Cold
Storage,
no
backup
-‐
Isilon
(1
PB)
– low
performance
– NOT
cluster
accessible
10.
11. Scien&fic
Compu&ng
SoXware
Sean Murphy Todd Safford Rob Svirskas
• Perl, Python
• Web-based GUI
• Image Processing
• Enterprise Java
• Relational and NoSQL DB
• 3D Graphics
• MatLab, C++, Java, Python
• Neuroscience domain
• Algorithms
12. Scien&fic
Compu&ng
SoXware
Don’t
be
fooled!
Saul’s
Presentation
Rest of Scientific
Computing Activities
13. SoXware-‐Intensive
Team
Projects
• Mouse
Light
– Mouse
Projec&ons
and
Connec&ons
• Fly
Light
– Op&cal
Mapping
of
the
Fly
Nervous
System
• FlyEM
– Electron
Microscopic
Reconstruc&ng
of
the
Drosophila
Nervous
System
15. Mouse
Light
• 3D Mosaic Imaging of the entire mouse brain
• Custom built 2-photon microscope (N. Clack)
• ~20,000 tiles (0.3um x 0.3um x 1.0um)
• ~15TB/Chan
• Stitched/rendered into seamless 3D volume
Jayaram Chandrashekar (MouseLight)
18. Tools
for
Confocal
Imaging
of
Fly
Brains
•
Confocal
imaging
of
Gal4
and
Split-‐Gal4
Lines
–
sparse
labeling
of
neurons
è
few
visible
per
image
•
>100k
samples
imaged
–
aligned
to
common
template
•
Janelia
Worksta&on
–
Tools
to
manage,
browse
and
annotate
–
View
full
spa&al
resolu&on
3D
stack
in
2s
S Murphy, T Safford, K Rokicki, C Bruns, Y Yu, L Foster, E
Trautman, D Olbris, P Davies
19. Alignment
Board
for
Separated
Neurons
Saul’s
Presentation
Rest of Scientific
Computing Activities
Key Benefits:
• Realizes the value of sample alignment
• Reveals structure/function relationships
• Gets max utility from collection of sparse samples
Credits:
• Inspiration: A Nern, T Wolff, Y Aso
• Conception: S Murphy
• Design and Implementation: L Foster
• Advanced 3D Consultant: C Bruns
20.
21.
22. Tools
for
EM
Connectomics
•
Two
EM
Projects
on
Fly
Brain
–
isotropic
FIB
SEM
(Harald
Hass,
Steve
Plaza)
–
anisotropic
TEM
(Davi
Bock,
Khaled
Khairy)
•
Mul&ple
tools
for
tracing
and
proofreading
•
Goal:
mix
and
match
computa&on
and
GUI
tools
•
Key:
shared
services-‐based
storage
•
Full
rendered
CNS
40-‐100TB
24. DVID
Datatype
Examples
Image tiles
(for low-latency
image browsing)
3D body volume
(validation / modifying
neuron shape)
Regions of interest
(define important
parts of the dataset)
Graph
(define overlap
between neurons)
L1
T4
Mi1Tm3
Segmentation
(validation/
modifying
segmentation)
DVID: Bill Katz (FlyEM)