WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Cognitive Engine: Boosting Scientific Discovery
1. Scalable Software Systems Laboratory
Scalable Software Systems Laboratory
Department of Electrical and Computer Engineering
CognitiveEngine: Boosting
Scientific Discovery
Xiaolin Andy Li
http://www.andyli.ece.ufl.edu
2. Scalable Software Systems Laboratory
Information Technology
Text in here
1939 1946 1970 1980 1990
New
Age
ENIAC
ARPANET
The Internet
Fiber Optics
Vint Cerf
Bob Kahn Charles Kuen Kao
Mosaic Web Browser
Marc Andreessen and Eric Bina
WWW
Tim Berners Lee
Martin Cooper, 1973 Steve Jobs, 2007
1G, 1980s
2G, 1990s
3G, 2000s
4G, 2010s
ABC
John Atanasoff
BSEE@UF, 1925
3. Scalable Software Systems Laboratory
Cloud Computing
n SaaS: Software as a Service
n Salesforce, 1999
n StaaS: Storage as a Service
n Amazon S3, 2006; Dropbox, 2008
n PaaS: Platform as a Service
n Google App Engine, 2008; Microsoft Azure, 2010;
n Docker, 2013; IBM BlueMix, 2014
n IaaS: Infrastructure as a Service
n Amazon AWS, 2002; Eucalyptus, 2008
n Rackspace/NASA OpenStack, 2010; Google Compute Engine, 2012
2000
4. Scalable Software Systems Laboratory
SDN: Software-Defined Networking
Nick
McKeown
Scott
Schenker
Martin
Casado
2009
6. Scalable Software Systems Laboratory
Geoffrey Hinton, Yann LeCun, Yoshua Bengio, Andrew Ng, Demis Hassabis
2013
7. Scalable Software Systems Laboratory
1970 àà 1990 àà 2010 àà 2030 àà
2D IT Booming Cycles
IT Boom V2 IT Boom V3IT Boom V1
1950 à à à 1980 à à à 2010 à à à 2040
3D Computing Platform Cycles
2nd Platform 3rd Platform1st Platform 4th Platform
Towards Intelligent Platform
IT Boom V4
8. Scalable Software Systems Laboratory
Time for Change
Current Unified Big Systems
Hadoop
OpenStack
Torque
Pig
Dryad
Pregel
Percolator
CIEL
Container
Virtual
Machine Bare Metal
9. Scalable Software Systems Laboratory
GatorCloud
- Towards Software-Defined Ecosystems
OpenFlow
Software-
Defined
Computing
SDC
Apps
Runtime
Big Data
PBS/Torq
Virtual
MachineContainer
Nova
Controller
HPC
Program
Models
Software-
Defined
Networking
SDN
Apps
Low
Latency
SDN
Hypervisor
OVS
OF-
Config
Open
Flow
GENI
SDN
Controller
High
Throughp
ut
10. Scalable Software Systems Laboratory
GatorCloud Network Topology
2*10Gb/s
upgraded to
2*100Gb/s
National Lambda
Rail, Internet2, GENI
(via Jacksonville)
UF
Physics
CMS/OSG
Data Center
GatorVisor
SSRB
CNS Lab
NEB
S3Lab
CISE Lab
Apps Controller
Nets Controller
8 U
46 U
8 U
8 U
1 U
2 U
3 U
3 U
3 U
8 U
46 U
8 U
8 U
1 U
2 U
3 U
3 U
3 U
Data Cloud
VM Cloud Cloud Portal
VM Cloud
Data Cloud
2
2
2
2
100G
100G
100G 100G10G
40G
4
4
Cloud Orange
Cloud Green
FLR
ECDC
HPC Center - ES
Physics
HPC Center - Phy 2
100G
Larsen
HPC Center - Eng
SSRB
Campus Datacenter
Hybrid Controller
Larsen
HCS Lab
40G
4
2*10Gb/s
upgraded to
2*100Gb/s
Golfer
Golfer
Deployed in 2012, one of the first 100Gbps SDN Campus Research Networks in USA
SDN Switch
Phase 1 SDN, 40G/10G
Phase 2 SDN, 100G
SDN Control Plane
11. Scalable Software Systems Laboratory
HiPerGator Supercomputer
Ranking from top500 supercomputer list
# 4 among public universities in US
# 8 among universities in US
# 115 among all machines listed
Major Data Centers at UF
HiPerGator Supercomputer
CMS/OSG Physics
HPC Centers
ICBR: Interdisciplinary Center for Biotech Research
CTSI: Clinical and Translational Science Institute
ACIS/CAC Data Center
CHREC Data Center (Novo-G)
NEB Data Center
12. Scalable Software Systems Laboratory
What Changed?
Lecture 1 -Fei-Fei Li & Andrej Karpathy & Justin Johnson
Convolution
Pooling
Softmax
Other
GoogLeNet VGG MSRASuperVision
[Krizhevsky NIPS 2012]
Year 2012 Year 2014Year 2010
Dense grid descriptor:
HOG, LBP
Coding: local coordinate,
super-vector
Pooling, SPM
Linear SVM
NEC-UIUC
[Lin CVPR 2011] [Szegedy arxiv 2014] [Simonyan arxiv 2014]
4-Jan-1631
Year 2015
Revolution of Depth
34
58
66
86
HOG, DPM AlexNet
(RCNN)
VGG
(RCNN)
ResNet
(Faster RCNN)*
PASCAL VOC 2007 Object Detection mAP (%)
shallow
8 layers
16 layers
101 layers
*w/ other improvem
Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition
Engines of
visual recognition
Revolution of Depth
3.57
6.7 7.3
11.7
16.4
25.8
28.2
ILSVRC'15
ResNet
ILSVRC'14
GoogleNet
ILSVRC'14
VGG
ILSVRC'13 ILSVRC'12
AlexNet
ILSVRC'11 ILSVRC'10
ImageNet Classification top-5 error (%)
shallow8 layers
19 layers22 layers
152 layers
Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. arXiv 2015.
8 layers
Beyond
Human
13. Scalable Software Systems Laboratory
CognitiveEngine: Beyond Hadoop and Spark
n Bulk Synchronization Parallel
n Both a blessing and a curse
n Easy to schedule and arrange dependency
n All synchronized
Map
Reduce
Stage
Stage
Stage
Stage
14. Scalable Software Systems Laboratory
ADD Design Choices
n Asynchronous Distributed Datasets (ADD)
n Inherits the easy-to-use programming interface
n Differentiate static data (samples) and the iteratively updated data
(parameters)
n Automatic asynchronous updates, with user specified bound
n Asynchronous-aware scheduling
15. Scalable Software Systems Laboratory
ADD local
copy
ADD System
ADD Server
ADD Server
ADD Client
ADD Client
ADD Client
Training
samples
Training
samples
Training
samples
Async
push
Async
pull
Feed Forward +
Back Propagation
ADD features
• Async push and pull of model
update
• Users are allowed to specify the
condition of returning from pull/
push, so that they don’t have to
wait
• Adaptive model update method:
all-to-one/tree aggregation/P2P
approximate update
• User-controllable tradeoff
between asynchrony and
convergence rate
• Model snapshot and sharing
16. Scalable Software Systems Laboratory
Execution
Static
Data
Dynamic Data
Handler
Function State
ADD Partition
ADD
Task
ADD
Task
ADD
Task
Locality Iteration, etc.
Fetch
Compute
Update
Bookkeeping
17. Scalable Software Systems Laboratory
Advantages
n Asynchronous Update
n IO / CPU overlap
n Fault tolerant
n Derive and live with state-of-the-art system
n Spark
n Sharing among jobs and users
n Maximizing parallelism of GPUs
18. Scalable Software Systems Laboratory
DeepApps
n DeepScience
n DeepSky
n DeepDefense
n DeepHealth
n DeepBipolar
n DeepVital
n DeepGuard
n DeepCancer
n DeepBot/Dingding
n DeepDrug
20. Scalable Software Systems Laboratory
The animation shows how Kepler detects planets. As the
planet passes between the host star and the spacecraft,
the observed star brightness decreases slightly, signaling
the potential detection of a planet.
Kepler looked at over 150,000 stars continuously for four
years in the constellations Cygnus and Lyra, seeking to
record the slight periodic brightness changes in stars that
could reveal the presence of planets.
Kepler detects planets by taking a photometric measurement
of the stars in its field of view every 30 minutes. A planet
transit will show as a small periodic dip in the “light curve” of
a star over time.
Kepler Data
Goal: Detect planet(s) currently missed by the
Kepler Team’s automatic search programs --
likely “super-Earths” with long periods
21. Scalable Software Systems Laboratory
Quasar Spectra Pair Method
The identification of 2175 bump is based on Mgii
absorber catalog with limitation:
• We can only identify the 2175 bump in the redshift
range from 0.7 to 2.5.
• The method is based on Mg II absorber catalog. If the
Mg ii absorber catalog is not complete, the 2175
bump sample may not be complete.
22. Scalable Software Systems Laboratory
Analysis of the Effects
(a) Input data with bumps (c) Feature map of last
convolutional layer
(b) Filters of the first
convolutional layer
23. Scalable Software Systems Laboratory
Reconstruction of Bumps
(d) Reconstructed input
image with bump
(e) Reconstructed input
image without bump
25. Scalable Software Systems Laboratory
DeepDefense Architecture
LSTM
CTC
DataSequence1
000
DataSequence2
000
DataSequence3
000
DataSequence4
000
CNN
CNN
CNN
CNN
CNN
LSTMLSTMLSTMLSTMLSTM
LSTMLSTM
LST
M
LSTMLSTMLSTM
LSTMLSTMLSTMLSTMLSTMLSTM
Spatial
Temporal,Recurrent,CascadingLSTM
BPTT
BPTS
Feature Analysis
Ensemble Analysis
Knowledge Fusion
Performance
Evaluation
BPTT: Backpropogation Through Time
BPTS: Backpropogation Through Space
CNN: Convolution Neural Network
LSTM: Long Short-Term Memory
CTS: Connectionist Temporal Classification
SearchableOutputs
26. Scalable Software Systems Laboratory
Data-Driven DeepHealth
With Azra Bihorac, Lizi Wu, Parisa Rashidi etc
27. Scalable Software Systems Laboratory
Bipolar Disorder & Challenge Objectives
• Bipolar disorder is a brain disease that causes
unusual mood shifts
• Estimated 51% of affected population go
untreated in a given year
• Detection not straightforward - symptoms and
test metrics not too dissimilar from other brain
disease
• Recent studies indicate heritability and
genetic factors as causes opening new area of
detection using genome data.
• CAGI challenge given to predict the bipolar
disorder using exomes .
• Exome sequencing data of 1000 samples with
500 for training and 500 for prediction
challengeImage source http://www.nimh.nih.gov/health/statistics/prevalence/
bipolar-disorder-among-adults.shtml
28. Scalable Software Systems Laboratory
Data Pre-Processing
n Extracted genotype information from the exomes
n The genotypes were 0/0,0/1,1/1 and ./.
n One-hot-encoding transformation on the genotypes i.e 0/0
encoded as 0100, 0/1 encoded as 0010,etc.
n One hot encoding treats all categorical variables equidistant
31. Scalable Software Systems Laboratory
SDE Controller
SDDC Hypervisor
SDE App Store
GatorCloud: SDN-enabled Campus Cloud
DeepCloud Towards Composable Intelligent Platform
Golfer
GolfVisor
8 U
46 U
8 U
8 U
1 U
2 U
3 U
3 U
3 U
8 U
46 U
8 U
8 U
1 U
2 U
3 U
3 U
3 U
8 U
46 U
8 U
8 U
1 U
2 U
3 U
3 U
3 U
8 U
46 U
8 U
8 U
1 U
2 U
3 U
3 U
3 U
Gator, GENI, and Testbed Racks
Internet2
/NLR
100G
100G
GENI
Apps
GolfStore
CloudDashboard
Users
Researchers
Scientists
Developers
Engineers
Admins
IaaS
PaaS
SaaS
CPSaaS
NaaS
HPCaaS
iBDaaS
Security
Apps
Network
Apps
BigData
Apps
Self-Protection
Major Data Centers at UF
HiPerGator Supercomputer
CMS/OSG Physics
HPC Centers
ICBR: Interdisciplinary Center
for Biotech Research
CTSI: Clinical and Translational
Science Institute
ACIS Data Center
NEB Data Center
HPC
Apps
StaaS
32. Scalable Software Systems Laboratory
S3Lab Research Highlights
Finest
Smartphone
Indoor
Location
Ecosystem
First
SDN-enabled
Campus Cloud
GatorCloud
Fastest
Campus
Research
Network
100G
IMPACT
Fourth
DeepCloud
Intelligent
Platform
33. Scalable Software Systems Laboratory
NSF I/UCR Center for Big Learning (Pending)
Deep
Learning
Big
Systems
Big
Data
Intelligence
Member Benefits
• Leveraging the world-class
talents (about 40
professors and 200
graduate students) in the
era of big learning, big
data, and big systems.
• Realizing a 10:1 return on
investment.
• Discovering top students in
top universities.
• Joining peer members from
high-profile companies and
research units.
CBL Consortium: University of Florida (UF, South), Carnegie Mellon University
(CMU, East), University of Missouri at Kansas City (UMKC, Central), University of
Notre Dame (ND, North), and University of Oregon (UO, West), and a large number
of industrial partners.