SlideShare a Scribd company logo
1 of 4
Download to read offline
XCube StreamX®
	
  
	
  
The	
  StreamX®	
  streaming	
  data	
  management	
  platform	
  
from	
  XCube	
  is	
  an	
  integrated,	
  turnkey,	
  end-­‐to-­‐end	
  
solution	
  for	
  huge	
  streaming	
  data.	
  	
  	
  
Whether	
  your	
  data	
  streams	
  are	
  combinations	
  of	
  video,	
  Lidar,	
  
radar,	
  or	
  other	
  sensors,	
  you	
  can	
  find	
  the	
  right	
  time	
  segments	
  of	
  
data	
  in	
  100’s	
  of	
  Petabytes	
  of	
  data	
  files,	
  then	
  use	
  those	
  segments	
  as	
  
input	
  to	
  your	
  application	
  and	
  run	
  it	
  all	
  in	
  parallel.	
  
The	
  StreamX	
  platform	
  is	
  designed	
  to	
  solve	
  two	
  of	
  the	
  biggest	
  
challenges	
  when	
  working	
  with	
  huge	
  streaming	
  datasets:	
  	
  	
  
The	
  StreamX	
  platform	
  includes	
  a	
  distributed	
  virtual	
  file	
  system,	
  
virtual	
  machine	
  technology,	
  and	
  parallel	
  execution	
  engine	
  that	
  
enables	
  any	
  desktop	
  application	
  to	
  run	
  in	
  parallel	
  in	
  a	
  distributed	
  
cluster.	
  	
  Applications	
  are	
  sent	
  into	
  the	
  cluster	
  to	
  where	
  the	
  data	
  
resides	
  and	
  then	
  executed	
  in	
  parallel,	
  all	
  controlled	
  by	
  a	
  simple	
  
user	
  interface	
  from	
  any	
  web	
  browser.	
  	
  No	
  parallel	
  programming.	
  	
  
No	
  transferring	
  or	
  reformatting	
  the	
  data.	
  
Working	
  with	
  huge	
  streaming	
  data	
  sets	
  has	
  never	
  been	
  easier.	
  	
  The	
  
StreamX	
  streaming	
  data	
  management	
  platform	
  from	
  XCube	
  can	
  
shave	
  months	
  off	
  of	
  your	
  development,	
  test,	
  and	
  validation.	
   	
  
	
  
How	
  to	
  run	
  an	
  existing	
  desktop	
  application,	
  such	
  as	
  
a	
  simulator,	
  on	
  lots	
  of	
  data	
  many	
  times	
  faster	
  than	
  a	
  
workstation	
  or	
  standard	
  server	
  
	
  
How	
  to	
  find	
  the	
  specific	
  time	
  segments	
  inside	
  of	
  
Petabytes	
  of	
  streaming	
  data	
  files	
  that	
  you	
  want	
  to	
  
run	
  the	
  application	
  on	
  or	
  further	
  analyze	
  
Data Management Tools for Big Streaming Data
Capture, Store, Tag, Search, Simulate, and Automate in Parallel
without reprogramming applications or reformatting data
With the StreamX platform you can:
Automatically tag content in data
streams
• Mine the data with user-defined
detection algorithms
• Run the detection algorithms in
parallel without parallel programing
• Tag detected scenes in the stream to
enable content-based searches
Find and manage specific content
out of petabytes stored globally
• Search in parallel for tagged content
in 100s petabytes stored globally
• Manage the search results to create
independents sets of data for
training, testing, and validation
Run desktop applications in parallel
using search results as input
• Run desktop applications such as
simulators in parallel without
recoding
• Work with Petabytes of data
distributed globally without ever
transmitting the data
Automate training and testing
• Automate AI training and regression
testing.
• Partition training and test data to
maintain test integrity.
Validate using Re-simulated Data
• Bring real-world streaming data in
your simulators
• Automatically vary input parameters
to create new simulated situations
• Use results from one simulation run
as input to the next to provide active
feedback
With XCube’s StreamX systems,
each of our GIDAS simulations runs
overnight instead of a week.
- Automotive OEM test engineer
“
”
1	
  
2	
  
The StreamX Platform: Powerful and Easy to Use
	
  
The	
  StreamX	
  streaming	
  data	
  management	
  platform	
  is	
  powerful	
  yet	
  easy	
  to	
  use.	
  	
  After	
  capturing	
  the	
  data,	
  
there	
  are	
  five	
  main	
  functions	
  with	
  the	
  system,	
  all	
  of	
  which	
  can	
  be	
  performed	
  remotely	
  from	
  any	
  browser.	
  
	
  
Capture.	
  Capture	
  the	
  data	
  stream	
  using	
  your	
  own	
  system	
  or	
  the	
  StreamX	
  Data	
  
Recorder.	
  	
  The	
  StreamX	
  data	
  recorder	
  can	
  capture	
  multi-­‐sensor	
  data	
  at	
  up	
  to	
  4.4	
  
GB/s,	
  eliminating	
  the	
  need	
  have	
  separate	
  recorders	
  for	
  each	
  sensor	
  stream.	
  Up	
  to	
  
256	
  TB	
  of	
  removable	
  storage	
  holds	
  extended	
  data	
  collection	
  sessions	
  and	
  then	
  
removed	
  and	
  shipped	
  to	
  the	
  data	
  center	
  without	
  having	
  to	
  disconnect	
  the	
  sensors.	
  
Store.	
  	
  The	
  incoming	
  data	
  needs	
  to	
  be	
  incorporated	
  in	
  the	
  distributed	
  storage	
  cluster	
  
and	
  organized	
  in	
  a	
  way	
  that	
  the	
  StreamX	
  tools	
  can	
  see	
  the	
  data.	
  	
  This	
  is	
  done	
  with	
  an	
  
import	
  function	
  that	
  associates	
  each	
  of	
  the	
  streams	
  in	
  a	
  multi-­‐sensor	
  dataset	
  with	
  a	
  
time-­‐synchronized	
  project.	
  	
  When	
  using	
  the	
  StreamX	
  data	
  recorder,	
  the	
  data	
  is	
  
already	
  stored	
  in	
  projects	
  and	
  is	
  immediately	
  accessible.	
  
Tag.	
  Once	
  the	
  data	
  is	
  recognized	
  by	
  the	
  system,	
  the	
  next	
  step	
  is	
  to	
  tag	
  or	
  annotate	
  the	
  
data	
  based	
  on	
  the	
  content	
  important	
  to	
  the	
  application.	
  	
  The	
  user	
  provides	
  one	
  or	
  
more	
  image	
  or	
  pattern	
  recognition	
  algorithms	
  in	
  the	
  form	
  of	
  a	
  small	
  application	
  that	
  
will	
  run	
  on	
  the	
  data	
  and	
  indicate	
  when	
  the	
  content	
  of	
  interest	
  is	
  detected	
  in	
  a	
  
particular	
  data	
  frame.	
  	
  The	
  StreamX	
  platform	
  uses	
  those	
  mini-­‐applications	
  to	
  crawl	
  
through	
  the	
  data	
  distributed	
  across	
  the	
  cluster	
  nodes	
  in	
  the	
  background,	
  mining	
  the	
  
data	
  in	
  parallel	
  for	
  interesting	
  content.	
  When	
  the	
  content	
  is	
  found,	
  the	
  system	
  tags	
  all	
  
the	
  timestamps	
  of	
  the	
  data	
  file	
  where	
  the	
  content	
  is	
  located.	
  	
  Time	
  stamps	
  can	
  
contain	
  many	
  different	
  tags	
  to	
  indicate	
  multiple	
  types	
  of	
  content	
  or	
  multiple	
  versions	
  
of	
  the	
  detection	
  algorithm.	
  
Search.	
  When	
  a	
  user	
  wants	
  to	
  find	
  specific	
  sequences	
  of	
  data	
  within	
  any	
  of	
  the	
  large	
  
multi-­‐sensor	
  data	
  files,	
  he	
  uses	
  a	
  web-­‐based	
  interface	
  to	
  setup	
  a	
  parallel	
  search.	
  	
  The	
  
search	
  criteria	
  are	
  the	
  user-­‐defined	
  tags	
  setup	
  during	
  the	
  tagging	
  and	
  mining	
  phase,	
  
and	
  the	
  criteria	
  can	
  be	
  as	
  simple	
  or	
  complex	
  as	
  needed.	
  	
  The	
  search	
  runs	
  in	
  parallel	
  
on	
  every	
  data	
  node	
  in	
  the	
  cluster	
  and	
  returns	
  a	
  list	
  of	
  time	
  sequences	
  that	
  matched	
  
the	
  search	
  criteria.	
  	
  The	
  search	
  results	
  can	
  be	
  stored,	
  partitioned,	
  and	
  managed	
  for	
  
repeated	
  testing	
  and	
  validation.	
  
Use.	
  	
  Once	
  the	
  user	
  has	
  identified	
  what	
  sections	
  of	
  each	
  data	
  file	
  are	
  needed,	
  he	
  then	
  
uses	
  a	
  web-­‐based	
  interface	
  to	
  launch	
  an	
  application,	
  such	
  as	
  a	
  simulator,	
  into	
  the	
  
cluster	
  to	
  run	
  in	
  parallel	
  on	
  every	
  node	
  where	
  any	
  of	
  the	
  target	
  data	
  is	
  located.	
  	
  The	
  
application	
  is	
  the	
  same	
  one	
  he	
  runs	
  on	
  a	
  desktop	
  or	
  workstation	
  without	
  recoding.	
  	
  
Often	
  the	
  application	
  is	
  large,	
  complex,	
  and	
  proprietary,	
  such	
  as	
  a	
  simulator.	
  	
  The	
  
application	
  results	
  are	
  collected	
  and	
  stored	
  separately.	
  	
  The	
  process	
  can	
  be	
  saved	
  so	
  
that	
  the	
  same	
  test	
  run	
  can	
  be	
  performed	
  the	
  same	
  way	
  in	
  the	
  future	
  automatically.	
  
Automate.	
  	
  Every	
  part	
  of	
  the	
  process	
  can	
  be	
  saved	
  and	
  automated	
  to	
  save	
  time	
  and	
  
provide	
  reliable	
  testing	
  and	
  validation.	
  	
  Complex	
  searches	
  can	
  be	
  saved	
  and	
  re-­‐run	
  
after	
  new	
  datasets	
  are	
  imported.	
  	
  Parallel	
  execution	
  runs	
  can	
  be	
  saved	
  and	
  re-­‐run	
  on	
  
updated	
  sets	
  of	
  search	
  results.	
  	
  Training	
  runs	
  can	
  be	
  set	
  up	
  with	
  thousands	
  or	
  
millions	
  of	
  examples	
  to	
  train	
  a	
  classifier.	
  	
  Simulators	
  can	
  be	
  given	
  slightly	
  different	
  
input	
  parameters	
  automatically	
  to	
  vary	
  the	
  environment	
  test	
  conditions	
  to	
  create	
  
more	
  robust	
  testing	
  and	
  validation.	
  The	
  test	
  automation	
  system	
  supports	
  active	
  
feedback,	
  so	
  that	
  the	
  results	
  from	
  one	
  simulation	
  run	
  can	
  be	
  used	
  as	
  input	
  to	
  the	
  next.	
  
	
  
It	
  is	
  as	
  simple	
  as	
  that.	
  The	
  user	
  supplies	
  the	
  data,	
  the	
  content	
  detection	
  algorithm,	
  the	
  search	
  criteria,	
  and	
  
the	
  desktop	
  application.	
  	
  The	
  StreamX	
  streaming	
  data	
  management	
  platform	
  takes	
  care	
  of	
  the	
  rest.	
  
Capture	
  
Store	
  
Tag	
  
Search	
  
Use	
  
Automate	
  
One Platform Addressing Multiple
Streaming Data Challenges
	
  
Many	
  different	
  challenges	
  with	
  big	
  streaming	
  data	
  can	
  be	
  solved	
  by	
  
the	
  StreamX	
  platform.	
  	
  Whether	
  the	
  focus	
  is	
  on	
  analyzing	
  streams	
  
as	
  they	
  come	
  in,	
  mining	
  the	
  data	
  off	
  line	
  for	
  deeper	
  understanding,	
  
or	
  training	
  AIs,	
  the	
  StreamX	
  platform	
  has	
  the	
  capability	
  to	
  find	
  the	
  
exact	
  data	
  you	
  are	
  looking	
  for	
  and	
  accelerate	
  your	
  use	
  of	
  that	
  data. 	
  
Test and Validation
Challenges
• Finding	
  the	
  right	
  time	
  slices	
  of	
  
data	
  to	
  test	
  with	
  from	
  inside	
  
many	
  large	
  data	
  files	
  
• Quickly	
  testing	
  the	
  new	
  
application	
  version	
  against	
  
those	
  test	
  data	
  
• Sensitivity	
  analysis	
  based	
  on	
  
parametric	
  variation	
  on	
  the	
  
state	
  space	
  
StreamX Solution
• Content-­‐based	
  search	
  of	
  up	
  to	
  
100	
  PB	
  of	
  streaming	
  data	
  in	
  
parallel	
  
• Automatic	
  parallel	
  execution	
  
of	
  the	
  application	
  against	
  that	
  
data	
  
• Test	
  automation	
  engine	
  
allowing	
  flexible	
  parametric	
  
variation	
  schemes	
  and	
  active	
  
feedback	
  control	
  of	
  test	
  
scenarios	
  
Training Classifiers
Challenges
• Finding	
  the	
  right	
  slices	
  of	
  
video	
  or	
  sensor	
  data	
  files	
  for	
  
training	
  out	
  of	
  100s	
  of	
  
Terabytes	
  or	
  Petabytes	
  
• Manual	
  labor	
  or	
  scripts	
  to	
  	
  
feed	
  the	
  training	
  data	
  to	
  the	
  
classifier	
  
StreamX Solution
• Content-­‐based	
  search	
  of	
  up	
  to	
  
100	
  PB	
  of	
  streaming	
  data	
  in	
  
parallel,	
  extracting	
  just	
  the	
  
relevant	
  scenes	
  
• Automated	
  test	
  harness	
  trains	
  
the	
  classifier	
  after	
  initial	
  setup	
  
from	
  web-­‐based	
  user	
  interface	
  
Real-time Analysis	
  
Challenges
• Making	
  newly	
  acquired	
  data	
  
instantly	
  available	
  for	
  
analysis	
  
• Quickly	
  compare	
  newly	
  
acquired	
  data	
  to	
  mounds	
  of	
  
historical	
  data	
  
StreamX Solution
• Instantly	
  makes	
  newly	
  
acquired	
  data	
  on	
  StreamX	
  
data	
  recorders	
  part	
  of	
  the	
  
distributed	
  the	
  StreamX	
  
distributed	
  file	
  system	
  
• Automatic	
  parallel	
  
execution	
  of	
  a	
  comparison	
  
application	
  for	
  the	
  new	
  data	
  
against	
  existing	
  data	
  
Automated Regression
Testing	
  
Challenges
• Separating	
  test	
  data	
  from	
  
data	
  used	
  to	
  development	
  
application	
  
• Manually	
  setting	
  up	
  each	
  
regression	
  run,	
  or	
  writing	
  
complicated	
  scripts	
  to	
  
manage	
  some	
  of	
  the	
  testing	
  
StreamX Solution
• Data	
  management	
  that	
  
keeps	
  regression	
  test	
  data	
  
separate	
  from	
  development	
  
data	
  
• Automated	
  runs	
  of	
  test	
  data	
  
in	
  parallel,	
  executing	
  the	
  
application	
  on	
  the	
  cluster	
  
nodes	
  where	
  the	
  data	
  
resides	
  
	
  
How We Do It
XCube has five guiding principles
that drive the design of our stream
data management solutions:
Never require a rewrite of the
customer's application. Those are
complex and proprietary.
Never move the data, because
individual datasets can be Terabytes
and collections of testing can be
Petabytes. Also some data sets are
restricted from being moved.
Support globally distributed data
and teams, so that any of the data
can be accessed and used anywhere
in the world.
Enable customers to define what
content is important for searches.
Provide the maximum amount of
automation, so the customer can
focus on the science and engineering
instead of IT and data management.
Why We Do It
Our inspiration came
while XCube founder
Mikael Taveniku was on
a consulting project to
design the autonomous
drive architecture for a major
automotive company. His approach
extended the US DoD multi-sensor
fusion architecture for unmanned
aerial vehicles to unmanned land
vehicles, and their architecture is still
in use today.
While completing that project, he
realized that there was a challenge
even bigger than the design of the
autonomous drive system – validating
the correctness of the system design
and operation. The enabling function-
ality for that validation is to find
relevant data, process it, and do so in
parallel without parallel programming.
From that vision, XCube is changing
the way ADAS and autonomous
vehicles get developed, tested, and
validated.
	
  
 
StreamX Data Recorders
	
  
The	
  StreamX	
  streaming	
  data	
  recorder	
  has	
  a	
  unique	
  combination	
  of	
  high-­‐speed,	
  high-­‐capacity,	
  and	
  industrial	
  
ruggedness.	
  	
  Designed	
  for	
  the	
  trunk	
  of	
  a	
  car	
  or	
  the	
  cabin	
  of	
  a	
  helicopter,	
  the	
  data	
  recorder	
  provides	
  flexible	
  
data	
  recorder	
  interfaces	
  for	
  multi-­‐sensor	
  streaming	
  data	
  such	
  as	
  video,	
  Lidar,	
  radar,	
  and	
  sonar.
	
  
	
  
StreamX Distributed Storage and Simulation Clusters
	
  
The	
  StreamX	
  streaming	
  data	
  management	
  software	
  comes	
  preloaded	
  on	
  a	
  StreamX	
  distributed	
  server	
  
cluster	
  based	
  on	
  hyper-­‐converged	
  infrastructure	
  elements.	
  Configurations	
  range	
  from	
  a	
  single	
  24U	
  rack	
  
with	
  12	
  multi-­‐core	
  CPUs,	
  1	
  PB	
  storage,	
  and	
  1	
  Gigabit	
  Ethernet	
  backbone	
  to	
  multiple	
  48U	
  racks	
  with	
  800	
  
multi-­‐core	
  CPUs,	
  100	
  Petabytes	
  of	
  storage,	
  and	
  a	
  10	
  Gigabit	
  Ethernet	
  fat-­‐tree	
  interconnect.	
  
	
  
	
  
	
  
For More Information
	
  
To	
  receive	
  a	
  quotation	
  customized	
  to	
  your	
  requirements	
  or	
  to	
  schedule	
  a	
  demo,	
  please	
  send	
  an	
  email	
  to	
  
sales@x3-­‐c.com.	
  	
  You	
  can	
  also	
  visit	
  the	
  Partners	
  section	
  of	
  our	
  website	
  for	
  a	
  list	
  of	
  resellers	
  and	
  local	
  
representatives.	
  
StreamX Product Family
XCube	
  
	
  
©	
  Copyright	
  2015.	
  All	
  rights	
  reserved.	
  XCube	
  and	
  StreamX	
  are	
  registered	
  trademarks	
  of	
  XCube	
  Research	
  and	
  Development,	
  Inc.	
  	
  
All	
  other	
  company	
  and	
  product	
  names	
  are	
  trademarks	
  or	
  registered	
  trademarks	
  of	
  the	
  respective	
  companies.	
  
XCube R&D, Inc. • +1 978.799.2024 • www.X3-c.com
30 Technology Way, Nashua, NH 03060 USA
Example Configuration
• 4U rack-mountable chassis
• (4) 10 Gigabit Ethernet inputs
• 64 TB removable internal SSD
or 256 TB external SSD
Example Configuration
• 1 24U Controller Rack
+ 3 48U Execution Racks
• 60 CPUs + 30 GPUs
• 3 Petabytes storage
• 10 Gigabit Ethernet interconnect

More Related Content

What's hot

Big data & hadoop framework
Big data & hadoop frameworkBig data & hadoop framework
Big data & hadoop framework
Tu Pham
 
SplunkLive! Beginner Session
SplunkLive! Beginner SessionSplunkLive! Beginner Session
SplunkLive! Beginner Session
Splunk
 

What's hot (20)

Data automation 101
Data automation 101Data automation 101
Data automation 101
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
The Future of Real-Time in Spark
The Future of Real-Time in SparkThe Future of Real-Time in Spark
The Future of Real-Time in Spark
 
Big data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructureBig data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructure
 
Big data & hadoop framework
Big data & hadoop frameworkBig data & hadoop framework
Big data & hadoop framework
 
Data stax no sql use cases
Data stax  no sql use casesData stax  no sql use cases
Data stax no sql use cases
 
Data Onboarding Breakout Session
Data Onboarding Breakout SessionData Onboarding Breakout Session
Data Onboarding Breakout Session
 
Big Data Testing
Big Data TestingBig Data Testing
Big Data Testing
 
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
Show me the Money! Cost & Resource  Tracking for Hadoop and Storm Show me the Money! Cost & Resource  Tracking for Hadoop and Storm
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
 
Big Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on SparkBig Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on Spark
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
SplunkLive! Beginner Session
SplunkLive! Beginner SessionSplunkLive! Beginner Session
SplunkLive! Beginner Session
 
High-Performance Analytics with Probabilistic Data Structures: the Power of H...
High-Performance Analytics with Probabilistic Data Structures: the Power of H...High-Performance Analytics with Probabilistic Data Structures: the Power of H...
High-Performance Analytics with Probabilistic Data Structures: the Power of H...
 
سکوهای ابری و مدل های برنامه نویسی در ابر
سکوهای ابری و مدل های برنامه نویسی در ابرسکوهای ابری و مدل های برنامه نویسی در ابر
سکوهای ابری و مدل های برنامه نویسی در ابر
 
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaReal-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
 
Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01
 
La potenza è nulla senza controllo
La potenza è nulla senza controlloLa potenza è nulla senza controllo
La potenza è nulla senza controllo
 
La potenza è nulla senza controllo
La potenza è nulla senza controlloLa potenza è nulla senza controllo
La potenza è nulla senza controllo
 
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystemOptimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystem
 

Viewers also liked

J. R. Land Res
J. R. Land ResJ. R. Land Res
J. R. Land Res
John Land
 
BrodtKerry_122016
BrodtKerry_122016BrodtKerry_122016
BrodtKerry_122016
Kerry Brodt
 

Viewers also liked (15)

J. R. Land Res
J. R. Land ResJ. R. Land Res
J. R. Land Res
 
Socialize your ERP, and collaborate with him!
Socialize your ERP, and collaborate with him!Socialize your ERP, and collaborate with him!
Socialize your ERP, and collaborate with him!
 
SPECTRUM_layout
SPECTRUM_layoutSPECTRUM_layout
SPECTRUM_layout
 
Chuong trinh bh i12 a (1)
Chuong trinh bh i12 a (1)Chuong trinh bh i12 a (1)
Chuong trinh bh i12 a (1)
 
Fabien Liénard : Pourquoi et comment les adolescents publient-ils sur le web ...
Fabien Liénard : Pourquoi et comment les adolescents publient-ils sur le web ...Fabien Liénard : Pourquoi et comment les adolescents publient-ils sur le web ...
Fabien Liénard : Pourquoi et comment les adolescents publient-ils sur le web ...
 
Analisis sinyal sinyal kecil
Analisis sinyal sinyal kecilAnalisis sinyal sinyal kecil
Analisis sinyal sinyal kecil
 
Tb chuong trinh bh 14 nen (1)
Tb chuong trinh bh 14 nen (1)Tb chuong trinh bh 14 nen (1)
Tb chuong trinh bh 14 nen (1)
 
Chi tiet can ho celadon
Chi tiet can ho celadonChi tiet can ho celadon
Chi tiet can ho celadon
 
Calendario jornadas
Calendario jornadasCalendario jornadas
Calendario jornadas
 
BrodtKerry_122016
BrodtKerry_122016BrodtKerry_122016
BrodtKerry_122016
 
Règlement concours Mangawa et revue de presse
Règlement concours Mangawa et revue de presseRèglement concours Mangawa et revue de presse
Règlement concours Mangawa et revue de presse
 
Young Lions 2016- Natalia Fernandes de Oliveira
Young Lions 2016- Natalia Fernandes de OliveiraYoung Lions 2016- Natalia Fernandes de Oliveira
Young Lions 2016- Natalia Fernandes de Oliveira
 
Tatiana Vidonscky - Young Lions 2016
Tatiana Vidonscky - Young Lions 2016Tatiana Vidonscky - Young Lions 2016
Tatiana Vidonscky - Young Lions 2016
 
Robótica
RobóticaRobótica
Robótica
 
Create rest webservice for oracle public api using java class via jdeveloper
Create rest webservice for oracle public api using java class via jdeveloperCreate rest webservice for oracle public api using java class via jdeveloper
Create rest webservice for oracle public api using java class via jdeveloper
 

Similar to XCube-overview-brochure-revB

Similar to XCube-overview-brochure-revB (20)

Internship msc cs
Internship msc csInternship msc cs
Internship msc cs
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Understanding the Anametrix Cloud-based Analytics Platform
Understanding the Anametrix Cloud-based Analytics PlatformUnderstanding the Anametrix Cloud-based Analytics Platform
Understanding the Anametrix Cloud-based Analytics Platform
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016
 
Msbi Architecture
Msbi ArchitectureMsbi Architecture
Msbi Architecture
 
Enhanced Data Visualization provided for 200,000 Machines with OpenTSDB and C...
Enhanced Data Visualization provided for 200,000 Machines with OpenTSDB and C...Enhanced Data Visualization provided for 200,000 Machines with OpenTSDB and C...
Enhanced Data Visualization provided for 200,000 Machines with OpenTSDB and C...
 
SplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk OverviewSplunkLive! London 2016 Splunk Overview
SplunkLive! London 2016 Splunk Overview
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
 
CS8091_BDA_Unit_IV_Stream_Computing
CS8091_BDA_Unit_IV_Stream_ComputingCS8091_BDA_Unit_IV_Stream_Computing
CS8091_BDA_Unit_IV_Stream_Computing
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Apache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analyticsApache Spark Streaming -Real time web server log analytics
Apache Spark Streaming -Real time web server log analytics
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
Setting up Sumo Logic - June 2017
Setting up Sumo Logic - June 2017Setting up Sumo Logic - June 2017
Setting up Sumo Logic - June 2017
 
Setting Up Sumo Logic - Sep 2017
Setting Up Sumo Logic -  Sep 2017Setting Up Sumo Logic -  Sep 2017
Setting Up Sumo Logic - Sep 2017
 
60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt
 
cametrics-report-final
cametrics-report-finalcametrics-report-final
cametrics-report-final
 
Level 3 Certification: Setting up Sumo Logic - Oct 2018
Level 3 Certification: Setting up Sumo Logic - Oct  2018Level 3 Certification: Setting up Sumo Logic - Oct  2018
Level 3 Certification: Setting up Sumo Logic - Oct 2018
 

XCube-overview-brochure-revB

  • 1. XCube StreamX®     The  StreamX®  streaming  data  management  platform   from  XCube  is  an  integrated,  turnkey,  end-­‐to-­‐end   solution  for  huge  streaming  data.       Whether  your  data  streams  are  combinations  of  video,  Lidar,   radar,  or  other  sensors,  you  can  find  the  right  time  segments  of   data  in  100’s  of  Petabytes  of  data  files,  then  use  those  segments  as   input  to  your  application  and  run  it  all  in  parallel.   The  StreamX  platform  is  designed  to  solve  two  of  the  biggest   challenges  when  working  with  huge  streaming  datasets:       The  StreamX  platform  includes  a  distributed  virtual  file  system,   virtual  machine  technology,  and  parallel  execution  engine  that   enables  any  desktop  application  to  run  in  parallel  in  a  distributed   cluster.    Applications  are  sent  into  the  cluster  to  where  the  data   resides  and  then  executed  in  parallel,  all  controlled  by  a  simple   user  interface  from  any  web  browser.    No  parallel  programming.     No  transferring  or  reformatting  the  data.   Working  with  huge  streaming  data  sets  has  never  been  easier.    The   StreamX  streaming  data  management  platform  from  XCube  can   shave  months  off  of  your  development,  test,  and  validation.       How  to  run  an  existing  desktop  application,  such  as   a  simulator,  on  lots  of  data  many  times  faster  than  a   workstation  or  standard  server     How  to  find  the  specific  time  segments  inside  of   Petabytes  of  streaming  data  files  that  you  want  to   run  the  application  on  or  further  analyze   Data Management Tools for Big Streaming Data Capture, Store, Tag, Search, Simulate, and Automate in Parallel without reprogramming applications or reformatting data With the StreamX platform you can: Automatically tag content in data streams • Mine the data with user-defined detection algorithms • Run the detection algorithms in parallel without parallel programing • Tag detected scenes in the stream to enable content-based searches Find and manage specific content out of petabytes stored globally • Search in parallel for tagged content in 100s petabytes stored globally • Manage the search results to create independents sets of data for training, testing, and validation Run desktop applications in parallel using search results as input • Run desktop applications such as simulators in parallel without recoding • Work with Petabytes of data distributed globally without ever transmitting the data Automate training and testing • Automate AI training and regression testing. • Partition training and test data to maintain test integrity. Validate using Re-simulated Data • Bring real-world streaming data in your simulators • Automatically vary input parameters to create new simulated situations • Use results from one simulation run as input to the next to provide active feedback With XCube’s StreamX systems, each of our GIDAS simulations runs overnight instead of a week. - Automotive OEM test engineer “ ” 1   2  
  • 2. The StreamX Platform: Powerful and Easy to Use   The  StreamX  streaming  data  management  platform  is  powerful  yet  easy  to  use.    After  capturing  the  data,   there  are  five  main  functions  with  the  system,  all  of  which  can  be  performed  remotely  from  any  browser.     Capture.  Capture  the  data  stream  using  your  own  system  or  the  StreamX  Data   Recorder.    The  StreamX  data  recorder  can  capture  multi-­‐sensor  data  at  up  to  4.4   GB/s,  eliminating  the  need  have  separate  recorders  for  each  sensor  stream.  Up  to   256  TB  of  removable  storage  holds  extended  data  collection  sessions  and  then   removed  and  shipped  to  the  data  center  without  having  to  disconnect  the  sensors.   Store.    The  incoming  data  needs  to  be  incorporated  in  the  distributed  storage  cluster   and  organized  in  a  way  that  the  StreamX  tools  can  see  the  data.    This  is  done  with  an   import  function  that  associates  each  of  the  streams  in  a  multi-­‐sensor  dataset  with  a   time-­‐synchronized  project.    When  using  the  StreamX  data  recorder,  the  data  is   already  stored  in  projects  and  is  immediately  accessible.   Tag.  Once  the  data  is  recognized  by  the  system,  the  next  step  is  to  tag  or  annotate  the   data  based  on  the  content  important  to  the  application.    The  user  provides  one  or   more  image  or  pattern  recognition  algorithms  in  the  form  of  a  small  application  that   will  run  on  the  data  and  indicate  when  the  content  of  interest  is  detected  in  a   particular  data  frame.    The  StreamX  platform  uses  those  mini-­‐applications  to  crawl   through  the  data  distributed  across  the  cluster  nodes  in  the  background,  mining  the   data  in  parallel  for  interesting  content.  When  the  content  is  found,  the  system  tags  all   the  timestamps  of  the  data  file  where  the  content  is  located.    Time  stamps  can   contain  many  different  tags  to  indicate  multiple  types  of  content  or  multiple  versions   of  the  detection  algorithm.   Search.  When  a  user  wants  to  find  specific  sequences  of  data  within  any  of  the  large   multi-­‐sensor  data  files,  he  uses  a  web-­‐based  interface  to  setup  a  parallel  search.    The   search  criteria  are  the  user-­‐defined  tags  setup  during  the  tagging  and  mining  phase,   and  the  criteria  can  be  as  simple  or  complex  as  needed.    The  search  runs  in  parallel   on  every  data  node  in  the  cluster  and  returns  a  list  of  time  sequences  that  matched   the  search  criteria.    The  search  results  can  be  stored,  partitioned,  and  managed  for   repeated  testing  and  validation.   Use.    Once  the  user  has  identified  what  sections  of  each  data  file  are  needed,  he  then   uses  a  web-­‐based  interface  to  launch  an  application,  such  as  a  simulator,  into  the   cluster  to  run  in  parallel  on  every  node  where  any  of  the  target  data  is  located.    The   application  is  the  same  one  he  runs  on  a  desktop  or  workstation  without  recoding.     Often  the  application  is  large,  complex,  and  proprietary,  such  as  a  simulator.    The   application  results  are  collected  and  stored  separately.    The  process  can  be  saved  so   that  the  same  test  run  can  be  performed  the  same  way  in  the  future  automatically.   Automate.    Every  part  of  the  process  can  be  saved  and  automated  to  save  time  and   provide  reliable  testing  and  validation.    Complex  searches  can  be  saved  and  re-­‐run   after  new  datasets  are  imported.    Parallel  execution  runs  can  be  saved  and  re-­‐run  on   updated  sets  of  search  results.    Training  runs  can  be  set  up  with  thousands  or   millions  of  examples  to  train  a  classifier.    Simulators  can  be  given  slightly  different   input  parameters  automatically  to  vary  the  environment  test  conditions  to  create   more  robust  testing  and  validation.  The  test  automation  system  supports  active   feedback,  so  that  the  results  from  one  simulation  run  can  be  used  as  input  to  the  next.     It  is  as  simple  as  that.  The  user  supplies  the  data,  the  content  detection  algorithm,  the  search  criteria,  and   the  desktop  application.    The  StreamX  streaming  data  management  platform  takes  care  of  the  rest.   Capture   Store   Tag   Search   Use   Automate  
  • 3. One Platform Addressing Multiple Streaming Data Challenges   Many  different  challenges  with  big  streaming  data  can  be  solved  by   the  StreamX  platform.    Whether  the  focus  is  on  analyzing  streams   as  they  come  in,  mining  the  data  off  line  for  deeper  understanding,   or  training  AIs,  the  StreamX  platform  has  the  capability  to  find  the   exact  data  you  are  looking  for  and  accelerate  your  use  of  that  data.   Test and Validation Challenges • Finding  the  right  time  slices  of   data  to  test  with  from  inside   many  large  data  files   • Quickly  testing  the  new   application  version  against   those  test  data   • Sensitivity  analysis  based  on   parametric  variation  on  the   state  space   StreamX Solution • Content-­‐based  search  of  up  to   100  PB  of  streaming  data  in   parallel   • Automatic  parallel  execution   of  the  application  against  that   data   • Test  automation  engine   allowing  flexible  parametric   variation  schemes  and  active   feedback  control  of  test   scenarios   Training Classifiers Challenges • Finding  the  right  slices  of   video  or  sensor  data  files  for   training  out  of  100s  of   Terabytes  or  Petabytes   • Manual  labor  or  scripts  to     feed  the  training  data  to  the   classifier   StreamX Solution • Content-­‐based  search  of  up  to   100  PB  of  streaming  data  in   parallel,  extracting  just  the   relevant  scenes   • Automated  test  harness  trains   the  classifier  after  initial  setup   from  web-­‐based  user  interface   Real-time Analysis   Challenges • Making  newly  acquired  data   instantly  available  for   analysis   • Quickly  compare  newly   acquired  data  to  mounds  of   historical  data   StreamX Solution • Instantly  makes  newly   acquired  data  on  StreamX   data  recorders  part  of  the   distributed  the  StreamX   distributed  file  system   • Automatic  parallel   execution  of  a  comparison   application  for  the  new  data   against  existing  data   Automated Regression Testing   Challenges • Separating  test  data  from   data  used  to  development   application   • Manually  setting  up  each   regression  run,  or  writing   complicated  scripts  to   manage  some  of  the  testing   StreamX Solution • Data  management  that   keeps  regression  test  data   separate  from  development   data   • Automated  runs  of  test  data   in  parallel,  executing  the   application  on  the  cluster   nodes  where  the  data   resides     How We Do It XCube has five guiding principles that drive the design of our stream data management solutions: Never require a rewrite of the customer's application. Those are complex and proprietary. Never move the data, because individual datasets can be Terabytes and collections of testing can be Petabytes. Also some data sets are restricted from being moved. Support globally distributed data and teams, so that any of the data can be accessed and used anywhere in the world. Enable customers to define what content is important for searches. Provide the maximum amount of automation, so the customer can focus on the science and engineering instead of IT and data management. Why We Do It Our inspiration came while XCube founder Mikael Taveniku was on a consulting project to design the autonomous drive architecture for a major automotive company. His approach extended the US DoD multi-sensor fusion architecture for unmanned aerial vehicles to unmanned land vehicles, and their architecture is still in use today. While completing that project, he realized that there was a challenge even bigger than the design of the autonomous drive system – validating the correctness of the system design and operation. The enabling function- ality for that validation is to find relevant data, process it, and do so in parallel without parallel programming. From that vision, XCube is changing the way ADAS and autonomous vehicles get developed, tested, and validated.  
  • 4.   StreamX Data Recorders   The  StreamX  streaming  data  recorder  has  a  unique  combination  of  high-­‐speed,  high-­‐capacity,  and  industrial   ruggedness.    Designed  for  the  trunk  of  a  car  or  the  cabin  of  a  helicopter,  the  data  recorder  provides  flexible   data  recorder  interfaces  for  multi-­‐sensor  streaming  data  such  as  video,  Lidar,  radar,  and  sonar.     StreamX Distributed Storage and Simulation Clusters   The  StreamX  streaming  data  management  software  comes  preloaded  on  a  StreamX  distributed  server   cluster  based  on  hyper-­‐converged  infrastructure  elements.  Configurations  range  from  a  single  24U  rack   with  12  multi-­‐core  CPUs,  1  PB  storage,  and  1  Gigabit  Ethernet  backbone  to  multiple  48U  racks  with  800   multi-­‐core  CPUs,  100  Petabytes  of  storage,  and  a  10  Gigabit  Ethernet  fat-­‐tree  interconnect.         For More Information   To  receive  a  quotation  customized  to  your  requirements  or  to  schedule  a  demo,  please  send  an  email  to   sales@x3-­‐c.com.    You  can  also  visit  the  Partners  section  of  our  website  for  a  list  of  resellers  and  local   representatives.   StreamX Product Family XCube     ©  Copyright  2015.  All  rights  reserved.  XCube  and  StreamX  are  registered  trademarks  of  XCube  Research  and  Development,  Inc.     All  other  company  and  product  names  are  trademarks  or  registered  trademarks  of  the  respective  companies.   XCube R&D, Inc. • +1 978.799.2024 • www.X3-c.com 30 Technology Way, Nashua, NH 03060 USA Example Configuration • 4U rack-mountable chassis • (4) 10 Gigabit Ethernet inputs • 64 TB removable internal SSD or 256 TB external SSD Example Configuration • 1 24U Controller Rack + 3 48U Execution Racks • 60 CPUs + 30 GPUs • 3 Petabytes storage • 10 Gigabit Ethernet interconnect