SlideShare a Scribd company logo
1 of 24
Download to read offline
Cosmos	
  	
  
Big	
  Data	
  and	
  Big	
  Challenges	
  
	
  
Pat	
  Helland	
  (“MSFT	
  Ex-­‐Pat”/Free	
  Agent)	
  
and	
  
Ed	
  Harris	
  (MicrosoA	
  Online	
  Services)	
  
Stanford	
  Oct	
  26,	
  2011	
  
1	
  
Outline	
  
•  Introduc1on	
  
•  Cosmos	
  Overview	
  
•  The	
  Structured	
  Streams	
  Project	
  
•  Conclusion	
  
2	
  
What	
  Is	
  COSMOS?	
  
•  Petabyte	
  Store	
  and	
  ComputaRon	
  System	
  
–  Several	
  hundred	
  petabytes	
  of	
  data	
  
–  Tens	
  of	
  thousands	
  of	
  computers	
  across	
  many	
  datacenters	
  
•  Massively	
  parallel	
  processing	
  based	
  on	
  Dryad	
  
–  Similar	
  to	
  MapReduce	
  but	
  can	
  represent	
  arbitrary	
  DAGs	
  of	
  computaRon	
  
–  AutomaRc	
  computaRon	
  placement	
  with	
  data	
  
•  SCOPE	
  (Structured	
  ComputaRon	
  OpRmized	
  for	
  Parallel	
  ExecuRon)	
  
–  SQL-­‐like	
  language	
  with	
  set-­‐oriented	
  record	
  and	
  column	
  manipulaRon	
  
–  AutomaRcally	
  compiled	
  and	
  opRmized	
  for	
  execuRon	
  over	
  Dryad	
  
•  Management	
  of	
  hundreds	
  of	
  “Virtual	
  Clusters”	
  for	
  computaRon	
  allocaRon	
  
–  Buy	
  your	
  machines	
  and	
  give	
  them	
  to	
  COSMOS	
  
–  Guaranteed	
  that	
  many	
  compute	
  resources	
  
–  May	
  use	
  more	
  when	
  they	
  are	
  not	
  in	
  use	
  
•  Ubiquitous	
  access	
  to	
  OSD’s	
  data	
  
–  Combining	
  knowledge	
  from	
  different	
  datasets	
  is	
  today’s	
  secret	
  sauce	
  
3	
  
OSD	
  Compu1ng/Storage	
  
Front-­‐End	
  On-­‐
Line	
  Web-­‐
Serving	
  
Back-­‐End	
  
Batch	
  
Data	
  Analysis	
  
Crawling	
  
Internet	
  
Other	
  Data	
  
User	
  &	
  System	
  Data	
  
Data	
  for	
  
On-­‐Line	
  
Work	
  
Results	
  
Large	
  
Read-­‐Only	
  
Datasets	
  
OSD	
  Compu1ng/Storage	
  
Front-­‐End	
  On-­‐
Line	
  Web-­‐
Serving	
  
Back-­‐End	
  
Batch	
  
Data	
  Analysis	
  
Crawling	
  
Internet	
  
Other	
  Data	
  
User	
  &	
  System	
  Data	
  
Data	
  for	
  
On-­‐Line	
  
Work	
  
Results	
  
Large	
  
Read-­‐Only	
  
Datasets	
  
Cosmos	
  and	
  OSD	
  ComputaRon	
  •  OSD	
  ApplicaRons	
  fall	
  into	
  two	
  broad	
  categories:	
  
–  Back-­‐end:	
  	
  Massive	
  batch	
  processing	
  creates	
  new	
  datasets	
  
–  Front-­‐end:	
  Online	
  request	
  processing	
  serves	
  up	
  and	
  captures	
  informaRon	
  
•  Cosmos	
  provides	
  storage	
  and	
  computaRon	
  for	
  Back-­‐End	
  Batch	
  	
  
data	
  analysis	
  
–  It	
  does	
  not	
  support	
  storage	
  and	
  computaRon	
  needs	
  for	
  the	
  Front-­‐End	
  
Cosmos!	
  
COSMOS:	
  The	
  Service	
  
•  Data	
  drives	
  search	
  and	
  adverRsing	
  
–  Web	
  pages:	
  Links,	
  text,	
  Rtles,	
  etc	
  
–  Search	
  logs:	
  What	
  people	
  searched	
  for,	
  what	
  they	
  clicked,	
  etc	
  
–  Browse	
  logs:	
  What	
  sites	
  people	
  visit,	
  the	
  browsing	
  order,	
  etc	
  
–  AdverRsing	
  logs:	
  What	
  ads	
  do	
  people	
  click	
  on,	
  what	
  was	
  shown,	
  etc	
  
•  We	
  ingest	
  or	
  generate	
  a	
  couple	
  of	
  PiB	
  every	
  day	
  
–  Bing,	
  MSN,	
  Hotmail,	
  Client	
  telemetry	
  
–  Web	
  crawl	
  snapshots	
  
–  Structured	
  data	
  feeds	
  
–  Long	
  tail	
  of	
  other	
  data	
  sets	
  of	
  interest	
  
•  COSMOS	
  is	
  the	
  backbone	
  for	
  Bing	
  analysis	
  and	
  relevance	
  
–  Click-­‐stream	
  informaRon	
  is	
  imported	
  from	
  many	
  sources	
  and	
  “cooked”	
  
–  Queries	
  analyzing	
  user	
  context,	
  click	
  commands,	
  and	
  success	
  are	
  processed	
  
•  COSMOS	
  is	
  a	
  service	
  
–  We	
  run	
  the	
  code	
  ourselves	
  (on	
  many	
  tens	
  of	
  thousands	
  of	
  servers)	
  
–  Users	
  simply	
  feed	
  in	
  data,	
  submit	
  jobs,	
  and	
  extract	
  the	
  results	
  
5	
  
Outline	
  
•  Introduc1on	
  
•  Cosmos	
  Overview	
  
•  The	
  Structured	
  Streams	
  Project	
  
•  Conclusion	
  
6	
  
Store	
  Layer:	
  
-­‐-­‐	
  Many	
  Extent	
  Nodes	
  store	
  and	
  
	
  	
  	
  	
  compress	
  replicated	
  extents	
  on	
  disk	
  
-­‐-­‐	
  Extents	
  are	
  combined	
  to	
  make	
  
	
  	
  	
  	
  unstructured	
  streams	
  
-­‐-­‐	
  CSM	
  (COSMOS	
  Store	
  Manager)	
  	
  
	
  	
  	
  handles	
  names,	
  streams,	
  &	
  replicaRon	
  
Store	
  	
  
Layer	
  
EN	
   EN	
   EN	
   EN	
  …	
  
Extent	
   Extent	
   Extent	
   Extent	
  …	
  
Stream	
  
“Foo”	
  
Stream	
  
“Bar”	
  
Stream	
  
“Zot”	
  
ExecuRon	
  	
  
Layer	
  
Execu1on	
  Layer:	
  
-­‐-­‐	
  Jobs	
  queues	
  up	
  per	
  Virtual	
  Cluster	
  
-­‐-­‐	
  When	
  a	
  job	
  starts,	
  it	
  gets	
  a	
  Job	
  Mgr	
  to	
  
	
  	
  	
  deploy	
  work	
  in	
  parallel	
  close	
  to	
  its	
  data	
  
-­‐-­‐	
  Many	
  Processing	
  Nodes	
  (PNs)	
  host	
  	
  
	
  	
  	
  execuRon	
  verRces	
  running	
  SCOPE	
  code	
  
Job	
  	
  
Mgr	
  
PN	
  
PN	
  
PN	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
Stream	
  
“Foo”	
  
Stream	
  
“Bar”	
  
Stream	
  
“Zot”	
  
Cosmos	
  Architecture	
  from	
  100,000	
  Feet	
  
7	
  
SCOPE	
  Layer:	
  
-­‐-­‐	
  SCOPE	
  Code	
  is	
  submihed	
  to	
  the	
  
	
  	
  	
  	
  SCOPE	
  Compiler	
  
-­‐-­‐	
  The	
  opRmizer	
  make	
  decisions	
  about	
  
	
  	
  	
  	
  execuRon	
  plan	
  and	
  parallelism	
  
-­‐-­‐	
  Algebra	
  (describing	
  the	
  job)	
  is	
  built	
  to	
  	
  
	
  	
  	
  	
  run	
  on	
  the	
  SCOPE	
  RunRme	
  
SCOPE	
  	
  
Layer	
   Data = SELECT * FROM S
WHERE Col-1 > 10
SCOPE	
  
OpRmizer	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Compiler	
  
The	
  Store	
  Layer	
  
•  Extent	
  Nodes:	
  
–  Implement	
  a	
  file	
  system	
  holding	
  extents	
  
–  Each	
  extent	
  is	
  up	
  to	
  2GB	
  
–  Compression	
  and	
  fault	
  detecRon	
  are	
  important	
  parts	
  of	
  the	
  EN	
  
•  CSM:	
  COSMOS	
  Store	
  Manager	
  
–  Instructs	
  replicaRon	
  across	
  3	
  different	
  ENs	
  per	
  extent	
  
–  Manages	
  composiRon	
  of	
  streams	
  out	
  of	
  extents	
  
–  Manages	
  the	
  namespace	
  of	
  streams	
  
8	
  
Store	
  	
  
Layer	
  
EN	
   EN	
   EN	
   EN	
  …	
  
Extent	
   Extent	
   Extent	
   Extent	
  …	
  
Stream	
  
“Foo”	
  
Stream	
  
“Bar”	
  
Stream	
  
“Zot”	
  
The	
  ExecuRon	
  Engine	
  
•  ExecuRon	
  Engine:	
  
–  Takes	
  the	
  plan	
  for	
  the	
  parallel	
  execuRon	
  of	
  a	
  SCOPE	
  job	
  and	
  finds	
  computers	
  
to	
  perform	
  the	
  work	
  
–  Responsible	
  for	
  the	
  placement	
  of	
  the	
  computaRon	
  close	
  to	
  the	
  data	
  it	
  reads	
  
–  Ensures	
  all	
  the	
  inputs	
  for	
  the	
  computaRon	
  are	
  available	
  before	
  firing	
  it	
  up	
  
–  Responsible	
  for	
  failures	
  and	
  restarts	
  
•  Dryad	
  is	
  similar	
  to	
  Map-­‐Reduce	
  
9	
  
ExecuRon	
  	
  
Layer	
  
Job	
  	
  
Mgr	
  
PN	
  
PN	
  
PN	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
Stream	
  
“Foo”	
  
Stream	
  
“Bar”	
  
Stream	
  
“Zot”	
  
The	
  SCOPE	
  Language	
  
•  SCOPE	
  (Structured	
  ComputaRon	
  OpRmized	
  for	
  Parallel	
  ExecuRon)	
  
–  Heavily	
  influenced	
  by	
  SQL	
  and	
  relaRonal	
  algebra	
  
–  Changed	
  to	
  deal	
  with	
  input	
  and	
  output	
  streams	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
•  SCOPE	
  is	
  a	
  high	
  level	
  declaraRve	
  language	
  for	
  data	
  manipulaRon	
  
–  It	
  translates	
  very	
  naturally	
  into	
  parallel	
  computaRon	
  
10	
  
Scope	
  
Job	
  
Stream-­‐1	
  
Stream-­‐2	
  
Stream-­‐3	
  
Stream-­‐A	
  
Stream-­‐B	
  
Input	
   Output	
  
Input	
  Arrives	
  as	
  
Sets	
  of	
  Records	
  
ComputaRon	
  Occurs	
  
as	
  Sets	
  of	
  Records	
  
Output	
  Wrihen	
  as	
  
Sets	
  of	
  Records	
  
The	
  SCOPE	
  Compiler	
  and	
  OpRmizer	
  
•  The	
  SCOPE	
  Compiler	
  and	
  OpRmizer	
  take	
  SCOPE	
  programs	
  and	
  create:	
  
–  The	
  algebra	
  describing	
  the	
  computaRon	
  
–  The	
  breakdown	
  of	
  the	
  work	
  into	
  processing	
  units	
  
–  The	
  descripRon	
  of	
  the	
  inputs	
  and	
  outputs	
  from	
  the	
  processing	
  units	
  
•  	
  Many	
  decisions	
  about	
  compiling	
  and	
  opRmizing	
  are	
  driven	
  by	
  data	
  size	
  
and	
  minimizing	
  data	
  movement	
  
11	
  
SCOPE	
  	
  
Layer	
   Data = SELECT * FROM S
WHERE Col-1 > 10
SCOPE	
  
OpRmizer	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Run/T	
  
SCOPE	
  
Compiler	
  
The	
  Virtual	
  Cluster	
  
•  Virtual	
  Cluster:	
  	
  a	
  management	
  tool	
  
–  Allocates	
  resources	
  across	
  groups	
  within	
  OSD	
  
–  Cost	
  model	
  captured	
  in	
  a	
  queue	
  of	
  work	
  (with	
  priority)	
  within	
  the	
  VC	
  
•  Each	
  Virtual	
  Cluster	
  has	
  a	
  guaranteed	
  capacity	
  
–  We	
  will	
  bump	
  other	
  users	
  of	
  the	
  VC’s	
  capacity	
  if	
  necessary	
  
–  The	
  VC	
  can	
  use	
  other	
  idle	
  capacity	
  	
  
12	
  
Work	
  Queue	
  
100	
  Hi-­‐Pri	
  PNs	
  
VC-­‐A	
  
Work	
  Queue	
  
500	
  Hi-­‐Pri	
  PNs	
  
VC-­‐B	
  
Work	
  Queue	
  
20	
  Hi-­‐Pri	
  PNs	
  
VC-­‐C	
  
Work	
  Queue	
  
1000	
  Hi-­‐Pri	
  PNs	
  
VC-­‐D	
  
Work	
  Queue	
  
350	
  Hi-­‐Pri	
  PNs	
  
VC-­‐E	
  
Outline	
  
•  Introduc1on	
  
•  Cosmos	
  Overview	
  
•  The	
  Structured	
  Streams	
  Project	
  
•  Conclusion	
  
13	
  
Introducing	
  Structured	
  Streams	
  
•  Cosmos	
  currently	
  supports	
  streams	
  
–  An	
  unstructured	
  byte	
  stream	
  of	
  data	
  
–  Created	
  by	
  append-­‐only	
  wriRng	
  to	
  the	
  end	
  of	
  the	
  stream	
  
•  Structured	
  streams	
  are	
  streams	
  with	
  metadata	
  
–  Metadata	
  defines	
  column	
  structure	
  and	
  affinity/clustering	
  informaRon	
  
•  Structured	
  streams	
  simplify	
  extractors	
  and	
  outpuhers	
  
–  A	
  structured	
  stream	
  may	
  be	
  imported	
  into	
  scope	
  without	
  an	
  extractor	
  
•  Structured	
  streams	
  offer	
  performance	
  improvements	
  
–  Column	
  features	
  allow	
  for	
  processing	
  opRmizaRons	
  
–  Affinity	
  management	
  can	
  dramaRcally	
  improve	
  performance	
  
–  Key-­‐oriented	
  features	
  offer	
  	
  
(someRmes	
  very	
  significant)	
  
	
  access	
  performance	
  improvements	
  
Stream	
  “A”	
  
Sequence	
  of	
  Bytes	
  
Today’s	
  Streams	
  
(unstructured	
  streams)	
  
Stream	
  “A”	
  
Sequence	
  of	
  Bytes	
  
Metadata	
  
…	
  
Record-­‐Oriented	
  Access	
  
New	
  Structured	
  	
  
Streams	
  
Today’s	
  Use	
  of	
  Extractors	
  and	
  Outpuhers	
  
Scope	
  
Stream	
  
“A”	
  
Unstructured	
  
Stream	
  
Extractor	
  
Metadata	
  
Scope	
  Processing	
  with	
  Metadata,	
  
Structure,	
  and	
  Rela7onal	
  Ops	
  
Outpuher	
  
Stream	
  “D”	
  
Unstructured	
  
Stream	
  
•  Extractors	
  
–  Programs	
  to	
  input	
  data	
  and	
  
supply	
  metadata	
  
•  Outpuhers	
  
–  Take	
  Scope	
  data	
  and	
  create	
  a	
  
bytestream	
  for	
  storage	
  
–  Discards	
  metadata	
  known	
  	
  
to	
  the	
  system	
  
source = EXTRACT col1, col2 FROM “A”
Data = SELECT * FROM source
<process	
  Data>	
  
OUTPUT Data to “D”
Metadata,	
  Streams,	
  Extractors,	
  &	
  Outpuhers	
  
•  Scope	
  has	
  metadata	
  for	
  the	
  data	
  it	
  is	
  
processing	
  
–  Extractors	
  provide	
  metadata	
  info	
  as	
  
they	
  suck	
  up	
  unstructured	
  streams	
  
•  Processing	
  the	
  Scope	
  queries	
  
ensures	
  metadata	
  is	
  preserved	
  
–  The	
  new	
  results	
  may	
  have	
  different	
  
metadata	
  than	
  the	
  old	
  
–  Scope	
  knows	
  the	
  new	
  metadata	
  
•  Scope	
  writes	
  structured	
  streams	
  
–  The	
  internal	
  informaRon	
  used	
  by	
  
Scope	
  is	
  wrihen	
  out	
  as	
  metadata	
  
•  Scope	
  reads	
  structured	
  streams	
  
–  Reading	
  a	
  structured	
  stream	
  allows	
  
later	
  jobs	
  to	
  see	
  the	
  metadata	
  
16	
  
Stream	
  “C”	
  
Metadata	
  
Structured	
  
Stream	
  
Stream	
  “B”	
  
Metadata	
  
Structured	
  
Stream	
  
Scope	
  
Stream	
  
“A”	
  
Unstructured	
  
Stream	
  
Extractor	
  
Metadata	
  
Outpuher	
  
Stream	
  “D”	
  
Unstructured	
  
Stream	
  
Scope	
  Processing	
  with	
  Metadata,	
  
Structure,	
  and	
  Rela7onal	
  Ops	
  
Note:	
  No	
  Cosmos	
  No7on	
  of	
  
Metadata	
  for	
  Stream	
  “D”	
  	
  -­‐-­‐	
  
Only	
  the	
  OutpuBer	
  Knows…	
  
The	
  Representa3on	
  of	
  a	
  Structured	
  
Stream	
  on	
  Disk	
  Is	
  Only	
  Visible	
  to	
  Scope!	
  
•  By	
  adding	
  metadata	
  (describing	
  the	
  stream)	
  into	
  the	
  stream,	
  we	
  can	
  
provide	
  performance	
  improvements:	
  
–  Cluster-­‐Key	
  access:	
  	
  random	
  reads	
  of	
  records	
  idenRfied	
  by	
  key	
  
–  ParRRoning	
  and	
  affinity:	
  data	
  to	
  be	
  processed	
  together	
  (someRmes	
  across	
  
mulRple	
  streams),	
  can	
  be	
  placed	
  together	
  for	
  faster	
  processing	
  
•  Metadata	
  for	
  a	
  structured	
  stream	
  	
  
is	
  kept	
  inside	
  the	
  stream	
  
–  The	
  stream	
  is	
  a	
  self-­‐contained	
  unit	
  
–  The	
  structured	
  stream	
  is	
  sRll	
  an	
  	
  
unstructured	
  stream	
  	
  
(plus	
  some	
  stuff)	
  
Streams,	
  Metadata,	
  and	
  Increased	
  Performance	
  
Stream	
  “A”	
   Metadata	
  
Cluster-­‐Key	
  
	
  Access	
  
ParRRoning	
  and	
  Affinity	
  
Cluster-­‐Key	
  Lookup	
  •  Cluster-­‐Key	
  Indices	
  make	
  a	
  huge	
  performance	
  improvement	
  
–  Today:	
  If	
  you	
  want	
  a	
  few	
  records,	
  you	
  must	
  process	
  the	
  whole	
  stream	
  
–  Structured	
  Streams:	
  	
  Directly	
  access	
  the	
  records	
  by	
  cluster-­‐key	
  index	
  
•  How	
  	
  it	
  works:	
  
–  Cluster-­‐Key	
  lookup	
  is	
  implemented	
  by	
  having	
  indexing	
  informaRon	
  
contained	
  in	
  the	
  metadata	
  inside	
  the	
  stream	
  
•  The	
  records	
  must	
  be	
  stored	
  in	
  cluster-­‐key	
  order	
  to	
  use	
  cluster-­‐key	
  lookup	
  
•  Cosmos	
  managed	
  index	
  generated	
  at	
  structured	
  stream	
  creaRon	
  
Stream	
  “Foo”	
  
Metadata	
  (including	
  index)	
  
A B	
   C	
   D E	
   F	
   G H I	
   J	
   K	
   L	
   M N O P	
   Q Z	
  Y	
  X	
  W…	
  
A E	
   J	
   N Q WA
D
Lookup	
  “D”	
  
ImplemenRng	
  ParRRoning	
  and	
  Affinity	
  •  Joins	
  across	
  streams	
  can	
  be	
  very	
  expensive	
  
–  Network	
  traffic	
  is	
  a	
  major	
  expense	
  when	
  joining	
  large	
  datasets	
  together	
  
–  Placing	
  related	
  data	
  together	
  can	
  dramaRcally	
  reduce	
  processing	
  cost	
  
•  We	
  affiniRze	
  data	
  when	
  we	
  believe	
  it	
  is	
  likely	
  to	
  be	
  processed	
  together	
  
–  AffiniRzaRon	
  places	
  the	
  data	
  close	
  together	
  	
  
–  If	
  we	
  want	
  affinity,	
  we	
  create	
  a	
  “parRRon”	
  as	
  we	
  create	
  a	
  structured	
  stream	
  
–  A	
  parRRon	
  is	
  a	
  subset	
  of	
  the	
  stream	
  intended	
  to	
  be	
  affiniRzed	
  together	
  
…	
  
Scope	
  
AffiniRzed	
  
Data	
  Is	
  Stored	
  
Close	
  Together	
  
Case	
  Study	
  1:	
  AggregaRon	
  
SELECT	
  GetDomain(URL)	
  AS	
  Domain,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  SUM((MyNewScoreFuncRon(A,	
  B,	
  …))	
  AS	
  TotalScore	
  
FROM	
  Web-­‐Table	
  
GROUP	
  BY	
  Domain;	
  
	
  
SELECT	
  TOP	
  100	
  Domain	
  ORDER	
  BY	
  TotalScore;	
  
Expensive	
  
Super	
  
Expensive	
  
Unstructured	
  Datasets	
  
Structured	
  Datasets	
  (Sstream)	
  
(par33oned	
  by	
  URL,	
  sorted	
  by	
  URL)	
  
Much	
  more	
  efficient	
  w/o	
  shuffling	
  data	
  	
  
across	
  network	
  
Case	
  Study	
  2:	
  SelecRon	
  
SELECT	
  URL,	
  feature1,	
  feature2	
  
FROM	
  Web-­‐Table	
  
WHERE	
  URL	
  ==	
  www.imdb.com;	
  
	
  
Unstructured	
  Datasets	
  
Massive	
  data	
  reads	
  
Structured	
  Datasets	
  (Sstream)	
  
(par33oned	
  by	
  URL,	
  sorted	
  by	
  URL)	
  
Par11on	
   Range	
   Metadata	
  
…	
   …	
   …	
  
P100	
   www.imc.com	
  ó	
  www.imovie.com	
  
P101	
   www.imz.com	
  ó	
  www.inode.com	
  	
  
…	
   …	
   …	
  
Par33on	
  
Metadata	
  
• 	
  Judiciously	
  choose	
  par11on	
  
• 	
  Push	
  predicate	
  close	
  to	
  data	
  
Case	
  Study	
  3:	
  Join	
  MulRple	
  Datasets	
  
SELECT	
  URL,	
  COUNT(*)	
  AS	
  TotalClicks	
  
FROM	
  Web-­‐Table	
  AS	
  W,	
  Click-­‐Stream	
  AS	
  C	
  
WHERE	
  GetDomain(W.URL)	
  ==	
  www.shopping.com	
  
	
  	
  	
  	
  AND	
  W.URL	
  ==	
  C.URL	
  AND	
  W.Outlinks	
  >	
  10	
  
GROUP	
  BY	
  URL;	
  
	
  
SELECT	
  TOP	
  100	
  URL	
  ORDER	
  BY	
  TotalClicks;	
  
Unstructured	
  Datasets	
  
Expensive	
  
Super	
  
Expensive	
  
Massive	
  	
  
data	
  reads	
  
Structured	
  Datasets	
  (Sstream)	
  
(par33oned	
  by	
  URL,	
  sorted	
  by	
  URL)	
  
Much	
  more	
  efficient	
  w/o	
  shuffling	
  data	
  	
  
across	
  network	
  
• 	
  Targeted	
  par11ons	
  
• 	
  Affini1zed	
  loca1on	
  
Outline	
  
•  Introduc1on	
  
•  Cosmos	
  Overview	
  
•  The	
  Structured	
  Streams	
  Project	
  
•  Conclusion	
  
23	
  
Takeaways	
  
•  Cosmos	
  manages	
  OSD’s	
  (Online	
  Services	
  Division)	
  data	
  and	
  computaRon	
  
–  Hundreds	
  of	
  petabytes	
  of	
  data	
  
–  Many	
  thousands	
  of	
  very	
  large	
  batch	
  jobs	
  per	
  day	
  
–  Tens	
  of	
  thousands	
  of	
  servers	
  in	
  mulRple	
  datacenters	
  
•  All	
  the	
  data	
  in	
  the	
  same	
  addressable	
  store	
  is	
  essen7al	
  to	
  the	
  business	
  
–  The	
  money	
  lies	
  in	
  combining	
  data	
  in	
  surprising	
  ways…	
  
•  Cosmos	
  is	
  evolving:	
  
–  Structured	
  Streams	
  combines	
  large	
  scale	
  parallelism	
  with	
  DB	
  techniques	
  to	
  
dramaRcally	
  improve	
  the	
  efficiencies	
  of	
  computaRon	
  
•  In	
  the	
  works:	
  
–  Integrated	
  random	
  and	
  sequenRal	
  processing	
  over	
  the	
  same	
  data	
  
–  Reliable	
  workflow	
  via	
  pub-­‐sub	
  
–  TransacRonally	
  correct	
  changes	
  over	
  many	
  petabytes	
  of	
  data	
  
•  Cosmos	
  needs	
  fresh	
  blood!	
  
–  Send	
  your	
  students	
  our	
  way!	
  	
  60+	
  PhD	
  and	
  undergrad	
  
internships	
  per	
  year	
  
MicrosoA	
  ConfidenRal	
   24	
  

More Related Content

What's hot

Cloud computing visualized.
Cloud computing visualized. Cloud computing visualized.
Cloud computing visualized. Soujan Kamesh
 
Report on cloud computing
Report on cloud computingReport on cloud computing
Report on cloud computingFarhanAhmade
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computingJithin Parakka
 
Restaurant billing application
Restaurant billing applicationRestaurant billing application
Restaurant billing applicationch samaram
 
Chapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud ComputingChapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud Computingnewbie2019
 
Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)
Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)
Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)HemaArora2
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computingnitinw25
 
Historical development of cloud computing
Historical development of cloud computingHistorical development of cloud computing
Historical development of cloud computinggaurav jain
 
A brief history of cloud computing
A brief history of cloud computingA brief history of cloud computing
A brief history of cloud computingOneserve
 
Sustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challengesSustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challengesAbdulMajidFarooqi
 

What's hot (20)

Churn Predictive Modelling
Churn Predictive ModellingChurn Predictive Modelling
Churn Predictive Modelling
 
Cloud computing visualized.
Cloud computing visualized. Cloud computing visualized.
Cloud computing visualized.
 
Report on cloud computing
Report on cloud computingReport on cloud computing
Report on cloud computing
 
Itil Mind Maps
Itil Mind MapsItil Mind Maps
Itil Mind Maps
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computing
 
Restaurant data
Restaurant data Restaurant data
Restaurant data
 
Restaurant billing application
Restaurant billing applicationRestaurant billing application
Restaurant billing application
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Tailor Management System
Tailor Management SystemTailor Management System
Tailor Management System
 
Cloud computing PPT
Cloud computing PPTCloud computing PPT
Cloud computing PPT
 
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing SeminarCloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
 
Chapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud ComputingChapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud Computing
 
Mis cloud computing
Mis cloud computingMis cloud computing
Mis cloud computing
 
Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)
Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)
Unit 1 Software Testing (HANDWRITTEN+PRINTED NOTES)
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Historical development of cloud computing
Historical development of cloud computingHistorical development of cloud computing
Historical development of cloud computing
 
A brief history of cloud computing
A brief history of cloud computingA brief history of cloud computing
A brief history of cloud computing
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Sustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challengesSustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challenges
 
basics of cloud computing
basics of cloud computingbasics of cloud computing
basics of cloud computing
 

Similar to Microsoft cosmos

Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale SupercomputerSagar Dolas
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging振东 刘
 
Kognitio - an overview
Kognitio - an overviewKognitio - an overview
Kognitio - an overviewKognitio
 
EEDC - Apache Pig
EEDC - Apache PigEEDC - Apache Pig
EEDC - Apache Pigjavicid
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike, Inc.
 
Hadoop first mr job - inverted index construction
Hadoop first mr job - inverted index constructionHadoop first mr job - inverted index construction
Hadoop first mr job - inverted index constructionSubhas Kumar Ghosh
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAndre Essing
 
H2O World - What's New in H2O with Cliff Click
H2O World - What's New in H2O with Cliff ClickH2O World - What's New in H2O with Cliff Click
H2O World - What's New in H2O with Cliff ClickSri Ambati
 
Script of Scripts Polyglot Notebook and Workflow System
Script of ScriptsPolyglot Notebook and Workflow SystemScript of ScriptsPolyglot Notebook and Workflow System
Script of Scripts Polyglot Notebook and Workflow SystemBo Peng
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 

Similar to Microsoft cosmos (20)

Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
 
Lecture17.ppt
Lecture17.pptLecture17.ppt
Lecture17.ppt
 
Lecture17 (1).ppt
Lecture17 (1).pptLecture17 (1).ppt
Lecture17 (1).ppt
 
Lecture17.ppt
Lecture17.pptLecture17.ppt
Lecture17.ppt
 
Kognitio - an overview
Kognitio - an overviewKognitio - an overview
Kognitio - an overview
 
EEDC Apache Pig Language
EEDC Apache Pig LanguageEEDC Apache Pig Language
EEDC Apache Pig Language
 
Apache Spark Components
Apache Spark ComponentsApache Spark Components
Apache Spark Components
 
EEDC - Apache Pig
EEDC - Apache PigEEDC - Apache Pig
EEDC - Apache Pig
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory Architecture
 
Hadoop first mr job - inverted index construction
Hadoop first mr job - inverted index constructionHadoop first mr job - inverted index construction
Hadoop first mr job - inverted index construction
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep Dive
 
H2O World - What's New in H2O with Cliff Click
H2O World - What's New in H2O with Cliff ClickH2O World - What's New in H2O with Cliff Click
H2O World - What's New in H2O with Cliff Click
 
Script of Scripts Polyglot Notebook and Workflow System
Script of ScriptsPolyglot Notebook and Workflow SystemScript of ScriptsPolyglot Notebook and Workflow System
Script of Scripts Polyglot Notebook and Workflow System
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 
Eedc.apache.pig last
Eedc.apache.pig lastEedc.apache.pig last
Eedc.apache.pig last
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 

More from Karthik Murugesan

Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation PlatformKarthik Murugesan
 
Yahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slidesYahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slidesKarthik Murugesan
 
Free servers to build Big Data Systems on: Bing's Approach
Free servers to build Big Data Systems on: Bing's  Approach Free servers to build Big Data Systems on: Bing's  Approach
Free servers to build Big Data Systems on: Bing's Approach Karthik Murugesan
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionKarthik Murugesan
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2Karthik Murugesan
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesKarthik Murugesan
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019Karthik Murugesan
 
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Unifying Twitter around a single ML platform  - Twitter AI Platform 2019Unifying Twitter around a single ML platform  - Twitter AI Platform 2019
Unifying Twitter around a single ML platform - Twitter AI Platform 2019Karthik Murugesan
 
The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...Karthik Murugesan
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019Karthik Murugesan
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at UberKarthik Murugesan
 
Developing a ML model using TF Estimator
Developing a ML model using TF EstimatorDeveloping a ML model using TF Estimator
Developing a ML model using TF EstimatorKarthik Murugesan
 
Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018Karthik Murugesan
 
Netflix factstore for recommendations - 2018
Netflix factstore  for recommendations - 2018Netflix factstore  for recommendations - 2018
Netflix factstore for recommendations - 2018Karthik Murugesan
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Karthik Murugesan
 
Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017Karthik Murugesan
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoveryKarthik Murugesan
 
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform Karthik Murugesan
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Karthik Murugesan
 

More from Karthik Murugesan (20)

Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
 
Yahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slidesYahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slides
 
Free servers to build Big Data Systems on: Bing's Approach
Free servers to build Big Data Systems on: Bing's  Approach Free servers to build Big Data Systems on: Bing's  Approach
Free servers to build Big Data Systems on: Bing's Approach
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019
 
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Unifying Twitter around a single ML platform  - Twitter AI Platform 2019Unifying Twitter around a single ML platform  - Twitter AI Platform 2019
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
 
The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
 
Developing a ML model using TF Estimator
Developing a ML model using TF EstimatorDeveloping a ML model using TF Estimator
Developing a ML model using TF Estimator
 
Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018
 
Netflix factstore for recommendations - 2018
Netflix factstore  for recommendations - 2018Netflix factstore  for recommendations - 2018
Netflix factstore for recommendations - 2018
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
 
Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017
 
State Of AI 2018
State Of AI 2018State Of AI 2018
State Of AI 2018
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music Discovery
 
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
 
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Microsoft cosmos

  • 1. Cosmos     Big  Data  and  Big  Challenges     Pat  Helland  (“MSFT  Ex-­‐Pat”/Free  Agent)   and   Ed  Harris  (MicrosoA  Online  Services)   Stanford  Oct  26,  2011   1  
  • 2. Outline   •  Introduc1on   •  Cosmos  Overview   •  The  Structured  Streams  Project   •  Conclusion   2  
  • 3. What  Is  COSMOS?   •  Petabyte  Store  and  ComputaRon  System   –  Several  hundred  petabytes  of  data   –  Tens  of  thousands  of  computers  across  many  datacenters   •  Massively  parallel  processing  based  on  Dryad   –  Similar  to  MapReduce  but  can  represent  arbitrary  DAGs  of  computaRon   –  AutomaRc  computaRon  placement  with  data   •  SCOPE  (Structured  ComputaRon  OpRmized  for  Parallel  ExecuRon)   –  SQL-­‐like  language  with  set-­‐oriented  record  and  column  manipulaRon   –  AutomaRcally  compiled  and  opRmized  for  execuRon  over  Dryad   •  Management  of  hundreds  of  “Virtual  Clusters”  for  computaRon  allocaRon   –  Buy  your  machines  and  give  them  to  COSMOS   –  Guaranteed  that  many  compute  resources   –  May  use  more  when  they  are  not  in  use   •  Ubiquitous  access  to  OSD’s  data   –  Combining  knowledge  from  different  datasets  is  today’s  secret  sauce   3  
  • 4. OSD  Compu1ng/Storage   Front-­‐End  On-­‐ Line  Web-­‐ Serving   Back-­‐End   Batch   Data  Analysis   Crawling   Internet   Other  Data   User  &  System  Data   Data  for   On-­‐Line   Work   Results   Large   Read-­‐Only   Datasets   OSD  Compu1ng/Storage   Front-­‐End  On-­‐ Line  Web-­‐ Serving   Back-­‐End   Batch   Data  Analysis   Crawling   Internet   Other  Data   User  &  System  Data   Data  for   On-­‐Line   Work   Results   Large   Read-­‐Only   Datasets   Cosmos  and  OSD  ComputaRon  •  OSD  ApplicaRons  fall  into  two  broad  categories:   –  Back-­‐end:    Massive  batch  processing  creates  new  datasets   –  Front-­‐end:  Online  request  processing  serves  up  and  captures  informaRon   •  Cosmos  provides  storage  and  computaRon  for  Back-­‐End  Batch     data  analysis   –  It  does  not  support  storage  and  computaRon  needs  for  the  Front-­‐End   Cosmos!  
  • 5. COSMOS:  The  Service   •  Data  drives  search  and  adverRsing   –  Web  pages:  Links,  text,  Rtles,  etc   –  Search  logs:  What  people  searched  for,  what  they  clicked,  etc   –  Browse  logs:  What  sites  people  visit,  the  browsing  order,  etc   –  AdverRsing  logs:  What  ads  do  people  click  on,  what  was  shown,  etc   •  We  ingest  or  generate  a  couple  of  PiB  every  day   –  Bing,  MSN,  Hotmail,  Client  telemetry   –  Web  crawl  snapshots   –  Structured  data  feeds   –  Long  tail  of  other  data  sets  of  interest   •  COSMOS  is  the  backbone  for  Bing  analysis  and  relevance   –  Click-­‐stream  informaRon  is  imported  from  many  sources  and  “cooked”   –  Queries  analyzing  user  context,  click  commands,  and  success  are  processed   •  COSMOS  is  a  service   –  We  run  the  code  ourselves  (on  many  tens  of  thousands  of  servers)   –  Users  simply  feed  in  data,  submit  jobs,  and  extract  the  results   5  
  • 6. Outline   •  Introduc1on   •  Cosmos  Overview   •  The  Structured  Streams  Project   •  Conclusion   6  
  • 7. Store  Layer:   -­‐-­‐  Many  Extent  Nodes  store  and          compress  replicated  extents  on  disk   -­‐-­‐  Extents  are  combined  to  make          unstructured  streams   -­‐-­‐  CSM  (COSMOS  Store  Manager)          handles  names,  streams,  &  replicaRon   Store     Layer   EN   EN   EN   EN  …   Extent   Extent   Extent   Extent  …   Stream   “Foo”   Stream   “Bar”   Stream   “Zot”   ExecuRon     Layer   Execu1on  Layer:   -­‐-­‐  Jobs  queues  up  per  Virtual  Cluster   -­‐-­‐  When  a  job  starts,  it  gets  a  Job  Mgr  to        deploy  work  in  parallel  close  to  its  data   -­‐-­‐  Many  Processing  Nodes  (PNs)  host          execuRon  verRces  running  SCOPE  code   Job     Mgr   PN   PN   PN   SCOPE   Run/T   SCOPE   Run/T   SCOPE   Run/T   Stream   “Foo”   Stream   “Bar”   Stream   “Zot”   Cosmos  Architecture  from  100,000  Feet   7   SCOPE  Layer:   -­‐-­‐  SCOPE  Code  is  submihed  to  the          SCOPE  Compiler   -­‐-­‐  The  opRmizer  make  decisions  about          execuRon  plan  and  parallelism   -­‐-­‐  Algebra  (describing  the  job)  is  built  to            run  on  the  SCOPE  RunRme   SCOPE     Layer   Data = SELECT * FROM S WHERE Col-1 > 10 SCOPE   OpRmizer   SCOPE   Run/T   SCOPE   Run/T   SCOPE   Run/T   SCOPE   Compiler  
  • 8. The  Store  Layer   •  Extent  Nodes:   –  Implement  a  file  system  holding  extents   –  Each  extent  is  up  to  2GB   –  Compression  and  fault  detecRon  are  important  parts  of  the  EN   •  CSM:  COSMOS  Store  Manager   –  Instructs  replicaRon  across  3  different  ENs  per  extent   –  Manages  composiRon  of  streams  out  of  extents   –  Manages  the  namespace  of  streams   8   Store     Layer   EN   EN   EN   EN  …   Extent   Extent   Extent   Extent  …   Stream   “Foo”   Stream   “Bar”   Stream   “Zot”  
  • 9. The  ExecuRon  Engine   •  ExecuRon  Engine:   –  Takes  the  plan  for  the  parallel  execuRon  of  a  SCOPE  job  and  finds  computers   to  perform  the  work   –  Responsible  for  the  placement  of  the  computaRon  close  to  the  data  it  reads   –  Ensures  all  the  inputs  for  the  computaRon  are  available  before  firing  it  up   –  Responsible  for  failures  and  restarts   •  Dryad  is  similar  to  Map-­‐Reduce   9   ExecuRon     Layer   Job     Mgr   PN   PN   PN   SCOPE   Run/T   SCOPE   Run/T   SCOPE   Run/T   Stream   “Foo”   Stream   “Bar”   Stream   “Zot”  
  • 10. The  SCOPE  Language   •  SCOPE  (Structured  ComputaRon  OpRmized  for  Parallel  ExecuRon)   –  Heavily  influenced  by  SQL  and  relaRonal  algebra   –  Changed  to  deal  with  input  and  output  streams                             •  SCOPE  is  a  high  level  declaraRve  language  for  data  manipulaRon   –  It  translates  very  naturally  into  parallel  computaRon   10   Scope   Job   Stream-­‐1   Stream-­‐2   Stream-­‐3   Stream-­‐A   Stream-­‐B   Input   Output   Input  Arrives  as   Sets  of  Records   ComputaRon  Occurs   as  Sets  of  Records   Output  Wrihen  as   Sets  of  Records  
  • 11. The  SCOPE  Compiler  and  OpRmizer   •  The  SCOPE  Compiler  and  OpRmizer  take  SCOPE  programs  and  create:   –  The  algebra  describing  the  computaRon   –  The  breakdown  of  the  work  into  processing  units   –  The  descripRon  of  the  inputs  and  outputs  from  the  processing  units   •   Many  decisions  about  compiling  and  opRmizing  are  driven  by  data  size   and  minimizing  data  movement   11   SCOPE     Layer   Data = SELECT * FROM S WHERE Col-1 > 10 SCOPE   OpRmizer   SCOPE   Run/T   SCOPE   Run/T   SCOPE   Run/T   SCOPE   Compiler  
  • 12. The  Virtual  Cluster   •  Virtual  Cluster:    a  management  tool   –  Allocates  resources  across  groups  within  OSD   –  Cost  model  captured  in  a  queue  of  work  (with  priority)  within  the  VC   •  Each  Virtual  Cluster  has  a  guaranteed  capacity   –  We  will  bump  other  users  of  the  VC’s  capacity  if  necessary   –  The  VC  can  use  other  idle  capacity     12   Work  Queue   100  Hi-­‐Pri  PNs   VC-­‐A   Work  Queue   500  Hi-­‐Pri  PNs   VC-­‐B   Work  Queue   20  Hi-­‐Pri  PNs   VC-­‐C   Work  Queue   1000  Hi-­‐Pri  PNs   VC-­‐D   Work  Queue   350  Hi-­‐Pri  PNs   VC-­‐E  
  • 13. Outline   •  Introduc1on   •  Cosmos  Overview   •  The  Structured  Streams  Project   •  Conclusion   13  
  • 14. Introducing  Structured  Streams   •  Cosmos  currently  supports  streams   –  An  unstructured  byte  stream  of  data   –  Created  by  append-­‐only  wriRng  to  the  end  of  the  stream   •  Structured  streams  are  streams  with  metadata   –  Metadata  defines  column  structure  and  affinity/clustering  informaRon   •  Structured  streams  simplify  extractors  and  outpuhers   –  A  structured  stream  may  be  imported  into  scope  without  an  extractor   •  Structured  streams  offer  performance  improvements   –  Column  features  allow  for  processing  opRmizaRons   –  Affinity  management  can  dramaRcally  improve  performance   –  Key-­‐oriented  features  offer     (someRmes  very  significant)    access  performance  improvements   Stream  “A”   Sequence  of  Bytes   Today’s  Streams   (unstructured  streams)   Stream  “A”   Sequence  of  Bytes   Metadata   …   Record-­‐Oriented  Access   New  Structured     Streams  
  • 15. Today’s  Use  of  Extractors  and  Outpuhers   Scope   Stream   “A”   Unstructured   Stream   Extractor   Metadata   Scope  Processing  with  Metadata,   Structure,  and  Rela7onal  Ops   Outpuher   Stream  “D”   Unstructured   Stream   •  Extractors   –  Programs  to  input  data  and   supply  metadata   •  Outpuhers   –  Take  Scope  data  and  create  a   bytestream  for  storage   –  Discards  metadata  known     to  the  system   source = EXTRACT col1, col2 FROM “A” Data = SELECT * FROM source <process  Data>   OUTPUT Data to “D”
  • 16. Metadata,  Streams,  Extractors,  &  Outpuhers   •  Scope  has  metadata  for  the  data  it  is   processing   –  Extractors  provide  metadata  info  as   they  suck  up  unstructured  streams   •  Processing  the  Scope  queries   ensures  metadata  is  preserved   –  The  new  results  may  have  different   metadata  than  the  old   –  Scope  knows  the  new  metadata   •  Scope  writes  structured  streams   –  The  internal  informaRon  used  by   Scope  is  wrihen  out  as  metadata   •  Scope  reads  structured  streams   –  Reading  a  structured  stream  allows   later  jobs  to  see  the  metadata   16   Stream  “C”   Metadata   Structured   Stream   Stream  “B”   Metadata   Structured   Stream   Scope   Stream   “A”   Unstructured   Stream   Extractor   Metadata   Outpuher   Stream  “D”   Unstructured   Stream   Scope  Processing  with  Metadata,   Structure,  and  Rela7onal  Ops   Note:  No  Cosmos  No7on  of   Metadata  for  Stream  “D”    -­‐-­‐   Only  the  OutpuBer  Knows…   The  Representa3on  of  a  Structured   Stream  on  Disk  Is  Only  Visible  to  Scope!  
  • 17. •  By  adding  metadata  (describing  the  stream)  into  the  stream,  we  can   provide  performance  improvements:   –  Cluster-­‐Key  access:    random  reads  of  records  idenRfied  by  key   –  ParRRoning  and  affinity:  data  to  be  processed  together  (someRmes  across   mulRple  streams),  can  be  placed  together  for  faster  processing   •  Metadata  for  a  structured  stream     is  kept  inside  the  stream   –  The  stream  is  a  self-­‐contained  unit   –  The  structured  stream  is  sRll  an     unstructured  stream     (plus  some  stuff)   Streams,  Metadata,  and  Increased  Performance   Stream  “A”   Metadata   Cluster-­‐Key    Access   ParRRoning  and  Affinity  
  • 18. Cluster-­‐Key  Lookup  •  Cluster-­‐Key  Indices  make  a  huge  performance  improvement   –  Today:  If  you  want  a  few  records,  you  must  process  the  whole  stream   –  Structured  Streams:    Directly  access  the  records  by  cluster-­‐key  index   •  How    it  works:   –  Cluster-­‐Key  lookup  is  implemented  by  having  indexing  informaRon   contained  in  the  metadata  inside  the  stream   •  The  records  must  be  stored  in  cluster-­‐key  order  to  use  cluster-­‐key  lookup   •  Cosmos  managed  index  generated  at  structured  stream  creaRon   Stream  “Foo”   Metadata  (including  index)   A B   C   D E   F   G H I   J   K   L   M N O P   Q Z  Y  X  W…   A E   J   N Q WA D Lookup  “D”  
  • 19. ImplemenRng  ParRRoning  and  Affinity  •  Joins  across  streams  can  be  very  expensive   –  Network  traffic  is  a  major  expense  when  joining  large  datasets  together   –  Placing  related  data  together  can  dramaRcally  reduce  processing  cost   •  We  affiniRze  data  when  we  believe  it  is  likely  to  be  processed  together   –  AffiniRzaRon  places  the  data  close  together     –  If  we  want  affinity,  we  create  a  “parRRon”  as  we  create  a  structured  stream   –  A  parRRon  is  a  subset  of  the  stream  intended  to  be  affiniRzed  together   …   Scope   AffiniRzed   Data  Is  Stored   Close  Together  
  • 20. Case  Study  1:  AggregaRon   SELECT  GetDomain(URL)  AS  Domain,                              SUM((MyNewScoreFuncRon(A,  B,  …))  AS  TotalScore   FROM  Web-­‐Table   GROUP  BY  Domain;     SELECT  TOP  100  Domain  ORDER  BY  TotalScore;   Expensive   Super   Expensive   Unstructured  Datasets   Structured  Datasets  (Sstream)   (par33oned  by  URL,  sorted  by  URL)   Much  more  efficient  w/o  shuffling  data     across  network  
  • 21. Case  Study  2:  SelecRon   SELECT  URL,  feature1,  feature2   FROM  Web-­‐Table   WHERE  URL  ==  www.imdb.com;     Unstructured  Datasets   Massive  data  reads   Structured  Datasets  (Sstream)   (par33oned  by  URL,  sorted  by  URL)   Par11on   Range   Metadata   …   …   …   P100   www.imc.com  ó  www.imovie.com   P101   www.imz.com  ó  www.inode.com     …   …   …   Par33on   Metadata   •   Judiciously  choose  par11on   •   Push  predicate  close  to  data  
  • 22. Case  Study  3:  Join  MulRple  Datasets   SELECT  URL,  COUNT(*)  AS  TotalClicks   FROM  Web-­‐Table  AS  W,  Click-­‐Stream  AS  C   WHERE  GetDomain(W.URL)  ==  www.shopping.com          AND  W.URL  ==  C.URL  AND  W.Outlinks  >  10   GROUP  BY  URL;     SELECT  TOP  100  URL  ORDER  BY  TotalClicks;   Unstructured  Datasets   Expensive   Super   Expensive   Massive     data  reads   Structured  Datasets  (Sstream)   (par33oned  by  URL,  sorted  by  URL)   Much  more  efficient  w/o  shuffling  data     across  network   •   Targeted  par11ons   •   Affini1zed  loca1on  
  • 23. Outline   •  Introduc1on   •  Cosmos  Overview   •  The  Structured  Streams  Project   •  Conclusion   23  
  • 24. Takeaways   •  Cosmos  manages  OSD’s  (Online  Services  Division)  data  and  computaRon   –  Hundreds  of  petabytes  of  data   –  Many  thousands  of  very  large  batch  jobs  per  day   –  Tens  of  thousands  of  servers  in  mulRple  datacenters   •  All  the  data  in  the  same  addressable  store  is  essen7al  to  the  business   –  The  money  lies  in  combining  data  in  surprising  ways…   •  Cosmos  is  evolving:   –  Structured  Streams  combines  large  scale  parallelism  with  DB  techniques  to   dramaRcally  improve  the  efficiencies  of  computaRon   •  In  the  works:   –  Integrated  random  and  sequenRal  processing  over  the  same  data   –  Reliable  workflow  via  pub-­‐sub   –  TransacRonally  correct  changes  over  many  petabytes  of  data   •  Cosmos  needs  fresh  blood!   –  Send  your  students  our  way!    60+  PhD  and  undergrad   internships  per  year   MicrosoA  ConfidenRal   24