2010.04.30 summary of cloud futures 2010 marco parenzan pov

  • 933 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
933
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
22
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Cloud Futures 2010 http://research.microsoft.com/en- us/events/cloudfutures2010/ Marco Parenzan MOSE MOSE – University of Trieste 30 April, 2010 - slide 1
  • 2. Marco Parenzan 36Più di 30 anni; sposato e due figli; abito a Fiume Veneto (PN) Un ―prodotto‖ di questa Università Passato di sviluppo software in aziende...  ...di software (visione esterna)...  ...di produzione (visione interna)....  ―Insourcing― (il contrario dell‘Outsourcing) Sono un libero professionista  Sviluppo applicazioni su commessa  Consulenza Collaboro con il Laboratorio MOSE dell‘Università degli Studi di Trieste  Mi occupo di metodologie e strumenti di sviluppo  Mi occupo di progetti software 2 MOSE – University of Trieste 30 April, 2010 - slide 2
  • 3. Attività di formazione 9 anni come Docente a Contratto in questa Università  4 alla Triennale di Ing.Inf (Programmazione dei Calcolatori – ex. Prof. Cesari)  5 alla Magistrale di Ing.Inf (Programmazione dei Web Services) Formazione in azienda Formazione in enti regionali  IAL/Centro Formazione Pordenone (Villaggio del Fanciullo)  2 IFTS (Istruzione e formazione tecnica superiore, 1200 ore – Tecnico Software): intero curriculum Microsoft Speaker per user groups  xe.net (http://www.xe.net/)  UGI.ALT.net (http://www.ugialt.net/)  1nn0va (http://www.1nn0va.net/)  Eventi presso il Consorzio Universitario di Pordenone 3 MOSE – University of Trieste 30 April, 2010 - slide 3
  • 4. 1nn0va L'associazione non ha scopo di lucro, è apartitica, apolitica e ha finalità esclusivamente scientifica. Si prefigge la diffusione delle tecnologie emergenti e attuali attraverso l'organizzazione di conferenze, la redazione a la diffusione di pubblicazioni, l'applicazione pratica in progetti non a fini di lucro di innovative tecniche e metodologie di sviluppo informatico, il coordinamento con altre Associazioni, gruppi o Enti. Divulgazione sul territorio (pordenonese) MOSE – University of Trieste 30 April, 2010 - slide 4
  • 5. Primo 1nn0valab: 28 maggio 2010 ECCEZIONALE EVENTO WPF DI 1NN0VA: NASCE 1nn0vaLAB. Niente slide o lunghi monologhi, potrete ''toccare con mano'', digitare direttamente il codice mostrato da Marco su un PC messo eccezzionalmente a vostra completa disposizione per questo evento. Dalla versione 3.0 del framework .NET, rilasciata nel novembre del 2006, abbiamo a disposizione una nuova libreria per sviluppare applicazioni desktop: Windows Presentation Foundation. A quattro anni dal rilascio, e in concomitanza del rilascio del framework .NET 4.0, è ora di fare il salto. Capiremo: · quali sono i nuovi presupposti e quindi le differenze dal vecchio modello GDI delle Windows Forms · il nuovo sistema di layout · la nuova relazione tra designer e developer con l‘approccio dichiarativo e il linguaggio XAML · l‘uso di pattern architetturali nello sviluppo di applicazioni desktop con il Model-View-View Model (M-V- VM) · perchè è ora di passare a WPF, visto che abbiamo anche Silverlight per lo sviluppo di Rich Internet Applications (ora alla versione 4) e (supernovità) lo sviluppo di applicazioni per il prossimo Windows Phone 7 L'evento si svolgerà al Consorzio Universitario di Pordenone sito in via Prasecco 3a, sala L2 (piano interrato), edificio B MOSE – University of Trieste 30 April, 2010 - slide 5
  • 6. Call for abstract MOSE – University of Trieste 30 April, 2010 - slide 6
  • 7. Call for Abstracts “Advancing Research with Cloud Computing” Cloud computing is fast becoming the most important platform for research. Researchers today need vast computing resources to collect, share, manipulate, and explore massive data sets as well as to build and deploy new services for research. Cloud computing has the potential to advance research discoveries by making data and computing resources readily available at unprecedented economy of scale and nearly infinite scalability. To realize the full promise of cloud computing for research, however, one must think about the cloud as a holistic platform for creating new services, new experiences, and new methods to pursue research, teaching and scholarly communication. This goal presents a broad range of interesting questions. We invited extended abstracts that illustrate the role of cloud computing across a variety of research and curriculum development areas—including computer science, earth sciences, healthcare, humanities, life sciences, and social sciences—that highlight how new techniques and methods of research in the cloud may solve distinct challenges arising in those diverse areas. Source: http://research.microsoft.com/en-us/events/cloudfutures2010/ MOSE – University of Trieste 30 April, 2010 - slide 7
  • 8. Microsoft Research MOSE – University of Trieste 30 April, 2010 - slide 8
  • 9. The University of Washington eScience Institute Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington Director University of Washington eScience Institute http://lazowska.cs.washington.edu/cloud2010.pdf MOSE – University of Trieste 30 April, 2010 - slide 9
  • 10. Dan Reed MOSE – University of Trieste 30 April, 2010 - slide 10
  • 11. Massive volumes of data from sensors and networks of sensors MOSE – University of Trieste 30 April, 2010 - slide 11
  • 12. SDSS Apache Point telescope, 80TB of raw image data (80,000,000,000,000 bytes) over a 7 year period MOSE – University of Trieste 30 April, 2010 - slide 12
  • 13. Large Hadron Collider 700MB of data per second, 60TB/day, 20PB/year MOSE – University of Trieste 30 April, 2010 - slide 13
  • 14. Illumina Major labs HiSeq 2000 have 25-100 Sequencer of these machines ~1TB/day MOSE – University of Trieste 30 April, 2010 - slide 14
  • 15. Regional Scale Nodes of the NSF Ocean Observatories Initiative 1000 km of fiber optic cable on the seafloor,thousands of connecting chemical, physical, and biological sensors MOSE – University of Trieste 30 April, 2010 - slide 15
  • 16. The Web 20+ billion web pages x 20KB = 400+TB One computer can read 30-35 MB/sec just to read the web from disk => 4 months MOSE – University of Trieste 30 April, 2010 - slide 16
  • 17. eScience: Sensor-driven (data-driven) science and engineering Jim Gray Transforming science (again!) MOSE – University of Trieste 30 April, 2010 - slide 17
  • 18. http://research.microsoft.com/en-us/collaboration/fourthparadigm/ MOSE – University of Trieste 30 April, 2010 - slide 18
  • 19. eScience is about the analysis of data The automated or semi-automated extraction of knowledge from massive volumes of data It’s not just a matter of volume MOSE – University of Trieste 30 April, 2010 - slide 19
  • 20. Large Synoptic Survey Telescope (LSST) 40TB/day (an SDSS every two days), 100+PB in its 10-year lifetime 400mbps sustained data rate between Chile and NCSA MOSE – University of Trieste 30 April, 2010 - slide 20
  • 21. LSST Data Management System is widely distributed Headquarters Site Archive Center Systems Operations Archive Site Co-located Center (SOC) Data Access Center (DAC) Education and Public • Site Outreach Center (EPOC) • A physical location/space that hosts DM centers Connected via • Base Site dedicated, Base Center • Center protected fiber • optic circuits A DM functional Co-located capability hosted Data Access Center (DAC) at a Site NSF Review December 15-17, 2009 NSF Review Tucson, AZ [Andy Connolly, University December 15-17, 2009 of Washington, and LSST] Tucson, AZ MOSE – University of Trieste 30 April, 2010 - slide 21
  • 22. But astronomy is substantially ahead of most other fields Data management in computational astrophysics fopen() fread() fwrite() fclose() scp – Jeff Gardner, UW eScience Institute Each simulation generates a sequence of snapshots; each snapshot is a single flat file; analysis is via C or Fortran programs MOSE – University of Trieste 30 April, 2010 - slide 22
  • 23. Data management in biology 90% of all business data is maintained in spreadsheets – Enrique Godreau, Voyager Capital MOSE – University of Trieste 30 April, 2010 - slide 23
  • 24. Top faculty across all disciplines understand and fear the coming data tsunami Survey of 125 top investigators “Data, data, data” Flat files and Excel are the most common data management tools … Great for Microsoft lousy for science! Typical science workflow: 2 years ago: 1/2 day/week Now: 1 FTE In 2 years: 10 FTE Need tools, tools, tools! MOSE – University of Trieste 30 April, 2010 - slide 24
  • 25. eScience is married to the Cloud: Scalable computing and storage for everyone MOSE – University of Trieste 30 April, 2010 - slide 25
  • 26. Economics of Cloud Users • Pay by use instead of provisioning for peak Capacity Resources Resources Demand Capacity Demand Time Time Static data center Data center in the cloud Unused resources 26 MOSE – University of Trieste 30 April, 2010 - slide 26
  • 27. Cloud Computing: Confusion The interesting thing about cloud computing is that we’ve redefined Cloud Computing to include everything that we already do… I don’t understand what we would do differently in the light of Cloud Computing than change some of the words in our ads. Larry Ellison (Oracle CEO) , quoted in the Wall Street Journal, Sept 26, 2008 MOSE – University of Trieste 30 April, 2010 - slide 27
  • 28. Cloud Computing: Confusion A lot of people are jumping on the [cloud] bandwagon, but I have not heard two people say the same thing about it. There are multiple definitions out there of “the cloud” Andy Isherwood (HP VP of European Software Sales), in ZDNews, Dec 11, 2008 MOSE – University of Trieste 30 April, 2010 - slide 28
  • 29. Cloud Computing: Confusion It’s stupidity. It’s worse than stupidity: it’s a marketing hype campaign. Somebody is saying this is inevitable – and whenever you hear somebody saying that, it’s very likely to be a set of businesses campaigning to make it true. Richard Stallman (“free software” advocate), in The Guardian, Sept 29, 2008 MOSE – University of Trieste 30 April, 2010 - slide 29
  • 30. Dilbert on Cloud Computing Novembre 18th, 2009 http://www.dilbert.com/strips/comic/2009-11-18/ Novembre 19th, 2009 http://www.dilbert.com/strips/comic/2009-11-19/ MOSE – University of Trieste 30 April, 2010 - slide 30
  • 31. Il Cloud...questo sconosciuto? MOSE – University of Trieste 30 April, 2010 - slide 31
  • 32. MOSE – University of Trieste 30 April, 2010 - slide 32
  • 33. MOSE – University of Trieste 30 April, 2010 - slide 33
  • 34. MOSE – University of Trieste 30 April, 2010 - slide 34
  • 35. MOSE – University of Trieste 30 April, 2010 - slide 35
  • 36. MOSE – University of Trieste 30 April, 2010 - slide 36
  • 37. http://www.youtube.com/watch?v=PPnoKb9fTkA&feature=player_embedded MOSE – University of Trieste 30 April, 2010 - slide 37
  • 38. MOSE – University of Trieste 30 April, 2010 - slide 38
  • 39. Cloud Computing A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet. [I. Foster, Y. Zhao, I. Raicu, S. Lu,‖Cloud Computing and Grid Computing 360-Degree Compared‖, in Proc. IEEE Grid Computing Environments Workshop, Austin (Tx), Nov. 2008, pp. 1-10.] MOSE – University of Trieste 30 April, 2010 - slide 39
  • 40. What Is Cloud Computing? Three New Aspects to Cloud Computing The Illusion of Infinite Computing Resources Available on Demand The Elimination of an Upfront Commitment by Cloud Users The Ability to Pay for Use of Computing Resources on a Short-Term Basis as Needed Above the Clouds: a Berkeley View of Cloud Computing http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdfCloud Computing Book Report” on the UC Berkeley Paper “Above the Clouds: a Berkeley View of Cloud Computing” http://blogs.msdn.com/pathelland/archive/2009/04/10/book-report-on-the-uc-berkeley-paper- above-the-clouds-a-berkeley-view-of-cloud-computing.aspx http://cid-84f3c5ef51d06e8b.skydrive.live.com/self.aspx/.Public/2009/Above-the-Clouds- 090401k.pptx MOSE – University of Trieste 30 April, 2010 - slide 40
  • 41. Evolution to Cloud Computing (from another presentation) Application runs Application runs at a Application runs using on-premises hoster cloud platform • Bring my own • Rent machines, connectivity, • Shared, machines, Pay someone for a pool of connectivity, software Pay someone to host my multi-tenant computing resources that can software, etc. • application using hardware Less control, but environment set of be applied to a that I specify applications Buy my own hardware, and •manage my owncontrol Complete data center fewer • Offers pool of and responsibility responsibilities computing • Lower capital costs, resources, • Upfront capital abstracted from costs for the but pay for fixed capacity, even if idle infrastructure infrastructure • Pay as you go MOSE – University of Trieste 30 April, 2010 - slide 41
  • 42. New Application Opportunities Some Interesting New Types of Applications Enable By the Cloud: Mobile Interactive Apps: Applications that respond in real time but work with lots of data. Cloud computing offers highly-available large datasets. Parallel Batch Processing: “Cost Associativity” – Many systems for a short time. Washington Post used 200EC2 instances to process 17,481 pages of Hillary Clinton’s travel documents within 9 hours of their release. Rise of Analytics: Again, “Cost Associativity” – Many systems for a short time. Compute intensive data analysis which may be parallelized. Compute Intensive Desktop Apps: For example, symbolic mathematics requires lots of computing per unit of data. Cost efficient to push the data to the cloud for computation MOSE – University of Trieste 30 April, 2010 - slide 42
  • 43. Conclusions and Questions about the Cloud of Tomorrow Utility Computing: It‘s Happening!  Grow and Shrink on Demand  Pay-As-You-Go Cloud Provider‘s View  Huge Datacenters Opened Economies and Possibilities Cloud User‘s View  Startups Don‘t Need Datacenters  Established Organizations Leverage Elasticity  UC Berkeley Has Extensively Leveraged Elasticity to Meet Deadlines Cloud Computing: High-Margin or Low-Margin Business?  Potential Cost Factor of 5-7X  Today‘s Cloud Providers Had Big Datacenter Infrastructure Anyway Implications of Cloud:  Application Software: Scale-Up and Down Rapidly; Client and Cloud  Infrastructure Software: Runs on VMs; Has Built-in Billing  Hardware Systems: Huge Scale; Container-Based; Energy Proportional MOSE – University of Trieste 30 April, 2010 - slide 43
  • 44. Top 10 Obstacles and Opportunities Obstacle Opportunity 1 Availability of Service Use Multiple Cloud Providers; Use Elasticity to Prevent DDOS 2 Data Lock-In Standardized APIs; Compatible Software to Enable Surge Computing 3 Data Confidentiality and Auditability Deploy Encryption, VLANs, Firewalls; Geographical Data Storage 4 Data Transfer Bottlenecks FedExing Disks; Data Backup/Archival; Higher Bandwidth Switches 5 Performance Unpredictability Improved VM Support; Flash Memory; Gang Scheduling VMs 6 Scalable Storage Invent Scalable Store 7 Bugs in Large Distributed Systems Invent Debugger that Relies on Dist VMs 8 Scaling quickly Auto-Scaler; Snaphots for Conservation 9 Reputation Fate Sharing Reputation Guarding Services 10 Software Licensing Pay-for-Use Licenses; Bulk Use Sales MOSE – University of Trieste 30 April, 2010 - slide 44
  • 45. #3 Obstacle: Data Confidentiality and Auditability “My sensitive corporate data will never be in the cloud!” Current Clouds Are Essentially Public Auditability Is Required Networks Sarbanes-Oxley They Are Exposed to More Attacks HIPAA Berkeley Believes There Are No Fundamental Obstacles to Making Cloud Computing as Secure as Most In-House IT Encrypted Storage Virtual LANs Network Middleboxes (Firewalls, Packet Filters) Encrypted Data in the Cloud Is Likely More Secure than Unencrypted Data on Premises Maybe: Cloud Concerns over More Focus on Provided Auditability Virtual National Boundaries USA PATRIOT Act Gives Capabilities… Some Europeans Auditing Below VMs Foreign Subpoenas Worries over SaaS in the USA Maybe More Tamper Resistant Blind Subpoenas MOSE – University of Trieste 30 April, 2010 - slide 45
  • 46. #4 Obstacle: Data Transfer Bottlenecks Problem: At $100 to $150 per Terabyte Transferred, Data Placement and Movement Is an Issue Opportunity-1: Sneaker-Net Opportunity-2: Keep Data  Jim Gray Found Cheapest Transfer in Cloud Was FedEx-ing Disks  If the Data Is in the Cloud,  1 Data Failure in 400 Attempts Transfer Doesn‘t Cost Example: Ship 10TB from  Amazon Hosting Large Data UC Berkeley to Amazon  E.g. US Census -- WAN: S3 < 20Mbits/sec: 10TB  4Mil Seconds  > 45 Days  Free on S3; Free on EC2 $1000 in AMZN Net Fees  Entice EC2 Business -- FedEx: Ten 1TB Disks via Overnight Shipping < 1 Day to Write 10TB to Disks Locally Opportunity-3: Cheaper WAN Cost ≈ $400  High-End Routers Are a Big Effective BW of 1500Mbits/Sec Part of the Cost of Data “NetFlix for Cloud Computing” Transfer  Research into Routing using Cheap Commodity Computers MOSE – University of Trieste 30 April, 2010 - slide 46
  • 47. To better understand, read the originals... Above the Clouds: a Berkeley View of Cloud Computing  http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdfCloud Computing Book Report‖ on the UC Berkeley Paper ―Above the Clouds: a Berkeley View of Cloud Computing‖  http://blogs.msdn.com/pathelland/archive/2009/04/10/book-report-on-the-uc-berkeley- paper-above-the-clouds-a-berkeley-view-of-cloud-computing.aspx  http://cid-84f3c5ef51d06e8b.skydrive.live.com/self.aspx/.Public/2009/Above-the-Clouds- 090401k.pptx Demystifying the Cloud (Simon Guest)  http://simonguest.com/blogs/smguest/archive/2009/05/14/Slides-from-TechEd-2009.aspx An introduction to Cloud Computing  http://s3.amazonaws.com/ppt-download/ima-cloud-computing-mar2010-v8-100320181538- phpapp02.pdf?Signature=GhK3ogCr2Z%2FzhWFa%2F%2BJUr1cT1eg%3D&Expires=12699 58049&AWSAccessKeyId=AKIAJLJT267DEGKZDHEQ …and many others MOSE – University of Trieste 30 April, 2010 - slide 47
  • 48. A Spectrum of Application Models Less Constrained Constraints in the App Model More Constrained Google App Engine Amazon AWS Microsoft Azure Traditional Web Apps .NET CLR/Windows Only Auto Scaling/Provisioning VMs Look Like Hardware Choice of Language No Limit on App Model Some Auto Failover/ Scale (but Force.Com User Must Implement Scalability and needs declarative application Failover SalesForce Biz Apps properties) Auto Scaling/Provisioning Less Automation Automated Management Services More Automation MOSE – University of Trieste 30 April, 2010 - slide 48
  • 49. Spectrum of Clouds Instruction Set VM (Amazon EC2, 3Tera) Bytecode VM (Microsoft Azure) Framework VM  Google AppEngine, Force.com Lower-level, Higher-level, Less management More management EC2 Azure AppEngine Force.com 49 MOSE – University of Trieste 30 April, 2010 - slide 49
  • 50. A Spectrum of Application Models Which Model Will Dominate?? High-Level Languages and Frameworks Can Be Built on Lower- Analogy: Programming Languages and Frameworks Level • Low-Level Languages (C/C++) Allow Fine-Grained Control • Building a Web App in C++ Is a Lot of Cumbersome Work • Ruby-on-Rails Hides the Mechanics but Only If You Follow More-Constrained Clouds May Be Request/Response and Ruby’s Abstractions Built on Less-Constrained Ones MOSE – University of Trieste 30 April, 2010 - slide 50
  • 51. Deploying A Service Manually Resource allocation  Machines must be chosen to host roles of the service  Fault domains, update domains, resource utilization, hosting environment, etc.  Procure additional hardware if necessary  IP addresses must be acquired Provisioning  Machines must be setup  Virtual machines created  Applications configured  DNS setup  Load balancers must be programmed Upgrades  Locate appropriate machines  Update the software/settings as necessary  Only bring down a subset of the service at a time Maintaining service health  Software faults must be handled  Hardware failures will occur  Logging infrastructure is provided to diagnose issues This is ongoing work…you‘re never done MOSE – University of Trieste 30 April, 2010 - slide 51
  • 52. Windows Azure Service Lifecycle Goal is to automate life cycle as much as possible Coding & Provisioning Deployment Maintain goal Modeling state •New services •Desired •Mapping and •Monitor and updates configuration deploying to •React to actual events hardware •Network configuration Developer Developer/ Automated Automated Deployer MOSE – University of Trieste 30 April, 2010 - slide 52
  • 53. ARCHITECTURE OF A CLOUD ENVIRONMENT users INTERNET Application Application Application Data Application s developers COMPUTE STORAGE Platform FABRIC Infrastructure MOSE – University of Trieste 30 April, 2010 - slide 53
  • 54. Domain Specific Cloud Components for General Availability in the Research Marco Parenzan •Tenure - Web Service Programming Computer Engineering – University of Trieste •Researcher Methodologies and Tools MOSE Laboratory – University of Trieste Maurizio Fermeglia •Full Professor Chemical Engineering MOSE Laboratory – University of Trieste MOSE – University of Trieste 30 April, 2010 - slide 54
  • 55. MOSE: Molecular Simulation Engineering Vision  Multi – Scale Molecular Modeling will revolution the world of research and industrial production in the next years by strongly accelerating the development of new products. Mission  Material Sciences: thermo physical properties for materials, polymer technology and nanoscience/nanotechnology  Life Sciences: drug-receptor interactions, drug-design, QSAR, drug- S02 S08 delivery… Process simulation: process synthesis, design, modeling for chemical, S01 COL1 MAKUPB COL2 S05 H3 T1 H1 P1 S13Z MAKUPA  S12 F1 Q1 S07 S10 M1 S06 H2 biochemical, energy production H4 S14 S13 MOSE – University of Trieste 30 April, 2010 - slide 55
  • 56. Multiscale Molecular Modeling Characteristic Time years Engineering Engineering design design hours Simulazione Process minutes Simulation di processo FEM seconds Mesoscale modeling FEM (segments) microseconds Meccanica Molecular nanoseconds Mechanics molecolare (atoms) (atomi) picoseconds Quantum Meccanica Mechanics Quantistica femtoseconds (electrons) (elettroni) 1Å 1nm 1μm 1mm 1m Characteristic Length MOSE – University of Trieste 30 April, 2010 - slide 56
  • 57. Message Passing Multiscale Molecular Modeling Engineering design Process Simulation FEM Mesoscale modeling (segments) Molecular Mechanics (atoms) Quantum Meccanica Mechanics Quantistica (electrons) (elettroni) MOSE – University of Trieste 30 April, 2010 - slide 57
  • 58. Cloud-based Message Passing for Multiscale Molecular Modeling Engineering Engineering design Quantum design Mechanics Process (electrons) Simulation FEM Simulazione di processo FEM Mesoscale Meccanica modeling molecolare (segments) (atomi) MOSE – University of Trieste 30 April, 2010 - slide 58
  • 59. Abstract This paper deals with availability of cloud computing to computational research labs. We will focus to the concept of availability. This concept may have two different interpretations, namely:  ―Available‖ as an accessible resource, always, from everywhere  ―Available‖ as the ability to consume a service (as a client or as the publisher) AvailableAccessible This paper will focus on the second interpretation: a cloud service is ―available‖ if it is easy for anyone in the academic community (and not) to consume the cloud. Indeed, cloud allows sharing ―knowledge‖ in form of components or data to be ―executed‖ in the cloud. The challenge here is to make possible for researchers, not necessarily expert in programming and computer science, to make available her/his knowledge in form of components and data tables. The solution we propose is based on Domain Specific Languages (DSL), by which a researcher will express the components in her/his specific language, that will be user-friendly since it is directly related to the particular research field. In this framework, cloud components will be expressed in terms of a generic mathematical model rather than a software component. This vision is quite common in computing thanks to the availability of many tools that simplify the development of DSL such as dynamic languages like IronRuby or revolutionary data access with SQL Server Modeling. The objective of this work is to present a model of a general ―Domain Specific Cloud Component‖ (DSCC) that can be expressed, published and consumed by the research community using tools that allow an easy and direct implementation for the mathematical algorithms developed by the scientists. The general concept will be applied to specific examples by developing frameworks customized to share a specific ―DSCC‖. Examples will be taken in the area of multiscale molecular modeling for the design of nanostructured polymer systems (nanotechnology) and the estimation of the environmental impact of a production process (sustainability). MOSE – University of Trieste 30 April, 2010 - slide 59
  • 60. MOSE in the cloud… …no. Why?  Because it heavily depends on software (molecular simulation, process simulation) that are not on the Cloud Can MOSE access ―alone‖ the Cloud? No, at the moment  The actors:  Chemists, Chemical Engineers, Materials Engineers, Biologists, Medical Doctors …  Just ―Computer Science‖ classes in the first two years of Engineering Curriculum (some C/C++, no VB(A) or .NET, some Matlab)  But they need programs to solve their problems…  …and sometimes they try to write them! MOSE – University of Trieste 30 April, 2010 - slide 60
  • 61. Objectives of this Research Move MOSE to the Cloud! Cannot wait software companies Computer engineers can ―simplify‖ write these codes  But she needs speaking with (non-computer) engineers about the details (Analysis, Specifications, ―DOMAIN‖) Why don‘t we enable (non-computer) scientists writing their own code? Simplifying (programming) tools to consume the Cloud Allow DOMAIN Engineers participating actively building the platform MOSE – University of Trieste 30 April, 2010 - slide 61
  • 62. Simplification development path We are ‗still @ C++‘ (some apps need C++ plug in/custom code) We already stepped into CLR world  Example our development in CAPE-OPEN (http://co-lan.org) The next step are  Dynamic Languages such as Python or Ruby  DSLs world for data (custom data texts)  Again another example in CAPE-OPEN Native VM Dynamic DSL Internal C/C++ C#/Java Python/Ruby External MOSE – University of Trieste 30 April, 2010 - slide 62
  • 63. Traditional App Architecture: n-Tier apps Presentation Infrastructure Business Logic (BLL) Data Access (DAL) MOSE – University of Trieste 30 April, 2010 - slide 63
  • 64. Onion Architecture Domain Driver Design User Interface Application Services Domain Services Database Domain M Model Services G File system etc MOSE – University of Trieste 30 April, 2010 - slide 64
  • 65. MOSE – University of Trieste 30 April, 2010 - slide 65
  • 66. MOSE – University of Trieste 30 April, 2010 - slide 66
  • 67. Domain Driven Design Objects ---> Vocabulary Grammar ---> Language DSL MOSE – University of Trieste 30 April, 2010 - slide 67
  • 68. MOSE – University of Trieste 30 April, 2010 - slide 68
  • 69. MOSE – University of Trieste 30 April, 2010 - slide 69
  • 70. MOSE – University of Trieste 30 April, 2010 - slide 70
  • 71. VISUAL UNIT PROGRAMMING WITH CAPEOPENSTUDIO.NET Marco Parenzan, Maurizio Fermeglia MOSE Lab – University of Trieste MOSE – University of Trieste 30 April, 2010 - slide 71
  • 72. CAPE-OPEN Toolkit Wizard MOSE – University of Trieste 30 April, 2010 - slide 73
  • 73. Exploring the model MOSE – University of Trieste 30 April, 2010 - slide 74
  • 74. Language: our tool Il linguaggio di programmazione è il nostro strumento (―la nostra cassetta degli attrezzi‖)  Ma la metafora (meccanica) si ferma qui  Una chiave inglese è immutabile  I nostri linguaggi no  Non siamo più nel XX secolo in cui VB6, C, C++, Pascal, Perl, Java, Javascript erano a lungo stabili… Stiamo vivendo un momento particolarmente fertile… MOSE – University of Trieste 30 April, 2010 - slide 75
  • 75. A Language Renaissance Diversi paradigmi di programmazione  Imperativo  Object Oriented  Funzionale  Dichiarativo Visual Studio 2010 esce con un nuovo linguaggio ufficiale (F#) Tipizzazione  Statica (compile type)  Dinamica (runtime) MOSE – University of Trieste 30 April, 2010 - slide 76
  • 76. Why should we care then?  More languages, more options  DLR gives apps instant C# 4.0 scripting abilities Dynamic  C# has moved in that C# 3.0 Programming direction too! Language Integrated Query  LINQ C# 2.0  Lambda expressions Generics  Parallel extensions (C# C# 1.0 4.0) Managed Code  ‗dynamic‘ (C# 4.0) and ‗var‘ keywords MOSE – University of Trieste 30 April, 2010 - slide 77
  • 77. Using internal DSL (aka Fluent Interface) MOSE – University of Trieste 30 April, 2010 - slide 78
  • 78. MOSE – University of Trieste 30 April, 2010 - slide 79
  • 79. Good Easy to do... no parsers, etc. Full IDE support Bad Limited by host language MOSE – University of Trieste 30 April, 2010 - slide 80
  • 80. Using external DSL MOSE – University of Trieste 30 April, 2010 - slide 81
  • 81. MOSE – University of Trieste 30 April, 2010 - slide 82
  • 82. Good Unlimited expressiveness You choose execution environment Bad Requires “more” work No IDE support MOSE – University of Trieste 30 April, 2010 - slide 83
  • 83. Don Box MOSE – University of Trieste 30 April, 2010 - slide 84
  • 84. Don Box Career DevelopMentor years (‗90ties)  Worldwide COM expert (Essential COM) Millenium Work (‗90ties thru 21° century)  SOAP Specifications  XML musings (Essential XML)  XML Schema  XML Infoset .NET Expert (Essential .NET) TechEd 2001, Barcelona  Musings having a bath in a tube on stage Microsoft years (since 2002)  Indigo Architect (Windows Communication Foundation – 2° generation Web Services - .NET 3.0)  Oslo (now SQL Server Modeling) MOSE – University of Trieste 30 April, 2010 - slide 85
  • 85. Il linguaggio “M” DSL Point.m DSLX DomainX.m DSLY DomainY.m Domain Model Domain Model Domain Model GPSLanguage.mg DomainX.mg DomainY.mg Domain Grammar Domain Grammar Domain Grammar "M" Domain-specific data models type Point { X : Integer where X < 100; Y : Integer?; MSchema } Domain-specific grammars language GPSLanguage { syntax Main = h:Integer ("," v:Integer)? => Point { X { h }, Y { v }}; } MGrammar Abstract data model Point { X { 100 }, Y { 200 } } MGraph 86 MOSE – University of Trieste 30 April, 2010 - slide 86
  • 86. Da M al Repository Da “Oslo” a “SQL Server Modeling” ModelA.m M.exe MX.exe Domain Model Domain Model ModelB.m Compiler Loader ModelC.m ModelABC.mx SQL M M Framework Framework Server MOSE – University of Trieste 30 April, 2010 - slide 87
  • 87. IntelliPad Chiamato inizialmente EMACS.NET Editor testuale, non ha funzioni visuali Buffer interni interagiscono con runtime  Parsing in tempo reale  Generazione risultati in finestre side-by-side MOSE – University of Trieste 30 April, 2010 - slide 88
  • 88. MGrammar in Intellipad Input Grammar Output Text MGraph Transfor m Errors MOSE – University of Trieste 30 April, 2010 - slide 89
  • 89. Douglas Purdy MOSE – University of Trieste 30 April, 2010 - slide 90
  • 90. ANTLR Overview ANother Tool for Language Recognition written by Terence Parr in Java Easier to use than most/all similar tools graphical grammar editor and debugger I’m a Supported byBovet using Swing written by Jean ANTLRWorks professor at the University of San Used to implement Francisco. domain-specific languages (DSLs) “real” programming languages Ter http://www.antlr.org I worked with Ter as a masters download ANTLR and ANTLRWorks here student there. both are free and open source docs, articles, wiki, mailing list, examples Jean ANTLR 3 3 MOSE – University of Trieste 30 April, 2010 - slide 91
  • 91. ANTLR Overview ... Uses EBNF grammars can directly express optional and repeated elements BNF grammars require more Extended Backus-Naur Form verbose syntax to express these. supports subrules (parenthesized groups of elements) Supports many target languages for generated code Java, Ruby, Python, Objective-C, C, C++ and C# Provides infinite lookahead most parser generators don’t used to choose between rule alternatives Plug-ins available for 4 ANTLR 3 IDEA and Eclipse MOSE – University of Trieste 30 April, 2010 - slide 92
  • 92. ANTLR Overview ... Supports LL(*) LL(k) parsers are top-down parsers that construct a Leftmost derivation of the input parse from Left to right Wikipedia has look ahead k tokens good descriptions of LL and LR. LR(k) parsers are bottom-up parsers that parse from Left to right construct a Rightmost derivation of the input look ahead k tokens LL parsers can’t handle left-recursive rules most people find LL grammars easier to understand than LR Supports predicates aid in resolving ambiguities (non-syntactic rules) 5 ANTLR 3 MOSE – University of Trieste 30 April, 2010 - slide 93
  • 93. ANTLRWorks A graphical grammar editor and debugger Features highlights grammar syntax errors checks for grammar errors beyond the syntax variety such as conflicting rule alternatives displays a syntax diagram for the selected rule debugger can step through creation of parse trees and ASTs 38 ANTLR 3 MOSE – University of Trieste 30 April, 2010 - slide 94
  • 94. ANTLRWorks ... parser rule syntax diagram 39 ANTLR 3 lexer rule MOSE – University of Trieste syntax diagram 30 April, 2010 - slide 95
  • 95. ANTLRWorks ... grammar check result 40 ANTLR 3 MOSE – University of Trieste 30 April, 2010 - slide 96
  • 96. ActiProSoftware Free add-ons are included that integrate domain- specific language (DSL) parsers created using Microsoft Oslo's MGrammar and ANTLR with SyntaxEditor. MOSE – University of Trieste 30 April, 2010 - slide 97
  • 97. Le peculiarità di un Dynamic Language Semplice e stringato  È una scelta tipica di chi ha sviluppato questi linguaggi  Sintassi leggera  Molte funzioni lasciate in API in linguaggi statici sono implementate nel linguaggio Interpretato  È la diretta conseguenza della semplicità, evitando la ―complicazione‖ di un processo di compilazione Implicitamente tipizzato  Il tipo è associato ai valori, non alle variabili  Non permettono di verificare e notificare gli errori di tipo se non quando vanno in esecuzione 98 MOSE – University of Trieste 30 April, 2010 - slide 98
  • 98. Cos’è Python? Un linguaggio di programmazione general purpose Sviluppato da Guido van Rossum nel 1991 Un linguaggio dinamico usato spesso come linguaggio di scripting Supporta diversi paradigmi di programmazione:  Object Oriented  Imperative  Functional È stato creato con questi obiettivi in mente:  Leggibilità del codice  Sintassi minimalista  Un esteso set di librerie  Duck Typing 99 MOSE – University of Trieste 30 April, 2010 - slide 99
  • 99. Cos’è IronPython? È una implementazione del linguaggio Python su piattaforma .NET  IronPython è scritto interamente in C# Creato da Jim Hugunin  sviluppatore anche di Jython (Python su JVM) Voleva scrivere un paper dal titolo ―Why .NET is a Terrible Platform for Dynamic Languages‖  ―Itwas a little less than a year ago that I first started investigating the Common Language Runtime (CLR). My plan was to do a little work and then write a short pithy article called, "Why .NET is a terrible platform for dynamic languages‖" http://www.ironpython.com/old.html Nel settembre del 2004 iniziava a lavorare in Microsoft  ―My plans changed when I found the CLR to be an excellent target for the highly dynamic Python language. Since then I've spent much of my spare time working on the development of IronPython‖ http://www.ironpython.com/old.html  http://www.python.org/community/pycon/dc2004/papers/9/  http://conferences.oreillynet.com/presentations/os2004/hugunin_jim_up.ppt 100 MOSE – University of Trieste 30 April, 2010 - slide 100
  • 100. Armando Fox (U.C. Berkley) Ha parlato di uso di Python in ambienti Cloud Parla di PLL (Production Level Languages)... Vs. BLL MOSE – University of Trieste 30 April, 2010 - slide 101
  • 101. Dynamic Languages on .NET IronPython IronRuby C# VB.NET Others… Dynamic Language Runtime Expression Trees Dynamic Dispatch Call Site Caching Object JavaScript Python Ruby COM binder binder binder binder binder MOSE – University of Trieste 30 April, 2010 - slide 102
  • 102. Storage in Windows Azure GOAL: SCALABLE, DURABLE STORAGE Tables: simply Queues: serially structured data, Blobs: large, accessed messages accessed using unstructured or requests, allowing Windows Azure storage is an data (audio, ADO.NET Data web-roles and worker- application managed by the Services video, etc) roles to interact Fabric Controller Windows Azure applications can use native storage or SQL Azure Application state is kept in storage services, so worker roles can replicate as needed MOSE – University of Trieste 30 April, 2010 - slide 103
  • 103. Simplification steps 1. Write apps running on cloud 1. Windows Azure 2. (ASP.NET MVC2) Web Role for the front-end 3. Worker Role for background processing 4. Table, Blob and Queue for ―unstructured‖, but easy, storage 2. Use Dynamic Languages to do the processing 1. Simplified deployment 2. Simplified ―code‖ model 3. Simplified type management (dynamic typing, no variable declaration) 4. Now fully integrated in .NET with DLR and IronPython and IronRuby 3. Input and Output as structured text 1. ―M‖ (in ―Oslo‖, now SQL Server Modeling) gives us a generic schema language (more general that XSD) and more ―readable‖ than xml 2. This gives structure and metadata to the Azure Storage data (as requested by Ed Lazowska in his yesterday wonderful keynote) MOSE – University of Trieste 30 April, 2010 - slide 104
  • 104. Domain Specific Cloud Components for General Availability in the Research Demo MOSE – University of Trieste 30 April, 2010 - slide 105
  • 105. The matrix was too simple? This is a two-dimensional matrix of three dimensional vectors Size of cube is: 100 nano meters MOSE – University of Trieste 30 April, 2010 - slide 106
  • 106. The results 1. Write apps running on cloud 1. Windows Azure 2. (ASP.NET MVC2) Web Role for the front-end 3. Web Role for background processing 2. Use Dynamic Languages to do the processing 1. Simplified deployment 2. Simplified ―code‖ model 3. Simplified type management (dynamic typing, no variable declaration) 4. Now fully integrated in .NET with DLR and IronPython and IronRuby 3. Input and Output as structured text 1. Oslo (now SQL Server Modeling) gives us a generic schema language (more general that Xsd) and more ―readable‖ than xml 2. Structured text as data sources MOSE – University of Trieste 30 April, 2010 - slide 107
  • 107. Conclusions Why MOSE needs the cloud?  To build a platform to orchestrate the message passing in Multiscale Molecular Modeling activity  To empower our research team with a flexible scientific platform that drives efficiency, collaboration and innovation In the demo we have seen  The ―creation‖ and the execution (invocation) of the single step of the process  The input and the output are the ―messages‖ that walk through the scale The code:  Definition of a library of a generic cloud component  Usage of Dynamic Languages (IronPython)  A new opportunity in .NET development  More productive (PLLs, as told by Armando Fox yesterday)  More simpler for non programmers  Application of DSLs (Oslo) for the definition of simple input/output messages  More confident with scientific people  More simple than a graphical UI to implement  It gaves metadata/schema to flat files (as requested by Ed Lazowska in his yesterday wonderful keynote) What‘s next? MOSE – University of Trieste 30 April, 2010 - slide 108
  • 108. What is next? Continue with the project The definition of a process (an orchestration)  Did you saw the session from Paul Watson yesterday? (―Cloud Computing from chemical Property Prediction‖) The users in the process  Collaboration in the process  Again, as Paul said, we agree on a structure like a ―social science community‖, a Web 2.0 application  Security, Confidentiality Verticalization on the domain  Remove all the nitty-gritty details that lowers the experience  Define custom component Languages MOSE – University of Trieste 30 April, 2010 - slide 109
  • 109. Simplification steps 1. Write apps running on cloud 1. Windows Azure 2. (ASP.NET MVC2) Web Role for the front-end 3. Worker Role for background processing 4. Table, Blob and Queue for ―unstructured‖, but easy, storage 2. Use Dynamic Languages to do the processing 1. Simplified deployment 2. Simplified ―code‖ model 3. Simplified type management (dynamic typing, no variable declaration) 4. Now fully integrated in .NET with DLR and IronPython and IronRuby 3. Input and Output as structured text 1. ―M‖ (in ―Oslo‖, now SQL Server Modeling) gives us a generic schema language (more general that XSD) and more ―readable‖ than xml 2. This gives structure and metadata to the Azure Storage data 4. Write DSL MOSE – University of Trieste 30 April, 2010 - slide 110
  • 110. Writing a Custom DSL (Supposed)Needs of the ―non- cloud component #naming part (entry point) Name = "test 0004" programmer‖ # declarative part  Libraries # sections like cobol  Integrated functionalities input i(label = "Input Vector")  No ―include‖ data # static declaration  Data Access as Libraries m(name = "matrix01" # this is the "query" label = "multiplication matrix")  Connect  Command  Execute LINQ output o(label = "Output Vector")  Define Datasource (Metadata), no SQL schema # coding part # dynamic like python (and vb)  All-in-one # verbose like visual basic  One Component, one ―file‖ (as much as code "this is the main" possible) # alterernative syntax of query from storage # calculated  Simplifing deployment m = lookup in Matrici for NomeMatrice Need of the programmer ### multiline comment ###  Not so (much) imperative, not so (much) assign 0 to r functional, not so (much) object oriented while r is less then m.rows do assign 0 to c  State is not so bad assign 0nm to a while c is less then m.cols do  Lambda are cool (no functions, all lambdas) #a = a + m(r,c) * i(c) increment a by m(r,c) * i(c) Escape to power (if DSL is ―poor‖) # python has no matrix, but jagged arrays increment c by 1  Backend of a full language, totally integrated end do assign a to o  DLR, (Iron)Python, (Iron)Ruby, (Iron)JS increment r by 1 end do (Javascript) and so on MOSE – University of Trieste 30 April, 2010 - slide 111
  • 111. Home Automation MOSE – University of Trieste 30 April, 2010 - slide 112
  • 112. Home Automation MOSE – University of Trieste 30 April, 2010 - slide 113
  • 113. Cloud Futures 2010 Le sessioni MOSE – University of Trieste 30 April, 2010 - slide 114
  • 114. Le sessioni Esperienze su Private Clouds  Idee su Public/Hybrid Clouds Esperienze ―sistemistiche‖  meno...esperienze ―programmative‖ MOSE – University of Trieste 30 April, 2010 - slide 115
  • 115. Cloud Computing for Chemical Property Prediction Paul Watson, Newcastle University MOSE – University of Trieste 30 April, 2010 - slide 116
  • 116. Bertrand Meyer ETH Zurich MOSE – University of Trieste 30 April, 2010 - slide 117
  • 117. Cloud Futures 2010 Le sessioni degli Italiani MOSE – University of Trieste 30 April, 2010 - slide 118
  • 118. Danilo Montesi Danilo Montesi (UniBO) ha presentato il progetto ―Connected City Campus‖, per connettere diverse strutture (dall‘ospedale all‘università alla biblioteca) facilitando la comunicazione e i servizi per i cittadini e sfruttando dalla rete wireless già esistente. Come dicevo, molte domande alla fine sono arrivate sulle leggi italiane in tema di privacy/conservazione dei dati. MOSE – University of Trieste 30 April, 2010 - slide 119
  • 119. Fabio Panzieri Fabio Panzieri -nella foto con Judith Bishop, Direttore Relazioni Esterne di Microsoft Research- anch‘egli di UniBO, ha presentato ―QoS-aware Clouds‖ che propone la creazione di un middleware all‘interno della piattaforma di Cloud per assegnare ―fette di cloud‖ in modo dinamico e in funzione del livello di servizio acquistato dagli utenti. MOSE – University of Trieste 30 April, 2010 - slide 120
  • 120. Domenico Talia Domenico Talia dell‘Università della Calabria, probabilmente fra noi più coinvolto nell‘argomento Cloud trattandolo già nella sua didattica in progetti nazionali ed europei, ed insegnando proprio Grid Computing. La sua presentazione ha ricevuto molte domande perché ha presentato un problema reale in gran parte già risolto tramite soluzioni cloud open source, ovvero con la definizione di un framework che permette agli sviluppatori di creare processi componendo servizi disponibili su cloud o intercloud. MOSE – University of Trieste 30 April, 2010 - slide 121
  • 121. Antonio Cisternino La presentazione di Antonio Cisternino dell‘Università di Pisa (Informatica) -veterano di questi eventi e introdotto nel gruppo di Redmond da oltre 10 anni- ha esposto un sistema di controllo dinamico per virtual machine, con client già disponibili anche per dispositivi mobili in HTML5, ad uso di farm di servizi cloud che debbano garantire disponibilità e risparmio energetico. MOSE – University of Trieste 30 April, 2010 - slide 122
  • 122. Marco Parenzan Marco Parenzan, ricercatore dell‘Università di Trieste e il più giovane della nostra delegazione, ha presentato un progetto molto interessante per rendere disponibile l‘uso del cloud a ricercatori NON esperti in computer science (nel caso specifico, a chimici) attraverso l‘uso di un linguaggio (DSL) che loro di esprimere e definire in maniera semplice le richieste elaborative e i dati su cui esse operano. MOSE – University of Trieste 30 April, 2010 - slide 123
  • 123. Grazie! Q&A blog: http://blog.codeisvalue.com/ email: marco.parenzan@libero.it web: http://www.codeisvalue.com/ skype: marco.parenzan messenger marco.parenzan@live.it slides http://www.slideshare.com/marco.parenzan twitter: marco_parenzan MOSE – University of Trieste 30 April, 2010 - slide 124