Cloud and Big Data Come Together in the Ocean Observatories Initiative to Give Scientists Real-Time Access to Environmental Measurements
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Cloud and Big Data Come Together in the Ocean Observatories Initiative to Give Scientists Real-Time Access to Environmental Measurements

on

  • 382 views

Transcript of a BriefingsDirect podcast on how cloud and big data come together to offer researchers a treasure trove of new real-time information.

Transcript of a BriefingsDirect podcast on how cloud and big data come together to offer researchers a treasure trove of new real-time information.

Statistics

Views

Total Views
382
Views on SlideShare
382
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cloud and Big Data Come Together in the Ocean Observatories Initiative to Give Scientists Real-Time Access to Environmental Measurements Document Transcript

  • 1. Cloud and Big Data Come Together in the OceanObservatories Initiative to Give Scientists Real-Time Accessto Environmental MeasurementsTranscript of a BriefingsDirect podcast on how cloud and big data come together to offerresearchers a treasure trove of new real-time information.Listen to the podcast. Find it on iTunes/iPod. Sponsor: VMwareDana Gardner: Hi. This is Dana Gardner, Principal Analyst at Interarbor Solutions, and youre listening to BriefingsDirect. Today, we present a sponsored podcast discussion on a fascinating global ocean studies initiative that defines some of the superlative around big data, cloud, and middleware integration capabilities. Well be exploring the Ocean Observatories Initiative (OOI) and itsaccompanying Cyberinfrastructure Program. This undertaking aims at providing anunprecedented ability to study the Earths oceans and climate impact using myriad distributedcenters and oceans worth of data.The scale and impact of the sciences importance is closely followed by the magnitude of thecomputer science needed to make that data accessible and actionable by scientists. In a sense, theOOI and its infrastructure program are constructing a big data scale programmable andintegratable cloud fabric. [Disclosure: VMware is a sponsor of BriefingsDirect podcasts.]We’ve gathered three leaders to explain the OOI and how the Cyberinfrastructure Program maynot only solve this set of data and compute problems, but perhaps establish how future massivedata and analysis problems are solved.Here to share their story on OOI are our guests. Please join me in welcoming Matthew Arrott, Project Manager at the OOI Cyberinfrastructure. Matthews career spans more than 20 years in design leadership and engineering management for software and network systems.He’s held leadership positions at Currenex, DreamWorks SKG, Autodesk, and the NationalCenter for Supercomputing Applications. His most recent work has been with the University ofCalifornia as e-Science Program Manager while focusing on delivering the OOICyberinfrastructure capabilities.Also joining us is Michael Meisinger. He is the Managing Systems Architect for the OceanObservatories Initiative Cyberinfrastructure. Since 2007, Michael has been employed by theUniversity of California, San Diego. He leads a team of systems architects on the OOI Project.
  • 2. Prior to UC San Diego, Michael was a lead developer in an Internet startup, developing aplatform for automated customer interactions and data analysis.Michael holds a masters degree in computer science from the Technical University of Munichand will soon complete a PhD in formal services-oriented computing and distributed systemsarchitecture.Lastly, we’re joined by Alexis Richardson, Senior Director for the VMware Cloud ApplicationPlatform. He is a serial entrepreneur and a technologist. Previously, he was a founder ofRabbitMQ and the CEO of Rabbit Technologies Limited, which was acquired by VMware inApril of 2010.Alexis plays a role in both the cloud and messaging communities, a leading role in addition toworking with AMQP. He is a co-founder of the CloudCamp conferences, and a co-chair of theOpen Cloud Computing Interface at the Open Grid Forum.Welcome to you all.Michael Meisinger, let me start with you. Could you sum up the OOI for our audience? Let usknow a little bit about how it came about.Ocean Observatories InitiativeMichael Meisinger: Thanks, Dana. The Ocean Observatories Initiative is a large project. Its a US National Science Foundation project that is intended to build a platform for ocean sciences end users and communities interested in this form of data for an operational life span of 30 years. It comprises a construction period of five years and will integrate a large number of resources and assets. These range from typical oceanographic assets, like instruments that are mounted on buoys deployed in the ocean, to networking infrastructure on the cyberinfrastructure side. It also includes a large number ofsophisticated software systems.Im the managing architect for the cyberinfrastructure, so Im primarily concerned with theinterfaces through the oceanographic infrastructure, including beta interfaces, networkinginterfaces, and then primarily, the design of the system that is the network hardware and softwaresystem that comprises the cyberinfrastructure.As I said, OOI’s goals include serving the science and education communities with their needsfor receiving, analyzing, and manipulating ocean sciences and environmental data. This will havea large impact on the science community and the overall public, as a whole, because oceansciences data is very important in understanding the changes and processes of the earth, theenvironment, and the climate as a whole.
  • 3. Ocean sciences, as a discipline, hasnt yet received as much infrastructure and central attention asother communities. So the OOI initiative is a very important to bring this to the community. Ithas a large volume. It has an almost $400 million construction budget and an annual operationsbudget of $70 million for a planned lifetime of 25-30 years.Gardner: Matthew Arrott, what is the big hurdle here in terms of a compute issue that youvefaced. Obviously, its a tremendously important project with a tremendous amount of data, butfrom a purely compute requirement perspective, what makes this so challenging?Matthew Arrott: It has a number of key aspects that we had to address. Its best to start at the top of the functional requirements, which is to provide interactive mission planning and control of the overall instrumentation on the 65 independent platforms that are deployed throughout the ocean. The issue there is how to provide a standard command-and-control infrastructure over a core set of 800 instruments, about 50 different classes of instrumentation, as well as be able to deploy over the 30-year lifecycle, new instrumentation brought to us by different scientific communities forexperimentation.The next is that the mission planning and control is meant to be interactive and respond toemergent changes. So we needed an event-response infrastructure that allowed us to operate onscales from microseconds to hours in being able to detect and respond to the changes. We neededan ability to move computing throughout the network to deal with the different latencyrequirements that were needed for the event-response analysis.Finally, we have computational nodes all the way down in the ocean, as well as on the shorestations, that are accepting or acquiring the data coming off the network. And were distributingthat data in real time to any one who wants to listen to the signals to develop their own sense-and-response mechanisms, whether theyre in the cloud, in their local institutions, or on theirlaptop.Domain of controlThe fundamental challenge was the ability to create a domain of control over instrumentationthat is deployed by operators and for processing and data distribution to be agile in itsdeployment anywhere in the global network.Gardner: Alexis Richardson, it sounds like a very interesting problem to solve. Why is this agood time to try to solve this? Of course, big data, cloud, doing tremendous amounts of servicesorientation across middleware and a variety of different formats and transports is all veryprominent in the enterprise now. Given that, what makes this, such an interesting pursuit for youin thinking about this from a software distribution and data distribution perspective?
  • 4. Alexis Richardson: It really comes down to the scale of the system and the ability of technologies to meet the scale need today. If we had been talking about this 12 years ago in the year 2000, we would have been talking about companies like Google and Yahoo, which we would not have considered to be of moderate scale. Since then, many companies have appeared, for example, Facebook, which has many hundreds of millions of users connecting throughout the world, sharing vast amounts of data all the time.Its that scale thats changed the architecture and deployment patents that people have been usingfor these applications. In addition to that, many of these companies have brought out essentiallya platform capability, whereby others, such as Zynga, in the case of Facebook, can createapplications that run inside these networks -- social networks in the case of Facebook.We can see the OOI project is essentially bringing the science needs to collaborate between vastnumbers of sensors and signals and a comparatively smaller number of scientists, researchinstitutions, and scientific applications to do analytics in a similar way as to how Facebookcombines what people say, what pictures they post, what music they listen to with everybody’sfriends, and then allow an application to be attached to that.So it’s a huge technology challenge that would have been simply infeasible 12 years ago in theyear 2000, when we thought things were big, but they were not. Now, when we talk about bigdata being masses of terabytes and petabytes that need to be analyzed all the time, then we’restarting to glimpse whats possible with the technology that’s been created in the last 10 years.Arrott: I’d like to actually go one step further than that. The challenge goes beyond just the bigdata challenge. It also now introduces, as Alexis talked about, the human putting in what they sayin their pictures. It introduced that the concept of the instrument as an equal partner with thehuman in the participation in the network.So you now have to think about what it means to have a device that’s acting like a human in thenetwork, and the notion that the instrument is, in fact, owned by someone and must be governedby someone, which is not the case with the human, because the human governs themselves. So itrepresents the notion of an autonomous agent in the network, as well as that agent having anotion of control that has to stay on the network.Gardner: Thank you, Matthew. I’d like to try to explain for our audience a bit more about whatis going on here. We understand that we’ve got a tremendous diversity of sensors gathering inreal time a tremendous scale of data. But we’re also talking about automating the gathering anddistribution of that data to a variety of applications.
  • 5. Numerical frameworkWe’re talking about having applications within this fabric, so that the output is not necessarilydata, but is a computational numerical framework that’s distributed. So theres computation beingdone at the data level, and then it has to be regulated. Certain data goes to certain people forcertain reasons under certain circumstances.So theres a lot of data, a lot of logic, and a lot of scale. Can one of you help step me through alittle bit more to understand the architecture of what’s being conducted here, so that we can thenmove into how it’s being done?Meisinger: The challenge, as you mentioned, is very heterogeneous. We deal with variousclasses of sensors, classes of data, classes of users, or even communities of users, and withclasses of technological problems and solution spaces.So the architecture is based on a tiered model or in a layered model of most invariant things atthe bottom, things that shouldn’t change over the lifetime of 30 years to serve the highest level ofattention.Then, we go into our more specialized layered architecture where we try to find optimal solutionsusing today’s technologies for high-speed messaging, big data, and so on. Then, we go intospecialized solutions for specific groups of users and specific sensors that are there as last-miletechnologies to integrate them into the system.So you basically see an onion layer model of the architecture, externalization outside. Then asyou go towards the core, you approach the invariants of the system. What are the invariants? Werecognized that a system of this scale and a system of this heterogeneity cannot be reinventedevery five years as part of the typical maintenance. So as a strongly scalable and extensiblesystem, its distributed in its nature, and as part of the distribution, the most invariant parts arethe protocols and the interactions between the distributed entities on the system.We found that its essential to define a common language, a common format, for the variousapplications and participants of the network, including sensor and sensor agents, but also higher-level software services to communicate in a common format.This architecture is based on defining a common interaction format. It’s based on defining acommon data format. You mentioned the complex numerical model. A lot of things in thisarchitecture are defined so that you have an easier model of reaching many heterogeneouscommunities by ingesting and getting specific solutions into the system, representing themconsistently and then presenting them again in the specific format for the audience.Our architecture is strongly communication-oriented, service-oriented, message-oriented, andfederated.
  • 6. As Matthew mentioned, it’s an important means to have the individual resources, agents, providetheir own policies, not having a central bottleneck in the system or central governing entity in thesystem that defines policies.Strongly federatedSo it’s a strongly federated system. It’s a system that’s strongly technology-independent. Thecommunication product can be implemented by various technologies, and we’re choosing acouple of programming languages and technologies for our initial reference implementation, butit’s strongly extensible for future communities to use.Gardner: One of the aspects of this that was particularly interesting to me is that this is verymuch a two-way street. The scientists who are gathering their analysis can very rapidly go backto these sensors, go back to this compute fabric, this fusion of data, and ask it to do other thingsin real time or to bring in data from outside sources to compare and contrast, to find thecommonalities and to find what it is they’re looking for in terms of trends.Could one of you help me understand why this is a two-way street and how thats possible giventhe scale and complexity?Arrott: The way to think about it, first and foremost, is to think of it as its four core layers. Thereis the underlying network resource management layer. We talk about agents. They supply thatcapability to any process in the system, and we create devices that process.The next layer up is the data layer, and the data layer consists of two core parts. One is thedistribution system that allows for data to be moved in real time from the source to the interestedparties. It’s fundamentally a publish-subscribe (pub-sub) model. Were currently using point-to-point as well as topic-based subscriptions, but were quickly moving towards content-basedrouting which is more based on the the selector that is provided by the consumer to direct traffictowards them.The other part of the data layer is the traditional harvesting or retrieval of data from historicalrepositories.The next layer up is the analytic layer. It looks a lot like the device layer, but is focused on themanagement of processes that are using the big data and responding to new arrival of data in thenetwork or change in data in the network. Finally, there is the fourth layer which is the missionplanning and control layer, which we’ll talk about later.Gardner: Id like to go to Alexis Richardson. When you saw the problem that needed to besolved here, you had a lot of experience with advanced message queuing protocol (AMQP),which Id like you to explain to us, and you also understand the requirements of a messagingsystem that can accomplish what Matthew just described.
  • 7. So tell me about AMQP, why this problem seems to be the right fit for that particular technology,RabbitMQ, and a messaging infrastructure in general.Richardson: What Matthew and Michael have described can be broken down into threefundamental pieces of technology.Lot of chatterNumber one, you’ve got a lot of chatter coming from these devices -- machines, people, andother kinds of processes -- and that needs to get to the right place. Its being chattered or twitteredaway and possibly at high rates and high frequencies. It needs to get to just the set of receiversfollowing that stream, very similar to how we understand distribution to our computers. So youneed what’s called pub-sub, which is a fundamental technology.In addition, that data needs to be stored somewhere. People need to go back and audit it, to pull itout of the archive and replay it, or view it again. So you need some form of storage andreliability built into your messaging network.Finally, you need the ability to attach applications that will be written by autonomous groups,scientists, and other people who don’t necessarily talk to one another, may choose these differentprogramming languages, and may be deploying our applications, as Matthew said, on their ownservers, on multiple different clouds that they are choosing through what you would like to be acommon platform. So you need this to be done in a standard way.AMQP is unique in bringing together pub-sub with reliable messaging with standards, so thatthis can happen. That is precisely why AMQP is important. Its like HTTP and email SMTP, butit’s aimed at messaging the publish-subscribe reliable message delivery in a standard way. AndRabbitMQ is one of the first implementations and that’s how we ended up working with the OOIteam, because RabbitMQ provides these and does it well.Gardner: Now we’ve talked a lot about computer science and some of the thorny issues thathave been created as a result of this project in going forward, but, I’d also like to go back to theproject itself, and give our listeners a sense of what this can accomplish. I’ve heard it describedas the Hubble Telescope of oceans.Let’s go back to the oceanography and the climate science. What can we accomplish with this,when this data is delivered in the fashion we’ve been discussing, where the programmability isthere, where certain scientists can interact with these sensors and data, ask it to do things, andthen get that information back in a format that’s not raw, but is in fact actionable intelligence.Matthew, what could possibly happen in terms of the change in our understanding of the oceansfrom this type of undertaking?Arrott: The way to think about this is not so much from the fact that we know exactly what willhappen. Its the notion that were providing capabilities that do not currently exist for
  • 8. oceanographers. It can be summed up as continual presence in the oceans at multiple scalesthrough multiple perspectives, also known as the different classes of instrumentation that areobserved in the ocean.Another class of instrumentation is deployed specifically for refocusing. The scope of the OOI issuch that it is considered to be observing the ocean at multiple scales -- coastal, regional, andglobal. It is an expandable model such that other observatories, as well as additions to the OOInetwork, can be considered and deployed in subsequent years.This allows us now, as Alexis talked about, to attach many different classes of applications to thenetwork. One of the largest classes of applications that we’ll attach to the network are themodeling, in particular the nowcast and forecast modeling.Happening at scaleThrough those observations about the ocean now, about what the ocean will be, and to be ableto ground-truth those models going forward, based on data arriving in the same time as theforecasts, provides for a broad range of modeling that has been done for a fair amount of time,but it now allows it to happen at scale.Once you have that ability to actually model the oceans and predict where it’s going, you can usethat to refocus the instrumentation on emergent events. Its this ability to have long-termpresence in the ocean, and the ability to refocus the instrumentation on emergent events, thatreally represents the revolutionary change in the formation of this infrastructure.Meisinger: Let me add, Im very fascinated by The Hubble Space Telescope as something thatproduces fantastic imagery and fantastic insights into the universe. For me as a computerscientist, it’s often very difficult to imagine what users of the system would do with the system.I’d like to see the OOI as a platform that’s developed by the experts in their fields to deploy theplatforms, the buoys, the cables, the sensors into the ocean that then enables the users of thesystem over 25 years to produce unprecedented knowledge and results out of that system.The primary mission of our project is to provide this platform, the space telescope in the ocean.And it’s not a single telescope. In our case, its a set of 65 buoys, locations in the ocean, and evena cable that runs a 1,000 miles at the seafloor of the Pacific Northwest that provides 10 gigabitethernet connectivity to the instrument, and high power.It’s a model where scientists have to compete. They have to compete for a slot on thatinfrastructure. Theyll have to apply for grants and theyll have to reserve the spot, so that theycan accomplish the best scientific discoveries out of that system.
  • 9. It’s kind of the analogy of the space telescope that will bring ocean scientists to the next level.This is our large platform, our large infrastructure that have the best scientists develop andresearch to best results. That’s the fascination that I see as part of this project.Gardner: For the average listener to understand, is this comparable to tracking weather and theclimate on the surface? Many of us, of course, get our weather forecasts and they seem to begetting better. We have satellites, radar, measurements, and historical data to compare, and wehave models of what weather should do. Is this in some ways taking the weather of the oceans?Is it comparable?Arrott: Quite comparable. Theres a movement to instrument the earth, so that we canunderstand from observation, as opposed to speculation, what the earth is actually doing, andfrom a notion of climate and climate change, what we might be doing to the earth as participantson it.The weather community, because of the demand for commercial need for that weather data, hasbeen well in advance of the other environmental sciences in this regard. What youll find is thatOOI is just one of several ongoing initiatives to do exactly what weather has done.The work that I did at NCSA, was with the atmospheric sciences community was very clear atthe time. What could they do if they had the kind of resources that we now have here in the 21stCentury? Weve worked with them and modeled much of our system based on the systems thatthey built, both in the research area, and in the operational area in programs such as Nova.Science more matureGardner: So, in a sense, were following the path of what we’ve done with the weather, andunderstanding the climate on land. We’re now moving into the oceans, but at a time when thecomputer science is more mature, and in fact, perhaps even more productive.Back to you Alexis Richardson. This is being sponsored by the US National Science Foundation,so being cost efficient is very important of course. How is it that cloud computing is beingbrought to bear, making this productive, and perhaps even ahead of where the whole weather andpredicting weather has been, because we can now avail ourselves of some of the newer tools andmodels around data and cloud infrastructure?Richardson: Happily, that’s an easy one. Imagine if a person or scientist wanted to process veryquickly a large amount of data that’s come from the oceans to build a picture of the climate, theocean, or anything to do with the coastal proprieties of the North American coast. They mightneed to borrow 10,000 or 20,000 machines for an hour, and they might need to have a vastamount of data readily accessible to those machines.In the cloud, you can do that, and with big data technologies today, that is a realistic proposition.It was not 5-10 years ago. It’s that simple.
  • 10. Obviously, you need to have the technologies, like this messaging that we talked about, to getthat data to those machines so they can be processed. But, the cloud is really there to bring italtogether and to make it seem to the application owner like something that’s just ready for themto acquire it, and when they don’t need it anymore, they can put it back and someone else can useit.Gardner: Back to you Michael. How do you view the advent of cloud computing as a benefit tothis sort of initiative? We’ve got a piece of it from Alexis, but I’d like to hear your perspective onwhy cloud models are enabling this perhaps at an unprecedented scale, but also at a mostefficient cost?Meisinger: Absolutely. It does enable computing at unprecedented scale for exactly reasons thatAlexis mentioned. A lot of the earths environment is changing. Assume that you’re interested intracking the effect of a hurricane somewhere in the ocean and you’re interested in computing avery complex numerical model that provides certain predictions about currents and othervariables of the ocean. You want to do that when the hurricane occurs and you want to do itquickly. Part of the strategy is to enable quick computation on demand.The OOI architecture, in particular, its common execution infrastructure subsystem, is built inorder to enable this access to computation and big data very quickly. You want to be able to makeuse of execution provider’s infrastructure as a service very quickly to run your own models withthe infrastructure that the OOI provides.Then, there are other users that want to do things more regularly, and they might have their ownhardware. They might run their own clusters, but in order to be interoperable, and in order tohave excess overflow capabilities, it’s very important to have cloud infrastructure as a means ofmaking the system more homogenous.So the cloud is a way of abstracting compute resources of the various participants of the system,be they commercial or academic cloud computing providers or institutions that provide their ownclusters as cloud systems, and they all form a large compute network, a compute fabric, so thatthey can run the computation in a predictable way, but also then in a very episodic way.Cloud as enablerI really see that the cloud paradigm is one of the enablers of doing this very efficiently, and itenables us as a software infrastructure project to develop the systems, the architecture, to actuallymanage this computation from a system’s point of view in a central way.Gardner: Alexis, because of AMQP and the VMware Cloud Application Platform, it seems tome that you’ve been able to shop around for cloud resources, using the marketplace, becauseyou’ve allowed for interoperability among and between platforms, applications, tools, andframeworks.
  • 11. Is it the case that leveraging AMQP has given you the opportunity to go to where the computeresources are available at the lowest cost when that’s in your best interest?Richardson: The dividend of interoperability for the end user and the end customer in thisplatform environment is ultimately portability -- portability through being able to choose whereyour application will run.Michael described it very well. A hurricane is coming. Do you want to use the machinesprovided by the cloud provider here for this price? Do you want to use your own servers? Maybeyour neighboring data center has servers available to you, provided those are visible andprovided there is this fundamental interoperability through cloud platforms of the type that weare investing in. Then, you will be able to have that choice. And that lets you make thesedecisions in a way that you could not do before.Gardner: I’m afraid we’re almost out of time, but I want to try to compare this to what this willallow in other areas. It’s been mentioned by Alexis and others that this has got some commonfeatures to Twitter, Facebook, or Zynga. We think of the social environment because of the scale,complexity, and the use of cloud models. But we’re doing far more advanced computationalactivities here. This is simply not a display of 140 characters, based on a very rudimentarysearch, for example. These are high-performance computing level, supercomputer-level types ofrequests and analysis.So are we combining the best of a social fabric approach and the architecture behind that to whatwe’ve been traditionally exposed to in high-performance computing and supercomputing? If so,what does that mean for how we could bring this to other types of uses in the future. I’ll throwthis out to any of you? How are we doing the best of the old and the new, and what does thatmean for the future?Meisinger: This is the direction in which the future will evolve, and it’s the combination ofproven patterns of interaction that are emerging out of how humans interact applied to high-performance computing. Providing a strong platform or a strong technological footprint that’snot specific to any technology is a great benefit to the community out there.Providing a reference architecture and a reference implementation that can solve these problems,that social network for sensor networks and for device computation will be a pattern that can beleveraged by other interested participants, either by participating in the system directly orindirectly, where it’s just taking that pattern and the technologies that come with it and basicallybringing it to the next level in the future. Developing it as one large project in a coherent setreally yields a technology stack and architecture that will carry us far into the future.Arrott: With all the incremental change that were introducing is taking the concepts ofFacebook and of Twitter and the notions of Dropbox which is my ability to move a file to ashared place so someone else can pick it up later, which was really not possible long ago. I had todo an FTP server, put up an HTTP server to accomplish that.
  • 12. Sharing processesWhat we are now adding to the mix is not sharing just artifacts, but we’re actually sharingprocesses with one another and then specifically sharing instrumentation. I can say to you, "Here,have a look through my telescope." You can move it around and focus it.Basically, we introduced the concept of artifacts or information resources, as well as the conceptof a taskable resource, and the thing that we’re adding to that which can be shared are taskableresources.Gardner: I’m just going to throw out a few blue-sky ideas that it seems this could be applicableto things like genetics and the human genome, but on an individual basis. Or crime statistics, inorder to have better insight into human behavior at a massive scale. Or perhaps even healthcare,where you’re diagnosing specific types of symptoms and then correlating them across entireregions or genetic patterns that would be brought to bear on those symptoms.Am I off-base? Is this science fiction? Or am I perhaps pointing to where this sort of capabilitymight go next?Richardson: The answer to your question is yes, if you add one little phrase into that: in realtime. If, you’re talking about crime statistics, as events happen on the streets, information isgathered and shared and processed. As people go on jobs, if information is gathered, shared, andprocessed on how people are doing, then you will be able to have the kind of crime or healthcarebenefits that you described. I’m sure we could think of lots of use cases. Transport is another one.Arrott: At the institution in which the OOI Cyberinfrastructure is housed, California Institute ofTelecommunication and Information Technology (Calit2), all of the concerns that you’vementioned are, in fact, active development research programs, all of which have yieldedsignificant improvements in the computational environment for that scientific community.Gardner: Michael, last word to you. Where do you see this potentially going in terms of thecapability? Obviously, its a very important activity with the oceans. But the methods that you’redefining, the implementations that you’re perfecting, where do you see them being applied in thenot-too-distant future?Meisinger: You’re absolutely right. This pattern is very applicable and it’s not that frequent thata research and construction project of that size has an ability to provide an end-to-end technologysolution to this challenge of big data combined with real-time analysis and real-time commandand control of the infrastructure.What I see that’s evolving into is, first of all, you can take the solutions build in this project andapply it to other communities that are in need for such a solution. But then it could go further.Why not combine these communities into a larger system? Why not federate or connect all these
  • 13. communities into a larger infrastructure that all is based on common ideas, common standards,and that still enables open participation.It’s a platform where you can plug in your own system or subsystem that you can then makeavailable to whoever is connected to that platform, whoever you trust. So it can evolve into alarge ecosystem, and that does not have to happen under the umbrella of one organization such asOOI.Larger ecosystemIt can happen to a larger ecosystem of connected computing based on your own policies, yourown technologies, your own standards, but where everyone shares a common piece of the sameidea and can take whatever they want and not consume what they’re not interested in.Gardner: And as I said earlier at that very interesting intersection of where you can find themost efficient compute resources available and avail yourself of them with that portability, itsounds like a really powerful combination.We’ve been talking about how the Ocean Observatories Initiative and its accompanyingCyberinfrastructure Program have been not only feeding the means for the ocean to be betterunderstood and climate interaction to be better appreciated, but we’re also seeing how thearchitecture behind that is leading to the potential for many other big data, cloud fabric, real-time, compute-intensive applications.I’d like to thank our guests. We’ve been joined by Matthew Arrott. He is the Project Manager atthe OOI and the initiative for the Cyberinfrastructure. Thank you so much, Matthew.Arrott: Thank you.Gardner: We’ve also been joined by Michael Meisinger. He is the Managing Systems Architectfor the OOI Cyberinfrastructure. Thank you, Michael.Meisinger: Thanks, Dana.Gardner: And Alexis Richardson, the Senior Director for VMware Cloud Application Platform.Thank you, Alexis.Richardson: Thank you very much.Gardner: And this is Dana Gardner, Principal Analyst at Interarbor Solutions. Thanks to you,our audience, for listening, and come back next time.Listen to the podcast. Find it on iTunes/iPod. Sponsor: VMware
  • 14. Transcript of a BriefingsDirect podcast on how cloud and big data come together to offerresearchers a treasure trove of new real-time information. Copyright Interarbor Solutions, LLC,2005-2012. All rights reserved.You may also be interested in: • Case Study: Strategic Approach to Disaster Recovery and Data Lifecycle Management Pays Off for Australias SAI Global • Case Study: Strategic Approach to Disaster Recovery and Data Lifecycle Management Pays Off for Australias SAI Global • Virtualization Simplifies Disaster Recovery for Insurance Broker Myron Steves While Delivering Efficiency and Agility Gains Too • SAP Runs VMware to Provision Virtual Machines to Support Complex Training Courses • Case Study: How SEGA Europe Uses VMware to Standardize Cloud Environment for Globally Distributed Game Development • Germanys Largest Travel Agency Starts a Virtual Journey to Get Branch Office IT Under Control