This document discusses using cloud computing services like Amazon Web Services to perform social network analysis on large-scale social network data. It proposes an architecture that uses a distributed hash table for storage, CouchDB for key-value storage, and MapReduce on Amazon's Elastic MapReduce service for analysis. The goal is to perform tasks like social graph analysis, visualization, relationship mining, and content clustering on massive social network datasets in a scalable way using cloud infrastructure.
Cloud applications - Protein Structure Predication and gene expression data...Pushpendra Singh Dangi
This document discusses using cloud computing for protein structure prediction and gene expression data analysis. Protein structure prediction is a computationally intensive task that determines the 3D structure of proteins from their amino acid sequences. Cloud computing allows this task to be parallelized across multiple machines to reduce computational time. Gene expression profiling measures thousands of genes and is used for cancer prediction and diagnosis. Analyzing large gene expression datasets for cancer classification is solved using an extended classifier system on cloud infrastructure to further divide and parallelize the problem.
Satellite image processing involves correcting satellite images for defects, overlaying the 2D images onto a 3D model of Earth, and applying the images for scientific and practical uses. Early satellite photographs from the 1940s and 1950s provided initial images of Earth and the Moon. Modern satellite image processing utilizes large amounts of data from numerous sensors and satellites to monitor the planet. Cloud computing provides advanced infrastructure for processing large satellite images, performing corrections, and generating meaningful results through on-demand public and private cloud resources.
본 자료는 빅데이터를 분석하는 전반적인 과정에 대해 정리한 자료로써 사회과학을 포함한 다양한 영역(컴퓨터 공학, 통계학, 수학 등)이 분석 과정에 참여할 수 있는지를 정리한 자료이다. 분석 과정 세부 영역에 있어선 주로 사회과학의 관점에서 기술하였다. 현재 자료는 2010년부터 사회과학의 관점에서 데이터 분석을 계속 해오면서 경험한 부분과 문헌 및 발표 자료 등을 통해 정리한 자료이다. 앞으로 여러 영역을 공부하면서 빅데이터 분석 프로세스를 더욱 발전시켜 나갈 예정이다.
Cloud Computing is a growing research topic in recent years. The key concept of Cloud Computing is to provide a resource sharing model based on virtualization, distributed file system, parallel algorithm and web services. But how can we provide a testbed for cloud computing related training courses? In this talk we will share our experience to build cloud computing testbed for virtualization, high throughput computing and bioinformatics applications. It covers lots of open source projects, such as DRBL, Xen, Hadoop and bioinformatics related applications.
In short, Diskless Remote Boot in Linux (DRBL) provides a diskless or systemless environment for client machines. It works on Debian, Ubuntu, Mandriva, Red Hat, Fedora, CentOS and SuSE. DRBL uses distributed hardware resources and makes it possible for clients to fully access local hardware.
Xen is one of open source hypervisor for linux kernel. It had been used in Amazon EC2 production environment to provide cloud service model (1) — "Infrastructure as a Service (IaaS)". In this talk, we will show you how DRBL can help on fast deployment of Xen playground in classroom.
Hadoop is becoming the well-known open source cloud computing technology developed by Apache community. It is very power tool for data mining. It had been used in Yahoo and Facebook production environment to provide cloud service model (2) — "Platform as a Service (PaaS)". It’s easy to setup single hadoop node but difficult to manage a hadoop cluster. In this talk, we will show you how DRBL can help on fast deployment and management.
Most bioinformatics applications are open source, such as R, Bioconductor, BLAST, Clustal, PipMaker, Phylip, etc. But it also require traditional cluster job submission. In this talk we will show you how DRBL can help to build a testbed of bioinformatics research and provide cloud service model (3) — "Software as a Service (SaaS)". In this talk, we will cover how to:
- 1. Use DRBL to deploy Xen virtual cluster (drbl-xen)
- 2. Use DRBL to deploy Hadoop cluster (drbl-hadoop)
- 3. Use DRBL to deploy bioinformatics cluster (drbl-biocluster)
A live demonstration about drbl-hadoop and drbl-biocluster will be done in the talk, too.
This document discusses big data and cloud computing. It introduces cloud storage and computing models. It then discusses how big data requires distributed systems that can scale out across many commodity machines to handle large volumes and varieties of data with high velocity. The document outlines some famous cloud products and their technologies. Finally, it provides an overview of the company's focus on enterprise big data management leveraging cloud technologies, and lists some of its cloud products and services including data storage, object storage, MapReduce and compute cloud services.
Grid and Cloud Computing Lecture-2a.pptxDrAdeelAkram2
The document discusses grid architecture and tools. It covers the hourglass model of grid architecture, which focuses on core services to enable diverse solutions. It also discusses the layered grid architecture with four layers - fabric, connectivity, collective, and application. Simulation tools for modeling grid environments like GridSim are presented. The document then discusses clouds and defines cloud computing. Key aspects of clouds like scalability, virtualization, and on-demand services are covered. It compares clouds to grids and other paradigms. Finally, it introduces service-oriented architecture and defines the characteristics of services.
Cloud applications - Protein Structure Predication and gene expression data...Pushpendra Singh Dangi
This document discusses using cloud computing for protein structure prediction and gene expression data analysis. Protein structure prediction is a computationally intensive task that determines the 3D structure of proteins from their amino acid sequences. Cloud computing allows this task to be parallelized across multiple machines to reduce computational time. Gene expression profiling measures thousands of genes and is used for cancer prediction and diagnosis. Analyzing large gene expression datasets for cancer classification is solved using an extended classifier system on cloud infrastructure to further divide and parallelize the problem.
Satellite image processing involves correcting satellite images for defects, overlaying the 2D images onto a 3D model of Earth, and applying the images for scientific and practical uses. Early satellite photographs from the 1940s and 1950s provided initial images of Earth and the Moon. Modern satellite image processing utilizes large amounts of data from numerous sensors and satellites to monitor the planet. Cloud computing provides advanced infrastructure for processing large satellite images, performing corrections, and generating meaningful results through on-demand public and private cloud resources.
본 자료는 빅데이터를 분석하는 전반적인 과정에 대해 정리한 자료로써 사회과학을 포함한 다양한 영역(컴퓨터 공학, 통계학, 수학 등)이 분석 과정에 참여할 수 있는지를 정리한 자료이다. 분석 과정 세부 영역에 있어선 주로 사회과학의 관점에서 기술하였다. 현재 자료는 2010년부터 사회과학의 관점에서 데이터 분석을 계속 해오면서 경험한 부분과 문헌 및 발표 자료 등을 통해 정리한 자료이다. 앞으로 여러 영역을 공부하면서 빅데이터 분석 프로세스를 더욱 발전시켜 나갈 예정이다.
Cloud Computing is a growing research topic in recent years. The key concept of Cloud Computing is to provide a resource sharing model based on virtualization, distributed file system, parallel algorithm and web services. But how can we provide a testbed for cloud computing related training courses? In this talk we will share our experience to build cloud computing testbed for virtualization, high throughput computing and bioinformatics applications. It covers lots of open source projects, such as DRBL, Xen, Hadoop and bioinformatics related applications.
In short, Diskless Remote Boot in Linux (DRBL) provides a diskless or systemless environment for client machines. It works on Debian, Ubuntu, Mandriva, Red Hat, Fedora, CentOS and SuSE. DRBL uses distributed hardware resources and makes it possible for clients to fully access local hardware.
Xen is one of open source hypervisor for linux kernel. It had been used in Amazon EC2 production environment to provide cloud service model (1) — "Infrastructure as a Service (IaaS)". In this talk, we will show you how DRBL can help on fast deployment of Xen playground in classroom.
Hadoop is becoming the well-known open source cloud computing technology developed by Apache community. It is very power tool for data mining. It had been used in Yahoo and Facebook production environment to provide cloud service model (2) — "Platform as a Service (PaaS)". It’s easy to setup single hadoop node but difficult to manage a hadoop cluster. In this talk, we will show you how DRBL can help on fast deployment and management.
Most bioinformatics applications are open source, such as R, Bioconductor, BLAST, Clustal, PipMaker, Phylip, etc. But it also require traditional cluster job submission. In this talk we will show you how DRBL can help to build a testbed of bioinformatics research and provide cloud service model (3) — "Software as a Service (SaaS)". In this talk, we will cover how to:
- 1. Use DRBL to deploy Xen virtual cluster (drbl-xen)
- 2. Use DRBL to deploy Hadoop cluster (drbl-hadoop)
- 3. Use DRBL to deploy bioinformatics cluster (drbl-biocluster)
A live demonstration about drbl-hadoop and drbl-biocluster will be done in the talk, too.
This document discusses big data and cloud computing. It introduces cloud storage and computing models. It then discusses how big data requires distributed systems that can scale out across many commodity machines to handle large volumes and varieties of data with high velocity. The document outlines some famous cloud products and their technologies. Finally, it provides an overview of the company's focus on enterprise big data management leveraging cloud technologies, and lists some of its cloud products and services including data storage, object storage, MapReduce and compute cloud services.
Grid and Cloud Computing Lecture-2a.pptxDrAdeelAkram2
The document discusses grid architecture and tools. It covers the hourglass model of grid architecture, which focuses on core services to enable diverse solutions. It also discusses the layered grid architecture with four layers - fabric, connectivity, collective, and application. Simulation tools for modeling grid environments like GridSim are presented. The document then discusses clouds and defines cloud computing. Key aspects of clouds like scalability, virtualization, and on-demand services are covered. It compares clouds to grids and other paradigms. Finally, it introduces service-oriented architecture and defines the characteristics of services.
This document summarizes tools for auditing and securing AWS infrastructure:
Cloudmapper visualizes AWS infrastructure and finds misconfigurations using commands like "audit" and "collect". Scout Suite provides detailed reports on individual AWS services' security. CloudTrail monitors API calls but requires processing logs. GuardDuty detects threats in real-time but is expensive. Together these tools can monitor for issues, but real-time response still requires manual incident response.
Cloud 2.0 - How Containers, Microservices and Open Source Software are Redefi...Mark Hinkle
Led by the rocket like success of Amazon Web Services cloud computing is a paradigm shift in the way we host and deploy infrastructure. Organizations are consuming cloud infrastructure across multiple cloud providers both inside their data center and the data centers of others. The advent of highly portable workloads via containers (e.g. Docker) and discrete units of computing delivered by microservices are enabling organizations (like Netflix) to deploy complex multi-layered products and services at breakneck speeds.
This talk will give an overview of the major cloud services and the open source software (e.g. OpenStack, Apache CloudStack) that can be used to deliver and manage cloud computing infrastructure(e.g. Puppet, Chef, Ansible). The discussion will cover the evolution of cloud computing and how that sets the stage for realizing the agility, flexibility and power of cloud computing.
Attendees should expect to learn about the leading technologies in cloud computing, strategies for using open source software to create/manage cloud computing services and to gain an understanding how current developments are providing a way to create a single cloud fabric that best serves their individual needs.
Stateful Microservices with Apache Kafka and Spring Cloud Stream with Jan Svo...HostedbyConfluent
You have been building your applications with stateless microservices. You might even be a rockstar using Kafka for inter service communication. Everything works wonderfully but you feel you could do something more. You want your microservices to have a state.
Developing stateful microservices can be hard. I will share my experience with building stateful applications with Kafka and Spring Cloud Stream libraries.
Kafka Streams State Stores and Interactive Queries are the main building blocks. They are used by stream processing applications to store and query data. They can scale and be fault tolerant together with your application instances in your container platform. But there are some limitations and we need to know how to monitor their performance.
This session is targeted for developers who are interested in learning event streaming practices. Demo application code will be available to participants.
Towards CloudML, a Model-Based Approach to Provision Resources in the CloudsSébastien Mosser
The Cloud-computing paradigm advocates the use of re- sources available “in the clouds”. In front of the multiplicity of cloud providers, it becomes cumbersome to manually tackle this heterogene- ity. In this paper, we propose to define an abstraction layer used to model resources available in the clouds. This cloud modelling language (CloudML) allows cloud users to focus on their needs, i.e., the modelling the resources they expect to retrieve in the clouds. An automated provi- sioning engine is then used to automatically analyse these requirements and actually provision resources in clouds. The approach is implemented, and was experimented on prototypical examples to provision resources in major public clouds (e.g., Amazon EC2 and Rackspace).
This document provides an agenda and overview of topics to be covered in a session on Google Cloud infrastructure and services, including cloud storage, monitoring, functions, pub/sub, IAM, BigQuery, Cloud SQL, VPC networks, and Kubernetes Engine. It also includes primers on cloud storage, monitoring, functions, and pub/sub that define their purposes and capabilities. Hands-on examples for working with containers using Docker are outlined at the end.
Session presented at the 2nd IndicThreads.com Conference on Cloud Computing held in Pune, India on 3-4 June 2011.
http://CloudComputing.IndicThreads.com
Abstract: Cloud computing is no longer a buzz term but a reality. With a great opportunity for huge financial savings and demand for Software-as-a-Service products, developing products for the cloud is something that cannot be ignored. In this talk, I would like to touch upon 3 key aspects of cloud engineering – scalability, security and flexibility and its impact on application architecture, data processing needs and deployment.
* By Manjusha Madabushi, Co-Founder and CTO of Talentica Software Pvt. Ltd.
Speaker: Manjusha is a Co-Founder and CTO of Talentica Software Pvt. Ltd. She has a Bachelor’s degree from IIT Mumbai and a Master’s degree from Northwestern University, Chicago. She has over 23 years experience working in the IT industry. She started her career working for Amoco Research Centre, USA till 1989 before returning to India and joining TCS. During her 9 year career at TCS, Manjusha worked in different technology areas such as Artificial Intelligence, Application Modeling, Compilers etc. She was also the Engineering head of the TCS’ product – E.X. NGN. Post TCS, she founded Nitman Software, which was acquired by a US based CRM company, eGain Communications in the year 2000. She co-founded Talentica Software, a company that helps technology companies transform their ideas into successful products in 2003. Talentica specializes in building highly scalable products using cutting edge technologies in the areas of Social Analytics, CRM, Natural Language processing and Advertising.
O Outro Lado BSidesSP Ed. 5 - As Nove Principais Ameaças na Computação em NuvemAndre Serralheiro
O objetivo desta apresentação é discutir as principais ameaças à computação em nuvem com base no documento "As Nove Principais Ameaças na computação em Nuvem" disponibilizado pela Cloud Security Alliance no início de 2013. Baseado em uma pesquisa realizada entre seus associados, este documento fornece o contexto necessário para auxiliar as organizações na tomada de decisões de risco ao analisar suas estratégias de adoção da Computação em Nuvem.
The Cloud is dead ?! Blockchain in the new cloudYuval Birenboum
The document discusses the potential for a "decentralized cloud" using blockchain technology for decentralized applications (Dapps), computing, databases, and storage. It defines blockchain and how it allows for a distributed ledger and peer-to-peer transactions without an central authority. The benefits of decentralization are reduced need for trust and privacy risks while unleashing resources at the network edges. Decentralized storage and databases on the blockchain could provide an alternative to current cloud-based services.
OW2con'16 Keynote address: Kubernetes, the rising tide of systems administrat...OW2
Kubernetes, the rising tide of systems administration. Containers and cloud have moved from "why" to "how and when?" Learn how Google is helping the world go Cloud Native.
* Explore the similar histories of the cloud and Content Delivery Networks (CDNs)
* Discover the future relationship between CDNs and emerging cloud platforms as the lines of distinction continue to blur
* Learn from real world use cases in which these technologies interact together
Building realtime data applications that can seamlessly run and integrate data across On Prem, and multiple public cloud vendors. How Hybrid Cloud can help tackle regulatory requirements for Data Sovereignty, Stressed Exit, and operational resilience.
Cloud computing and grid computing 360 degree comparedMd. Hasibur Rashid
Cloud computing builds upon concepts from cluster and grid computing. Cluster computing links multiple computers to share workloads, while grid computing dynamically aggregates distributed resources for tasks. Cloud computing provides scalable resources and services over the internet. It extends concepts from grid computing by offering virtualized, dynamically provisioned resources on-demand. Key differences are that cloud computing has loose coupling between providers and consumers, supports scaling, and offers services under a pay-per-use business model. Common cloud services are SaaS, PaaS, and IaaS. Challenges include dynamic scalability, security, and standardization. Cloud computing shows promise for further research in areas like security, interoperability and dynamic pricing models.
Content Delivery Using Amazon CloudFront - AWS Presentation - John MancusoAmazon Web Services
CloudFront is Amazon's content delivery network (CDN) that caches copies of content across a global network of edge servers to improve performance and reduce latency. It uses a distribution configuration to determine how to route requests for content to the optimal edge location. Origins specify the source of the content. CloudFront delivers content through its edge locations, improving load times, providing high bandwidth, and ensuring availability. Many companies use CloudFront to deliver media, software downloads, web assets and even dynamic content through features like cache behaviors and multiple origins. Getting started is self-service through the AWS Management Console or APIs.
Cloud lockin and interoperability v2 indic threads cloud computing conferen...IndicThreads
This document discusses cloud lock-in and interoperability. It begins with recapping cloud computing concepts like deployment models and service models. It then defines cloud lock-in and discusses how portability and interoperability can help address lock-in concerns. Emerging standards from groups like DMTF, SNIA, and CSA that aim to improve interoperability are presented. Best practices for vendors and customers to reduce lock-in are outlined. While lock-in exists now due to proprietary systems, the future of interoperability is promising. Standards and informed customer decisions can help minimize negative impacts of lock-in.
Cloud lockin and interoperability v2 indic threads cloud computing conferen...IndicThreads
This document discusses cloud lock-in and interoperability. It begins with recapping cloud computing concepts like deployment models and service models. It then defines lock-in, portability, and interoperability. Lock-in occurs when there are significant costs to switch cloud vendors. The document discusses how portability and interoperability benefit customers by increasing choice and lowering costs. It provides examples of lock-in for different cloud platforms and analyzes emerging standards from groups like DMTF, SNIA, and CSA. Best practices are outlined to minimize lock-in for IaaS, PaaS and SaaS. The document concludes that while lock-in exists now, interoperability is improving and portability
presentation givent at the 2nd International Workshop on Web Intelligence & Virtual Enterprises (WIVE'10) held at the 11th IFIP Working Conference on Virtual Enterprises (PRO-VE'10)
http://www.emse.fr/wive/
Cloud Trends for 2017 and Actions You Can Take NowRightScale
This document discusses 10 cloud trends for 2017 and actions that can be taken. The trends include: 1) Enterprises using multiple public clouds like Azure and Google Cloud; 2) Infrastructure as a service (IaaS) and platform as a service (PaaS) converging; 3) Governance becoming more important; 4) Cloud security matching on-premises security; 5) Growing use of containers with orchestration tools; 6) Emergence of serverless computing via functions as a service (FaaS); 7) Slowing price cuts from cloud providers; 8) Opportunities to reduce cloud spending by 30-45%; 9) Lack of cloud expertise being a top challenge; 10) High demand
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud ComputingMark Hinkle
And while the Hitchhiker’s Guide to the Galaxy (HHGTTG) is a wholly remarkable book it doesn’t cover the nuances of cloud computing. Whether you want to build a public, private or hybrid cloud there are free and open source tools that can help provide you a complete solution or help augment your existing Amazon or other hosted cloud solution. That’s why you need the Hitchhiker’s Guide to (Open Source) Cloud Computing (HHGTCC) or at least to attend this talk understand the current state of open source cloud computing. This talk will cover infrastructure-as-a-service, platform-as-a-service and developments in big data and how to more effectively deploy and manage open source flavors of these technologies. Specific the guide will cover:
Infrastructure-as-a-Service – The Systems Cloud – Get a comparison of the open source cloud platforms including OpenStack, Apache CloudStack, Eucalyptus and OpenNebula
Platform-as-a-Service – The Developers Cloud – Learn about the tools that abstract the complexity for developers and used to build portable auto-scaling applications ton CloudFoundry, OpenShift, Stackato and more.
Data-as-a-Service – The Analytics Cloud – Want to figure out the who, what, where, when and why of big data? You’ll get an overview of open source NoSQL databases and technologies like MapReduce to help parallelize data mining tasks and crunch massive data sets in the cloud.
Network-as-a-Service – The Network Cloud – The final pillar for truly fungible network infrastructure is network virtualization. We will give an overview of software-defined networking including OpenStack Quantum, Nicira, open Vswitch and others.
Finally this talk will provide an overview of the tools that can help you really take advantage of the cloud. Do you want to auto-scale to serve millions of web pages and scale back down as demand fluctuates. Are you interested in automating the total lifecycle of cloud computing environments You’ll learn how to combine these tools into tool chains to provide continuous deployment systems that will help you become agile and spend more time improving your IT rather than simply maintaining it.
[Finally, for those of you that are Douglas Adams fans please accept the deepest apologies for bad analogies to the HHGTTG.]
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
More Related Content
Similar to SNS Analysis using Cloud Computing Services
This document summarizes tools for auditing and securing AWS infrastructure:
Cloudmapper visualizes AWS infrastructure and finds misconfigurations using commands like "audit" and "collect". Scout Suite provides detailed reports on individual AWS services' security. CloudTrail monitors API calls but requires processing logs. GuardDuty detects threats in real-time but is expensive. Together these tools can monitor for issues, but real-time response still requires manual incident response.
Cloud 2.0 - How Containers, Microservices and Open Source Software are Redefi...Mark Hinkle
Led by the rocket like success of Amazon Web Services cloud computing is a paradigm shift in the way we host and deploy infrastructure. Organizations are consuming cloud infrastructure across multiple cloud providers both inside their data center and the data centers of others. The advent of highly portable workloads via containers (e.g. Docker) and discrete units of computing delivered by microservices are enabling organizations (like Netflix) to deploy complex multi-layered products and services at breakneck speeds.
This talk will give an overview of the major cloud services and the open source software (e.g. OpenStack, Apache CloudStack) that can be used to deliver and manage cloud computing infrastructure(e.g. Puppet, Chef, Ansible). The discussion will cover the evolution of cloud computing and how that sets the stage for realizing the agility, flexibility and power of cloud computing.
Attendees should expect to learn about the leading technologies in cloud computing, strategies for using open source software to create/manage cloud computing services and to gain an understanding how current developments are providing a way to create a single cloud fabric that best serves their individual needs.
Stateful Microservices with Apache Kafka and Spring Cloud Stream with Jan Svo...HostedbyConfluent
You have been building your applications with stateless microservices. You might even be a rockstar using Kafka for inter service communication. Everything works wonderfully but you feel you could do something more. You want your microservices to have a state.
Developing stateful microservices can be hard. I will share my experience with building stateful applications with Kafka and Spring Cloud Stream libraries.
Kafka Streams State Stores and Interactive Queries are the main building blocks. They are used by stream processing applications to store and query data. They can scale and be fault tolerant together with your application instances in your container platform. But there are some limitations and we need to know how to monitor their performance.
This session is targeted for developers who are interested in learning event streaming practices. Demo application code will be available to participants.
Towards CloudML, a Model-Based Approach to Provision Resources in the CloudsSébastien Mosser
The Cloud-computing paradigm advocates the use of re- sources available “in the clouds”. In front of the multiplicity of cloud providers, it becomes cumbersome to manually tackle this heterogene- ity. In this paper, we propose to define an abstraction layer used to model resources available in the clouds. This cloud modelling language (CloudML) allows cloud users to focus on their needs, i.e., the modelling the resources they expect to retrieve in the clouds. An automated provi- sioning engine is then used to automatically analyse these requirements and actually provision resources in clouds. The approach is implemented, and was experimented on prototypical examples to provision resources in major public clouds (e.g., Amazon EC2 and Rackspace).
This document provides an agenda and overview of topics to be covered in a session on Google Cloud infrastructure and services, including cloud storage, monitoring, functions, pub/sub, IAM, BigQuery, Cloud SQL, VPC networks, and Kubernetes Engine. It also includes primers on cloud storage, monitoring, functions, and pub/sub that define their purposes and capabilities. Hands-on examples for working with containers using Docker are outlined at the end.
Session presented at the 2nd IndicThreads.com Conference on Cloud Computing held in Pune, India on 3-4 June 2011.
http://CloudComputing.IndicThreads.com
Abstract: Cloud computing is no longer a buzz term but a reality. With a great opportunity for huge financial savings and demand for Software-as-a-Service products, developing products for the cloud is something that cannot be ignored. In this talk, I would like to touch upon 3 key aspects of cloud engineering – scalability, security and flexibility and its impact on application architecture, data processing needs and deployment.
* By Manjusha Madabushi, Co-Founder and CTO of Talentica Software Pvt. Ltd.
Speaker: Manjusha is a Co-Founder and CTO of Talentica Software Pvt. Ltd. She has a Bachelor’s degree from IIT Mumbai and a Master’s degree from Northwestern University, Chicago. She has over 23 years experience working in the IT industry. She started her career working for Amoco Research Centre, USA till 1989 before returning to India and joining TCS. During her 9 year career at TCS, Manjusha worked in different technology areas such as Artificial Intelligence, Application Modeling, Compilers etc. She was also the Engineering head of the TCS’ product – E.X. NGN. Post TCS, she founded Nitman Software, which was acquired by a US based CRM company, eGain Communications in the year 2000. She co-founded Talentica Software, a company that helps technology companies transform their ideas into successful products in 2003. Talentica specializes in building highly scalable products using cutting edge technologies in the areas of Social Analytics, CRM, Natural Language processing and Advertising.
O Outro Lado BSidesSP Ed. 5 - As Nove Principais Ameaças na Computação em NuvemAndre Serralheiro
O objetivo desta apresentação é discutir as principais ameaças à computação em nuvem com base no documento "As Nove Principais Ameaças na computação em Nuvem" disponibilizado pela Cloud Security Alliance no início de 2013. Baseado em uma pesquisa realizada entre seus associados, este documento fornece o contexto necessário para auxiliar as organizações na tomada de decisões de risco ao analisar suas estratégias de adoção da Computação em Nuvem.
The Cloud is dead ?! Blockchain in the new cloudYuval Birenboum
The document discusses the potential for a "decentralized cloud" using blockchain technology for decentralized applications (Dapps), computing, databases, and storage. It defines blockchain and how it allows for a distributed ledger and peer-to-peer transactions without an central authority. The benefits of decentralization are reduced need for trust and privacy risks while unleashing resources at the network edges. Decentralized storage and databases on the blockchain could provide an alternative to current cloud-based services.
OW2con'16 Keynote address: Kubernetes, the rising tide of systems administrat...OW2
Kubernetes, the rising tide of systems administration. Containers and cloud have moved from "why" to "how and when?" Learn how Google is helping the world go Cloud Native.
* Explore the similar histories of the cloud and Content Delivery Networks (CDNs)
* Discover the future relationship between CDNs and emerging cloud platforms as the lines of distinction continue to blur
* Learn from real world use cases in which these technologies interact together
Building realtime data applications that can seamlessly run and integrate data across On Prem, and multiple public cloud vendors. How Hybrid Cloud can help tackle regulatory requirements for Data Sovereignty, Stressed Exit, and operational resilience.
Cloud computing and grid computing 360 degree comparedMd. Hasibur Rashid
Cloud computing builds upon concepts from cluster and grid computing. Cluster computing links multiple computers to share workloads, while grid computing dynamically aggregates distributed resources for tasks. Cloud computing provides scalable resources and services over the internet. It extends concepts from grid computing by offering virtualized, dynamically provisioned resources on-demand. Key differences are that cloud computing has loose coupling between providers and consumers, supports scaling, and offers services under a pay-per-use business model. Common cloud services are SaaS, PaaS, and IaaS. Challenges include dynamic scalability, security, and standardization. Cloud computing shows promise for further research in areas like security, interoperability and dynamic pricing models.
Content Delivery Using Amazon CloudFront - AWS Presentation - John MancusoAmazon Web Services
CloudFront is Amazon's content delivery network (CDN) that caches copies of content across a global network of edge servers to improve performance and reduce latency. It uses a distribution configuration to determine how to route requests for content to the optimal edge location. Origins specify the source of the content. CloudFront delivers content through its edge locations, improving load times, providing high bandwidth, and ensuring availability. Many companies use CloudFront to deliver media, software downloads, web assets and even dynamic content through features like cache behaviors and multiple origins. Getting started is self-service through the AWS Management Console or APIs.
Cloud lockin and interoperability v2 indic threads cloud computing conferen...IndicThreads
This document discusses cloud lock-in and interoperability. It begins with recapping cloud computing concepts like deployment models and service models. It then defines cloud lock-in and discusses how portability and interoperability can help address lock-in concerns. Emerging standards from groups like DMTF, SNIA, and CSA that aim to improve interoperability are presented. Best practices for vendors and customers to reduce lock-in are outlined. While lock-in exists now due to proprietary systems, the future of interoperability is promising. Standards and informed customer decisions can help minimize negative impacts of lock-in.
Cloud lockin and interoperability v2 indic threads cloud computing conferen...IndicThreads
This document discusses cloud lock-in and interoperability. It begins with recapping cloud computing concepts like deployment models and service models. It then defines lock-in, portability, and interoperability. Lock-in occurs when there are significant costs to switch cloud vendors. The document discusses how portability and interoperability benefit customers by increasing choice and lowering costs. It provides examples of lock-in for different cloud platforms and analyzes emerging standards from groups like DMTF, SNIA, and CSA. Best practices are outlined to minimize lock-in for IaaS, PaaS and SaaS. The document concludes that while lock-in exists now, interoperability is improving and portability
presentation givent at the 2nd International Workshop on Web Intelligence & Virtual Enterprises (WIVE'10) held at the 11th IFIP Working Conference on Virtual Enterprises (PRO-VE'10)
http://www.emse.fr/wive/
Cloud Trends for 2017 and Actions You Can Take NowRightScale
This document discusses 10 cloud trends for 2017 and actions that can be taken. The trends include: 1) Enterprises using multiple public clouds like Azure and Google Cloud; 2) Infrastructure as a service (IaaS) and platform as a service (PaaS) converging; 3) Governance becoming more important; 4) Cloud security matching on-premises security; 5) Growing use of containers with orchestration tools; 6) Emergence of serverless computing via functions as a service (FaaS); 7) Slowing price cuts from cloud providers; 8) Opportunities to reduce cloud spending by 30-45%; 9) Lack of cloud expertise being a top challenge; 10) High demand
OSCON 2013 - The Hitchiker’s Guide to Open Source Cloud ComputingMark Hinkle
And while the Hitchhiker’s Guide to the Galaxy (HHGTTG) is a wholly remarkable book it doesn’t cover the nuances of cloud computing. Whether you want to build a public, private or hybrid cloud there are free and open source tools that can help provide you a complete solution or help augment your existing Amazon or other hosted cloud solution. That’s why you need the Hitchhiker’s Guide to (Open Source) Cloud Computing (HHGTCC) or at least to attend this talk understand the current state of open source cloud computing. This talk will cover infrastructure-as-a-service, platform-as-a-service and developments in big data and how to more effectively deploy and manage open source flavors of these technologies. Specific the guide will cover:
Infrastructure-as-a-Service – The Systems Cloud – Get a comparison of the open source cloud platforms including OpenStack, Apache CloudStack, Eucalyptus and OpenNebula
Platform-as-a-Service – The Developers Cloud – Learn about the tools that abstract the complexity for developers and used to build portable auto-scaling applications ton CloudFoundry, OpenShift, Stackato and more.
Data-as-a-Service – The Analytics Cloud – Want to figure out the who, what, where, when and why of big data? You’ll get an overview of open source NoSQL databases and technologies like MapReduce to help parallelize data mining tasks and crunch massive data sets in the cloud.
Network-as-a-Service – The Network Cloud – The final pillar for truly fungible network infrastructure is network virtualization. We will give an overview of software-defined networking including OpenStack Quantum, Nicira, open Vswitch and others.
Finally this talk will provide an overview of the tools that can help you really take advantage of the cloud. Do you want to auto-scale to serve millions of web pages and scale back down as demand fluctuates. Are you interested in automating the total lifecycle of cloud computing environments You’ll learn how to combine these tools into tool chains to provide continuous deployment systems that will help you become agile and spend more time improving your IT rather than simply maintaining it.
[Finally, for those of you that are Douglas Adams fans please accept the deepest apologies for bad analogies to the HHGTTG.]
Similar to SNS Analysis using Cloud Computing Services (20)
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Project Management Semester Long Project - Acuityjpupo2018
Acuity is an innovative learning app designed to transform the way you engage with knowledge. Powered by AI technology, Acuity takes complex topics and distills them into concise, interactive summaries that are easy to read & understand. Whether you're exploring the depths of quantum mechanics or seeking insight into historical events, Acuity provides the key information you need without the burden of lengthy texts.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
1. PlatformDay2009
SNS Analysis using Cloud Computing Services
DHT-based Key-Value Storage and MapReduce-based Analysis
DongWoo Lee
oiko.cloud@gmail.com
S Oiko
Laboratory
D SocialFlow
OikoLab 2
CloudKR
2. Agenda
2
CloudKR
‣ Introduction
• Social Network Serivce
• Motivation : Visualization, Social Network Analysis
• SocialFlow
• Scale Out Technologies : Cloud Computing
‣ SNS Analysis Architecture based on Cloud
• Overall Process
• Crawling
• DHT Storage (CouchDB)
• MapReduce
• Pair-Wise Similarity
‣ Cloud Computing Service
• Amazon Web Service
• EC2 / S3 / Elastic MapReduce
• Tips
‣ References
3. Introduction
2 CloudKR
Social Network Cloud Computing Mobile Device
4. Social Network Service
2
CloudKR
“Social Applications = Social Networks”
“A social network is a collection of people bound together
through a specific set of social relations.”
“A collection of people is a social network if and only if it is
possible for something to spread virally through that collection.”
13. SocialFlow
2
CloudKR
‣ Thoughts, Feelings, Interests, Relationship and Information of SNS
‣ Real-time Massive Social Data Streams
‣ Difficult to follow the Social Streams
‣ Need a way to get a summary or clustered information based on Common Interests
D SocialFlow
OikoLab
14. SocialFlow
‣ Getting Common Flows of people through Content Similarities
2
CloudKR
‣ Reflecting Short-Term Interests of People
‣ Extracting Hot Issues
‣ Revealing Relationships among In/Out Resources
‣ Implementing Scale-Out Technologies
‣ Evolving toward Recommendation System
based on Collective Intelligence
19. Experimental Project
2
CloudKR
D SocialFlow
OikoLab
‣Python / Django / Boto
‣ML / Data Mining
‣DHT / CouchDB
‣Cloud / AWS S3, EC2, Hadoop MapReduce
20. Workflow
2 CloudKR
SNS Crawler MapReduce Post-Processing CDN User
In-house Cluster Cloud Service
(Local DataCenter)
21. Technologies : Before
2
CloudKR
Crawler
Crawler Crawler
Hash_ring Consistent
MapReduce CouchJS
DHT
CouchDB Key-Value Machine Home
Storage Learning Made
22. Technologies : After
2
CloudKR
Crawler
Crawler Crawler
Storage S3
Hash_ring Consistent EC2
MapReduce
DHT Hadoop
CouchDB Key-Value Machine Home
Storage Learning Made
23. Crawling
2 CloudKR
‣ Fetching recent postings of SNS
‣ Storing fetched postings to CouchDB Storage through DHT Layer (which select a sever)
‣ Pushing raw data into the Cloud to process them with MapReduce
Crawler
DB [ term, doc ]
Crawler
DB
Index
Indexer
File
DB
Crawler
DB Mapper
Crawler
DHT Replication
24. Consistent DHT (Distributed Hash Table)
‣ Uniform key distribution and load balancing with a good hash function
2 CloudKR
‣ Minimizing the effects of a storage crash or temporal down
‣ High availability with replication scheme
N-1 0
Node N-1 Node k-1
k-1
‣ Notice: A real node has non-
linear portions of the total key
space.
Replicas
k+1
Node k+1 Node k
2 1
Replicate(k, k-1, k+1)
!"#$!%&'()*+,-.(
/0123',(0405123',(&6-.-7-1(080.-'9(.0405.-'9(.&6-.-7-1(0:
25. Consistent DHT (Distributed Hash Table)
2 CloudKR
Admin Anonymouse
Traffic User Traffic
View
Admin View User View
Generated Contents
SNS Crawler SNS Anlysis AWS S3
html image
DHT Front End
Memory Cache
N-1 0
Node N-1 Node k-1
DHT
Node k+1 Node k
2 1
26. Consistent DHT : Replication
2 CloudKR
* Replica = 2
D A B C
A B B C C D D A
B
B B
Replica Replica
30. Map & Reduce : Pair-Wise Similarity
2 CloudKR
[ term, { docs } ]
=>
DB [ term, doc ] [ term, { docs } ] [ doc1, doc2 ]
DB
Index Doc Group Doc Candidate
Indexer
File Grouper File Combinator File
DB
DB Mapper Reducer Mapper
DocPair
Reducer
Counter
Doc
File
Result
File
‣ Indexer and Grouper for Processing Korean.
[ freq, doc1, doc2 ]
‣ No NLP and No Structural Analysis.
‣ Produce a pairwise similarity between two postings.
31. Map & Reduce : Optimization
2 CloudKR
‣ Concerns ‣ Sample Data
‣ Consider Key Group Size Distribution ‣ Two months postings of my friends
‣ Data Load Balancing ‣ Reachable graph: 4,060 Peoples
‣ Barrier Point ‣ Total Postings: 206,115
39. Before the Cloud Age
‣ Smart Shell Guru’s Daily Work : Parallel Sort
2 CloudKR
$ wc -l data scp scp $ sort -rm data*.sorted >
$ split -l 1000k data NFS NFS data.sorted
$ nohup ./work.sh data1 > data1.processed
$ nohup sort -r data1.processed > data1.sorted
➡ Need to prepare/maintain physical machines and resources
Complexity ➡ Need to monitor job progress (wait and see job’s status)
➡ Need to cope with machine failure (slave nodes / storages / networks)
➡ Need to schedule multiple jobs
43. AWS : Paid AMI / The Cloud Market
2 CloudKR
AMI
Amazon Machine Image
Paid AMI
44. AWS : How to make a AMI (1)
2
CloudKR
Loopback File
# dd if=/dev/zero of=new_image.fs bs=1M count=1024
Make ext3 file system
# mke2fs -F -j new_image.fs
# mkdir /mnt/ec2-fs
# mount -o loop new_image.fs /mnt/ec2-fs
# mkdir /mnt/ec2-fs/dev
# /sbin/MAKEDEV -d /mnt/ec2-fs/dev -x console
# /sbin/MAKEDEV -d /mnt/ec2-fs/dev -x null
# /sbin/MAKEDEV -d /mnt/ec2-fs/dev -x zero
# mkdir /mnt/ec2-fs/etc
Create /mnt/ec2-fs/etc/fstab (Add /dev/sda1 --> /, /etc/pts, shm, /proc, /sys)
Create yum-xen.conf
# mkdir /mnt/ec2-fs/proc
# mount -t proc none /mnt/ec2-fs/proc
# yum -c yum-xen.conf --installroot=/mnt/ec2-fs -y groupinstall Base
Edit /mnt/ec2-fs/etc/sysconfig/network-scripts/ifcfg-eth0
Edit /mnt/ec2-fs/etc/sysconfig/network
Edit /mnt/ec2-fs/etc/fstab (Add /dev/sda2 --> /mnt, /dev/sda3 --> swap)
chroot /mnt/ec2-fs /bin/sh
Edit services
45. AWS : How to make a AMI (2)
2
CloudKR
Building an AMI
# yum install ruby
# rpm -i ec2-ami-tools-noarch.rpm (Download from public s3 bucket)
# ec2-bundle-image -i new_image.fs -k my-private-key.key -u aws-user-id
Local Machine Root File System
# ec2-bundle-vol -k my-private-key.key -s 1000 -u aws-user-id
Upload to S3
# ec2-upload-bundle -b my-bucket -m image.manifest
-a my-aws-access-key-id -s my-secret-key-id
Register AMI
# ec2-register my-bucket/image.manifest
IMAGE ami-xxxx
Testing
# ec2-describe-images ami-xxxx
Deregister AMI
# ec2-deregister ami-xxxx
Running AMI
# ec2-run-intances ami-xxxx -n 1
http://docs.amazonwebservices.com/AWSEC2/2006-06-26/DeveloperGuide/
56. AWS: Elastic MapReduce
2
CloudKR
‣ Failed tasks will be rescheduled in other Hadoop slaves.
‣ If a task is finished, the same instance will be killed by a tracker.
58. AWS: SocialFlow Automation
2 CloudKR
Home IDC Amazon Wild World
Local Global
Results
Admin
DHT S3 Users
Read/Write
Read Only
Renderer
boto python Launching EC2 pool
64. 10 Cent Tips
2
CloudKR
‣ AWS EC2
‣ Minimizing set-up time with prepared shell scripts
‣ Use Boto for automating deployments
‣ Use S3 (Free of Charge between S3 and EC2 in the same region)
‣ $0.030 per GB through June 30, 2000 ($0.1 per GB normal price)
‣ AWS Elastic MapReduce
‣ Enabling the SSH port(22) and Hadoop related ports (9100, 91001)
‣ Assess to Master Node: ssh -i keypair hadoop@public_dns_name
‣ Double Check (PATH, etc)
‣ Debug, Debug, Debug
‣ Use EC2 for hadoop (eg. Clouera’s Hadoop AMI) (No extra cost for Hadoop!)
65. 10 Cent Tips
2 CloudKR
‣ AWS S3
‣ Setting HTTP header for images and static resources.
‣ Cache-Control: max-age=31536000
‣ Block Search Bots
‣ robots.txt at the root of a Bucket
‣ User-agent: *
‣ Disallow: /
‣ Using BitTorrent for large files
‣ http://s3.xyz.com/xfile.zip?torrent
‣ Compress Rendered HTML with gzip
‣ Content-Encoding: gzip
$ s3cmd put index.html s3://s3.xyz.com/www
--mime-type "text/html”
--add-header "Content-Encoding: gzip"
--acl-public