This document presents a case study comparing a traditional single-node approach and a cloud-based approach for analyzing a large dataset of over 150 million domain names to determine which are hosted by SoftLayer. The single-node approach ran on a single server and took approximately 300 hours to complete at a cost of $102.67. A cloud-based approach using multiple servers in parallel could complete the task much faster and potentially at a lower overall cost by leveraging elastic computing resources in the cloud.
International Refereed Journal of Engineering and Science (IRJES) is a peer reviewed online journal for professionals and researchers in the field of computer science. The main aim is to resolve emerging and outstanding problems revealed by recent social and technological change. IJRES provides the platform for the researchers to present and evaluate their work from both theoretical and technical aspects and to share their views.
Is your organization drowning in siloed cybersecurity data? Are you eager to put Big Data to work on your cybersecurity haystack? Are you planning an Apache Metron deployment? Early in 2018, T-Mobile began their journey to cybersecurity at scale. Come learn how one of the largest wireless carriers in the US successfully operationalized Apache Metron, a horizontally scalable cybersecurity analytics platform that ingests, enriches and triages events in real time. Hear why T-Mobile chose Metron and how they planned and executed their deployment. Learn how the team leveraged built-in Metron components and tapped into existing event pipelines to get ingestion up and running quickly. Dive into the details on tuning ingest on a real event feed. Finally get tips and best practices for staying on top of security event monitoring in today’s challenging threat landscape. We discuss migrating log sources to Metron, monitoring and troubleshooting ingest, adapting security configurations to find new attacks, as well as capacity planning.
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...Eswar Publications
Load balancing is a computer networking method to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, may increase reliability through redundancy. The
load balancing service is usually provided by dedicated software or hardware, such as a multilayer switch or a Domain Name System server. In this paper, the existing static algorithms used for simple cloud load balancing have been identified and also a hybrid algorithm for developments in the future is suggested.
International Refereed Journal of Engineering and Science (IRJES) is a peer reviewed online journal for professionals and researchers in the field of computer science. The main aim is to resolve emerging and outstanding problems revealed by recent social and technological change. IJRES provides the platform for the researchers to present and evaluate their work from both theoretical and technical aspects and to share their views.
Is your organization drowning in siloed cybersecurity data? Are you eager to put Big Data to work on your cybersecurity haystack? Are you planning an Apache Metron deployment? Early in 2018, T-Mobile began their journey to cybersecurity at scale. Come learn how one of the largest wireless carriers in the US successfully operationalized Apache Metron, a horizontally scalable cybersecurity analytics platform that ingests, enriches and triages events in real time. Hear why T-Mobile chose Metron and how they planned and executed their deployment. Learn how the team leveraged built-in Metron components and tapped into existing event pipelines to get ingestion up and running quickly. Dive into the details on tuning ingest on a real event feed. Finally get tips and best practices for staying on top of security event monitoring in today’s challenging threat landscape. We discuss migrating log sources to Metron, monitoring and troubleshooting ingest, adapting security configurations to find new attacks, as well as capacity planning.
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...Eswar Publications
Load balancing is a computer networking method to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, may increase reliability through redundancy. The
load balancing service is usually provided by dedicated software or hardware, such as a multilayer switch or a Domain Name System server. In this paper, the existing static algorithms used for simple cloud load balancing have been identified and also a hybrid algorithm for developments in the future is suggested.
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Sebastian Verheughe
Learn how Telenor uses Neo4j to protect data in business critical services running in production. Sebastian will discuss lessons learned both with technology and our experience after running it in production for half a year, backing many of our mission critical services.
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...Kevin Mao
Strata Hadoop World 2017 San Jose
Today’s enterprise architectures are often composed of a myriad of heterogeneous devices. Bring-your-own-device policies, vendor diversification, and the transition to the cloud all contribute to a sprawling infrastructure, the complexity and scale of which can only be addressed by using modern distributed data processing systems.
Kevin Mao outlines the system that Capital One has built to collect, clean, and analyze the security-related events occurring within its digital infrastructure. Raw data from each component is collected and preprocessed using Apache NiFi flows. This raw data is then written into an Apache Kafka cluster, which serves as the primary communications backbone of the platform. The raw data is parsed, cleaned, and enriched in real time via Apache Metron and Apache Storm and ingested into ElasticSearch, allowing operations teams to detect and monitor events as they occur. The refined data is also transformed into the Apache ORC data format and stored in Amazon S3, allowing data scientists to perform long-term, batch-based analysis.
Kevin discusses the challenges involved with architecting and implementing this system, such as data quality, performance tuning, and the impact of additional financial regulations relating to data governance, and shares the results of these efforts and the value that the data platform brings to Capital One.
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...confluent
We recently learned about “Fault Tree Analysis” and decided to apply the technique to bulletproof our Apache Kafka deployments. In this talk, learn about fault tree analysis and what you should focus on to make your Apache Kafka clusters resilient.
This talk should provide a framework for answers the following common questions a Kafka operator or user might have:
What guarantees can I promise my users?
What should my replication factor?
What should the ISR setting be?
Should I use RAID or not?
Should I use external storage such as EBS or local disks?
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics:
-Network flow use cases and why this data is important.
-Reference architectures from production systems at a major international Bank.
-Why Kafka and Druid and other OSS tools for Network Flows.
-A demo of one such system.
Cloud computing is an emerging technology. It process huge amount of data so scheduling mechanism
works as a vital role in the cloud computing. Thus my protocol is designed to minimize the switching time,
improve the resource utilization and also improve the server performance and throughput. This method or
protocol is based on scheduling the jobs in the cloud and to solve the drawbacks in the existing protocols.
Here we assign the priority to the job which gives better performance to the computer and try my best to
minimize the waiting time and switching time. Best effort has been made to manage the scheduling of jobs
for solving drawbacks of existing protocols and also improvise the efficiency and throughput of the server.
Serverless Streaming Architectures and Algorithms for the EnterpriseArun Kejariwal
In recent years, serverless has gained momentum in the realm of cloud computing. Broadly speaking, it comprises function as a service (FaaS) and backend as a service (BaaS). The distinction between the two is that under FaaS, one writes and maintains the code (e.g., the functions) for serverless compute; in contrast, under BaaS, the platform provides the functionality and manages the operational complexity behind it. Serverless provides a great means to boost development velocity. With greatly reduced infrastructure costs, more agile and focused teams, and faster time to market, enterprises are increasingly adopting serverless approaches to gain a key advantage over their competitors.
Example early use cases of serverless include, for example, data transformation in batch and ETL scenarios and data processing using MapReduce patterns. As a natural extension, serverless is being used in the streaming context such as, but not limited to, real-time bidding, fraud detection, intrusion detection. Serverless is, arguably, naturally suited to extracting insights from fast data, that is, high-volume, high-velocity data. Example tasks in this regard include filtering and reducing noise in the data and leveraging machine learning and deep learning models to provide continuous insights about business operations.
We walk the audience through the landscape of streaming systems for each stage of an end-to-end data processing pipeline—messaging, compute, and storage. We overview the inception and growth of the serverless paradigm. Further, we deep dive into Apache Pulsar, which provides native serverless support in the form of Pulsar functions, and paint a bird’s-eye view of the application domains where Pulsar functions can be leveraged.
Baking in intelligence in a serverless flow is paramount from a business perspective. To this end, we detail different serverless patterns—event processing, machine learning, and analytics—for different use cases and highlight the trade-offs. We present perspectives on how advances in hardware technology and the emergence of new applications will impact the evolution of serverless streaming architectures and algorithms. The topics covered include an introduction to st
reaming, an introduction to serverless, serverless and streaming requirements, Apache Pulsar, application domains, serverless event processing patterns, serverless machine learning patterns, and serverless analytics patterns.
This presentation describes how NOSQL solutions such as the Neo4j graph database and Lucene/Solr index was used in a classic middleware stack in Telenor to solve perfomance and scalability issues.
COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...Nexgen Technology
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA csandit
Recently, real-time image data generated is increasing not only in resolution but also in amount. This large-scale image originates from a large number of camera channels. There is a
way to use GPU for high-speed processing of images, but it cannot be done efficiently by using single GPU for large-scale image processing. In this paper, we provide a new method for
constructing a distributed environment using open source called Apache Kafka for real-time processing of large-scale images. This method provides an opportunity to gather related data into single node for high-speed processing using GPGPU or Xeon-Phi processing.
A real-time speech to text conversion system converts the spoken words into text . Speech-to-Text technology enables us to convert audio to text by applying powerful neural network models. It has a number of applications for users with and without disabilities. Speech-to-text has been used for voice search, help writers boost their productivity, and to provide alternate access to a computer for individuals with physical impairments. Other applications include speech recognition for foreign language learning, voice-activated products for the blind and many familiar mainstream technologies. It is a driving force behind the success of new age voice-controlled speakers like Amazon Echo and Google Home.
zenoh: zero overhead pub/sub store/query computeAngelo Corsaro
Unifies data in motion, data in-use, data at rest and computations.
It carefully blends traditional pub/sub with distributed queries, while retaining a level of time and space efficiency that is well beyond any of the mainstream stacks.
It provides built-in support for geo-distributed storages and distributed computations
Microservices, Kafka Streams and KafkaEsqueconfluent
Speakers: Patrick Schuh, Bearing Point + Patrik Kleindl, Bearing Point
Abstract:
- Managing topic configurations and dependencies in a microservice deployment
- Managing Kafka Streams configurations
- KafkaEsque: an open source support tool for Apache Kafka® development (https://github.com/patschuh/KafkaEsque)
Dynamic Cloud Partitioning and Load Balancing in Cloud Shyam Hajare
Cloud computing is the emerging and transformational paradigm in the field of information technology. It mostly focuses in providing various services on demand and resource allocation and secure data storage are some of them. To store huge amount of data and accessing data from such metadata is new challenge. Distributing and balancing of the load over a cloud using cloud partitioning can ease the situation. Implementing load balancing by considering static as well as dynamic parameters can improve the performance cloud service provider and can improve the user satisfaction. Implementation the model can provide dynamic way of resource selection de-pending upon different situation of cloud environment at the time of accessing cloud provisions based on cloud partitioning. This model can provide effective load balancing algorithm over the cloud environment, better refresh time methods and better load status evaluation methods.
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Sebastian Verheughe
Learn how Telenor uses Neo4j to protect data in business critical services running in production. Sebastian will discuss lessons learned both with technology and our experience after running it in production for half a year, backing many of our mission critical services.
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...Kevin Mao
Strata Hadoop World 2017 San Jose
Today’s enterprise architectures are often composed of a myriad of heterogeneous devices. Bring-your-own-device policies, vendor diversification, and the transition to the cloud all contribute to a sprawling infrastructure, the complexity and scale of which can only be addressed by using modern distributed data processing systems.
Kevin Mao outlines the system that Capital One has built to collect, clean, and analyze the security-related events occurring within its digital infrastructure. Raw data from each component is collected and preprocessed using Apache NiFi flows. This raw data is then written into an Apache Kafka cluster, which serves as the primary communications backbone of the platform. The raw data is parsed, cleaned, and enriched in real time via Apache Metron and Apache Storm and ingested into ElasticSearch, allowing operations teams to detect and monitor events as they occur. The refined data is also transformed into the Apache ORC data format and stored in Amazon S3, allowing data scientists to perform long-term, batch-based analysis.
Kevin discusses the challenges involved with architecting and implementing this system, such as data quality, performance tuning, and the impact of additional financial regulations relating to data governance, and shares the results of these efforts and the value that the data platform brings to Capital One.
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...confluent
We recently learned about “Fault Tree Analysis” and decided to apply the technique to bulletproof our Apache Kafka deployments. In this talk, learn about fault tree analysis and what you should focus on to make your Apache Kafka clusters resilient.
This talk should provide a framework for answers the following common questions a Kafka operator or user might have:
What guarantees can I promise my users?
What should my replication factor?
What should the ISR setting be?
Should I use RAID or not?
Should I use external storage such as EBS or local disks?
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics:
-Network flow use cases and why this data is important.
-Reference architectures from production systems at a major international Bank.
-Why Kafka and Druid and other OSS tools for Network Flows.
-A demo of one such system.
Cloud computing is an emerging technology. It process huge amount of data so scheduling mechanism
works as a vital role in the cloud computing. Thus my protocol is designed to minimize the switching time,
improve the resource utilization and also improve the server performance and throughput. This method or
protocol is based on scheduling the jobs in the cloud and to solve the drawbacks in the existing protocols.
Here we assign the priority to the job which gives better performance to the computer and try my best to
minimize the waiting time and switching time. Best effort has been made to manage the scheduling of jobs
for solving drawbacks of existing protocols and also improvise the efficiency and throughput of the server.
Serverless Streaming Architectures and Algorithms for the EnterpriseArun Kejariwal
In recent years, serverless has gained momentum in the realm of cloud computing. Broadly speaking, it comprises function as a service (FaaS) and backend as a service (BaaS). The distinction between the two is that under FaaS, one writes and maintains the code (e.g., the functions) for serverless compute; in contrast, under BaaS, the platform provides the functionality and manages the operational complexity behind it. Serverless provides a great means to boost development velocity. With greatly reduced infrastructure costs, more agile and focused teams, and faster time to market, enterprises are increasingly adopting serverless approaches to gain a key advantage over their competitors.
Example early use cases of serverless include, for example, data transformation in batch and ETL scenarios and data processing using MapReduce patterns. As a natural extension, serverless is being used in the streaming context such as, but not limited to, real-time bidding, fraud detection, intrusion detection. Serverless is, arguably, naturally suited to extracting insights from fast data, that is, high-volume, high-velocity data. Example tasks in this regard include filtering and reducing noise in the data and leveraging machine learning and deep learning models to provide continuous insights about business operations.
We walk the audience through the landscape of streaming systems for each stage of an end-to-end data processing pipeline—messaging, compute, and storage. We overview the inception and growth of the serverless paradigm. Further, we deep dive into Apache Pulsar, which provides native serverless support in the form of Pulsar functions, and paint a bird’s-eye view of the application domains where Pulsar functions can be leveraged.
Baking in intelligence in a serverless flow is paramount from a business perspective. To this end, we detail different serverless patterns—event processing, machine learning, and analytics—for different use cases and highlight the trade-offs. We present perspectives on how advances in hardware technology and the emergence of new applications will impact the evolution of serverless streaming architectures and algorithms. The topics covered include an introduction to st
reaming, an introduction to serverless, serverless and streaming requirements, Apache Pulsar, application domains, serverless event processing patterns, serverless machine learning patterns, and serverless analytics patterns.
This presentation describes how NOSQL solutions such as the Neo4j graph database and Lucene/Solr index was used in a classic middleware stack in Telenor to solve perfomance and scalability issues.
COST-MINIMIZING DYNAMIC MIGRATION OF CONTENT DISTRIBUTION SERVICES INTO HYBR...Nexgen Technology
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA csandit
Recently, real-time image data generated is increasing not only in resolution but also in amount. This large-scale image originates from a large number of camera channels. There is a
way to use GPU for high-speed processing of images, but it cannot be done efficiently by using single GPU for large-scale image processing. In this paper, we provide a new method for
constructing a distributed environment using open source called Apache Kafka for real-time processing of large-scale images. This method provides an opportunity to gather related data into single node for high-speed processing using GPGPU or Xeon-Phi processing.
A real-time speech to text conversion system converts the spoken words into text . Speech-to-Text technology enables us to convert audio to text by applying powerful neural network models. It has a number of applications for users with and without disabilities. Speech-to-text has been used for voice search, help writers boost their productivity, and to provide alternate access to a computer for individuals with physical impairments. Other applications include speech recognition for foreign language learning, voice-activated products for the blind and many familiar mainstream technologies. It is a driving force behind the success of new age voice-controlled speakers like Amazon Echo and Google Home.
zenoh: zero overhead pub/sub store/query computeAngelo Corsaro
Unifies data in motion, data in-use, data at rest and computations.
It carefully blends traditional pub/sub with distributed queries, while retaining a level of time and space efficiency that is well beyond any of the mainstream stacks.
It provides built-in support for geo-distributed storages and distributed computations
Microservices, Kafka Streams and KafkaEsqueconfluent
Speakers: Patrick Schuh, Bearing Point + Patrik Kleindl, Bearing Point
Abstract:
- Managing topic configurations and dependencies in a microservice deployment
- Managing Kafka Streams configurations
- KafkaEsque: an open source support tool for Apache Kafka® development (https://github.com/patschuh/KafkaEsque)
Dynamic Cloud Partitioning and Load Balancing in Cloud Shyam Hajare
Cloud computing is the emerging and transformational paradigm in the field of information technology. It mostly focuses in providing various services on demand and resource allocation and secure data storage are some of them. To store huge amount of data and accessing data from such metadata is new challenge. Distributing and balancing of the load over a cloud using cloud partitioning can ease the situation. Implementing load balancing by considering static as well as dynamic parameters can improve the performance cloud service provider and can improve the user satisfaction. Implementation the model can provide dynamic way of resource selection de-pending upon different situation of cloud environment at the time of accessing cloud provisions based on cloud partitioning. This model can provide effective load balancing algorithm over the cloud environment, better refresh time methods and better load status evaluation methods.
The Church is the Bride of Lord Jeshua Christos [ Christ the Anointed One ]. Lord Jeshua occupies three offices in the Body of Christ, namely that of
- High Priest
- Prophet
- King of Kings
2020 Cloud Data Lake Platforms Buyers Guide - White paper | QuboleVasu S
Qubole's buyer guide about how cloud data lake platform helps organizations to achieve efficiency & agility by adopting an open data lake platform and why data lakes are moving to the cloud
https://www.qubole.com/resources/white-papers/2020-cloud-data-lake-platforms-buyers-guide
Fast Synchronization In IVR Using REST API For HTML5 And AJAXIJERA Editor
Need a web service is just a web page meant for a computer to request and process. IVR system uses REST API for data access while calling. When in call process user want to access data, server need to validate it and while validation data accessing must be synchronize with mysql server. For large data accessing API is essential. This method is for HTML5 and AJAX for fast data synchronization. REST can support any media type, but XML is expected to be the most popular transport for structured information. In IVR System problems with fast data access because before uses HTML4. Proposed work is for HTML5 with AJAX implementation.
As network-computing have been around since 1960′s, Schmidt was the first ever to toss cloud like we recognize it today. Perhaps you are fielding lots of concerns from your own boss, your clients or development team on why you have not moved towards cloud.
Scaling Databricks to Run Data and ML Workloads on Millions of VMsMatei Zaharia
Keynote at Scale By The Bay 2020.
Cloud service developers need to handle massive scale workloads from thousands of customers with no downtime or regressions. In this talk, I’ll present our experience building a very large-scale cloud service at Databricks, which provides a data and ML platform service used by many of the largest enterprises in the world. Databricks manages millions of cloud VMs that process exabytes of data per day for interactive, streaming and batch production applications. This means that our control plane has to handle a wide range of workload patterns and cloud issues such as outages. We will describe how we built our control plane for Databricks using Scala services and open source infrastructure such as Kubernetes, Envoy, and Prometheus, and various design patterns and engineering processes that we learned along the way. In addition, I’ll describe how we have adapted data analytics systems themselves to improve reliability and manageability in the cloud, such as creating an ACID storage system that is as reliable as the underlying cloud object store (Delta Lake) and adding autoscaling and auto-shutdown features for Apache Spark.
1Running head WINDOWS SERVER DEPLOYMENT PROPOSAL2WINDOWS SE.docxaulasnilda
1
Running head: WINDOWS SERVER DEPLOYMENT PROPOSAL
2
WINDOWS SERVER DEPLOYMENT PROPOSAL
Windows server deployment proposal
My Name
University of Maryland University College
WINDOWS SERVER / CMIT 369
December 8, 2019
Windows server deployment proposal
This proposal is a description of the implementation and configuration of the core IT services as a solution to "We Make Windows" Inc. This solution will supply the needs of the company for 2-3 years. As part of this proposal, six topics will be addressed in detail and both the business and technical reasoning for the choice of each of these topic will be provided. The 6 topics that will be addressed in this proposal include the new features of windows server 2016 that that the company can take advantage of, deployment and server editions, active directory domains, DNS and DHCP designs, deployment of application services, and last but not the least, printer and file sharing. That said, this proposal progresses as follows.
New features of windows server 2016 that WMW can take advantage
Nano server
One of the new features of windows server 2016 that WMW Inc can take advantage of is the nano server feature. At this point in time, it should be understood that the a "nano server is the server that is responsible for refactoring the core pieces of the windows server, turning them into their minimally functional state" (Ferrill, 2015). To expound further on the refactoring aspect, it should be know that refactoring is that process of analyzing a given code, in this case, the core pieces of the windows serve, the goal of which is to simplify it. Having described a nano server, it is time to address both the technical and business reasoning for this feature.
One of the technical reasoning for this new feature is that a nano server can run on a bare-metal operating system. In basic terms, a bare metal operating system is basically a hard disk which is the usual medium on which many computer operating systems are installed. So, the capacity of the nano server running on a bare metal operating system is advantageous in that the system will require fewer updates. At the same time, this means that fewer rebooting of the system when the updates are done will be necessary. From the business standpoint, fewer updates and reboots will ensure the business operations remain online and functional most of the time with little interruptions. In other words, there will be little down times. Since down times are costly to the business, this means that the element of cost due to down times will be addressed by the nano server.
Another technical reasoning for this feature is that nano servers are so small that they could be ported across physical sites, data centers as well as other servers. In fact, compared to other installation options, this feature posses a 92% smaller installation. This means that the installation can connected easily across physical sites, data centers, and even across other server ...
Performance comparison on java technologies a practical approachcsandit
Performance responsiveness and scalability is a make-or-break quality for software. Nearly
everyone runs into performance problems at one time or another. This paper discusses about
performance issues faced during one of the project implemented in java technologies. The
challenges faced during the life cycle of the project and the mitigation actions performed. It
compares 3 java technologies and shows how improvements are made through statistical
analysis in response time of the application. The paper concludes with result analysis.
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACHcscpconf
Performance responsiveness and scalability is a make-or-break quality for software. Nearly everyone runs into performance problems at one time or another. This paper discusses about
performance issues faced during one of the project implemented in java technologies. The challenges faced during the life cycle of the project and the mitigation actions performed. It compares 3 java technologies and shows how improvements are made through statistical analysis in response time of the application. The paper concludes with result analysis.
This was a talk, largely on Kamaelia & its original context given at a Free Streaming Workshop in Florence, Italy in Summer 2004. Many of the core
concepts still hold valid in Kamaelia today
1. The Importance of using Small Solutions to solve Big Problems
How to move a mountain (of data)
Christopher Gallo
Technology Evangelist
SoftLayer, an IBM Company
Houston, USA
cgallo@us.ibm.com
Abstract— Abstract- Designing applications that can produce
meaningful results out of large-scale data sets is a challenging
and often problematic undertaking. The difficulties in these
projects are often compounded by designers using the
improper tool, or worse, designing a new tool that is
inadequate for the task. In the current state of cloud
computing, there exists a myriad of services and software to
handle even the most daunting tasks, however discovering
these tools is often a challenge in and of itself. This paper
presents a case study concerning the design of an application
that uses minimal code to solve a large-data problem as an
exercise in choosing the proper tools and creating a quickly
scalable application in a cloud environment. The study will
take every registered Internet Domain Name and determine if
it is hosted by a specific hosting provider (in this case
SoftLayer, an IBM Company). While the case may seem
simple, the technical challenges presented are both interesting
to solve, and general enough to apply to a wide variety of
similar problems. This case study shows the benefits provided
by Infrastructure as a Service (IaaS), queues as a form of task
distribution, configuration management tools for rapid
scalability, and the importance of leveraging threads for
maximum performance.
Keywords-component; Infrastructure as a Service; Cloud
Scaling; Large-Scale Application Design;
I. INTRODUCTION
"The Cloud" is defined by The National Institute of
Standards and Technology as a model for enabling
ubiquitous, convenient, on-demand network access to a
shared pool of configurable computing resources (e.g.,
networks, servers, storage, applications, and services) that
can be rapidly provisioned and released with minimal
management effort or service provider interaction. [1]
Creating an application that is not only capable, but
optimized, for operating in "The Cloud" is challenging in
part due to the very distributed and dynamic nature of "The
Cloud", and to the rapidly changing array of tools that need
to be employed. This case study will solve the same problem
with two different methods, one a traditional single node
approach, and the other a cloud based approach. While many
of the techniques required can, and will be used for the single
node approach, only when we apply these techniques to "The
Cloud" will we see their optimal value.
The problem starts off fairly simply. We are tasked with
iterating through every registered domain name, and
assessing whether it is hosted in a SoftLayer[2] datacenter or
not. The scale of the problem becomes clear when we
discover how many domains there could be. The only
limitation on a domain name is that each label be less than 63
ASCII characters, usually only A-Z and the "-" character [3].
This give us a grand total of 63^26 possible combinations
per Top Level Domain (TLD), of which there are now over
800 [4]. To make our task somewhat easier, various registrars
allow access to their list of registered domain names, so we
will restrict our search to only domains we know to exist,
and will not attempt to search every possible domain name
combination, as that would take an eternity. The registrars
behind the most popular TLDs, .COM, .NET, and .ORG all
give out access, which comprises about 80% of the total
registered domains, or around 150,000,000 domains total [5].
We will need to be content with that number, as obtaining
access to 100% of domains is cost prohibitive for this case
study.
This paper will present the case study by first elaborating
on some of the background technical challenges presented by
iterating through one hundred and fifty million records and
how we plan to solve them, along with the methodology we
plan to use for the two cases. Then we will discuss the Base
Case, which would be a traditional single node solution to
this problem, and some of the lessons learned. Next we will
study the Cloud Case, and how it compares to the Base Case.
Finally we will close with some thoughts on what could have
been done better along with some other concluding remarks.
II. BACKGROUND
It might seem unusual that a large IaaS provider like
SoftLayer does not have ready access to the information on
which domains are being hosted on their infrastructure, but
while SoftLayer keeps track of how many servers are online
and the number of IP addresses that are being leased out,
SoftLayer does not keep track of anything that runs on the
server once access is handed over to a customer. So this
leaves SoftLayer in a position of having to determine the
number of domains hosted the hard way, by checking each
and every registered domain.
2. Since there are around 150,000,000 domains to check,
using a monolithic program where each domain is processed
fully before proceeding to the next is simply going to take
too long, each task must be broken down and parallelized as
much as possible. Multi-threaded programming is generally
significantly more challenging than single-treaded
programming, to such a degree that many programmers
avoid it altogether [18]. Yet here multi-threading is going to
be a must in order to get meaningful results in a reasonable
amount of time. While multi-threaded programming has not
gotten easier since the paper by Bridges, Matthew, et al was
published in 2007, there are now many new tools which will
be explored here to help make the task easier.
Even on a single machine, being able to take advantage
of every core is paramount to maximizing performance of an
application [19], and the easiest way for this application will
be to split every task into its own program that can run
simultaneously and independently of each other. The tasks
will be broken down as listed below.
a. Domain Parser
This is the script that is responsible for taking the files
provided by the various registrars and adding them to the
RabbitMQ server. These zone file are downloaded ahead of
time since they can be fairly large and are located on the
system running the Domain Parser. To help minimize queue
transactions, each domain is packaged into groups of 25. The
package is a simple array of objects, encoded as JSON. The
logic for this code is in Fig. 1:
b. Domain Resolver
This script takes a packet of domains from the queue,
attempts to resolve each one in a thread, and then adds an
updated packet of domains to a final queue, adding in some
new information about the domain. This section is where
multi-threading will really shine. The average time to resolve
a domain successfully for this project was 0.306 seconds.
However, even with optimizations to Unbound, the time to
unsuccessfully resolve a domain was 2.051 seconds, which is
a very long time for a CPU to wait for a result. Thankfully
threads allow us the ability to continue to attempt to resolve
domains while we wait on a response from the upstream
DNS server. The logic for this code is contained in Fig. 2.
DNS lookups are going to be the biggest bottleneck for
this study, especially since it is expected that about 25% of
the lookups will result in a failure [6], which will
significantly slow down the rate at which we can query
domains. To mitigate this, a local DNS resolver service
(Unbound DNS [7]) will be required so that control can be
exercised over how long to wait on slow DNS servers, and to
limit caching to save on resource utilization. Each domain
will be only queried once, so there should be no need for
caching at all in this project.
c. Domain Checker
This script takes a packet of domains from the final
queue, and checks against our database of IP addresses to see
if the IP address of the domain is a SoftLayer IP address or
not. Once the check is complete, the domain object is
updated with that information and finally saved to Elastic
Search. The logic is in Fig. 3.
1. Domain Parser Logic
2. Domain Resolver Logic
To control the even distribution of domains to processes
between each program, a message queue will need to be
added. For this project an Advanced Message Queuing
Protocol (AMQP) compatible queue was chosen because it is
an open standard supported by a wide variety of client and
service applications [8]. the AMQP protocol is designed to
be usable from different programming environments,
operating systems, and hardware devices, as well as making
high-performance implementations possible on various
network transports including TCP, SCTP (Stream Control
Transmission Protocol), and InfiniBand [9].
3. 3. Domain Checker Logic
Specifically, RabbitMQ was chosen for this project since
due to its ease of setup and support for the Python
programming language [20], however any AMQP
compatible service would have likely worked just as well.
Although the WHOIS [22] database serves as a great
resource to lookup what organization owns an IP address, it
will not be used here as SoftLayer has provided database
containing all of their IP address information. To make
querying this database as fast as possible, the IP information
will be converted from the common dotted quad format into
its decimal representation using the netaddr python library.
These decimal numbers will be stored in an indexed MySQL
database to facility fast queries [23].
Storing the data is the most important technical challenge
to solve, since up until this point all the work we have done
has been in memory, and would be lost if the services were
shut down. NoSQL is defined as a collection of next
generation databases mostly addressing some of these points:
being non-relational, distributed, open-source and
horizontally scalable [10], which are precisely the problems
that we will likely encounter. There are a wide variety in
NoSQL implementations, and for this project a Document
Store style offering is the best fitted for how the data will be
used after it is stored. In light of the huge variety of NoSQL
applications that could possible work with this project,
ElasticSearch for three main reasons.
• Storing data is fast, and as simple as forming a HTTP
PUT request [21].
• Searching through the data is the main purpose of
ElasticSearch, which will be useful for doing post mortem
data analysis.
• Most importantly, Kibana [11] is a fantastic tool to
visualize data stored in ElasticSearch, and was used to
create many of the graphs in this case study.
Finally, all of this will be run on the Debian “jessie/sid”
operating system, with most of the custom code written in
python 2.7. The operating system and programming
language are just personal preferences however, it should be
expected that similar results would be apparent with different
choices made here.
III. Methodology
The end goal of this project is to determine with some
accuracy the exact number of domains that resolve to a
SoftLayer owned IP address. Yet there three important
milestones that will be observed in trying to reach this goal.
1. The proof of concept. During this section, the core
components of the project are put together, tested,
and checked for consistency. Critically important
for any software project.
2. The Base Case. The first full run through the data
set and will serve as a benchmark for what we could
expect performance to look like given a single
server approach.
3. The Cloud Case. Here we will attempt to leverage
as many resources as possible to answer our
question in the shortest time possible, and will be
compared against the Base Case.
While finding the answer to our question may be
interesting to some, especially SoftLayer, we have setup this
study to help answer some questions that might be more
relevant to the community, specifically those who lack
excessive experience working with cloud technologies and
distributed workloads. We hope to address the following
general concerns with this case study.
Concern 1
What are the difficulties in solving a large-data problem with
a monolithic approach?
Concern 2
How much time and effort can be saved with a cloud based
approach compared to a monolithic approach?
These concerns are important because they mirror many of
the concerns newcomers coming into the cloud computing
space encounter, and addressing them will hopefully
alleviate some of the hesitancy to adopt cloud computing.
IV. Proof of Concept
Creating a proof of concept version is critical to the
success of any application. It is during this phase where we
try to answer the most basic question, "can this plan actually
work?". Even with most of the technology stack already
chosen before attempting the proof of concept, creating a
proof of concept is important to prove that all the technology
works well together before work is wasted on a solution that
is impossible. This stage brought to light a collection of
issues that had previously not been apparent on the surface.
As mentioned earlier, multi-threaded programming is
inherently difficult, and working out these difficulties is
much easier in the proof of concept phase than in a full
production run. This phase also uncovered an interesting
problem in that the domain files were being parsed entirely
too quickly, which had the result of crashing the RabbitMQ
server almost instantly by exhausting the available RAM.
Thankfully this issue was discovered early and with some
fine tuning of the RabbitMQ settings, and some rate limiting
4. on the parsing program, everything ended up running very
smoothly afterwards.
Aside from those major issues uncovered, this proof of
concept phase helped illuminate which areas of the program
were likely to break, and where best to put in logging
messages to ensure any errors were being properly reported
and handled. The data structure used to pass domain
information between processes was finalized here, along
with the end document that will eventually be stored in
ElasticSearch.
V. Base Case
With the proof of concept finished, it is time to move
onto actually running everything together at full speed. This
involves ordering a new server, installing the required
libraries and packages, configuring everything and the
setting all the programs running.
A. New Problems
Going from a proof of concept to a full run is generally
bound to uncover new problems, and this transition is no
exception. The first unexpected hurdle turned out to be
difficulties in turning a python program into a background
service, which was surprisingly complicated, at least for
someone not intimately familiar with how Debian manages
startup scripts. Secondly, while DNS lookups were expected
to be fairly CPU expensive, they turned out to largely be the
limiting factor in how many processes could be launched at
once. Since none of the DNS lookups being performed
would be in a cache already, the resolver needed to query the
root name servers, then the zone name servers, then finally
the authoritative name servers for each domain.
Passing messages between processes with RabbitMQ was
incredibly easy, but slightly error prone. The biggest issue
was that occasionally the connector would hit a timeout and
that would cause the resolver program to exit. Once some
logic was added to the programs interacting with RabbitMQ
to handle that exception and keep going everything ran
smoothly.
B. The Hardware
The power of a bare metal server has been well
documented [12], so for this single server case a single, bare
metal, server will be used to get the most optimal
4. Base Case Domains Per Hour
performance. This server is something that would be
easily found in any datacenter, or at least something very
similar.
The server will be an Intel Xeon E3-1270, 4 Cores
@3.40GHz, 2 hard drives and 8 GB of RAM, costing
0.368$/hour [13]. This server was chosen because of its fast
clock speed, cheap hourly rate, and enough RAM to hold all
our data.
C. Results
Below is a breakdown of the average amount of CPU
percentage each part of our solution took up. These numbers
are approximate averages to give a good sense of where most
of the time was spent. As noted earlier, Unbound (or DNS
resolver) takes up nearly 50% of the CPU time. RabbitMQ
and ElasticSearch are both fairly low on this chart, which
was a little unexpected, however it goes a long way to show
how powerful and well made these tools are. So it should be
no surprise that the code written specifically for this study
performed worse than tools written by industry experts.
I. CPU USAGE BREAKDOWN
5. RabbitMQ Network Utilization
Process CPU %
Unbound 45%
Domain Resolver x 40 25%
Domain Parser 1%
Domain Checker 1%
RabbitMQ 15%
ElasticSearch 10%
Operating System 3%
5. Overall, the whole system took about 300 hours to run
for a grand total of $102.672, averaging between 100 and
200 domains a second. A bargain considering the cost for
just the Intel Xeon E3-1270 v3 is $373.11 [24].
Increasing the number of cores will easily help reduce the
runtime, however there are only so many cores you can fit
inside a single machine. The biggest hourly server SoftLayer
provides is the Intel Xeon E5-2690 v3 (12 Cores, 2.60 GHz)
$2.226/hour [14]. Since this server has three times as many
cores as our original, it can be generously assumed this
process would have taken a third of the time (100 hours).
However 100 hours @ $2.226/hour is significantly more
expensive at $222.6.
Overall once all of the programs were set running, the
base case performed admirably without supervision. There
are still some performance improvements that could have
been made to the code and configuration of services, but that
would take a significant amount of intimate knowledge about
each service and some of the inner workings of the python
libraries involved, so to get our runtime and overall cost
down, it is easier to simply spread everything out into a
cloud deployment.
VI. Cloud Case
On of the many benefits of Cloud Computing is a
smoother scalability path. Cloud Computing empowers any
application with architectures that were designed to easily
scale with added hardware and infrastructure resources [15].
This path to smoother scalability is exactly what this case
will study. The simplest way to start scaling is to split off
each service into its own bare metal or virtual server. The
RabbitMQ service will get a virtual server with plenty of
RAM, and the ElasticSearch service, MySQL, and the
domain parser will get a bare metal server with plenty of disk
space and ample disk speed. Unbound and the domain
resolver will be paired together on a series of virtual servers
to maximize cores while minimizing costs. The virtual server
will need at least two cores, one to run Unbound, and the
other to go through all of the domain resolver threads. The
domain checker service will also get a series of virtual
servers as it is also only dependent on CPU time, with very
little disk or ram usage.
A. New Problems
The first major problem is adopting a cloud computing
deployment, is the network. In the Base Case data was
transferred between services via the loopback interface,
which is incredibly fast since the data never has to actually
go over the wire. In the Cloud Case however, it quickly
became apparent that the default 100Mbps data transfer rate
was entirely too slow for our application. Thankfully it is a
simple matter to upgrade to a 1Gbps connection in a cloud
environment, which was plenty of bandwidth, with our
application maxing out at around 250Mbps. Due to the
amounts of data being transferred over the network,
bandwidth costs also become a big concern. Luckily
SoftLayer does not meter traffic over their private network ,
even across regions [25]. Provided all network traffic is kept
to the private network there will be no additional costs for
splitting out the infrastructure.
1. Network traffic handled by rabbitMQ
Configuration management starts to become a real
problem in cloud environments due to the ever increasing
number of nodes requiring configuration. Setting up a single
server is a fairly trivial task for any seasoned administrator,
but managing dozens of nodes that all need to be provisioned
simultaneously becomes a bit of a nightmare. Thankfully
there are a myriad of configuration management tools [16]
that help manage cloud deployments, and for this project Salt
Stack [17] was selected for its ability to easily provision
servers on the SoftLayer platform. Once SaltStack has been
fleshed out with the details of the application and its
deployment structure, creating the thirty six servers required
for the Cloud Case is contained in one simple command, and
takes about fifteen minutes for all nodes to be provisioned,
configured, and running the programs they were told to run.
B. The Hardware
a. Domain Master - Hourly Bare Metal - 4 Cores
3.50GHz 32 GB @ .595$/hour
This server will be responsible for both being the master
for my SaltStack configuration management along with
running the ElasticSearch, Kibana, MySQL, and Domain
Parser services. This is the only bare metal server since this
is the only node where data is actually written to or read
from a disk.
b. Rabbit Node - Virtual Server - 4 cores 48GB RAM -
1Gbps Network @ .606$/hour
Responsible for the RabbitMQ service. 48G of RAM is a
significant increase from the base case, which is due to the
rate at which domains are entering the queue. In the base
case we limited the rate of the Domain Parser to keep in pace
with the Domain Resolver, however in this case that rate
limit has been removed since the Domain Resolver will be
scaled up significantly as the hardware provisioned here can
support holding the entirety of the data that will be worked
with. This now makes the network a limiting factor where it
was not previously, hence the 1Gbps network connection.
c. 25 Resolver Nodes - Virtual Server - 2 cores 1G
RAM @ .060$/hour
Responsible for Unbound and the Domain Resolver
script. Each node can run about 40 Domain Resolver scripts
before maxing out the CPU. Due to the very dependent
nature of Unbound and Domain Resolver, keeping them
together worked out really well.
d. 10 Checking Nodes - Virtual Server - 2 cores 1G
RAM @ .060$/hour
Responsible for running the Domain Checker script.
Each node can run about 80 Domain Checker scripts before
maxing out the CPU. The amount of work required for the
Domain Checker is significantly less than the Domain
Resolver, which is why the same amount of domains could
6. be processed with ten nodes instead of the twenty five for
Domain Resolver.
Separating out the services in this manner has the very
significant advantage of being able to use more CPUs and
RAM than can fit into a single server. Each server, aside
from the Domain Master, was managed entirely by
SaltStack, from the ordering step all the way to the final
provisioning and running the needed services, without ever
having to login to the server itself.
Overall, the server count here was a bit on the
conservative side, however this setup still completely
exceeded expectations even without hitting any cloud
bottlenecks. With 78 cores working, the Cloud Case
managed to progress through between 6000 and 7000
domains a second, which is a huge increase from the Base
Case.
A. Results
From the point domains started being added to
RabbitMQ, this project took a little under 6 hours to fully
complete the assigned task, and could have been even shorter
had more Resolver Nodes been added. This project was left
to run overnight given the extreme length of time the Base
Case took, which is why no other Resolver Nodes were
added, since the project was already completed before it was
noticed how fast it was going.
Despite the significantly higher CPU and RAM count
used in the Cloud Case, the end cost was only $26.998,
roughly a quarter of the Base Case cost. This cost should
hopefully help make it clear how powerful Cloud
Architectures can be in both time and money savings.
Since everything is also specified in SaltStack,
redeploying this environment again is a trivial process,
which is another huge benefit of using a cloud computing
model for solving problems.
VII. Conclusion
In the face of increasingly vast and complicated work
loads, traditional programming techniques are quickly
becoming inadequate and time consuming. Distributing tasks
across a wide array of discrete nodes is going to be a critical
aspect of any large-data project, and being able to master the
plethora of services that assist programmers in this space is a
must for any developer going into the Cloud Era.
6. Cloud Case Domains Per Hour
Primarily the Message Queue as a tool for task
distribution, and NoSQL data stores are going to play some
of the biggest roles in these architectures. Hopefully this
paper helped shed some light on how all these services can
work together to build a successful application even without
a significant amount of prior knowledge of the products
involved.
Finally, we can address fully our concerns from earlier.
Concern 1
The difficulties with solving large-data problems with a
monolithic approach tend to be the limitations imposed by
physical restrictions. Even though both the Cloud Case and
the Base Case used a similar software architecture, the Base
Case simply couldn't get a server big enough to go through
the data in even a reasonable fraction of the time compared
to the Cloud Case. Even though a myriad of unfamiliar
technology was employed here, generally the only
information required was how to get the service installed,
and how to get data into or out of the service in question.
While the inner workings remain a mystery, the services
themselves perform well with intelligently designed defaults.
Concern 2
With the Cloud Case clocking in at around 6 hours and
$27 it greatly surpassed the Base Case in both time and cost,
as the Base Case too around 300 hours and $103. Although it
is counter intuitive, using more computing power can
actually be cheaper if it can reduce the required computation
time for a program. Getting the Cloud Case setup in
SaltStack was certainly challenging and time consuming,
however now that the work has been done, redeploying the
Cloud Case takes no time at all, where redeploying the Base
Case would still take a few hours of configuration by hand to
get everything working.
In conclusion, it should hopefully be clear that expertise
in cloud computing is not required to be able to take
advantage of the power it offers. Nor should distributed or
parallelized programming techniques be avoided because
they are difficult to understand, the performance
improvement they allow for are too great to ignore. Work is
being done constantly to make these techniques easier to
understand, and already a great many tools and concepts,
such as queues for message transfers between programs, that
allow even an inexperienced developer to make great choices
in how to solve difficult problems.
VIII. Acknowledgments
This work was sponsored by SoftLayer, which is why
they were the IaaS vendor of choice in this paper. While the
pricing and servers are specific to SoftLayer, we expect the
findings in this paper to be replicable in any other IaaS
vendor. The developers at SaltStack were also incredibly
helpful in sorting out issues relating to some of the more
complicated configurations in the deployment.
7. REFERENCES
1. http://faculty.winthrop.edu/domanm/csci411/Handouts/NIST.pdf
2. https://softlayer.com
3. https://tools.ietf.org/html/rfc1035 section 3.1
4. https://ntldstats.com/
5. http://www.registrarstats.com/TLDDomainCounts.aspx
6. Jaeyeon Jung, Emil Sit, Hari Balakrishnan and Robert Morris "DNS
Performance and the Effectiveness of Caching" IEEE/ACM
TRANSACTIONS ON NETWORKING, VOL. 10, NO. 5, OCTOBER
2002
7. https://www.unbound.net/
8. https://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol
9. O'Hara, J. (2007). "Toward a commodity enterprise middleware". Acm
Queue 5 (4): 48–55
10. http://nosql-database.org/
11. https://www.elastic.co/products/kibana
12. Ekanayake, Jaliya, and Geoffrey Fox. "High performance parallel
computing with clouds and cloud technologies." Cloud Computing.
Springer Berlin Heidelberg, 2010. 294-308.
13. https://www.softlayer.com/Store/orderHourlyBareMetalInstance/
37276/64
14. https://www.softlayer.com/Store/orderHourlyBareMetalInstance/
165559/103
15. Creeger, Mache. "Cloud Computing: An Overview." ACM Queue 7.5
(2009): 2.
16. https://en.wikipedia.org/wiki/Configuration_management
17. http://saltstack.com/
18. Bridges, Matthew, et al. "Revisiting the sequential programming model
for multi-core." Proceedings of the 40th Annual IEEE/ACM
International Symposium on Microarchitecture. IEEE Computer
Society, 2007.
19. Dean, Jeffrey, and Sanjay Ghemawat. "Distributed programming with
Mapreduce." Beautiful Code. Sebastopol: O’Reilly Media, Inc 384
(2007).
20. https://pika.readthedocs.org/en/0.10.0/
21. https://www.elastic.co/guide/en/elasticsearch/guide/current/create-
doc.html
22. https://whois.icann.org/en/about-whois
23. Schwartz, Baron, Peter Zaitsev, and Vadim Tkachenko. High
performance MySQL: Optimization, backups, and replication. "
O'Reilly Media, Inc.", 2012. 115-130
24. http://amzn.com/B00D697QRM
25. http://blog.softlayer.com/tag/private-network