The document describes the Cisco 7200 Series Router. It provides high performance routing and processing for applications such as VPN gateways, broadband subscriber aggregation, and enterprise WAN aggregation. It offers modular interfaces that support a wide range of connectivity options from Ethernet and Fast Ethernet to synchronous serial and packet over SONET. The Cisco 7200 provides a cost-effective platform that integrates functions previously requiring separate devices.
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
This document discusses how Hadoop RPC quality of service (QoS) helps improve HDFS availability by preventing name node congestion. It describes how certain user requests can monopolize name node resources, causing slowdowns or outages for other users. The solution presented is to implement fair scheduling of RPC requests using a weighted round-robin approach across user queues. This provides performance isolation and prevents abusive users from degrading service for others. Configuration and implementation details are also covered.
This document discusses security features in Apache Kafka including SSL for encryption, SASL/Kerberos for authentication, authorization controls using an authorizer, and securing Zookeeper. It provides details on how these security components work, such as how SSL establishes an encrypted channel and SASL performs authentication. The authorizer implementation stores ACLs in Zookeeper and caches them for performance. Securing Zookeeper involves setting ACLs on Zookeeper nodes and migrating security configurations. Future plans include moving more functionality to the broker side and adding new authorization features.
Rich placement constraints: Who said YARN cannot schedule services?DataWorks Summit
The rise in popularity of machine learning, streaming, and latency-sensitive online applications in shared production clusters has raised new challenges for cluster schedulers. To optimize their performance and resilience, these applications require precise control of their placements by means of complex constraints. Examples of such scenarios are the following:
• Deep learning applications need to run on GPU machines with specific GPU models and driver/kernel versions.
• Hive or Spark applications benefit from being collocated on the same rack to reduce network cost and thus speed up their execution. At the same time, it is desirable to limit the number of allocations per machine to minimize resource interference.
• Low-latency services such as HBase need to be allocated across failure domains to improve their availability.
• A DNS service might need to run on machines with public IP address.
In this talk we present the brand new addition of expressive placement constraints in YARN. We show how applications can leverage such constraints to achieve complex placements, such as collocating their allocations on the same node/rack (affinity), spreading their allocations across nodes/racks (anti-affinity), or allowing up to a specific number of allocations per node group (cardinality) to strike a balance between the two. We describe real use cases from production clusters and show the benefits of placement constraints on large clusters using popular applications in both on-prem and cloud settings.
Speakers
Konstantinos Karanasos, Senior Scientists, Microsoft
Wangda Tan, Staff Software Engineer, Hortonworks
Apache Hadoop 3 updates with migration storySunil Govindan
The document discusses migrating Hadoop clusters from version 2 to version 3. It provides an overview of new features in HDFS, YARN, and other components in Hadoop 3, including erasure coding in HDFS, global scheduling and new resource types in YARN. It also covers important considerations for upgrading such as recommended source and target versions, upgrade mechanisms, tooling changes, and ensuring Java 8 is used.
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
This document discusses YARN's shared cache feature for application resources. It provides an overview of how YARN localizes resources for each application and containers. The shared cache aims to address inefficiencies in this process by caching identical resources on NodeManagers and sharing them between applications and containers. The design goals are for the shared cache to be scalable, secure, fault-tolerant and transparent. It works by having a shared cache client interface with a shared cache manager that maintains metadata and persisted resources. This can significantly reduce data transfer and localization costs for applications that reuse common resources.
An overview of securing Hadoop. Content primarily by Balaji Ganesan, one of the leaders of the Apache Argus project. Presented on Sept 4, 2014 at the Toronto Hadoop User Group by Adam Muise.
Bringing Real-Time to the Enterprise with Hortonworks DataFlowDataWorks Summit
This document discusses TELUS's journey to enable real-time streaming analytics of data from IPTV set top boxes (STBs) to improve the customer experience. It describes moving from batch processing STB log data every 12 hours to streaming the data in real-time using Apache Kafka, NiFi, and Spark. Key lessons learned include using Java 8 for SSL, Spark 2.0 for Kafka integration, and addressing security challenges in their multi-tenant Hadoop environment.
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
This document discusses how Hadoop RPC quality of service (QoS) helps improve HDFS availability by preventing name node congestion. It describes how certain user requests can monopolize name node resources, causing slowdowns or outages for other users. The solution presented is to implement fair scheduling of RPC requests using a weighted round-robin approach across user queues. This provides performance isolation and prevents abusive users from degrading service for others. Configuration and implementation details are also covered.
This document discusses security features in Apache Kafka including SSL for encryption, SASL/Kerberos for authentication, authorization controls using an authorizer, and securing Zookeeper. It provides details on how these security components work, such as how SSL establishes an encrypted channel and SASL performs authentication. The authorizer implementation stores ACLs in Zookeeper and caches them for performance. Securing Zookeeper involves setting ACLs on Zookeeper nodes and migrating security configurations. Future plans include moving more functionality to the broker side and adding new authorization features.
Rich placement constraints: Who said YARN cannot schedule services?DataWorks Summit
The rise in popularity of machine learning, streaming, and latency-sensitive online applications in shared production clusters has raised new challenges for cluster schedulers. To optimize their performance and resilience, these applications require precise control of their placements by means of complex constraints. Examples of such scenarios are the following:
• Deep learning applications need to run on GPU machines with specific GPU models and driver/kernel versions.
• Hive or Spark applications benefit from being collocated on the same rack to reduce network cost and thus speed up their execution. At the same time, it is desirable to limit the number of allocations per machine to minimize resource interference.
• Low-latency services such as HBase need to be allocated across failure domains to improve their availability.
• A DNS service might need to run on machines with public IP address.
In this talk we present the brand new addition of expressive placement constraints in YARN. We show how applications can leverage such constraints to achieve complex placements, such as collocating their allocations on the same node/rack (affinity), spreading their allocations across nodes/racks (anti-affinity), or allowing up to a specific number of allocations per node group (cardinality) to strike a balance between the two. We describe real use cases from production clusters and show the benefits of placement constraints on large clusters using popular applications in both on-prem and cloud settings.
Speakers
Konstantinos Karanasos, Senior Scientists, Microsoft
Wangda Tan, Staff Software Engineer, Hortonworks
Apache Hadoop 3 updates with migration storySunil Govindan
The document discusses migrating Hadoop clusters from version 2 to version 3. It provides an overview of new features in HDFS, YARN, and other components in Hadoop 3, including erasure coding in HDFS, global scheduling and new resource types in YARN. It also covers important considerations for upgrading such as recommended source and target versions, upgrade mechanisms, tooling changes, and ensuring Java 8 is used.
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
This document discusses YARN's shared cache feature for application resources. It provides an overview of how YARN localizes resources for each application and containers. The shared cache aims to address inefficiencies in this process by caching identical resources on NodeManagers and sharing them between applications and containers. The design goals are for the shared cache to be scalable, secure, fault-tolerant and transparent. It works by having a shared cache client interface with a shared cache manager that maintains metadata and persisted resources. This can significantly reduce data transfer and localization costs for applications that reuse common resources.
An overview of securing Hadoop. Content primarily by Balaji Ganesan, one of the leaders of the Apache Argus project. Presented on Sept 4, 2014 at the Toronto Hadoop User Group by Adam Muise.
Bringing Real-Time to the Enterprise with Hortonworks DataFlowDataWorks Summit
This document discusses TELUS's journey to enable real-time streaming analytics of data from IPTV set top boxes (STBs) to improve the customer experience. It describes moving from batch processing STB log data every 12 hours to streaming the data in real-time using Apache Kafka, NiFi, and Spark. Key lessons learned include using Java 8 for SSL, Spark 2.0 for Kafka integration, and addressing security challenges in their multi-tenant Hadoop environment.
This document discusses streaming data ingestion and processing options. It provides an overview of common streaming architectures including Kafka as an ingestion hub and various streaming engines. Spark Streaming is highlighted as a popular and full-featured option for processing streaming data due to its support for SQL, machine learning, and ease of transition from batch workflows. The document also briefly profiles StreamSets Data Collector as a higher-level tool for building streaming data pipelines.
Hadoop ClusterClient Security Using KerberosSarvesh Meena
This document discusses securing Hadoop clusters using Kerberos. It provides background on Hadoop, describing it as a framework that allows distributed processing of large datasets across computer clusters. It notes that Hadoop clusters are designed specifically for storing and analyzing huge amounts of unstructured data. The document then discusses why Hadoop cluster security is important, as the operating system trusts clients and servers trust any system on the network. It introduces Kerberos as a network protocol that uses secret-key cryptography to authenticate applications, providing encrypted tickets to access services.
LinkedIn leverages the Apache Hadoop ecosystem for its big data analytics. Steady growth of the member base at LinkedIn along with their social activities results in exponential growth of the analytics infrastructure. Innovations in analytics tooling lead to heavier workloads on the clusters, which generate more data, which in turn encourage innovations in tooling and more workloads. Thus, the infrastructure remains under constant growth pressure. Heterogeneous environments embodied via a variety of hardware and diverse workloads make the task even more challenging.
This talk will tell the story of how we doubled our Hadoop infrastructure twice in the past two years.
• We will outline our main use cases and historical rates of cluster growth in multiple dimensions.
• We will focus on optimizations, configuration improvements, performance monitoring and architectural decisions we undertook to allow the infrastructure to keep pace with business needs.
• The topics include improvements in HDFS NameNode performance, and fine tuning of block report processing, the block balancer, and the namespace checkpointer.
• We will reveal a study on the optimal storage device for HDFS persistent journals (SATA vs. SAS vs. SSD vs. RAID).
• We will also describe Satellite Cluster project which allowed us to double the objects stored on one logical cluster by splitting an HDFS cluster into two partitions without the use of federation and practically no code changes.
• Finally, we will take a peek at our future goals, requirements, and growth perspectives.
SPEAKERS
Konstantin Shvachko, Sr Staff Software Engineer, LinkedIn
Erik Krogen, Senior Software Engineer, LinkedIn
Big data processing meets non-volatile memory: opportunities and challenges DataWorks Summit
Advanced big data processing frameworks have been proposed to harness the fast data transmission capability of remote direct memory access (RDMA) over InfiniBand and RoCE. However, with the introduction of the non-volatile memory (NVM), these designs along with the default execution models, like MapReduce and Directed Acyclic Graph (DAG), need to be re-assessed to discover the possibilities of further enhanced performance.
In this context, we propose an accelerated execution framework (NVMD) for MapReduce and DAG that leverages the benefits of NVM and RDMA. NVMD introduces novel features for MapReduce and DAG, such as a hybrid push and pull shuffle mechanism and dynamic adaptation to the network congestion. The design has been incorporated into Apache Hadoop and Tez. Performance results illustrate that NVMD can achieve up to 3.65x and 3.18x improvement for Hadoop and Tez, respectively. In this talk, we will also present NVM-aware HDFS design and its benefits for MapReduce, Spark, and HBase.
Speaker: Shashank Gugnani, PhD Student, Ohio State University
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
This document discusses best practices for implementing Ceph-powered storage as a service. It covers planning a Ceph implementation based on business and technical requirements. Various use cases for Ceph are described, including OpenStack, cloud storage, web-scale applications, high performance block storage, archive/cold storage, databases and Hadoop. Architectural considerations for redundancy, servers, networking are also discussed. The document concludes with a case study of a university implementing a Ceph-based storage cloud to address storage needs for cancer and genomic research data.
Trend Micro uses Hadoop for processing large volumes of web data to quickly identify and block malicious URLs. They have expanded their Hadoop cluster significantly over time to support growing data and job volumes. They developed Hadooppet to automate deployment and management of their large, customized Hadoop distribution across hundreds of nodes. Profiling tools like Nagios, Ganglia and Splunk help monitor and troubleshoot cluster performance issues.
How the Internet of Things are Turning the Internet Upside DownDataWorks Summit
- The document discusses how time series data from sensors can be ingested and analyzed at large scales. It describes how traditional internet architecture concentrates resources at the core while sensors and devices reside at the edge, producing large amounts of time series data.
- It then summarizes techniques for ingesting and analyzing time series data at rates of millions to hundreds of millions of data points per second using technologies like OpenTSDB, HBase, and MapR databases. This involves batching data at the edge and optimized storage designs.
- The document concludes by discussing the advantages of MapR for time series use cases due to its high ingestion rates and integration with query engines like Drill for flexible analysis of large time series datasets
The document discusses new features in Apache Hadoop 3, including HDFS erasure coding which reduces storage overhead, YARN federation which improves scalability, and the Application Timeline Server which provides improved visibility into application performance. It also covers HDFS multi standby NameNodes which enhances high availability, and the future directions of Hadoop including object storage with Ozone and running HDFS on cloud infrastructure.
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...DataWorks Summit
The proliferation of connected devices and sensors is leading the Digital Transformation. By 2020 there will be over 20 billion connected devices. Data from these devices need to be ingested at extreme speeds in order to be analyzed before the data decays. The life cycle of the data is critical in revealing what insight can be revealed and how quickly they can be acted upon.
In this session we will look at the past, present and future architecture trends streaming analytics. We will look at how to turn all the data from devices into actionable insights and dive into recommendations for streaming architecture depending on the data streams and time factor of the data. We will also discuss how to manage all the sensor data, understand the life cycle cost of the data, and how to scale capacity and capability easily with a modern infrastructure strategy.
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)Amazon Web Services
In this session, we walk through the Amazon VPC network presentation and describe the problems we were trying to solve when we created it. Next, we walk through how these problems are traditionally solved, and why those solutions are not scalable, inexpensive, or secure enough for AWS. Finally, we provide an overview of the solution that we've implemented and discuss some of the unique mechanisms that we use to ensure customer isolation, get packets into and out of the network, and support new features like VPC endpoints.
- Kerberos is used to authenticate Hadoop services and clients running on different nodes communicating over a non-secure network. It uses tickets for authentication.
- Key configuration changes are required to enable Kerberos authentication in Hadoop including setting hadoop.security.authentication to kerberos and generating keytabs containing principal keys for HDFS services.
- Services are associated with Kerberos principles using keytabs which are then configured for use by the relevant Hadoop processes and services.
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsDataWorks Summit
Apache Ambari manages Hadoop at large-scale and it becomes increasingly difficult for cluster admins to keep the machinery running smoothly as data grows and nodes scale from 30 to 3000 agents. To test at scale, Ambari has a Performance Stack that allows a VM to host as many as 50 Ambari Agents. The simulated stack and 50 Agents per VM can stress-test Ambari Server with the same load as a 3000 node cluster. This talk will cover how to tune the performance of Ambari and MySQL, and share performance benchmarks for features like deploy times, bulk operations, installation of bits, Rolling & Express Upgrade. Moreover, the speaker will show how to use Ambari Metrics System and Grafana to plot performance, detect anomalies, and pinpoint tips on how to improve performance for a more responsive experience. Lastly, the talk will discuss roadmap features in Ambari 3.0 for improving performance and scale.
Intuit CTOF 2011 - Netflix for Mobile in the CloudSid Anand
Netflix provides concise summaries of its mobile app development strategy and tips for the Apple iPhone:
1. Netflix develops mobile web pages that run in the iPhone's Web Kit to enable A/B testing, fast deployments, and leveraging code across devices like iPhone and Android.
2. The Netflix iPhone app loads key data like the user's rental history, movie lists, and ratings in parallel to provide a smooth user experience with minimal loading delays.
3. Netflix prefetches non-essential data like movie images, titles, and synopses on list pages to reduce load times when users view individual movie pages.
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
The Apache Knox Gateway is an extensible reverse proxy framework for securely exposing REST APIs and HTTP-based services at a perimeter. It provides out of the box support for several common Hadoop services, integration with enterprise authentication systems, and other useful features. Knox is not an alternative to Kerberos for core Hadoop authentication or a channel for high-volume data ingest/export. It has graduated from the Apache incubator and is included in Hortonworks Data Platform releases to simplify access, provide centralized control, and enable enterprise integration of Hadoop services.
Aspera on demand for AWS (S3 inc) overviewBhavik Vyas
Aspera provides high-speed file transfer software and technologies. According to the document:
- Aspera was founded in 2004 and is headquartered in Emeryville, CA. It has 95 employees and over 1,200 customers.
- Aspera's core technology is the fasp protocol, which uses innovative techniques to maximize transfer speeds over any network distance or conditions. It outperforms other transfer methods.
- When transferring data over long distances, TCP performance degrades significantly but fasp transfers maintain high speeds. This makes it crucial for applications involving "Big Data" transfers.
(NET301) New Capabilities for Amazon Virtual Private CloudAmazon Web Services
Amazon's Virtual Private Cloud (Amazon VPC) continues to evolve with new capabilities and enhancements. These features give you increasingly greater isolation, control, and visibility at the all-important networking layer. In this session, we review some of the latest changes, discuss their value, and describe their use cases.
AWS re:Invent 2016: Deep Dive: AWS Direct Connect and VPNs (NET402)Amazon Web Services
As enterprises move to the cloud, robust connectivity is often an early consideration. AWS Direct Connect provides a more consistent network experience for accessing your AWS resources, typically with greater bandwidth and reduced network costs. This session dives deep into the features of AWS Direct Connect and VPNs. We discuss deployment architectures and demonstrate the process from start to finish. We show you how to configure public and private virtual interfaces, configure routers, use VPN backup, and provide secure communication between sites by using the AWS VPN CloudHub.
Building Real-Time Web Applications with Vortex-WebAngelo Corsaro
The Real-Time Web is rapidly growing and as a consequence an increasing number of applications require soft-real time interactions with the server-side as well as with peer web applications. In addition, real-time web technologies are experiencing swift adoption in traditional systems as a means of providing portable and ubiquitously accessible thin client applications.
In spite of this trend, few high level communication frameworks exist that allow efficient and timely data exchange between web applications as well as with the server-side and the back-end system. Vortex Web is one of the first technologies to bring the powerful OMG Data Distribution Service (DDS) abstractions to the world of HTML5 / JavaScript applications. With Vortex Web, HTML5 / JavaScript applications can seamlessly and efficiently share data in a timely manner amongst themselves as well as with any other kind of device or system that supports the standard DDS Interoperability wire protocol (DDSI).
This presentation will (1) introduce the key abstractions provided by Vortex Web, (2) provide an overview of its architecture and explain how Vortex Web uses Web Sockets and Web Workers to provide low latency and high throughput, and (3) get you started developing real-time web applications.
It introduces and illustrates use cases, benefits and problems for Kerberos deployment on Hadoop; how Token support and TokenPreauth can help solve the problems. It also briefly introduces Haox project, a Java client library for Kerberos.
The document is a catalogue from SUNMEDIA Corporation that provides information on routers and switches from Cisco. It includes sections on branch routers, data center interconnect platforms, campus LAN switches for access, and campus LAN switches for core and distribution. The document provides specifications, features and families for each product type.
This document discusses streaming data ingestion and processing options. It provides an overview of common streaming architectures including Kafka as an ingestion hub and various streaming engines. Spark Streaming is highlighted as a popular and full-featured option for processing streaming data due to its support for SQL, machine learning, and ease of transition from batch workflows. The document also briefly profiles StreamSets Data Collector as a higher-level tool for building streaming data pipelines.
Hadoop ClusterClient Security Using KerberosSarvesh Meena
This document discusses securing Hadoop clusters using Kerberos. It provides background on Hadoop, describing it as a framework that allows distributed processing of large datasets across computer clusters. It notes that Hadoop clusters are designed specifically for storing and analyzing huge amounts of unstructured data. The document then discusses why Hadoop cluster security is important, as the operating system trusts clients and servers trust any system on the network. It introduces Kerberos as a network protocol that uses secret-key cryptography to authenticate applications, providing encrypted tickets to access services.
LinkedIn leverages the Apache Hadoop ecosystem for its big data analytics. Steady growth of the member base at LinkedIn along with their social activities results in exponential growth of the analytics infrastructure. Innovations in analytics tooling lead to heavier workloads on the clusters, which generate more data, which in turn encourage innovations in tooling and more workloads. Thus, the infrastructure remains under constant growth pressure. Heterogeneous environments embodied via a variety of hardware and diverse workloads make the task even more challenging.
This talk will tell the story of how we doubled our Hadoop infrastructure twice in the past two years.
• We will outline our main use cases and historical rates of cluster growth in multiple dimensions.
• We will focus on optimizations, configuration improvements, performance monitoring and architectural decisions we undertook to allow the infrastructure to keep pace with business needs.
• The topics include improvements in HDFS NameNode performance, and fine tuning of block report processing, the block balancer, and the namespace checkpointer.
• We will reveal a study on the optimal storage device for HDFS persistent journals (SATA vs. SAS vs. SSD vs. RAID).
• We will also describe Satellite Cluster project which allowed us to double the objects stored on one logical cluster by splitting an HDFS cluster into two partitions without the use of federation and practically no code changes.
• Finally, we will take a peek at our future goals, requirements, and growth perspectives.
SPEAKERS
Konstantin Shvachko, Sr Staff Software Engineer, LinkedIn
Erik Krogen, Senior Software Engineer, LinkedIn
Big data processing meets non-volatile memory: opportunities and challenges DataWorks Summit
Advanced big data processing frameworks have been proposed to harness the fast data transmission capability of remote direct memory access (RDMA) over InfiniBand and RoCE. However, with the introduction of the non-volatile memory (NVM), these designs along with the default execution models, like MapReduce and Directed Acyclic Graph (DAG), need to be re-assessed to discover the possibilities of further enhanced performance.
In this context, we propose an accelerated execution framework (NVMD) for MapReduce and DAG that leverages the benefits of NVM and RDMA. NVMD introduces novel features for MapReduce and DAG, such as a hybrid push and pull shuffle mechanism and dynamic adaptation to the network congestion. The design has been incorporated into Apache Hadoop and Tez. Performance results illustrate that NVMD can achieve up to 3.65x and 3.18x improvement for Hadoop and Tez, respectively. In this talk, we will also present NVM-aware HDFS design and its benefits for MapReduce, Spark, and HBase.
Speaker: Shashank Gugnani, PhD Student, Ohio State University
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
This document discusses best practices for implementing Ceph-powered storage as a service. It covers planning a Ceph implementation based on business and technical requirements. Various use cases for Ceph are described, including OpenStack, cloud storage, web-scale applications, high performance block storage, archive/cold storage, databases and Hadoop. Architectural considerations for redundancy, servers, networking are also discussed. The document concludes with a case study of a university implementing a Ceph-based storage cloud to address storage needs for cancer and genomic research data.
Trend Micro uses Hadoop for processing large volumes of web data to quickly identify and block malicious URLs. They have expanded their Hadoop cluster significantly over time to support growing data and job volumes. They developed Hadooppet to automate deployment and management of their large, customized Hadoop distribution across hundreds of nodes. Profiling tools like Nagios, Ganglia and Splunk help monitor and troubleshoot cluster performance issues.
How the Internet of Things are Turning the Internet Upside DownDataWorks Summit
- The document discusses how time series data from sensors can be ingested and analyzed at large scales. It describes how traditional internet architecture concentrates resources at the core while sensors and devices reside at the edge, producing large amounts of time series data.
- It then summarizes techniques for ingesting and analyzing time series data at rates of millions to hundreds of millions of data points per second using technologies like OpenTSDB, HBase, and MapR databases. This involves batching data at the edge and optimized storage designs.
- The document concludes by discussing the advantages of MapR for time series use cases due to its high ingestion rates and integration with query engines like Drill for flexible analysis of large time series datasets
The document discusses new features in Apache Hadoop 3, including HDFS erasure coding which reduces storage overhead, YARN federation which improves scalability, and the Application Timeline Server which provides improved visibility into application performance. It also covers HDFS multi standby NameNodes which enhances high availability, and the future directions of Hadoop including object storage with Ozone and running HDFS on cloud infrastructure.
Future Architecture of Streaming Analytics: Capitalizing on the Analytics of ...DataWorks Summit
The proliferation of connected devices and sensors is leading the Digital Transformation. By 2020 there will be over 20 billion connected devices. Data from these devices need to be ingested at extreme speeds in order to be analyzed before the data decays. The life cycle of the data is critical in revealing what insight can be revealed and how quickly they can be acted upon.
In this session we will look at the past, present and future architecture trends streaming analytics. We will look at how to turn all the data from devices into actionable insights and dive into recommendations for streaming architecture depending on the data streams and time factor of the data. We will also discuss how to manage all the sensor data, understand the life cycle cost of the data, and how to scale capacity and capability easily with a modern infrastructure strategy.
AWS re:Invent 2016: Another Day, Another Billion Packets (NET401)Amazon Web Services
In this session, we walk through the Amazon VPC network presentation and describe the problems we were trying to solve when we created it. Next, we walk through how these problems are traditionally solved, and why those solutions are not scalable, inexpensive, or secure enough for AWS. Finally, we provide an overview of the solution that we've implemented and discuss some of the unique mechanisms that we use to ensure customer isolation, get packets into and out of the network, and support new features like VPC endpoints.
- Kerberos is used to authenticate Hadoop services and clients running on different nodes communicating over a non-secure network. It uses tickets for authentication.
- Key configuration changes are required to enable Kerberos authentication in Hadoop including setting hadoop.security.authentication to kerberos and generating keytabs containing principal keys for HDFS services.
- Services are associated with Kerberos principles using keytabs which are then configured for use by the relevant Hadoop processes and services.
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsDataWorks Summit
Apache Ambari manages Hadoop at large-scale and it becomes increasingly difficult for cluster admins to keep the machinery running smoothly as data grows and nodes scale from 30 to 3000 agents. To test at scale, Ambari has a Performance Stack that allows a VM to host as many as 50 Ambari Agents. The simulated stack and 50 Agents per VM can stress-test Ambari Server with the same load as a 3000 node cluster. This talk will cover how to tune the performance of Ambari and MySQL, and share performance benchmarks for features like deploy times, bulk operations, installation of bits, Rolling & Express Upgrade. Moreover, the speaker will show how to use Ambari Metrics System and Grafana to plot performance, detect anomalies, and pinpoint tips on how to improve performance for a more responsive experience. Lastly, the talk will discuss roadmap features in Ambari 3.0 for improving performance and scale.
Intuit CTOF 2011 - Netflix for Mobile in the CloudSid Anand
Netflix provides concise summaries of its mobile app development strategy and tips for the Apple iPhone:
1. Netflix develops mobile web pages that run in the iPhone's Web Kit to enable A/B testing, fast deployments, and leveraging code across devices like iPhone and Android.
2. The Netflix iPhone app loads key data like the user's rental history, movie lists, and ratings in parallel to provide a smooth user experience with minimal loading delays.
3. Netflix prefetches non-essential data like movie images, titles, and synopses on list pages to reduce load times when users view individual movie pages.
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
The Apache Knox Gateway is an extensible reverse proxy framework for securely exposing REST APIs and HTTP-based services at a perimeter. It provides out of the box support for several common Hadoop services, integration with enterprise authentication systems, and other useful features. Knox is not an alternative to Kerberos for core Hadoop authentication or a channel for high-volume data ingest/export. It has graduated from the Apache incubator and is included in Hortonworks Data Platform releases to simplify access, provide centralized control, and enable enterprise integration of Hadoop services.
Aspera on demand for AWS (S3 inc) overviewBhavik Vyas
Aspera provides high-speed file transfer software and technologies. According to the document:
- Aspera was founded in 2004 and is headquartered in Emeryville, CA. It has 95 employees and over 1,200 customers.
- Aspera's core technology is the fasp protocol, which uses innovative techniques to maximize transfer speeds over any network distance or conditions. It outperforms other transfer methods.
- When transferring data over long distances, TCP performance degrades significantly but fasp transfers maintain high speeds. This makes it crucial for applications involving "Big Data" transfers.
(NET301) New Capabilities for Amazon Virtual Private CloudAmazon Web Services
Amazon's Virtual Private Cloud (Amazon VPC) continues to evolve with new capabilities and enhancements. These features give you increasingly greater isolation, control, and visibility at the all-important networking layer. In this session, we review some of the latest changes, discuss their value, and describe their use cases.
AWS re:Invent 2016: Deep Dive: AWS Direct Connect and VPNs (NET402)Amazon Web Services
As enterprises move to the cloud, robust connectivity is often an early consideration. AWS Direct Connect provides a more consistent network experience for accessing your AWS resources, typically with greater bandwidth and reduced network costs. This session dives deep into the features of AWS Direct Connect and VPNs. We discuss deployment architectures and demonstrate the process from start to finish. We show you how to configure public and private virtual interfaces, configure routers, use VPN backup, and provide secure communication between sites by using the AWS VPN CloudHub.
Building Real-Time Web Applications with Vortex-WebAngelo Corsaro
The Real-Time Web is rapidly growing and as a consequence an increasing number of applications require soft-real time interactions with the server-side as well as with peer web applications. In addition, real-time web technologies are experiencing swift adoption in traditional systems as a means of providing portable and ubiquitously accessible thin client applications.
In spite of this trend, few high level communication frameworks exist that allow efficient and timely data exchange between web applications as well as with the server-side and the back-end system. Vortex Web is one of the first technologies to bring the powerful OMG Data Distribution Service (DDS) abstractions to the world of HTML5 / JavaScript applications. With Vortex Web, HTML5 / JavaScript applications can seamlessly and efficiently share data in a timely manner amongst themselves as well as with any other kind of device or system that supports the standard DDS Interoperability wire protocol (DDSI).
This presentation will (1) introduce the key abstractions provided by Vortex Web, (2) provide an overview of its architecture and explain how Vortex Web uses Web Sockets and Web Workers to provide low latency and high throughput, and (3) get you started developing real-time web applications.
It introduces and illustrates use cases, benefits and problems for Kerberos deployment on Hadoop; how Token support and TokenPreauth can help solve the problems. It also briefly introduces Haox project, a Java client library for Kerberos.
The document is a catalogue from SUNMEDIA Corporation that provides information on routers and switches from Cisco. It includes sections on branch routers, data center interconnect platforms, campus LAN switches for access, and campus LAN switches for core and distribution. The document provides specifications, features and families for each product type.
The Alcatel 7350 ASAM is a multiservice DSLAM and ATM switch that offers cost-effective delivery of broadband internet, voice, and video services. It supports various DSL technologies, ATM, and IP networking to enable high-density access and service delivery. The platform is scalable, provides quality of service and traffic management features, and can be remotely managed through the Alcatel Network Manager.
This document provides information about the Cisco XFP10GEROC192IR product, including:
- It is a 10GBASE-ER and OC192 IR2 XFP module for networking.
- Launch 3 Telecom sells this product and provides same-day shipping, payment options like credit cards, and a warranty.
- They also offer services like repairs, maintenance contracts, installation, and recycling of telecom equipment.
The presentation discusses iDirect's Evolution product line including the iDX 1.0 satellite router, X3 router, and line cards. Key features highlighted are DVB-S2/ACM technology for improved bandwidth efficiency, integration with the X3 satellite router, and software tools for monitoring and adjusting ACM performance. Benefits of iDirect's DVB-S2/ACM implementation include increased throughput and bandwidth savings while easing network configuration.
Cisco Unified Wireless Network and Converged access – Design sessionCisco Russia
This document discusses Cisco's unified wireless network and converged access design session. It provides an overview of wireless standards past and present, including expected developments. Cisco's unified access vision is described, bringing wired and wireless onto a single policy and management framework. The document highlights Cisco's leadership in wireless networking and reviews Cisco's wireless product portfolio, including new access point models. Key capabilities such as RF management and advanced mobility services are also summarized.
This document provides information about the Cisco XFP-10GLR-OC192SR module and how to purchase it from Launch 3 Telecom. It describes Launch 3 Telecom as a supplier of Cisco and telecom equipment, outlines the payment and shipping options for purchasing the module, and details the warranty and support services provided by Launch 3 Telecom.
This document provides information about the Cisco SFPOC48SR product, including:
1) It lists contact information for purchasing the Cisco SFPOC48SR and provides a product description noting it is a Cisco OC-48c/STM-16 Short-Reach Transceiver Module.
2) It describes the company Launch 3 Telecom that sells the product and notes they offer same-day shipping, payment options, warranty, and additional services like repair.
3) It provides an overview of the Cisco 7600 Series Internet Router, which the SFPOC48SR can be used with, highlighting its scalability, interfaces, applications for service providers and enterprises.
The document describes the Cisco 2500 Series Wireless Controller, which enables systemwide wireless functions for small to medium enterprises. It supports up to 75 access points and 1000 clients, and provides centralized security policies, RF management, and quality of service. Key features include scalability, ease of deployment, high performance up to 1 Gbps, comprehensive security, and support for voice, video and guest access.
AudioCodes Voice gateway can be connected to Ms LyncATHLSolutions
The document discusses AudioCodes enhanced gateways that provide connectivity between Microsoft Lync, the public switched telephone network (PSTN), and IP-based phone systems. The gateways come in various sizes to support different branch office needs. They offer a gradual migration path from traditional phone networks to IP and SIP trunking.
Huawei ar2200 series enterprise routers datasheetUmar Yaqub
The AR2200 series enterprise routers provide secure, scalable unified voice and data communications for enterprise headquarters and branch offices. The routers integrate routing, switching, 3G, voice, and security functions into a single modular chassis. They support wired and wireless access including Ethernet, xDSL, fiber, and 3G. The routers also provide built-in PBX, SIP server, and firewall capabilities to enable enterprise-class voice services and network security.
The document describes a Cisco MEM-C6K-CPTFL256M part and provides information about purchasing, shipping, warranty, and services from Launch 3 Telecom. Specifically, it states that Launch 3 Telecom sells the Cisco MEM-C6K-CPTFL256M part, offers same day shipping, and provides a warranty and return policy. It also notes that Launch 3 Telecom offers services like repair, maintenance contracts, de-installation, and recycling.
The document discusses various types of network hardware including:
- Local networking hardware such as network interface cards, cables, connectors, hubs, switches, servers, and workstations.
- Internetworking hardware such as line drivers, transceivers, bridges, switches, routers, and gateways.
Current internetworking devices are mostly confined to switches and routers. The document also examines network interface cards in detail, describing their specifications and evolution over time from older cards to current gigabit Ethernet cards.
The document discusses various types of network hardware including:
- Local networking hardware such as network interface cards, cables, connectors, hubs, switches, servers, and workstations.
- Internetworking hardware such as line drivers, transceivers, bridges, switches, routers, and gateways.
Current internetworking devices are mostly confined to switches and routers. The document also examines networking interface cards in detail, describing their specifications and evolution over time from older technologies like Token Ring and 10BaseT to current Gigabit Ethernet standards.
The document discusses various types of network hardware including local networking hardware like network interface cards (NICs), cables, connectors, hubs, switches, servers and workstations. It also discusses internetworking hardware like line drivers, transceivers, bridges, switches, routers and gateways. It describes the characteristics of different NICs such as their speed, connector type, and bus technology. It also examines concepts like IRQ, I/O address, base memory address, and DMA used for resource allocation on NICs. Finally, it discusses network connectors and different types of hubs.
Squire Technologies: 9225 Protocol Converter Presentation.
SS7 to PRI ISDN Protocol Converter.
The SS7 to ISDN Protocol Converter is a fully featured, carrier-grade product with a flexible and powerful routing engine, offered in 3 models 1000, 2000 and 8000 to satisfy client’s deployment requirements and budget.
The product has a high pedigree of worldwide SS7 to ISDN PRI signalling interconnect in over 70 countries, catering for both small interconnect up, and large international points of presence.
How much you know about cisco, cisco routerIT Tech
Cisco is the worldwide leader in networking for the Internet. Cisco provides networking solutions that are the foundation for most corporate, education, and government networks around the world. Cisco's popular router series include the 800, 1800, 1900, 2800, 2900, 3800, 3900, 7200, and 7600 series. These routers offer features such as security, wireless connectivity, VPN support, and voice and video capabilities to meet the needs of small to large networks.
The document discusses ACKSYS's rugged WiFi solutions for communication on buses, trams, and at depots. It describes how ACKSYS's products allow vehicles to automatically load and offload operating data through a wireless network at depots. It also discusses providing real-time communication on vehicles in motion through ACKSYS's WiFi devices that offer fast roaming and high data rates to ensure seamless connectivity. ACKSYS offers a complete solution of devices tailored for both onboard and ground use to enable dynamic data transfers and smart deployment of wireless networks for public transportation.
Building a Raspberry Pi Robot with Dot NET 8, Blazor and SignalRPeter Gallagher
In this session delivered at NDC Oslo 2024, I talk about how you can control a 3D printed Robot Arm with a Raspberry Pi, .NET 8, Blazor and SignalR.
I also show how you can use a Unity app on an Meta Quest 3 to control the arm VR too.
You can find the GitHub repo and workshop instructions here;
https://bit.ly/dotnetrobotgithub
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"Emmanuel Onwumere
In iOS 18, Apple has introduced a significant revamp to the Control Centre, making it more intuitive and user-friendly. One of the standout features is a quicker and more accessible way to shut down your iPhone. This enhancement aims to streamline the user experience, allowing for faster access to essential functions. Discover how iOS 18's redesigned Control Centre can simplify your daily interactions with your iPhone, bringing convenience right at your fingertips.