Slides from my Scalapeno 2016 talk about BigPanda's journey from node.js to Scala on our core data processing service.
For more details - hit me up at @itrvd on Twitter.
Kubernetes networking is complex with several components including pod networking, service networking, and DNS. Pod networking requires assigning each pod a unique IP and enabling communication between pods using CNI plugins like Flannel or Calico which implement overlay networking. Service networking is done via kube-proxy using either iptables or IPVS mode, with IPVS being more scalable. DNS lookups are handled by CoreDNS to map service names to cluster IPs. Overall, Kubernetes networking takes work to understand but the ecosystem adapts quickly to issues.
Kubernetes Jobによるバッチシステムのリソース最適化 / AbemaTV DevCon 2018 TrackB Session B6AbemaTV, Inc.
This document discusses transcoding video files using Kubernetes jobs. It describes creating different job pods for each transcoding task, such as 1080p, 720p, etc. It shows configuring the jobs with specifications for parallelism, completions, and restart policies. The document also discusses using the Kubernetes client API to watch for job status changes in order to manage the transcoding workflow.
Dual-homed connectivity provides connectivity to the internet through two separate internet service providers (ISPs), reducing the risk of an outage due to single point of failure if one ISP fails. The ipv6 traffic-filter command is used to apply an IPv6 access list called MY-LIST to check inbound packets on an interface. NAT64 is the protocol that provides IPv4 internet connectivity to IPv6 devices.
This document discusses making a local CPAN mirror called DPAN that can provide access to Perl modules even when disconnected from the internet. It describes using the MyCPAN::Indexer and MyCPAN::App::DPAN modules to build indexes of modules from CPAN and backpan.perl.org and store them locally in a format similar to the CPAN distribution files. The DPAN files can then be served to local CPAN clients and updated periodically using SVN to sync with changes to the upstream repositories.
PGroonga – Make PostgreSQL fast full text search platform for all languages!Kouhei Sutou
PGroonga is an extension for PostgreSQL that provides fast full text search across all languages. It uses a full inverted index to allow phrase searches without needing to perform a slow "recheck" step, unlike the pg_bigm extension. By supporting accurate phrase searches through its index structure, PGroonga is much faster than pg_bigm, especially for queries with many result hits.
This work presents the evaluation of the two classic workstealing algorithms (FIFO and LIFO) together with a new proposed implementation based on the priority of tasks calculated using the longest path as a metric
What is a Service Mesh and what can it do for your MicroservicesMatt Turner
e’ll explore what a service mesh is and what it can do for your microservices. Are the claims of observability, resiliency, and WAF features real? Are they useful during development, production, or both? Using pictures and demos, we’ll find out!
This session will also briefly cover how a service mesh works, giving us a mental model with which to explore and evaluate after the talk. Matt will show a simple installation and demo, giving us all the knowledge to go home and try for ourself.
Kubernetes networking is complex with several components including pod networking, service networking, and DNS. Pod networking requires assigning each pod a unique IP and enabling communication between pods using CNI plugins like Flannel or Calico which implement overlay networking. Service networking is done via kube-proxy using either iptables or IPVS mode, with IPVS being more scalable. DNS lookups are handled by CoreDNS to map service names to cluster IPs. Overall, Kubernetes networking takes work to understand but the ecosystem adapts quickly to issues.
Kubernetes Jobによるバッチシステムのリソース最適化 / AbemaTV DevCon 2018 TrackB Session B6AbemaTV, Inc.
This document discusses transcoding video files using Kubernetes jobs. It describes creating different job pods for each transcoding task, such as 1080p, 720p, etc. It shows configuring the jobs with specifications for parallelism, completions, and restart policies. The document also discusses using the Kubernetes client API to watch for job status changes in order to manage the transcoding workflow.
Dual-homed connectivity provides connectivity to the internet through two separate internet service providers (ISPs), reducing the risk of an outage due to single point of failure if one ISP fails. The ipv6 traffic-filter command is used to apply an IPv6 access list called MY-LIST to check inbound packets on an interface. NAT64 is the protocol that provides IPv4 internet connectivity to IPv6 devices.
This document discusses making a local CPAN mirror called DPAN that can provide access to Perl modules even when disconnected from the internet. It describes using the MyCPAN::Indexer and MyCPAN::App::DPAN modules to build indexes of modules from CPAN and backpan.perl.org and store them locally in a format similar to the CPAN distribution files. The DPAN files can then be served to local CPAN clients and updated periodically using SVN to sync with changes to the upstream repositories.
PGroonga – Make PostgreSQL fast full text search platform for all languages!Kouhei Sutou
PGroonga is an extension for PostgreSQL that provides fast full text search across all languages. It uses a full inverted index to allow phrase searches without needing to perform a slow "recheck" step, unlike the pg_bigm extension. By supporting accurate phrase searches through its index structure, PGroonga is much faster than pg_bigm, especially for queries with many result hits.
This work presents the evaluation of the two classic workstealing algorithms (FIFO and LIFO) together with a new proposed implementation based on the priority of tasks calculated using the longest path as a metric
What is a Service Mesh and what can it do for your MicroservicesMatt Turner
e’ll explore what a service mesh is and what it can do for your microservices. Are the claims of observability, resiliency, and WAF features real? Are they useful during development, production, or both? Using pictures and demos, we’ll find out!
This session will also briefly cover how a service mesh works, giving us a mental model with which to explore and evaluate after the talk. Matt will show a simple installation and demo, giving us all the knowledge to go home and try for ourself.
Gauntlt is an open source project that allows developers to test the security of their code by running automated attacks defined in Cucumber features. It includes features to scan for open ports with Nmap, crawl a site for passwords, and include other recon, scanning, fuzzing and load testing tools. The project aims to provide a domain specific language and framework to encourage contributors to package common penetration testing tools for automated security testing of code.
This document discusses how to submit a Perl module to CPAN in 3 steps:
1. Generate the module files using h2xs and write documentation.
2. Post about the module on modules@perl.org to get feedback.
3. Upload the distribution files to PAUSE after testing and following the checklist.
NSQ is a real-time distributed messaging platform that provides a simple publish-subscribe interface along with a queuing system. It is open source, written in Go, and supports multiple programming languages. NSQ handles topics which messages can be published to, and channels that allow consuming and processing messages.
Traitement temps réel chez Streamroot - Golang Paris Juin 2016Simon Caplette
This document summarizes Simon Caplette's work as a Backend Scalability Engineer at streamroot.io, which provides peer-to-peer functionality to save large video broadcasters bandwidth. Key aspects of Streamroot's realtime systems built with Go include high concurrency and scalability components like trackers, signaling servers, and autoscalers. Go was well-suited due to its built-in concurrency and ability to handle many concurrent processes. The document also discusses Streamroot's realtime data pipeline using Kafka and InfluxDB for low latency analytics.
DOD 2016 - Kamil Szczygieł - Patching 100 OpenStack Compute Nodes with Zero-d...PROIDEA
YouTube: https://www.youtube.com/watch?v=OsgNn-D9KFc&index=15&list=PLnKL6-WWWE_VtIMfNLW3N3RGuCUcQkDMl
Undisclosed vulnerabilities are very serious threat to the cloud security. Once the flaw leaks to the public information, the risk of attacks increases dramatically. In our talk we will go through case study of patching 100 OpenStack compute nodes consisting of 4000 virtual machines with zero-day patch within 16 hours. We will talk about the challenges we have encountered, how we faced them and we will answer the most important question – did we make it within 16 hours.
Nmap has several hidden options that provide little value. 8 options are useless except for naughty users or elementary school children. Nmap can only detect one type of malware, the Mydoom worm, through service scanning at high intensity levels. In summary, most of the "hidden truths" about Nmap options provide little practical benefit to users.
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...Databricks
Spark performance has made impressive progress but there is still a significant gap towards what is achievable by best-of-breed query engines or hand-written low-level C code, on modern server-class hardware. We present Flare, a new back-end for Spark SQL that yields significant speedups by compiling Catalyst query plans to native code.
Flare’s low-level implementation takes full advantage of native execution, using techniques such as NUMA-aware scheduling and data layouts to leverage ‘mechanical sympathy’ and bring execution closer to the metal. In particular, Flare enables easy integration with other native code. We present TensorFlare, a compiler backend that enables full native code generation for AI pipelines that use Spark together with TensorFlow, and demonstrates order-of-magnitude speedups.
The document describes the CyberPro, a portable packet capture and analysis appliance. It offers high-speed packet capture of up to 10Gbps, real-time analytics, active triggers to flag anomalies, and integrated post-processing tools. The CyberPro allows network analysts and security investigators to quickly capture and analyze network traffic to uncover intrusions or performance issues.
David Melendez - Building a drone from scratch with spare parts is a challenging business. To accomplish this journey, a Linux embedded stability control system is developed entirely from 0.This is a journey starting from the hardware choosing (a home WIFI router), to a stable and real flight. Unconventional implementations are one of the main topic, like using WiFi as communication between drone and pilot, HTML5 and COMET to show telemetry from the router web server, and implementing a entirely new protocol based on 802.11 Beacon Frames to prevent deauthentication attacks.
rtpengine is a media relay, WebRTC bridge, call recorder, media transcoder, and media player. It can relay and manipulate media in real-time by forwarding packets through a kernel module. It supports features like SDP profile transforming, ICE negotiation, DTLS-SRTP encryption, packet recording, transcoding between codecs, and injecting audio streams into calls from files or databases. rtpengine integrates with Kamailio through modules and configuration to manipulate media on SIP calls.
Jdd2014: High performance logging - Peter LawreyPROIDEA
In many applications, there is a tension between how much you can log without slowing down your application, and how much information you would like to have.
Chronicle provided a number of solutions which allow you to record millions of events per second, with micro-second latencies in a persisted way without contributing to your garbage.
How does this simplify the design, help you increase the determinism and vertical scalability of your application?
Riski uzņēmumu IT nodrošināšanā. Situācijas anatomija.HORTUS Digital
Riski ir visur. IT pasaule nav izņēmums, bet ikdienā par to tiek aizmirsts. IT risku tēma ir tuvu vai bezgalīgi plaša un laikā mainīga. Šodienas drošība viegli var kļūt par rītdienas lielāko risku. Riski, kas saistīti ar IT procesiem, ir ļoti liela un reāla problēma, kas, ja ignorēta, var izmaksāt milzu naudas, amatus un karjeras. Tai pat laikā ikdienā saskaros ar to, ka, ja arī sākotnēji klients prasa ļoti drošu un uzticamu risinājumu, tad ieraugot tāmi, pēkšņi visi iepriekš aplūkotie riski kļūst nesvarīgi. Tas nav pareizi un tāpēc par riskiem vajag runāt.
Heath Richard Munro is an experienced Operations Manager with over 25 years in the automotive sales sector. He has held several leadership roles, including Area Sales Manager for Skoda UK Ltd and Territory Business Manager for Mazda Motors UK Ltd. Currently, he is seeking a new opportunity in operations management where he can utilize his expertise in sales, marketing, customer satisfaction, and team leadership.
Narayan Sandhle has over 5 years of experience in payroll and finance and administration roles. He has specialized expertise in US, Asian Pacific, and Indian payrolls. He is currently a Process Specialist at Infosys BPO handling the end-to-end payroll for Singapore and South Korea. Previously he worked at EXL Services and Shahi Exports in payroll processing, accounts payable, and reconciliation roles. Narayan holds an MBA in Finance and a BCom in Finance from universities in Bangalore.
Proactively using social media throughout a crisisStephen Thompson
The document discusses how social media is developing and how it can be used to manage and mitigate crises. It notes the rise of more visual and private platforms, as well as fake news. It recommends using skepticism and alternative sources to address misinformation risks. It also discusses echo chambers, listening strategies like location and network tools, and crisis fundamentals like pre-planning, internal communication, and ongoing listening across multiple platforms and sources.
After meniscus-repair - POSTSURGICAL MENISCAL REPAIR REHABILITATION PROTOCOLpriyaakumarr
This is an outline of the major exercises that are commonly incorporated. Individual patient response
should be considered and therefore modifications may need to be made. Communication should be
made to the Surgeon if concerns arise during rehabilitation.
To Know more visit - http://www.dramrajani.com/
This document provides information about Cape Fear HealthNet (CFHN), which aims to create a coordinated healthcare system for the uninsured in Brunswick and New Hanover counties, North Carolina. It discusses CFHN's mission, history, target populations, current safety net providers, goals, and plans to develop a healthcare system for the uninsured through recruiting specialty physicians and establishing advisory committees.
This document provides an overview of a course on computers that teaches students how to use computers and common programs through hands-on practice. The course covers topics like working with computers, using programs, and creating a website, blog, and photo album to give students experience with different digital tools and resources.
Gauntlt is an open source project that allows developers to test the security of their code by running automated attacks defined in Cucumber features. It includes features to scan for open ports with Nmap, crawl a site for passwords, and include other recon, scanning, fuzzing and load testing tools. The project aims to provide a domain specific language and framework to encourage contributors to package common penetration testing tools for automated security testing of code.
This document discusses how to submit a Perl module to CPAN in 3 steps:
1. Generate the module files using h2xs and write documentation.
2. Post about the module on modules@perl.org to get feedback.
3. Upload the distribution files to PAUSE after testing and following the checklist.
NSQ is a real-time distributed messaging platform that provides a simple publish-subscribe interface along with a queuing system. It is open source, written in Go, and supports multiple programming languages. NSQ handles topics which messages can be published to, and channels that allow consuming and processing messages.
Traitement temps réel chez Streamroot - Golang Paris Juin 2016Simon Caplette
This document summarizes Simon Caplette's work as a Backend Scalability Engineer at streamroot.io, which provides peer-to-peer functionality to save large video broadcasters bandwidth. Key aspects of Streamroot's realtime systems built with Go include high concurrency and scalability components like trackers, signaling servers, and autoscalers. Go was well-suited due to its built-in concurrency and ability to handle many concurrent processes. The document also discusses Streamroot's realtime data pipeline using Kafka and InfluxDB for low latency analytics.
DOD 2016 - Kamil Szczygieł - Patching 100 OpenStack Compute Nodes with Zero-d...PROIDEA
YouTube: https://www.youtube.com/watch?v=OsgNn-D9KFc&index=15&list=PLnKL6-WWWE_VtIMfNLW3N3RGuCUcQkDMl
Undisclosed vulnerabilities are very serious threat to the cloud security. Once the flaw leaks to the public information, the risk of attacks increases dramatically. In our talk we will go through case study of patching 100 OpenStack compute nodes consisting of 4000 virtual machines with zero-day patch within 16 hours. We will talk about the challenges we have encountered, how we faced them and we will answer the most important question – did we make it within 16 hours.
Nmap has several hidden options that provide little value. 8 options are useless except for naughty users or elementary school children. Nmap can only detect one type of malware, the Mydoom worm, through service scanning at high intensity levels. In summary, most of the "hidden truths" about Nmap options provide little practical benefit to users.
Flare and TensorFlare: Native Compilation for Spark and TensorFlow Pipelines ...Databricks
Spark performance has made impressive progress but there is still a significant gap towards what is achievable by best-of-breed query engines or hand-written low-level C code, on modern server-class hardware. We present Flare, a new back-end for Spark SQL that yields significant speedups by compiling Catalyst query plans to native code.
Flare’s low-level implementation takes full advantage of native execution, using techniques such as NUMA-aware scheduling and data layouts to leverage ‘mechanical sympathy’ and bring execution closer to the metal. In particular, Flare enables easy integration with other native code. We present TensorFlare, a compiler backend that enables full native code generation for AI pipelines that use Spark together with TensorFlow, and demonstrates order-of-magnitude speedups.
The document describes the CyberPro, a portable packet capture and analysis appliance. It offers high-speed packet capture of up to 10Gbps, real-time analytics, active triggers to flag anomalies, and integrated post-processing tools. The CyberPro allows network analysts and security investigators to quickly capture and analyze network traffic to uncover intrusions or performance issues.
David Melendez - Building a drone from scratch with spare parts is a challenging business. To accomplish this journey, a Linux embedded stability control system is developed entirely from 0.This is a journey starting from the hardware choosing (a home WIFI router), to a stable and real flight. Unconventional implementations are one of the main topic, like using WiFi as communication between drone and pilot, HTML5 and COMET to show telemetry from the router web server, and implementing a entirely new protocol based on 802.11 Beacon Frames to prevent deauthentication attacks.
rtpengine is a media relay, WebRTC bridge, call recorder, media transcoder, and media player. It can relay and manipulate media in real-time by forwarding packets through a kernel module. It supports features like SDP profile transforming, ICE negotiation, DTLS-SRTP encryption, packet recording, transcoding between codecs, and injecting audio streams into calls from files or databases. rtpengine integrates with Kamailio through modules and configuration to manipulate media on SIP calls.
Jdd2014: High performance logging - Peter LawreyPROIDEA
In many applications, there is a tension between how much you can log without slowing down your application, and how much information you would like to have.
Chronicle provided a number of solutions which allow you to record millions of events per second, with micro-second latencies in a persisted way without contributing to your garbage.
How does this simplify the design, help you increase the determinism and vertical scalability of your application?
Riski uzņēmumu IT nodrošināšanā. Situācijas anatomija.HORTUS Digital
Riski ir visur. IT pasaule nav izņēmums, bet ikdienā par to tiek aizmirsts. IT risku tēma ir tuvu vai bezgalīgi plaša un laikā mainīga. Šodienas drošība viegli var kļūt par rītdienas lielāko risku. Riski, kas saistīti ar IT procesiem, ir ļoti liela un reāla problēma, kas, ja ignorēta, var izmaksāt milzu naudas, amatus un karjeras. Tai pat laikā ikdienā saskaros ar to, ka, ja arī sākotnēji klients prasa ļoti drošu un uzticamu risinājumu, tad ieraugot tāmi, pēkšņi visi iepriekš aplūkotie riski kļūst nesvarīgi. Tas nav pareizi un tāpēc par riskiem vajag runāt.
Heath Richard Munro is an experienced Operations Manager with over 25 years in the automotive sales sector. He has held several leadership roles, including Area Sales Manager for Skoda UK Ltd and Territory Business Manager for Mazda Motors UK Ltd. Currently, he is seeking a new opportunity in operations management where he can utilize his expertise in sales, marketing, customer satisfaction, and team leadership.
Narayan Sandhle has over 5 years of experience in payroll and finance and administration roles. He has specialized expertise in US, Asian Pacific, and Indian payrolls. He is currently a Process Specialist at Infosys BPO handling the end-to-end payroll for Singapore and South Korea. Previously he worked at EXL Services and Shahi Exports in payroll processing, accounts payable, and reconciliation roles. Narayan holds an MBA in Finance and a BCom in Finance from universities in Bangalore.
Proactively using social media throughout a crisisStephen Thompson
The document discusses how social media is developing and how it can be used to manage and mitigate crises. It notes the rise of more visual and private platforms, as well as fake news. It recommends using skepticism and alternative sources to address misinformation risks. It also discusses echo chambers, listening strategies like location and network tools, and crisis fundamentals like pre-planning, internal communication, and ongoing listening across multiple platforms and sources.
After meniscus-repair - POSTSURGICAL MENISCAL REPAIR REHABILITATION PROTOCOLpriyaakumarr
This is an outline of the major exercises that are commonly incorporated. Individual patient response
should be considered and therefore modifications may need to be made. Communication should be
made to the Surgeon if concerns arise during rehabilitation.
To Know more visit - http://www.dramrajani.com/
This document provides information about Cape Fear HealthNet (CFHN), which aims to create a coordinated healthcare system for the uninsured in Brunswick and New Hanover counties, North Carolina. It discusses CFHN's mission, history, target populations, current safety net providers, goals, and plans to develop a healthcare system for the uninsured through recruiting specialty physicians and establishing advisory committees.
This document provides an overview of a course on computers that teaches students how to use computers and common programs through hands-on practice. The course covers topics like working with computers, using programs, and creating a website, blog, and photo album to give students experience with different digital tools and resources.
Common sense of backpack buckle 2 --smart industrial (asia) corporation limitedEdward Young
This document provides information about Smart Industrial (Asia) Corporation Limited, including their address, website, and contact person. It discusses three major backpack buckle brands, five common buckle types, an example buckle reference SB6462, and common buckle testing types like strain relief, instant pull, opening/closing, temperature, salt spray, and other environmental tests. The purpose is to introduce Smart Industrial as a backpack buckle manufacturer and describe relevant buckle product and testing information.
Science fiction is an exciting genre which depicts extraordinary situations and characters which are not accepted by mainstream science. The main purpose of this genre is to entertain the audiences and take them to an imaginary world. Distant Home, Merc Force, Drifting School, Dead KIds, Minority Report are some of the best movies related to this genre.
The document provides an overview of the autonomic nervous system including its divisions of sympathetic and parasympathetic. Key points covered include:
- The sympathetic nervous system mobilizes the body for "fight or flight" through neurotransmitters like norepinephrine, while the parasympathetic nervous system conserves energy for "rest and digest" using acetylcholine.
- There are multiple pathways for preganglionic neurons in both divisions to reach target organs, often traveling along spinal nerves, ganglia, and plexuses.
- A clinical case of a tumor impinging the brachial plexus and sympathetic trunk is used to demonstrate autonomic effects like ptosis and mydriasis.
-
The document summarizes discussions from a Central Piedmont Access II board retreat work group session. It outlines plans to transition the organization's structure, including consolidating steering committees, establishing a network executive board, and defining board member roles and responsibilities. Metrics for measuring network performance are presented, along with examples of practice profiles and tools for monitoring individual provider quality. Goals for developing a strategic plan are discussed, including creating a mission, assessing strengths and weaknesses, and setting measurable objectives. A sample budget template is also included.
This document lists the episodes from Season 1 of the TV series "Off The Vine" which has 10 episodes total. Episode 1 kicks off the season, followed by Episodes 2 through 9. The season finale is Episode 10.
Akhdiyat Duta Modjo adalah vokalis grup musik asal Yogyakarta, Sheila On Seven. Pria kelahiran Amerika Serikat ini tercatat pernah kuliah di Universitas Gadjah Mada namun tak selesai karena fokus pada musik. Selain bernyanyi, Duta juga pernah membintangi film Tak Biasa bersama Melanie Putria.
This resume summarizes Chandan Raj S's personal and professional details. It includes his contact information, education history, work experience in web development and related projects at companies like TekSystems and Arena Animation. His skills include programming languages like Java, PHP, and JavaScript. He has a B.E. in computer science from VTU and an MTech from CMR Institute of Technology. His areas of interest are in web development, computer networks, and cloud computing.
This document advertises the sale of various illegal goods and services, including stolen credit card information, bank login credentials, money transfers through fraudulent means, and document forgery services. Contact information is provided for interested buyers to arrange purchases through email and messaging applications. Pricing and samples are listed for different types of stolen financial and personal information from various countries. Payment is expected through anonymous payment methods like bitcoin or money transfer services.
1) Tom is called to his boss's office fearing he will be fired for a mistake at a company event.
2) Over the weekend, Tom takes advice to apologize to everyone he has wronged, making several situations much worse.
3) On Monday, Tom's boss calls him into the office, but instead of firing him, promotes Tom to manager. However, when others find out, they do not look happy to see Tom, implying they want to complain about him to the boss.
Hungary is an old European country with borders to six other countries. Budapest is its capital city and home to the second oldest metro system in the world. Some key facts about Hungary are that it has over 15 million residents, Hungarians are large consumers of paprika, Lake Balaton is a major healing lake, and the Rubik's Cube was invented there. Hungary has also found success in the Olympics and fields such as literature, music, and chess over the years.
This document provides an overview of Apache Samza, an open source stream processing framework. It discusses why stream processing is useful, Samza's design of processing streams of data across jobs and tasks, how its design is implemented using Apache Kafka for messaging and YARN for resource management, and how to use Samza by developing stream and stateful tasks.
FPGA based 10G Performance Tester for HW OpenFlow SwitchYutaka Yasuda
SDN operators need to measure the performance of OF HW switch on their site. Cause there is 1000 times differences in latency, depends on the specified flow entry. ASIC can forward in several μsecs but the software (CPU) may take msec.
To protect yourself from unexpected performance plunge, monitor your switches healthiness on your site.
XMPP and AMQP are messaging protocols. XMPP uses XML and is extendable while AMQP is for message queues. Ejabberd is an XMPP server written in Erlang, while RabbitMQ supports AMQP. Both can be used for communication between clients and servers.
The document summarizes a presentation given by Chris Fregly on end-to-end real-time analytics using Apache Spark. It discusses topics like Spark streaming, machine learning, tuning Spark for performance, and demonstrates live demos of sorting, matrix multiplication, and thread synchronization optimized for CPU cache. The presentation emphasizes techniques like cache-friendly data layouts, prefetching, and lock-free algorithms to improve Spark performance.
Introducing Exactly Once Semantics To Apache KafkaApurva Mehta
Here are slides from my talk on introducing exactly once semantics to Apache Kafka. The talk was given at the Kafka Summit NYC, 8 May 2017.
The slides dive into the design of transactions in Apache Kafka.
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafkaconfluent
The document introduces Apache Kafka's new exactly once semantics that provide exactly once, in-order delivery of records per partition and atomic writes across multiple partitions. It discusses the existing at-least once delivery semantics and issues around duplicates. The new approach uses idempotent producers, sequence numbers, and transactions to ensure exactly once delivery and coordination across partitions. It also provides up to 20% higher throughput for producers and 50% for consumers through more efficient data formatting and batching. The new features are available in Apache Kafka 0.11 released in June 2017.
This document discusses Reactive Programming and Reactive Streams. It introduces Reactor, a reactive programming framework, and how it addresses issues like latency in microservices architectures. Reactive Streams provide an interoperable way to work with asynchronous data streams in a non-blocking manner. Streams represent sequences of data that can be processed reactively through operators like map and filter.
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
This document discusses stream processing at scale. It begins with an introduction and agenda. It then discusses scenarios for stream processing like newsfeeds, cybersecurity, and IoT. It presents the canonical stream processing architecture with data buses, real-time and batch processing, and ingestion/serving tiers. The document dives into the essential ingredients for stream processing: scale, reprocessing, accuracy of results, and easy programmability. It provides examples and strategies for each of these essential ingredients to achieve efficient and accurate stream processing at large scales.
Spark Streaming has supported Kafka since it's inception, but a lot has changed since those times, both in Spark and Kafka sides, to make this integration more fault-tolerant and reliable.Apache Kafka 0.10 (actually since 0.9) introduced the new Consumer API, built on top of a new group coordination protocol provided by Kafka itself.
So a new Spark Streaming integration comes to the playground, with a similar design to the 0.8 Direct DStream approach. However, there are notable differences in usage, and many exciting new features. In this talk, we will cover what are the main differences between this new integration and the previous one (for Kafka 0.8), and why Direct DStreams have replaced Receivers for good. We will also see how to achieve different semantics (at least one, at most one, exactly once) with code examples.
Finally, we will briefly introduce the usage of this integration in Billy Mobile to ingest and process the continuous stream of events from our AdNetwork.
The document describes Hermes, a new replication protocol that aims to provide high performance, strong consistency, and fault tolerance for distributed datastores. Hermes uses an invalidating broadcast-based approach where writes are coordinated by a replica that broadcasts invalidations and value updates to other replicas. It allows for local reads from any replica and fast, decentralized, fully concurrent writes that commit in one network round trip. To handle faults, Hermes propagates write values with invalidations to allow replicas to recover from failures without blocking. The goal is to improve on existing protocols by avoiding serialization bottlenecks while maintaining strong consistency under both normal operation and replica failures.
RabbitMQ with python and ruby RuPy 2009Paolo Negri
The document discusses using RabbitMQ, an open-source message broker based on AMQP, for asynchronous messaging between Python and Ruby applications. It provides an overview of AMQP concepts like producers, consumers, exchanges and queues, and how RabbitMQ implements these using Erlang. Code examples are shown for sending and receiving messages asynchronously in Python and Ruby.
Stream processing in python with Apache Samza and BeamHai Lu
Apache Samza is the streaming engine being used at LinkedIn that processes around 2 trillion messages daily. A while back we announced Samza's integration with Apache Beam, a great success which leads to our Samza Beam API. Now an UPGRADE of our APIs - we're now supporting Stream Processing in Python! This work has made stream processing more accessible and enabled many interesting use cases, particularly in the area of machine learning. The Python API is based on our work of Samza runner for Apache Beam. In this talk, we will quickly review our work on Samza runner, and then how we extended it to support portability in Beam (Python specifically). In addition to technical and architectural details, we will also talk about how we bridged Python and Java ecosystems at LinkedIn with the Python API, together with different use cases.
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Natan Silnitsky
At Wix, we have created a universal event-driven programming infrastructure on top of the Kafka message broker.
This infra makes sure messages are eventually successfully consumed and produced no matter what failure it encounters.
In this talk, you will learn about the features we introduced in order to make sure our distributed system can safely handle an ever growing message throughput in a fault tolerant manner.
You will be introduced to such techniques as retry topics, local persistent queues, and cooperative fibers that help make your flows more resilient and performant.
You will also learn how to make this infra work for all programming languages tech stacks with optimal resource manage using the power of Kubernetes and gRPC.
When to use a client library, and when to deploy an external pod (DaemonSet) or even deploy a sidecar.
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 Peopleconfluent
Tom Crayford discusses his experience running hundreds of Apache Kafka clusters on Heroku with a small team. Some key points discussed include:
- Using automation to manage clusters and reduce manual work required
- Common issues encountered like disk growth from log compaction bugs and addressing them by scanning clusters for anomalies
- Kafka's built-in high availability and how it helped during an AWS EBS failure event
- Novel failure cases encountered like a JVM memory leak from gzip usage and working to fix it
- Importance of taking breaks and not wasting time when operating clusters at scale.
Hai Lu presented on the Samza Portable Runner for Apache Beam. The key points are:
1) The Samza Portable Runner allows stream processing to be done in multiple languages like Python by translating Beam pipelines into the Samza execution engine.
2) It provides a high-level Python SDK for building streaming applications on top of Beam's portability framework. Pipelines are translated from Python into the language-independent Beam representation.
3) Performance is improved through batching/bundling messages between the Python and Java processes to reduce round trips. Initial tests showed throughput increasing with larger bundle sizes.
4) Example use cases demonstrated near real-time image OCR, model training,
This document discusses parallel computing and cloud computing. It describes how compute-intensive bioinformatics tasks like OTU picking that take weeks on a desktop can be accelerated by distributing the workload across many processors. Cloud computing provides pay-as-you-go access to large compute clusters without the overhead of maintaining physical hardware. Public clouds like Amazon allow users to provision virtual machines for running analyses and terminate them when finished.
The document discusses intelligent character recognition using deep learning techniques such as recurrent neural networks and connectionist temporal classification. It begins by motivating the need for intelligent character recognition over traditional optical character recognition due to variations in handwritten text. It then reviews previous approaches using preprocessing, feature extraction, and hybrid neural networks before describing an approach using multidimensional recurrent neural networks with LSTM cells trained end-to-end using CTC loss. The proposed architecture involves stacking multiple bidirectional LSTM layers to capture spatial and temporal dependencies in handwritten text. Results on a public dataset show a character error rate of 15% without a lexicon and 12.6% with a lexicon during testing.
Apache Kafka's rise in popularity as a streaming platform has demanded a revisit of its traditional at-least-once message delivery semantics.
In this talk, we present the recent additions to Kafka to achieve exactly-once semantics (EoS) including support for idempotence and transactions in the Kafka clients. The main focus will be the specific semantics that Kafka distributed transactions enable and the underlying mechanics which allow them to scale efficiently.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
From node.js to Scala - with a 100x performance boost
1. FROM TO
with a 100x perf. boost!
BY ITAMAR RAVID | MAY 3, 2016
2. t
AGENDA
WE’LL TALK ABOUT…
• What we do, our challenges and what led us to Scala and Akka;
• How we redesigned our core data processing service;
• Some useful lessons and patterns.
There will be relatively little node.js bashing. Promise.
3. t
BIGPANDA: THE ANSWER TO ALERT FATIGUE
RABBIT IS DOWN!
NO FREE SPACE!
INBOUND QUEUE OVERFLOWING!
OUTBOUND QUEUE OVERFLOWING!
APPLICATION HEALTH CRITICAL!
TOO MANY FAILED HTTP REQS!
rabbit-1, ping
rabbit-2, disk
queue-1, size
queue-2, size
app1, health
app2, 500 codes
RabbitMQ cluster
ping disk
RabbitMQ node 3
queue size queue size
API server
health failed reqs
CorrelationAlgorithm
4. t
Correlation
Stage
Normalization
Stage
IN TERMS OF STREAMS…
RABBIT IS DOWN!
NO FREE SPACE!
INBOUND QUEUE OVERFLOWING!
OUTBOUND QUEUE OVERFLOWING!
APPLICATION HEALTH CRITICAL!
TOO MANY FAILED HTTP REQS!
Nagios event source
Datadog event source
AppDynamics
event source
rabbit-1, ping
rabbit-2, disk
queue-1, size
queue-2, size
app1, health
app2, 500 codes
RabbitMQ cluster
ping disk
RabbitMQ node 3
queue size queue size
API server
health failed reqs
CorrelationAlgorithm
21. t
ACTOR-BASED SOLUTION
Node Manager
Customer A
Pipeline
Kafka
Reader
Algorithm
runner
Mongo
Writer
Rabbit
Writer
Customer B
Pipeline
Customer C
Pipeline
SUPERVISION
MESSAGING
customer_a_inputs
22. t
NEXT-GEN SOLUTION
Node Manager
Customer A
Pipeline
Kafka
Reader
Algorithm
runner
Mongo
Writer
Rabbit
Writer
Customer B
Pipeline
Customer C
Pipeline
SUPERVISION
MESSAGING
FAILURE
ISOLATION
customer_a_inputs
23. t
NEXT-GEN SOLUTION
Node Manager
Customer A
Pipeline
Kafka
Reader
Algorithm
runner
Mongo
Writer
Rabbit
Writer
Customer B
Pipeline
Customer C
Pipeline
SUPERVISION
MESSAGING
SEPARATE DISPATCHERS
FOR QOS-TUNING
customer_a_inputs
28. t
PRUNING AN INFINITE DATA STREAM
5 6 7 8 9 N…10
t=8, OK
MISSING
ALERTS :-(
PRUNING STREAMS THAT RESULT IN
STATE REQUIRES STATE RECOVERY.
29. t
PRUNING AN INFINITE DATA STREAM
5 6 7 8 9 N…10
Snapshot
Repository
<data …>
lastOffset: 4
<data …>
lastOffset: 8
<data …>
lastOffset: 10
ON BOOT, LATEST SNAPSHOT IS LOADED
AND STREAM IS SEEKED TO STORED OFFSET.
30. t
PRUNING AN INFINITE DATA STREAM
CHALLENGES:
- COMPACTNESS
- SCHEMA EVOLUTION
kryo/chill with a manual de/serializer <=> Map[String, Any]
Schema evolution support with some caveats
Big datasets are only a few MBs in size
31. USE SNAPSHOTS TO PRUNE STREAMS
JSON IS NOT THE ONLY SOLUTION!
KEY TAKEAWAYS
47. t
FINAL NUMBERS AND BENEFITS
OVERALL RATE IMPROVMENT:
~ 16 events/s on a single node.js process at peak
1600-2500 events/s on a single pipeline at peak
ISOLATION
COMPLETE DETERMINISM
SCALABILITY
Actor-per-Customer; failure isolation
More nodes => more actors; reduced I/O
Actions determined entirely by Kafka contents;
amazing for debugging!