Lambda architecture for real time big dataTrieu Nguyen
Lambda Architecture in Real-time Big Data Project
Concepts & Techniques “Thinking with Lambda”
Case study in some real projects
Why lambda architecture is correct solution for big data?
Lambda architecture for real time big dataTrieu Nguyen
Lambda Architecture in Real-time Big Data Project
Concepts & Techniques “Thinking with Lambda”
Case study in some real projects
Why lambda architecture is correct solution for big data?
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes per day of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. This session is especially recommended for data infrastructure engineers and architects planning, building, or maintaining similar systems.
The right architecture is key for any IT project. This is especially the case for big data projects, where there are no standard architectures which have proven their suitability over years. This session discusses the different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Streaming Analytics architecture as well as Lambda and Kappa architecture and presents the mapping of components from both Open Source as well as the Oracle stack onto these architectures.
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahDatabricks
Insnap, a hyper-personalized ML-based platform acquired by The Honest Company, has been used to build a real-time data platform based on Apache Spark, Cassandra and Redshift. Users’ behavioral and transactional data have been used to build data models and ML models, and to drive use cases for marketing, growth, finance and operations.
Learn how Honest Company has used Spark as a workhorse for 1) collecting, ETL and storing data from various sources including mysql, mongo, jde, Google analytics, Facebook, Localytics and REST API; 2) building data models and aggregating and generating reports of revenue, order fulfillment tracking, data pipeline monitoring and subscriptions; 3) Using ML to build model for user acquisitions, LTV and recommendations use cases. Spark replaced the monolithic codebase with flexible, scalable and robust pipelines. Databricks helped The Honest Company to focus on data instead of maintaining infrastructure. While Honest users got delightful recommendations to improve experience, data users at Honest understood users much better in terms of segmenting with behavioral information and advanced ML models, leading to increased revenue and retention.
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.
We believe that it is now possible, using a well-chosen selection of enterprise open source big data projects, to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers. We propose to show a working prototype of just such a system, and explain in some detail how it was made.
Our presentation describes a working real-time ML-based anomaly detection system. We show a working industrial robot-analog installed with a wireless movement sensor. Our system scores the data in a cloud-based cluster. For added realism, the system we demonstrate live includes a working augmented-reality headset that can show the real-time status overlaid on the working robot.
This talk is about demonstrating a concrete example of a real-time predictive maintenance system, built as a series of microservices connected by Kafka streams and powered by the excellent H2O distributed Machine Learning tool. Our goal is for our attendees to get a feel for what can be realistically achieved by a few non-genius-level engineers in a few months of effort using the best in open source technology for real-time streams (Kafka) and Machine learning (H2O).
Where appropriate, we’ll mention how our choice of using the MapR Converged Data Platform made the development easier thanks to some of its unique features.
Speaker
Cao Yi, MapR
Cloud Experience: Data-driven Applications Made Simple and FastDatabricks
A complex real-time data workflow implementation is very challenging. This session will describe the architecture of a data platform that provides a single, secure, high-performance system that can be deployed in a hybrid cloud architectures. We will present how to support simultaneous, consistent and high-performance access through multiple industry open source and cloud compatible standards of streaming, table, TSDB, object, and file APIs. A new serverless technology is also used in the architecture to support a dynamic and flexible implementations. The presenter will also outline how the platform was integrated with the Spark eco-system, including AI and ML tools, to simplify the development process
A series of tweets I posted about my 11hr struggle to make a cup of tea with my WiFi kettle ended-up going viral, got picked-up by the national and then international press, and led to thousands of retweets, comments and references in the media. In this session we’ll take the data I recorded on this Twitter activity over the period and use Oracle Big Data Graph and Spatial to understand what caused the breakout and the tweet going viral, who were the key influencers and connectors, and how the tweet spread over time and over geography from my original series of posts in Hove, England.
We have two locations to serve. Contact us today!
Jack Key Dodge Jeep Chrysler Ram
http://alamogordo.jackkey.com/
1501 Hwy 70 W, Alamogordo, NM 88310
(575) 434-3916
https://www.facebook.com/JackKeyDodgeJeepChryslerRam?fref=ts
https://plus.google.com/116687532374692437165/posts?gl=us&hl=en
https://twitter.com/JackKeyAuto
Jack Key Chrysler Jeep
http://lascruces.jackkey.com/
1840 N Main St, Las Cruces, NM 88001
(575) 524-7741
https://www.facebook.com/JackKeyChryslerJeep
https://plus.google.com/102555838424996277659/about?gl=us&hl=en
https://twitter.com/JackKeyAuto
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes per day of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. This session is especially recommended for data infrastructure engineers and architects planning, building, or maintaining similar systems.
The right architecture is key for any IT project. This is especially the case for big data projects, where there are no standard architectures which have proven their suitability over years. This session discusses the different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Streaming Analytics architecture as well as Lambda and Kappa architecture and presents the mapping of components from both Open Source as well as the Oracle stack onto these architectures.
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahDatabricks
Insnap, a hyper-personalized ML-based platform acquired by The Honest Company, has been used to build a real-time data platform based on Apache Spark, Cassandra and Redshift. Users’ behavioral and transactional data have been used to build data models and ML models, and to drive use cases for marketing, growth, finance and operations.
Learn how Honest Company has used Spark as a workhorse for 1) collecting, ETL and storing data from various sources including mysql, mongo, jde, Google analytics, Facebook, Localytics and REST API; 2) building data models and aggregating and generating reports of revenue, order fulfillment tracking, data pipeline monitoring and subscriptions; 3) Using ML to build model for user acquisitions, LTV and recommendations use cases. Spark replaced the monolithic codebase with flexible, scalable and robust pipelines. Databricks helped The Honest Company to focus on data instead of maintaining infrastructure. While Honest users got delightful recommendations to improve experience, data users at Honest understood users much better in terms of segmenting with behavioral information and advanced ML models, leading to increased revenue and retention.
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.
We believe that it is now possible, using a well-chosen selection of enterprise open source big data projects, to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers. We propose to show a working prototype of just such a system, and explain in some detail how it was made.
Our presentation describes a working real-time ML-based anomaly detection system. We show a working industrial robot-analog installed with a wireless movement sensor. Our system scores the data in a cloud-based cluster. For added realism, the system we demonstrate live includes a working augmented-reality headset that can show the real-time status overlaid on the working robot.
This talk is about demonstrating a concrete example of a real-time predictive maintenance system, built as a series of microservices connected by Kafka streams and powered by the excellent H2O distributed Machine Learning tool. Our goal is for our attendees to get a feel for what can be realistically achieved by a few non-genius-level engineers in a few months of effort using the best in open source technology for real-time streams (Kafka) and Machine learning (H2O).
Where appropriate, we’ll mention how our choice of using the MapR Converged Data Platform made the development easier thanks to some of its unique features.
Speaker
Cao Yi, MapR
Cloud Experience: Data-driven Applications Made Simple and FastDatabricks
A complex real-time data workflow implementation is very challenging. This session will describe the architecture of a data platform that provides a single, secure, high-performance system that can be deployed in a hybrid cloud architectures. We will present how to support simultaneous, consistent and high-performance access through multiple industry open source and cloud compatible standards of streaming, table, TSDB, object, and file APIs. A new serverless technology is also used in the architecture to support a dynamic and flexible implementations. The presenter will also outline how the platform was integrated with the Spark eco-system, including AI and ML tools, to simplify the development process
A series of tweets I posted about my 11hr struggle to make a cup of tea with my WiFi kettle ended-up going viral, got picked-up by the national and then international press, and led to thousands of retweets, comments and references in the media. In this session we’ll take the data I recorded on this Twitter activity over the period and use Oracle Big Data Graph and Spatial to understand what caused the breakout and the tweet going viral, who were the key influencers and connectors, and how the tweet spread over time and over geography from my original series of posts in Hove, England.
We have two locations to serve. Contact us today!
Jack Key Dodge Jeep Chrysler Ram
http://alamogordo.jackkey.com/
1501 Hwy 70 W, Alamogordo, NM 88310
(575) 434-3916
https://www.facebook.com/JackKeyDodgeJeepChryslerRam?fref=ts
https://plus.google.com/116687532374692437165/posts?gl=us&hl=en
https://twitter.com/JackKeyAuto
Jack Key Chrysler Jeep
http://lascruces.jackkey.com/
1840 N Main St, Las Cruces, NM 88001
(575) 524-7741
https://www.facebook.com/JackKeyChryslerJeep
https://plus.google.com/102555838424996277659/about?gl=us&hl=en
https://twitter.com/JackKeyAuto
Fed up with the way our modern day world uses outdated techniques in the job world? I give you My ANTI-Resume Manifesto. It's like no resume you've ever seen before!
ATENÇÃO LIDERANÇAS<<<
Link liberado para pré cadastro:
FAÇA JÁ SEU CADASTRO E SEJA PIONEIRO NESSE NEGÓCIO ESPETACULAR!!!!!!!
http://clubedelideres.net/?afiliado=fotografia
VEJAM AS VANTAGENS DE FAZER PARTE DO CDL
CLUBE DE LIDERES: Você obterá por só R$ 25,00.
• Sistema de Gerenciamento de Rede com direito de revenda e 100% de lucro!
• Ganhos rápidos de ciclos na matriz 2x2 (mesmo só com derramamentos)
• Residual mensal na matriz global 2x10 – meta da primeira fase R$ 6.963,00
• Premiações na matriz 2x7 conforme for atingindo os níveis sendo eles:
1º nível: página de captura
2º nível: auto responder
3º nível: site personalizado
4º nível: adesão na primeira empresa de mmn
5º nível: adesão na segunda empresa de mmn
6º nível: adesão na terceira empresa de mmn
7º nível: adesão na quarta empresa de mmn
• E muito mais novidades chegando por ai!
assista ao vídeo https://www.youtube.com/watch?v=ShXwyFNikVE
Seja um vencedor!
FAÇA JÁ SEU CADASTRO E ATIVE SUA CONTA
http://clubedelideres.net/?afiliado=fotografia
jrfotografia.desner@outlook.com
VAMOS QUE VAMOS PARA O SUCESSO!!!
A thorough review of what makes a great ecosystem for entrepreneurs. It's not enough to have a great idea or a great team, as an entrepreneur you need a supporting environment that will help you succeed. In this presentation you can find the main ingredients that create a good ecosystem for startups along with a review of main European startup ecosystems.
Spotify in the Cloud - An evolution of data infrastructure - Strata NYCJosh Baer
Slides from a presentation given by Alison Gilles and Josh Baer during StrataNYC 2017.
Covers the decision, challenge and strategy (technical, organizational, people) for converting Spotify's 2500 node Hadoop cluster's worth of data and processing to Google Cloud.
Finally, touches on Spotify's resulting infrastructure on GCP.
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio, Inc.
Alluxio Bay Area Meetup March 14th
Join the Alluxio Meetup group: https://www.meetup.com/Alluxio
Alluxio Community slack: https://www.alluxio.org/slack
In this presentation, we start by briefly talking about why configuration management and automation tools are becoming increasingly important along with our general approach and the community that supports it. We will also provide a comprehensive overview of the technologies used with Puppet, so expect to learn more about Puppet Enterprise, Puppet, PuppetDB, MCollective, Forge and more. Other programs that help people learn about Puppet, like training and certification programs are also included.
This presentation explains how open-source Apache Nifi can be used to easily consume AWS Cloud Services. Featuring drag and drop interactions with many cloud capabilities, it enables teams to quickly start handling their big data on the cloud. Both small agile and large enterprise teams can benefit from this easy to learn, rapid to implement approach to data processing. For more information, go to www.calculatedsystems.com.
Marco Pozzan
Power BI consultant & Trainer
Scenario di utilizzo del real-time di Power BI. In questa sessione verrà introdotta la teoria sul real-time dashboarding offerto da Power BI. Poi ci si focalizzerà sun un caso pratico di real-time dataset in modalità ibrida per la realizzazione di una dashboard di controllo con la possibilità di effettuare il write back e permettere all’utente di effettuare analisi what-if.
Designing a social network offers some exciting challenges to engineers. The system needs to operate at scale, to provide a responsive user experience and to be able to inspect user activity in order to both generate new content and improve how the existing content is delivered.
Event Driven Architectures are particularly suitable to handle these kind of challenges and highly scalable messaging systems such as Apache Kafka have been designed specifically to support the requirements of modern high volume applications.
In this talk we are describing how the Crowdmix back-end has been designed as an Event Based system running on top of Kafka. We are going to present the overall system architecture and discuss in more detail some of the different sub components processing those events in different fashions, from streaming based processing to batch processing passing through a lambda-style batch and stream cooperation.
We are going to conclude describing some lessons learned from our one-year journey in implementing and operating the system
Discover how the world of big data is evolving and becoming faster, more reliable and better organized-- powering many of the cooler new features that you see in the client today!
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2l2Rr6L.
Doug Daniels discusses the cloud-based platform they have built at DataDog and how it differs from a traditional datacenter-based analytics stack. He walks through the decisions they have made at each layer, covers the pros and cons of these decisions and discusses the tooling they have built. Filmed at qconsf.com.
Doug Daniels is a Director of Engineering at Datadog, where he works on high-scale data systems for monitoring, data science, and analytics. Prior to joining Datadog, he was CTO at Mortar Data and an architect and developer at Wireless Generation, where he designed data systems to serve more than 4 million students in 49 states.
Similar to Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to produce real-time insights by Josh Baer (20)
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
Insights can only be as good as the data. The data quality domain is enormously large, so you need to understand your company pain points to know what to focus on first.
https://www.bigdataspain.org/2017/talk/big-data-big-quality
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
2gether is a financial platform based on Blockchain, Big Data and Artificial Intelligence that allows interaction between users and third-party services in a single interface.
https://www.bigdataspain.org/2017/talk/scaling-a-backend-for-a-big-data-and-blockchain-environment
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
All modern Big Data solutions, like Hadoop, Kafka or the rest of the ecosystem tools, are designed as distributed processes and as such include some sort of redundancy for High Availability.
https://www.bigdataspain.org/2017/talk/disaster-recovery-for-big-data
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
In this presentation, attendees will see how to speed up existing Hadoop and Spark deployments by just making Apache Ignite responsible for RAM utilization. No code modifications, no new architecture from scratch!
https://www.bigdataspain.org/2017/talk/boost-hadoop-and-spark-with-in-memory-technologies
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
The power of this new set of tools for Data Science. Is really easy to start applying these technics in your current workflow.
https://www.bigdataspain.org/2017/talk/data-science-for-lazy-people-automated-machine-learning
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
GPUs on the cloud as Infrastructure as a Service (IaaS) seem a commodity. However to efficiently distribute deep learning tasks on several GPUs is challenging.
https://www.bigdataspain.org/2017/talk/training-deep-learning-models-on-multiple-gpus-in-the-cloud
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
Unbalanced data is a specific data configuration that appears commonly in nature. Applying machine learning techniques to this kind of data is a difficult process, usually addressed by unbalanced reduction techniques.
https://www.bigdataspain.org/2017/talk/unbalanced-data-same-algorithms-different-techniques
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
Time series related problems have traditionally been solved using engineered features obtained by heuristic processes.
https://www.bigdataspain.org/2017/talk/state-of-the-art-time-series-analysis-with-deep-learning
Big Data Spain 2017
November 16th - 17th
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
Not long ago only banks and hedge funds could afford doing automated and High Frequency Trading, that is, the ability to send buy commodities in microseconds intervals.
https://www.bigdataspain.org/2017/talk/trading-at-market-speed-with-the-latest-kafka-features
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
The shift to stream processing at LinkedIn has accelerated over the past few years. We now have over 200 Samza applications in production processing more than 260B events per day.
https://www.bigdataspain.org/2017/talk/apache-samza-jake-maes
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
IBM has built a “Data Science Experience” cloud service that exposes Notebook services at web scale.
https://www.bigdataspain.org/2017/talk/the-analytic-platform-behind-ibms-watson-data-platform
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
Artificial Intelligence and Data-centric businesses.
https://www.bigdataspain.org/2017/talk/tbc
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
Ten years ago there were rumours of the death of causal inference. Big data was supposed to enable us to rely on purely correlational data to predict and control the world.
https://www.bigdataspain.org/2017/talk/why-big-data-didnt-end-causal-inference
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
The Meme of the Internet Index will be the new normal to analyze and predict facts and sensations which go around the Internet.
https://www.bigdataspain.org/2017/talk/meme-index-analyzing-fads-and-sensations-on-the-internet
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
Geotab is a leader in the expanding world of Internet of Things (IoT) and telematics industry with Big Data.
https://www.bigdataspain.org/2017/talk/vehicle-big-data-that-drives-smart-city-advancement
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
The talk will focus on explaining why operational databases do not scale due to limitations in legacy transactional management.
https://www.bigdataspain.org/2017/talk/end-of-the-myth-ultra-scalable-transactional-management
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
In recent years Machine Learning (ML) and especially Deep Learning (DL) have achieved great success in many areas such as visual recognition, NLP or even aiding in medical research.
https://www.bigdataspain.org/2017/talk/attacking-machine-learning-used-in-antivirus-with-reinforcement
Big Data Spain 2017
16th - 17th Kinépolis Madrid
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
Primary function of banking sector is promoting economic activity; which means “commerce”, exchanging what someone produces-has for something that someone consumes-desires.
https://www.bigdataspain.org/2017/talk/more-people-less-banking-blockchain
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
Bol.com has been an early Hadoop user: since 2008 where it was first built for a recommendation algorithm.
https://www.bigdataspain.org/2017/talk/make-the-elephant-fly-once-again
Big Data Spain 2017
16th - 17th Kinépolis Madrid
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
3. Who am I?
• Technical Product Owner at Spotify
• Working with fast processing infrastructure
• Previously, building out Spotify’s 2500 node
Hadoop cluster
@l_phant
4. • Spotify Launches
• Instant Access to a gigantic
catalog of music
• Click to play instantaneous!
In 2008
20. In the Beginning…
• Spotifywas almost completely on-premise/bare
metal
• 2500 node Hadoop cluster, over 10K machines in
production at four globally distributed data centers
• Grew with users: from 1M in 2009, over 100M in 2016
21. Why Move to the Cloud?
• Cloud Providers have matured, decreasing in costs
and increasing in reliability and variety of service
offered
• Owning and operating physical machines is not a
competitive advantage for Spotify
22. Why Google’s Cloud?
• We believe Google’s industry leading background
in Big Data technologies will give us a data
processing advantage
24. BigQuery
• Ad-hoc and interactive querying service for massive datasets
• Like Hive, but without needing to manage Hadoop and servers
• Leverages Google’s internal tech
• Dremel (query execution engine)
• Colossus (distributed storage)
• Borg (distributed compute)
• Juniper (network)
Source: https://cloud.google.com/blog/big-data/2016/01/bigquery-under-the-hood
25. BigQuery vs. Hive
• Example Query: Find the top 10 songs by
popularity in Spain during October
• BigQuery (1.50 TB processed): 108s
• Hive(15.5TB processed): 2647s
Note: Hive performance unoptimized.Version used (0.14), input format (Avro), run on
a ~2500 nodeYarn cluster.This is not considered to be a thorough benchmark
26. BigQuery vs. Hive (example #2)
• Example Query: Find the total hours of music
listening in Spain during October
• BigQuery (780 GB processed): 33s
• Hive(15.5TB processed): 969s
Note: Hive performance unoptimized.Version used (0.14), input format (Avro), run on
a ~2500 nodeYarn cluster.This is not considered to be a thorough benchmark
27. •
Top 10 Tracks in Spain during October 2016
Rank Artist(s) Track Name
1 J Balvin Safari
2 DJ Snake Let Me Love You
3 Ricky Mar8n Vente Pa' Ca
4 Sebas8an Yatra Traicionera
5 Zion & Lennox (feat. J Balvin) Otra Vez
6 Carlos Vives, Shakira La Bicicleta
7 The Chainsmokers Closer
8 Major Lazer (feat. Jus8n Bieber & MØ) Cold Water
9 Sia The Greatest
10 IAmChino (feat. Pitbull, Yandel & Chacal) Ay MI Dios
28. Time Spent Listening to
Spotify by users in Spain
during October
Nearly 10,000 Years!
29. BigQuery at Spotify
• Interactive and ad-hoc querying immediately
started to transferto BQ once the data was
available on the cloud
• Pace of learning increases as friction to question
decreases
30. Cloud Pub/Sub
• At least once globally distributed message queue
• For high volume, low topic (<10,000) publish
subscribe behavior
• Like Kafka, but without needing to operate servers
and supporting services (zookeeper)
31. Cloud Pub/Sub at Spotify
• 800K events/second? No problem
• P99 Latency of ingestions into ES: 500ms
• Ingestion from globally distributed non-GCP
datacenters is painless
32. • Managed Service for running batch and streaming jobs
• UnifiedAPI for batch and streaming mode
• Inspired by internal Google tools like FlumeJava and
Millwheel
• Programming model open-sourced asApache Beam
(currently incubating)
Cloud Dataflow
33. • Usually run via Scio: https://github.com/spotify/scio
• Scio provides a scalaAPI for running Dataflow jobs
and provides easy integrations with BigQuery
• New batch processing jobs @Spotify are being
written in Scio/Dataflow
Cloud Dataflow (Batch) at Spotify
34. • Exactly-once stream processing framework
• Areplacement for Spark/Flink streaming and
Storm workloads at Spotify
• Optimizes for consistencywhich can complicate
real-time workloads
Cloud Dataflow (Streaming) at Spotify
35. Spotify + Google Cloud Timeline
2015 2016
Beginning of Google
Cloud evaluation
BigQuery begins
to replace Hive
Cloud Pub/Sub begins
to replace Kafka
Dataflow (streaming)
begins to replace StormSpotify + Google
Cloud Announcement
Dataflow (batch)
replacing Map/Reduce
Note: Dates are approximations
39. Introducing “Pulsar”
• An internal name forthe system aggregating data
fromAccess Points and feeding it into Cloud Pub/
Sub
• Replaces the Kafka real-time event feed
42. Dataflow
• Subscribes to critical event Pub/Sub topics
• Aggregate events into minute windows
• Always running, no need to schedule orwait for
results
49. Problem
As a developer, I want to be able to instantly explore
data being logged bythe clients.
50. Solution
• Produce a topic for all employee client events
• Store in Elasticsearch
• Visualize in Kibana
51.
52.
53. Benefits
• Able to understand what’s being sent bythe client
as it happens
• Exploring events, visualizing distribution (i.e. does
this field actually get populated)
• Prototyping analysis based on a sample
• Dashboards for Employee Releases
57. Dataflow to the Rescue!
• We created a librarythat allows teams to build
maps/filters with simple java code
• Code gets translated into a Dataflow job
60. No Ops!
• For our users:
• Event-feed managed through Cloud Pub/Sub
• Dataflow managed by Google
• Shared Elasticsearch cluster (managed by an
infra team)
61. Low Ops :/
• Dataflow is improving, but it’s had some stability
issues with streaming jobs
• Teams may need to set-up their own Elasticsearch
cluster ifthey require a higher SLAthan default
65. Live Results for X-Factor
• X-Factor: television music
competition
• Contest songs get loaded onto
Spotify immediately after show
airs
• Listener behavior determines the
order of contestants on the playlist
70. Cloud to the Rescue!
• Spotify has leveled up our abilityto gain actionable
insights by leveraging Google Cloud tools, such as
Pub/Sub, Dataflow and BigQuery
71. TheValue of a Fast Feedback Loop
• Detecting problems early in data avoids long backfills or
long term data loss
• Instant insights on newly developed features allows
teams to iterate quicker and take risks
• Providing a quicker ad-hoc querying engine allows teams
to ask more questions and learn faster
72. UseAnything and Everything
• Opensource and other cloud providers offer many
alternatives to the stack we’ve used
• Opensource tools, like Elasticsearch/Kibana, and
proprietary solutions, like Tableau, have also been
useful additions
74. Stream Processing First
• The sun never sets on Spotify, why impose
boundaries on our datasets?
• What’s the shortest distance between two lines?
Zero!
• Can we reduce the feedback cycle to zero?