To capture and perform analysis the data stream generated by the VRT NU and VRT NWS platforms, we built a high-performance and real-time data streaming platform based on Apache Kafka.
These slides contain the lessons learned and a brief introduction to our open-sourced architecture
I'll provide guidelines for thinking about empirical performance evaluation of parallel programs in general and of Spark jobs in particular. It's easier to be systematic about this if you think in terms of "what's the effective network bandwidth we're getting?" instead of "How fast does this particular job run?" In addition, the figure of merit for parallel performance isn't necessarily obvious. If you want to minimize your AWS bill you should almost certainly run on a single node (but your job may take six months to finish). You may think you want answers as quickly as possible, but if you could make a job finish in 55 minutes instead 60 minutes while doubling your AWS bill, would you do it? No? Then what exactly is the metric that you should optimize?
Stitch Fix aspires to help you find the style that you will love. Data, the backbone of the business, is used to help with styling recommendations, demand modeling, user acquisition, and merchandise planning and also to influence business decisions throughout the organization. These decisions are backed by algorithms and data collected and interpreted based on client preferences. Neelesh Srinivas Salian offers an overview of the compute infrastructure used by the data science team at Stitch Fix, covering the architecture, tools within the larger ecosystem, and the challenges that the team overcame along the way.
Apache Spark plays an important role in Stitch Fix’s data platform, and the company’s data scientists use Spark for their ETL and Presto for their ad hoc queries. The goal for the team running the compute infrastructure is to understand and make the data scientists’ lives easier, particularly in terms of usability of Spark, by building tools that expedite the process of getting started with Spark and transitioning from an ad hoc to a production workflow. The compute infrastructure is a part of the data platform that is responsible for all the needs of data scientists as Stitch Fix.
Neelesh shares Stitch Fix’s journey, exploring its ad hoc and production infrastructure and detailing its in-house tools and how they work in synergy with open source frameworks in a cloud environment. Neelesh also discusses the additional improvements to the infrastructure that help persist information for future use and optimization and explains how the implementation of Amazon’s EMR FS has helped make it easier to read from the S3 source.
10,000 Foot View of Big Data. Presentation slides by Peter Smith, PhD, ACL at the Canadian Executive Cloud & DevSecOps Summit. Toronto May 4, 2018 and Vancouver, May 11, 2018 hosted by TriNimbus.
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...Spark Summit
The central premise of DataXu is to apply data science to better marketing. At its core, is the Real Time Bidding Platform that processes 2 Petabytes of data per day and responds to ad auctions at a rate of 2.1 million requests per second across 5 different continents. Serving on top of this platform is Dataxu’s analytics engine that gives their clients insightful analytics reports addressed towards client marketing business questions. Some common requirements for both these platforms are the ability to do real-time processing, scalable machine learning, and ad-hoc analytics. This talk will showcase DataXu’s successful use-cases of using the Apache Spark framework and Databricks to address all of the above challenges while maintaining its agility and rapid prototyping strengths to take a product from initial R&D phase to full production. The team will share their best practices and highlight the steps of large scale Spark ETL processing, model testing, all the way through to interactive analytics.
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big DataHakka Labs
By Doug Daniels (Director of Engineering, Data Dog)
At Datadog, we collect hundreds of billions of metric data points per day from hosts, services, and customers all over the world. In addition charting and monitoring this data in real time, we also run many large-scale offline jobs to apply algorithms and compute aggregations on the data. In the past months, we’ve migrated our largest data sets over to Apache Parquet—an efficient, portable columnar storage format
To capture and perform analysis the data stream generated by the VRT NU and VRT NWS platforms, we built a high-performance and real-time data streaming platform based on Apache Kafka.
These slides contain the lessons learned and a brief introduction to our open-sourced architecture
I'll provide guidelines for thinking about empirical performance evaluation of parallel programs in general and of Spark jobs in particular. It's easier to be systematic about this if you think in terms of "what's the effective network bandwidth we're getting?" instead of "How fast does this particular job run?" In addition, the figure of merit for parallel performance isn't necessarily obvious. If you want to minimize your AWS bill you should almost certainly run on a single node (but your job may take six months to finish). You may think you want answers as quickly as possible, but if you could make a job finish in 55 minutes instead 60 minutes while doubling your AWS bill, would you do it? No? Then what exactly is the metric that you should optimize?
Stitch Fix aspires to help you find the style that you will love. Data, the backbone of the business, is used to help with styling recommendations, demand modeling, user acquisition, and merchandise planning and also to influence business decisions throughout the organization. These decisions are backed by algorithms and data collected and interpreted based on client preferences. Neelesh Srinivas Salian offers an overview of the compute infrastructure used by the data science team at Stitch Fix, covering the architecture, tools within the larger ecosystem, and the challenges that the team overcame along the way.
Apache Spark plays an important role in Stitch Fix’s data platform, and the company’s data scientists use Spark for their ETL and Presto for their ad hoc queries. The goal for the team running the compute infrastructure is to understand and make the data scientists’ lives easier, particularly in terms of usability of Spark, by building tools that expedite the process of getting started with Spark and transitioning from an ad hoc to a production workflow. The compute infrastructure is a part of the data platform that is responsible for all the needs of data scientists as Stitch Fix.
Neelesh shares Stitch Fix’s journey, exploring its ad hoc and production infrastructure and detailing its in-house tools and how they work in synergy with open source frameworks in a cloud environment. Neelesh also discusses the additional improvements to the infrastructure that help persist information for future use and optimization and explains how the implementation of Amazon’s EMR FS has helped make it easier to read from the S3 source.
10,000 Foot View of Big Data. Presentation slides by Peter Smith, PhD, ACL at the Canadian Executive Cloud & DevSecOps Summit. Toronto May 4, 2018 and Vancouver, May 11, 2018 hosted by TriNimbus.
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...Spark Summit
The central premise of DataXu is to apply data science to better marketing. At its core, is the Real Time Bidding Platform that processes 2 Petabytes of data per day and responds to ad auctions at a rate of 2.1 million requests per second across 5 different continents. Serving on top of this platform is Dataxu’s analytics engine that gives their clients insightful analytics reports addressed towards client marketing business questions. Some common requirements for both these platforms are the ability to do real-time processing, scalable machine learning, and ad-hoc analytics. This talk will showcase DataXu’s successful use-cases of using the Apache Spark framework and Databricks to address all of the above challenges while maintaining its agility and rapid prototyping strengths to take a product from initial R&D phase to full production. The team will share their best practices and highlight the steps of large scale Spark ETL processing, model testing, all the way through to interactive analytics.
DataEngConf: Parquet at Datadog: Fast, Efficient, Portable Storage for Big DataHakka Labs
By Doug Daniels (Director of Engineering, Data Dog)
At Datadog, we collect hundreds of billions of metric data points per day from hosts, services, and customers all over the world. In addition charting and monitoring this data in real time, we also run many large-scale offline jobs to apply algorithms and compute aggregations on the data. In the past months, we’ve migrated our largest data sets over to Apache Parquet—an efficient, portable columnar storage format
Slides from the August 2021 St. Louis Big Data IDEA meeting from Sam Portillo. The presentation covers AWS EMR including comparisons to other similar projects and lessons learned. A recording is available in the comments for the meeting.
Michael Cutler (CTO cofounder of TUMRA) provides a high-level introduction to Apache Spark in a presentation given at ‘Big Data Week 2014’ #BDW14 held at University College London.
TUMRA were early adopters of Spark after a brief PoC in Dec ‘12 and took it to production just a few months later. The main motivation to do so was the inflexibility and high-latency of Hadoop Map/Reduce jobs and the knock-on effect for technology that utilises it (Mahout machine learning, Hive data warehousing, Cascading).
With two primary uses case ‘Ecommerce Personalisation’ and ‘Marketing Automation’ TUMRA are currently flowing around 29 million ‘user engagement events’ (JSON) each day through Apache Kafka and Spark Streaming at peak rates of up to 800 events per second.
TUMRA use Apache Spark on Amazon Web Services (EC2) in production for a mix of machine learning model building, graph analytics and near-real-time reporting.
To learn more about how we use Spark and the services we can deliver through our Platform please contact: hello@tumra.com
How to teach your data scientist to leverage an analytics cluster with Presto...Alluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
How to teach your data scientist to leverage an analytics cluster with Presto, Spark, and Alluxio
Katarzyna Orzechowska, Data Scientist (ING Tech)
Mariusz Derela, DevOps Engineer (ING Tech)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Content presented at a talk on Aug. 29th. Purpose is to inform a fairly technical audience on the primary tenets of Big Data and the hadoop stack. Also, did a walk-thru' of hadoop and some of the hadoop stack i.e. Pig, Hive, Hbase.
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...Dataconomy Media
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias Johansson, Lead Developer at Valo.io
Watch more from Data Natives Berlin 2016 here: http://bit.ly/2fE1sEo
Visit the conference website to learn more: www.datanatives.io
Follow Data Natives:
https://www.facebook.com/DataNatives
https://twitter.com/DataNativesConf
Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS
About the Author:
Tobias is technical lead developer for Valo.io in London. He has a background in the financial sector as a front-office developer but changed track in 2013 to be part of a team building a new real-time analytics platform from the ground up. His goal is to outlive the JVM and his tea addiction. This is his first appearance on the conference scene as a speaker.
Amazon Web Services: Lessons for Architecting Data in the CloudSafe Software
Learn tips for transitioning your data architecture into the cloud. We’ll explore storing data, automating data processing, and delivering data with AWS, with lessons from our experience helping clients deploy their data into Amazon. You’ll discover the standard design patterns we live by, and learn best practices for Amazon services including S3, RDS, Lambda, SNS, and SQS. Plus, get a peek at the future of data delivery as we take a look at AWS API Gateway.
The presentation covers following topics: 1) Hadoop Introduction 2) Hadoop nodes and daemons 3) Architecture 4) Hadoop best features 5) Hadoop characteristics. For more further knowledge of Hadoop refer the link: http://data-flair.training/blogs/hadoop-tutorial-for-beginners/
Slides from the August 2021 St. Louis Big Data IDEA meeting from Sam Portillo. The presentation covers AWS EMR including comparisons to other similar projects and lessons learned. A recording is available in the comments for the meeting.
Michael Cutler (CTO cofounder of TUMRA) provides a high-level introduction to Apache Spark in a presentation given at ‘Big Data Week 2014’ #BDW14 held at University College London.
TUMRA were early adopters of Spark after a brief PoC in Dec ‘12 and took it to production just a few months later. The main motivation to do so was the inflexibility and high-latency of Hadoop Map/Reduce jobs and the knock-on effect for technology that utilises it (Mahout machine learning, Hive data warehousing, Cascading).
With two primary uses case ‘Ecommerce Personalisation’ and ‘Marketing Automation’ TUMRA are currently flowing around 29 million ‘user engagement events’ (JSON) each day through Apache Kafka and Spark Streaming at peak rates of up to 800 events per second.
TUMRA use Apache Spark on Amazon Web Services (EC2) in production for a mix of machine learning model building, graph analytics and near-real-time reporting.
To learn more about how we use Spark and the services we can deliver through our Platform please contact: hello@tumra.com
How to teach your data scientist to leverage an analytics cluster with Presto...Alluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
How to teach your data scientist to leverage an analytics cluster with Presto, Spark, and Alluxio
Katarzyna Orzechowska, Data Scientist (ING Tech)
Mariusz Derela, DevOps Engineer (ING Tech)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Content presented at a talk on Aug. 29th. Purpose is to inform a fairly technical audience on the primary tenets of Big Data and the hadoop stack. Also, did a walk-thru' of hadoop and some of the hadoop stack i.e. Pig, Hive, Hbase.
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...Dataconomy Media
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias Johansson, Lead Developer at Valo.io
Watch more from Data Natives Berlin 2016 here: http://bit.ly/2fE1sEo
Visit the conference website to learn more: www.datanatives.io
Follow Data Natives:
https://www.facebook.com/DataNatives
https://twitter.com/DataNativesConf
Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS
About the Author:
Tobias is technical lead developer for Valo.io in London. He has a background in the financial sector as a front-office developer but changed track in 2013 to be part of a team building a new real-time analytics platform from the ground up. His goal is to outlive the JVM and his tea addiction. This is his first appearance on the conference scene as a speaker.
Amazon Web Services: Lessons for Architecting Data in the CloudSafe Software
Learn tips for transitioning your data architecture into the cloud. We’ll explore storing data, automating data processing, and delivering data with AWS, with lessons from our experience helping clients deploy their data into Amazon. You’ll discover the standard design patterns we live by, and learn best practices for Amazon services including S3, RDS, Lambda, SNS, and SQS. Plus, get a peek at the future of data delivery as we take a look at AWS API Gateway.
The presentation covers following topics: 1) Hadoop Introduction 2) Hadoop nodes and daemons 3) Architecture 4) Hadoop best features 5) Hadoop characteristics. For more further knowledge of Hadoop refer the link: http://data-flair.training/blogs/hadoop-tutorial-for-beginners/
Getting real-time analytics for devices/application/business monitoring from trillions of events and petabytes of data like companies Netflix, Uber, Alibaba, Paypal, Ebay, Metamarkets do.
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Imam Raza
Google Next Extended (https://cloudnext.withgoogle.com/) is an annual Google event focusing on Google cloud technologies. This presentation is from tech talk held in Google Next Extended 2017 Karachi event
The strategic relationship between Hortonworks and SAP enables SAP to resell Hortonworks Data Platform (HDP) and provide enterprise support for their global customer base. This means SAP customers can incorporate enterprise Hadoop as a complement within a data architecture that includes SAP HANA, Sybase and SAP BusinessObjects enabling a broad range of new analytic applications.
Vikram Andem Big Data Strategy @ IATA Technology Roadmap IT Strategy Group
Vikram Andem, Senior Manager, United Airlines, A case for Bigdata Program and Strategy @ IATA Technology Roadmap 2014, October 13th, 2014, Montréal, Canada
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
Jan 22nd, 2010 Hadoop meetup presentation on project voldemort and how it plays well with Hadoop at linkedin. The talk focus on Linkedin Hadoop ecosystem. How linkedin manage complex workflows, data ETL , data storage and online serving of 100GB to TB of data.
Presented at JAX London
MapReduce begat Hadoop begat Big Data. NoSQL moved us away from the stricture of monolithic storage architectures to fit-for-purpose designs. But, Houston, we still have a problem. Architects are still designing systems like this is the '70s. SOA, went from buzzword to the bank with the emergence and evolution of the cloud and on-demand right-now elasticity. Yet most systems are still designed to store-then-compute rather than to observe, orient, decide and act on in-flight data.
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Cloudera, Inc.
3 Things to Learn About:
*How Apache Kudu enables users to do more than ever before with their Analytic and Operational Databases
*How Cloudera has built two versatile databases to help our customers tackle their hardest problems.
*How the addition of Apache Kudu to this mix will enable new use cases around real-time analytics, internet of things, time series data, and more.
Introduction to Redis and its features.pptxKnoldus Inc.
Join us for an interactive session where we'll cover the fundamentals of Redis, practical use cases, and best practices for incorporating Redis into your projects. Whether you're a developer, architect, or system administrator, this session will equip you with the knowledge to harness the full potential of Redis for your applications. Get ready to elevate your understanding of in-memory data storage and revolutionize the way you handle data in your projects with Redis
On a business level, everyone wants to get hold of the business value and other organizational advantages that big data has to offer. Analytics has arisen as the primitive path to business value from big data. Hadoop is not just a storage platform for big data; it’s also a computational and processing platform for business analytics. Hadoop is, however, unsuccessful in fulfilling business requirements when it comes to live data streaming. The initial architecture of Apache Hadoop did not solve the problem of live stream data mining. In summary, the traditional approach of big data being co-relational to Hadoop is false; focus needs to be given on business value as well. Data Warehousing, Hadoop and stream processing complement each other very well. In this paper, we have tried reviewing a few frameworks and products
which use real time data streaming by providing modifications to Hadoop.
https://www.learntek.org/big-data-and-hadoop-training/
Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses.
Big data analytics: Technology's bleeding edgeBhavya Gulati
There can be data without information , but there can not be information without data.
Companies without Big Data Analytics are deaf and dumb , mere wanderers on web.
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
3 Things to Learn About:
*Building scalable real time architectures for managing data from IoT
*Processing data in real time with components such as Kudu & Spark
*Customer case studies highlighting real-time IoT use cases
This presentation was inspired post read of "TimeSeries Databases" -- Ted Dunning & Ellen Friedman.
I have tried to summarize a lot of the previous bench marks. Hope others find it useful. The slides were compiled early 2015 so some of the results might have changed but the core literature should still hold.
The FP7 CODE project will be presented at the Big Data Benchmarking Community call. Here, a high-level overview shall introduce CODEs vision and show the progress after 6-months.
Generische Datenintegration zur semantischen Diagnoseunterstützung im Projekt...Florian Stegmaier
Derzeitig basiert der diagnostische Prozess eines Krankheitsverlaufes in Krankenhäusern auf einer manuellen Beurteilung von Patientendaten zu unterschiedlichen Zeiten und unterschiedlichen Modalitäten (z. B. CT-Aufnahmen vs. MRT). Diese Aufnahmen werden in sehr großen Datenarchiven (Picture Archiving and Communication System, PACS) gespeichert, wohingegen einzelne Datensätze aufgrund von fehlenden aussagekräftigen semantischen Annotationen nur bedingt effizient angefragt werden können.
In dieser Präsentation wird ein generischer Ansatz vorgestellt, um die heterogenen Kliniksysteme durch moderne, semantisch aussagekräftige Technologien zu verbinden und uniform anfragbar zu machen. Durch einen uniformen Zugriff bezüglich Speicherungsform und Anfrageparadigma wird auf diese heterogene Datenlandschaft eine hochwertige semantische Diagnoseunterstützung ermöglicht.
AIR: Architecture for Interoperable Retrieval on Distributed and Heterogeneou...Florian Stegmaier
Nowadays multimedia data is produced and consumed at an ever increasing rate. Similarly to this trend, diverse storage approaches for multimedia data have been introduced. These observations lead to the fact that distributed and heterogeneous multimedia repositories exist whereas an unified and easy access to the stored multimedia data is not given.
This paper presents an architecture, named AIR, that offers the aformentioned retrieval possibilites. To ensure interoperability, AIR makes use of recently issued standards, namely the MPEG Query Format (MPQF) (multimedia query language) and the JPSearch transformation rules (metadata interoperability).
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
2. Learn
Why we need fancy Big Data frameworks.
How the lambda architecture looks like.
How twitter used to do real-time analytics.
Why twitter created Storm.
How Storm works.
2
6. New Tools
Large scale computation systems such as Hadoop.
Scalable databases such as Casandra and Riak.
Easy to use frameworks such as Storm and Dempsy.
6
10. Batch Layer
Speed Layer
Serving Layer
Stores the immutable master dataset.
Precomputes arbitrary batch views.
Home of batch processing and map
reduce systems such as Hadoop.
10
Batch Layer
11. Serving Layer
Speed Layer
Serving Layer
Batch Layer
Read-only random-access to batch views.
Updated by batch layer.
Indexes batch views.
Home of real-time query systems
such as Cloudera Impala for Hadoop.
11
12. Speed Layer
Speed Layer
Serving Layer
Batch Layer
Compensates for high-latency batch views.
Fast, incremental algorithms.
More complex because of random-writes.
Home of Apache HBase or Storm.
12
20. Know
Why we need fancy Big Data frameworks.
How the lambda architecture looks like.
How twitter used to do real-time analytics.
Why twitter created Storm.
How Storm works.
20