This is the story of the ten-year journey of a high-load data ingestion pipeline that handles 500 million posts daily, from choreography-based design to fully centralized orchestration, and the valuable lessons learned along the way.
Fighting Against Chaotically Separated Values with EmbulkSadayuki Furuhashi
We created a plugin-based data collection tool that can read any chaotically formatted files called "CSV" by guessing its schema automatically
Talked at csv,conf,v2 in Berlin
http://csvconf.com/
OpenSource API Server based on Node.js API framework built on supported Node.js platform with Tooling and DevOps. Use cases are Omni-channel API Server, Mobile Backend as a Service (mBaaS) or Next Generation Enterprise Service Bus. Key functionality include built in enterprise connectors, ORM, Offline Sync, Mobile and JS SDKs, Isomorphic JavaScript and Graphical API creation tool.
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Ontico
NoSQL key-value — популярное решение, но какие преимущества и какой ценой оно даёт?
Скорость? Возможно, но ценой урезанного, по сравнению с реляционными базами данных, функционала. Но данные и приложение всё еще разделены сетевым стеком, а иногда и десятками километров оптоволокна. В крупном проекте, работающем на десятках или сотнях серверов нельзя обеспечить высокую скорость доступа к данным с каждой машины. Если хранилище достаточно быстро, то время обработки запроса окажется значительно меньше затрат на работу с сетью, а производительность приложения будут определять сетевые задержки и частота запросов к БД.
В проекте Облако@Mail.Ru мы ушли от использования чистых key-value хранилищ в пользу микросервисов на Tarantool, что позволило свести общение с хранилищем данных к минимуму.
Да, Tarantool — это еще одна NoSQL база данных, но еще это полноценный сервер приложений. Приложений, расположенных рядом с данными!
Я расскажу, как мы пришли к использованию микросервисов на основе Tarantool. Приведу несколько сценариев использования, которые работают в Облаке и могут быть легко адаптированы для другого web-проекта. Вы узнаете о компонентах, которые разработаны и опубликованы нами уже сейчас, и о дальнейших планах развития.
Fighting Against Chaotically Separated Values with EmbulkSadayuki Furuhashi
We created a plugin-based data collection tool that can read any chaotically formatted files called "CSV" by guessing its schema automatically
Talked at csv,conf,v2 in Berlin
http://csvconf.com/
OpenSource API Server based on Node.js API framework built on supported Node.js platform with Tooling and DevOps. Use cases are Omni-channel API Server, Mobile Backend as a Service (mBaaS) or Next Generation Enterprise Service Bus. Key functionality include built in enterprise connectors, ORM, Offline Sync, Mobile and JS SDKs, Isomorphic JavaScript and Graphical API creation tool.
Tarantool как платформа для микросервисов / Антон Резников, Владимир Перепели...Ontico
NoSQL key-value — популярное решение, но какие преимущества и какой ценой оно даёт?
Скорость? Возможно, но ценой урезанного, по сравнению с реляционными базами данных, функционала. Но данные и приложение всё еще разделены сетевым стеком, а иногда и десятками километров оптоволокна. В крупном проекте, работающем на десятках или сотнях серверов нельзя обеспечить высокую скорость доступа к данным с каждой машины. Если хранилище достаточно быстро, то время обработки запроса окажется значительно меньше затрат на работу с сетью, а производительность приложения будут определять сетевые задержки и частота запросов к БД.
В проекте Облако@Mail.Ru мы ушли от использования чистых key-value хранилищ в пользу микросервисов на Tarantool, что позволило свести общение с хранилищем данных к минимуму.
Да, Tarantool — это еще одна NoSQL база данных, но еще это полноценный сервер приложений. Приложений, расположенных рядом с данными!
Я расскажу, как мы пришли к использованию микросервисов на основе Tarantool. Приведу несколько сценариев использования, которые работают в Облаке и могут быть легко адаптированы для другого web-проекта. Вы узнаете о компонентах, которые разработаны и опубликованы нами уже сейчас, и о дальнейших планах развития.
Clojure has always been good at manipulating data. With the release of spec and Onyx (“a masterless, cloud scale, fault tolerant, high performance distributed computation system”) good became best. In this talk you will learn about a streaming data layer architecture build around Kafka and Onyx that is self-describing, declarative, scalable and convenient to work with for the end user. The focus will be on the power and elegance of describing data and computation with data; the inferences and automations that can be built on top of that; and how and why Clojure is a natural choice for tasks that involve a lot of data manipulation, touching both on functional programming and lisp-specifics such as code-is-data.
We will look at how such an approach can be used to manage a data warehouse by automatically inferring materialized views from raw incoming data or other views based on a combination of heuristics, statistical analysis (seasonality, outlier removal, ...) and predefined ontologies. Doing so is a practical way to maintain a large number of views, increasing their availability and abstracting the complexity into declarative rules, rather than having an ETL pipeline with dozens or even hundreds of hand crafted tasks.
The system described requires relatively little effort upfront but can easily grow with one's needs both in terms of scale as well as scope. With its good introspection capabilities and strong decoupling it is for instance an excellent substrate for putting machine learning algorithms in production, which is the final use-case we will dive into.
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Reactivesummit
Akka Streams and its amazing handling of stream back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially use cases where the amount of work increases as you process make you really value the back-pressure.
This talk takes a sample web crawler use case where each processing pass expands to a larger and larger workload to process, and discusses how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
In addition, we will also provide some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaAkara Sucharitakul
Akka Streams and its amazing handling of stream back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially use cases where the amount of work increases as you process make you really value the back-pressure.
This talk takes a sample web crawler use case where each processing pass expands to a larger and larger workload to process, and discusses how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
In addition, we will also provide some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Dan Halperin
Apache Beam (incubating) is a unified batch and streaming data processing programming model that is efficient and portable. Beam evolved from a decade of system-building at Google, and Beam pipelines run today on both open source (Apache Flink, Apache Spark) and proprietary (Google Cloud Dataflow) runners. This talk will focus on I/O and connectors in Apache Beam, specifically its APIs for efficient, parallel, adaptive I/O. Google will discuss how these APIs enable a Beam data processing pipeline runner to dynamically rebalance work at runtime, to work around stragglers, and to automatically scale up and down cluster size as a job’s workload changes. Together these APIs and techniques enable Apache Beam runners to efficiently use computing resources without compromising on performance or correctness. Practical examples and a demonstration of Beam will be included.
SQL Server 2008
In this event we'll take a look at what's coming down the line for database developers with SQL Server 2008. We'll look at changes to the core engine and the T-SQL language, any changes in the toolset and we'll also take a good look at what's coming with ADO.NET V3.0 in terms of the new Entity Framework and the new Microsoft Synchronisation Framework. If you're a SQL Server developer come along and see what's in store for 2008.
For more details and the original slidedeck visit http://www.microsoft.com/uk/msdn/events/new/Detail.aspx?id=322
Enterprise data is moving into Hadoop, but some data has to stay in operational systems. Apache Calcite (the technology behind Hive’s new cost-based optimizer, formerly known as Optiq) is a query-optimization and data federation technology that allows you to combine data in Hadoop with data in NoSQL systems such as MongoDB and Splunk, and access it all via SQL.
Hyde shows how to quickly build a SQL interface to a NoSQL system using Calcite. He shows how to add rules and operators to Calcite to push down processing to the source system, and how to automatically build materialized data sets in memory for blazing-fast interactive analysis.
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
Akka Streams and its amazing handling of streaming with back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially in use cases where the amount of work continues to increase as you’re processing it. This is where back-pressure really shines.
In this talk for Architects and Dev Managers by Akara Sucharitakul, Principal MTS for Global Platform Frameworks at PayPal, Inc., we look at how back-pressure based on Akka Streams and Kafka is being used at PayPal to handle very bursty workloads.
In addition, Akara will also share experiences in creating a platform based on Akka and Akka Streams that currently processes over 1 billion transactions per day (on just 8 VMs), with the aim of helping teams adopt these technologies. In this webinar, you will:
*Start with a sample web crawler use case to examine what happens when each processing pass expands to a larger and larger workload to process.
*Review how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
*Look at lessons learned, plus some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
"What I learned through reverse engineering", Yuri ArtiukhFwdays
In recent years, I have gained most of my knowledge through reverse engineering, how I did it and what I learned during this period, I decided to share. All this concerns graphic programming, performance, best practices in the frontend.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
More Related Content
Similar to "From Orchestration to Choreography and Back", Yevhen Bobrov
Clojure has always been good at manipulating data. With the release of spec and Onyx (“a masterless, cloud scale, fault tolerant, high performance distributed computation system”) good became best. In this talk you will learn about a streaming data layer architecture build around Kafka and Onyx that is self-describing, declarative, scalable and convenient to work with for the end user. The focus will be on the power and elegance of describing data and computation with data; the inferences and automations that can be built on top of that; and how and why Clojure is a natural choice for tasks that involve a lot of data manipulation, touching both on functional programming and lisp-specifics such as code-is-data.
We will look at how such an approach can be used to manage a data warehouse by automatically inferring materialized views from raw incoming data or other views based on a combination of heuristics, statistical analysis (seasonality, outlier removal, ...) and predefined ontologies. Doing so is a practical way to maintain a large number of views, increasing their availability and abstracting the complexity into declarative rules, rather than having an ETL pipeline with dozens or even hundreds of hand crafted tasks.
The system described requires relatively little effort upfront but can easily grow with one's needs both in terms of scale as well as scope. With its good introspection capabilities and strong decoupling it is for instance an excellent substrate for putting machine learning algorithms in production, which is the final use-case we will dive into.
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Reactivesummit
Akka Streams and its amazing handling of stream back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially use cases where the amount of work increases as you process make you really value the back-pressure.
This talk takes a sample web crawler use case where each processing pass expands to a larger and larger workload to process, and discusses how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
In addition, we will also provide some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaAkara Sucharitakul
Akka Streams and its amazing handling of stream back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially use cases where the amount of work increases as you process make you really value the back-pressure.
This talk takes a sample web crawler use case where each processing pass expands to a larger and larger workload to process, and discusses how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
In addition, we will also provide some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Dan Halperin
Apache Beam (incubating) is a unified batch and streaming data processing programming model that is efficient and portable. Beam evolved from a decade of system-building at Google, and Beam pipelines run today on both open source (Apache Flink, Apache Spark) and proprietary (Google Cloud Dataflow) runners. This talk will focus on I/O and connectors in Apache Beam, specifically its APIs for efficient, parallel, adaptive I/O. Google will discuss how these APIs enable a Beam data processing pipeline runner to dynamically rebalance work at runtime, to work around stragglers, and to automatically scale up and down cluster size as a job’s workload changes. Together these APIs and techniques enable Apache Beam runners to efficiently use computing resources without compromising on performance or correctness. Practical examples and a demonstration of Beam will be included.
SQL Server 2008
In this event we'll take a look at what's coming down the line for database developers with SQL Server 2008. We'll look at changes to the core engine and the T-SQL language, any changes in the toolset and we'll also take a good look at what's coming with ADO.NET V3.0 in terms of the new Entity Framework and the new Microsoft Synchronisation Framework. If you're a SQL Server developer come along and see what's in store for 2008.
For more details and the original slidedeck visit http://www.microsoft.com/uk/msdn/events/new/Detail.aspx?id=322
Enterprise data is moving into Hadoop, but some data has to stay in operational systems. Apache Calcite (the technology behind Hive’s new cost-based optimizer, formerly known as Optiq) is a query-optimization and data federation technology that allows you to combine data in Hadoop with data in NoSQL systems such as MongoDB and Splunk, and access it all via SQL.
Hyde shows how to quickly build a SQL interface to a NoSQL system using Calcite. He shows how to add rules and operators to Calcite to push down processing to the source system, and how to automatically build materialized data sets in memory for blazing-fast interactive analysis.
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
Akka Streams and its amazing handling of streaming with back-pressure should be no surprise to anyone. But it takes a couple of use cases to really see it in action - especially in use cases where the amount of work continues to increase as you’re processing it. This is where back-pressure really shines.
In this talk for Architects and Dev Managers by Akara Sucharitakul, Principal MTS for Global Platform Frameworks at PayPal, Inc., we look at how back-pressure based on Akka Streams and Kafka is being used at PayPal to handle very bursty workloads.
In addition, Akara will also share experiences in creating a platform based on Akka and Akka Streams that currently processes over 1 billion transactions per day (on just 8 VMs), with the aim of helping teams adopt these technologies. In this webinar, you will:
*Start with a sample web crawler use case to examine what happens when each processing pass expands to a larger and larger workload to process.
*Review how we use the buffering capabilities in Kafka and the back-pressure with asynchronous processing in Akka Streams to handle such bursts.
*Look at lessons learned, plus some constructive “rants” about the architectural components, the maturity, or immaturity you’ll expect, and tidbits and open source goodies like memory-mapped stream buffers that can be helpful in other Akka Streams and/or Kafka use cases.
"What I learned through reverse engineering", Yuri ArtiukhFwdays
In recent years, I have gained most of my knowledge through reverse engineering, how I did it and what I learned during this period, I decided to share. All this concerns graphic programming, performance, best practices in the frontend.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
"Micro frontends: Unbelievably true life story", Dmytro PavlovFwdays
A real life story about the experience of using Micro frontends in an existing Enterprise product. Problems and their solutions on the way from the integration of a separate component to an extensible No-code platform.
"Objects validation and comparison using runtime types (io-ts)", Oleksandr SuhakFwdays
A common task in modern JS is parsing, validating and then comparing JSON objects. In this talk I will quickly go through most common ways to parse/validate and compare objects we use today and then focus more on how runtime types (based on io-ts) can help make such tasks easier and quicker to implement.
"JavaScript. Standard evolution, when nobody cares", Roman SavitskyiFwdays
Should we take a look at JavaScript when everyone is writing in TypeScript? What happens to the standard? What did we get last year? What new features can we expect this and next year? And most importantly, when will Observer be standardized?
Let's try to answer all these questions and even a little more, dream about the future, and enjoy that Observer is alive (or not).
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...Fwdays
Case study of how small team in Preply started with inheriting an existing ranking model to being able to produce a model per day. In this talk we'll cover steps to take if you find yourself in a similar situation: what kind of technology and processes can you introduce in order to achieve a great speedup in a development speed.
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil TopchiiFwdays
In my talk, I will tell about the world of GenAI services beyond GPT-wrappers and how we developed and scaled GenAI-centric applications. I'll share personal experiences about the obstacles, lessons, and strategic tools and methodologies that were key in taking GenAI applications from 0 to 1. I'll talk about the challenges we faced when launching LLM-based and image generative applications and delivering them to end users, and what conclusions and solutions were made.
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
Python engineers are introduced to the transformative potential of Large Language Models (LLMs) in the realm of advanced data analysis and the application of Semantic Kernel techniques. We will talk about how LLMs like ChatGPT can be integrated into Python environments to automate data processing, enhance predictive modeling, and unlock deeper insights from complex datasets. The session will delve into practical strategies for embedding Semantic Kernel methods within Python projects, illustrating how these advanced techniques can refine the accuracy of machine learning models by embedding domain-specific knowledge directly into the analysis process. Attendees will leave with a clear roadmap for leveraging the combined power of LLMs and Semantic Kernels, equipped with actionable knowledge to drive innovation in their data analysis projects and beyond, marking a significant leap forward in the evolution of Python engineering practices.
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
Federated learning. Algorithmic solution to the problem of privacy preserving ML. Pieces involved to support the training with NVIDIA Flare as example. How newest legislation affects federated learning.
"What is a RAG system and how to build it",Dmytro SpodaretsFwdays
Today, large language models are becoming an integral part of almost every IT solution. However, their use is often accompanied by certain limitations, such as the relevance of information or its depth and specificity. One of the ways to overcome these limitations is the method of working with LLMs - RAG (Retrieval Augmented Generation).
In an ideal world, you would write Python code and then it would work perfectly. But unfortunately, it doesn't work in this manner. In my talk, I'll cover how to efficiently debug your programs, especially in cloud environments or inside Kubernetes.
MLOps (Machine Learning Operations) is a recent buzzword, that trends a lot. Let's figure out together how maintaining applications with machine learning components is significantly different from maintaining applications without them.
We will look into MLOps best practices and typical problems and their implementations/solutions in real world production.
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
Ever seen a code base where understanding a simple method meant jumping through tangled class hierarchies? We all have! And while "Favor composition over inheritance!" is almost as old as object-oriented programming, strictly avoiding all types of subclassing leads to verbose, un-Pythonic code. So, what to do?
The discussion on composition vs. inheritance is so frustrating because far-reaching design decisions like this can only be made with the ecosystem in mind – and because there's more than one type of subclassing!
Let's take a dogma-free stroll through the types of subclassing through a Pythonic lens and untangle some patterns and trade-offs together. By the end, you'll be more confident in deciding when subclassing will make your code more Pythonic and when composition will improve its clarity.
"Distributed graphs and microservices in Prom.ua", Maksym KindritskyiFwdays
The current architecture of Prom.ua is built on microservices and GraphQL API, but it was not always like that. In this talk, I'll tell you how far we've come and how we've made using graphs in a microservice architecture convenient and simple. I will talk about the problems we faced and how we overcame them, made our development process more accessible, deployments faster, and the remains of the monolith less loaded.
"Rethinking the existing data loading and processing process as an ETL exampl...Fwdays
ETL stands for extract, transform, load. It's a process that combines data from different sources into a single repository for further processing, analysis, and utilization.
This talk provides an example of how pandas can be used to solve ETL tasks as a stage in the evolution of the data intake component. This involves preliminary validation, filtering, and conversion of data according to a set of business rules and internal representation, with intermediate combination with other sources.
"How Ukrainian IT specialist can go on vacation abroad without crossing the T...Fwdays
I’m confident that many IT professionals are currently facing the same situation I was in a few months ago. Mobilization, uncertainty. How can I be maximally beneficial to the country with my experience and continue professional development in such circumstances? Since the onset of the full-scale invasion, I've been actively volunteering and assisting the army. Mobilization became the next logical step.
I want to share:
My journey in IT, volunteering, and the beginning of my service in the Armed Forces
Impressions from the first few months
Which Soft Skills are helpful in this context
I aim to dispel myths about the mobilization process and projects of the Armed Forces. Address your questions
And yes, military personnel can travel abroad during their leave.
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...Fwdays
The leader must be strong all the time. The leader cannot afford to make mistakes, let alone fail in front of their team. Is that really true? Nick Gicinto, a cybersecurity leader with over 25 years of experience, who has worked for the CIA and has built security systems from scratch at Tesla and Uber, fully hiring teams for these projects, will talk about the importance of being vulnerable to build trust within a team.
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...Fwdays
Sharing open feedback can be difficult because it equals much work on yourself. However, feedback needs attention and a special place in the corporate culture. It helps to grow dynamically, build a team of like-minded people and achieve powerful results.
In the presentation, I will talk about:
The ability to work with feedback as a soft, solid skill in developing technical specialists.
A list of difficulties that prevent quality work with feedback.
The 4A Framework is a tool for successful giving and receiving feedback.
I will also help specialists learn the following:
Form constructive feedback and understand how and when to give it.
Work analytically with the received feedback.
Feel free to share your thoughts and be heard.
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...Fwdays
Will discuss:
Current communication challenges, including mishaps and toxic versus productive interactions.
Ever wondered about PDP? It’s likely because its relevance to career planning, even outside your current company, hasn’t been fully spotlighted.
Exploring how PDP functions within career planning, applicable even if you’re eyeing an exit.
“Who do I aspire to become?”
Summarizing key points with a reference to a practical form you can download to use.
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...Fwdays
This talk will reveal four destructive communication patterns that can undermine team spirit, reduce productivity and cause conflict, and offer effective strategies for neutralizing them.
Let's start with exciting storytelling about a fictional team of developers working on Scrum. You will learn about situations that their team member noticed during team meetings.
Next, we will analyze "The Gottman Four Horsemen" model, which describes the four "horsemen of the apocalypse" of work relationships: criticism, defensiveness, contempt, and stonewalling. For each of these patterns, specific "antidotes" will be offered that allow you to build healthier and more productive relationships in the team.
Finally, we'll look at why this topic is critical to team productivity, drawing on Google's "Project Aristotle" research. Special attention will be paid to the concept of psychological safety, which is a key factor in the success of high-performance teams.
This talk will not only provide valuable insights and tools for improving communication and management in Tech teams, but will also help each member better understand their own contribution to the overall success of the team.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
14. TPL DataFlow
var fetch = new TransformBlock(async url =>
{
using (var web = new WebClient())
return (url, await
web.DownloadAsync(url));
});
var save = new ActionBlock(async data => {
(string url, byte[] image) = data;
await File.WriteAllBytesAsync(filePath,
image);
});
fetch.LinkTo(save); // links the output from the TransformBlock to
the ActionBlock
16. evolution conundrum
extend
o validate
o normalize
o detect sentiment
o detect language
o match
o delay
o send to client
Microservices, anyone?
vs split
o validate
o normalize
o detect sentiment
o detect language
o match
o delay
o send to client
17. disintegration reasons
+ + +
+
service
functionality
code
volatility
scalability &
throughput
fault
tolerance
data
security
Architecture, The Hard Parts by Neal Ford
(https:www.youtube.com/@thoughtworks)
36. reintegration reasons
Architecture, The Hard Parts by Neal Ford
(https:www.youtube.com/@thoughtworks)
database
transactions
structural (data)
dependencies
workflow and
choreography
45. complexity distribution
o validate
o normalize
o detect sentiment
o detect language
…
o match
o delay
o send to client
o detect objects
o detect OCR
o send to client o match keywords
o send to client
o extract
o highlight
o send to client
46. big ball of …
o validate
o normalize
o detect sentiment
o detect language
…
o match
o delay
o send to client
o detect objects
o detect OCR
o send to client
o match keywords
o send to client
o extract
o highlight
o send to client
[isFresh
]
[hasImage
]
[noOCR
]
[highPriority]
[isFromChanne
l]
[hasText]
56. When forming analogies, this bias
may lead individuals to selectively
focus on similarities between
two things while ignoring
differences, even if the analogy
is not valid.
Confirmation Bias
the end