Treasure Data simplifies event analytics for the complex digital
world. Our customers send us 1,000,000 events per second and issue 30,000+ Presto queries everyday to understand their customers better. One of the challenges is designing a cloud database with zero downtime to support a global customer base. We have achieved this goal by developing several open-source technologies; Fluentd and Embulk enable seamless log collection from stream/batch sources, and with MessagePack we can provide an extensible columnar store that accommodates future schema changes. Finally, Presto allows us to serve a wide variety of data processing our customers perform on our service. In this talk, I will present an overview of our system, and how our customers keep using Presto while collecting and extending their data set.
Lessons learned while taking Presto from alpha to production at Twitter. Presented at the Presto meetup at Facebook on 2015.03.22.
Video: https://www.facebook.com/prestodb/videos/531276353732033/
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Matt Fuller
Teradata has been hard at work on Presto, and we want to share with you what we've done so far and our roadmap going forward. From presto-admin, a tool for installing and administering Presto, to YARN/Ambari support, to fully certified JDBC and ODBC drivers, we are committed to making Presto the best, most enterprise-ready SQL-on Hadoop solution out there.
Lessons learned while taking Presto from alpha to production at Twitter. Presented at the Presto meetup at Facebook on 2015.03.22.
Video: https://www.facebook.com/prestodb/videos/531276353732033/
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Matt Fuller
Teradata has been hard at work on Presto, and we want to share with you what we've done so far and our roadmap going forward. From presto-admin, a tool for installing and administering Presto, to YARN/Ambari support, to fully certified JDBC and ODBC drivers, we are committed to making Presto the best, most enterprise-ready SQL-on Hadoop solution out there.
Bullet is an open sourced, lightweight, pluggable querying system for streaming data without a persistence layer implemented on top of Storm. It allows you to filter, project, and aggregate on data in transit. It includes a UI and WS. Instead of running queries on a finite set of data that arrived and was persisted or running a static query defined at the startup of the stream, our queries can be executed against an arbitrary set of data arriving after the query is submitted. In other words, it is a look-forward system.
Bullet is a multi-tenant system that scales independently of the data consumed and the number of simultaneous queries. Bullet is pluggable into any streaming data source. It can be configured to read from systems such as Storm, Kafka, Spark, Flume, etc. Bullet leverages Sketches to perform its aggregate operations such as distinct, count distinct, sum, count, min, max, and average.
An instance of Bullet is currently running at Yahoo against its user engagement data pipeline. We’ll highlight how it is powering internal use-cases such as web page and native app instrumentation validation. Finally, we’ll show a demo of Bullet and go over query performance numbers.
Hoodie: How (And Why) We built an analytical datastore on SparkVinoth Chandar
Exploring a specific problem of ingesting petabytes of data in Uber and why they ended up building an analytical datastore from scratch using Spark. Then, discuss design choices and implementation approaches in building Hoodie to provide near-real-time data ingestion and querying using Spark and HDFS.
https://spark-summit.org/2017/events/incremental-processing-on-large-analytical-datasets/
Boston Hadoop Meetup: Presto for the EnterpriseMatt Fuller
Presentation Title: Presto for the Enterprise
Presenter(s): Matt Fuller and Kamil Bajda-Pawlikowski
Company: Teradata Center for Hadoop
Short Description:
Teradata will provide a technical overview and demo of Presto, focusing on Presto's architecture and Teradata's contributions to the project and community. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Originally developed by Facebook, Teradata now joins Facebook as the second largest contributor to the open source project. Come join us and learn more about Presto. And how you can join the Presto community.
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
Benjamin Hopp (Solutions Architect) @ Imply:
Druid is an emerging standard in the data infrastructure world, designed for high-performance slice-and-dice analytics (“OLAP”-style) on large data sets.
This talk is for you if you’re interested in learning more about pushing Druid’s analytical performance to the limit.
Perhaps you’re already running Druid and are looking to speed up your deployment, or perhaps you aren’t familiar with Druid and are interested in learning the basics.
Some of the tips in this talk are Druid-specific, but many of them will apply to any operational analytics technology stack.
The most important contributor to a fast analytical setup is getting the data model right.
The talk will center around various choices you can make to prepare your data to get best possible query performance.
We’ll look at some general best practices to model your data before ingestion such as OLAP dimensional modeling (called “roll-up” in Druid), data partitioning, and tips for choosing column types and indexes.
We’ll also look at how more can be less: often, storing copies of your data partitioned, sorted, or aggregated in different ways can speed up queries by reducing the amount of computation needed.
We’ll also look at Druid-specific optimizations that take advantage of approximations; where you can trade accuracy for performance and reduced storage.
You’ll get introduced to Druid’s features for approximate counting, set operations, ranking, quantiles, and more.
And we will finish with the latest and greatest Druid news, including details about the latest roadmap and releases.
Bullet is an open sourced, lightweight, pluggable querying system for streaming data without a persistence layer implemented on top of Storm. It allows you to filter, project, and aggregate on data in transit. It includes a UI and WS. Instead of running queries on a finite set of data that arrived and was persisted or running a static query defined at the startup of the stream, our queries can be executed against an arbitrary set of data arriving after the query is submitted. In other words, it is a look-forward system.
Bullet is a multi-tenant system that scales independently of the data consumed and the number of simultaneous queries. Bullet is pluggable into any streaming data source. It can be configured to read from systems such as Storm, Kafka, Spark, Flume, etc. Bullet leverages Sketches to perform its aggregate operations such as distinct, count distinct, sum, count, min, max, and average.
An instance of Bullet is currently running at Yahoo against its user engagement data pipeline. We’ll highlight how it is powering internal use-cases such as web page and native app instrumentation validation. Finally, we’ll show a demo of Bullet and go over query performance numbers.
Hoodie: How (And Why) We built an analytical datastore on SparkVinoth Chandar
Exploring a specific problem of ingesting petabytes of data in Uber and why they ended up building an analytical datastore from scratch using Spark. Then, discuss design choices and implementation approaches in building Hoodie to provide near-real-time data ingestion and querying using Spark and HDFS.
https://spark-summit.org/2017/events/incremental-processing-on-large-analytical-datasets/
Boston Hadoop Meetup: Presto for the EnterpriseMatt Fuller
Presentation Title: Presto for the Enterprise
Presenter(s): Matt Fuller and Kamil Bajda-Pawlikowski
Company: Teradata Center for Hadoop
Short Description:
Teradata will provide a technical overview and demo of Presto, focusing on Presto's architecture and Teradata's contributions to the project and community. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Originally developed by Facebook, Teradata now joins Facebook as the second largest contributor to the open source project. Come join us and learn more about Presto. And how you can join the Presto community.
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
Benjamin Hopp (Solutions Architect) @ Imply:
Druid is an emerging standard in the data infrastructure world, designed for high-performance slice-and-dice analytics (“OLAP”-style) on large data sets.
This talk is for you if you’re interested in learning more about pushing Druid’s analytical performance to the limit.
Perhaps you’re already running Druid and are looking to speed up your deployment, or perhaps you aren’t familiar with Druid and are interested in learning the basics.
Some of the tips in this talk are Druid-specific, but many of them will apply to any operational analytics technology stack.
The most important contributor to a fast analytical setup is getting the data model right.
The talk will center around various choices you can make to prepare your data to get best possible query performance.
We’ll look at some general best practices to model your data before ingestion such as OLAP dimensional modeling (called “roll-up” in Druid), data partitioning, and tips for choosing column types and indexes.
We’ll also look at how more can be less: often, storing copies of your data partitioned, sorted, or aggregated in different ways can speed up queries by reducing the amount of computation needed.
We’ll also look at Druid-specific optimizations that take advantage of approximations; where you can trade accuracy for performance and reduced storage.
You’ll get introduced to Druid’s features for approximate counting, set operations, ranking, quantiles, and more.
And we will finish with the latest and greatest Druid news, including details about the latest roadmap and releases.
From Batch to Streaming ET(L) with Apache Apex at Berlin Buzzwords 2017Thomas Weise
https://berlinbuzzwords.de/17/session/batch-streaming-etl-apache-apex
Stream data processing is increasingly required to support business needs for faster actionable insight with growing volume of information from more sources. Apache Apex is a true stream processing framework for low-latency, high-throughput and reliable processing of complex analytics pipelines on clusters. Apex is designed for quick time-to-production, and is used in production by large companies for real-time and batch processing at scale.
This session will use an Apex production use case to walk through the incremental transition from a batch pipeline with hours of latency to an end-to-end streaming architecture with billions of events per day which are processed to deliver real-time analytical reports. The example is representative for many similar extract-transform-load (ETL) use cases with other data sets that can use a common library of building blocks. The transform (or analytics) piece of such pipelines varies in complexity and often involves business logic specific, custom components.
Topics include:
Pipeline functionality from event source through queryable state for real-time insights.
API for application development and development process.
Library of building blocks including connectors for sources and sinks such as Kafka, JMS, Cassandra, HBase, JDBC and how they enable end-to-end exactly-once results.
Stateful processing with event time windowing.
Fault tolerance with exactly-once result semantics, checkpointing, incremental recovery
Scalability and low-latency, high-throughput processing with advanced engine features for auto-scaling, dynamic changes, compute locality.
Recent project development and roadmap.
Following the session attendees will have a high level understanding of Apex and how it can be applied to use cases at their own organizations.
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Databricks
Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Airbnb, Comcast, GrubHub, Facebook, FINRA, LinkedIn, Lyft, Netflix, Twitter, and Uber, in the last few years Presto experienced an unprecedented growth in popularity in both on-premises and cloud deployments over Object Stores, HDFS, NoSQL and RDBMS data stores.
Marco Pozzan
Power BI consultant & Trainer
Scenario di utilizzo del real-time di Power BI. In questa sessione verrà introdotta la teoria sul real-time dashboarding offerto da Power BI. Poi ci si focalizzerà sun un caso pratico di real-time dataset in modalità ibrida per la realizzazione di una dashboard di controllo con la possibilità di effettuare il write back e permettere all’utente di effettuare analisi what-if.
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent
(Bruno Simic, Solutions Engineer, Couchbase)
Breakout during Confluent’s streaming event in Munich. This three-day hands-on course focused on how to build, manage, and monitor clusters using industry best-practices developed by the world’s foremost Apache Kafka™ experts. The sessions focused on how Kafka and the Confluent Platform work, how their main subsystems interact, and how to set up, manage, monitor, and tune your cluster.
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumarconfluent
Siphon is a highly available and reliable distributed pub/sub system built using Apache Kafka. It is used to publish, discover and subscribe to near real-time data streams for operational and product intelligence. Siphon is used as a “Databus” by a variety of producers and subscribers in Microsoft, and is compliant with security and privacy requirements. It has a built-in Auditing and Quality control. This session will provide an overview of the use of Kafka at Microsoft, and then deep dive into Siphon. We will describe an important business scenario and talk about the technical details of the system in the context of that scenario. We will also cover the design and implementation of the service, the scale, and real world production experiences from operating the service in the Microsoft cloud environment.
Les objets connectés : de nombreux cas d'usage Jedha Bootcamp
Aujourd'hui, les objets connectés sont partout et nous entourent sans même s'en apercevoir : téléphones, transports, musique, montres, "The Internet of Things" (IoT) a pris une part importante dans notre vie. En nous montrant des cas d'usages des entreprises telles que la NASA, Airbus, Red bull et d'autres, Sean nous expliquera comment ils fonctionnent et comment sont gérées toutes ces données récoltées.
Most data visualisation solutions today still work on data sources which are stored persistently in a data store, using the so called “data at rest” paradigms. More and more data sources today provide a constant stream of data, from IoT devices to Social Media streams. These data stream publish with high velocity and messages often have to be processed as quick as possible. For the processing and analytics on the data, so called stream processing solutions are available. But these only provide minimal or no visualisation capabilities. One was is to first persist the data into a data store and then use a traditional data visualisation solution to present the data.
If latency is not an issue, such a solution might be good enough. An other question is which data store solution is necessary to keep up with the high load on write and read. If it is not an RDBMS but an NoSQL database, then not all traditional visualisation tools might already integrate with the specific data store. An other option is to use a Streaming Visualisation solution. They are specially built for streaming data and often do not support batch data. A much better solution would be to have one tool capable of handling both, batch and streaming data. This talk presents different architecture blueprints for integrating data visualisation into a fast data solution and highlights some of the products available to implement these blueprints.
How bol.com makes sense of its logs, using the Elastic technology stack.Renzo Tomà
Presentation given by Renzo Tomà as "Tech and Use Case Deep Dive", during the Elastic{ON}Tour 2015 event in Amsterdam on October 29th.
Explanation of how bol.com is using the Elastic ELK stack to power a logsearch platform. Lots of details on the types of sources and number of feeds. Some history and reasoning why the current set of in-process JSON based logshippers are used. Links to the bol.com github account for the logshipper projects. The presentation ends with two special sauces: fun things you can do with lots of data in Elasticsearch. The 1st sauce is 'the call stack' - tagging each request with a unique ID, passing that ID along to all service calls and making sure this ID ends up in all access logging, enables you to group all calls together and get a call stack. The 2nd sauce is a way of generating a service map using access logging and some logstash magic.
I love questions and feedback. My mail address can be found in the presentation.
Stream data processing is increasingly required to support business needs for faster actionable insight with growing volume of information from more sources. Apache Apex is a true stream processing framework for low-latency, high-throughput and reliable processing of complex analytics pipelines on clusters. Apex is designed for quick time-to-production, and is used in production by large companies for real-time and batch processing at scale.
This session will use an Apex production use case to walk through the incremental transition from a batch pipeline with hours of latency to an end-to-end streaming architecture with billions of events per day which are processed to deliver real-time analytical reports. The example is representative for many similar extract-transform-load (ETL) use cases with other data sets that can use a common library of building blocks. The transform (or analytics) piece of such pipelines varies in complexity and often involves business logic specific, custom components.
Topics include:
* Pipeline functionality from event source through queryable state for real-time insights.
* API for application development and development process.
* Library of building blocks including connectors for sources and sinks such as Kafka, JMS, Cassandra, HBase, JDBC and how they enable end-to-end exactly-once results.
* Stateful processing with event time windowing.
* Fault tolerance with exactly-once result semantics, checkpointing, incremental recovery
* Scalability and low-latency, high-throughput processing with advanced engine features for auto-scaling, dynamic changes, compute locality.
* Who is using Apex in production, and roadmap.
Following the session attendees will have a high level understanding of Apex and how it can be applied to use cases at their own organizations.
Flink in Zalando's world of Microservices ZalandoHayley
Apache Flink Meetup at Zalando Technology, May 2016
By Javier Lopez & Mihail Vieru, Zalando
In this talk we present Zalando's microservices architecture and introduce Saiki – our next generation data integration and distribution platform on AWS. We show why we chose Apache Flink to serve as our stream processing framework and describe how we employ it for our current use cases: business process monitoring and continuous ETL. We then have an outlook on future use cases.
Berlin Apache Flink Meetup, May 2016
In this talk we present Zalando's microservices architecture and introduce Saiki – our next generation data integration and distribution platform on AWS. We show why we chose Apache Flink to serve as our stream processing framework and describe how we employ it for our current use cases: business process monitoring and continuous ETL. We then have an outlook on future use cases.
By Javier Lopez & Mihail Vieru, Zalando, Zalando SE
Similar to Presto @ Treasure Data - Presto Meetup Boston 2015 (20)
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Taro L. Saito
Scala can be used for developing both frontend (Scala.js) and backend (Scala JVM) applications. A missing piece has been bridging these two worlds using Scala. We built Airframe RPC, a framework that uses Scala traits as a unified RPC interface between servers and clients. With Airframe RPC, you can build HTTP/1 (Finagle) and HTTP/2 (gRPC) services just by defining Scala traits and case classes. It simplifies web application design as you only need to care about Scala interfaces without using existing web standards like REST, ProtocolBuffers, OpenAPI, etc. Scala.js support of Airframe also enables building interactive Web applications that can dynamically render DOM elements while talking with Scala-based RPC servers. With Airframe RPC, the value of Scala developers will be much higher both for frontend and backend areas.
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Taro L. Saito
Arm Treasure Data utilizes Presto as the query engine processing over 1 million queries per day to support the data business of 500+ companies in three regions; US, EU, and Asia. Arm Treasure Data had been using Presto 0.205 and in 2019 started a big migration project to Presto 317. Although we performed extensive query simulations to check any incompatibilities, we faced many unexpected challenges during the migration at production.
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Taro L. Saito
Scala is a powerful language; You can build front-end applications with Scala.js, and efficient backend application servers for JVM. In this session, we will learn how to build everything with Scala by using Airframe OSS framework.
Airframe is a library designed for maximizing the advantages of Scala as a hybrid of object-oriented and functional programming language. In this session, we will learn how to use Airframe to build REST APIs and RPC (with Finagle or gRPC) services, and how to create frontend applications in Scala.js that interact with the servers using functional interfaces for dynamically updating web pages.
Airframe RPC is a framework for building RPC services by using Scala as a unified RPC interface between servers and clients. It supports Finagle (HTTP/1) and gRPC (HTTP/2) backend, and even Scala.js for web application development.
Talk video: https://www.youtube.com/watch?v=qf8wOc2YHmQ&feature=youtu.be
Documentation: https://wvlet.org/airframe/docs/airframe-rpc
Demo source code: https://github.com/wvlet/airframe/tree/master/examples/rpc-examples
Airframe Meetup #3: 2019 Updates & AirSpecTaro L. Saito
Presentation slides of Airframe Meetup #3 https://airframe.connpass.com/event/148169/
- Airframe 19 Milestone
- AirSpec: A new testing library for Scala
-
Presto At Arm Treasure Data - 2019 UpdatesTaro L. Saito
Presentation at Presto Conference Tokyo 2019
- Arm Treasure Data
- Plazma DB Indexes
- Real-time, Archive Storages
- Schema-on-read data processing
- Physical partition maintenance via presto-stella plugin
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoTaro L. Saito
Silk is a framework for building dataflows in Scala. In Silk users write data processing code with collection operators (e.g., map, filter, reduce, join, etc.). Silk uses Scala Macros to construct a DAG of dataflows, nodes of which are annotated with variable names in the program. By using these variable names as markers in the DAG, Silk can support interruption and resume of dataflows and querying the intermediate data. By separating dataflow descriptions from its computation, Silk enables us to switch executors, called weavers, for in-memory or cluster computing without modifying the code. In this talk, we will show how Silk helps you run data-processing pipelines as you write the code.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Presto @ Treasure Data - Presto Meetup Boston 2015
1. Designing An Evolving
Database Service with Presto
Taro L. Saito
leo@tresaure-data.com
Oct 6th, 2015.
Presto Meetup @ Boston
2. Presto Usage at Treasure Data
2
• 100~ customers are actively using Presto
• 30,000~ Presto queries every day
• Importing 1,000,000~ records / sec.
Import Export
Store Analyze with
Presto/Hive
3. Mobile and Web Sources
Mobile SDKs
JavaScript SDK
(web access logs)
3
12. GAMING
BEFORE
AFTER
Daily Upload Delay of 1-2 days
2500+ servers
Real-time
Real-time
2500+ servers
1 Billion records/day
• Reduced TCO
• Real-time collection
• Real-time access to KPIs
Top 10 globally; 40M+ users
x 20
12
13. AD TECH
Publishers’ Dashboard Advertisers’ Dashboard
• 800 B/month
• Live in 2 weeks with 1 engineer!
• 300% growth
Europe’s largest mobile ad-exchange
More than 50 billion impressions/month
13
15. Challenges
• Handle Huge Query Result Output
• SELECT */ CREATE TABLE AS /INSERT INTO
• Parallel Result Upload to S3
• Bypass JSON result generation at the coordinator
• td-presto connector
• Accesses MessagePack based columnar store
• Handle S3 access retry / pipelining
• Future:
• Better query plan visualization
• Quickly find the performance bottleneck and memory consuming tasks
• Storing intermediate query results to disks
• Process large joins, query resource limitation
15
16. Extensible Schema
SQL via Hive, Presto
Unlimited Users, Queries
Enterprise Apps
Enterprise Apps Data Science
Tools
REST API
Ingestion: Streaming, Bulk
BI Tools
treasuredata.com/request_demo