Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Taro L. Saito
Scala is a powerful language; You can build front-end applications with Scala.js, and efficient backend application servers for JVM. In this session, we will learn how to build everything with Scala by using Airframe OSS framework.
Airframe is a library designed for maximizing the advantages of Scala as a hybrid of object-oriented and functional programming language. In this session, we will learn how to use Airframe to build REST APIs and RPC (with Finagle or gRPC) services, and how to create frontend applications in Scala.js that interact with the servers using functional interfaces for dynamically updating web pages.
Presto At Arm Treasure Data - 2019 UpdatesTaro L. Saito
Presentation at Presto Conference Tokyo 2019
- Arm Treasure Data
- Plazma DB Indexes
- Real-time, Archive Storages
- Schema-on-read data processing
- Physical partition maintenance via presto-stella plugin
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Taro L. Saito
Scala can be used for developing both frontend (Scala.js) and backend (Scala JVM) applications. A missing piece has been bridging these two worlds using Scala. We built Airframe RPC, a framework that uses Scala traits as a unified RPC interface between servers and clients. With Airframe RPC, you can build HTTP/1 (Finagle) and HTTP/2 (gRPC) services just by defining Scala traits and case classes. It simplifies web application design as you only need to care about Scala interfaces without using existing web standards like REST, ProtocolBuffers, OpenAPI, etc. Scala.js support of Airframe also enables building interactive Web applications that can dynamically render DOM elements while talking with Scala-based RPC servers. With Airframe RPC, the value of Scala developers will be much higher both for frontend and backend areas.
Airframe RPC is a framework for building RPC services by using Scala as a unified RPC interface between servers and clients. It supports Finagle (HTTP/1) and gRPC (HTTP/2) backend, and even Scala.js for web application development.
Talk video: https://www.youtube.com/watch?v=qf8wOc2YHmQ&feature=youtu.be
Documentation: https://wvlet.org/airframe/docs/airframe-rpc
Demo source code: https://github.com/wvlet/airframe/tree/master/examples/rpc-examples
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Taro L. Saito
Arm Treasure Data utilizes Presto as the query engine processing over 1 million queries per day to support the data business of 500+ companies in three regions; US, EU, and Asia. Arm Treasure Data had been using Presto 0.205 and in 2019 started a big migration project to Presto 317. Although we performed extensive query simulations to check any incompatibilities, we faced many unexpected challenges during the migration at production.
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Taro L. Saito
Scala is a powerful language; You can build front-end applications with Scala.js, and efficient backend application servers for JVM. In this session, we will learn how to build everything with Scala by using Airframe OSS framework.
Airframe is a library designed for maximizing the advantages of Scala as a hybrid of object-oriented and functional programming language. In this session, we will learn how to use Airframe to build REST APIs and RPC (with Finagle or gRPC) services, and how to create frontend applications in Scala.js that interact with the servers using functional interfaces for dynamically updating web pages.
Presto At Arm Treasure Data - 2019 UpdatesTaro L. Saito
Presentation at Presto Conference Tokyo 2019
- Arm Treasure Data
- Plazma DB Indexes
- Real-time, Archive Storages
- Schema-on-read data processing
- Physical partition maintenance via presto-stella plugin
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Taro L. Saito
Scala can be used for developing both frontend (Scala.js) and backend (Scala JVM) applications. A missing piece has been bridging these two worlds using Scala. We built Airframe RPC, a framework that uses Scala traits as a unified RPC interface between servers and clients. With Airframe RPC, you can build HTTP/1 (Finagle) and HTTP/2 (gRPC) services just by defining Scala traits and case classes. It simplifies web application design as you only need to care about Scala interfaces without using existing web standards like REST, ProtocolBuffers, OpenAPI, etc. Scala.js support of Airframe also enables building interactive Web applications that can dynamically render DOM elements while talking with Scala-based RPC servers. With Airframe RPC, the value of Scala developers will be much higher both for frontend and backend areas.
Airframe RPC is a framework for building RPC services by using Scala as a unified RPC interface between servers and clients. It supports Finagle (HTTP/1) and gRPC (HTTP/2) backend, and even Scala.js for web application development.
Talk video: https://www.youtube.com/watch?v=qf8wOc2YHmQ&feature=youtu.be
Documentation: https://wvlet.org/airframe/docs/airframe-rpc
Demo source code: https://github.com/wvlet/airframe/tree/master/examples/rpc-examples
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Taro L. Saito
Arm Treasure Data utilizes Presto as the query engine processing over 1 million queries per day to support the data business of 500+ companies in three regions; US, EU, and Asia. Arm Treasure Data had been using Presto 0.205 and in 2019 started a big migration project to Presto 317. Although we performed extensive query simulations to check any incompatibilities, we faced many unexpected challenges during the migration at production.
Wayfair Use Case: The four R's of Metrics DeliveryInfluxData
Wayfair currently uses both Graphite and InfluxDB as a time series platform - The data is used by their developers, business stakeholders, by their internal alerting engine. Most importantly their 24x7 Ops Monitoring Center is using this data to constantly analyze the vital signs of Wayfair’s IT infrastructure and storefront operations.
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...Chester Chen
Uncovering performance regressions in the TCP SACKs vulnerability fixes
In early July 2019, Databricks noticed some Apache Spark workloads regressing by as much as 6x. In this talk, we'll discuss how we traced these regressions back to the Linux kernel and the fixes for the TCP SACKs vulnerabilities. We will explain the symptoms we were seeing, walk through how we debugged the TCP connections, and dive into the Linux source to uncover the root cause.
Speaker: Chris Stevens (Databricks)
Chris Stevens is a software engineer at Databricks where he works on the reliability, scalability, and security of Apache Spark clusters. His work focuses on auto-scaling compute, auto-scaling storage, node initialization performance, and node health monitoring. Prior to Databricks, Chris founded the Minoca OS project, where he built a POSIX compliant, general purpose OS - from scratch - to run on resource constrained device. He got his start at Microsoft working on the Windows kernel team, porting the Windows boot environment from BIOS to UEFI.
The Dark Side Of Go -- Go runtime related problems in TiDB in productionPingCAP
Ed Huang, CTO of PingCAP, talked at Go System Conference about dealing with the typical and profound issues related to Go’s runtime as your systems become more complex. Taking TiDB as an example, he demonstrated how these problems can be reproduced, located, and analyzed in production.
First impressions of SparkR: our own machine learning algorithmInfoFarm
In june 2015, SparkR was first integrated into SparkR. At InfoFarm we strive to stay on top of new technologies, hence we have tried it out and implemented a few machine learning algorithms as well.
This is the speech Shen Li gave at GopherChina 2017.
TiDB is an open source distributed database. Inspired by the design of Google F1/Spanner, TiDB features in infinite horizontal scalability, strong consistency, and high availability. The goal of TiDB is to serve as a one-stop solution for data storage and analysis.
In this talk, we will mainly cover the following topics:
- What is TiDB
- TiDB Architecture
- SQL Layer Internal
- Golang in TiDB
- Next Step of TiDB
PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...Equnix Business Solutions
PGConf.ASIA 2019 Bali - 10 September 2019
Speaker: Atsushi Mitani
Room: WAL
Title: How did PostgreSQL Write Load Balancing of Queries Using Transactions?
This is the speech Siddon Tang gave at the 1st Rust Meetup in Beijing on April 16, 2017.
Siddon Tang:Chief Architect of PingCAP
The slide covered the following topics:
- Why do we use Rust in TiKV
- TiKV architecture introduction
- Key technology
- Future plan
This is the speech Max Liu gave at Percona Live Open Source Database Conference 2016.
Max Liu: Co-founder and CEO, a hacker with a free soul
The slide covered the following topics:
- Why another database?
- What kind of database we want to build?
- How to design such a database, including the principles, the architecture, and design decisions?
- How to develop such a database, including the architecture and the core technologies for TiKV and TiDB?
- How to test the database to ensure the quality and stability?
Airframe Meetup #3: 2019 Updates & AirSpecTaro L. Saito
Presentation slides of Airframe Meetup #3 https://airframe.connpass.com/event/148169/
- Airframe 19 Milestone
- AirSpec: A new testing library for Scala
-
Wayfair Use Case: The four R's of Metrics DeliveryInfluxData
Wayfair currently uses both Graphite and InfluxDB as a time series platform - The data is used by their developers, business stakeholders, by their internal alerting engine. Most importantly their 24x7 Ops Monitoring Center is using this data to constantly analyze the vital signs of Wayfair’s IT infrastructure and storefront operations.
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...Chester Chen
Uncovering performance regressions in the TCP SACKs vulnerability fixes
In early July 2019, Databricks noticed some Apache Spark workloads regressing by as much as 6x. In this talk, we'll discuss how we traced these regressions back to the Linux kernel and the fixes for the TCP SACKs vulnerabilities. We will explain the symptoms we were seeing, walk through how we debugged the TCP connections, and dive into the Linux source to uncover the root cause.
Speaker: Chris Stevens (Databricks)
Chris Stevens is a software engineer at Databricks where he works on the reliability, scalability, and security of Apache Spark clusters. His work focuses on auto-scaling compute, auto-scaling storage, node initialization performance, and node health monitoring. Prior to Databricks, Chris founded the Minoca OS project, where he built a POSIX compliant, general purpose OS - from scratch - to run on resource constrained device. He got his start at Microsoft working on the Windows kernel team, porting the Windows boot environment from BIOS to UEFI.
The Dark Side Of Go -- Go runtime related problems in TiDB in productionPingCAP
Ed Huang, CTO of PingCAP, talked at Go System Conference about dealing with the typical and profound issues related to Go’s runtime as your systems become more complex. Taking TiDB as an example, he demonstrated how these problems can be reproduced, located, and analyzed in production.
First impressions of SparkR: our own machine learning algorithmInfoFarm
In june 2015, SparkR was first integrated into SparkR. At InfoFarm we strive to stay on top of new technologies, hence we have tried it out and implemented a few machine learning algorithms as well.
This is the speech Shen Li gave at GopherChina 2017.
TiDB is an open source distributed database. Inspired by the design of Google F1/Spanner, TiDB features in infinite horizontal scalability, strong consistency, and high availability. The goal of TiDB is to serve as a one-stop solution for data storage and analysis.
In this talk, we will mainly cover the following topics:
- What is TiDB
- TiDB Architecture
- SQL Layer Internal
- Golang in TiDB
- Next Step of TiDB
PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...Equnix Business Solutions
PGConf.ASIA 2019 Bali - 10 September 2019
Speaker: Atsushi Mitani
Room: WAL
Title: How did PostgreSQL Write Load Balancing of Queries Using Transactions?
This is the speech Siddon Tang gave at the 1st Rust Meetup in Beijing on April 16, 2017.
Siddon Tang:Chief Architect of PingCAP
The slide covered the following topics:
- Why do we use Rust in TiKV
- TiKV architecture introduction
- Key technology
- Future plan
This is the speech Max Liu gave at Percona Live Open Source Database Conference 2016.
Max Liu: Co-founder and CEO, a hacker with a free soul
The slide covered the following topics:
- Why another database?
- What kind of database we want to build?
- How to design such a database, including the principles, the architecture, and design decisions?
- How to develop such a database, including the architecture and the core technologies for TiKV and TiDB?
- How to test the database to ensure the quality and stability?
Airframe Meetup #3: 2019 Updates & AirSpecTaro L. Saito
Presentation slides of Airframe Meetup #3 https://airframe.connpass.com/event/148169/
- Airframe 19 Milestone
- AirSpec: A new testing library for Scala
-
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Nelson Calero
Each new version of the Oracle database includes improvements in the upgrade and patching utilities, forcing us to update our procedures to incorporate these changes.
The Fleet Provisioning & Patching (FPP, formerly RHP) utility, together with the change in its licensing announced at OOW 2019 that makes it free in RAC, now makes it possible to centrally manage the software life cycle.
This presentation shows examples of how to use FPP and different configuration options.
This presentation includes a comprehensive introduction to Apache Spark. From an explanation of its rapid ascent to performance and developer advantages over MapReduce. We also explore its built-in functionality for application types involving streaming, machine learning, and Extract, Transform and Load (ETL).
(BDT303) Running Spark and Presto on the Netflix Big Data PlatformAmazon Web Services
In this session, we discuss how Spark and Presto complement the Netflix big data platform stack that started with Hadoop, and the use cases that Spark and Presto address. Also, we discuss how we run Spark and Presto on top of the Amazon EMR infrastructure; specifically, how we use Amazon S3 as our data warehouse and how we leverage Amazon EMR as a generic framework for data-processing cluster management.
Strata NYC 2015: What's new in Spark StreamingDatabricks
As the adoption of Spark Streaming in the industry is increasing, so is the community’s demand for more features. Since the beginning of this year, we have made significant improvements in performance, usability, and semantic guarantees. In particular, some of these features are:
- New Kafka integration for exactly-once guarantees
- Improved Kinesis integration for stronger guarantees
- Addition of more sources to the Python API
Significantly improved UI for greater monitoring and debuggability.
In this talk, I am going to discuss these improvements as well as the plethora of features we plan to add in the near future.
In the big data world, it's not always easy for Python users to move huge amounts of data around. Apache Arrow defines a common format for data interchange, while Arrow Flight introduced in version 0.11.0, provides a means to move that data efficiently between systems. Arrow Flight is a framework for Arrow-based messaging built with gRPC. It enables data microservices where clients can produce and consume streams of Arrow data to share it over the wire. In this session, I'll give a brief overview of Arrow Flight from a Python perspective, and show that it's easy to build high performance connections when systems can talk Arrow. I'll also cover some ongoing work in using Arrow Flight to connect PySpark with TensorFlow - two systems with great Python APIs but very different underlying internal data.
These slides were presented by Hossein Falaki of Databricks to the Atlanta Apache Spark User Group on Thursday, March 9, 2017: https://www.meetup.com/Atlanta-Apache-Spark-User-Group/events/238120227/
Five cool ways the JVM can run Apache Spark fasterTim Ellison
The IBM JVM runs Apache Spark fast! This talk explains some of the findings and optimizations from our experience of running Spark workloads.
The talk was originally presented at the SparkEU Summit 2015 in Amsterdam.
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonDataWorks Summit
Python is one of the most popular programming languages for advanced analytics, data science, machine learning, and deep learning. One of Python’s greatest assets is its extensive set of libraries, such as Numpy, Pandas, Scikit-learn, Theano, TensorFlow, Keras, and so on. Apache Spark is becoming the core component for big data processing and playing important role to help data scientists solve complicated problems. It has a great significance and strong demand to integrate Spark with the extremely rich Python ecosystems to handle challenges in artificial intelligence. In the latest Spark 2.3, some very exciting features were put in, for example: vectorized UDF in PySpark, which leverages Apache Arrow to provide high performance interoperability between Spark and Pandas/Numpy; Image format in dataFrame/dataset, which can improve Spark and TensorFlow (or other deep learning libraries) interoperability; high-efficiency parallel modeling tuning with Spark MLlib, etc. In this talk, we'll share best practice on real use cases and hands-on experiences to illustrate the power of these new features and bring more discussions on this topic.
Speaker: Yanbo Liang, Staff Software Engineer, Hortonworks
Easy enterprise application integration with RabbitMQ and AMQPRabbit MQ
VMware vFabric RabbitMQ Technical Webinar December 2010 by VMware engineer Emile Joubert. Covers common integration patterns, and how RabbitMQ makes these easily implemented, using AMQP as a communications mechanism.
You can view a recording of this presentation on YouTube: http://www.youtube.com/user/SpringSourceDev#p/c/5956C6D9EC319817/0/ABGMjX4K0D8
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021StreamNative
Today we will span from edge to any and all clouds to support data collection, real-time streaming, sensor ingest, edge computing, IoT use cases and edge AI. Apache Pulsar allows us to build computing at the edge and produce and consume messages at scale in any IoT, hybrid or cloud environment. Apache Pulsar supports MoP which allows for MQTT protocol to be used for high speed messaging.
We will teach you to quickly build scalable open source streaming applications regardless of if you are running in containers, pods, edge devices, VMs, on-premise servers, moving vehicles and any cloud.
Similar to td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020 (20)
Presto @ Treasure Data - Presto Meetup Boston 2015Taro L. Saito
Treasure Data simplifies event analytics for the complex digital
world. Our customers send us 1,000,000 events per second and issue 30,000+ Presto queries everyday to understand their customers better. One of the challenges is designing a cloud database with zero downtime to support a global customer base. We have achieved this goal by developing several open-source technologies; Fluentd and Embulk enable seamless log collection from stream/batch sources, and with MessagePack we can provide an extensible columnar store that accommodates future schema changes. Finally, Presto allows us to serve a wide variety of data processing our customers perform on our service. In this talk, I will present an overview of our system, and how our customers keep using Presto while collecting and extending their data set.
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoTaro L. Saito
Silk is a framework for building dataflows in Scala. In Silk users write data processing code with collection operators (e.g., map, filter, reduce, join, etc.). Silk uses Scala Macros to construct a DAG of dataflows, nodes of which are annotated with variable names in the program. By using these variable names as markers in the DAG, Silk can support interruption and resume of dataflows and querying the intermediate data. By separating dataflow descriptions from its computation, Silk enables us to switch executors, called weavers, for in-memory or cluster computing without modifying the code. In this talk, we will show how Silk helps you run data-processing pipelines as you write the code.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
How world-class product teams are winning in the AI era by CEO and Founder, P...
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
1. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Taro L. Saito
Arm Treasure Data
July 31th, 2020
Spark Meetup Tokyo #3
td-spark internals
Extending Spark with Airframe
1
2. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
About Me: Taro L. Saito
2
● Ph.D., Principal Software Engineer of
Arm Treasure Data
● Living US for 5 years
● Created Presto as a service
● Processing 1 million SQL queries /
day on the cloud. Presto Webinar
● OSS:
● Airframe, snappy-java (used in
Parquet, Spark core),
sbt-sonatype, etc.
● Books:
WIP
3. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Challenge: Adding Treasure Data Support to Spark
● PlazmaDB: Cloud Data Store of Treasure Data
● MessagePack-based columnar format (MPC1)
● Each table column is represented as a sequence of MessagePack values
● What was necessary for supporting Spark?
● td-spark driver (td-spark-assembly.jar)
■ MPC1 <-> DataFrame conversion
● Plazma Public API
■ APIs for reading and writing MPC1 files from PlazmaDB
● Created these two components with Airframe OSS
Airframe
3
4. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Airframe: Core Scala Modules of Treasure Data
● Airframe
● Scala OSS assets of our knowledges, production experiences, and design decisions
● 20+ Common Utilities for Scala
● Dependency Injection (DI)
● Airframe RPC
■ HTTP Server, Client builder (ScalaMatsuri. Tokyo, October 2020)
● AirSpec
■ Testing framework for Scala (ScalaDays. Seattle, May 2021)
4
Knowledge
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming OSS Outcome
Airframe
5. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Airframe Modules Used Inside td-spark
Airframe DI
DataFrame MPC1
airframe-codec
airframe-msgpack
Plazma Public API
airframe-http
airframe-finagle
Airframe DI
Airframe RPC
airframe-fluentd
Master Worker
DesignSparkContext
TDSparkContext TDSparkService
MPC1 Reader/Writer IO Manager
Airframe DI
airframe-http
airframe-config
airframe-launcher
airframe-jmx
airframe-metrics
airframe-control
airframe-metrics
td-spark.jarairframe-log
airframe-log
airframe-codec
airframe-json
Airframe
5
6. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Reading MPC1 Partitions and Column Blocks
6
Table Data
mpc1 partition
mpc1 partition
mpc1 partition
mpc1 partition
mpc1 partition
Plazma
Public API
Table Data
column blocks
column blocks
column blocks
column blocks
column blocks
td-spark
Table Data
Data Frame
Data Frame
Data Frame
Data Frame
Data Frame
td-pyspark
Parallel Read
User
Programs
Columnar Data Download
DataFrame
7. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Uploading DataFrame as MPC1 Partitions
7
td-spark
td-pyspark
User
Programs
Table Data
mpc1 partition
mpc1 partition
mpc1 partition
mpc1 partition
mpc1 partition
DataFrame
Format
Conversion
Plazma
Public API
Amazon S3
Parallel Upload
Copy
Transaction
Table Data
mpc1 partition
mpc1 partition
mpc1 partition
mpc1 partition
mpc1 partition
Table DataTable Data
8. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Airframe
Airframe RPC
RPC Interface
Router
Scala.js Client
RPC Web Server
Generate
HTTP/gRPC Client
Open API Spec
RPC Impl
Create
RPC CallsJSON
Cross-Language
RPC Client
Scala.js
Web Application
Micro Servicesbt-airframeairframe-http
airframe-http-finagle
airframe-http-rx
airframe-codec
API Documentation
airframe-gRPC
8
● Use Scala As An RPC Interface
● Generate HTTP Server/Client (REST or gRPC)
● HTTP calls -> JSON/MessagePack data -> Remote Scala function calls
9. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
td-spark: Adding More Functions to Spark
● Using an implicit class to extend SparkSession (spark variable)
● Adding TD-specific functionalities
● Time series data queries
■ e.g., : spark.td.table(“TD’s table”).within(“-1h”).df
● Predicate pushdown for time-series data
● etc.
9
10. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Tips: Avoid Task Serialization Errors with Airframe DI
● Serializable
● spark.conf key-value properties inside SparkContext
● Non-Serializable
● Complex service objects
● Solution: Airframe DI (Dependency Injection)
● Distribute the service design (= how to construct objects) with the jar file
● Build service objects from the design (20+ components and config objects)
Airframe DI
Master
Worker
TDSparkContext TDSparkService
TDSparkContext
td-spark.jar
td-spark.jarSerialization
Error (!) Design
Design
TDSparkService
Airframe
Build OK!
Config
Config
10
11. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Flexible Format Conversion with MessagePack
DataFrame
Airframe
Codec
Pack/Unpack Pack/Unpack
MPC1
JDBC
ResultSet
Plazma Public API
Airframe
11
12. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Spark 3.0 and PySpark Support
12
13. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Resources
● Airframe: https://wvlet.org/airframe/
● Airframe Meetup #1 ~ #3 reports
● ScalaMatsuri 2019 presentation
■ And more!!
● td-spark documentation: https://treasure-data.github.io/td-spark/
● See Also: Spark with Airframe (@smdmts)
● Spark to Spark data transfer with MessagePack-based airframe-codec
● Spark -> AWS service call management with airframe-control
13
Airframe
New!
15. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Treasure Data: A Ready-to-Use Cloud Data Platform
15
Logs
Device
Data
Batch
Data
PlazmaDB
Table Schema
Data Collection Cloud Storage Distributed Data Processing
Jobs
Job Management
SQL Editor
Scheduler
Workflows
Machine
Learning
Treasure Data OSS
Third Party OSS
Data
17. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
td-pyspark
● Supporting PySpark
● Access Scala methods of td-spark:
■ sparkContext._jvm.(jvm package name).method(...)
● Conversion to PySpark’s DataFrame
● DataFrame(Scala DataFrame, sqlContext)
17
18. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Predicate Pushdown
● Traverse DataFrame Column Filters
● Extract time conditions (e.g., -1d, -1w, -7d, etc.)
●
18
19. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Class Loader Hierarchy of Databricks
● Base class loader
● User library class loader
● td-spark.jar will be loaded here
● Shared between multiple notebooks
■ Static variables used inside td-spark.jar
will be shared by multiple notebooks!
● REPL class loader
● Shared between multiple notebooks
● Spark-library class loader
● Notebook-local
● Notebook-local class loader
● Caching local instances to static variables in
td-spark caused ClassNotFound error in
other notebooks
19
20. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
Using Presto with Spark
● presto-jdbc
● Submit select * from (Original SQL) limit 0 => Query result schema
● JDBC ResultSet => Airframe Codec => DataFrame
20
21. Copyright 1995-2020 Arm Limited (or its affiliates). All rights reserved.
td-prestobase: A Proxy Gateway to Presto Clusters
21
● td-prestobase is a proxy gateway to Presto clusters that talks standard presto
protocol to support any Presto clients (e.g., presto-cli, jdbc, odbc, etc.)
● td-spark uses presto-jdbc and td-prestobase APIs for making Presto queries
Airframe