Tez

•Download as PPTX, PDF•

0 likes•389 views

Rafael de Paula Souza

Technology

What is it?
• Complex Directedacyclic-graph tasks
for processing data
•

Built atop Apache
Hadoop YARN

Rafael Souza
@rafael_psouza
rafaelpsouza
rafaelpsouza

What's hot

DoneDeal - AWS Data Analytics Platform

martinbpeters

http://bit.ly/1BTaXZP – As organizations look for even faster ways to derive value from big data, they are turning to Apache Spark is an in-memory processing framework that offers lightning-fast big data analytics, providing speed, developer productivity, and real-time processing advantages. The Spark software stack includes a core data-processing engine, an interface for interactive querying, Spark Streaming for streaming data analysis, and growing libraries for machine-learning and graph analysis. Spark is quickly establishing itself as a leading environment for doing fast, iterative in-memory and streaming analysis. This talk will give an introduction the Spark stack, explain how Spark has lighting fast results, and how it complements Apache Hadoop. By the end of the session, you’ll come away with a deeper understanding of how you can unlock deeper insights from your data, faster, with Spark.

Intro to Apache Spark by CTO of Twingo

MapR Technologies

Apache Spark is a lightning-fast cluster computing technology, designed for fast computation. It extends the MapReduce model of Hadoop to efficiently use it for more types of computations, which includes interactive queries and stream processing. Spark is one of Hadoop's subproject developed in 2009 in UC Berkeley's AMPLab by Matei Zaharia. It was Open Sourced in 2010 under a BSD license. It was donated to Apache software foundation in 2013, and now Apache Spark has become a top-level Apache project from Feb-2014. This document shares some basic knowledge about Apache Spark.

Apache spark

Dona Mary Philip

Solr + Hadoop: Interactive Search for Hadoop

gregchanan

Using Visualization to Succeed with Big Data

Pactera_US

Apache spark - Architecture , Overview & libraries

Walaa Hamdy Assy

Apache Spark beyond Hadoop MapReduce

Edureka!

Spark for big data analytics

Edureka!

Continuing with the objectives to make Spark faster, easier, and smarter, Apache Spark 3.1 extends its scope with more than 1500 resolved JIRAs. We will talk about the exciting new developments in the Apache Spark 3.1 as well as some other major initiatives that are coming in the future. In this talk, we want to share with the community many of the more important changes with the examples and demos. The following features are covered: the SQL features for ANSI SQL compliance, new streaming features, and Python usability improvements, the performance enhancements and new tuning tricks in query compiler.

Deep Dive into the New Features of Apache Spark 3.1

Databricks

An Introduction to Apache Spark

Dona Mary Philip

5 reasons why spark is in demand!

Edureka!

Meeting Performance Goals in multi-tenant Hadoop Clusters

DataWorks Summit/Hadoop Summit

Adios hadoop, Hola Spark! T3chfest 2015

dhiguero

Introduction to Big Data Analytics using Apache Spark on HDInsights on Azure (SaaS) and/or HDP on Azure(PaaS) This workshop will provide an introduction to Big Data Analytics using Apache Spark using the HDInsights on Azure (SaaS) and/or HDP deployment on Azure(PaaS) . There will be a short lecture that includes an introduction to Spark, the Spark components. Spark is a unified framework for big data analytics. Spark provides one integrated API for use by developers, data scientists, and analysts to perform diverse tasks that would have previously required separate processing engines such as batch analytics, stream processing and statistical modeling. Spark supports a wide range of popular languages including Python, R, Scala, SQL, and Java. Spark can read from diverse data sources and scale to thousands of nodes. The lecture will be followed by demo . There will be a short lecture on Hadoop and how Spark and Hadoop interact and compliment each other. You will learn how to move data into HDFS using Spark APIs, create Hive table, explore the data with Spark and SQL, transform the data and then issue some SQL queries. We will be using Scala and/or PySpark for labs.

Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...

Alex Zeltov

How Apache Spark Is Helping Tame the Wild West of Wi-Fi

Spark Summit

Designing the Next Generation of Data Pipelines at Zillow with Apache Spark

Databricks

Advanced Analytics and Big Data (August 2014)

Thomas W. Dinsmore

Prototypes are typically re-implemented in another language due to compatibility issues with R in the enterprise, but TIBCO Enterprise Runtime for R (TERR) allows the language to be run on several platforms. Enterprise-level scalability has been brought to the R language, enabling rapid iteration without the need to recode, re-implement and test. This presentation will delve further into these topics, highlighting specific use cases and the true value that can be gained from utilizing R. The session will be followed by a lively, open Q&A discussion.

Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...

Data Con LA

Fedbench - A Benchmark Suite for Federated Semantic Data Processing

Peter Haase

Frequently Bought Together Recommendations Based on Embeddings

Databricks

What's hot (20)

DoneDeal - AWS Data Analytics Platform

Intro to Apache Spark by CTO of Twingo

Apache spark

Solr + Hadoop: Interactive Search for Hadoop

Using Visualization to Succeed with Big Data

Apache spark - Architecture , Overview & libraries

Apache Spark beyond Hadoop MapReduce

Spark for big data analytics

Deep Dive into the New Features of Apache Spark 3.1

An Introduction to Apache Spark

5 reasons why spark is in demand!

Meeting Performance Goals in multi-tenant Hadoop Clusters

Adios hadoop, Hola Spark! T3chfest 2015

Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...

How Apache Spark Is Helping Tame the Wild West of Wi-Fi

Designing the Next Generation of Data Pipelines at Zillow with Apache Spark

Advanced Analytics and Big Data (August 2014)

Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...

Fedbench - A Benchmark Suite for Federated Semantic Data Processing

Frequently Bought Together Recommendations Based on Embeddings

Viewers also liked

Santa Barbara Association of REALTORS Presentation 1.19.10

Jeff Dowler

Introduction to SaltStack

Rafael de Paula Souza

.Net 4.0 Threading and Parallel Programming

Alex Moore

Migração de legado - Seniortec 2015

Rafael de Paula Souza

RSFBPW Social Media and Your Business 9.17.09_notes

Jeff Dowler

Creating a blog like a hacker

Rafael de Paula Souza

Antlr rafaelpsouza

Rafael de Paula Souza

PicoContainer

Rafael de Paula Souza

Apresentação realizada no TDC Porto Alegre - 2016 O monitoramento e a visibilidade da saúde e performance de componentes em uma arquitetura de microserviços é fundamental para determinar, de uma forma rápida, a causa raiz de possíveis problemas além de fornecer insights para melhorias de eficiência. Nessa apresentação vou contar um pouco do meu último ano trabalhando, para um cliente do Vale do Silício, com instrumentação, coleta, armazenamento e visualização de métricas (Observability) em uma arquitetura de microserviços na cloud. Além dos principais problemas e soluções encontradas vou abordar os seguintes tópicos: a arquitetura para instrumentação, coleta, armazenamento e visualização de métricas; Collectd; Sensu e SignaFx.

Coleta, armazenamento e visualização de métricas em uma arquitetura de micros...

Rafael de Paula Souza

Software Design and Technical Debts

Rafael de Paula Souza

Cheap HPC

Alex Moore

Space, Galaxies & Blackholes

Subhransu Behera

IronRuby

Alex Moore

Agile Development Practices - Productivity

Alex Moore

HTML Parsing With Hpricot

Subhransu Behera

NLP e Chatbots

Rafael de Paula Souza

Hacking and Securing iOS Apps : Part 1

Subhransu Behera

Viewers also liked (17)

Santa Barbara Association of REALTORS Presentation 1.19.10

Introduction to SaltStack

.Net 4.0 Threading and Parallel Programming

Migração de legado - Seniortec 2015

RSFBPW Social Media and Your Business 9.17.09_notes

Creating a blog like a hacker

Antlr rafaelpsouza

PicoContainer

Coleta, armazenamento e visualização de métricas em uma arquitetura de micros...

Software Design and Technical Debts

Cheap HPC

Space, Galaxies & Blackholes

IronRuby

Agile Development Practices - Productivity

HTML Parsing With Hpricot

NLP e Chatbots

Hacking and Securing iOS Apps : Part 1

Recently uploaded

Tata AIG General Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc

What are drone anti-jamming systems? The drone anti-jamming systems and anti-spoof technology protect against interference, jamming, and spoofing of the UAVs. To protect their security, countries are beginning to research drone anti-jamming systems, also known as drone strike weapons. The anti-jam and anti-spoof technology protects against interference, jamming and spoofing. A drone strike weapon is a drone attack weapon that can attack and destroy enemy drones. So what is so unique about this amazing system?

What Are The Drone Anti-jamming Systems Technology?

Antenna Manufacturer Coco

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

A Domino Admins Adventures (Engage 2024)

Gabriella Davis

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

This presentations targets students or working professionals. You may know Google for search, YouTube, Android, Chrome, and Gmail, but did you know Google has many developer tools, platforms & APIs? This comprehensive yet still high-level overview outlines the most impactful tools for where to run your code, store & analyze your data. It will also inspire you as to what's possible. This talk is 50 minutes in length.

Powerful Google developer tools for immediate impact! (2023-24 C)

wesley chun

Abhishek Deb(1), Mr Abdul Kalam(2) M. Des (UX) , School of Design, DIT University , Dehradun. This paper explores the future potential of AI-enabled smartphone processors, aiming to investigate the advancements, capabilities, and implications of integrating artificial intelligence (AI) into smartphone technology. The research study goals consist of evaluating the development of AI in mobile phone processors, analyzing the existing state as well as abilities of AI-enabled cpus determining future patterns as well as chances together with reviewing obstacles as well as factors to consider for more growth.

Exploring the Future Potential of AI-Enabled Smartphone Processors

debabhi2

[2024]Digital Global Overview Report 2024 Meltwater.pdf

hans926745

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

Scaling API-first – The story of a global engineering organization

Radu Cotescu

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

Handwritten Text Recognition for manuscripts and early printed texts

Maria Levchenko

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

Recently uploaded (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

What Are The Drone Anti-jamming Systems Technology?

GenAI Risks & Security Meetup 01052024.pdf

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

A Domino Admins Adventures (Engage 2024)

Artificial Intelligence: Facts and Myths

Powerful Google developer tools for immediate impact! (2023-24 C)

Exploring the Future Potential of AI-Enabled Smartphone Processors

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Scaling API-first – The story of a global engineering organization

The 7 Things I Know About Cyber Security After 25 Years | April 2024

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Automating Google Workspace (GWS) & more with Apps Script

🐬 The future of MySQL is Postgres 🐘

presentation ICT roal in 21st century education

Handwritten Text Recognition for manuscripts and early printed texts

Apidays New York 2024 - The value of a flexible API Management solution for O...

Tez

1. TEZ Rafael Souza

2. What is it? • Complex Directedacyclic-graph tasks for processing data • Built atop Apache Hadoop YARN

3. Stack

4. Key Design

5. DAG Execution Plan

6. Expressive dataflow definition APIs

7. Rafael Souza @rafael_psouza rafaelpsouza rafaelpsouza

Tez

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Recently uploaded

Recently uploaded (20)

Tez