Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org

•Download as PPT, PDF•

2 likes•1,137 views

Cloudera, Inc.

Technology

Prototyping Data Intensive Apps: TrendingTopics.org 09/29/09 1 Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch

Talk Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Data Intensive Web Apps ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

TrendingTopics is Open Source ,[object Object],[object Object],[object Object],[object Object]

Hadoop on Amazon EC2 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Wikipedia Page View Log Data ,[object Object],[object Object],[object Object]

Wikipedia Redirect Data ,[object Object],[object Object]

Loading Data into Hadoop on EC2 ,[object Object],[object Object]

Hive Data Warehouse Layer on EC2 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Robust Regression with R & Hadoop ,[object Object],[object Object],Page View “Spikes”

Exporting Data from our EC2 Cluster ,[object Object],[object Object],[object Object],[object Object],[object Object]

Ruby on Rails Front End ,[object Object]

Automate & Deploy with EC2onRails ,[object Object],[object Object],[object Object],[object Object]

Cron Triggers Hadoop Jobs on EC2 ,[object Object]

Bash Scripts Automate Daily Run ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Capistrano Tasks ,[object Object],[object Object],[object Object],[object Object],[object Object]

Visualization APIs ,[object Object],[object Object],[object Object]

Ideas & Next Steps ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Questions? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Wikipedia Views ~ Google Searches “ Dwight Howard” on TrendingTopics “ Dwight Howard” on Google Trends

Wikipedia Log Analysis with Pig ,[object Object],[object Object]

What's hot

Building a Real-Time Data Pipeline: Apache Kafka at LinkedInAmy W. Tang

SSR: Structured Streaming for R and Machine Learningfelixcss

Analytics at Scale with Apache Spark on AWS with Jonathan FritzDatabricks

Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim DowlingDatabricks

(BDT303) Running Spark and Presto on the Netflix Big Data PlatformAmazon Web Services

Hopsworks - Self-Service Spark/Flink/Kafka/HadoopJim Dowling

Strata Hadoop HopsworksJim Dowling

Mobius: C# Language Binding For SparkSpark Summit

Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Databricks

Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...viirya

Big data Lambda Architecture - Batch Layer Hands Onhkbhadraa

Natural Language Query and Conversational Interface to Apache SparkDatabricks

Scalable Scientific Computing with DaskUwe Korn

Just enough DevOps for Data Scientists (Part II)Databricks

Developing apache spark jobs in .net using mobiusshareddatamsft

Data Pipeline with KafkaPeerapat Asoktummarungsri

Spark Summit EU talk by Ruben Pulido Behar VeliqiSpark Summit

AWS April 2016 Webinar Series - Best Practices for Apache Spark on AWSAmazon Web Services

ETL with SPARK - First Spark London meetupRafal Kwasny

Prototyping Data Intensive Apps: TrendingTopics.orgPeter Skomoroch

What's hot (20)

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

SSR: Structured Streaming for R and Machine Learning

Analytics at Scale with Apache Spark on AWS with Jonathan Fritz

Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling

(BDT303) Running Spark and Presto on the Netflix Big Data Platform

Hopsworks - Self-Service Spark/Flink/Kafka/Hadoop

Strata Hadoop Hopsworks

Mobius: C# Language Binding For Spark

Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...

Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...

Big data Lambda Architecture - Batch Layer Hands On

Natural Language Query and Conversational Interface to Apache Spark

Scalable Scientific Computing with Dask

Just enough DevOps for Data Scientists (Part II)

Developing apache spark jobs in .net using mobius

Data Pipeline with Kafka

Spark Summit EU talk by Ruben Pulido Behar Veliqi

AWS April 2016 Webinar Series - Best Practices for Apache Spark on AWS

ETL with SPARK - First Spark London meetup

Prototyping Data Intensive Apps: TrendingTopics.org

Similar to Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org

Introduction to AWS GlueAmazon Web Services

Big Data Analytics from Azure Cloud to Power BI MobileRoy Kim

BDA311 Introduction to AWS GlueAmazon Web Services

Honu - A Large Scale Streaming Data Collection and Processing Pipeline__Hadoo...Yahoo Developer Network

AWS Big Data LandscapeCrishantha Nanayakkara

Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely

Building and deploying LLM applications with Apache AirflowKaxil Naik

Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Jason Dai

Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Cloudera, Inc.

Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiSlim Baltagi

Available platforms for Big Data 2.0Petr Novotný

Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Databricks

Agile data warehousingSneha Challa

Datalake ArchitectureTechYugadi IT Solutions & Consulting

Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech TalksAmazon Web Services

Apache Arrow: Open Source Standard Becomes an Enterprise NecessityWes McKinney

API Platform 2.1: when Symfony meets ReactJS (Symfony Live 2017)Les-Tilleuls.coop

Hadoop Big Data A big pictureJ S Jodha

Windows Azure HDInsight ServiceNeil Mackenzie

(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...Amazon Web Services

Similar to Hw09 Building Data Intensive Apps A Closer Look At Trending Topics.Org (20)

Introduction to AWS Glue

Big Data Analytics from Azure Cloud to Power BI Mobile

BDA311 Introduction to AWS Glue

Honu - A Large Scale Streaming Data Collection and Processing Pipeline__Hadoo...

AWS Big Data Landscape

Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...

Building and deploying LLM applications with Apache Airflow

Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)

Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...

Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi

Available platforms for Big Data 2.0

Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0

Agile data warehousing

Datalake Architecture

Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech Talks

Apache Arrow: Open Source Standard Becomes an Enterprise Necessity

API Platform 2.1: when Symfony meets ReactJS (Symfony Live 2017)

Hadoop Big Data A big picture

Windows Azure HDInsight Service

(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

The Evolution of Money: Digital Transformation and CBDCs in Central BankingSelcen Ozturkcan

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

A Call to Action for Generative AI in 2024Results

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

Histor y of HAM Radio presentation slidevu2urc

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Injustice - Developers Among Us (SciFiDevCon 2024)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Scaling API-first – The story of a global engineering organization

Breaking the Kubernetes Kill Chain: Host Path Mount

08448380779 Call Girls In Civil Lines Women Seeking Men

Understanding the Laravel MVC Architecture

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

The Evolution of Money: Digital Transformation and CBDCs in Central Banking

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Unblocking The Main Thread Solving ANRs and Frozen Frames

A Call to Action for Generative AI in 2024

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

Histor y of HAM Radio presentation slide

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Handwritten Text Recognition for manuscripts and early printed texts

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Maximizing Board Effectiveness 2024 Webinar.pptx