Recent years have seen dramatic advancements in the technologies available for managing and processing data. While these technologies provide powerful tools to build data applications, they also require new skills. Ted Malaska and Jonathan Seidman explain how to evaluate these new technologies and build teams to effectively leverage these technologies and achieve ROI with your data initiatives.
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...Cloudera, Inc.
This presentation provides detail on how we are now in the 6th wave of automation, that is based on Machine Learning. In this 6th wave, Cloudera plays a critical role in providing the data platform for Machine Learning and Analytics built for the Cloud.
Data Science and Machine Learning for the EnterpriseCloudera, Inc.
Overview of Machine Learning and how the Cloudera Data Science Workbench provides full access to data while supporting IT SLAs. The presentation includes details on Fast Forward Labs and The Value of Interpretability in Models.
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
Learn how Cloudera provides a unified platform that breaks down data silos commonly seen in organizations. By unifying the data needed for applied machine learning, organizations are better equipped to gather valuable insights from their data.
A deep dive into running data analytic workloads in the cloudCloudera, Inc.
Aishwarya Venkataraman, Jason Wang, Mala Ramakrishnan, Stefan Salandy, and Vinithra Varadharajan lead a deep dive into running data analytic workloads in a managed service capacity in the public cloud and highlight cloud infrastructure best practices.
Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.
We hope this session was valuable in teaching you more about Cloudera Enterprise on AWS, and how fast and easy it is to deploy a modern data management platform—in your cloud and on your terms.
Self-service Big Data Analytics on Microsoft AzureCloudera, Inc.
In this presentation Microsoft will join Cloudera to introduce a new Platform-as-a-Service (PaaS) offering that helps data engineers use on-demand cloud infrastructure to speed the creation and operation of data pipelines that power sophisticated, data-driven applications - without onerous administration.
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
Federal organizations increasingly are focused on creating environments that enable more data-driven decisions. Yet ensuring that all data is considered and is current, complete, and accurate is a tall order for most. To make data analytics meaningful to support real-world transformation, agency staff need business tools that provide user-friendly dashboards, on-demand reporting, and methods to manage efficiently the rise of voluminous and varied data sets and types commonly associated with big data. In most cases, existing systems are insufficient to support these requirements. Enter the enterprise data hub (EDH), a software architecture specifically designed to be a unified platform that can economically store unlimited data and enable diverse access to it at scale. Plan to attend this discussion to understand the key considerations to making an EDH the architectural center of your agency’s modern data strategy.
Big data journey to the cloud maz chaudhri 5.30.18Cloudera, Inc.
We hope this session was valuable in teaching you more about Cloudera Enterprise on AWS, and how fast and easy it is to deploy a modern data management platform—in your cloud and on your terms.
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...Cloudera, Inc.
This presentation provides detail on how we are now in the 6th wave of automation, that is based on Machine Learning. In this 6th wave, Cloudera plays a critical role in providing the data platform for Machine Learning and Analytics built for the Cloud.
Data Science and Machine Learning for the EnterpriseCloudera, Inc.
Overview of Machine Learning and how the Cloudera Data Science Workbench provides full access to data while supporting IT SLAs. The presentation includes details on Fast Forward Labs and The Value of Interpretability in Models.
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
Learn how Cloudera provides a unified platform that breaks down data silos commonly seen in organizations. By unifying the data needed for applied machine learning, organizations are better equipped to gather valuable insights from their data.
A deep dive into running data analytic workloads in the cloudCloudera, Inc.
Aishwarya Venkataraman, Jason Wang, Mala Ramakrishnan, Stefan Salandy, and Vinithra Varadharajan lead a deep dive into running data analytic workloads in a managed service capacity in the public cloud and highlight cloud infrastructure best practices.
Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.
We hope this session was valuable in teaching you more about Cloudera Enterprise on AWS, and how fast and easy it is to deploy a modern data management platform—in your cloud and on your terms.
Self-service Big Data Analytics on Microsoft AzureCloudera, Inc.
In this presentation Microsoft will join Cloudera to introduce a new Platform-as-a-Service (PaaS) offering that helps data engineers use on-demand cloud infrastructure to speed the creation and operation of data pipelines that power sophisticated, data-driven applications - without onerous administration.
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
Federal organizations increasingly are focused on creating environments that enable more data-driven decisions. Yet ensuring that all data is considered and is current, complete, and accurate is a tall order for most. To make data analytics meaningful to support real-world transformation, agency staff need business tools that provide user-friendly dashboards, on-demand reporting, and methods to manage efficiently the rise of voluminous and varied data sets and types commonly associated with big data. In most cases, existing systems are insufficient to support these requirements. Enter the enterprise data hub (EDH), a software architecture specifically designed to be a unified platform that can economically store unlimited data and enable diverse access to it at scale. Plan to attend this discussion to understand the key considerations to making an EDH the architectural center of your agency’s modern data strategy.
Big data journey to the cloud maz chaudhri 5.30.18Cloudera, Inc.
We hope this session was valuable in teaching you more about Cloudera Enterprise on AWS, and how fast and easy it is to deploy a modern data management platform—in your cloud and on your terms.
Learn how Cloudera is using our own platform to build the applications our support teams use every day to solve complex problems in Hadoop.
Recorded Webinar: http://www.cloudera.com/content/www/en-us/resources/recordedwebinar/data-driven-customer-support.html
Making Self-Service BI a Reality in the EnterpriseCloudera, Inc.
For most analysts, the pace of analytics and data science can be frustrating. The common waterfall approach works well for the fixed reports, but it can be a lengthy process to request additional data sets, create new reports, or serve new use cases. So it’s no surprise that organizations are looking to shift towards a self-service model, empowering business users to discover and iterate quickly.
However, it’s not just about opening up this access, but also ensuring the results are accurate and trusted. When there are petabytes of data, how does a user know which tables to use and which are most relevant? How do you strike the balance between discovery and agility, while still meeting enterprise governance standards to truly get more value from your data?
During this webinar, you’ll learn how to empower end-users to make self-service BI a reality within your organization while fostering governance collaboration between all data stakeholders. We’ll discuss and demo:
Strategies of consolidating data across silos for fast, flexible access
Enabling easy discovery and exploration, including understanding which data to trust and where to start
New capabilities for intelligent query assistance as well as immediate performance optimizations and recommendations as-you-go
Collaboration and access outside of just SQL for data science and beyond
In addition, we will walk through best practices and considerations when developing your organizational strategy around self-service analytics, and highlight several real-world success stories from a wide range of industries.
3 things to learn:
Strategies of consolidating data across silos for fast, flexible access
Enabling easy discovery and exploration, including understanding which data to trust and where to start
New capabilities for intelligent query assistance as well as immediate performance optimizations and recommendations as-you-go
Big data journey to the cloud rohit pujari 5.30.18Cloudera, Inc.
We hope this session was valuable in teaching you more about Cloudera Enterprise on AWS, and how fast and easy it is to deploy a modern data management platform—in your cloud and on your terms.
Preparing data for analysis and insights is the foundation of any data-driven exercise. Moving workloads to a PaaS, be it data engineering, analytic database, or data science requires a two step leap of faith - in trusting the public cloud, and then your PaaS vendor. In this webinar we will discuss the architecture of a PaaS solution for data management and understand the nitty gritty details of what exactly this involves with the following:
An exploration of the architecture of Cloudera Altus PaaS - the industry’s first multi-function, multi-cloud data and analytic platform-as-a-service
A dive into use cases and a demo of Altus
The synergy between AWS and Altus to help you securely standardize on a combination of public cloud and data management
3 things to learn:
An exploration of the architecture of Cloudera Altus PaaS - the industry’s first multi-function, multi-cloud data and analytic platform-as-a-service
A dive into use cases and a demo of Altus
The synergy between AWS and Altus to help you securely standardize on a combination of public cloud and data management
3 Things to Learn:
-How data is driving digital transformation to help businesses innovate rapidly
-How Choice Hotels (one of largest hoteliers) is using Cloudera Enterprise to gain meaningful insights that drive their business
-How Choice Hotels has transformed business through innovative use of Apache Hadoop, Cloudera Enterprise, and deployment in the cloud — from developing customer experiences to meeting IT compliance requirements
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
The Internet of Things is moving into the mainstream and this new world of data-driven products is transforming a vast number of industry sectors and technologies.
However, IoT creates a new challenge: how to build and operationalize continual data ingestion from such a wide and ever-changing array of endpoints so that the data arrives consumption-ready and can drive analysis and action within the business.
In this webinar, Sean Anderson from Cloudera and Kirit Busu, Director of Product Management at StreamSets, will discuss Hadoop's ecosystem and IoT capabilities and provide advice about common patterns and best practices. Using specific examples, they will demonstrate how to build and run end-to-end IOT data flows using StreamSets and Cloudera infrastructure.
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Cloudera, Inc.
Kelley Blue Book Customer Use Case In this webinar, you will learn how KBB has: - Experienced a 37% increase in ad spend efficiency - Drove an incremental 1 billion impressions from its target segments - Observed a 24% lift in website engagement
Extreme Sports & Beyond: Exploring a new frontier in data with GoProCloudera, Inc.
GoPro is a powerful global brand, thanks in large part to its innovative cameras and accessories that capture moments other cameras just miss: surfing in Maui, skiing in Tahoe, recording your child’s first steps. And today, the company is nearly as well known for its user-generated social and content networks.
Join us for this special webinar hosted by Tableau, Trifacta, and Cloudera—featuring GoPro. We’ll dive into GoPro’s data strategy and architecture, from ingest and processing to data prep and reporting, all on AWS.
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
This session will provide an executive overview of the Apache Hadoop ecosystem, its basic concepts, and its real-world applications. Attendees will learn how organizations worldwide are using the latest tools and strategies to harness their enterprise information to solve business problems and the types of data analysis commonly powered by Hadoop. Learn how various projects make up the Apache Hadoop ecosystem and the role each plays to improve data storage, management, interaction, and analysis. This is a valuable opportunity to gain insights into Hadoop functionality and how it can be applied to address compelling business challenges in your agency.
Risk Management for Data: Secured and GovernedCloudera, Inc.
Cloudera Tech Day Presentation by Eddie Garcia, Chief Security Architect, Cloudera. Protecting enterprise data is an increasingly complex challenge given the diversity and sophistication of threat actors and their cyber-tactics. In this session, participants will hear a comprehensive introduction to Hadoop Security, including the “three A’s” for secure operating environments: Authentication, Authorization, and Audit. In addition, the presenter will cover strategies to orchestrate data security, encryption, and compliance, and will explain the Cloudera Security Maturity Model for Hadoop. Attendees will leave with a greater understanding of how effective INFOSEC relies on an enterprise big data governance and risk management approach.
Using Hadoop to Drive Down Fraud for TelcosCloudera, Inc.
Communication Service Providers (CSPs) lose around $38 Billion to fraud every year. Check out this webinar to learn more about the Cloudera - Argyle Data real-time fraud analytics platform and how Telcos can utilize Apache Hadoop to drive down fraud.
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
In this webinar, Cloudera and AtScale will showcase:
How a company can modernize their analytic architecture to deliver flexibility and agility to more end-users.
How using AtScale’s Universal Semantic layer can end the data chaos and allow business users to use the data in the modern platform.
Highlight the performance of AtScale and Cloudera’s analytic database with newly completed TPC-DS standard benchmarking.
Best practices for migrating from legacy appliances.
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Cloudera, Inc.
We have seen the evolution with the Bi and Data Science fields from the structured data warehouse to data lake and finally, to the data hub. This session will cover the key steps required to building a data hub, examining how best to align and engage stakeholders and develop architectural sanction to enable your organisations to realise new customer insights and better enable you to achieve business objectives.
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)Cloudera, Inc.
In this workshop, we will look outside the box and help expand the problem space to include issues you may not have thought were possible before Big Data. From Near Real Time (NRT) recommendation engines, loan applications to churn detection, Big Data is answering new questions and providing organisations with a competitive edge through revenue increase, cost savings and risk mitigation. We will take a special look at the role the Cloud can play in elevating your analytics environment. We will discuss real world examples of how Big Data answers these questions and does it at a lower cost outlay.
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18Cloudera, Inc.
Webinar on Cloudera Enterprise 6.0 where we will discuss how to build new applications on the modern platform for machine learning and analytics. This webinar will take a look at the latest software enhancements and how they’ll help you improve your productivity and innovate new analytics applications.
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Cloudera, Inc.
Cloudera Enterprise can be used as an adaptive, high-performance analytic database, complementing existing data warehouses by relieving the pressure of growing numbers of ETL jobs and BI analytics. But where do you get started when developing your offload strategy? How can you identify which workloads are the best fit for which system? And once you’re up and running, how can you constantly adapt to Hadoop’s changing data needs?
Cloudera Navigator Optimizer eases the path for moving the right workloads to Hadoop and then actively manages data allowing you to take advantage of Hadoop’s benefits. Now generally available with the recent release of Cloudera 5.8 and a unique part of Cloudera’s analytic database solution, Navigator Optimizer gives you the workload visibility and assessments to build a predictable offload plan, adapt to evolving data and workload demands, and optimize query performance for Hadoop technologies
3 Things to Learn:
Join Ewa Ding, Senior Product Manager at Cloudera, as she discusses:
-An overview of Cloudera Navigator Optimizer and its key features
-A live demo and key use cases of this web-based tool
-What’s next for active data optimization in Hadoop
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
Learn how organizations are deriving unique customer insights, improving product and services efficiency, and reducing business risk with a modern big data architecture powered by Cloudera on AWS. In this webinar, you see how fast and easy it is to deploy a modern data management platform—in your cloud, on your terms.
Data is being generated at a feverish pace and many businesses want all of it at their disposal to solve complex strategic problems. As decision making moves to real-time, enterprises need data ready for analysis immediately. Sean Anderson and Amandeep Khurana will discuss common pipeline trends in modern streaming architectures, Hadoop components that enable streaming capabilities, and popular use cases that are enabling the world of IOT and real-time data science.
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...Cloudera, Inc.
You like to use R, and you need to use big data. dplyr, one of the most popular packages for R, makes it easy to query large data sets in scalable processing engines like Apache Spark and Apache Impala.
But there can be pitfalls: dplyr works differently with different data sources—and those differences can bite you if you don’t know what you’re doing.
Ian Cook is a data scientist, an R contributor, and a curriculum developer at Cloudera University. In this webinar, Ian will show you exactly what you need to know about sparklyr (from RStudio) and the package implyr (from Cloudera). He will show you how to write dplyr code that works across these different interfaces. And, he will solve mysteries:
Do I need to know SQL to use dplyr?
When is a “tbl” not a “tibble”?
Why is 1 not always equal to 1?
When should you collect(), collapse(), and compute()?
How can you use dplyr to combine data stored in different systems?
3 things to learn:
Do I need to know SQL to use dplyr?
When should you collect(), collapse(), and compute()?
How can you use dplyr to combine data stored in different systems?
Learn how Cloudera is using our own platform to build the applications our support teams use every day to solve complex problems in Hadoop.
Recorded Webinar: http://www.cloudera.com/content/www/en-us/resources/recordedwebinar/data-driven-customer-support.html
Making Self-Service BI a Reality in the EnterpriseCloudera, Inc.
For most analysts, the pace of analytics and data science can be frustrating. The common waterfall approach works well for the fixed reports, but it can be a lengthy process to request additional data sets, create new reports, or serve new use cases. So it’s no surprise that organizations are looking to shift towards a self-service model, empowering business users to discover and iterate quickly.
However, it’s not just about opening up this access, but also ensuring the results are accurate and trusted. When there are petabytes of data, how does a user know which tables to use and which are most relevant? How do you strike the balance between discovery and agility, while still meeting enterprise governance standards to truly get more value from your data?
During this webinar, you’ll learn how to empower end-users to make self-service BI a reality within your organization while fostering governance collaboration between all data stakeholders. We’ll discuss and demo:
Strategies of consolidating data across silos for fast, flexible access
Enabling easy discovery and exploration, including understanding which data to trust and where to start
New capabilities for intelligent query assistance as well as immediate performance optimizations and recommendations as-you-go
Collaboration and access outside of just SQL for data science and beyond
In addition, we will walk through best practices and considerations when developing your organizational strategy around self-service analytics, and highlight several real-world success stories from a wide range of industries.
3 things to learn:
Strategies of consolidating data across silos for fast, flexible access
Enabling easy discovery and exploration, including understanding which data to trust and where to start
New capabilities for intelligent query assistance as well as immediate performance optimizations and recommendations as-you-go
Big data journey to the cloud rohit pujari 5.30.18Cloudera, Inc.
We hope this session was valuable in teaching you more about Cloudera Enterprise on AWS, and how fast and easy it is to deploy a modern data management platform—in your cloud and on your terms.
Preparing data for analysis and insights is the foundation of any data-driven exercise. Moving workloads to a PaaS, be it data engineering, analytic database, or data science requires a two step leap of faith - in trusting the public cloud, and then your PaaS vendor. In this webinar we will discuss the architecture of a PaaS solution for data management and understand the nitty gritty details of what exactly this involves with the following:
An exploration of the architecture of Cloudera Altus PaaS - the industry’s first multi-function, multi-cloud data and analytic platform-as-a-service
A dive into use cases and a demo of Altus
The synergy between AWS and Altus to help you securely standardize on a combination of public cloud and data management
3 things to learn:
An exploration of the architecture of Cloudera Altus PaaS - the industry’s first multi-function, multi-cloud data and analytic platform-as-a-service
A dive into use cases and a demo of Altus
The synergy between AWS and Altus to help you securely standardize on a combination of public cloud and data management
3 Things to Learn:
-How data is driving digital transformation to help businesses innovate rapidly
-How Choice Hotels (one of largest hoteliers) is using Cloudera Enterprise to gain meaningful insights that drive their business
-How Choice Hotels has transformed business through innovative use of Apache Hadoop, Cloudera Enterprise, and deployment in the cloud — from developing customer experiences to meeting IT compliance requirements
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
The Internet of Things is moving into the mainstream and this new world of data-driven products is transforming a vast number of industry sectors and technologies.
However, IoT creates a new challenge: how to build and operationalize continual data ingestion from such a wide and ever-changing array of endpoints so that the data arrives consumption-ready and can drive analysis and action within the business.
In this webinar, Sean Anderson from Cloudera and Kirit Busu, Director of Product Management at StreamSets, will discuss Hadoop's ecosystem and IoT capabilities and provide advice about common patterns and best practices. Using specific examples, they will demonstrate how to build and run end-to-end IOT data flows using StreamSets and Cloudera infrastructure.
Kelley Blue Book Uses Big Data to Increase User Engagement Over 100%Cloudera, Inc.
Kelley Blue Book Customer Use Case In this webinar, you will learn how KBB has: - Experienced a 37% increase in ad spend efficiency - Drove an incremental 1 billion impressions from its target segments - Observed a 24% lift in website engagement
Extreme Sports & Beyond: Exploring a new frontier in data with GoProCloudera, Inc.
GoPro is a powerful global brand, thanks in large part to its innovative cameras and accessories that capture moments other cameras just miss: surfing in Maui, skiing in Tahoe, recording your child’s first steps. And today, the company is nearly as well known for its user-generated social and content networks.
Join us for this special webinar hosted by Tableau, Trifacta, and Cloudera—featuring GoPro. We’ll dive into GoPro’s data strategy and architecture, from ingest and processing to data prep and reporting, all on AWS.
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
This session will provide an executive overview of the Apache Hadoop ecosystem, its basic concepts, and its real-world applications. Attendees will learn how organizations worldwide are using the latest tools and strategies to harness their enterprise information to solve business problems and the types of data analysis commonly powered by Hadoop. Learn how various projects make up the Apache Hadoop ecosystem and the role each plays to improve data storage, management, interaction, and analysis. This is a valuable opportunity to gain insights into Hadoop functionality and how it can be applied to address compelling business challenges in your agency.
Risk Management for Data: Secured and GovernedCloudera, Inc.
Cloudera Tech Day Presentation by Eddie Garcia, Chief Security Architect, Cloudera. Protecting enterprise data is an increasingly complex challenge given the diversity and sophistication of threat actors and their cyber-tactics. In this session, participants will hear a comprehensive introduction to Hadoop Security, including the “three A’s” for secure operating environments: Authentication, Authorization, and Audit. In addition, the presenter will cover strategies to orchestrate data security, encryption, and compliance, and will explain the Cloudera Security Maturity Model for Hadoop. Attendees will leave with a greater understanding of how effective INFOSEC relies on an enterprise big data governance and risk management approach.
Using Hadoop to Drive Down Fraud for TelcosCloudera, Inc.
Communication Service Providers (CSPs) lose around $38 Billion to fraud every year. Check out this webinar to learn more about the Cloudera - Argyle Data real-time fraud analytics platform and how Telcos can utilize Apache Hadoop to drive down fraud.
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
In this webinar, Cloudera and AtScale will showcase:
How a company can modernize their analytic architecture to deliver flexibility and agility to more end-users.
How using AtScale’s Universal Semantic layer can end the data chaos and allow business users to use the data in the modern platform.
Highlight the performance of AtScale and Cloudera’s analytic database with newly completed TPC-DS standard benchmarking.
Best practices for migrating from legacy appliances.
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Cloudera, Inc.
We have seen the evolution with the Bi and Data Science fields from the structured data warehouse to data lake and finally, to the data hub. This session will cover the key steps required to building a data hub, examining how best to align and engage stakeholders and develop architectural sanction to enable your organisations to realise new customer insights and better enable you to achieve business objectives.
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)Cloudera, Inc.
In this workshop, we will look outside the box and help expand the problem space to include issues you may not have thought were possible before Big Data. From Near Real Time (NRT) recommendation engines, loan applications to churn detection, Big Data is answering new questions and providing organisations with a competitive edge through revenue increase, cost savings and risk mitigation. We will take a special look at the role the Cloud can play in elevating your analytics environment. We will discuss real world examples of how Big Data answers these questions and does it at a lower cost outlay.
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18Cloudera, Inc.
Webinar on Cloudera Enterprise 6.0 where we will discuss how to build new applications on the modern platform for machine learning and analytics. This webinar will take a look at the latest software enhancements and how they’ll help you improve your productivity and innovate new analytics applications.
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Cloudera, Inc.
Cloudera Enterprise can be used as an adaptive, high-performance analytic database, complementing existing data warehouses by relieving the pressure of growing numbers of ETL jobs and BI analytics. But where do you get started when developing your offload strategy? How can you identify which workloads are the best fit for which system? And once you’re up and running, how can you constantly adapt to Hadoop’s changing data needs?
Cloudera Navigator Optimizer eases the path for moving the right workloads to Hadoop and then actively manages data allowing you to take advantage of Hadoop’s benefits. Now generally available with the recent release of Cloudera 5.8 and a unique part of Cloudera’s analytic database solution, Navigator Optimizer gives you the workload visibility and assessments to build a predictable offload plan, adapt to evolving data and workload demands, and optimize query performance for Hadoop technologies
3 Things to Learn:
Join Ewa Ding, Senior Product Manager at Cloudera, as she discusses:
-An overview of Cloudera Navigator Optimizer and its key features
-A live demo and key use cases of this web-based tool
-What’s next for active data optimization in Hadoop
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
Learn how organizations are deriving unique customer insights, improving product and services efficiency, and reducing business risk with a modern big data architecture powered by Cloudera on AWS. In this webinar, you see how fast and easy it is to deploy a modern data management platform—in your cloud, on your terms.
Data is being generated at a feverish pace and many businesses want all of it at their disposal to solve complex strategic problems. As decision making moves to real-time, enterprises need data ready for analysis immediately. Sean Anderson and Amandeep Khurana will discuss common pipeline trends in modern streaming architectures, Hadoop components that enable streaming capabilities, and popular use cases that are enabling the world of IOT and real-time data science.
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...Cloudera, Inc.
You like to use R, and you need to use big data. dplyr, one of the most popular packages for R, makes it easy to query large data sets in scalable processing engines like Apache Spark and Apache Impala.
But there can be pitfalls: dplyr works differently with different data sources—and those differences can bite you if you don’t know what you’re doing.
Ian Cook is a data scientist, an R contributor, and a curriculum developer at Cloudera University. In this webinar, Ian will show you exactly what you need to know about sparklyr (from RStudio) and the package implyr (from Cloudera). He will show you how to write dplyr code that works across these different interfaces. And, he will solve mysteries:
Do I need to know SQL to use dplyr?
When is a “tbl” not a “tibble”?
Why is 1 not always equal to 1?
When should you collect(), collapse(), and compute()?
How can you use dplyr to combine data stored in different systems?
3 things to learn:
Do I need to know SQL to use dplyr?
When should you collect(), collapse(), and compute()?
How can you use dplyr to combine data stored in different systems?
How to successfully engage enterprise software vendors – software selectionJohn Cachat
It All Starts With The Sales Team
It Doesn’t Do What We Want
Vendors Tiers And Costs
Finding The Right Tier
Contract Language
SaaS Alternative
johncachat@hotmail.com
www.peproso.com
A Primer for Your Next Data Science Proof of Concept on the CloudAlton Alexander
Learn how to quickly create a highly scalable solution using AWS. We introduce the benefits and challenges you may face. We discuss scope and establish realistic expectations, budgets, and constraints for these type of projects. Finally we end with a demo for website event tracking and analysis.
Lean Analytics is a set of rules to make data science more streamlined and productive. It touches on many aspects of what a data scientist should be and how a data science project should be defined to be successful. During this presentation Richard will present where data science projects go wrong, how you should think of data science projects, what constitutes success in data science and how you can measure progress. This session will be loaded with terms, stories and descriptions of project successes and failures. If you're wondering whether you're getting value out of data science, how to get more value out of it and even whether you need it then this talk is for you!
What you will take away from this session
Learn how to make your data science projects successful
Evaluate how to track progress and report on the efficacy of data science solutions
Understand the role of engineering and data scientists
Understand your options for processes and software
How a global manufacturing company built a data science capability from scratchCarlo Torniai
In less than a year, Pirelli, a global manufacturing company best known for high-performance tires and motorsports, grew an impactful data science capability from the ground up. I am sharing a how-to guide for doing the same in your organization, equipping you with arguments to marshal and concrete tips to follow, while calling out pitfalls to watch out for along the way.
Transform Banking with Big Data and Automated Machine Learning 9.12.17Cloudera, Inc.
Banks are rich in valuable data and can build and maintain a competitive advantage by identifying and executing on high-value machine learning projects leveraging the rich data available.This webinar will describe use cases fit for big data and machine learning in the banking sector (commercial, consumer, regulatory, and markets) and the impact they can have for your organization.
3 things to learn:
* How to create a next generation data platform and why it is important
* How to monetize big data using predictive modeling and machine learning
* What is needed for automated machine learning as a sustainable, cost-effective, and efficient solution
7 things to consider when choosing your IaaS provider for ISV/SaaSFrederik Denkens
As an ISV or SaaS company, choosing the right IaaS provider can be a challenge. I hope to give you some things to think about to guide you in your decision.
You can off course always call us if you need help choosing!
SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...Christian Lechner
Successfully developing Cloud Native applications is often regarded from a purely technological perspective - what design aspects have to be considered, what frameworks are best fitting and so on. However, there are many more especially non-technological factors that you have to take into account and that will influence your success. In this talk, I would like to draw your attention to these factors and share the experiences and learnings that we made when developing our first Cloud Native applications.
In the world of agile, there is theory and then there is practice. We like to talk about self-organizing teams, asynchronous execution, BDD, TDD, and emergent architecture. We also talk about cross-functional teams: how analysts, testers, architects, technical writers, and UX designers belong on the same team, right next to programmers. It all sounds nice in theory, but how does this work in reality? What do these people actually do? How do they interact? What does it look like? Is there really a pragmatic way to make this work?
In this simulation, a cross-functional team will actually build a piece of software. Every specialist will have a hand in the process. Every specialist will also act as a generalist. Everyone will add value. And as a team, we’ll get something DONE.
This is your opportunity to see agile development in practice, and to bridge the gap between what agilists say and what teams do. And it’s not as new or as difficult as you think – affinity between testers, BA’s, coders, and other team members has really been at the root of effective development practices all along. Let’s just finally acknowledge that it works, demonstrate its capabilities, and encourage it going forward.
This IS agile development.
Speaker: Venkatesh Umaashankar
LinkedIn: https://www.linkedin.com/in/venkateshumaashankar/
What will be discussed?
What is Data Science?
Types of data scientists
What makes a Data Science Team? Who are its members?
Why does a DS team need Full Stack Developer?
Who should lead the DS Team
Building a Data Science team in a Startup Vs Enterprise
Case studies on:
Evolution Of Airbnb’s DS Team
How Facebook on-boards DS team and trains them
Apple’s Acqui-hiring Strategy to build DS team
Spotify -‘Center of Excellence’ Model
Who should attend?
Managers
Technical Leaders who want to get started with Data Science
Data Governance in an Agile SCRUM Lean MVP WorldDATAVERSITY
Most of us learned data modeling via a waterfall-driven methodology lens. Yet Agile and other modern development methods have for the most part assumed that data governance is an anti-pattern to just getting things (software) done. Well look at questions such as:
•Are Agile and Data Governance Enemies?
•How can we get stuff done AND get systems delivered?
•And what do we do about existing systems delivered without data governance attention?
We'll also look at how data modeling fits in the answers to these questions.
Profit from AI & Machine Learning: The Best Practices for People & ProcessTony Baer
Presents the results of a detailed survey of organizations highly experienced with AI projects in production, providing best practices for when to apply AI; how to organize AI project teams; how to manage the AI project lifecycle and keep projects aligned; and what are the added challenges that AI introduces compared to managing traditional data science projects.
10 bezcennych lekcji dla software developera stającego się szefem firmyWojciech Seliga
[Originally Polish lecture with English slides - with a few exceptions]
Przez wiele lat byłem software developerem. Koncentrowałem się na kodzie, projektach software'owych oraz interakcjach w moim zespole i z klientami. Byłem pewny, że Agile rozwiązuje wszystkie problemy tego świata. Śmiałem się z komiksów Scotta Adamsa i stworzonej przez niego karykatury szefa (PHB). Życie było proste i piękne...
Teraz od ponad 8 lat prowadzę firmę software'ową, którą przy blisko 90 osobach trudno już nazwać maleństwem. Sam stałem się "szefem" na pełen etat.
Podczas prezentacji podzielę się z Wami różnymi doświadczeniami oraz naukami (nieraz bolesnymi) jakie wyniosłem w ostatnich latach podczas mojej stopniowej przemiany z developera/inżyniera w przedsiębiorcę i szefa firmy. O ile zapewne nie wszystkie sytuacje i wnioski mają lub mogą mieć (o ile marzysz o własnym startupie czy zespole) zastosowanie w Twoim życiu, same sobie ich uświadomienie może oszczędzić Ci w przyszłości straty mnóstwa czasu, energii i pieniędzy oraz uniknąć przykrych rozczarowań.
Data Con LA 2019 - The challenges of data science for veteran media organizat...Data Con LA
An overview of the tools, engineering infrastructure, and decisions that we've been making in concert with our newsroom and executive leadership.- What tools might you use and why? Python, Scala, Spark, AWS, Google GCP, BigQuery, Redshift, Tableau, SQL?- What engineering tradeoffs do you have to make when building data science platforms and choosing big data tools?- How can you empower analysts, engineers, data scientists, and privacy advocates to build a data driven culture ecosystem at at any size company?- How do you recruit, train, and retain world class data talent against competitors like Netflix, Google, and Snap?
360DigitalTransformation is a leader in providing Hadoop, Hive, Elastic Search, Spark, Mobile, Digital Marketing, Talend, Tableo, Mobile Development, Big Data Analytics Trainings online. Learn these cutting edge technologies from professional who have a long career with Industry and develop your skill set.
Similar to Managing Successful Data Projects: Technology Selection and Team Building (20)
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
This annual program recognizes organizations who are moving swiftly towards the future and building innovative solutions by making what was impossible yesterday, possible today.
The winning organizations' implementations demonstrate outstanding achievements in fulfilling their mission, technical advancement, and overall impact.
The 2021 Data Impact Awards recognize organizations' achievements with the Cloudera Data Platform in seven categories:
Data Lifecycle Connection
Data for Enterprise AI
Cloud Innovation
Security & Governance Leadership
People First
Data for Good
Industry Transformation
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
Cloudera is proud to present the 2020 Data Impact Awards Finalists. This annual program recognizes organizations running the Cloudera platform for the applications they've built and the impact their data projects have on their organizations, their industries, and the world. Nominations were evaluated by a panel of independent thought-leaders and expert industry analysts, who then selected the finalists and winners. Winners exemplify the most-cutting edge data projects and represent innovation and leadership in their respective industries.
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
Cloudera Fast Forward Labs’ latest research report and prototype explore learning with limited labeled data. This capability relaxes the stringent labeled data requirement in supervised machine learning and opens up new product possibilities. It is industry invariant, addresses the labeling pain point and enables applications to be built faster and more efficiently.
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
In this session, we will cover how to move beyond structured, curated reports based on known questions on known data, to an ad-hoc exploration of all data to optimize business processes and into the unknown questions on unknown data, where machine learning and statistically motivated predictive analytics are shaping business strategy.
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
Watch this webinar to understand how Hortonworks DataFlow (HDF) has evolved into the new Cloudera DataFlow (CDF). Learn about key capabilities that CDF delivers such as -
-Powerful data ingestion powered by Apache NiFi
-Edge data collection by Apache MiNiFi
-IoT-scale streaming data processing with Apache Kafka
-Enterprise services to offer unified security and governance from edge-to-enterprise
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
Cloudera’s Data Science Workbench (CDSW) is available for Hortonworks Data Platform (HDP) clusters for secure, collaborative data science at scale. During this webinar, we provide an introductory tour of CDSW and a demonstration of a machine learning workflow using CDSW on HDP.
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
Join Cloudera as we outline how we use Cloudera technology to strengthen sales engagement, minimize marketing waste, and empower line of business leaders to drive successful outcomes.
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
Learn how organizations are deriving unique customer insights, improving product and services efficiency, and reducing business risk with a modern big data architecture powered by Cloudera on Azure. In this webinar, you see how fast and easy it is to deploy a modern data management platform—in your cloud, on your terms.
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
Join us to learn about the challenges of legacy data warehousing, the goals of modern data warehousing, and the design patterns and frameworks that help to accelerate modernization efforts.
Explore new trends and use cases in data warehousing including exploration and discovery, self-service ad-hoc analysis, predictive analytics and more ways to get deeper business insight. Modern Data Warehousing Fundamentals will show how to modernize your data warehouse architecture and infrastructure for benefits to both traditional analytics practitioners and data scientists and engineers.
Explore new trends and use cases in data warehousing including exploration and discovery, self-service ad-hoc analysis, predictive analytics and more ways to get deeper business insight. Modern Data Warehousing Fundamentals will show how to modernize your data warehouse architecture and infrastructure for benefits to both traditional analytics practitioners and data scientists and engineers.
Explore new trends and use cases in data warehousing including exploration and discovery, self-service ad-hoc analysis, predictive analytics and more ways to get deeper business insight. Modern Data Warehousing Fundamentals will show how to modernize your data warehouse architecture and infrastructure for benefits to both traditional analytics practitioners and data scientists and engineers.
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
Cloudera SDX is by no means no restricted to just the platform; it extends well beyond. In this webinar, we show you how Bardess Group’s Zero2Hero solution leverages the shared data experience to coordinate Cloudera, Trifacta, and Qlik to deliver complete customer insight.
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
Join Cloudera Fast Forward Labs Research Engineer, Mike Lee Williams, to hear about their latest research report and prototype on Federated Learning. Learn more about what it is, when it’s applicable, how it works, and the current landscape of tools and libraries.
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
451 Research Analyst Sheryl Kingstone, and Cloudera’s Steve Totman recently discussed how a growing number of organizations are replacing legacy Customer 360 systems with Customer Insights Platforms.
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
In this webinar, you will learn how Cloudera and BAH riskCanvas can help you build a modern AML platform that reduces false positive rates, investigation costs, technology sprawl, and regulatory risk.
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
How can companies integrate data science into their businesses more effectively? Watch this recorded webinar and demonstration to hear more about operationalizing data science with Cloudera Data Science Workbench on Cazena’s fully-managed cloud platform.
In this webinar, we’ll show you how Cloudera SDX reduces the complexity in your data management environment and lets you deliver diverse analytics with consistent security, governance, and lifecycle management against a shared data catalog.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
2. About the presenters
▪Technical Group Architect at Blizzard Entertainment
- Cloud, Build, Deployment, Data
▪Principal Solutions Architect at Cloudera
▪Big Data Architect at FINRA
▪Contributor to Apache HDFS, HBase, Flume, Avro, Pig, Spark, YARN,
Sqoop, Kudu, Kafka, …
▪Co-Author of O’Reilly’s Hadoop Application Architectures
▪O’Reilly Online Trainer Course Creator
▪Adviser to the Board of MetiStream
▪Video Gamer
Ted Malaska
3. About the presenters
▪Software Engineer at Cloudera
▪Co-Author of O’Reilly’s Hadoop Application
Architectures
▪Previously Technical Lead on the big data team at
Orbitz, co-founder of the Chicago Hadoop User Group
and Chicago Big Data
Jonathan Seidman
4. Ted Malaska & Jonathan Seidman
Foundations
forArchitecting
Data Solutions
MANAGING SUCCESSFUL DATA PROJECTS
9. Before we get started…
§ A story of two paths:
- Vendor
- Customer
10. There’s a lot on the line
§ Limited amount of time to make decisions
§ Time and resource cost money
§ Don’t want to paint yourself into a corner
§ Everyday something new comes up
§ Mistakes can be career/project damaging
11. And many voices
§ There are the high profile companies
§ There are the venders
§ There are the hype trains
§ There are the demands from your business
§ There is the desire to build up your resume
17. Build or buy? It depends
§ Internal culture
§ Technical religions
§ Skill sets
§ Opportunity for upside
- Incremental
- Revolutionary
- Control
§ Effectiveness of a vender
18. Consider buy in
§ Internal buy in
§ Allow for fail fast and restarting
19. How to stay focused
§ Where is the Business Value
§ All things are possible but not all things are easy
§ Stay dumb
§ You must hate the technology before you select it
§ Treat a vender as a vender not a friend
§ There must be a desire
20. Reducing risk
§ Interface design
§ Fail fast
§ Cloud based provisioning
§ Containers
§ Identify weak points with a passion
24. Instead, build well rounded teams
Sysadmins Developers Analysts Data Scientists
Other roles:
Data Protection Officer Network/Systems EngineersProduct Managers
25. How to find people?
Start with people you already have, but make sure you invest in
training…
§Linux, network, DBAs –> sysadmins
§Developers –> developers
- Easy if you’re at a company like Orbitz, otherwise maybe not so much.
§Analysts –> analysts
§It’s not an easy path though.
- Set goals instead of micro-managing development.
- Be prepared to iterate, don’t be afraid to fail.
26. Also don’t forget other teams
Communication is key
DBAs Other Project Teams
27. Think beyond just skills
§ Also look for complementary personalities.
§ And avoid toxic personalities.
- But what if they’re really talented?
- See above.
28. Customer Engagement
§ Your teams should work closely with your customers, whether they’re external or internal.