Robin Bloor and Teradata
Live Webcast on April 22, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=2e69345c0a6a4e5a8de6fc72652e3bc6
Can you replace the data warehouse with Hadoop? Is Hadoop an ideal ETL subsystem? And what is the real magic of Hadoop? Everyone is looking to capitalize on the insights that lie in the vast pools of big data. Generating the value of that data relies heavily on several factors, especially choosing the right solution for the right context. With so many options out there, how do organizations best integrate these new big data solutions with the existing data warehouse environment?
Register for this episode of The Briefing Room to hear veteran analyst Dr. Robin Bloor as he explains where Hadoop fits into the information ecosystem. He’ll be briefed by Dan Graham of Teradata, who will offer perspective on how Hadoop can play a critical role in the analytic architecture. Bloor and Graham will interactively discuss big data in the big picture of the data center and will also seek to dispel several common misconceptions about Hadoop.
Visit InsideAnlaysis.com for more information.
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...VMware Tanzu
Pivotal HAWQ, one of the world’s most advanced enterprise SQL on Hadoop technology, coupled with the Hortonworks Data Platform, the only 100% open source Apache Hadoop data platform, can turbocharge your analytic efforts. The slides from this technical webinar present a deep dive on this powerful modern data architecture for analytics and data science.
Learn more here: http://pivotal.io/big-data/pivotal-hawq
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
The Briefing Room with Dr. Robin Bloor and Splice Machine
Live Webcast August 11, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=e1b33c9d45b178e13784b4a971a4c1349
The ETL process was born out of necessity, and for decades it has been the glue between data sources and target applications. But as data
growth soars and increased competition demands real-time data, standard ETL has become brittle and often unmanageable. Scaling up resources can do the trick, but it’s very costly and only a matter of time before the processes hit another bottleneck. When outmoded ETL stands in the way of real-time analytics, it might be time to consider a completely new approach.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he explains how modern, data-driven architectures must adopt an equally capable data integration strategy. He’ll be briefed by Rich Reimer of Splice Machine, who will discuss how his company solves ETL performance issues and enables real-time analytics and reports on big data. He will show that by leveraging the scale-out power of Hadoop and the in-memory speed of Spark, users can bring both analytical and operational systems together, eventually performing transformations only when needed.
Visit InsideAnalysis.com for more information.
BDM39: HP Vertica BI: Sub-second big data analytics your users and developers...Big Data Montreal
Despite how fantastic pigs look with lipstick on and how magical elephants look with wings attached, there remains a large gap between what popular big data stacks offer and what end users demand in terms of reporting agility and speed. Join us to learn how Montreal-based AdGear, an advertising technology company, faced challenges as its data volume increased. You will hear how AdGear's data stack evolved to meet these challenges, and how HP Vertica's architecture and features changed the game.
(by Mina Naguib, Technical Director of Platform Engineering at AdGear).
https://youtu.be/tzQUUCuVjVc
HAWQ: a massively parallel processing SQL engine in hadoopBigData Research
HAWQ, developed at Pivotal, is a massively parallel processing SQL engine sitting on top of HDFS. As a hybrid of MPP database and Hadoop, it inherits the merits from both parties. It adopts a layered architecture and relies on the distributed file system for data replication and fault tolerance. In addition, it is standard SQL compliant, and unlike other SQL engines on Hadoop, it is fully transactional. This paper presents the novel design of HAWQ, including query processing, the scalable software interconnect based on UDP protocol, transaction management, fault tolerance, read optimized storage, the extensible framework for supporting various popular Hadoop based data stores and formats, and various optimization choices we considered to enhance the query performance. The extensive performance study shows that HAWQ is about 40x faster than Stinger, which is reported 35x-45x faster than the original Hive.
Level Up – How to Achieve Hadoop AccelerationInside Analysis
The Briefing Room with Robin Bloor and HP Vertica
Live Webcast on August 26, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=3dd6d1b068fe395f665c75adb682ac41
Hadoop has long passed the point of being a nascent technology, but many users have found that when left to its own devices, Hadoop can be a one trick pony. To get the most out of Hadoop, organizations need a flexible platform that empowers analysts and data managers with a complete set of information lifecycle management and analytics tools without a performance tradeoff.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he outlines Hadoop’s role in a big data architecture. He’ll be briefed by Walt Maguire of HP Vertica, who will showcase his company’s big data solutions, including HAVEn and the HP Big Data Platform. He will demonstrate how HP Vertica acts as a complement to Hadoop, and how the combination of the two provides a versatile and highly performant solution.
Visit InsideAnlaysis.com for more information.
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...VMware Tanzu
Pivotal HAWQ, one of the world’s most advanced enterprise SQL on Hadoop technology, coupled with the Hortonworks Data Platform, the only 100% open source Apache Hadoop data platform, can turbocharge your analytic efforts. The slides from this technical webinar present a deep dive on this powerful modern data architecture for analytics and data science.
Learn more here: http://pivotal.io/big-data/pivotal-hawq
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
The Briefing Room with Dr. Robin Bloor and Splice Machine
Live Webcast August 11, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=e1b33c9d45b178e13784b4a971a4c1349
The ETL process was born out of necessity, and for decades it has been the glue between data sources and target applications. But as data
growth soars and increased competition demands real-time data, standard ETL has become brittle and often unmanageable. Scaling up resources can do the trick, but it’s very costly and only a matter of time before the processes hit another bottleneck. When outmoded ETL stands in the way of real-time analytics, it might be time to consider a completely new approach.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he explains how modern, data-driven architectures must adopt an equally capable data integration strategy. He’ll be briefed by Rich Reimer of Splice Machine, who will discuss how his company solves ETL performance issues and enables real-time analytics and reports on big data. He will show that by leveraging the scale-out power of Hadoop and the in-memory speed of Spark, users can bring both analytical and operational systems together, eventually performing transformations only when needed.
Visit InsideAnalysis.com for more information.
BDM39: HP Vertica BI: Sub-second big data analytics your users and developers...Big Data Montreal
Despite how fantastic pigs look with lipstick on and how magical elephants look with wings attached, there remains a large gap between what popular big data stacks offer and what end users demand in terms of reporting agility and speed. Join us to learn how Montreal-based AdGear, an advertising technology company, faced challenges as its data volume increased. You will hear how AdGear's data stack evolved to meet these challenges, and how HP Vertica's architecture and features changed the game.
(by Mina Naguib, Technical Director of Platform Engineering at AdGear).
https://youtu.be/tzQUUCuVjVc
HAWQ: a massively parallel processing SQL engine in hadoopBigData Research
HAWQ, developed at Pivotal, is a massively parallel processing SQL engine sitting on top of HDFS. As a hybrid of MPP database and Hadoop, it inherits the merits from both parties. It adopts a layered architecture and relies on the distributed file system for data replication and fault tolerance. In addition, it is standard SQL compliant, and unlike other SQL engines on Hadoop, it is fully transactional. This paper presents the novel design of HAWQ, including query processing, the scalable software interconnect based on UDP protocol, transaction management, fault tolerance, read optimized storage, the extensible framework for supporting various popular Hadoop based data stores and formats, and various optimization choices we considered to enhance the query performance. The extensive performance study shows that HAWQ is about 40x faster than Stinger, which is reported 35x-45x faster than the original Hive.
Level Up – How to Achieve Hadoop AccelerationInside Analysis
The Briefing Room with Robin Bloor and HP Vertica
Live Webcast on August 26, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=3dd6d1b068fe395f665c75adb682ac41
Hadoop has long passed the point of being a nascent technology, but many users have found that when left to its own devices, Hadoop can be a one trick pony. To get the most out of Hadoop, organizations need a flexible platform that empowers analysts and data managers with a complete set of information lifecycle management and analytics tools without a performance tradeoff.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he outlines Hadoop’s role in a big data architecture. He’ll be briefed by Walt Maguire of HP Vertica, who will showcase his company’s big data solutions, including HAVEn and the HP Big Data Platform. He will demonstrate how HP Vertica acts as a complement to Hadoop, and how the combination of the two provides a versatile and highly performant solution.
Visit InsideAnlaysis.com for more information.
A Big Data Journey: Bringing Open Source to FinanceSlim Baltagi
Slim Baltagi & Rick Fath. Closing Keynote: Big Data Executive Summit. Chicago 11/28/2012.
PART I – Hadoop at CME: Our Practical Experience
1. What’s CME Group Inc.?
2. Big Data & CME Group: a natural fit!
3. Drivers for Hadoop adoption at CME Group
4. Key Big Data projects at CME Group
5. Key Learning’s
PART II - Bringing Hadoop to the Enterprise:
Challenges & Opportunities
PART II - Bringing Hadoop to the Enterprise
1. What is Hadoop, what it isn’t and what it can help you do?
2. What are the operational concerns and risks?
3. What organizational changes to expect?
4. What are the observed Hadoop trends?
Mr. Slim Baltagi is a Systems Architect at Hortonworks, with over 4 years of Hadoop experience working on 9 Big Data projects: Advanced Customer Analytics, Supply Chain Analytics, Medical Coverage Discovery, Payment Plan Recommender, Research Driven Call List for Sales, Prime Reporting Platform, Customer Hub, Telematics, Historical Data Platform; with Fortune 100 clients and global companies from Financial Services, Insurance, Healthcare and Retail.
Mr. Slim Baltagi has worked in various architecture, design, development and consulting roles at.
Accenture, CME Group, TransUnion, Syntel, Allstate, TransAmerica, Credit Suisse, Chicago Board Options Exchange, Federal Reserve Bank of Chicago, CNA, Sears, USG, ACNielsen, Deutshe Bahn.
Mr. Baltagi has also over 14 years of IT experience with an emphasis on full life cycle development of Enterprise Web applications using Java and Open-Source software. He holds a master’s degree in mathematics and is an ABD in computer science from Université Laval, Québec, Canada.
Languages: Java, Python, JRuby, JEE , PHP, SQL, HTML, XML, XSLT, XQuery, JavaScript, UML, JSON
Databases: Oracle, MS SQL Server, MYSQL, PostreSQL
Software: Eclipse, IBM RAD, JUnit, JMeter, YourKit, PVCS, CVS, UltraEdit, Toad, ClearCase, Maven, iText, Visio, Japser Reports, Alfresco, Yslow, Terracotta, Toad, SoapUI, Dozer, Sonar, Git
Frameworks: Spring, Struts, AppFuse, SiteMesh, Tiles, Hibernate, Axis, Selenium RC, DWR Ajax , Xstream
Distributed Computing/Big Data: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, HBase, R, RHadoop, Cloudera CDH4, MapR M7, Hortonworks HDP 2.1
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks
Real Time Monitoring requires a high scalable infrastructure of message bus, database, distributed event processing and scalable analytics engine. By bringing together leading open source projects of Apache Kafka, Apache HBase, Apache Storm and Apache Hive, the Hortonworks Data Platform offers a comprehensive Real Time Analysis platform. In this session, we will provide an in-depth overview all the key technology components and demonstrate a working solution for monitoring a fleet of trucks.
Audience: Developers, Architects and System Engineers from the Hortonworks Technology Partner community.
Recording: https://hortonworks.webex.com/hortonworks/lsr.php?RCID=0278dc8aa49a9991e1ce436c71f53d30
Slides from the joint webinar. Learn how Pivotal HAWQ, one of the world’s most advanced enterprise SQL on Hadoop technology, coupled with the Hortonworks Data Platform, the only 100% open source Apache Hadoop data platform, can turbocharge your Data Science efforts.
Together, Pivotal HAWQ and the Hortonworks Data Platform provide businesses with a Modern Data Architecture for IT transformation.
2015 nov 27_thug_paytm_rt_ingest_brief_finalAdam Muise
Paytm Labs provides a quick overview of their Hadoop data ingest platform. We cover our journey from a batch focused ingest system with SQOOP to a streaming ingest supported by Kafka, Confluent.io, Hadoop, Cassandra, and Spark Streaming. This presentation also provides an overview of our complete data platform including our feature creation template
The Briefing Room with Dr. Robin Bloor and HP Vertica
Live Webcast on April 15, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=ad9b301880f27007e836560cf3cd8904
Hadoop has emerged as a chief solution for big data challenges, and businesses are eager to capture the potential value from the vast pools of newly available information assets. But when the data lake is comprised of semi-structured data – click stream logs, sensor data, text files – it can make access and performance a bit more difficult to achieve. One way to clear the hurdle is by using an analytic platform, specifically one that has been designed to enable exploration of Hadoop data using standard SQL queries.
Register for this episode of The Briefing Room to hear from veteran Analyst Robin Bloor as he explains the role big data plays in enterprise analytics. He’ll be briefed by Jeff Healey and Eamon O'Neill of HP Vertica, who will tout their company’s Flex Zone, a new component of its Analytics Platform. They will discuss how Flex Zone empowers data scientists and business analysts by tapping into Hadoop via SQL, providing a one stop shop for real-time analytics on massive volumes of data.
Visit InsideAnlaysis.com for more information.
YARN Ready: Integrating to YARN with Tez Hortonworks
YARN Ready webinar series helps developers integrate their applications to YARN. Tez is one vehicle to do that. We take a deep dive including code review to help you get started.
Hadoop Reporting and Analysis - JaspersoftHortonworks
Hadoop is deployed for a variety of uses, including web analytics, fraud detection, security monitoring, healthcare, environmental analysis, social media monitoring, and other purposes.
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
No matter if you are new to Hadoop or have a mature cluster in production, scale will be a critical factor of your success with Hadoop. Are you ready to take the next big step as you scale out your data architecture?
Talend and Hortonworks discuss where we will help you learn how to implement an effective big data and Hadoop strategy across your IT infrastructure. You will learn:
How to grow a pilot into production
How to scale-out architecture & systems affordably
How to leverage the flexibility of Hadoop to optimize your data integration processes
Recording: http://www.talend.com/resources/webinars/starting-small-and-scaling-big-with-hadoop
Introduction to Designing and Building Big Data ApplicationsCloudera, Inc.
Learn what the course covers, from capturing data to building a search interface; the spectrum of processing engines, Apache projects, and ecosystem tools available for converged analytics; who is best suited to attend the course and what prior knowledge you should have; and the benefits of building applications with an enterprise data hub.
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Agile Testing Alliance
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Processing by "Sampat Kumar" from "Harman". The presentation was done at #doppa17 DevOps++ Global Summit 2017. All the copyrights are reserved with the author
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...Hortonworks
There certainly is no shortage of hype when it comes to the term “Big Data”. One thing we can be sure of is that massive data volumes are driving a new modern data architecture that includes Hadoop in the mix. But what does that architecture look like for Business Intelligence Data Strategy?
Join Hortonworks and MicroStrategy, where we’ll:
• Discuss the modern architecture for Business Intelligence on top of Hadoop as a data source.
• Learn how our joint solution helps enterprises store, process and analyze vast amounts of structured and unstructured data to deliver business insights throughout an organization.
• Discover what new benefits Hadoop 2.0 offers and how the MicroStrategy Analytics platform leverages those new features to improve performance, achieve faster access times, and allow for true interactive visual data discovery.
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationInside Analysis
The Briefing Room with Dr. Robin Bloor and Teradata
Live Webcast on May 20, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=f09e84f88e4ca6e0a9179c9a9e930b82
Traditional data warehouses have been the backbone of corporate decision making for over three decades. With the emergence of Big Data and popular technologies like open-source Apache™ Hadoop®, some analysts question the lifespan of the data warehouse and the future role it will play in enterprise information management. But it’s not practical to believe that emerging technologies provide a wholesale replacement of existing technologies and corporate investments in data management. Rather, a better approach is for new innovations and technologies to complement and build upon existing solutions.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he explains where tomorrow’s data warehouse fits in the information landscape. He’ll be briefed by Imad Birouty of Teradata, who will highlight the ways in which his company is evolving to meet the challenges presented by different types of data and applications. He will also tout Teradata’s recently-announced Teradata® Database 15 and Teradata® QueryGrid™, an analytics platform that enables data processing across the enterprise.
Visit InsideAnlaysis.com for more information.
A Big Data Journey: Bringing Open Source to FinanceSlim Baltagi
Slim Baltagi & Rick Fath. Closing Keynote: Big Data Executive Summit. Chicago 11/28/2012.
PART I – Hadoop at CME: Our Practical Experience
1. What’s CME Group Inc.?
2. Big Data & CME Group: a natural fit!
3. Drivers for Hadoop adoption at CME Group
4. Key Big Data projects at CME Group
5. Key Learning’s
PART II - Bringing Hadoop to the Enterprise:
Challenges & Opportunities
PART II - Bringing Hadoop to the Enterprise
1. What is Hadoop, what it isn’t and what it can help you do?
2. What are the operational concerns and risks?
3. What organizational changes to expect?
4. What are the observed Hadoop trends?
Mr. Slim Baltagi is a Systems Architect at Hortonworks, with over 4 years of Hadoop experience working on 9 Big Data projects: Advanced Customer Analytics, Supply Chain Analytics, Medical Coverage Discovery, Payment Plan Recommender, Research Driven Call List for Sales, Prime Reporting Platform, Customer Hub, Telematics, Historical Data Platform; with Fortune 100 clients and global companies from Financial Services, Insurance, Healthcare and Retail.
Mr. Slim Baltagi has worked in various architecture, design, development and consulting roles at.
Accenture, CME Group, TransUnion, Syntel, Allstate, TransAmerica, Credit Suisse, Chicago Board Options Exchange, Federal Reserve Bank of Chicago, CNA, Sears, USG, ACNielsen, Deutshe Bahn.
Mr. Baltagi has also over 14 years of IT experience with an emphasis on full life cycle development of Enterprise Web applications using Java and Open-Source software. He holds a master’s degree in mathematics and is an ABD in computer science from Université Laval, Québec, Canada.
Languages: Java, Python, JRuby, JEE , PHP, SQL, HTML, XML, XSLT, XQuery, JavaScript, UML, JSON
Databases: Oracle, MS SQL Server, MYSQL, PostreSQL
Software: Eclipse, IBM RAD, JUnit, JMeter, YourKit, PVCS, CVS, UltraEdit, Toad, ClearCase, Maven, iText, Visio, Japser Reports, Alfresco, Yslow, Terracotta, Toad, SoapUI, Dozer, Sonar, Git
Frameworks: Spring, Struts, AppFuse, SiteMesh, Tiles, Hibernate, Axis, Selenium RC, DWR Ajax , Xstream
Distributed Computing/Big Data: Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, HBase, R, RHadoop, Cloudera CDH4, MapR M7, Hortonworks HDP 2.1
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks
Real Time Monitoring requires a high scalable infrastructure of message bus, database, distributed event processing and scalable analytics engine. By bringing together leading open source projects of Apache Kafka, Apache HBase, Apache Storm and Apache Hive, the Hortonworks Data Platform offers a comprehensive Real Time Analysis platform. In this session, we will provide an in-depth overview all the key technology components and demonstrate a working solution for monitoring a fleet of trucks.
Audience: Developers, Architects and System Engineers from the Hortonworks Technology Partner community.
Recording: https://hortonworks.webex.com/hortonworks/lsr.php?RCID=0278dc8aa49a9991e1ce436c71f53d30
Slides from the joint webinar. Learn how Pivotal HAWQ, one of the world’s most advanced enterprise SQL on Hadoop technology, coupled with the Hortonworks Data Platform, the only 100% open source Apache Hadoop data platform, can turbocharge your Data Science efforts.
Together, Pivotal HAWQ and the Hortonworks Data Platform provide businesses with a Modern Data Architecture for IT transformation.
2015 nov 27_thug_paytm_rt_ingest_brief_finalAdam Muise
Paytm Labs provides a quick overview of their Hadoop data ingest platform. We cover our journey from a batch focused ingest system with SQOOP to a streaming ingest supported by Kafka, Confluent.io, Hadoop, Cassandra, and Spark Streaming. This presentation also provides an overview of our complete data platform including our feature creation template
The Briefing Room with Dr. Robin Bloor and HP Vertica
Live Webcast on April 15, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=ad9b301880f27007e836560cf3cd8904
Hadoop has emerged as a chief solution for big data challenges, and businesses are eager to capture the potential value from the vast pools of newly available information assets. But when the data lake is comprised of semi-structured data – click stream logs, sensor data, text files – it can make access and performance a bit more difficult to achieve. One way to clear the hurdle is by using an analytic platform, specifically one that has been designed to enable exploration of Hadoop data using standard SQL queries.
Register for this episode of The Briefing Room to hear from veteran Analyst Robin Bloor as he explains the role big data plays in enterprise analytics. He’ll be briefed by Jeff Healey and Eamon O'Neill of HP Vertica, who will tout their company’s Flex Zone, a new component of its Analytics Platform. They will discuss how Flex Zone empowers data scientists and business analysts by tapping into Hadoop via SQL, providing a one stop shop for real-time analytics on massive volumes of data.
Visit InsideAnlaysis.com for more information.
YARN Ready: Integrating to YARN with Tez Hortonworks
YARN Ready webinar series helps developers integrate their applications to YARN. Tez is one vehicle to do that. We take a deep dive including code review to help you get started.
Hadoop Reporting and Analysis - JaspersoftHortonworks
Hadoop is deployed for a variety of uses, including web analytics, fraud detection, security monitoring, healthcare, environmental analysis, social media monitoring, and other purposes.
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
No matter if you are new to Hadoop or have a mature cluster in production, scale will be a critical factor of your success with Hadoop. Are you ready to take the next big step as you scale out your data architecture?
Talend and Hortonworks discuss where we will help you learn how to implement an effective big data and Hadoop strategy across your IT infrastructure. You will learn:
How to grow a pilot into production
How to scale-out architecture & systems affordably
How to leverage the flexibility of Hadoop to optimize your data integration processes
Recording: http://www.talend.com/resources/webinars/starting-small-and-scaling-big-with-hadoop
Introduction to Designing and Building Big Data ApplicationsCloudera, Inc.
Learn what the course covers, from capturing data to building a search interface; the spectrum of processing engines, Apache projects, and ecosystem tools available for converged analytics; who is best suited to attend the course and what prior knowledge you should have; and the benefits of building applications with an enterprise data hub.
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Agile Testing Alliance
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Processing by "Sampat Kumar" from "Harman". The presentation was done at #doppa17 DevOps++ Global Summit 2017. All the copyrights are reserved with the author
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...Hortonworks
There certainly is no shortage of hype when it comes to the term “Big Data”. One thing we can be sure of is that massive data volumes are driving a new modern data architecture that includes Hadoop in the mix. But what does that architecture look like for Business Intelligence Data Strategy?
Join Hortonworks and MicroStrategy, where we’ll:
• Discuss the modern architecture for Business Intelligence on top of Hadoop as a data source.
• Learn how our joint solution helps enterprises store, process and analyze vast amounts of structured and unstructured data to deliver business insights throughout an organization.
• Discover what new benefits Hadoop 2.0 offers and how the MicroStrategy Analytics platform leverages those new features to improve performance, achieve faster access times, and allow for true interactive visual data discovery.
Not Your Father’s Data Warehouse: Breaking Tradition with InnovationInside Analysis
The Briefing Room with Dr. Robin Bloor and Teradata
Live Webcast on May 20, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=f09e84f88e4ca6e0a9179c9a9e930b82
Traditional data warehouses have been the backbone of corporate decision making for over three decades. With the emergence of Big Data and popular technologies like open-source Apache™ Hadoop®, some analysts question the lifespan of the data warehouse and the future role it will play in enterprise information management. But it’s not practical to believe that emerging technologies provide a wholesale replacement of existing technologies and corporate investments in data management. Rather, a better approach is for new innovations and technologies to complement and build upon existing solutions.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he explains where tomorrow’s data warehouse fits in the information landscape. He’ll be briefed by Imad Birouty of Teradata, who will highlight the ways in which his company is evolving to meet the challenges presented by different types of data and applications. He will also tout Teradata’s recently-announced Teradata® Database 15 and Teradata® QueryGrid™, an analytics platform that enables data processing across the enterprise.
Visit InsideAnlaysis.com for more information.
5 Years of Progress in Active Data WarehousingTeradata
Teradata's Dan Graham, , presentation from the 2010 Teradata User Group meetings on Active Data Warehousing over the last five years.
For more information on Active Data Warehousing, please visit Teradata.com
Conceptual framing for educational research through Deleuze and GuattariDavid R Cole
This presentation will address the issue of conceptual framing for educational research through the philosophy of Deleuze & Guattari. The picture of what this means is complicated by the fact that in their combined texts, Deleuze and Guattari present different notions of conceptual framing. In their final joint text, What is Philosophy? conceptual framing appears in the context of concept creation, and helps with the analysis of western philosophy through concepts such as ‘geophilosophy’. In their joint texts on Capitalism and Schizophrenia, concepts are aligned with pre-personal and individualising flows that pass through any context. This presentation will make sense of the disparate deployment of concepts in the work of Deleuze & Guattari to aid clear conceptual work in the growing international field of educational research inspired by their philosophy.
Smart companies know that business intelligence surfaces insights. With complex analytics, data mining and everything in between, it takes many moving parts to serve up the big picture. The key is to provide full-stack visibility into the entire BI environment, ensuring solid service and system performance.
Learn more at http://www.insideanalysis.com
Building an Intelligent Biobank to Power Research Decision-MakingDenodo
This presentation belongs to the workshop: "Building an Intelligent Biobank to Power Research Decision-Making", from ISBER 2015 Annual Meeting by Lori A. Ball (Chief Operating Officer, President of Integrated Client Solutions at BioStorage Technologies, Inc), Brian Brunner (Senior Manager, Clinical Practice at LabAnswer) and Suresh Chandrasekaran (Senior Vice President at Denodo).
The workshop cover three different topic areas:
- Research sample intelligence: the growing need for Global Data Integration (Biobank Sample and Data Stakeholders).
- Building a research data integration plan and cloud sourcing strategy (data integration).
- How data virtualization works and the value it delivers (a data virtualization introduction, solution portfolio and current customers in Life Sciences industry).
The biomedical R&D environment is increasingly dependent on data meta-analysis and bioinformatics to support research advancements. The integration of biorepository sample inventory data with biomarker and clinical research information has become a priority to R&D organizations. Therefore, a flexible IT system for managing sample collections, integrating sample data with clinical data and providing a data virtualization platform will enable the advancement of research studies. This workshop provides an overview of how sample data integration, virtualization and analytics can lead to more streamlined and unified sample intelligence to support global biobanking for future research.
A Bigger Magnifying Glass: Analyzing the Internet of Things Eric Kavanagh
The Briefing Room with Richard Hackathorn and Teradata
The blessing and curse of IoT is the size of the equation. The opportunities are everywhere, but so are the challenges. With so many moving parts, and such diversity of systems and data involved, the Internet of Things promises to fundamentally change the way companies use data. In many cases, success will require a bigger magnifying glass, and greater discipline in designing and managing analytic architectures.
Register for this episode of The Briefing Room to learn from veteran analyst Richard Hackathorn, who will share insights from his recent report on IoT use cases. He'll be joined by Dan Graham of Teradata, who will explain why getting value from the Internet of Things will rely on 'Analytics-of-Things' -- a refined practice of carefully designing and managing analytical workflows to leverage the vast amounts of sensor data coming down the turnpike.
Data Lake vs. Data Warehouse: Which is Right for Healthcare?Health Catalyst
The data lake style of a data warehouse architecture is a flexible alternative to a traditional data warehouse. It allows for unstructured data. When a warehousing approach requires that the data be in a structured format, there are constraints on the analyses that can be performed because not all of the data can be structured early. The data lake concept is very similar to our Late-Binding approach in that data lakes are our source marts. We increase the efficiency and effectiveness of these through: 1. Metadata, 2. Source Mart Designer, and 3. Subject Area Mart Designer.
Data virtualization, Data Federation & IaaS with Jboss TeiidAnil Allewar
Enterprise have always grappled with the problem of information silos that needed to be merged using multiple data warehouses(DWs) and business intelligence(BI) tools so that enterprises could mine this disparate data for businessdecisions and strategy. Traditionally this data integration was done with ETL by consolidating multiple DBMS into a single data storage facility.
Data virtualization enables abstraction, transformation, federation, and delivery of data taken from variety of heterogeneous data sources as if it is a single virtual data source without the need to physically copy the data for integration. It allows consuming applications or users to access data from these various sources via a request to a single access point and delivers information-as-a-service (IaaS).
In this presentation, we will explore what data virtualization is and how it differs from the traditional data integration architecture. We’ll also look at validating the data virtualization and federation concepts by working through an example(see videos at the GitHub repo) to federate data across 2 heterogeneous data sources; mySQL and MongoDB using the JBoss Teiid data virtualization platform.
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
The Briefing Room with Neil Raden and Teradata
Live Webcast on August 19, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=1acd0b7ace309f765dc3196001d26a5e
Modern enterprises have been able to solve information management woes with the data warehouse, now a staple across the IT landscape that has evolved to a high level of sophistication and maturity with thousands of global implementations. Today’s modern enterprise has a similar challenge; big data and the fast evolution of the Hadoop ecosystem create plenty of new opportunities but also a significant number of operational pains as new solutions emerge.
Register for this episode of The Briefing Room to hear veteran Analyst Neil Raden as he explores the details and nature of Hadoop’s evolution. He’ll be briefed by Cesar Rojas of Teradata, who will share how Teradata solves some of the Hadoop operational challenges. He will also explain how the integration between Hadoop and the data warehouse can help organizations develop a more responsive and robust data management environment.
Visit InsideAnlaysis.com for more information.
Are you confused by Big Data? Get in touch with this new "black gold" and familiarize yourself with undiscovered insights through our complimentary introductory lesson on Big Data and Hadoop!
The next generation user experience should move to customer engagement zones along their preferred channels with desired action to outcome approaches. With scores of information ranging from inventory to inquiry, weather to warehouse alerts, product to promotion info at disposal, enterprise digitization can create value at every customer touch point. Attendees witnessed the manifestation of TCS’ Thought Leadership in the Game of Retail.
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
In recent years, Apache™ Hadoop® has emerged from humble beginnings to disrupt the traditional disciplines of information management. As with all technology innovation, hype is rampant, and data professionals are easily overwhelmed by diverse opinions and confusing messages.
Even seasoned practitioners sometimes miss the point, claiming for example that Hadoop replaces relational databases and is becoming the new data warehouse. It is easy to see where these claims originate since both Hadoop and Teradata® systems run in parallel, scale up to enormous data volumes and have shared-nothing architectures. At a conceptual level, it is easy to think they are interchangeable, but the differences overwhelm the similarities. This session will shed light on the differences and help architects, engineering executives, and data scientists identify when to deploy Hadoop and when it is best to use MPP relational database in a data warehouse, discovery platform, or other workload-specific applications.
Two of the most trusted experts in their fields, Steve Wooledge, VP of Product Marketing from Teradata and Jim Walker of Hortonworks will examine how big data technologies are being used today by practical big data practitioners.
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
The Briefing Room with Dr. Robin Bloor and Pepperdata
Live Webcast September 15, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=32f198185d9d0c4cf32c27bdd1498b2a
Industry researchers agree: the importance of Hadoop will continue to grow as more companies recognize the range of benefits they can reap, from lower-cost storage to better business insights. At the same time, advances in the Hadoop ecosystem are addressing many of the key concerns that have hampered adoption, including performance and reliability. As a result, Hadoop is fast becoming a first-class citizen in the world of enterprise computing.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor explain how the Hadoop ecosystem is evolving into a mature foundation for managing enterprise data. He’ll be briefed by Sean Suchter of Pepperdata, who will explain how his company’s software brings predictability and reliability to Hadoop through dynamic, policy-based controls and monitoring. He’ll show how to guarantee service-level agreements by slowing down low-priority tasks as needed. He’ll also discuss the holy grail of Hadoop: how to enable mixed workloads.
Visit InsideAnalysis.com for more information.
Hitachi Data Systems Hadoop Solution. Customers are seeing exponential growth of unstructured data from their social media websites to operational sources. Their enterprise data warehouses are not designed to handle such high volumes and varieties of data. Hadoop, the latest software platform that scales to process massive volumes of unstructured and semi-structured data by distributing the workload through clusters of servers, is giving customers new option to tackle data growth and deploy big data analysis to help better understand their business. Hitachi Data Systems is launching its latest Hadoop reference architecture, which is pre-tested with Cloudera Hadoop distribution to provide a faster time to market for customers deploying Hadoop applications. HDS, Cloudera and Hitachi Consulting will present together and explain how to get you there. Attend this WebTech and learn how to: Solve big-data problems with Hadoop. Deploy Hadoop in your data warehouse environment to better manage your unstructured and structured data. Implement Hadoop using HDS Hadoop reference architecture. For more information on Hitachi Data Systems Hadoop Solution please read our blog: http://blogs.hds.com/hdsblog/2012/07/a-series-on-hadoop-architecture.html
5 Things that Make Hadoop a Game Changer
Webinar by Elliott Cordo, Caserta Concepts
There is much hype and mystery surrounding Hadoop's role in analytic architecture. In this webinar, Elliott presented, in detail, the services and concepts that makes Hadoop a truly unique solution - a game changer for the enterprise. He talked about the real benefits of a distributed file system, the multi workload processing capabilities enabled by YARN, and the 3 other important things you need to know about Hadoop.
To access the recorded webinar, visit the event site: https://www.brighttalk.com/webcast/9061/131029
For more information the services and solutions that Caserta Concepts offers, please visit http://casertaconcepts.com/
How Hewlett Packard Enterprise Gets Real with IoT AnalyticsArcadia Data
Learn how HPE uses visual analytics within a data lake to create an “Industrial Internet of Things” model that solves their data analytics problem at scale.
Overview of Apache Trafodion (incubating), Enterprise Class Transactional SQL-on-Hadoop DBMS, with operational use cases, what it takes to be a world class RDBMS, some performance information, and the new company Esgyn which will leverage Apache Trafodion for operational solutions.
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
The world’s largest enterprises run their infrastructure on Oracle, DB2 and SQL and their critical business operations on SAP applications. Organisations need this data to be available in real-time to conduct necessary analytics. However, delivering this heterogeneous data at the speed it’s required can be a huge challenge because of the complex underlying data models and structures and legacy manual processes which are prone to errors and delays.
Unlock these silos of data and enable the new advanced analytics platforms by attending this session.
Find out how to:
• To overcome common challenges faced by enterprises trying to access their SAP data
• You can integrate SAP data in real-time with change data capture (CDC) technology
• Organisations are using Attunity Replicate for SAP to stream SAP data in to Kafka
Speakers:
John Hol, Regional Director, Attunity
Mike Hollobon, Director Business Development, IBT
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopCaserta
In our most recent Big Data Warehousing Meetup, we learned about transitioning from Big Data 1.0 with Hadoop 1.x with nascent technologies to the advent of Hadoop 2.x with YARN to enable distributed ETL, SQL and Analytics solutions. Caserta Concepts Chief Architect Elliott Cordo and an Actian Engineer covered the complete data value chain of an Enterprise-ready platform including data connectivity, collection, preparation, optimization and analytics with end user access.
For more information on our services or upcoming events, please visit our website at http://www.casertaconcepts.com/.
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsInside Analysis
The Briefing Room with Dr. Robin Bloor and InfiniDB
Live Webcast on August 12, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=1e562c3a4b9e9cb9a054f0ec216d578b
Today’s organizations need all kinds of data, from a wide and growing array of sources. Marshaling all that data into one location can be difficult, even unrealistic. Increasingly, innovative companies are taking a much more distributed approach to storing and processing data. The end result is an information architecture that supports a broader range of business activities, and reduces dependence on costly data movement.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor, as he explains how a distributed approach to data management can open doors to new business opportunities. He’ll be briefed by Jim Tommaney of InfiniDB who will explain how his company’s database has the flexibility to run on-prem, in the cloud, with cluster files systems or even Hadoop’s HDFS. He’ll also show how InfiniDB can serve as a conduit to companies looking to transform their information architecture to better satisfy changing market demands.
Visit InsideAnlaysis.com for more information.
Agile, Automated, Aware: How to Model for SuccessInside Analysis
The Briefing Room with David Loshin and Embarcadero
Live Webcast October 27, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=eea9877b71c653c499c809c5693eae8fe
Data management teams face some tough challenges these days. Organizations need business-driven visibility that enables understanding and awareness of enterprise data assets – without worrying about definitions and change management. But with information architectures evolving into a hybrid mix of data objects and data services built over relational databases as well as big data stores, serving up accurately defined, reusable data can become a complex issue.
Register for this episode of The Briefing Room to learn from veteran Analyst David Loshin as he explains the importance of agile, automated workflows in today’s enterprise. He’ll be briefed by Ron Huizenga of Embarcadero, who will discuss how his company’s ER/Studio suite approaches data modeling and management from a modern architecture standpoint. He will explain that unifying the way information is represented can not only eliminate the need for costly workarounds, but also foster collaboration between data architects, developers and business users.
Visit InsideAnalysis.com for more information.
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
The Briefing Room with Dr. Robin Bloor and Teradata RainStor
Live Webcast October 13, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=012bb2c290097165911872b1f241531d
Hadoop data lakes are emerging as peers to corporate data warehouses. However, successful data management solutions require a fusion of all relevant data, new and old, which has proven challenging for many companies. With a data lake that’s been optimized for fast queries, solid governance and lifecycle management, users can take data management to a whole new level.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses the relevance of data lakes in today’s information landscape. He’ll be briefed by Mark Cusack of Teradata, who will explain how his company’s archiving solution has developed into a storage point for raw data. He’ll show how the proven compression, scalability and governance of Teradata RainStor combined with Hadoop can enable an optimized data lake that serves as both reservoir for historical data and as a "system of record” for the enterprise.
Visit InsideAnalysis.com for more information.
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
The Briefing Room with Dr. Robin Bloor and RedPoint Global
Live Webcast October 6, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=9982ad3a2603345984895f279e849d35
Gartner recently placed Big Data in its “trough of disillusionment,” reflective of many leaders’ struggle to prove the value of Hadoop within their organization. While the promise of enhanced data integration and enrichment is obvious, measurable results have remained elusive. This episode of The Briefing Room will outline how to successfully tie Big Data to existing business applications, preventing your next Hadoop project from being another “Big Data letdown.”
Register today to learn from veteran Analyst Dr. Robin Bloor as he discusses the importance of converging enterprise data integration with intelligence and scalability. He’ll be briefed by George Corugedo of RedPoint Global, who will provide concrete examples of how the convergence of scalable cloud platforms, ever-expanding data sources and intelligent execution can turn the Big Data hype into demonstrable business value.
Visit InsideAnalysis.com for more information.
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
The Briefing Room with Dr. Robin Bloor and HP Security Voltage
Live Webcast September 22, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=45ece7082b1d7c2cc8179bc7a1a69ea5
Hadoop is rapidly becoming a development platform and dominant server environment, and organizations are keen to take advantage of its massively scalable – and relatively inexpensive – resources. It is not, however, without its limitations, and it often requires a contingent of complementary components in order to behave within an information architecture. One area often overlooked is security, a factor that, if not considered from the onset, can insert great risk when putting sensitive data in Hadoop.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses how security was never a design point for Hadoop and what organizations can do about it. He’ll be briefed by Sudeep Venkatesh of HP Security Voltage, who will explain the intricacies surrounding a secure Hadoop implementation. He will show how techniques like format-preserving and partial-field encryption can allow for analytics over protected data, with zero performance impact.
Visit InsideAnalysis.com for more information.
Special Edition with Dr. Robin Bloor
Live Webcast September 9, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=e8b9ac35d8e4ffa3452562c1d4286a975
Do the math: algebra will transform information management. Just as the relational database revolutionized the information landscape, so will a just-released, complete algebra of data overhaul the industry itself. So says Dr. Robin Bloor in his new book, the Algebra of Data, which he’ll outline in this special one-hour webcast.
Once organizations learn how to express their data sets algebraically, the benefits will be significant and far-reaching. Data quality problems will slowly subside; queries will run orders of magnitude faster; integration challenges will fade; and countless tedious jobs in the data management space will bid their farewell. But first, software companies must evolve, and that will take time.
Visit InsideAnalysis.com for more information.
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
The Briefing Room with Mark Madsen and Trifacta
Live Webcast September 1, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=eb655874d04ba7d560be87a9d906dd2fd
Like all enterprise software solutions, Hadoop must deliver business value in order to be a success. Much of the innovation around the big data industry these days therefore addresses usability. While there will always be a technical side to the Hadoop equation, the need for user-friendly tools to manage the data will continue to focus on business users. That’s why self-service data preparation or "data wrangling" is a serious and growing trend, one which promises to move Hadoop beyond the early adopter phase and more into the mainstream of business.
Register for this episode of The Briefing Room to hear veteran Analyst Mark Madsen of Third Nature explain why business users will play an increasingly important role in the evolution of big data. He’ll be briefed by Trifacta's Will Davis and Alon Bartur, who will demonstrate how Trifacta's solution empowers business users to “wrangle" data of all shapes and sizes faster and easier than ever before. They’ll discuss why a new approach to accessing and preparing diverse data is required and how it can accelerate and broaden the use of big data within organizations.
Visit InsideAnalysis.com for more information.
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
Business seems to move faster by the day, with the most cutting edge companies taking advantage of real-time data streams for heavy duty analytics. But with so much innovation happening in so many places, how can companies stay ahead of the game? One answer is to future-proof your analytics architecture by using an abstraction layer that can translate your business use-case or work-flow to one of many leading innovative technologies to address the growing number of use cases in this dynamic field.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor, as he explains how a data flow architecture can harness a wide range of streaming solutions. He'll be briefed by Anand Venugopal of Impetus Technologies, who will showcase his company's StreamAnalytix platform, which was designed from the ground up to leverage multiple major streaming engines available today, including Apache Spark, Apache Storm and others. He'll demonstrate how StreamAnalytix provides enterprise-class performance while incorporating best-of-breed open-source components.
View the archive at: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=925d1e9b639b78c6cf76a1bbbf485b2b
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
The Briefing Room with Mark Madsen and Cisco
Live Webcast August 18, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=0eff120f8b2879b582b77f4ff207ee54
Today's digital enterprises are seeing an explosion of data at the edge. The Internet of Everything is fast approaching a critical mass that will demand a sea change in how companies process data. This new world of information is widely distributed, streaming, and overall becoming too big to move. Experts predict that within two to three years, the bulk of analytic processing will take place on the fringes of information architectures. As a result, forward-thinking companies are dramatically shifting their analytic strategies.
Register for this episode of The Briefing Room to hear veteran Analyst Mark Madsen of Third Nature explain how a new era of information architectures is now unfolding, paving the way to much more responsive and agile business models. He'll be briefed by Kim Macpherson of the Cisco Data and Analytics Business Unit, who will explain how her company's platform is uniquely suited for this new, federated analytic paradigm. She'll demonstrate how edge analytics can help companies address opportunities quickly and effectively.
Visit InsideAnalysis.com for more information.
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
The Briefing Room with Dr. Robin Bloor and Modus Operandi
Live Webcast July 28, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=efc4082d9b0b0adfcd753a7435d2d6a1b
The analytic bottlenecks of yesterday need not apply today. The boundaries are also falling thanks in large part to the abundance of third-party data. The most data-driven companies these days are finding creative ways to dynamically incorporate data from within and beyond the firewall, thus building highly accurate, multidimensional views of their business, customer, competition or other subject areas.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he explains the magnitude of change that's occurring in the world of data, why it's happening now, and how you can take advantage. He'll be briefed by Mike Gilger and Boris Pelakh, who will showcase their company's enterprise analytics platform, which combines a range of battle-tested functionality to deliver dynamic situational awareness that can leverage a comprehensive array of data sets. They'll explain how the platform's reasoner benefits from a highly scalable rules engine, and a flexible modeling capability that can optimize data storage virtually on the fly.
Visit InsideAnalysis.com for more information.
Structurally Sound: How to Tame Your ArchitectureInside Analysis
The Briefing Room with Krish Krishnan and Teradata
Live Webcast July 21, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=602b2a8413e8719d39465f4d6291d505
Technology changes all the time, but the basic needs of the business are the same: BI and analytics. With new types of data, various analytics engines and multiple systems, giving business users seamless access to enterprise data can be a rather daunting process. One solution is to provide a complete fabric that spans the organization, touching all data points and masking the complexity behind disparate sources.
Register for this episode of The Briefing Room to learn from veteran Analyst Krish Krishnan as he explores how and why architectures have changed over the years. He’ll be briefed by Imad Birouty of Teradata, who will discuss his company’s QueryGrid, an analytics solution designed to provide access to data across all systems. He will show how QueryGrid essentially creates a logical data warehouse and enables users to leverage SQL over multiple data types.
Visit InsideAnalysis.com for more information.
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
The Briefing Room with Dr. Robin Bloor and Actian
Live Webcast July 14, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=bbd4395ea2f8c60a03cfefc68c7aa823
Innovation often implies risk, which is why businesses have many issues to weigh when considering change. Yet the remarkable growth of data is driving many traditional systems into the ground, forcing information workers to take a critical look at their existing tools. Technologies like Hadoop offer economical solutions to big data management, but to truly take advantage of its capabilities, organizations must modernize their infrastructure.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he explains how and why organizations should improve legacy systems. He’ll be briefed by Todd Untrecht of Actian, who will tout his company’s Actian Vortex, a SQL-in-Hadoop solution. He will show how integrating a SQL engine directly in the Hadoop cluster can lead to faster analytics and greater control, while still maintaining existing investments.
Visit InsideAnalysis.com for more information.
The Briefing Room with Dr. Robin Bloor and SYSTAP
Live Webcast June 30, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=0ff3889293f6c090483295fd7362c5a4
There's a reason why the biggest Web companies these days leverage graph technology: it is incredibly powerful for revealing a wide range of insights. Unlike other analytical databases, graph can very quickly identify the kinds of patterns that lead to better business decisions. Though relatively nascent in existing data centers, graph databases are proving to be well-suited for all kinds of business use cases, from clustering and hypothesis generation to failure detection and cyber analytics.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses how semantic technology fits in the spectrum of database and discovery solutions. He’ll be briefed by Brad Bebee of SYSTAP, who will showcase his company’s Blazegraph products and Mapgraph technology. He will explain how SYSTAP’s approach overcomes the challenge of scalability, and how graph technology’s powerful data management capabilities can deliver better enterprise performance and analytics using GPUs and other approaches.
Visit InsideAnalysis.com for more information.
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
Hot Technologies with Rick Sherman, Dr. Robin Bloor and Snowflake Computing
Live Webcast June 25, 2015
Watch the Archive: https://bloorgroup.webex.com/bloorgroup/onstage/g.php?MTID=e6e6de6cdfa8926e7a9d52e099a1a08e2
Enterprise software tends to advance in one of two ways: evolutionary and revolutionary. Evolutionary advances happen through incremental improvements made to an existing code base over a long period of time. Revolutionary advances happen when a new solution is designed from scratch, breaking cleanly from legacy approaches to take advantage of technology innovations that can span from hardware to software and methodologies.
Register for this episode of Hot Technologies to hear veteran analysts Rick Sherman of Athena IT Solutions and Dr. Robin Bloor along with Bob Muglia, CEO of Snowflake Computing, explain how a confluence of advances in the data world have opened up new doors for revolutionary advances in data warehousing. They will discuss new technology innovations and how they can be used to create data warehouses with the power, flexibility, and resiliency that modern enterprises need without the complexities and latencies inherent to traditional approaches.
Visit InsideAnalysis.com for more information.
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
The Briefing Room with Rick van der Lans and Think Big, a Teradata Company
Live Webcast on June 16, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=197f8106531874cc5c14081ca214eaff
Hadoop is arguably one of the most disruptive technologies of the last decade. Once lauded solely for its ability to transform the speed of batch processing, it has marched steadily forward and promulgated an array of performance-enhancing accessories, notably Spark and YARN. Hadoop has evolved into much more than a file system and batch processor, and it now promises to stand as the data management and analytics backbone for enterprises.
Register for this episode of The Briefing Room to learn from veteran Analyst Rick van der Lans, as he discusses the emerging roles of Hadoop within the analytics ecosystem. He’ll be briefed by Ron Bodkin of Think Big, a Teradata Company, who will explore Hadoop’s maturity spectrum, from typical entry use cases all the way up the value chain. He’ll show how enterprises that already use Hadoop in production are finding new ways to exploit its power and build creative, dynamic analytics environments.
Visit InsideAnalysis.com for more information.
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
The Briefing Room with Malcolm Chisholm and Druva
Live Webcast on June 9, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=baf82d3835c5dfa63202dcbe322a3ad7
The emergence of the mobile workforce has left an indelible mark on the enterprise; every employee is now mobile, and business data continues to be dispatched to the far reaches of the enterprise. While this has added enormous opportunity for increased productivity, it has also muddied the waters when it comes to controlling and protecting valuable data assets. As companies quickly evolve to address the new set of challenges posed by this shift in data usage, IT must ensure that all data, no matter where it’s generated or stored, is available and governed just as if it were still safely behind the corporate firewall.
Register for this episode of The Briefing Room to hear veteran Analyst Malcolm Chisholm as he explains the myriad challenges that mobile data introduces when addressing regulations and compliance needs, requiring new approaches to data governance. He’ll be briefed by Dave Packer of Druva, who will outline his company’s converged data protection strategy, which brings data center class capabilities to backup, availability and governance for the mobile workforce. He will share strategies to meet regional data residency, data recovery, legal hold and eDiscovery requirements and more.
Visit InsideAnalysis.com for more information.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Hadoop and the Data Warehouse: Point/Counter Point
1. Grab some coffee and enjoy
the pre-show banter before
the top of the hour!
2. Hadoop and the Data Warehouse: Point/Counter Point
The Briefing Room
3. Twitter Tag: #briefr
The Briefing Room
Welcome
Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com
@eric_kavanagh
4. ! Reveal the essential characteristics of enterprise software,
good and bad
! Provide a forum for detailed analysis of today’s innovative
technologies
! Give vendors a chance to explain their product to savvy
analysts
! Allow audience members to pose serious questions... and get
answers!
Twitter Tag: #briefr
The Briefing Room
Mission
5. Twitter Tag: #briefr
The Briefing Room
Topics
This Month: BIG DATA
May: DATABASE
June: ANALYTICS & MACHINE LEARNING
2014 Editorial Calendar at
www.insideanalysis.com/webcasts/the-briefing-room
7. Twitter Tag: #briefr
The Briefing Room
Analyst: Robin Bloor
Robin Bloor is
Chief Analyst at
The Bloor Group
robin.bloor@bloorgroup.com
@robinbloor
8. Twitter Tag: #briefr
The Briefing Room
Teradata
! Teradata is known for its analytics data solutions with a
focus on integrated data warehousing, big data analytics
and business applications
! It offers a broad suite of technology platforms and solutions
and a wide range of data management applications
! Teradata’s SQL-H allows users and applications to join
Hadoop data to the Teradata Data Warehouse and the Aster
Discovery Platform
9. Twitter Tag: #briefr
The Briefing Room
Guest: Dan Graham
Dan Graham is the Technical Marketing
Director for Teradata. With over 30 years in IT,
Dan joined Teradata Corporation in 1989
where he was the senior product manager for
the DBC/1012 parallel database computer. He
then joined IBM where he wrote product plans
and launched the RS/6000 SP parallel server.
He then became Strategy Executive for IBM’s
Global Business Intelligence Solutions. As
Enterprise Systems General Manager at
Teradata, Dan was responsible for strategy,
go-to-market success, and competitive
differentiation for the Active Enterprise Data
Warehouse platform. He currently leads
Teradata’s technical marketing activities.
10. HADOOP AND
THE DATA WAREHOUSE
Point, Counterpoint
Myths and Magic
11. AGENDA
• Words and modern terminology
• Is Hadoop a data integration product?
• Is Hadoop a data warehouse?
• Hadoop – the magic
• Teradata and Hadoop
http://www.teradata.com/analyst-reports/Hadoop-and-the-Data-Warehouse-Competitive-or-Complementary/
11 Copyright Teradata
12. Our Host, the Word Smith
@RobinBloor
12 Copyright Teradata
14. Now we have something that can provide
us “Real-time” in Hadoop
• At least most of the time
• Queries are significantly faster but not
always instantaneous
> Simple selects à A couple of seconds
> Join queries à 10s of seconds
Source: Slideshare, Real Time Interactive Queries IN HADOOP:
Big Data Warehousing Meetup, June 2013
14 Copyright Teradata
15. Term Hadoop meaning BI/DW meaning
Real time
query
• Self- service interactive
queries that run in under
minutes, preferably < 10s
of seconds
• Query responses in milliseconds
• +Advanced query prioritization
SQL
• Subset of ANSI 92 SQL
• Primary data types
• UDFs
• ANSI 2008 SQL +
• Some/all ANSI SQL 2011
• All SQL data types
• Integrity constraints, window
15 Copyright Teradata
functions, UDFs, triggers, XML
• ACID transactions (start
transaction, commit, rollback)
• Geospatial, temporal
OLAP • Any query < 10 seconds
• Subsecond multi-dimensional
aggregate queries
• Roll-up, drill-down hierarchies
• MOLAP and ROLAP
Hadoop Translator
See: Wikipedia
16. Shoop, Shoop Hadoop!
q Real-Time Query: Real-Time is really business
time. It is almost always performance critical
(otherwise why would you engineer for it?).
q SQL sophistication depends on what you want to
use it for. SQL-92 is rather primitive. There are
consequences – performance consequences.
q The appropriateness of Hadoop Interactive (OLAP)
capability is user dependent. But why would you
use Hadoop for this?
16 Copyright Teradata
17. Current HDFS Availability & Data Integrity
• Simple design, storage fault tolerance
> Storage: Rely in OS’s file system rather than use raw disk
> Storage Fault Tolerance: multiple replicas, active monitoring
> Single NameNode Master
– Persistent state: multiple copies + checkpoints
– Restart on failure
• How well did it work?
> Lost 19 out of 329 Million blocks on 10 clusters with 20K
nodes in 2009
– 7-9’s of reliability
– Fixed in 20 and 21.
> 18 months Study: 22 failures on 25 clusters - 0.58 failures
per year per cluster
– Only 8 would have benefitted from HA failover!! (0.23 failures
per cluster year)
> NN is very robust and can take a lot of abuse
– NN is resilient against overload caused by misbehaving apps
Source: Slideshare, NameNode HA, 2011
17 Copyright Teradata
18. Term Hadoop
meaning
18 Copyright Teradata
BI/DW meaning
High
Availability
• Data replication
• Name node fail
over
• Redundant access paths (network,
nodes, disks)
• RAID storage, high quality hardware
• Minimized planned downtime
• No single point of failure
• HA administration tools, event alerts
tracking and auto recovery
• Backups
Fault tolerant
Query automatically
restarts on another
node without
resubmission using
replicated data
• Nonstop system (no unplanned
system halt or reboot)
• Extreme hardware reliability
• 99.999% uptime
• Fault isolation and containment
• Graceful degradation
• Rolling upgrades
Hadoop Translator
19. Hadoop Falling Over!
q Hadoop was built for the recovery of large batch
on large commodity grids.
q The goal was not to lose the work
q This is really about disk failure
q HA/FT is always configured according to workload
characteristics. Enterprise HA is best thought of
as “transactional” and OLTP, at the least, if not a
real-time event.
19 Copyright Teradata
20. Is Hadoop a Data Integration Platform?
• Yes
> “Lots of customers doing
ETL in Hadoop”
> Data refineries
> Unstructured data
– Weblogs and sensor data
> Data Hub/Data Lake
• No
> No built-ins
20 Copyright Teradata
– Data quality tools
– Transformations
> All do-it-yourself code
> No ETL process
management
> No metadata repository
21. Hadoop Is Not a Data Integration Solution
• Data integration requires a method for rationalizing inconsistent
semantics, which helps developers rationalize various sources of
data (depending on some of the metadata and policy capabilities
that are entirely absent from the Hadoop stack).
• Data quality is a key component of any appropriately governed
data integration project. The Hadoop stack offers no support
for this, other than the individual programmer's code, one
data element at a time, or one program at a time.
• Because Hadoop work streams are independent — and separately
programmed for specific use cases — there is no method for
relating one to another, nor for identifying or reconciling underlying
semantic differences.
29 January 2013
21 Copyright Teradata
22. Unstructured Data in the Data Warehouse
• Facebook, Twitter, LinkedIn
• Sensor data
• XML
• Web logs
• JSON
• eMail
• Documents
• Images
• Not so much
> Audio
> Video
20%
15%
10%
5%
0%
2013
Social XML Docs eMail Web JPGs A/V Sensor
22 Copyright Teradata
logs
Sources: Derived from TDWI, Wikibon, Gartner, IDC
23. Hadoop For Data Integration!
q Hadoop serves a useful function as a
data reservoir.
q The revenge of the ISAM file
q Some ETL
q Some cleansing
q Some analytics
q Personally, I would want drag and drop
ETL, ELT. Those who write code
maintain code.
23 Copyright Teradata
24. Is Hadoop a Data Warehouse?
24 Copyright Teradata
25. Scaling the Facebook Data
Warehouse to 300 PB
At Facebook, we have unique storage scalability challenges when it comes to our
data warehouse. Our warehouse stores upwards of 300 PB of Hive data,
with an incoming daily rate of about 600 TB. In the last year, the warehouse has
seen a 3x growth in the amount of data stored. Given this growth trajectory,
storage efficiency is and will continue to be a focus for our warehouse
infrastructure.
There are many areas we are innovating in to improve storage efficiency for the
warehouse – building cold storage data centers, adopting techniques like RAID in
HDFS to reduce replication ratios (while maintaining high availability), and using
compression for data reduction before it’s written to HDFS. The most widely used
system at Facebook for large data transformations on raw logs is Hive, a query
engine based on Corona Map Reduce used for processing and creating large
tables in our data warehouse. In this post, we will focus primarily on how we
evolved the Hive storage format to compress raw data as efficiently as possible
into the on-disk data format.
https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/
25 Copyright Teradata
April 10, 2014
26. What is a Data Warehouse?
• A data design pattern, an architecture
> Size doesn’t matter
> A perpetual evolution
• Definition: Gartner (2005) /Inmon (1992) /Wikipedia
> Subject oriented
– Detailed data + modeling of sales, inventory, finance, etc.
> Integrated logical model
– Merged data
– Consistent, standardized data formats and values
> Nonvolatile
– Data stored unmodified for long periods of time
> Time variant
– Record versioning or temporal services
> Persistent storage, not virtual, not federated
Source: Gartner: Of Data Warehouses, Operational Data Stores, Data Marts and Data 'Outhouses‘;
Bill Inmon, Building the Data Warehouse, 1992, Wiley and Sons
26 Copyright Teradata
27. Subject Areas: A Model of ‘Our’ Business
Price
history
Point of Sale
Product/Services
Inventory
Supplier
Contracts
27 Copyright Teradata
Labor
Associate
E-Commerce
Channels
Customer
Sales
transactions
Carrier Shipment
Campaigns
Promotion
Warehouse
Each subject area has numerous large FACT tables (=big joins)
28. You Wish You Had Redundant Data!
Match keys
App Cust_ID First Last DOB Social Address
ERP 30391-244 William Franks 04/12/00 563-49-1234 123 Oak, Atlanta
CRM 30391244 W. Franks 04/12/70 563491234
SCM 30391244 Bill Franks 04/12/70 Atlanta
XYZ 30391-244 Frank Williams 563491234 123 Oak St. #14
Cust_ID First Last DOB Social Address
30391244 William Franks 04/12/70 563491234 123 Oak St. #14
Final integrated record
28 Copyright Teradata
ETL
29. What is a Data Mart?
• A targeted project that will be finished
> A subset of data, not all the data
> Not for all of the people
• Often heavily denormalized
• Volatility
> Often completely reloaded
• Time variance and currency
> Can restate the data “as of” a point in time
• Virtualization option
> Can be a logical set of views, cubes
Source: Gartner: Of Data Warehouses, Operational Data Stores, Data Marts and Data 'Outhouses‘;
Inmon, Building the Data Warehouse, 1992, Wiley and Sons
29 Copyright Teradata
30. Why Hadoop Is Not a Data Warehouse
30 Copyright Teradata
31. Words Matter!
q The meaning of data warehouse is
changing:
q JSON (hierarchical capability)
q Network queries (possibly offload)
q Analytics
q The meaning of data warehouse is
extending. But it still includes
“optimization.”
q It’s no longer a data staging area, it’s
a reservoir.
31 Copyright Teradata
35. Data Lake Benefits: The Landing Zone
• Rapid ingest
> File copy vs database load
• Temporary data
• Data not ready for the
data warehouse
• Data that never à data
warehouse
• Archives
> Alternative to magnetic tape
35 Copyright Teradata
36. Hadoop Enables Another Data Platform
• Ad hoc projects
> One-shot complex analytics
> Hurry up, short term efforts
• Alternative analytics
> Not SQL-friendly algorithms
> Markov chains, random forest
> JPG, audio analysis
• Sandbox – hunting in the dark
> Prototyping
> Data exploration
> Trial and error new algorithms
36 Copyright Teradata
37. What Hadoop Is For!
q Data reservoir
q Prototyping
q Analytical or BI sandboxing (data
wrangling)
q Archive
q File system API (HDFS)
37 Copyright Teradata
38. Marketing
Applications
Business
Intelligence
Data
Mining
Math
and Stats
Languages
ANALYTIC
TOOLS & APPS
Customers
Partners
Business
Analysts
Data
Scientists
USERS
TERADATA UNIFIED DATA ARCHITECTURE
MOVE MANAGE ACCESS
INTEGRATED DATA WAREHOUSE
INTEGRATED DISCOVERY
PLATFORM
ERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
Web and
Social
SOURCES
DATA
PLATFORM
System Conceptual View
Marketing
Executives
Operational
Systems
Frontline
Workers
Engineers
TERADATA
DATABASE
HORTONWORKS
TERADATA DATABASE
TERADATA ASTER DATABASE
39. Teradata SQL-H Teradata SQL-H
• Joint R&D with Hortonworks
> Donated to Apache
• Business user query with
favorite BI tools
• Join Hadoop data to
> Teradata Data Warehouse
> Aster Discovery Platform
• Teradata 15.0
> Bi-directional SQL
> Push down filters to Hive
• Fast, secure, reliable
39 Copyright Teradata
Aster SQL-H
Hadoop
MR
Hive
Pig
HCatalog
Hadoop Layer: HDFS
Data
Data Filtering
40. Teradata 15: Teradata QueryGrid™
Business users Data Scientists
TERADATA
ASTER
DATABASE
SQL,
SQL-MR,
SQL-GR
TERADATA
DATABASE
Teradata
Systems
40 Copyright Teradata
OTHER
DATABASES
Remote
Data
LANGUAGES
SAS, Perl,
Python, R,
Ruby, etc,
HADOOP
Push-down
to Hadoop
IDW Discovery
TERADATA
DATABASE
TERADATA
ASTER
DATABASE
41. Market Possibilities
q The scale-out file system will not die
(because it’s only an API)
q YARN (& Cascading) will prosper
q Hadoop will play a role in data flow
q It will never replace the EDW,
except by deception
q The struggle for a unified
architecture will continue
41 Copyright Teradata
42. Hadoop and the Data Warehouse:
Competitive or Complementary?
http://www.teradata.com/analyst-reports/Hadoop-and-the-Data-Warehouse-Competitive-or-Complementary/
42 Copyright Teradata