Oracle's BigData solutions consist of a number of new products and solutions to support customers looking to gain maximum business value from data sets such as weblogs, social media feeds, smart meters, sensors and other devices that generate massive volumes of data (commonly defined as ‘Big Data’) that isn’t readily accessible in enterprise data warehouses and business intelligence applications today.
Slides from a presentation I gave at the 5th SOA, Cloud + Service Technology Symposium (September 2012, Imperial College, London). The goal of this presentation was to explore with the audience use cases at the intersection of SOA, Big Data and Fast Data. If you are working with both SOA and Big Data I would would be very interested to hear about your projects.
Strata 2015 presentation from Oracle for Big Data - we are announcing several new big data products including GoldenGate for Big Data, Big Data Discovery, Oracle Big Data SQL and Oracle NoSQL
Expand a Data warehouse with Hadoop and Big Datajdijcks
After investing years in the data warehouse, are you now supposed to start over? Nope. This session discusses how to leverage Hadoop and big data technologies to augment the data warehouse with new data, new capabilities and new business models.
A modern approach to streaming data integration, event processing with a big data (kappa style) data architecture. Key patterns are discussed with pros/cons of newer approaches and open source technologies. Focus on Oracle and GoldenGate technology. OpenWorld 2018 presentation.
Hortonworks Oracle Big Data Integration Hortonworks
Slides from joint Hortonworks and Oracle webinar on November 11, 2014. Covers the Modern Data Architecture with Apache Hadoop and Oracle Data Integration products.
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
Overview presentation showing Oracle Big Data Appliance and Oracle Big Data SQL in combination with why this really matters. Big Data SQL brings you the unique ability to analyze data across the entire spectrum of system, NoSQL, Hadoop and Oracle Database.
Slides from a presentation I gave at the 5th SOA, Cloud + Service Technology Symposium (September 2012, Imperial College, London). The goal of this presentation was to explore with the audience use cases at the intersection of SOA, Big Data and Fast Data. If you are working with both SOA and Big Data I would would be very interested to hear about your projects.
Strata 2015 presentation from Oracle for Big Data - we are announcing several new big data products including GoldenGate for Big Data, Big Data Discovery, Oracle Big Data SQL and Oracle NoSQL
Expand a Data warehouse with Hadoop and Big Datajdijcks
After investing years in the data warehouse, are you now supposed to start over? Nope. This session discusses how to leverage Hadoop and big data technologies to augment the data warehouse with new data, new capabilities and new business models.
A modern approach to streaming data integration, event processing with a big data (kappa style) data architecture. Key patterns are discussed with pros/cons of newer approaches and open source technologies. Focus on Oracle and GoldenGate technology. OpenWorld 2018 presentation.
Hortonworks Oracle Big Data Integration Hortonworks
Slides from joint Hortonworks and Oracle webinar on November 11, 2014. Covers the Modern Data Architecture with Apache Hadoop and Oracle Data Integration products.
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
Overview presentation showing Oracle Big Data Appliance and Oracle Big Data SQL in combination with why this really matters. Big Data SQL brings you the unique ability to analyze data across the entire spectrum of system, NoSQL, Hadoop and Oracle Database.
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
This webinar discusses why Apache Hadoop most typically the technology underpinning "Big Data". How it fits in a modern data architecture and the current landscape of databases and data warehouses that are already in use.
Hadoop based data Lakes have become increasingly popular within today’s modern data architectures for their ability to scale, handle data variety and low cost. Many organizations start slow with the data lake initiatives but as they grow bigger, they suffer with challenges on data consistency, quality and security, resulting in losing confidence in their data lake initiatives.
This talk will discuss the need for good data governance mechanisms for Hadoop data lakes and it relationship with productivity and how it helps organizations meet regulatory and compliance requirements. The talk advocates carrying a different mindset for designing and implementing flexible governance mechanisms on Hadoop data lakes.
The Next Generation of Big Data AnalyticsHortonworks
Apache Hadoop has evolved rapidly to become a leading platform for managing and processing big data. If your organization is examining how you can use Hadoop to store, transform, and refine large volumes of multi-structured data, please join us for this session where we will discuss, the emergence of "big data" and opportunities for deriving business value, the evolution of Apache Hadoop and future directions, essential components required in a Hadoop-powered platform, and solution architectures that integrate Hadoop with existing data discovery and data warehouse platforms.
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
Greg Chase, Director, Product Marketing presents Big Data 10 A
mazing Things to do With A Hadoop-based Data Lake at the Strata Conference + Hadoop World 2014 in NYC.
Insurance companies of all sizes are challenged to keep up with emerging technologies that deliver a competitive advantage. Recording: https://www.brighttalk.com/webcast/9573/192877
Big data holds the key to greater customer insight and stronger customer relationships. But risk of sensitive data exposure — and compliance violations — keeps many insurers from pursuing big data initiatives and reaping the rewards of business-driven analytics. Join Dataguise and Hortonworks for this live webinar to learn how you can free your organization from traditional information security constraints and unlock the power of your most valuable business assets.
• What do you need to know about PII/PHI privacy before embarking on big data initiatives?
• Why do so many big data initiatives fail before they’ve even begun—and what can you do about it?
• How can IT security organizations help data scientists extract more business value from their data?
• How are leading insurance companies leveraging big data to gain competitive advantage?
Hortonworks and Clarity Solution Group Hortonworks
Many organizations are leveraging social media to understand consumer sentiment and opinions about brands and products. Analytics in this area, however, is in its infancy and does not always provide a compelling result for effective business impact. Learn how consumer organizations can benefit by integrating social data with enterprise data to drive more profitable consumer relationships. This webinar is presented by Hortonworks and Clarity Solution Group, and will focus on the evolution of Hadoop, the clear advantage of Hortonworks distribution, and business challenges solved by “Consumer720.”
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Hortonworks
As the Big Data Analytics and the Apache Hadoop ecosystem has matured and gained increasing traction in established industries with faster adoption in the insurance market than originally anticipated, it is clear that the potential benefits for data management and business intelligence are staggering. At the same time, many big data programs have stalled or failed to deliver on their aspirational value proposition, resulting in a substantial gap between expectations of analytics consumers and the ability of big data analytics programs to deliver. Join Hortonworks and Clarity as we review the common needs of Property and Casualty (P&C) Insurers and how to unlock the true value of big data analytics:
Information agility – Centralization of data and decentralization of analysis
Expanded capability – Conventional analysis combined with real-time analytics demands
Reduced expense – Lower costs through cheaper storage while maintaining scalability
We will discuss a modern data architecture that constitutes a mature, enterprise strength Hadoop framework for P&C Insurers that answers the need for governance processes across the enterprise stack. We will cover how a modern data architecture allows organizations to collect, store, analyze and manipulate massive quantities of data on their own terms—regardless of the source of that data - accelerating the real lifetime value of big data and Hadoop analytics for claims, customer sentiment and telematics.
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
Set of product roadmap + capabilities slides from Oracle Data Integration Product Management, and thoughts on data integration on big data implementations by Mark Rittman (Independent Analyst)
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...NoSQLmatters
Come to this deep dive on how Pivotal's Data Lake Vision is evolving by embracing next generation in-memory data exchange and compute technologies around Spark and Tachyon. Did we say Hadoop, SQL, and what's the shortest path to get from past to future state? The next generation of data lake technology will leverage the availability of in-memory processing, with an architecture that supports multiple data analytics workloads within a single environment: SQL, R, Spark, batch and transactional.
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
An organization’s information is spread across multiple repositories, on-premise and in the cloud, with limited ability to correlate information and derive insights. The Smart Content Hub solution from HP and Hortonworks enables a shared content infrastructure that transparently synchronizes information with existing systems and offers an open standards-based platform for deep analysis and data monetization.
- Leverage 100% of your data: Text, images, audio, video, and many more data types can be automatically consumed and enriched using HP Haven (powered by HP IDOL and HP Vertica), making it possible to integrate this valuable content and insights into various line of business applications.
- Democratize and enable multi-dimensional content analysis: - Empower your analysts, business users, and data scientists to search and analyze Hadoop data with ease, using the 100% open source Hortonworks Data Platform.
- Extend the enterprise data warehouse: Synchronize and manage content from content management systems, and crack open the files in whatever format they happen to be in.
- Dramatically reduce complexity with enterprise-ready SQL engine: Tap into the richest analytics that support JOINs, complex data types, and other capabilities only available with HP Vertica SQL on the Hortonworks Data Platform.
Speakers:
- Ajay Singh, Director, Technical Channels, Hortonworks
- Will Gardella, Product Management, HP Big Data
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
Data is exponentially increasing in both types and volumes, creating opportunities for businesses. Watch this video and learn from three Big Data experts: John Kreisa, VP Strategic Marketing at Hortonworks, Imad Birouty, Director of Technical Product Marketing at Teradata and John Haddad, Senior Director of Product Marketing at Informatica.
Multiple systems are needed to exploit the variety and volume of data sources, including a flexible data repository. Learn more about:
- Apache Hadoop 2 and YARN
- Data Lakes
- Intelligent data management layers needed to manage metadata and usage patterns as well as track consumption across these data platforms.
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...StampedeCon
At StampedeCon 2014, Stephen O’Sullivan (Silicon Valley Data Science) presented "Beyond a Big Data Pilot: Building a Production Data Infrastructure."
Creating a data architecture involves many moving parts. By examining the data value chain, from ingestion through to analytics, we will explain how the various parts of the Hadoop and big data ecosystem fit together to support batch, interactive and realtime analytical workloads.
By tracing the flow of data from source to output, we’ll explore the options and considerations for components, including data acquisition, ingestion, storage, data services, analytics and data management. Most importantly, we’ll leave you with a framework for understanding these options and making choices.
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
This webinar discusses why Apache Hadoop most typically the technology underpinning "Big Data". How it fits in a modern data architecture and the current landscape of databases and data warehouses that are already in use.
Hadoop based data Lakes have become increasingly popular within today’s modern data architectures for their ability to scale, handle data variety and low cost. Many organizations start slow with the data lake initiatives but as they grow bigger, they suffer with challenges on data consistency, quality and security, resulting in losing confidence in their data lake initiatives.
This talk will discuss the need for good data governance mechanisms for Hadoop data lakes and it relationship with productivity and how it helps organizations meet regulatory and compliance requirements. The talk advocates carrying a different mindset for designing and implementing flexible governance mechanisms on Hadoop data lakes.
The Next Generation of Big Data AnalyticsHortonworks
Apache Hadoop has evolved rapidly to become a leading platform for managing and processing big data. If your organization is examining how you can use Hadoop to store, transform, and refine large volumes of multi-structured data, please join us for this session where we will discuss, the emergence of "big data" and opportunities for deriving business value, the evolution of Apache Hadoop and future directions, essential components required in a Hadoop-powered platform, and solution architectures that integrate Hadoop with existing data discovery and data warehouse platforms.
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
Greg Chase, Director, Product Marketing presents Big Data 10 A
mazing Things to do With A Hadoop-based Data Lake at the Strata Conference + Hadoop World 2014 in NYC.
Insurance companies of all sizes are challenged to keep up with emerging technologies that deliver a competitive advantage. Recording: https://www.brighttalk.com/webcast/9573/192877
Big data holds the key to greater customer insight and stronger customer relationships. But risk of sensitive data exposure — and compliance violations — keeps many insurers from pursuing big data initiatives and reaping the rewards of business-driven analytics. Join Dataguise and Hortonworks for this live webinar to learn how you can free your organization from traditional information security constraints and unlock the power of your most valuable business assets.
• What do you need to know about PII/PHI privacy before embarking on big data initiatives?
• Why do so many big data initiatives fail before they’ve even begun—and what can you do about it?
• How can IT security organizations help data scientists extract more business value from their data?
• How are leading insurance companies leveraging big data to gain competitive advantage?
Hortonworks and Clarity Solution Group Hortonworks
Many organizations are leveraging social media to understand consumer sentiment and opinions about brands and products. Analytics in this area, however, is in its infancy and does not always provide a compelling result for effective business impact. Learn how consumer organizations can benefit by integrating social data with enterprise data to drive more profitable consumer relationships. This webinar is presented by Hortonworks and Clarity Solution Group, and will focus on the evolution of Hadoop, the clear advantage of Hortonworks distribution, and business challenges solved by “Consumer720.”
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Hortonworks
As the Big Data Analytics and the Apache Hadoop ecosystem has matured and gained increasing traction in established industries with faster adoption in the insurance market than originally anticipated, it is clear that the potential benefits for data management and business intelligence are staggering. At the same time, many big data programs have stalled or failed to deliver on their aspirational value proposition, resulting in a substantial gap between expectations of analytics consumers and the ability of big data analytics programs to deliver. Join Hortonworks and Clarity as we review the common needs of Property and Casualty (P&C) Insurers and how to unlock the true value of big data analytics:
Information agility – Centralization of data and decentralization of analysis
Expanded capability – Conventional analysis combined with real-time analytics demands
Reduced expense – Lower costs through cheaper storage while maintaining scalability
We will discuss a modern data architecture that constitutes a mature, enterprise strength Hadoop framework for P&C Insurers that answers the need for governance processes across the enterprise stack. We will cover how a modern data architecture allows organizations to collect, store, analyze and manipulate massive quantities of data on their own terms—regardless of the source of that data - accelerating the real lifetime value of big data and Hadoop analytics for claims, customer sentiment and telematics.
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
Set of product roadmap + capabilities slides from Oracle Data Integration Product Management, and thoughts on data integration on big data implementations by Mark Rittman (Independent Analyst)
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...NoSQLmatters
Come to this deep dive on how Pivotal's Data Lake Vision is evolving by embracing next generation in-memory data exchange and compute technologies around Spark and Tachyon. Did we say Hadoop, SQL, and what's the shortest path to get from past to future state? The next generation of data lake technology will leverage the availability of in-memory processing, with an architecture that supports multiple data analytics workloads within a single environment: SQL, R, Spark, batch and transactional.
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
An organization’s information is spread across multiple repositories, on-premise and in the cloud, with limited ability to correlate information and derive insights. The Smart Content Hub solution from HP and Hortonworks enables a shared content infrastructure that transparently synchronizes information with existing systems and offers an open standards-based platform for deep analysis and data monetization.
- Leverage 100% of your data: Text, images, audio, video, and many more data types can be automatically consumed and enriched using HP Haven (powered by HP IDOL and HP Vertica), making it possible to integrate this valuable content and insights into various line of business applications.
- Democratize and enable multi-dimensional content analysis: - Empower your analysts, business users, and data scientists to search and analyze Hadoop data with ease, using the 100% open source Hortonworks Data Platform.
- Extend the enterprise data warehouse: Synchronize and manage content from content management systems, and crack open the files in whatever format they happen to be in.
- Dramatically reduce complexity with enterprise-ready SQL engine: Tap into the richest analytics that support JOINs, complex data types, and other capabilities only available with HP Vertica SQL on the Hortonworks Data Platform.
Speakers:
- Ajay Singh, Director, Technical Channels, Hortonworks
- Will Gardella, Product Management, HP Big Data
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
Data is exponentially increasing in both types and volumes, creating opportunities for businesses. Watch this video and learn from three Big Data experts: John Kreisa, VP Strategic Marketing at Hortonworks, Imad Birouty, Director of Technical Product Marketing at Teradata and John Haddad, Senior Director of Product Marketing at Informatica.
Multiple systems are needed to exploit the variety and volume of data sources, including a flexible data repository. Learn more about:
- Apache Hadoop 2 and YARN
- Data Lakes
- Intelligent data management layers needed to manage metadata and usage patterns as well as track consumption across these data platforms.
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...StampedeCon
At StampedeCon 2014, Stephen O’Sullivan (Silicon Valley Data Science) presented "Beyond a Big Data Pilot: Building a Production Data Infrastructure."
Creating a data architecture involves many moving parts. By examining the data value chain, from ingestion through to analytics, we will explain how the various parts of the Hadoop and big data ecosystem fit together to support batch, interactive and realtime analytical workloads.
By tracing the flow of data from source to output, we’ll explore the options and considerations for components, including data acquisition, ingestion, storage, data services, analytics and data management. Most importantly, we’ll leave you with a framework for understanding these options and making choices.
Big Data Discovery + Analytics = Datengetriebene Innovation!Harald Erb
Vortrag von der DOAG 2015-Konferenz: Die Umsetzung von Datenprojekten muss man nicht zwangsläufig den sog. Data Scientists allein überlassen werden. Daten- und Tool-Komplexität im Umgang mit Big Data sind keine unüberwindbaren Hürden mehr für die Teams, die heute im Unternehmen bereits für Aufbau und Bewirtschaftung des Data Warehouses sowie dem Management bzw. der Weiterentwicklung der Business Intelligence-Plattform zuständig sind. In einem interdisziplinären Team bringen neben den technischen Rollen auch Fachanwender und Business Analysten von Anfang an ihr Domänenwissen in das Datenprojekt mit ein,
If you also got the Big Data itch, here is something to ease the pain :-)
Answers to this questions will be available soon (more info in the attached link)
Which Big Data Appliance should YOU use?
(click on the attached link for Poll results)
Appliances are Small and Quick, Right?
Revealing the 6 Types of Big Data Appliances
Uncovering the Main Players
Challenges, Pitfalls, and Winning the Big Data Game
Where is all this leading YOU to?
Agile BI Development Through AutomationManta Tools
How can code life cycle automation satisfy the growing demands in modern enterprise business intelligence?
Whilst an agile approach to BI development is useful for delivering value in general, the use of advanced automation techniques can also save significant resources, prevent production errors, and shorten time to market.
Gentlemen from Data To Value, Manta Tools, Volkswagen and M&G investments presented and discussed different approaches to agile BI development. Take a look!
Actionable Data: Mastering the Hybrid Analytics MixPerficient, Inc.
With an increase in the adoption of cloud applications, most organizations today are in some form of hybrid state (i.e. using a combination of on-premise and cloud applications to run their business). Regardless of where the data resides, you need a complete view of the company spanning across different parts of the business, combining insightful data across both onsite and public cloud instances.
In this webinar, we looked at multiple approaches that organizations have successfully used to consolidate data from multiple cloud and on-premise applications and to perform seamless analytics across these varied data sources.
2016 VLDB - Messing Up with Bart: Error Generation for Evaluating Data-Cleani...Boris Glavic
We study the problem of introducing errors into clean databases for the purpose of benchmarking data-cleaning algorithms. Our goal is to provide users with the highest possible level of control over the error-generation process, and at the same time develop solutions that scale to large databases. We show in the paper that the error-generation problem is surprisingly challenging, and in fact, NP-complete. To pro- vide a scalable solution, we develop a correct and efficient greedy algorithm that sacrifices completeness, but succeeds under very reasonable assumptions. To scale to millions of tuples, the algorithm relies on several non-trivial optimizations, including a new symmetry property of data quality constraints. The trade-off between control and scalability is the main technical contribution of the paper.
Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13Mark Rittman
The latest releases of OBIEE and ODI come with the ability to connect to Hadoop data sources, using MapReduce to integrate data from clusters of "big data" servers complementing traditional BI data sources. In this presentation, we will look at how these two tools connect to Apache Hadoop and access "big data" sources, and share tips and tricks on making it all work smoothly.
You've seen the basic 2-stage example Spark Programs, and now you're ready to move on to something larger. I'll go over lessons I've learned for writing efficient Spark programs, from design patterns to debugging tips.
The slides are largely just talking points for a live presentation, but hopefully you can still make sense of them for offline viewing as well.
Left Brain, Right Brain: How to Unify Enterprise AnalyticsInside Analysis
The Briefing Room with Robin Bloor and Teradata
Live Webcast on Jan. 29, 2013
Despite its name, effective Data Science requires a certain amount of artistic flair. Analysts must be creative about how and where they find the insights that will drive business value. One classic roadblock to that kind of frictionless process? Programming. Not everyone can code Java, which makes the unstructured domain of Hadoop quite challenging for the average business analyst.
Check out the slides from this episode of the Briefing Room to hear veteran Analyst Dr. Robin Bloor explain how a new generation of analytical platforms will solve the complexity of unifying structured and unstructured data. He'll be briefed by Steve Wooledge of Teradata Aster who will tout his company's Big Data Appliance, which leverages the SQL-H bridge, an innovation designed to connect Hadoop with SQL.
Visit: http://www.insideanalysis.com
SAS Big Data Forum - Transforming Big Data into Corporate GoldLouis Fernandes
Synopsis: How SAS believes organisations can turn Big Data in to competitive advantage through the use of High Performance Analytics.
In this presentations, we look at how SAS is seeing organisations take the outputs from big data analysis and turn them into tangible business outcomes through real-time decision-making.
In it, we explore:
- Why we believe organisations need to exploit their data assets to create the insights that build competitive advantage
- How to develop infrastructures required to support multi-dimentsional insight
- What SAS is doing to make this a reality
Key topics include:
- Data governance
- Big data infrastructure
- High performance analytics
- Data visualisation
About SAS:
- World’s largest, privately held software company,
- 35 years old
- Focus on advanced and predictive analytics right from the word go
- Big data has been in our DNA before it became mainstream
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...Cloudera, Inc.
This presentation will explore how Hadoop and Big Data are re-inventing enterprise workflows, and the pivotal role of the Data Analyst. It will examine the changing face of analytics and the streamlining of iterative queries through evolved user interfaces. The speaker will cut through hype around “shorter time to insight” and explain how combining Hadoop and SQL-based analytics help companies discover emergent trends hidden in unstructured data, without having to retrain data miners or restaff. In particular, it will highlight changes to Big Data analysis from this paradigm and illustrate stepwise how analysts can now connect to Big Data platforms, assemble working data sets from disparate sources, analyze and mine that data for actionable insight, publish the results as visualizations and for feeding reporting tools, and operationalize Map-Reduce and Big Data outcomes into company workflows – all without touching the command line.
Two Keys to Analytic Success: Cooperation, CollaborationInside Analysis
The Briefing Room with Robin Bloor and ParAccel
Live Webcast on Feb. 19, 2013
Experienced analysts know there is no single platform that can handle all types of analytic processing efficiently. Invariably, data-driven organizations will use a variety of engines to refine their raw data into usable insights. There are several down sides to this heterogeneity, not the least of which is poor collaboration. But that's starting to change, as many companies focus on creative ways to foster analytical cooperation.
Check out the slides from this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor explain why collaboration in the design and use of analytical applications can have wide-ranging impacts on an organization. He'll be briefed by John Santaferraro of ParAccel, who will tout his company's Cooperative Analytic Processing Architecture, designed to perform sophisticated deep analytics on large amounts of data quickly. CAPA can orchestrate the processing power of other engines in its ecosystem, including data warehouses and Hadoop implementations.
Visit: http://www.insideanalysis.com
Almost all developers face the challenge of reactively debugging failed business transaction processes. Not only does this require extensive navigation of enormous volumes of log data, but determining root cause becomes a laborious and time-consuming task.
Additionally, business managers often request developers and operations to provide analytics on applications, resulting in the tedious task of charting the information, most usually from intangible data. Learn how to capture, extract and analyze your event data by having analytics embedded in the application. Download the white-paper that details how to gain Application Intelligence through effective logging.
Check out the webinar here: http://www.splunk.com/goto/analytics_webcast
Simplifying Big Data Analytics for the BusinessTeradata Aster
Tasso Argyros, Co-Founder & Co-President, Teradata Aster presents at the 2012 Big Analytics Roadshow.
The opportunity exists for organizations in every industry to unlock the power of iterative, big data analysis with new applications such as digital marketing optimization and social network analysis to improve their bottom line. Big data analysis is not just the ability to analyze large volumes of data, but the ability to analyze more varieties of data by performing more complex analysis than is possible with more traditional technologies. This session will demonstrate how to bring the science of data to the art of business by empowering more business users and analysts with operationalized insights that drive results. See how data science is making emerging analytic technologies more accessible to businesses while providing better manageability to enterprise architects across retail, financial services, and media companies.
This talk was held at the 13th meeting on Sept 23rd 2014 by Bruno Ungermann.
Conceptual overview of Hadoop based analytics, comparison between data warehouse architecture and Big Data architecture, characteristics of „schema on read“, typical Big Data use cases like customer analytics, operational analytics and EDW optimization, short software demo
This talk was held at the 13th meeting on Sept 23rd 2014 by André Vocat.
In the process of proposing a highly available, redundant and performant infrastructure for a large Swiss Telco operator, the project team has opted for Cassandra as one of the key components. The resulting platform has, after more than one year in operation, proven to be the right choice. The session will show the chosen architecture, give an insight in to the development and deployment and shows the current status of the platform which is just about to see its first upgrade.
This talk was held at the 12th meeting on July 22 2014 by Karen Zhang.
Customers in business-to-consumer (B2C) and business-to-business (B2B) markets go through similar buying journey: need, search, evaluate, and finally order. Thus similar customer analytics approaches are applicable to both scenarios. However company’s go-to-market strategies are usually different in B2C vs. B2B. This study discusses unique characteristics of analytic methodologies applied in B2B vs. B2C. Two case studies will be presented to illustrate similarities and differences.
This talk was held at the 12th meeting on July 22 2014 by Romeo Kienzler.
After giving a short contextual overview about SQL for Hadoop projects in the Ecosystem (Hive, Impala, Presto, Cascading Lingual, ...) we will hear about the latest SQL for Hadoop features in Big SQL. Big SQL delivers some exciting capabilities including low latency and high performance queries while maintaining backwards compatibility to Hive and HCatalog. This is achieved by a optimizer and dedicated execution framework which will be covered in detail. Finally as demo of Big SQL v3.0 on a cluster in Silicon Valley Lab (SVL) will be shown.
This talk was held at the 11th meeting on April 7 2014 by Marcel Kornacker.
Impala (impala.io) raises the bar for SQL query performance on Apache Hadoop. With Impala, you can query Hadoop data – including SELECT, JOIN, and aggregate functions – in real time to do BI-style analysis. As a result, Impala makes a Hadoop-based enterprise data hub function like an enterprise data warehouse for native Big Data.
This talk was held at the 11th meeting on April 7 2014 by Karolina Alexiou.
Analysis of big data is useless (and a lot harder to sell) when you can't measure whether the resulting insights are correct. In order to develop sophisticated data analysis methodologies tailored to your particular use-case, you need to be able to figure out what works and what doesn't. It is crucial to gather data independently to your analysis (ground truth) and compare it to your results using the correct metrics and account for biases. The sheer volume of data means that you also need to have a strategy for slicing and dicing the data to isolate the really valuable parts, and also, a keen eye for visualization so that you can quickly compare methodologies and support the validity of your insights to third parties.
This talk was held at the 10th meeting on February 3rd 2014 by Daniel Fasel.
Many traditional Swiss companies, such as banks, insurance companies and government agencies, are highly interested in Big Data and Data Science but don’t know exactly what the business value of Big Data is for them. Often Big Data is misinterpreted as large amounts of data and companies are unaware of the innovation behind the new technologies of Big Data and how these technologies can be profitable to them. In this presentation, I discuss sample cases that demonstrate a set of these new technologies and how they can be applied not only for large web scale data but also for data sets of traditional companies. First, I demonstrate how multi-structured data can be indexed and searched using Autonomy. I show how fast new analytical application can be built based on a real-time streaming example using STORM, Redis and Node.js. And the last demonstration shows how machine learning algorithms and visualization can be applied for improving analytics using AsterData.
This talk was held at the 10th meeting on February 3rd 2014 by Sean Owen.
Having collected Big Data, organizations are now keen on data science and “Big Learning”. Much of the focus has been on data science as exploratory analytics: offline, in the lab. However, building from that a production-ready large-scale operational analytics system remains a difficult and ad-hoc endeavor, especially when real-time answers are required. Design patterns for effective implementations are emerging, which take advantage of relaxed assumptions, adopt a new tiered "lambda" architecture, and pick the right scale-friendly algorithms to succeed. Drawing on experience from customer problems and the open source Oryx project at Cloudera, this session will provide examples of operational analytics projects in the field, and present a reference architecture and algorithm design choices for a successful implementation.
This talk was held at the 10th meeting on February 3rd 2014 by Dr. Thilo Stadelmann.
Many companies are struggling with Big Data. Some argue that Big Data is the new answer to all problems while others are more critical about it. What is common to many discussions with IT professionals is that almost everyone has a different understanding of the topic. Moreover, many enterprises find it very hard to recruit the perfect data scientist to solve Big Data problems.
In this talk we give an overview of our understanding of data science and present the driving factors for the newly established Datalab at Zurich University of Applied Sciences. The goal of the lab is to establish a sound curriculum and research agenda to prepare data scientists for the ever-increasing demand from industry and to allow industry partners collaborate with academia to solve problems that go beyond everyday routines.
Big data is an opportunity for communications service providers (CSPs) to create the intelligence for operating their infrastructures more efficiently, to analyze the success of their services, and to create a better personal experience for their customers.
CSP Top executives, Network and IT managers and Marketing, are eager to exploit the large amount of information to achieve better business decisions. They expect their Chief Technical Officer to provide end-to-end analytic solutions based on the data available in their IT and network infrastructure.
This presentation analyzes the complete value chain that can transform CSPs’ data to knowledge. It covers the sources of information, the data collection tools, the analytic platforms providing quick data access, and finally the business intelligence use cases with the presentation and visualization of the results and predictions.
The "Babelfish" system is built with Scala and runs in the Java Virtual Machine. For graph persistence, a neo4j database with Lucene index is used. A generic importer module reads data from various data sources and persists them in a version-aware way, using the domain model as a schema. The schema is used by our domain specific language to statically verify queries. Query results can either be in the form of graphs or tables. For the latter, an additional step uses an in-memory SQL-Database for further processing of the results. Queries in the generated DSL can be submitted via a REST interface. The server uses json4s for serialization of the results. This interface as well as the deployable war-file is generated by the web framework Scalatra.
While user tracking with WebTrends, comScore, Google Analytics etc. is a de-facto standard in the online world, tracking visitors in the real world is still fragmented. From a wide perspective, potential tracking data is produced by various sensors. With a real ‘bricks and mortar’ store, one could figure out possible sensors they could use: customer frequency counters at the doors, the cashier system, free WiFi access points, video capture, temperature, background music, smells and many more. For many of those sensors additional hardware and software would be needed, but a few sensors already have solutions available, e.g. video capturing with face or even eye recognition. The most interesting sensor data that doesn’t require additional hardware and software could be the WiFi access points. Especially given that many visitors will have WiFi enabled mobile phones. This talk demonstrates how WiFi access point log files can be used to answer different questions for a particular store.
ParaView is an open-source graphical user interface for VTK with additional functionality including the capability to perform rendering in parallel and a client-server architecture enabling visualization and analysis to be performed on a server while being viewed and driven from a client. ParaView, like VTK, is open-sourced under a BSD license and its development is overseen by the commercial entity, Kitware, Inc. ParaView is multi-platform, extensible via its plugin architecture, and natively supports many common data analysis tasks and data formats. As it builds upon VTK, any VTK functionality can in principle be invoked. In practice not all VTK functionality is exposed by default but can easily be exposed or extended via the plugin architecture previously mentioned and discussed in more detail below. Exposing VTK functionality is as easy as writing a short XML file. In this talk I present the process of plugging into ParaView to do visualization and analysis of terabytes of data in real time.
Apache Drill [1] is a distributed system for interactive analysis of large-scale datasets, inspired by Google’s Dremel technology. It is a design goal to scale to 10,000 servers or more and to be able to process Petabytes of data and trillions of records in seconds. Since its inception in mid 2012, Apache Drill has gained widespread interest in the community. In this talk we focus on how Apache Drill enables interactive analysis and query at scale. First we walk through typical use cases and then delve into Drill's architecture, the data flow and query languages as well as data sources supported.
[1] http://incubator.apache.org/drill/
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
15. What is ?
• Brings R’s statistical functionality to the Oracle Database
• Eliminates R’s memory constraints
• Allows R to run on very large data sets
• Oracle R is architected for enterprise production infrastructure
• Automatically exploits database parallelism without requiring
parallel R programming
• Oracle R leverages the latest R algorithms and packages
• R is an embedded component of the DBMS server
• Part of Oracle Advanced Analytics (+ODM)
16. Oracle R Architecture
R workspace console
Function push-down Oracle statistics engine
OBIEE, Web
– data transformation & Services
statistics
Development Production Consumption
• Leverages SQL for data prep, analysis and enhanced statistics engine
• R engine runs on database nodes for production enablement of R models
• Leverages Exadata—Oracle R workloads run in-database and can be bound to
database nodes for workload isolation
• Enriches OBIEE dashboards with Oracle R statistics and analytics
17. Oracle Data Mining (ODM) Data mining can answer questions
that cannot be addressed through
simple query and reporting techniques.
• Data Mining: Insight from discovering relationships
• Knowledge about what happened in the past
• Characterization, segmentation, comparisons, discrimination
• Descriptive models of patterns
• Predictive Analytics: Making better decisions and
forecasts
• Knowledge about what is happening right now and in the future
• Classification and prediction of patterns
• Rule-and-model driven
18. Data Mining – Some Definitions
Supervised Learning
Problem Classification Sample Problem
Classification Predict customer response to an affinity
card program
Regression Predict customer’s age
Attribute Importance Find the most significant predictors, data
preparation
A1 A2 A3 A4 A5 A6 A7
19. Data Mining – Some Definitions
Unsupervised Learning
Problem Classification Sample Problem
Anomaly Identify customer purchasing behavior that is
Detection significantly different from the norm
Association Find the items that tend to be purchased
Rules together and specify their relationship –
market basket analysis
Clustering Segment demographic data into clusters and
rank the probability that an individual will
belong to a given cluster
Feature Group the attributes into general
Extraction characteristics of the customers
F1 F2 F3 F4
While it could never be described as a sleepy business since there have been several profound changes in the course of its evolution, it doesn’t really take an industry pundit to observe that the current Analytics market is marked by an accelerating pace of change. Comparable changes taking place now in a matter of a few years took decades to play out in the early days of BI and EPM. So, in the 80s we saw database reporting tools rule the roost. And most applications shipped with some sort of hardwired reporting capabilities built in, providing visibility but no subsequent interactivity. You could get your first question answered really well. But if you had a follow-on question, you were out of luckCome the 90s and most BI platforms evolved to 3 tier architectures, supporting more users and subject area specific data marts and BI environments for functional areas such as marketing, sales and supply chain.The broad-based adoption of the internet saw BI tools in the 2000s increase their footprint to be true analytical platforms deployed on enterprise data warehouses. These data warehouses supported the decision support needs of all users of an extended enterprise with capabilities that spanned production reporting to highly interactive ad hoc analysisBig changes, no doubt, but played out over a 20+ year time horizon. In the last 2-3 years, though, we are seeing technology disruptions opening up new possibilities in Analytics at a pace that is nothing short of breathtaking:-There is an explosion of business relevant data now on the internet. It is incredibly varied, generated at great velocity and already enormous in volume. How will it be analyzed?-Apple and others have revolutionized the tablet as an internet and general content consumption device that is now well ensconced with corporations, certainly at the highest echelons. What will analytics on these smaller and intensely personal devices come to mean?-The real cost of in-memory technology has declined dramatically. What transformative power could this hold for companies looking to live – and win – “in the moment”?-The maturity and consequent acceptance of the cloud has introduced a low friction delivery model for software delivered as a service to enterprises. How will Analytics be transformed, or how might it transform, the Cloud?These dramatic changes are sweeping through the enterprise computing landscape now. They each come with their own set of challenges but for those who view them, instead , as opportunities, we believe that tremendous competitive advantage can be unlocked. And we believe that Oracle Business Analytics provides you with the tools to do just that.
While it could never be described as a sleepy business since there have been several profound changes in the course of its evolution, it doesn’t really take an industry pundit to observe that the current Analytics market is marked by an accelerating pace of change. Comparable changes taking place now in a matter of a few years took decades to play out in the early days of BI and EPM. So, in the 80s we saw database reporting tools rule the roost. And most applications shipped with some sort of hardwired reporting capabilities built in, providing visibility but no subsequent interactivity. You could get your first question answered really well. But if you had a follow-on question, you were out of luckCome the 90s and most BI platforms evolved to 3 tier architectures, supporting more users and subject area specific data marts and BI environments for functional areas such as marketing, sales and supply chain.The broad-based adoption of the internet saw BI tools in the 2000s increase their footprint to be true analytical platforms deployed on enterprise data warehouses. These data warehouses supported the decision support needs of all users of an extended enterprise with capabilities that spanned production reporting to highly interactive ad hoc analysisBig changes, no doubt, but played out over a 20+ year time horizon. In the last 2-3 years, though, we are seeing technology disruptions opening up new possibilities in Analytics at a pace that is nothing short of breathtaking:-There is an explosion of business relevant data now on the internet. It is incredibly varied, generated at great velocity and already enormous in volume. How will it be analyzed?-Apple and others have revolutionized the tablet as an internet and general content consumption device that is now well ensconced with corporations, certainly at the highest echelons. What will analytics on these smaller and intensely personal devices come to mean?-The real cost of in-memory technology has declined dramatically. What transformative power could this hold for companies looking to live – and win – “in the moment”?-The maturity and consequent acceptance of the cloud has introduced a low friction delivery model for software delivered as a service to enterprises. How will Analytics be transformed, or how might it transform, the Cloud?These dramatic changes are sweeping through the enterprise computing landscape now. They each come with their own set of challenges but for those who view them, instead , as opportunities, we believe that tremendous competitive advantage can be unlocked. And we believe that Oracle Business Analytics provides you with the tools to do just that.
Enables Map-Reduce style R calculations with the Big Data Appliance and HDFSSupports compute-intensive parallelism for simulationsORCH provides optimized R algorithms that are robust, numerically accurate and linearly scalable on Hadoop and the Big Data Appliance. More cores achieve a proportional decrease in run times and matches R user experience.Linear Models and Logistic ModelsGeneral feed-forward Neural Networks Regression ModelsMatrix Factorization (algorithms for large-scale Matrix problems)K-Means ClusteringPCA (Principal Component Analysis)Correlations