This document provides an overview of analytics tools and methods from the perspective of guest lecturer Scott Allen Mongeau. It includes details about Mongeau's background and experience in data science, as well as sections on people and roles in data science, technologies and tools, processes and methods, predictive analytics using machine learning, descriptive analytics using unsupervised learning, and causal modeling.
An overview on the application of data science methods and data analytics tools to complement cyber risk quantification, cyber insurance valuation, and cyber risk assessment.
AI & ML in Cyber Security - Why Algorithms Are DangerousRaffael Marty
Every single security company is talking in some way or another about how they are applying machine learning. Companies go out of their way to make sure they mention machine learning and not statistics when they explain how they work. Recently, that's not enough anymore either. As a security company you have to claim artificial intelligence to be even part of the conversation.
Guess what. It's all baloney. We have entered a state in cyber security that is, in fact, dangerous. We are blindly relying on algorithms to do the right thing. We are letting deep learning algorithms detect anomalies in our data without having a clue what that algorithm just did. In academia, they call this the lack of explainability and verifiability. But rather than building systems with actual security knowledge, companies are using algorithms that nobody understands and in turn discover wrong insights.
In this talk I will show the limitations of machine learning, outline the issues of explainability, and show where deep learning should never be applied. I will show examples of how the blind application of algorithms (including deep learning) actually leads to wrong results. Algorithms are dangerous. We need to revert back to experts and invest in systems that learn from, and absorb the knowledge, of experts.
Ensuring security of a company’s data and infrastructure has largely become a data analytics challenge. It is about finding and understanding patterns and behaviors that are indicative of malicious activities or deviations from the norm. Data, Analytics, and Visualization are used to gain insights and discover those malicious activities. These three components play off of each other, but also have their inherent challenges. A few examples will be given to explore and illustrate some of these challenges,
Malware detection within enterprise networks is a critical component of an effective information security strategy. Instances of malware attacks are increasing – making them especially important to detect – and data science can help. This presentation outlines data science driven approaches to finding domains that have time and user-based co-occurrence relationships. It also includes a demonstration of a scalable and operationalizable framework to detect domain associations by analyzing the web traffic of users in any organization.
Additional information:
http://www.datasciencecentral.com/video/dsc-webinar-series-data-science-driven-approaches-to-malware
Leading organizations today all have data scientists and analytics teams. A key challenge is establishing cross-functional teams that can collaboratively derive insights from data and move exploratory interactive analytics into automated production systems. Boston Consulting Group, founded on quantitative decision making, guides global F500 companies in the technical and organizational structures that will provide a foundation for agility, innovation, and competitive advantage. This talk will outline key strategies for building effective cloud-native analytics teams.
Building a Real-Time Security Application Using Log Data and Machine Learning...Sri Ambati
Building a Real-Time Security Application Using Log Data and Machine Learning- Karthik Aaravabhoomi
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
An overview on the application of data science methods and data analytics tools to complement cyber risk quantification, cyber insurance valuation, and cyber risk assessment.
AI & ML in Cyber Security - Why Algorithms Are DangerousRaffael Marty
Every single security company is talking in some way or another about how they are applying machine learning. Companies go out of their way to make sure they mention machine learning and not statistics when they explain how they work. Recently, that's not enough anymore either. As a security company you have to claim artificial intelligence to be even part of the conversation.
Guess what. It's all baloney. We have entered a state in cyber security that is, in fact, dangerous. We are blindly relying on algorithms to do the right thing. We are letting deep learning algorithms detect anomalies in our data without having a clue what that algorithm just did. In academia, they call this the lack of explainability and verifiability. But rather than building systems with actual security knowledge, companies are using algorithms that nobody understands and in turn discover wrong insights.
In this talk I will show the limitations of machine learning, outline the issues of explainability, and show where deep learning should never be applied. I will show examples of how the blind application of algorithms (including deep learning) actually leads to wrong results. Algorithms are dangerous. We need to revert back to experts and invest in systems that learn from, and absorb the knowledge, of experts.
Ensuring security of a company’s data and infrastructure has largely become a data analytics challenge. It is about finding and understanding patterns and behaviors that are indicative of malicious activities or deviations from the norm. Data, Analytics, and Visualization are used to gain insights and discover those malicious activities. These three components play off of each other, but also have their inherent challenges. A few examples will be given to explore and illustrate some of these challenges,
Malware detection within enterprise networks is a critical component of an effective information security strategy. Instances of malware attacks are increasing – making them especially important to detect – and data science can help. This presentation outlines data science driven approaches to finding domains that have time and user-based co-occurrence relationships. It also includes a demonstration of a scalable and operationalizable framework to detect domain associations by analyzing the web traffic of users in any organization.
Additional information:
http://www.datasciencecentral.com/video/dsc-webinar-series-data-science-driven-approaches-to-malware
Leading organizations today all have data scientists and analytics teams. A key challenge is establishing cross-functional teams that can collaboratively derive insights from data and move exploratory interactive analytics into automated production systems. Boston Consulting Group, founded on quantitative decision making, guides global F500 companies in the technical and organizational structures that will provide a foundation for agility, innovation, and competitive advantage. This talk will outline key strategies for building effective cloud-native analytics teams.
Building a Real-Time Security Application Using Log Data and Machine Learning...Sri Ambati
Building a Real-Time Security Application Using Log Data and Machine Learning- Karthik Aaravabhoomi
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
It is almost impossible to escape the topic of Data Science. While the core of Data Science has remained the same over the last decade, it’s emergence to the forefront is spurred by both the availability of new data types and a true realization of the value that it delivers. In this session, we will provide an overview of data science, the different classes of machine learning algorithm and deliver an end-to-end demonstration of performing Machine Learning Using Hadoop. Audience: Developers, Data Scientist Architects and System Engineers.
Recording: https://hortonworks.webex.com/hortonworks/lsr.php?RCID=4175a7421d00257f33df146f50c41af8
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
Transitioning to a Big Data architecture is a big step; and the complexity of moving existing analytical services onto modern platforms like Cloudera, can seem overwhelming.
Introduction to Deep Learning and AI at Scale for ManagersDataWorks Summit
Deep Learning and the new wave of AI are inevitably coming to your business area. If you are a manager and if you are trying to make sense of all the buzzwords, this session is four you. We will show you what is Deep Learning in a way that you will understand how it works and how can you apply it. We then expand the scope and apply the deep learning and AI techniques in the Big Data context. You will learn about things that don't work out so well, the risks and challenges in both applying and developing with deep learning and AI technologies. We conclude with practical guidance on how to add the exciting deep learning and AI capabilities to your next project.
Outline:
- The path to Deep Learning
- From machine learning to Deep Learning
- But how does it work?
- Deep Learning architectures
- Deep Learning applications
- Deep Learning at scale
- Running AI at scale
- Deep learning at Scale using Spark
- The trouble with AI
- Application challenges
- Development challenges
- How to start your first Deep Learning project
The 5 Biggest Data Myths in Telco: ExposedCloudera, Inc.
More than any business, telecommunications firms have long been dealing with huge, diverse sets of data. Big Data. Data that is unstructured, unwieldy and disorganised, making it difficult to analyse and costly to manage. Your landscape is fiercely competitive and you instinctively know it's exactly that data that would allow you to be more innovative. Data that would set you apart from the competition. You would like to realise its true potential yet you have concerns around security, RoI or integration with existing data management solutions.
In this talk, we introduce the Data Scientist role , differentiate investigative and operational analytics, and demonstrate a complete Data Science process using Python ecosystem tools, like IPython Notebook, Pandas, Matplotlib, NumPy, SciPy and Scikit-learn. We also touch the usage of Python in Big Data context, using Hadoop and Spark.
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
The Briefing Room with Dr. Robin Bloor and HP Security Voltage
Live Webcast September 22, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=45ece7082b1d7c2cc8179bc7a1a69ea5
Hadoop is rapidly becoming a development platform and dominant server environment, and organizations are keen to take advantage of its massively scalable – and relatively inexpensive – resources. It is not, however, without its limitations, and it often requires a contingent of complementary components in order to behave within an information architecture. One area often overlooked is security, a factor that, if not considered from the onset, can insert great risk when putting sensitive data in Hadoop.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses how security was never a design point for Hadoop and what organizations can do about it. He’ll be briefed by Sudeep Venkatesh of HP Security Voltage, who will explain the intricacies surrounding a secure Hadoop implementation. He will show how techniques like format-preserving and partial-field encryption can allow for analytics over protected data, with zero performance impact.
Visit InsideAnalysis.com for more information.
4° Sessione - Telemetria e internet delle cose nell'ambito della ricercaJürgen Ambrosi
In questa sessione vedremo una dimostrazione pratica delle tecnologie abilitanti dell'Internet of Things e analizzeremo insieme casi applicati nel mondo moderno, dal mondo della ricerca a quello dell'industria
How Cloudera SDX can aid GDPR compliance 6.21.18Cloudera, Inc.
In this webinar, we will cover:
Technical capabilities required in your data platform including metadata classification on ingest, column-level lineage, fine-grained authorization, encryption, and more
How a shared data experience can facilitate the safe handling of metadata
Ways to enable your data platform for GDPR success
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...Jürgen Ambrosi
I dati sono il nuovo Capitale: come il capitale finanziario, sono una risorsa che deve essere gestita, raccolta e tenuta al sicuro, ma deve essere anche investita dalle organizzazioni che vogliono ottenere vantaggio competitivo. I dati non sono una risorsa nuova, ma soltanto oggi per la prima volta sono disponbili in abbondanza assieme alle tecnologie necessarie per massimizzarne il ritorno. Esattamente come l'elettricità fu una curiosità da laboratorio per molto tempo, finché non venne resa disponibile alle masse e dunque cambiò totalmente il volto dell'industria moderna.Ecco perché per accelerare il cambiamento è necessario un approccio innovativo alla esecuzione delle iniziative orientate ai Big Data: un laboratorio analitico come catalizzatore dell'innovazione (Data Lab).In questo webinar sulle tecnologie Oracle, utilizzeremo il consueto approccio del racconto basato su casi d’uso ed esperienze concrete.
An overview of some methods and principles for big data visualization. The presentation quickly hits on the topic of dashboards and some cyber security uses. The topic of a big data lake is also briefly discussed in the context of a cyber security big data setup.
2016 Cybersecurity Analytics State of the UnionCloudera, Inc.
3 Things to Learn About:
-Ponemon Institute's 2016 big data cybersecurity analytics research report
-Quantifiable returns organizations are seeing with big data cybersecurity analytics
-Trends in the industry that are affecting cybersecurity strategies
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...StampedeCon
At StampedeCon 2014, Stephen O’Sullivan (Silicon Valley Data Science) presented "Beyond a Big Data Pilot: Building a Production Data Infrastructure."
Creating a data architecture involves many moving parts. By examining the data value chain, from ingestion through to analytics, we will explain how the various parts of the Hadoop and big data ecosystem fit together to support batch, interactive and realtime analytical workloads.
By tracing the flow of data from source to output, we’ll explore the options and considerations for components, including data acquisition, ingestion, storage, data services, analytics and data management. Most importantly, we’ll leave you with a framework for understanding these options and making choices.
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
A talk from AnacondaCON presenting my personal journey from physics to finance to biology and how collaborative team-based data science has been the big enabler. The talk looks at Python, Big Data, Jupyter Notebooks, Anaconda. Discusses CERN LHCb particle physics computing, protein structure determination, and patterns in data science.
Due to recent advances in technology, humanity is collecting vast amounts of data at an unprecedented rate, making the skills necessary to mine insights from this data increasingly valuable. So what does it take for a Developer to enter the world of data science?
Join me on a journey into the world of big data and machine learning where we will explore what the work actually looks like, identify which skills are most important, and design a road map for how you too can join this exciting and profitable industry.
TechWise with Eric Kavanagh, Dr. Robin Bloor and Dr. Kirk Borne
Live Webcast on July 23, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=59d50a520542ee7ed00a0c38e8319b54
Analytical applications are everywhere these days, and for good reason. Organizations large and small are using analytics to better understand any aspect of their business: customers, processes, behaviors, even competitors. There are several critical success factors for using analytics effectively: 1) know which kind of apps make sense for your company; 2) figure out which data sets you can use, both internal and external; 3) determine optimal roles and responsibilities for your team; 4) identify where you need help, either by hiring new employees or using consultants 5) manage your program effectively over time.
Register for this episode of TechWise to learn from two of the most experienced analysts in the business: Dr. Robin Bloor, Chief Analyst of The Bloor Group, and Dr. Kirk Borne, Data Scientist, George Mason University. Each will provide their perspective on how companies can address each of the key success factors in building, refining and using analytics to improve their business. There will then be an extensive Q&A session in which attendees can ask detailed questions of our experts and get answers in real time. Registrants will also receive a consolidated deck of slides, not just from the main presenters, but also from a variety of software vendors who provide targeted solutions.
Visit InsideAnlaysis.com for more information.
Extract business value by analyzing large volumes of multi-structured data from various sources such as databases, websites, blogs, social media, smart sensors...
It is almost impossible to escape the topic of Data Science. While the core of Data Science has remained the same over the last decade, it’s emergence to the forefront is spurred by both the availability of new data types and a true realization of the value that it delivers. In this session, we will provide an overview of data science, the different classes of machine learning algorithm and deliver an end-to-end demonstration of performing Machine Learning Using Hadoop. Audience: Developers, Data Scientist Architects and System Engineers.
Recording: https://hortonworks.webex.com/hortonworks/lsr.php?RCID=4175a7421d00257f33df146f50c41af8
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
Transitioning to a Big Data architecture is a big step; and the complexity of moving existing analytical services onto modern platforms like Cloudera, can seem overwhelming.
Introduction to Deep Learning and AI at Scale for ManagersDataWorks Summit
Deep Learning and the new wave of AI are inevitably coming to your business area. If you are a manager and if you are trying to make sense of all the buzzwords, this session is four you. We will show you what is Deep Learning in a way that you will understand how it works and how can you apply it. We then expand the scope and apply the deep learning and AI techniques in the Big Data context. You will learn about things that don't work out so well, the risks and challenges in both applying and developing with deep learning and AI technologies. We conclude with practical guidance on how to add the exciting deep learning and AI capabilities to your next project.
Outline:
- The path to Deep Learning
- From machine learning to Deep Learning
- But how does it work?
- Deep Learning architectures
- Deep Learning applications
- Deep Learning at scale
- Running AI at scale
- Deep learning at Scale using Spark
- The trouble with AI
- Application challenges
- Development challenges
- How to start your first Deep Learning project
The 5 Biggest Data Myths in Telco: ExposedCloudera, Inc.
More than any business, telecommunications firms have long been dealing with huge, diverse sets of data. Big Data. Data that is unstructured, unwieldy and disorganised, making it difficult to analyse and costly to manage. Your landscape is fiercely competitive and you instinctively know it's exactly that data that would allow you to be more innovative. Data that would set you apart from the competition. You would like to realise its true potential yet you have concerns around security, RoI or integration with existing data management solutions.
In this talk, we introduce the Data Scientist role , differentiate investigative and operational analytics, and demonstrate a complete Data Science process using Python ecosystem tools, like IPython Notebook, Pandas, Matplotlib, NumPy, SciPy and Scikit-learn. We also touch the usage of Python in Big Data context, using Hadoop and Spark.
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
The Briefing Room with Dr. Robin Bloor and HP Security Voltage
Live Webcast September 22, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=45ece7082b1d7c2cc8179bc7a1a69ea5
Hadoop is rapidly becoming a development platform and dominant server environment, and organizations are keen to take advantage of its massively scalable – and relatively inexpensive – resources. It is not, however, without its limitations, and it often requires a contingent of complementary components in order to behave within an information architecture. One area often overlooked is security, a factor that, if not considered from the onset, can insert great risk when putting sensitive data in Hadoop.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses how security was never a design point for Hadoop and what organizations can do about it. He’ll be briefed by Sudeep Venkatesh of HP Security Voltage, who will explain the intricacies surrounding a secure Hadoop implementation. He will show how techniques like format-preserving and partial-field encryption can allow for analytics over protected data, with zero performance impact.
Visit InsideAnalysis.com for more information.
4° Sessione - Telemetria e internet delle cose nell'ambito della ricercaJürgen Ambrosi
In questa sessione vedremo una dimostrazione pratica delle tecnologie abilitanti dell'Internet of Things e analizzeremo insieme casi applicati nel mondo moderno, dal mondo della ricerca a quello dell'industria
How Cloudera SDX can aid GDPR compliance 6.21.18Cloudera, Inc.
In this webinar, we will cover:
Technical capabilities required in your data platform including metadata classification on ingest, column-level lineage, fine-grained authorization, encryption, and more
How a shared data experience can facilitate the safe handling of metadata
Ways to enable your data platform for GDPR success
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...Jürgen Ambrosi
I dati sono il nuovo Capitale: come il capitale finanziario, sono una risorsa che deve essere gestita, raccolta e tenuta al sicuro, ma deve essere anche investita dalle organizzazioni che vogliono ottenere vantaggio competitivo. I dati non sono una risorsa nuova, ma soltanto oggi per la prima volta sono disponbili in abbondanza assieme alle tecnologie necessarie per massimizzarne il ritorno. Esattamente come l'elettricità fu una curiosità da laboratorio per molto tempo, finché non venne resa disponibile alle masse e dunque cambiò totalmente il volto dell'industria moderna.Ecco perché per accelerare il cambiamento è necessario un approccio innovativo alla esecuzione delle iniziative orientate ai Big Data: un laboratorio analitico come catalizzatore dell'innovazione (Data Lab).In questo webinar sulle tecnologie Oracle, utilizzeremo il consueto approccio del racconto basato su casi d’uso ed esperienze concrete.
An overview of some methods and principles for big data visualization. The presentation quickly hits on the topic of dashboards and some cyber security uses. The topic of a big data lake is also briefly discussed in the context of a cyber security big data setup.
2016 Cybersecurity Analytics State of the UnionCloudera, Inc.
3 Things to Learn About:
-Ponemon Institute's 2016 big data cybersecurity analytics research report
-Quantifiable returns organizations are seeing with big data cybersecurity analytics
-Trends in the industry that are affecting cybersecurity strategies
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...StampedeCon
At StampedeCon 2014, Stephen O’Sullivan (Silicon Valley Data Science) presented "Beyond a Big Data Pilot: Building a Production Data Infrastructure."
Creating a data architecture involves many moving parts. By examining the data value chain, from ingestion through to analytics, we will explain how the various parts of the Hadoop and big data ecosystem fit together to support batch, interactive and realtime analytical workloads.
By tracing the flow of data from source to output, we’ll explore the options and considerations for components, including data acquisition, ingestion, storage, data services, analytics and data management. Most importantly, we’ll leave you with a framework for understanding these options and making choices.
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
A talk from AnacondaCON presenting my personal journey from physics to finance to biology and how collaborative team-based data science has been the big enabler. The talk looks at Python, Big Data, Jupyter Notebooks, Anaconda. Discusses CERN LHCb particle physics computing, protein structure determination, and patterns in data science.
Due to recent advances in technology, humanity is collecting vast amounts of data at an unprecedented rate, making the skills necessary to mine insights from this data increasingly valuable. So what does it take for a Developer to enter the world of data science?
Join me on a journey into the world of big data and machine learning where we will explore what the work actually looks like, identify which skills are most important, and design a road map for how you too can join this exciting and profitable industry.
TechWise with Eric Kavanagh, Dr. Robin Bloor and Dr. Kirk Borne
Live Webcast on July 23, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=59d50a520542ee7ed00a0c38e8319b54
Analytical applications are everywhere these days, and for good reason. Organizations large and small are using analytics to better understand any aspect of their business: customers, processes, behaviors, even competitors. There are several critical success factors for using analytics effectively: 1) know which kind of apps make sense for your company; 2) figure out which data sets you can use, both internal and external; 3) determine optimal roles and responsibilities for your team; 4) identify where you need help, either by hiring new employees or using consultants 5) manage your program effectively over time.
Register for this episode of TechWise to learn from two of the most experienced analysts in the business: Dr. Robin Bloor, Chief Analyst of The Bloor Group, and Dr. Kirk Borne, Data Scientist, George Mason University. Each will provide their perspective on how companies can address each of the key success factors in building, refining and using analytics to improve their business. There will then be an extensive Q&A session in which attendees can ask detailed questions of our experts and get answers in real time. Registrants will also receive a consolidated deck of slides, not just from the main presenters, but also from a variety of software vendors who provide targeted solutions.
Visit InsideAnlaysis.com for more information.
Extract business value by analyzing large volumes of multi-structured data from various sources such as databases, websites, blogs, social media, smart sensors...
Watch full webinar here: https://bit.ly/3H4vrlD
Data as a strategic imperative for any company to compete, New common self-service data experience required for all things intelligent, Modern data platform focused on producing data products, Data platform, product, people, process key solution ingredients and Denodo is the future and time is now to get started.
LoQutus helps organisations to innovate with analytics and to get insights with data visualisation. We also build large scale data layers to enable interaction with core data, and develop data-driven applications to deliver the insights our customers need. During this session we’ll share what we have learned along the way. We’ll show you our framework for self-service analytics & insights, and some successful case studies.
Building enterprise advance analytics platformHaoran Du
By Raymond Fu - Practice Architect
This lecture talks about the best practices in building an advanced analytics platform to help companies apply machine learning, deep learning and data science to their structured and unstructured data.
At Southern California Data Science Conference Sept.25.2016 at USC
http://socaldatascience.org/
http://www.datalaus.com/en/
Agile & Data Modeling – How Can They Work Together?DATAVERSITY
A tenet of the Agile Manifesto is ‘Working software over comprehensive documentation’, and many have interpreted that to mean that data models are not necessary in the agile development environment. Others have seen the value of data models for achieving the other core tenets of ‘Customer Collaboration’ and ‘Responding to Change’.
This webinar will discuss how data models are being effectively used in today’s Agile development environment and the benefits that are being achieved from this approach.
Building the Artificially Intelligent EnterpriseDatabricks
This session looks at where we are today with data and analytics and what is needed to transition to the Artificially Intelligent Enterprise.
How do you mobilise developers to exploit what data scientists and business analysts have built? How do you align it all with business strategy to maximise business outcomes? How do you combine BI, predictive and prescriptive analytics, automation and reinforcement learning to get maximum value across the enterprise? What is the blueprint for building the artificially intelligent enterprise?
•Data and analytics – Where are we?
•Why is the journey only half-way done?
•2021 and beyond – The new era of AI usage and not just build
•The requirement – event-driven, on-demand and automated analytics
•Operationalising what you build – DataOps, MLOps and RPA
•Mobilising the masses to integrate AI into processes – what needs to be done?
•Business strategy alignment – the guiding light to AI utilisation for high reward
•Agility step change – the shift to no-code integration of AI by citizen developers
•Recording decisions, and analysing business impact
•Reinforcement-learning – transitioning to continuous reward
DataLakes kan skalere i takt med skyen, nedbryde integrationsbarrierer og data gemt i siloer og bane vejen for nye forretningsmuligheder. Det er alt sammen med til at give et bedre beslutningsgrundlag for ledelse og medarbejdere. Kom og hør hvordan.
David Bojsen, Arkitekt, Microsoft
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
Many data scientists are well grounded in creating accomplishment in the enterprise, but many come from outside – from academia, from PhD programs and research. They have the necessary technical skills, but it doesn’t count until their product gets to production and in use. The speaker recently helped a struggling data scientist understand his organization and how to create success in it. That turned into this presentation, because many new data scientists struggle with the complexities of an enterprise.
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...HostedbyConfluent
Building a Data Driven Culture and AI Revolution With Gregory Little | Current 2022
Transforming business or mission through AI/ML doesn't start with technology but with culture…and an audit. At least as much is true for the US Department of Defense (DoD), which presents significant modernization challenges because of its mission scope, expansive global footprint, and massive size - with over 2.8 million people, it is the largest employer in the world. Greg Little discusses how establishing the DoD’s annual audit became a surprising accelerator for the department’s data and analytics journey. It revealed the foundational needs for data management to run a $3 trillion in assets enterprise, and its successful implementation required breaking through deeply entrenched cultural and organizational resistance across DoD.
In this session, Greg will discuss what it will take to guide the evolution of technology and culture in parallel: leadership, technology that enables rapid scale and a complete & reliable data flow, and a data driven culture.
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...Hortonworks
There certainly is no shortage of hype when it comes to the term “Big Data”. One thing we can be sure of is that massive data volumes are driving a new modern data architecture that includes Hadoop in the mix. But what does that architecture look like for Business Intelligence Data Strategy?
Join Hortonworks and MicroStrategy, where we’ll:
• Discuss the modern architecture for Business Intelligence on top of Hadoop as a data source.
• Learn how our joint solution helps enterprises store, process and analyze vast amounts of structured and unstructured data to deliver business insights throughout an organization.
• Discover what new benefits Hadoop 2.0 offers and how the MicroStrategy Analytics platform leverages those new features to improve performance, achieve faster access times, and allow for true interactive visual data discovery.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
2. 2
2
2
Education
• PhD (ABD)
• MBA
• MA Financial Mgmt
• Cert. Finance
• GD IT Mgmt
• MA Com Tech
Experience
• SAS Institute
Sr. Mgr. Business Solutions
• Deloitte
Manager Analytics
• Nyenrode University
Lecturer Analytics
• SARK7
Owner / Principal Consultant
• Genentech Inc. / Roche
Principal Analyst / Sr. Mgr.
• Atradius
Sr. R&D Engineer
• CFSI
CIO
Data Scientist
Cyber Analytics
scott.mongeau@sas.com
+31 (0)64 235 3427
Scott Allen Mongeau
Certified Analytics Professional (CAP)
YouTube
• Introduction to Advanced Analytics
• Introduction to Cognitive Analytics
• TedX RSM: Data Analytics
Blog: sctr7.com
Twitter: sark7
Web: sark7.com
IT solutions
Research
methods
Finance
Data
analytics
Consulting
3. 3
40 #1
14,000
93
80,000+
US $ 3.2 B
23%
SAS employees worldwide
of the top
100companieson the
GLOBAL
500 LIST
Annual reinvestment in
R&D
Continuous Revenue
Growth since 1976
Years of
BUSINESS
ANALYTICS
World’s
privately held
software company
LARGEST
Customer sites in 148 countries
DATA
ANALYTICS MARKET LEADER
5. 5MOORE’S LAW: EXPONENTIAL GROWTH OF COMPUTING POWER
5
25,000 x
Home computers
High-capacity servers
Smartphone
explosion
Cloud, AI / Watson, IoT
2015
38. 3838
Fair use: illustrate publication and article of issue in question. The Economist.
http://en.wikipedia.org/wiki/Category:Fair_use_The_Economist_magazine_covers
38
41. 4141
41
Public domain Agricultural Research Service
http://en.wikipedia.org/wiki/File:Orange_juice_1.jpg
GNU Free Documentation License: Ibanix Suzuki Shahid DL650 motorcycle
http://commons.wikimedia.org/wiki/File:Suzuki_vstrom_dl650_motorcycle.jpg
43. 43
Supervised learning - predictive
• K-Means
• Decision Trees (DT)
(random forests, boosted trees)
• Naïve Bayes classifier
• Neural networks
• Support Vector Machine (SVM)
• Ensembles / Ensemble Learning
Decision Tree
Machine Learning
Support Vector Machines
44. 4444
MACHINE LEARNING PREDICTION (SUPERVISED)
CAR Engine
Training set Validation set
Non-criminal Criminal
NORMAL UNUSUAL
Device
Time of day
Source
location
IP
Threat
intelligence
Amount
At risk
profile
Destination
location
Secure
profile
Known
devices
Average
amount
Known
location
Known
destination
45. 45
45
EXAMPLE MACHINE LEARNING TOOLS
Open source
•R
•Python
•Weka
Commercial
• SAS BASE & JMP
• SAS Enterprise Miner
• IBM SPSS
• Oracle Data Mining
• Rapid Miner
Ranjit Bose, (2009),"Advanced analytics: opportunities and challenges",
Industrial Management & Data Systems, Vol. 109 Iss 2 pp. 155 - 172
http://dx.doi.org/10.1108/02635570910930073
48. 4848
• Data preparation
• Model development
• Model management
• Model deployment
http://www.sas.com/en_gb/insights/articles/analytics/
Industrialize-your-analytics-today.html
50. 5050
CONFUSION
MATRIX
A confusion matrix
separates out the
decisions made by
the classifier,
making explicit how
one class is being
confused for
another. In this way
different sorts of
errors may be dealt
with separately.
Foster & Fawcett. Data Science for Business
What you need to know about data mining and data-analytic thinking: Chapter 7: Decision Analytic Thinking
51. 5151
RECEIVER OPERATING
CHARACTERISTICS (ROC) &
AREA UNDER THE CURVE (AUC)
“A ROC graph is a two-
dimensional plot of a
classifier with false positive
rate on the x axis against
true positive rate on the y
axis.
ROC graph depicts relative
trade-offs that a classifier
makes between benefits
(true positives) and costs
(false positives).”
Provost; Fawcett. Data Science for Business
Chapter 8: Visualization Model Performance
Area Under the Curve (AUC):
area under a classifier’s curve
expressed as a fraction of the
unit square. Its value ranges
from zero to one.
52. 5252
CUMULATIVE RESPONSE /
LIFT CURVE
• How much the line representing the
model performance is lifted up over
the random performance diagonal
Provost; Fawcett. Data Science for Business. Chapter 8: Visualizing Model Performance
• I.E. “our model gives a two times (or a 2X)
lift”: this means that at the chosen
threshold (often not mentioned), the lift
curve shows that the model’s targeting is
twice as good as random
59. 59
DATA ANALYTICS DRIVERS: V4C
59
Social and mobile
Data analytics
Interactive platforms Real-Time systems
•VOLUME
•VELOCITY
•VARIETY
•VARIABILITY
•COMPLEXITY
V4C
60. 60
• Cases where prediction is
not “deterministic”
• Bayes rate
• Theoretical maximum accuracy
that can be achieved for a
problem
60
MODEL ERRORS: INHERENT
RANDOMNESS
61. 61
• Bias: even with ‘Big Data’, model will
never reach perfect accuracy of true
model
• Example
• Linear regression model to predict
response to an advertising campaign…
• Model is an abstraction…
• True model always
more complex
61
MODEL ERRORS: BIAS
62. 62
• Variance: procedures with more variance tend to
produce models with larger errors
• Accuracy tends to vary across training sets
• Given finite sample set…
• Different models emerge
from different samples
• Different models tend to
have different accuracy
62
MODEL ERRORS: VARIANCE
63. 63
Big Data
• Complex model
• Many variables
• Low bias…
• but high variance
• Subject to overfitting
63
BALANCE: BIAS VERSUS VARIANCE
Strong models
– Tested abstraction
– Few, but significant
variables
– Low variance…
– but high bias
Jno. T-62 tank in Russian service. http://www.aviation.ru/jno/Kubinka02
http://commons.wikimedia.org/wiki/File:T-62_tank_in_Russian_service_(2).jpg
67. 67
• Explanatory performance NOT EQUAL to predictive efficacy (and vice versa),
difference between inductive and deductive methods/thinking
• This is a (sometimes heated) methodological debate amongst
practitioners/academics…
• Is it really a debate, or a religious (professional/Kuhnian) dispute? Econometrics
+ machine learning (H. Varian)
EXPLANATORY
ANALYTICS
68. 68
• Varian, Hal R. 2014. Machine Learning and Econometrics. Stanford lecture slides:
https://web.stanford.edu/class/ee380/Abstracts/140129-slides-Machine-Learning-and-Econometrics.pdf
• Varian, Hal R. 2013. Big Data: New Tricks for Econometrics. Paper:
http://people.ischool.berkeley.edu/~hal/Papers/2013/ml.pdf
MACHINE LEARNING
AND ECONOMETRICS
69. 69
• Ensemble learning…
• Promising – averages over many predictive
cases to reduce impact of variance
• However, is CORRELATIVE, not CAUSAL
• CAUSAL data analysis requires
• Investment in data acquisition
• Similarity measurements
• Expected value calculations
• Correlation understanding
• Identifying informative variables
• Fitting equations to data
• Significance testing
• Domain knowledge
69
MODEL MANAGEMENT