This document discusses the large amount of data available today and the issues with human processing of data. It proposes using algorithms and servers to analyze large amounts of data in order to maximize return on information. The document recommends shifting the awareness curve to the right through a single access point for data and shifting the action curve to the left through tools that are easy to use, relevant, personalized and flexible. This would inject collaboration into processes and provide benefits like increased reaction speed and decision efficiency while decreasing decision delays and barriers to use.
Data Science deals with the extraction of valuable insights from an incredible number of sources in an endless number of formats. This session will go through a typical workflow using practical tools and tricks. This will give you a basic understanding of Data Science in the Cloud. The examples will show the steps that are needed to build and deploy a model to predict traffic collisions with weather data.
My recent presentation about what is Big Data, Why so much Hype now, Startling Facts, Opportunity, History, Important Research Papers such as GFS, Map-Reduce , Technology Platforms and Organizations , Hadoop, Cassandra, Introduction to Hadoop, Contribution of Indians to various Big Data technologies working in Google, Cloudera, Hortonworks, Yahoo, Facebook, Aadhar - "All your answers lie in data - @Sameer Sawhney"
Data Science deals with the extraction of valuable insights from an incredible number of sources in an endless number of formats. This session will go through a typical workflow using practical tools and tricks. This will give you a basic understanding of Data Science in the Cloud. The examples will show the steps that are needed to build and deploy a model to predict traffic collisions with weather data.
My recent presentation about what is Big Data, Why so much Hype now, Startling Facts, Opportunity, History, Important Research Papers such as GFS, Map-Reduce , Technology Platforms and Organizations , Hadoop, Cassandra, Introduction to Hadoop, Contribution of Indians to various Big Data technologies working in Google, Cloudera, Hortonworks, Yahoo, Facebook, Aadhar - "All your answers lie in data - @Sameer Sawhney"
Discussed about the things that happen in 60 seconds, big data, and 4V's of big data. The presentation includes analytics, its evolution and applications.
We all know real-time data has a value. But how do you quantify that value in order to create a business case for becoming more data, or event, driven? The first half of this talk will explore the value of data across a variety of organizations, starting with the five most valuable companies in the world: Apple, Alphabet (Google), Microsoft, Amazon, and Facebook. We will go on to discuss other digital natives; Uber, Ebay, Netflix and LinkedIn before exploring more traditional companies across Retail, Finance and Automotive. Whether organizations are using data to create new business products and services, improving user experiences, increasing productivity, managing risk or influencing global power, we’ll see that fast and interconnected data, or ‘event streaming’ is increasingly important. After showing data value can be quantified, the second half of this talk will explain the five steps to creating a business case around Kafka use cases.
Most businesses focus on:
1. Making more money, or conferring competitive advantage to make more money
2. Increasing efficiency to save money, and / or
3. Mitigating risk to the business, to protect money.
We’ll walk through examples of real business cases, discuss how business cases have evolved over the years and show the power of a sound business case. If you’re interested in Big Money and Big Business, as well as Big Data, this talk is for you.
COMEX2017 Smart Talks by Amjid Ali , Muscat, Oman. Covering Introduction to big data, Big Data Definitions, Big Data Revolution, Big Data Timeline, Hadoop and Map Reduce covers importance of storage and DNA, Oceanstore 9000, Microsoft R, Spark,
Elasticsearch : petit déjeuner du 13 mars 2014ALTER WAY
Elasticsearch est un moteur de recherche Open Source très puissant basé sur
Apache Lucene. Il permet l'indexation de millions de données, leur recherche et leur
analyse en temps réel. Les outils Elascticsearch sont déjà utilisés par des acteurs de
référence tels que FourSquare, GitHub, OpenDataSoft ou encore Dailymotion.
Alter Way et Elasticsearch vous convient à venir découvrir la suite Elasticsearch
enfin disponible en version 1.0 et prête pour la production !
How does LightsOutPlanning support you in the challenge of digital supply chain? Take the chance to leverage your data via smart algorithms and machine learning to be on the edge of real time planning in order to make sound decisions.
Discover the key features of the tool; the virtual twin (1), the E2E optimizer (2), the smart parameters module (3), the E2E visibility (4) & the control tower (5).
Start your digital supply chain planning journey today!
Presentation at Logipharma event 2018 by bluecrux.
My perspective on the evolution of big data from the perspective of a distributed systems researcher & engineer -- the background of how it get started, the scale-out paradigm, industry use cases, open source development paradigm, and interesting future challenges.
SF Big Analytics Meetup - Exact Count Distinct with Apache KylinSamanthaBerlant
With over 450 million customers, Didi (world’s largest rideshare company) conducts complex user behavior analysis on huge datasets daily. Exact Count Distinct is one of Didi’s most critical metrics, but it is known for being computationally heavy and notoriously slow. The difference between exact Count Distinct and approximate Count Distinct can cost Didi millions of dollars. In this talk, Kaige Liu of the Apache Kylin project will explain how Didi uses Apache Kylin to return exact Distinct Count on billions of rows of data with sub-second latency to generate the most accurate picture of its business.
You will also learn about the latest development in modern OLAP technologies. Kaige will share how Didi and Truck Alliance (a truck-hailing company that processes $100 billion worth of goods yearly) use Apache Kylin to power their analytics platforms that allow 100s of analysts to achieve sub-second latency on petabyte-scale data.
Discussed about the things that happen in 60 seconds, big data, and 4V's of big data. The presentation includes analytics, its evolution and applications.
We all know real-time data has a value. But how do you quantify that value in order to create a business case for becoming more data, or event, driven? The first half of this talk will explore the value of data across a variety of organizations, starting with the five most valuable companies in the world: Apple, Alphabet (Google), Microsoft, Amazon, and Facebook. We will go on to discuss other digital natives; Uber, Ebay, Netflix and LinkedIn before exploring more traditional companies across Retail, Finance and Automotive. Whether organizations are using data to create new business products and services, improving user experiences, increasing productivity, managing risk or influencing global power, we’ll see that fast and interconnected data, or ‘event streaming’ is increasingly important. After showing data value can be quantified, the second half of this talk will explain the five steps to creating a business case around Kafka use cases.
Most businesses focus on:
1. Making more money, or conferring competitive advantage to make more money
2. Increasing efficiency to save money, and / or
3. Mitigating risk to the business, to protect money.
We’ll walk through examples of real business cases, discuss how business cases have evolved over the years and show the power of a sound business case. If you’re interested in Big Money and Big Business, as well as Big Data, this talk is for you.
COMEX2017 Smart Talks by Amjid Ali , Muscat, Oman. Covering Introduction to big data, Big Data Definitions, Big Data Revolution, Big Data Timeline, Hadoop and Map Reduce covers importance of storage and DNA, Oceanstore 9000, Microsoft R, Spark,
Elasticsearch : petit déjeuner du 13 mars 2014ALTER WAY
Elasticsearch est un moteur de recherche Open Source très puissant basé sur
Apache Lucene. Il permet l'indexation de millions de données, leur recherche et leur
analyse en temps réel. Les outils Elascticsearch sont déjà utilisés par des acteurs de
référence tels que FourSquare, GitHub, OpenDataSoft ou encore Dailymotion.
Alter Way et Elasticsearch vous convient à venir découvrir la suite Elasticsearch
enfin disponible en version 1.0 et prête pour la production !
How does LightsOutPlanning support you in the challenge of digital supply chain? Take the chance to leverage your data via smart algorithms and machine learning to be on the edge of real time planning in order to make sound decisions.
Discover the key features of the tool; the virtual twin (1), the E2E optimizer (2), the smart parameters module (3), the E2E visibility (4) & the control tower (5).
Start your digital supply chain planning journey today!
Presentation at Logipharma event 2018 by bluecrux.
My perspective on the evolution of big data from the perspective of a distributed systems researcher & engineer -- the background of how it get started, the scale-out paradigm, industry use cases, open source development paradigm, and interesting future challenges.
SF Big Analytics Meetup - Exact Count Distinct with Apache KylinSamanthaBerlant
With over 450 million customers, Didi (world’s largest rideshare company) conducts complex user behavior analysis on huge datasets daily. Exact Count Distinct is one of Didi’s most critical metrics, but it is known for being computationally heavy and notoriously slow. The difference between exact Count Distinct and approximate Count Distinct can cost Didi millions of dollars. In this talk, Kaige Liu of the Apache Kylin project will explain how Didi uses Apache Kylin to return exact Distinct Count on billions of rows of data with sub-second latency to generate the most accurate picture of its business.
You will also learn about the latest development in modern OLAP technologies. Kaige will share how Didi and Truck Alliance (a truck-hailing company that processes $100 billion worth of goods yearly) use Apache Kylin to power their analytics platforms that allow 100s of analysts to achieve sub-second latency on petabyte-scale data.
Building Confidence in Big Data - IBM Smarter Business 2013 IBM Sverige
Success with big data comes down to confidence. Without confidence in the underlying data, decision makers may not trust and act on analytic insight. You need confidence in your data – that it’s correct, trusted, and protected through automated integration, visual context, and agile governance. You need confidence in your ability to accelerate time to value, with fast deployments of big data appliances. Learn how clients have succeeded with big data by building confidence in their data, ability to deploy, and skills. Presenter: David Corrigan, Big Data specialist, IBM. Mer från dagen på http://bit.ly/sb13se
Yes, we face a data deluge and big data seems to be largely about how to deal with it. But 99% of what has been written about big data is focused on selling hardware and services. The truth is that until the concept of big data can be objectively defined, any measurements, claims of success, quantifications, etc. must be viewed skeptically and with suspicion. While both the need for and approaches to these new requirements are faced by virtually every organization, jumping into the fray ill-prepared has (to date) reproduced the same dismal IT project results.
The very real, very rapid, very great increases in data of all forms (charts showing data types and volume increases)
Challenges faced by virtually all data management programs
Means by which big data techniques can compliment existing data management practices
Necessary but insufficient pre-requisites to exploiting big data techniques
Prototyping nature of practicing big data techniques
You can sign up for future Data-Ed webinars here: http://www.datablueprint.com/resource-center/webinar-schedule/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
7. 2.5 QBytes*
Some web-data facts:
• 340 million Tweets a day.
• 684,000 new shares on Facebook a day.
• 72 hours of video on YouTube a minute.
• $ 2.7 Billion spent on Web shopping a day.
• 2 million search queries on Google a minute.
• 27,000 new posts on Tumblr a minute.
• 400 K comments on Wordpress posts a day
• 3,600 new photos on Instagram a minute.
• 571 new websites launched a minute.
THE ISSUE…
LOTS OF DATA
WEB-ENABLED DATA
Sources = IDG, McKinsey, The Economist, IBIS World Group, IBM
8. 2.5 QBytes*
Some examples:
• 2.5 Petabytes Wallmart transactions.
• 10 Terabytes sensor data for 30 min flight.
• 1 Terabyte NYSE data per day.
• $ 6 Trillion Big data cost
THE ISSUE…
LOTS OF DATA
INTERNAL DATA
Sources = IDG, McKinsey, The Economist, IBIS World Group, IBM
28. INJECT COLLABORATION IN PROCESSES
DATA
FEED
MANAGEMENT
AWARENESS
RAISING
KNOWLEDGE
TRANSFER
DECISION
MAKING
ANALYTICS
TASK
EXECUTION
INSIGHTS
DISTRIBUTION
REPORTING
BWLciti fallen in love with the web technologies and 8 years back started my first company. 18 months back I’ve started my 4th project. And what we learned in this crazy intense time about the way corporate organizations function and would function event more efficiently in the future is something I want to share with you. I warn you up front: at times it may appear as pretty eclectic journey as we will stumbled upon all sorts of things like big data, quality of data economy, information intelligence, enterprise software consumerization, novel concepts of human resource management, even Leonardo da Vinci . But at the end – it made perfect sense, at least for us Everything for us started with a vaccine malfunction somewhere in Brazil. The pharmaceutical company that produced it has sold it to some health organization, this organization didn’t store it properly, the vaccine unfortunatelly caused a lot of serious damage once injected. For some reason, the producer learned about it whole 26 days later when it was really late. It could not intervene on time, spare a lot of suffering and the quite expensive litigation case afterwards. Because one of my partners in our current venture was a close professional witness to what actually happened, we’ve got a lot of specific insights to understand why it came to this. On prima vista it seemed the reporting system has let this pharma company down. Driven by inertia, it is neglecting the free web in general and because of processing
During the RnD stage.
Everything for us started with a vaccine malfunction somewhere in Brazil. The pharmaceutical company that produced it has sold it to some health organization, this organization didn’t store it properly, the vaccine unfortunatelly caused a lot of serious damage once injected. For some reason, the producer learned about it whole 26 days later when it was really late. It could not intervene on time, spare a lot of suffering and the quite expensive litigation case afterwards. We felt that we (me and my partners) have what it takes to change this for good:
WE’VE TAKEN A DEEP LOOK INTO THE MATTER TO REALIZE THAT THE PHARMA COMPANY WAS LET DOWN BY ITS INFORMATION INTELLIGENCE SYSTEM.IT RELIED HEAVILY ON…..All these 3 points are so inappropriate for the times we live in. Why was it inaapropriate
How you can process 2.5 QB of information produced everyday ?Sue Feldman: In today’s market, there are massive volumes and variety of data being generated at an alarming speed.
It is physiologically impossible for the human brain to process all the available information. We all know why businesses need to process that data1.25 Terabytes: The amount of data the human brain can hold. It performs at roughly 100 teraflops. (Ray Kurzweil as cited by IBM’s Tony Pearson)
…because they need to find relevant facts, patterns and relationships that help them act efficiently. Because gut feel is no longer acceptable.
This is why the most competitive organizations globally turned to a combination of mathematical techniques and computing power to uncover vital intelligence that exists in the data. This empowers the businesses to tap into the wealth of available data to make decisions based on “what they know to be true” rather than “what they believe to be true”.In general every company out there tries to maximize ROI on Information
This ROI on information is actually a sweet spot every enterprise is looking for, where at a affordable expense, the company gets data good enough to keep the losses due to low data quality in check.
As we wanted to solidify and improve this situation, it was paramount to find ways to shift the violet curve to right and
Lean the red curve towards left as much as possible
And help companies end up with higher data quality at the same or even lower expense and as a result dramatically lower cost due to low data quality.
During the RnD stage.
It was relitively easy to realize that automation, unification and centralization would deliver on the first one.
As for the second one we had our assumptions as all the partners involved were big business guys and they dealt with this for quite a long time. Our approach was a bit of empirical and a bit of scientific. We had our assumptions and we went to verify them with real businesses carefully chosing such from the pharma, energy, telecom, finance and fmcg. And within these companies we went to look for answers across the various departments. We’ve heard a lot of stories In the end the ingredients of the magic sauce that would help us lean the red curve shaped up
During the RnD stage.
As for the second one we had our assumptions as all the partners involved were big business guys and they dealt with this for quite a long time. Our approach was a bit of empirical and a bit of scientific. We had our assumptions and we went to verify them with real businesses carefully chosing such from the pharma, energy, telecom, finance and fmcg. And within these companies we went to look for answers across the various departments. We’ve heard a lot of stories In the end the ingredients of the magic sauce that would help us lean the red curve shaped up
Quality of data industry chart
COLLABORATION
19 Billion $ industry
In today’s market, there are massive volumes and variety of data being generated at an alarming speed. It is physiologically impossible for the human brain to process all the available information to find patterns and relationships. Coupled with the diverse nature of the available data, the speed in which business is expected to react and make decisions, and the increasingly limited resources, “gut feel” decisions are no longer acceptable.Today, the most competitive organisations globally use a combination of mathematical techniques and computing power to uncover vital intelligence that exists in the data. This unique combination of technologies empowers business to tap into the wealth of available data to make decisions based on “what they know to be true” rather than “what they believe to be true”.