This document provides a high-level summary of NoSQL and Big Data:
1) It discusses the history of databases from COBOL to SQL and the development of NoSQL in response to the need to handle large, unstructured datasets.
2) It outlines some of the opportunities that NoSQL databases provide for storing and analyzing massive amounts of diverse data types.
3) It briefly mentions some examples of popular NoSQL databases like MongoDB, Cassandra, and DynamoDB that are well-suited for Big Data applications.
عبارت کلان داده به مجموعههای داده ای اشاره دارد که به اندازه ای بزرگ و حجیم هستند که با ابزارهای مدیریتی و پایگاههاي داده سنتی و معمولی قابل مدیریت نیستند. مشکلات اصلی در کار با این نوع دادهها مربوط به برداشت و جمعآوری، ذخیرهسازی، جستوجو، اشتراکگذاری، تحلیل و نمایش آنها می باشد. کلان داده به عنوان یکی از فناوری های کلیدی و نوظهور به اذعان بسیاری از خبرگان می تواند تاثیرات شگرفی بر جای بگذارد. امروزه با گسترش شبکههای اجتماعی و ظهور منابع جدید اطلاعاتی، حجم دادههای تولیدی به شکل روزافزونی در حال افزایش است. نظرات کاربران شبکههای اجتماعی، محتواهای بههد اشتراک گذاشته شده و اطلاعات ضبط شده توسط حسگرهای مختلف همگی از انواع منابعی هستند که در این انفجار اطلاعاتی نقش ایفا می کنند. با استفاده از تحلیل حجمهاي بیشتری از دادهها، ميتوان تحلیلهاي بهتر و پيشرفتهتري را برای مقاصد مختلف، از جمله مقاصد تجاری، پزشکی و امنیتی، انجام داد و نتایج مناسبتری را دریافتکرد. پیوند موجود بین کلان داده و ابزارهای متن باز به وضوح با استفاده از ابزار هدوپ شروع شد و این روند در ادامه سرعت بیشتری به خود گرفت
Little Big Data #1 다양한 사람들의 데이터 사이언스 이야기에서 발표한 자료입니다
궁금한 것은 언제나 문의주세요 :)
행사 후기는 https://zzsza.github.io/etc/2018/04/21/little-big-data/ 에 있습니다!
(2018.5 내용 추가) 현재 회사가 없으니, 제게 관심있으신 분들도 연락 환영합니다 :)
As the Big Data market has evolved, the focus has shifted from data operations (storage, access and processing of data) to data science (understanding, analyzing and forecasting from data). And as new models are developed, organizations need a process for deploying analytics from research into the production environment. In this talk, we'll describe the five stages of real-time analytics deployment:
Data distillation
Model development
Model validation and deployment
Model refresh
Real-time model scoring
We'll review the technologies supporting each stage, and how Revolution Analytics software works with the entire analytics stack to bring Big Data analytics to real-time production environments.
My keynote talk at San Diego Superdata conference, looking at history and current state of Analytics and Data Mining, and examining the effects of Big Data
عبارت کلان داده به مجموعههای داده ای اشاره دارد که به اندازه ای بزرگ و حجیم هستند که با ابزارهای مدیریتی و پایگاههاي داده سنتی و معمولی قابل مدیریت نیستند. مشکلات اصلی در کار با این نوع دادهها مربوط به برداشت و جمعآوری، ذخیرهسازی، جستوجو، اشتراکگذاری، تحلیل و نمایش آنها می باشد. کلان داده به عنوان یکی از فناوری های کلیدی و نوظهور به اذعان بسیاری از خبرگان می تواند تاثیرات شگرفی بر جای بگذارد. امروزه با گسترش شبکههای اجتماعی و ظهور منابع جدید اطلاعاتی، حجم دادههای تولیدی به شکل روزافزونی در حال افزایش است. نظرات کاربران شبکههای اجتماعی، محتواهای بههد اشتراک گذاشته شده و اطلاعات ضبط شده توسط حسگرهای مختلف همگی از انواع منابعی هستند که در این انفجار اطلاعاتی نقش ایفا می کنند. با استفاده از تحلیل حجمهاي بیشتری از دادهها، ميتوان تحلیلهاي بهتر و پيشرفتهتري را برای مقاصد مختلف، از جمله مقاصد تجاری، پزشکی و امنیتی، انجام داد و نتایج مناسبتری را دریافتکرد. پیوند موجود بین کلان داده و ابزارهای متن باز به وضوح با استفاده از ابزار هدوپ شروع شد و این روند در ادامه سرعت بیشتری به خود گرفت
Little Big Data #1 다양한 사람들의 데이터 사이언스 이야기에서 발표한 자료입니다
궁금한 것은 언제나 문의주세요 :)
행사 후기는 https://zzsza.github.io/etc/2018/04/21/little-big-data/ 에 있습니다!
(2018.5 내용 추가) 현재 회사가 없으니, 제게 관심있으신 분들도 연락 환영합니다 :)
As the Big Data market has evolved, the focus has shifted from data operations (storage, access and processing of data) to data science (understanding, analyzing and forecasting from data). And as new models are developed, organizations need a process for deploying analytics from research into the production environment. In this talk, we'll describe the five stages of real-time analytics deployment:
Data distillation
Model development
Model validation and deployment
Model refresh
Real-time model scoring
We'll review the technologies supporting each stage, and how Revolution Analytics software works with the entire analytics stack to bring Big Data analytics to real-time production environments.
My keynote talk at San Diego Superdata conference, looking at history and current state of Analytics and Data Mining, and examining the effects of Big Data
An Overview of the Emerging Graph Landscape (Oct 2013)Emil Eifrem
Recent years have seen an explosion of technologies for managing, processing and analyzing graphs, ranging from community projects like Apache Giraph, to vendor led products such as Neo4j and spin outs from established companies like Twitter’s FlockDB. The sheer number of technologies makes it difficult to keep track of these tools and what sets them apart, even for those of us who are active in the space!
But all graph technologies are not created equal. This session will provide a high level framework for making sense of the emerging graph landscape. It will describe the three dominant graph data models today, define top level categories like graph compute engines (Graphlab, Giraph, Pegasus, YarcData, etc) and graph databases (Neo4j, FlockDB, OrientDB, etc) and discuss common characteristics and important properties of each category.
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
KMIS International Conference 2021.
This talk aims to provide insights and performance of predictive models for Airbnb Rating using Big Data and distributed parallel computing systems. We have predicted and classified using Two-Class Classification models if a property has a high or a low rating based on the features of the listing. It helps the hosts to know if their property is suitable and how their listing compares to other similar listings. We compare the results and the performance of rating prediction models with accuracy and computing time metrics.
Despite the existence of data analysis tools such as R, SQL, Excel and others, it is still insufficient to cope with today's big data analysis needs.
The author proposes a CUI (Character User Interface) toolset with dozens of functions to neatly handle tabular data in TSV (Tab Separated Values) files.
It implements many basic and useful functions that have not been implemented in existing software with each function borrowing the ideas of Unix philosophy and covering the most frequent pre-analysis tasks during the initial exploratory stage of data analysis projects.
Also, it greatly speeds up basic analysis tasks, such as drawing cross tables, Venn diagrams, etc., while existing software inevitably requires rather complicated programming and debugging processes for even these basic tasks.
Here, tabular data mainly means TSV (Tab-Separated Values) files as well as other CSV (Comma Separated Value)-type files which are all widely used for storing data and suitable for data analysis.
Real-time information analysis: social networks and open dataData Science Society
Plamen Penev - Co-founder of Yatrus Analytics, graduated from the University of Essex (Political Science and IR). He works In Yatrus Analytics in the fields of NLP(Natural Language Processing), Text mining, Graph analysis.
"Real-time information analysis: social networks and open data" will focus on the problem of real-time multisource data analytics, as well as the variety and the combination of many various data sources, the blending of those in real-time. The information discovery from Twitter and other sources as seen through the state of NLP (Event extraction, Classification of events, Sentiment analysis) in combination with financial and economic data.
Rating Prediction using Deep Learning and SparkJongwook Woo
Distributed Deep Learning to predict Amazon review data rating in Spark using Analytics Zoo on AWS, which is published at "Rating Prediction using Deep Learning and Spark" at The 11th Internation Conference on Internet (ICONI 2019), Hanoi, Vietnam, Dec 15 - 18 2019
SUM TWO is making 'serious investments' in big data, cloud, mobility !!! “Big data refers to the datasets whose size is beyond the ability of atypical database software tools to capture ,store, manage and analyze.defines big data the following way: “Big data is data that exceeds theprocessing capacity of conventional database systems. The data is too big, moves toofast, or doesnt fit the strictures of your database architectures. The 3 Vs of Big data.Apache Hadoop is 100% open source, and pioneered a fundamentally new way of storing and processing data. Instead of relying on expensive, proprietary hardware and different systems to store and process data, Hadoop enables distributed parallel processing of huge amounts of data across inexpensive, industry-standard servers that both store and process the data, and can scale without limits. With Hadoop, no data is too big. And in today’s hyper-connected world where more and more data is being created every day, Hadoop’s breakthrough advantages mean that businesses and organizations can now find value in data that was recently considered useless.Hadoop’s cost advantages over legacy systems redefine the economics of data. Legacy systems, while fine for certain workloads, simply were not engineered with the needs of Big Data in mind and are far too expensive to be used for general purpose with today's largest data sets.One of the cost advantages of Hadoop is that because it relies in an internally redundant data structure and is deployed on industry standard servers rather than expensive specialized data storage systems, you can afford to store data not previously viable . And we all know that once data is on tape, it’s essentially the same as if it had been deleted - accessible only in extreme circumstances.Make Big Data the Lifeblood of Your Enterprise
With data growing so rapidly and the rise of unstructured data accounting for 90% of the data today, the time has come for enterprises to re-evaluate their approach to data storage, management and analytics. Legacy systems will remain necessary for specific high-value, low-volume workloads, and compliment the use of Hadoop-optimizing the data management structure in your organization by putting the right Big Data workloads in the right systems. The cost-effectiveness, scalability and streamlined architectures of Hadoop will make the technology more and more attractive. In fact, the need for Hadoop is no longer a question.
Class lecture by Prof. Raj Jain on Big Data. The talk covers Why Big Data Now?, Big Data Applications, ACID Requirements, Terminology, Google File System, BigTable, MapReduce, MapReduce Optimization, Story of Hadoop, Hadoop, Apache Hadoop Tools, Apache Other Big Data Tools, Other Big Data Tools, Analytics, Types of Databases, Relational Databases and SQL, Non-relational Databases, NewSQL Databases, Columnar Databases. Video recording available in YouTube.
Introduction to Big Data and its TrendsJongwook Woo
Big Data has been popular last 10 years using Hadoop and Spark for data analysis and prediction with large scale data sets in distributed parallel computing systems. Its platform has expanded using NoSQL DB and Search Engine as well and has been more popular along cloud computing. Then, Deep Learning has become a buzzword past several years using GPU and Big Data. It makes even small companies and labs to own supercomputers with a small amount of budgets, which is the situation of “Dream Comes True” in the IT and business. In this talk, the history and trends of Big Data and AI platforms are introduced and Big Data predictive analysis should be presented.
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
Big Data has been popular last 10 years using Hadoop and Spark for data analysis and prediction with large scale data sets in distributed parallel computing systems. Its platform has expanded using NoSQL DB and Search Engine as well and has been more popular along cloud computing. Then, Deep Learning has become a buzzword past several years using GPU and Big Data. It makes even small companies and labs to own supercomputers with a small amount of budgets, which is the situation of “Dream Comes True” in the IT and business. In this talk, the history and trends of Big Data and AI platforms are introduced and how predictive analysis should be presented in Business using Big Data & AI.
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.
Extract business value by analyzing large volumes of multi-structured data from various sources such as databases, websites, blogs, social media, smart sensors...
An Overview of the Emerging Graph Landscape (Oct 2013)Emil Eifrem
Recent years have seen an explosion of technologies for managing, processing and analyzing graphs, ranging from community projects like Apache Giraph, to vendor led products such as Neo4j and spin outs from established companies like Twitter’s FlockDB. The sheer number of technologies makes it difficult to keep track of these tools and what sets them apart, even for those of us who are active in the space!
But all graph technologies are not created equal. This session will provide a high level framework for making sense of the emerging graph landscape. It will describe the three dominant graph data models today, define top level categories like graph compute engines (Graphlab, Giraph, Pegasus, YarcData, etc) and graph databases (Neo4j, FlockDB, OrientDB, etc) and discuss common characteristics and important properties of each category.
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
KMIS International Conference 2021.
This talk aims to provide insights and performance of predictive models for Airbnb Rating using Big Data and distributed parallel computing systems. We have predicted and classified using Two-Class Classification models if a property has a high or a low rating based on the features of the listing. It helps the hosts to know if their property is suitable and how their listing compares to other similar listings. We compare the results and the performance of rating prediction models with accuracy and computing time metrics.
Despite the existence of data analysis tools such as R, SQL, Excel and others, it is still insufficient to cope with today's big data analysis needs.
The author proposes a CUI (Character User Interface) toolset with dozens of functions to neatly handle tabular data in TSV (Tab Separated Values) files.
It implements many basic and useful functions that have not been implemented in existing software with each function borrowing the ideas of Unix philosophy and covering the most frequent pre-analysis tasks during the initial exploratory stage of data analysis projects.
Also, it greatly speeds up basic analysis tasks, such as drawing cross tables, Venn diagrams, etc., while existing software inevitably requires rather complicated programming and debugging processes for even these basic tasks.
Here, tabular data mainly means TSV (Tab-Separated Values) files as well as other CSV (Comma Separated Value)-type files which are all widely used for storing data and suitable for data analysis.
Real-time information analysis: social networks and open dataData Science Society
Plamen Penev - Co-founder of Yatrus Analytics, graduated from the University of Essex (Political Science and IR). He works In Yatrus Analytics in the fields of NLP(Natural Language Processing), Text mining, Graph analysis.
"Real-time information analysis: social networks and open data" will focus on the problem of real-time multisource data analytics, as well as the variety and the combination of many various data sources, the blending of those in real-time. The information discovery from Twitter and other sources as seen through the state of NLP (Event extraction, Classification of events, Sentiment analysis) in combination with financial and economic data.
Rating Prediction using Deep Learning and SparkJongwook Woo
Distributed Deep Learning to predict Amazon review data rating in Spark using Analytics Zoo on AWS, which is published at "Rating Prediction using Deep Learning and Spark" at The 11th Internation Conference on Internet (ICONI 2019), Hanoi, Vietnam, Dec 15 - 18 2019
SUM TWO is making 'serious investments' in big data, cloud, mobility !!! “Big data refers to the datasets whose size is beyond the ability of atypical database software tools to capture ,store, manage and analyze.defines big data the following way: “Big data is data that exceeds theprocessing capacity of conventional database systems. The data is too big, moves toofast, or doesnt fit the strictures of your database architectures. The 3 Vs of Big data.Apache Hadoop is 100% open source, and pioneered a fundamentally new way of storing and processing data. Instead of relying on expensive, proprietary hardware and different systems to store and process data, Hadoop enables distributed parallel processing of huge amounts of data across inexpensive, industry-standard servers that both store and process the data, and can scale without limits. With Hadoop, no data is too big. And in today’s hyper-connected world where more and more data is being created every day, Hadoop’s breakthrough advantages mean that businesses and organizations can now find value in data that was recently considered useless.Hadoop’s cost advantages over legacy systems redefine the economics of data. Legacy systems, while fine for certain workloads, simply were not engineered with the needs of Big Data in mind and are far too expensive to be used for general purpose with today's largest data sets.One of the cost advantages of Hadoop is that because it relies in an internally redundant data structure and is deployed on industry standard servers rather than expensive specialized data storage systems, you can afford to store data not previously viable . And we all know that once data is on tape, it’s essentially the same as if it had been deleted - accessible only in extreme circumstances.Make Big Data the Lifeblood of Your Enterprise
With data growing so rapidly and the rise of unstructured data accounting for 90% of the data today, the time has come for enterprises to re-evaluate their approach to data storage, management and analytics. Legacy systems will remain necessary for specific high-value, low-volume workloads, and compliment the use of Hadoop-optimizing the data management structure in your organization by putting the right Big Data workloads in the right systems. The cost-effectiveness, scalability and streamlined architectures of Hadoop will make the technology more and more attractive. In fact, the need for Hadoop is no longer a question.
Class lecture by Prof. Raj Jain on Big Data. The talk covers Why Big Data Now?, Big Data Applications, ACID Requirements, Terminology, Google File System, BigTable, MapReduce, MapReduce Optimization, Story of Hadoop, Hadoop, Apache Hadoop Tools, Apache Other Big Data Tools, Other Big Data Tools, Analytics, Types of Databases, Relational Databases and SQL, Non-relational Databases, NewSQL Databases, Columnar Databases. Video recording available in YouTube.
Introduction to Big Data and its TrendsJongwook Woo
Big Data has been popular last 10 years using Hadoop and Spark for data analysis and prediction with large scale data sets in distributed parallel computing systems. Its platform has expanded using NoSQL DB and Search Engine as well and has been more popular along cloud computing. Then, Deep Learning has become a buzzword past several years using GPU and Big Data. It makes even small companies and labs to own supercomputers with a small amount of budgets, which is the situation of “Dream Comes True” in the IT and business. In this talk, the history and trends of Big Data and AI platforms are introduced and Big Data predictive analysis should be presented.
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
Big Data has been popular last 10 years using Hadoop and Spark for data analysis and prediction with large scale data sets in distributed parallel computing systems. Its platform has expanded using NoSQL DB and Search Engine as well and has been more popular along cloud computing. Then, Deep Learning has become a buzzword past several years using GPU and Big Data. It makes even small companies and labs to own supercomputers with a small amount of budgets, which is the situation of “Dream Comes True” in the IT and business. In this talk, the history and trends of Big Data and AI platforms are introduced and how predictive analysis should be presented in Business using Big Data & AI.
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.
Extract business value by analyzing large volumes of multi-structured data from various sources such as databases, websites, blogs, social media, smart sensors...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...i_scienceEU
Network of Excellence Internet Science Summer School. The theme of the summer school is "Internet Privacy and Identity, Trust and Reputation Mechanisms".
More information: http://www.internet-science.eu/
Abstract: Knowledge has played a significant role on human activities since his development. Data mining is the process of
knowledge discovery where knowledge is gained by analyzing the data store in very large repositories, which are analyzed
from various perspectives and the result is summarized it into useful information. Due to the importance of extracting
knowledge/information from the large data repositories, data mining has become a very important and guaranteed branch of
engineering affecting human life in various spheres directly or indirectly. The purpose of this paper is to survey many of the
future trends in the field of data mining, with a focus on those which are thought to have the most promise and applicability
to future data mining applications.
Keywords: Current and Future of Data Mining, Data Mining, Data Mining Trends, Data mining Applications.
Big Data consists of several issues: data collecting, storage, computing, analysis and visualization. Python is a popular scripting language with good code readability and thus is suitable for fast development. In this slides, the author shares how to solve Big Data issues using Python open source tools.
Big Data may well be the Next Big Thing in the IT world. The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Facebook were built around big data from the beginning.
Based on "18 Minutes" by Peter Bregman, 2 simple steps to Get the Right Things Done by focusing on the right things and making small changes to your Tasks & Calendar.
Computers & Programming for Creativity in ChildrenVishy Poosala
Introduce computers and programming to kids to nurture their creativity. Some pointers on using Scratch, Light Bot, blogspot, etc. The main idea is to insert computers into the kid's natural passion (eg writing, arts) as a tool and then give the right mental model of programming.
This is a talk I gave at Yahoo! Archiects conference. uCome up with innovative solutions to architecture problems, taking inspiration from buildings and nature.
Techniques for brainstorming and lateral thinking.
Do you want to learn how to figure out what you love to do ("your ideal job"), pinpoint what's blocking you from do it, and start doing it?
This presentation is like a recipe, a working manual for finding out and doing your ideal job, often without risking it all.
A Recipe For Innovation and Creative Thinking [creating the 8th wonder of the...Vishy Poosala
A simple recipe for how to innovate bigger and better ideas, tools for thinking creatively and brainstorming better, six hats of thinking, and plans for taking ideas to the market.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
5. Billions of Keys & Values
GFS
Google
Big Table
Hadoop
Cassandra
Dynamo
5
6. How would you build a super-fast,
FB-scale chat service, in 2012?
(for example)
6
7. I want my own DB!
• Memcached
Main
Memory • redis
Distr.
• MongoDB
K-V
Versions • CouchDB
Social
Graphs • Neo4j
7
8. BIG
KB GB TB PB
Data Semi-
FILES TABLES Variety
Structured
Dynamic
Analytics OLAP
STATS Apps Mahout
Cube
Language
COBOL SQL XML NoSQL
60’s 80-96 96-’07 ‘07-
8
9. Following *AMAZING* Slides Courtesy: Gregory Piatesky-Shapiro, kdnuggets.com
You can find all the slides from his talk at:
http://www.slideshare.net/gpiatetskyshapiro/analytics-and-data-mining-industry-overview
9
10. Data Tsunami
• In 2010 enterprises
stored 7 exabytes
=7,000,000,000 GB
of new data (McKinsey)
• 90 percent of the
world's data has been
Image with apologies to KDD-2011
generated in the past
two years (IBM)
10
11. Pre-history
Statistics is the biggest term in 20th century, but
data mining and analytics appears in late
1990s
From Google Ngram viewer – English language books
Note: Our analysis uses only English language data.
Other languages, especially Chinese , need to be considered for full picture
11
12. Recent History:
Analytics, Data Mining, Knowledge Discovery
Analytics has been used since 1800, but started to rise in 2005
Data Mining jumps around 1996 (soon after first KDD conference) but declines after
2003 (TIA controversy, associated with gov. invasion of privacy).
Knowledge Discovery appears in 1989, jumps in 1996, and plateaus after 2000
12
18. Largest Dataset Analyzed?
2011 median dataset
size ~10-20 GB,
vs 8-10 GB in 2010.
Increase in
10 GB to 1 PB range
www.KDnuggets.com/polls/2011/largest-dataset-analyzed-data-mined.html
18
19. Which methods/algorithms did you
use for data analysis in 2011
% analysts who used it
0% 10% 20% 30% 40% 50% 60% 70%
Decision Trees
Regression
Clustering
Statistics
Visualization
Time series/Sequence analysis
Support Vector (SVM)
Association rules
Ensemble methods
Text Mining
Neural Nets
Boosting
Bayesian
Bagging
Factor Analysis
Anomaly/Deviation detection
Social Network Analysis
Survival Analysis
Genetic algorithms
Uplift modeling
www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html
19
20. Cloud Analytics is not common
(yet)
www.KDnuggets.com/polls/2011/algorithms-analytics-data-mining.html
20
21. Shortage of Skills
• McKinsey: shortage by 2018 in the US of
– 140-190,000 people with deep analytical skills
– 1.5 M managers/analysts with the know-how
to use the analysis of big data to make
effective decisions.
Source:
www.mckinsey.com/mgi/publications/big_data
/ 21
24. “Ground” Analytics (LinkedIn
Skills)
~ 75,000 with Data Mining skill
~ 7,000 with Predictive Modeling
Also
~ 20,000 with Predictive
Analytics
(not related with Predictive
Modeling ??
24