The document summarizes Joel Lord's presentation on machine learning given at the Web à Québec conference on April 4, 2017. The presentation covered topics including artificial intelligence vs machine learning, big data and deep learning, basic machine learning algorithms like naive Bayes classification and sentiment analysis, and genetic algorithms. Live code demos were included to illustrate naive Bayes classification and sentiment analysis.
Scikits.learn (http://scikit-learn.sourceforge.net/) is a scikit for machine learning which has gained lots of popularity in recent months. In particular, it can be used for text and large scale database mining.
On another side, CubicWeb (http://www.cubicweb.org/) is a python-based framework for semantic web applications that has been used in different application fields (library, museum, conference, intranet applications).
The aim of this talk is to present how these tools can be used together for semantic data mining of rss feeds (clustering, prediction), and for building a news aggregator similar to google news.
Full description : http://www.euroscipy.org/talk/4291
A quick Description about presentation:
• What is ElasticSearch and how it works.
• How ElasticSearch works to analyze data splitting a document into meaningful portions and indexing each of those portions separately. So whenever a new search request comes in, it knows what to find.
• Features and advantages of ElasticSearch like built in sharding defaults, maintaining fail-safe node clusters, automatically adding a new node without having to reboot and so on.
• Out of the box features for today’s applications like faceted search, reverse search using Percolators and pre-built Analyzers.
The tutorial includes big data search, contenders, intro to elasticsearch, more than just search, unchartered territory. Beginning is a brief detail about big data search which includes big data search in terms of rapid consumption and the challenges faced by big data search. Following is a section about contenders. It includes contenders like lucene, apache soir, sphinx and ElasticSearch itself.
Moreover, there is also an introduction section to ElasticSearch. It includes an introduction to ElasticSearch as a search server and it's features like push replication, node auto discovery, fail-safe. It also includes data analyzing and ways of indexing it right. Afterwards, there is a section on more than search which includes factors more than just search functions like facets, range facet, histogram facet, geo facet, percolator and ElasticSearch percolating.
The last section of this tutorial includes unchartered territory. It includes territories like ElasticSearch and NoSQL database, situations in cases of WHAT IF and references.
Hibernate Tips ‘n’ Tricks - 15 Tips to solve common problemsThorben Janssen
Hibernate can do a lot more than just mapping a database table to an entity. It also provides a lot of features that make the implementation of your business logic quick and easy.
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineTrey Grainger
Search engines frequently miss the mark when it comes to understanding user intent. This talk will describe how to overcome this by leveraging Lucene/Solr to power a knowledge graph that can extract phrases, understand and weight the semantic relationships between those phrases and known entities, and expand the query to include those additional conceptual relationships. For example, if a user types in (Senior Java Developer Portland, OR Hadoop), you or I know that the term “senior” designates an experience level, that “java developer” is a job title related to “software engineering”, that “portland, or” is a city with a specific geographical boundary, and that “hadoop” is a technology related to terms like “hbase”, “hive”, and “map/reduce”. Out of the box, however, most search engines just parse this query as text:((senior AND java AND developer AND portland) OR (hadoop)), which is not at all what the user intended. We will discuss how to train the search engine to parse the query into this intended understanding, and how to reflect this understanding to the end user to provide an insightful, augmented search experience. Topics: Semantic Search, Finite State Transducers, Probabilistic Parsing, Bayes Theorem, Augmented Search, Recommendations, NLP, Knowledge Graphs
Scikits.learn (http://scikit-learn.sourceforge.net/) is a scikit for machine learning which has gained lots of popularity in recent months. In particular, it can be used for text and large scale database mining.
On another side, CubicWeb (http://www.cubicweb.org/) is a python-based framework for semantic web applications that has been used in different application fields (library, museum, conference, intranet applications).
The aim of this talk is to present how these tools can be used together for semantic data mining of rss feeds (clustering, prediction), and for building a news aggregator similar to google news.
Full description : http://www.euroscipy.org/talk/4291
A quick Description about presentation:
• What is ElasticSearch and how it works.
• How ElasticSearch works to analyze data splitting a document into meaningful portions and indexing each of those portions separately. So whenever a new search request comes in, it knows what to find.
• Features and advantages of ElasticSearch like built in sharding defaults, maintaining fail-safe node clusters, automatically adding a new node without having to reboot and so on.
• Out of the box features for today’s applications like faceted search, reverse search using Percolators and pre-built Analyzers.
The tutorial includes big data search, contenders, intro to elasticsearch, more than just search, unchartered territory. Beginning is a brief detail about big data search which includes big data search in terms of rapid consumption and the challenges faced by big data search. Following is a section about contenders. It includes contenders like lucene, apache soir, sphinx and ElasticSearch itself.
Moreover, there is also an introduction section to ElasticSearch. It includes an introduction to ElasticSearch as a search server and it's features like push replication, node auto discovery, fail-safe. It also includes data analyzing and ways of indexing it right. Afterwards, there is a section on more than search which includes factors more than just search functions like facets, range facet, histogram facet, geo facet, percolator and ElasticSearch percolating.
The last section of this tutorial includes unchartered territory. It includes territories like ElasticSearch and NoSQL database, situations in cases of WHAT IF and references.
Hibernate Tips ‘n’ Tricks - 15 Tips to solve common problemsThorben Janssen
Hibernate can do a lot more than just mapping a database table to an entity. It also provides a lot of features that make the implementation of your business logic quick and easy.
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineTrey Grainger
Search engines frequently miss the mark when it comes to understanding user intent. This talk will describe how to overcome this by leveraging Lucene/Solr to power a knowledge graph that can extract phrases, understand and weight the semantic relationships between those phrases and known entities, and expand the query to include those additional conceptual relationships. For example, if a user types in (Senior Java Developer Portland, OR Hadoop), you or I know that the term “senior” designates an experience level, that “java developer” is a job title related to “software engineering”, that “portland, or” is a city with a specific geographical boundary, and that “hadoop” is a technology related to terms like “hbase”, “hive”, and “map/reduce”. Out of the box, however, most search engines just parse this query as text:((senior AND java AND developer AND portland) OR (hadoop)), which is not at all what the user intended. We will discuss how to train the search engine to parse the query into this intended understanding, and how to reflect this understanding to the end user to provide an insightful, augmented search experience. Topics: Semantic Search, Finite State Transducers, Probabilistic Parsing, Bayes Theorem, Augmented Search, Recommendations, NLP, Knowledge Graphs
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
http://www.bigdataspain.org/2014/conference/state-of-play-data-science-on-hadoop-in-2015-keynote
Machine Learning is not new. Big Machine Learning is qualitatively different: More data beats algorithm improvement, scale trumps noise and sample size effects, can brute-force manual tasks.
Session presented at Big Data Spain 2014 Conference
18th Nov 2014
Kinépolis Madrid
http://www.bigdataspain.org
Event promoted by: http://www.paradigmatecnologico.com
Slides: https://speakerdeck.com/bigdataspain/state-of-play-data-science-on-hadoop-in-2015-by-sean-owen-at-big-data-spain-2014
FIFA fails, Guy Kawasaki and real estate in SF - find out about all three by ...Elżbieta Bednarek
How to use Object Path, the agile query languge, to effectively extract relevant data from JSON documents of complex or even unknown structure. How to quickly build a web app using the insights you discover with ObjectPath.
Big Data Analytics: Finding diamonds in the rough with AzureChristos Charmatzis
In this session it will presented main workflows and technologies of getting value from Big Data stored in our Enterprise using Azure.
- When we have a Big Data problem
- Finding the best solution for our Big Data
- Working inside the Data Team
- Extract the true value of our data.
AUTOMATED DATA EXPLORATION - Building efficient analysis pipelines with DaskVíctor Zabalza
# Talk given at PyCon UK 2017
The first step in any data-intensive project is understanding the available data. To this end, data scientists spend a significant part of their time carrying out data quality assessments and data exploration. In spite of this being a crucial step, it usually requires repeating a series of menial tasks before the data scientist gains an understanding ofthe dataset and can progress to the next steps in the project.
In this talk I will detail the inner workings of a Python package that we have built which automates this drudge work, enables efficient data exploration, and kickstarts data science projects. A summary is generated for each dataset, including:
- General information about the dataset, including data quality of each of the columns;
- Distribution of each of the columns through statistics and plots (histogram, CDF, KDE), optionally grouped by other categorical variables;
- 2D distribution between pairs of columns;
- Correlation coefficient matrix for all numerical columns.
Building this tool has provided a unique view into the full Python data stack, from the parallelised analysis of a dataframe within a Dask custom execution graph, to the interactive visualisation with Jupyter widgets and Plotly. During the talk, I will also introduce how Dask works, and demonstrate how to migrate data pipelines to take advantage of its scalable capabilities.
Querying your database in natural language was a presentation done during PyData Silicon Valley 2014, based on the quepy software project. More information at:
http://pydata.org/sv2014/abstracts/#197
https://github.com/machinalis/quepy
Querying your database in natural language by Daniel Moisset PyData SV 2014PyData
Most end users can't write a database query, and yet, they often have the need to access information that keyword-based searches can't retrieve precisely. Lately, there's been an explosion of proprietary Natural Language Interfaces to knowledge databases, like Siri, Google Now and Wolfram Alpha. On the open side, huge knowledge bases like DBpedia and Freebase exists, but access to them is typically limited to using formal database query languages. We implemented Quepy as an approach to provide a solution for this problem. Quepy is an open source framework to transform Natural Language questions into semantic database queries that can be used with popular knowledge databases like, for example, DBPedia and Freebase. So instead of requiring end users to learn to write some query language, a Quepy Application can fills the gap, allowing end users to make their queries in "plain English". In this talk we would discuss the techniques used in Quepy, what additional work can be done, and its limitations.
Joel Grus gives a funny and beginner-friendly talk about his journey on the road to data science. For animations, see the original slides here: https://docs.google.com/presentation/d/1gqs54MMCgJpIVgcXUFm82MKfdmHA_pQRvThT6fAdb6g/edit?usp=sharing. More insights in Joel's new book, "Data Science from Scratch."
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...Uri Cohen
This presentation will focuses on the various data and querying models available in today’s distributed data stores landscape. It reviews what models and APIs are available and discusses the capabilities each of them provides, the applicable use cases and what it means for your application’s performance and scalability.
Leveraging NLP and Deep Learning for Document Recommendations in the CloudDatabricks
Efficient recommender systems are critical for the success of many industries, such as job recommendation, news recommendation, ecommerce, etc. This talk will illustrate how to build an efficient document recommender system by leveraging Natural Language Processing(NLP) and Deep Neural Networks (DNNs). The end-to-end flow of the document recommender system is build on AWS at scale, using Analytics Zoo for Spark and BigDL. The system first processes text rich documents into embeddings by incorporating Global Vectors (GloVe), then trains a K-means model using native Spark APIs to cluster users into several groups. The system further trains a recommender model for each group, and gives an ensemble prediction for each test record. By adopting the end-to-end pipeline of Analytics Zoo solution, we saw about 10% improvement of mean reciprocal ranking and 6% of precision respectively compared to the search recommendations for a job recommendation study.
Speaker: Guoqiong Song
Big data classification refers to the process of categorizing or classifying data based on predefined categories or classes. It involves using machine learning algorithms and statistical techniques to analyze large volumes of data and assign them to specific classes or categories.
Here are some key aspects of big data classification:
Training Data: Classification algorithms require a labeled dataset for training. This dataset consists of examples where the class or category of each data point is known. The training data is used to build a classification model that can generalize patterns and relationships between the input features and the corresponding classes.
Feature Selection: In classification, the features or attributes of the data play a crucial role in determining the class labels. Feature selection involves identifying the most relevant and informative features that contribute to accurate classification. This helps reduce dimensionality and improve the performance of the classification model.
Classification Algorithms: Various machine learning algorithms can be used for big data classification, such as decision trees, random forests, support vector machines (SVM), logistic regression, and deep learning techniques like neural networks. Each algorithm has its strengths, limitations, and suitability for different types of data and classification tasks.
Model Training and Evaluation: The classification model is trained using the labeled training data. The model learns the patterns and relationships between the features and the corresponding class labels. After training, the model is evaluated using evaluation metrics such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic (ROC) curve to assess its performance and generalization ability.
Predictive Classification: Once the classification model is trained and evaluated, it can be used to predict the classes of new, unseen data. The model takes the input features of the new data and applies the learned patterns to assign the appropriate class label.
Handling Big Data Challenges: Big data classification comes with specific challenges, including the volume, velocity, variety, and veracity of data. Processing large-scale datasets requires distributed computing frameworks like Apache Hadoop or Apache Spark. Additionally, data preprocessing, feature engineering, and model optimization techniques are used to handle the complexity and scalability of big data.
Big data classification finds applications in various domains, including customer segmentation, fraud detection, image recognition, sentiment analysis, and medical diagnosis, among others. It enables organizations to extract valuable insights from massive datasets and automate the process of categorizing and organizing data.
To successfully perform big data classification, it is important to have a good understanding of the data, the domain, and the appropriate choice of algorithms and techniqu
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
From Ceasar Cipher To Quantum CryptographyJoel Lord
Humans of all times have used codes and ciphers. Some of the greatest wars in history have been won thanks to good encryption, or lost due to great cryptographers. Even if we don’t think about it, encryption and cryptography are a big part of our lives, now that https is the defacto standard for the web. While most modern developers want to ensure that their data is secured, most of them don’t understand how the data is encrypted or how cryptography works. During this talk, the attendees will understand where ciphers come from by going through a journey in the history of cryptography. With examples from the Caesar cipher all the way to quantum cryptography, the speaker will explain in simple terms how cryptography evolved into what it is today and how it should be used to secure user data.
I Don't Care About Security (And Neither Should You)Joel Lord
In this talk, the attendees will learn about OAuth, JWTs and OpenID Connect. By understanding how to use those flows, it will help developers make application more secure and save significant development time. By using simple examples, the speaker tries to make this talk both informative and entertaining.
- OAuth
- What is OAuth
- The access code grant
- The implicit grant
- JWTs
- What is a token
- Anatomy of a JWT
- What is a refresh token
- Simple OAuth server code samples and demo
- Open ID Connect
- General flow
- OIDC demo
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
http://www.bigdataspain.org/2014/conference/state-of-play-data-science-on-hadoop-in-2015-keynote
Machine Learning is not new. Big Machine Learning is qualitatively different: More data beats algorithm improvement, scale trumps noise and sample size effects, can brute-force manual tasks.
Session presented at Big Data Spain 2014 Conference
18th Nov 2014
Kinépolis Madrid
http://www.bigdataspain.org
Event promoted by: http://www.paradigmatecnologico.com
Slides: https://speakerdeck.com/bigdataspain/state-of-play-data-science-on-hadoop-in-2015-by-sean-owen-at-big-data-spain-2014
FIFA fails, Guy Kawasaki and real estate in SF - find out about all three by ...Elżbieta Bednarek
How to use Object Path, the agile query languge, to effectively extract relevant data from JSON documents of complex or even unknown structure. How to quickly build a web app using the insights you discover with ObjectPath.
Big Data Analytics: Finding diamonds in the rough with AzureChristos Charmatzis
In this session it will presented main workflows and technologies of getting value from Big Data stored in our Enterprise using Azure.
- When we have a Big Data problem
- Finding the best solution for our Big Data
- Working inside the Data Team
- Extract the true value of our data.
AUTOMATED DATA EXPLORATION - Building efficient analysis pipelines with DaskVíctor Zabalza
# Talk given at PyCon UK 2017
The first step in any data-intensive project is understanding the available data. To this end, data scientists spend a significant part of their time carrying out data quality assessments and data exploration. In spite of this being a crucial step, it usually requires repeating a series of menial tasks before the data scientist gains an understanding ofthe dataset and can progress to the next steps in the project.
In this talk I will detail the inner workings of a Python package that we have built which automates this drudge work, enables efficient data exploration, and kickstarts data science projects. A summary is generated for each dataset, including:
- General information about the dataset, including data quality of each of the columns;
- Distribution of each of the columns through statistics and plots (histogram, CDF, KDE), optionally grouped by other categorical variables;
- 2D distribution between pairs of columns;
- Correlation coefficient matrix for all numerical columns.
Building this tool has provided a unique view into the full Python data stack, from the parallelised analysis of a dataframe within a Dask custom execution graph, to the interactive visualisation with Jupyter widgets and Plotly. During the talk, I will also introduce how Dask works, and demonstrate how to migrate data pipelines to take advantage of its scalable capabilities.
Querying your database in natural language was a presentation done during PyData Silicon Valley 2014, based on the quepy software project. More information at:
http://pydata.org/sv2014/abstracts/#197
https://github.com/machinalis/quepy
Querying your database in natural language by Daniel Moisset PyData SV 2014PyData
Most end users can't write a database query, and yet, they often have the need to access information that keyword-based searches can't retrieve precisely. Lately, there's been an explosion of proprietary Natural Language Interfaces to knowledge databases, like Siri, Google Now and Wolfram Alpha. On the open side, huge knowledge bases like DBpedia and Freebase exists, but access to them is typically limited to using formal database query languages. We implemented Quepy as an approach to provide a solution for this problem. Quepy is an open source framework to transform Natural Language questions into semantic database queries that can be used with popular knowledge databases like, for example, DBPedia and Freebase. So instead of requiring end users to learn to write some query language, a Quepy Application can fills the gap, allowing end users to make their queries in "plain English". In this talk we would discuss the techniques used in Quepy, what additional work can be done, and its limitations.
Joel Grus gives a funny and beginner-friendly talk about his journey on the road to data science. For animations, see the original slides here: https://docs.google.com/presentation/d/1gqs54MMCgJpIVgcXUFm82MKfdmHA_pQRvThT6fAdb6g/edit?usp=sharing. More insights in Joel's new book, "Data Science from Scratch."
To scale or not to scale: Key/Value, Document, SQL, JPA – What’s right for my...Uri Cohen
This presentation will focuses on the various data and querying models available in today’s distributed data stores landscape. It reviews what models and APIs are available and discusses the capabilities each of them provides, the applicable use cases and what it means for your application’s performance and scalability.
Leveraging NLP and Deep Learning for Document Recommendations in the CloudDatabricks
Efficient recommender systems are critical for the success of many industries, such as job recommendation, news recommendation, ecommerce, etc. This talk will illustrate how to build an efficient document recommender system by leveraging Natural Language Processing(NLP) and Deep Neural Networks (DNNs). The end-to-end flow of the document recommender system is build on AWS at scale, using Analytics Zoo for Spark and BigDL. The system first processes text rich documents into embeddings by incorporating Global Vectors (GloVe), then trains a K-means model using native Spark APIs to cluster users into several groups. The system further trains a recommender model for each group, and gives an ensemble prediction for each test record. By adopting the end-to-end pipeline of Analytics Zoo solution, we saw about 10% improvement of mean reciprocal ranking and 6% of precision respectively compared to the search recommendations for a job recommendation study.
Speaker: Guoqiong Song
Big data classification refers to the process of categorizing or classifying data based on predefined categories or classes. It involves using machine learning algorithms and statistical techniques to analyze large volumes of data and assign them to specific classes or categories.
Here are some key aspects of big data classification:
Training Data: Classification algorithms require a labeled dataset for training. This dataset consists of examples where the class or category of each data point is known. The training data is used to build a classification model that can generalize patterns and relationships between the input features and the corresponding classes.
Feature Selection: In classification, the features or attributes of the data play a crucial role in determining the class labels. Feature selection involves identifying the most relevant and informative features that contribute to accurate classification. This helps reduce dimensionality and improve the performance of the classification model.
Classification Algorithms: Various machine learning algorithms can be used for big data classification, such as decision trees, random forests, support vector machines (SVM), logistic regression, and deep learning techniques like neural networks. Each algorithm has its strengths, limitations, and suitability for different types of data and classification tasks.
Model Training and Evaluation: The classification model is trained using the labeled training data. The model learns the patterns and relationships between the features and the corresponding class labels. After training, the model is evaluated using evaluation metrics such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic (ROC) curve to assess its performance and generalization ability.
Predictive Classification: Once the classification model is trained and evaluated, it can be used to predict the classes of new, unseen data. The model takes the input features of the new data and applies the learned patterns to assign the appropriate class label.
Handling Big Data Challenges: Big data classification comes with specific challenges, including the volume, velocity, variety, and veracity of data. Processing large-scale datasets requires distributed computing frameworks like Apache Hadoop or Apache Spark. Additionally, data preprocessing, feature engineering, and model optimization techniques are used to handle the complexity and scalability of big data.
Big data classification finds applications in various domains, including customer segmentation, fraud detection, image recognition, sentiment analysis, and medical diagnosis, among others. It enables organizations to extract valuable insights from massive datasets and automate the process of categorizing and organizing data.
To successfully perform big data classification, it is important to have a good understanding of the data, the domain, and the appropriate choice of algorithms and techniqu
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
From Ceasar Cipher To Quantum CryptographyJoel Lord
Humans of all times have used codes and ciphers. Some of the greatest wars in history have been won thanks to good encryption, or lost due to great cryptographers. Even if we don’t think about it, encryption and cryptography are a big part of our lives, now that https is the defacto standard for the web. While most modern developers want to ensure that their data is secured, most of them don’t understand how the data is encrypted or how cryptography works. During this talk, the attendees will understand where ciphers come from by going through a journey in the history of cryptography. With examples from the Caesar cipher all the way to quantum cryptography, the speaker will explain in simple terms how cryptography evolved into what it is today and how it should be used to secure user data.
I Don't Care About Security (And Neither Should You)Joel Lord
In this talk, the attendees will learn about OAuth, JWTs and OpenID Connect. By understanding how to use those flows, it will help developers make application more secure and save significant development time. By using simple examples, the speaker tries to make this talk both informative and entertaining.
- OAuth
- What is OAuth
- The access code grant
- The implicit grant
- JWTs
- What is a token
- Anatomy of a JWT
- What is a refresh token
- Simple OAuth server code samples and demo
- Open ID Connect
- General flow
- OIDC demo
I Don't Care About Security (And Neither Should You)Joel Lord
Presented at Twin Cities Code Camp 23
Remember when setting up a login page was easy? It seems like nowadays it can take weeks to start a project--creating a signup form, a login form, a password recovery screen, and all the validation in between. And you haven’t even started on security considerations yet. During this presentation, the attendees will be introduced to OpenID Connect and OAuth. They will also learn how to leverage these technologies to create more secure applications. Most importantly, they will learn how to delegate authorization and authentication so they can focus on their real work and forget about all that security stuff.
I Don't Care About Security (And Neither Should You)Joel Lord
Remember when setting up a login page was easy? It seems like nowadays it can take weeks to start a project -- creating a signup form, a login form, a password recovery screen, and all the validation in between. And you haven't even started on security considerations yet. During this presentation, the attendees will be introduced to OpenID Connect and OAuth. They will also learn how to leverage these technologies to create more secure applications. Most importantly, they will learn how to delegate authorization and authentication so they can focus on their real work and forget about all that security stuff.
Every month, we hear about a new data breach and billions of user passwords are being shared as we speak. How can we stop this? There is a simple solution, let’s stop using passwords! From email links to biometrics, more and more technologies are available to help developers handle different types of credentials. During this presentation, the attendees will learn about some of the alternatives and how to implement them in the context of an OAuth flow.
I Don't Care About Security (And Neither Should You)Joel Lord
Remember when setting up a login page was easy? It seems like nowadays it can take weeks to start a project--creating a signup form, a login form, a password recovery screen, and all the validation in between. And you haven’t even started on security considerations yet. During this presentation, the attendees will be introduced to OpenID Connect and OAuth. They will also learn how to leverage these technologies to create more secure applications. Most importantly, they will learn how to delegate authorization and authentication so they can focus on their real work and forget about all that security stuff.
Presented at South Florida Code Camp '19
À chaque mois, une nouvelle brèche de sécurité pointe son nez dans les médias. Et avec chaque brèche de sécurité, des millions de noms d’utilisateurs et de mots de passes sont partagés. Mais comment cesser ce carnage? Il existe une solution toute simple: cessons d’utiliser des mots de passe! Des lien courriels aux senseurs biométriques, de plus en plus de technologies sont disponibles pour aider les développeurs logiciels à gérer différents types de mode d’identification. Durant cette présentation, les participants apprendront des alternatives aux mots de passe et comment les implémenter dans le context d’un flot OAuth.
When starting to dabble with Javascript, the biggest challenge for most developers is understanding how to deal with asynchronous development. During this talk, we will cover some of the different ways to handle async programming like callbacks, promises, generators, async/away and events. As we cover those, we will also plunge into some of the mechanics of the NodeJs engine, namely the event loop. Developers attending this talk will have a better understanding of asynchronous programming and will have a few new tools to their belt to tackle those issues as they come.
From chatbots to your home thermostat, it seems like machine learning algorithms are everywhere nowadays. How about understanding how this works now? In this talk, you will learn about the basics of machine learning through various basic examples, without the need for a PhD or deep knowledge of assembly. At the end of this talk, you will know what the Naive Bayes classifiers, sentiment analysis and basic genetic algorithms are and how they work. You will also see how to create your own implementations in Javascript.
Every month, we hear about a new data breach and billions of user passwords are being shared as we speak. How can we stop this? There is a simple solution, let’s stop using passwords! From email links to biometrics, more and more technologies are available to help developers handle different types of credentials. During this presentation, the attendees will learn about some of the alternatives and how to implement them in the context of an OAuth flow.
Chances are sooner or later your shiny new single page application will need authentication. Add some security and resource access control to that list as well. But how can we integrate all of this into a single page application that is entirely public? How can we ensure that our users only have access to the resources they are authorized to by hacking way in via the console? In this talk, the attendees will learn about l JSON Web Tokens (JWT) and see how they can be used to properly secure single page applications.
I Don't Care About Security (And Neither Should You)Joel Lord
Remember when setting up an auth system was easy? Me neither. From the signup form, the login form, password reset form, and all the validation in between it can easily take weeks if not months to get something basic up and running. Then you have to deal with all the security considerations. No thanks. During this presentation, the attendees will be introduced to OpenID and OAuth. They will learn how to leverage these technologies to create secure applications, but most importantly, they will learn why and how to delegate authorization and authentication so they can focus on their real work and forget about all that security stuff.
Every month, we hear about a new data breach and billions of user passwords are being shared as we speak. How can we stop this? There is a simple solution, let’s stop using passwords! From email links to biometrics, more and more technologies are available to help developers handle different types of credentials. During this presentation, the attendees will learn about some of the alternatives and how to implement them in the context of an OAuth flow.
I Don't Care About Security (And Neither Should You)Joel Lord
Remember when setting up an auth system was easy? Me neither. From the signup form, the login form, password reset form, and all the validation in between it can easily take weeks if not months to get something basic up and running. Then you have to deal with all the security considerations. No thanks. During this presentation, the attendees will be introduced to OpenID and OAuth. They will learn how to leverage these technologies to create secure applications, but most importantly, they will learn why and how to delegate authorization and authentication so they can focus on their real work and forget about all that security stuff.
This is a talk given at CharmCityJS on May 2nd 2018.
Chances are sooner or later your shiny new single page application will need authentication. Add some security and resource access control to that list as well. But how can we integrate all of this into a single page application that is entirely public? How can we ensure that our users only have access to the resources they are authorized to by hacking way in via the console? In this talk, the attendees will learn about l JSON Web Tokens (JWT) and see how they can be used to properly secure single page applications.
When starting to dabble with Javascript, the biggest challenge for most developers is understanding how to deal with asynchronous development. During this talk, we will cover some of the different ways to handle async programming like callbacks, promises, reactive streams and events. As we cover those, we will also plunge into some of the mechanics of the NodeJs engine, namely the event loop. Developers attending this talk will have a better understanding of asynchronous programming and will have a few new tools to their belt to tackle those issues as they come.
Remember that time where setting up a login page was easy? It seems like nowadays, it take many weeks to start a project just to create a signup form, a login form and a forget password screen. And that is if you don’t need 2 factor authentication or passwordless authentication. In a world of security breaches and privacy violations, it is important for developers to understand how modern identity work.
This talk will be in two parts.
For starters, the attendees will be introduced to modern security protocols like OpenID Connect and OAuth. The basics of token authentication will be explained using simple examples that are easy to understand. The concept of tokens will also be explained, more specifically how JWTs work.
In the second part of this presentation, the attendees will learn how to implement their own authentication server, how to secure their APIs and how to protect their Single Page Applications by making use of the protocols described in the first part.
Finally, the presenter will show the participants how to add Auth0 as an authentication server with minimal code changes and will demonstrate the simplicity of using a third party to handle login, signups, lost password as well as 2 factor authentication or passwordless logins.
I Don't Care About Security (And Neither Should You)Joel Lord
Remember that time where setting up a login page was easy? It seems like nowadays, it take many weeks to start a project just to create a signup form, a login form and a forget password screen. During this presentation, the attendees will be introduced to OpenID and OAuth. They will also learn how to leverage these technologies to create more secure application. Most importantly, they will learn how to delegate authorization and authentication so they can focus on their real work and forget about all that security stuff.
I Don't Care About Security (And Neither Should You)Joel Lord
Remember that time where setting up a login page was easy? It seems like nowadays, it take many weeks to start a project just to create a signup form, a login form and a forget password screen. And that is if you don’t need 2 factor authentication or passwordless authentication. During this presentation, the attendees will be introduced to OpenID and OAuth. They will also learn how to leverage this to create secure application or, most importantly, how to delegate to a third party so they can focus on their real work.
A quick demo that shows the attendees how to secure an SPA (could be React, Angular or VueJs) using Auth0. Can be adapted based on the time available for the presentation.
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfFlorence Consulting
Quattordicesimo Meetup di Milano, tenutosi a Milano il 23 Maggio 2024 dalle ore 17:00 alle ore 18:30 in presenza e da remoto.
Abbiamo parlato di come Axpo Italia S.p.A. ha ridotto il technical debt migrando le proprie APIs da Mule 3.9 a Mule 4.4 passando anche da on-premises a CloudHub 1.0.
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBrad Spiegel Macon GA
Brad Spiegel Macon GA’s journey exemplifies the profound impact that one individual can have on their community. Through his unwavering dedication to digital inclusion, he’s not only bridging the gap in Macon but also setting an example for others to follow.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
4. @joel__lord
#WAQ17
Agenda
• Intelligence articifielle vs Apprentissage automatisé
• Big Data et apprentissage profond
• Les algo de base
• Naïve Bayes Classifier
• Sentiment Analysis
• Genetic Algorithm
• Le tout parsemé de démos
7. @joel__lord
#WAQ17
L'intelligence artificielle (IA) est
l'intelligence fournie par les machines. En
informatique, le domaine de la recherche
sur l'IA se définit comme l'étude des
«agents intelligents»: tout dispositif qui
perçoit son environnement et prend des
mesures qui maximisent ses chances de
succès à un but.
22. Big Data et apprentissage
profond
ENCORE UN PEU DE THÉORIE
23. @joel__lord
#WAQ17
Big Data
QU’EST-CE QUE C’EST?
• Croissance
exponentielle des
données digitales
• Trop complexe à traiter
de façon traditionnelle
• Principalement utilisée
pour de la prédiction
ou analyse des
comportements des
utilisateurs
24. @joel__lord
#WAQ17
Apprentissage profonD (Deep learning)
QU’EST-CE QUE C’EST
• Utilise des réseaux neuronaux pour traiter les données
• Idéal pour des classsificateurs complexes
• Un moyen de traiter le big data
27. @joel__lord
#WAQ17
Apprentissage supervisé
QU’EST-CE QUE C’EST
• Requiert une rétroaction
• Débute avec aucune connaissance et augmente sa compréhension
• Inutile lorsque les données sont de mauvaise qualité
• Cas pratiques
• Classification
28. @joel__lord
#WAQ17
Apprentissage non-supervisé
CONTRAIRE DE SUPERVISÉ?
• Besoin d’aucun feedback
• Pratique lorsqu’il n’y a pas de bonne ou mauvais réponse
• Aide à trouver des patterns ou structures de données
• Cas pratiques
• “Vous pourriez aussi être intéressé par…”
• Grouper des clients selon leur comportement
30. @joel__lord
#WAQ17
Classification naïve bayésienne
DÉFINITION
• Algorithme supervisé
• Un simple moyen de classifier et identifier l’information
var classifier = new Classifier();
classifier.classify("J'adore le Javascript", POSITIVE);
classifier.classify('WebStorm est génial', POSITIVE);
classifier.classify('Non, Javascript est mauvais', NEGATIVE);
classifier.classify("Je n'aime pas le brocoli", NEGATIVE);
console.log(classifier.categorize("Javascript est génial"));
// "positive"
console.log(classifier.categorize("J'aime WebStorm"));
// undefined
32. @joel__lord
#WAQ17
Classification naïve bayésienne
DÉFINITION DE LA STRUCTURE
var Classifier = function() {
this.dictionaries = {};
};
Classifier.prototype.classify = function(text, group) {
};
Classifier.prototype.categorize = function(text) {
};
33. @joel__lord
#WAQ17
Classification naïve bayésienne
CRÉATION DE LA CLASSIFICATION
Classifier.prototype.classify = function(text, group) {
var words = text.split(" ");
this.dictionaries[group] ? "" : this.dictionaries[group] = {};
var self = this;
words.map((w) => {
if (self.dictionaries[group][w]) {
self.dictionaries[group][w]++;
} else {
self.dictionaries[group][w] = 1;
}
});
};
34. @joel__lord
#WAQ17
Classification naïve bayésienne
ET LE RESTE…
Classifier.prototype.categorize = function(text) {
var self = this;
var probabilities = {};
var groups = [];
var finals = {};
//Find the groups
for (var k in this.dictionaries) {groups.push(k);}
var sums = {};
var probs = {};
//Loop through the groups to calculate the sums of found text
for (var j = 0; j < groups.length; j++) {
if (!sums[text]) sums[text] = 0;
if (!this.dictionaries[groups[j]][text]) this.dictionaries[groups[j]][text] = 0;
sums[text] += this.dictionaries[groups[j]][text];
probs[groups[j]] = (this.dictionaries[groups[j]][text]) ? this.dictionaries[groups[j]][text] : 0;
}
// Perform calculations
for (var j = 0; j < groups.length; j++) {
(!probabilities[text]) ? probabilities[text] = {} : "";
(!probs[groups[j]]) ? probabilities[text][groups[j]] = 0 : probabilities[text][groups[j]] =
probs[groups[j]]/sums[text];
}
//Average out the probabilities
for (var j = 0; j < groups.length; j++) {
if (!finals[groups[j]]) finals[groups[j]] = [];
finals[groups[j]].push(probabilities[text][groups[j]]);
}
for (var i = 0; i < groups.length; i++) {
finals[groups[i]] = average(finals[groups[i]]);
}
//Find the largest probability
var highestGroup = "";
var highestValue = 0;
for (var group in finals) {
if (finals[group] > highestValue) {
highestGroup = group;
highestValue = finals[group];
}
}
return highestGroup;
};
37. @joel__lord
#WAQ17
Classification naïve bayésienne
CATÉGORISATION
Classifier.prototype.categorize = function(text) {
…
//Loop through the groups to calculate the sums of found text
for (var j = 0; j < groups.length; j++) {
if (!sums[text]) sums[text] = 0;
if (!this.dictionaries[groups[j]][text]) this.dictionaries[groups[j]][text]
= 0;
sums[text] += this.dictionaries[groups[j]][text];
probs[groups[j]] = (this.dictionaries[groups[j]][text]) ?
this.dictionaries[groups[j]][text] : 0;
}};
41. @joel__lord
#WAQ17
Classification naïve bayésienne
SOMMAIRE
Classifier.prototype.categorize = function(text) {
var self = this;
var probabilities = {};
var groups = [];
var finals = {};
//Find the groups
for (var k in this.dictionaries) {groups.push(k);}
var sums = {};
var probs = {};
//Loop through the groups to calculate the sums of found text
for (var j = 0; j < groups.length; j++) {
if (!sums[text]) sums[text] = 0;
if (!this.dictionaries[groups[j]][text]) this.dictionaries[groups[j]][text] = 0;
sums[text] += this.dictionaries[groups[j]][text];
probs[groups[j]] = (this.dictionaries[groups[j]][text]) ? this.dictionaries[groups[j]][text] : 0;
}
// Perform calculations
for (var j = 0; j < groups.length; j++) {
(!probabilities[text]) ? probabilities[text] = {} : "";
(!probs[groups[j]]) ? probabilities[text][groups[j]] = 0 : probabilities[text][groups[j]] =
probs[groups[j]]/sums[text];
}
//Average out the probabilities
for (var j = 0; j < groups.length; j++) {
if (!finals[groups[j]]) finals[groups[j]] = [];
finals[groups[j]].push(probabilities[text][groups[j]]);
}
for (var i = 0; i < groups.length; i++) {
finals[groups[i]] = average(finals[groups[i]]);
}
//Find the largest probability
var highestGroup = "";
var highestValue = 0;
for (var group in finals) {
if (finals[group] > highestValue) {
highestGroup = group;
highestValue = finals[group];
}
}
return highestGroup;
};
48. @joel__lord
#WAQ17
Analyse de sentiments
EXEMPLE DE CODE
var twit = require("twit");
var sentiment = require("sentiment");
var keyword = "#waq17";
var t = new twit(require("./credentials"));
var stream1 = t.stream("statuses/filter", {track: keyword});
stream1.on("tweet", function (tweet) {
var score = sentiment(tweet.text);
console.log("--- n New Tweetn" + tweet.text + "n" + (score > 0 ?
"Positive" : "Negative"));
});
51. @joel__lord
#WAQ17
Algorithmes génétiques
COMMENT ÇA FONCTIONNE
• On crée une population d’individus aléatoires
• On garde les plus proches de la solution
• On garde des individus aléatoires
• On introduit des mutations aléatores
• On crée aléatoirement des “enfants”
• On arrive magiquement à une solution!
52. @joel__lord
#WAQ17
Algorithmes génétiques
COMMENT ÇA FONCTIONNE
• On crée une population d’individus aléatoires
• On garde les plus proches de la solution
• On garde des individus aléatoires
• On introduit des mutations aléatores
• On crée aléatoirement des “enfants”
• On arrive magiquement à une solution!
54. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
var population = [];
const TARGET = 200;
const MIN = 0;
const MAX = TARGET - 1;
const IND_COUNT = 4;
const POP_SIZE = 100;
const CLOSE_ENOUGH = 0.001;
var RETAIN = 0.02;
var RANDOM_SELECTION = 0.05;
var MUTATION_PROBABILITY = 0.01;
55. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {
return Math.round(random(min, max));
}
function random(min, max) {
if (max == undefined) { max = min; min = 0; }
if (max == undefined) { max = 100; }
return (Math.random()*(max-min)) + min;
}
56. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {…}
function random(min, max) {…}
function fitness(individual) {
sum = individual.reduce((a,b) => a + b, 0);
return Math.abs(TARGET - sum);
}
function sortByFitness(population) {
population.sort((a, b) => {
var fitA = fitness(a); var fitB = fitness(b);
return fitA > fitB ? 1 : -1;
});
return population;
}
57. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {…}
function random(min, max) {…}
function fitness(individual) {…}
function sortByFitness(population) {…}
function randomIndividual() {
var individual = [];
for (var i = 0; i < IND_COUNT; i++) {
individual.push(random(MIN, MAX));
}
return individual;
}
function randomPopulation(size) {
var population = [];
for (var i = 0; i < size; i++) {
population.push(randomIndividual());
}
return population;
}
58. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {…}
function random(min, max) {…}
function fitness(individual) {…}
function sortByFitness(population) {…}
function randomIndividual() {…}
function randomPopulation(size) {…}
function mutate(population) {
for (var i=0; i < population.length; i++) {
if (MUTATION_PROBABILITY > Math.random()) {
var index = randomInt(population[i].length);
population[i][index] = random(MIN, MAX);
}
}
return population;
}
59. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {…}
function random(min, max) {…}
function fitness(individual) {…}
function sortByFitness(population) {…}
function randomIndividual() {…}
function randomPopulation(size) {…}
function mutate(population) {…}
function reproduce(father, mother) {
var half = father.length / 2;
var child = [];
child = child.concat(father.slice(0, half), mother.slice(half,
mother.length));
return child;
}
60. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {…}
function random(min, max) {…}
function fitness(individual) {…}
function sortByFitness(population) {…}
function randomIndividual() {…}
function randomPopulation(size) {…}
function mutate(population) {…}
function reproduce(father, mother) {…}
function evolve(population) {
var parents = [];
//Keep the best solutions
parents=sortByFitness(population).slice(0,Math.round(POP_SIZE*RETAIN));
//Randomly add new elements
for (var i = parents.length; i < POP_SIZE - parents.length; i++) {
if (RANDOM_SELECTION > Math.random()) {
parents.push(randomIndividual());
}
}
}
61. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {…}
function random(min, max) {…}
function fitness(individual) {…}
function sortByFitness(population) {…}
function randomIndividual() {…}
function randomPopulation(size) {…}
function mutate(population) {…}
function reproduce(father, mother) {…}
function evolve(population) {
//Random Stuff
parents = mutate(parents);
var rndMax = parents.length - 1;
while (parents.length < POP_SIZE) {
var father = randomInt(rndMax);
var mother = randomInt(rndMax);
if (father != mother) {
father = parents[father]; mother = parents[mother];
parents.push(reproduce(father, mother));
}
}
return parents;
62. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
//Declare Consts
function randomInt(min, max) {…}
function random(min, max) {…}
function fitness(individual) {…}
function sortByFitness(population) {…}
function randomIndividual() {…}
function randomPopulation(size) {…}
function mutate(population) {…}
function reproduce(father, mother) {…}
function evolve(population) {…}
function findSolution() {
var population = randomPopulation(POP_SIZE);
var generation = 0;
while (fitness(population[0]) > CLOSE_ENOUGH) {
generation++;
population = evolve(population);
}
return {solution: population[0], generations: generation};
}
var sol = findSolution();
63. @joel__lord
#WAQ17
Algorithmes génétiques
EXEMPLE DE CODE
var population = [];
const TARGET = 200;
const MIN = 0;
const MAX = TARGET - 1;
const IND_COUNT = 4;
const POP_SIZE = 100;
const CLOSE_ENOUGH = 0.001;
var RETAIN = 0.02;
var RANDOM_SELECTION = 0.05;
var MUTATION_PROBABILITY = 0.01;
function randomInt(min, max) {
return Math.round(random(min, max));
}
function random(min, max) {
if (max == undefined) { max = min; min = 0; }
if (max == undefined) { max = 100; }
return (Math.random()*(max-min)) + min;
}
function fitness(individual) {
sum = individual.reduce((a,b) => a + b, 0);
return Math.abs(TARGET - sum);
}
function sortByFitness(population) {
population.sort((a, b) => {
var fitA = fitness(a); var fitB = fitness(b);
return fitA > fitB ? 1 : -1;
});
return population;
}
function randomIndividual() {
var individual = [];
for (var i = 0; i < IND_COUNT; i++) {
individual.push(random(MIN, MAX));
}
return individual;
}
function randomPopulation(size) {
var population = [];
for (var i = 0; i < size; i++) {
population.push(randomIndividual());
}
return population;
}
function mutate(population) {
for (var i=0; i < population.length; i++) {
if (MUTATION_PROBABILITY > Math.random()) {
var index = randomInt(population[i].length);
population[i][index] = random(MIN, MAX);
}
}
return population;
}
function reproduce(father, mother) {
var half = father.length / 2;
var child = [];
child = child.concat(father.slice(0, half), mother.slice(half, mother.length));
return child;
}
function evolve(population) {
var parents = [];
//Keep the best solutions
parents = sortByFitness(population).slice(0, Math.round(POP_SIZE*RETAIN));
//Randomly add new elements
for (var i = parents.length; i < POP_SIZE - parents.length; i++) {
if (RANDOM_SELECTION > Math.random()) {
parents.push(randomIndividual());
}
}
//Mutate elements
parents = mutate(parents);
var rndMax = parents.length - 1;
while (parents.length < POP_SIZE) {
var father = randomInt(rndMax);
var mother = randomInt(rndMax);
if (father != mother) {
father = parents[father];
mother = parents[mother];
parents.push(reproduce(father, mother));
}
}
return parents;
}
function findSolution() {
var population = randomPopulation(POP_SIZE);
var generation = 0;
while (fitness(population[0]) > CLOSE_ENOUGH) {
generation++;
population = evolve(population);
}
return {solution: population[0], generations: generation};
}
var sol = findSolution();
console.log("Found solution in " + sol.generations + " generations.", sol.solution);
64. Faut le voir pour le croire
ALGORITHMES GÉNÉTIQUES
65. DOCUMENT CONFIDENTIEL, TOUT DROIT RÉSERVÉ
PRESENTED BY
That’s all folks !
Questions?
JOEL LORD
April 4th, 2017
TWITTER: @JOEL__LORD
GITHUB: HTTP://GITHUB.COM/JOELLORD