Binary classification and linear separators. Perceptron, ADALINE, artifical neurons. Artificial neural networks (ANNs), activation functions, and universal approximation theorem. Linear versus non-linear classification problems. Typical tasks, architectures and loss functions. Gradient descent and back-propagation. Support Vector Machines (SVMs), soft-margins and kernel trick. Connexions between ANNs and SVMs.
Overview of the course. Introduction to image sciences, image processing and computer vision. Basics of machine learning, terminologies, paradigms. No-free lunch theorem. Supervised versus unsupervised learning. Clustering and K-Means. Classification and regression. Linear least squares and polynomial curve fitting. Model complexity and overfitting. Curse of dimensionality. Dimensionality reduction and principal component analysis. Image representation, semantic gap, image features, and classical computer vision pipelines.
Localization and classification. Overfeat: class agnostic versu class specific localization, fully convolutional neural networks, greedy merge strategy. Multiobject detection. Region proposal and selective search. R-CNN, Fast R-CNN, Faster R-CNN and YOLO. Image segmentation. Semantic segmentation and transposed convolutions. Instance segmentation and Mask R-CNN. Image captioning. Recurrent Neural Networks (RNNs). Language generation. Long Short Term Memory (LSTMs). DeepImageSent, Show and Tell, and Show, Attend and Tell algorithms.
Binary classification and linear separators. Perceptron, ADALINE, artifical neurons. Artificial neural networks (ANNs), activation functions, and universal approximation theorem. Linear versus non-linear classification problems. Typical tasks, architectures and loss functions. Gradient descent and back-propagation. Support Vector Machines (SVMs), soft-margins and kernel trick. Connexions between ANNs and SVMs.
Overview of the course. Introduction to image sciences, image processing and computer vision. Basics of machine learning, terminologies, paradigms. No-free lunch theorem. Supervised versus unsupervised learning. Clustering and K-Means. Classification and regression. Linear least squares and polynomial curve fitting. Model complexity and overfitting. Curse of dimensionality. Dimensionality reduction and principal component analysis. Image representation, semantic gap, image features, and classical computer vision pipelines.
Localization and classification. Overfeat: class agnostic versu class specific localization, fully convolutional neural networks, greedy merge strategy. Multiobject detection. Region proposal and selective search. R-CNN, Fast R-CNN, Faster R-CNN and YOLO. Image segmentation. Semantic segmentation and transposed convolutions. Instance segmentation and Mask R-CNN. Image captioning. Recurrent Neural Networks (RNNs). Language generation. Long Short Term Memory (LSTMs). DeepImageSent, Show and Tell, and Show, Attend and Tell algorithms.
Machine Learning: The Bare Math Behind LibrariesJ On The Beach
During this presentation, we will answer how much you’ll need to invest in a superhero costume to be as popular as Superman. We will generate a unique logo which will stand against the ever popular Batman and create new superhero teams. We shall achieve it using linear regression and neural networks.
Machine learning is one of the hottest buzzwords in technology today as well as one of the most innovative fields in computer science – yet people use libraries as black boxes without basic knowledge of the field. In this session, we will strip them to bare math, so next time you use a machine learning library, you’ll have a deeper understanding of what lies underneath.
During this session, we will first provide a short history of machine learning and an overview of two basic teaching techniques: supervised and unsupervised learning.
We will start by defining what machine learning is and equip you with an intuition of how it works. We will then explain the gradient descent algorithm with the use of simple linear regression to give you an even deeper understanding of this learning method. Then we will project it to supervised neural networks training.
Within unsupervised learning, you will become familiar with Hebb’s learning and learning with concurrency (winner takes all and winner takes most algorithms). We will use Octave for examples in this session; however, you can use your favourite technology to implement presented ideas.
Our aim is to show the mathematical basics of neural networks for those who want to start using machine learning in their day-to-day work or use it already but find it difficult to understand the underlying processes. After viewing our presentation, you should find it easier to select parameters for your networks and feel more confident in your selection of network type, as well as be encouraged to dive into more complex and powerful deep learning methods.
Useing PSO to optimize logit model with TensorflowYi-Fan Liou
This project aim to use particle swarm optimization (PSO), one the evolutionary algorithms, to optimize the weights and bias in logistic regression using Tensorflow.
"You Can Do It" by Louis Monier (Altavista Co-Founder & CTO) & Gregory Renard (CTO & Artificial Intelligence Lead Architect at Xbrain) for Deep Learning keynote #0 at Holberton School (http://www.meetup.com/Holberton-School/events/228364522/)
If you want to assist to similar keynote for free, checkout http://www.meetup.com/Holberton-School/
From Research Objects to Reproducible Science TalesBertram Ludäscher
University of Southampton. Electronics & Computer Science. Research Seminar (Invited Talk).
TITLE: From Research Objects to Reproducible Science Tales
ABSTRACT. Rumor has it that there is a reproducibility crisis in science. Or maybe there are multiple crises? What do we mean by reproducibility and replicability anyways? In this talk I will first make an attempt at sorting out some of the terminological confusion in this area, focusing on computational aspects. The PRIMAD model is another attempt to describe different aspects of reproducibility studies by focusing on the "delta" between those studies and the original study. In addition to these more theoretical investigations, I will discuss practical efforts to create more reproducible and more transparent computational platforms such as the one developed by the Whole-Tale project: here 'tales' are executable research objects that may combine data, code, runtime environments, and narratives (i.e., the traditional "science story"). I will conclude with some thoughts about the remaining challenges and opportunities to bridge the large conceptual gaps that continue to exist despite the recognition of problems of reproducibility and transparency in science.
ABOUT the Speaker. Bertram Ludäscher is a professor at the School of Information Sciences at the University of Illinois, Urbana-Champaign and a faculty affiliate with the National Center for Supercomputing Applications (NCSA) and the Department of Computer Science at Illinois. Until 2014 he was a professor at the Department of Computer Science at the University of California, Davis. His research interests range from practical questions in scientific data and workflow management, to database theory and knowledge representation and reasoning. Prior to his faculty appointments, he was a research scientist at the San Diego Supercomputer Center (SDSC) and an adjunct faculty at the CSE Department at UC San Diego. He received his M.S. (Dipl.-Inform.) in computer science from the University of Karlsruhe (now K.I.T.), and his PhD (Dr. rer. nat.) from the University of Freiburg, in Germany.
Neural Networks and Deep Learning for PhysicistsHéloïse Nonne
Introduction to neural networks and deep learning. Seminar given by Héloïse Nonne on February 19th, 2015 at CINaM (Centre Interdisciplinaire de Nanosciences de Marseille) at Aix-Marseille University
Informações sobre as oportunidade de estudo e carreira promovidos pelo curso de Ciências de Computação da Universidade de São Paulo, campus de São Carlos.
Introdução às ferramentas de Business Intelligence do ecossistema Hadoop:
Business Intelligence e Big Data
Big Data warehousing
Arquitetura de um data warehouse
Hadoop e Apache Hive
Extract Transform Load
Data warehouse vs Banco de dados operacional
OLAP – Online Analytical Processing
Apache Kylin
Soluções OLAP convencionais
Advanced Analytics com o Apache Mahout
Machine Learning: The Bare Math Behind LibrariesJ On The Beach
During this presentation, we will answer how much you’ll need to invest in a superhero costume to be as popular as Superman. We will generate a unique logo which will stand against the ever popular Batman and create new superhero teams. We shall achieve it using linear regression and neural networks.
Machine learning is one of the hottest buzzwords in technology today as well as one of the most innovative fields in computer science – yet people use libraries as black boxes without basic knowledge of the field. In this session, we will strip them to bare math, so next time you use a machine learning library, you’ll have a deeper understanding of what lies underneath.
During this session, we will first provide a short history of machine learning and an overview of two basic teaching techniques: supervised and unsupervised learning.
We will start by defining what machine learning is and equip you with an intuition of how it works. We will then explain the gradient descent algorithm with the use of simple linear regression to give you an even deeper understanding of this learning method. Then we will project it to supervised neural networks training.
Within unsupervised learning, you will become familiar with Hebb’s learning and learning with concurrency (winner takes all and winner takes most algorithms). We will use Octave for examples in this session; however, you can use your favourite technology to implement presented ideas.
Our aim is to show the mathematical basics of neural networks for those who want to start using machine learning in their day-to-day work or use it already but find it difficult to understand the underlying processes. After viewing our presentation, you should find it easier to select parameters for your networks and feel more confident in your selection of network type, as well as be encouraged to dive into more complex and powerful deep learning methods.
Useing PSO to optimize logit model with TensorflowYi-Fan Liou
This project aim to use particle swarm optimization (PSO), one the evolutionary algorithms, to optimize the weights and bias in logistic regression using Tensorflow.
"You Can Do It" by Louis Monier (Altavista Co-Founder & CTO) & Gregory Renard (CTO & Artificial Intelligence Lead Architect at Xbrain) for Deep Learning keynote #0 at Holberton School (http://www.meetup.com/Holberton-School/events/228364522/)
If you want to assist to similar keynote for free, checkout http://www.meetup.com/Holberton-School/
From Research Objects to Reproducible Science TalesBertram Ludäscher
University of Southampton. Electronics & Computer Science. Research Seminar (Invited Talk).
TITLE: From Research Objects to Reproducible Science Tales
ABSTRACT. Rumor has it that there is a reproducibility crisis in science. Or maybe there are multiple crises? What do we mean by reproducibility and replicability anyways? In this talk I will first make an attempt at sorting out some of the terminological confusion in this area, focusing on computational aspects. The PRIMAD model is another attempt to describe different aspects of reproducibility studies by focusing on the "delta" between those studies and the original study. In addition to these more theoretical investigations, I will discuss practical efforts to create more reproducible and more transparent computational platforms such as the one developed by the Whole-Tale project: here 'tales' are executable research objects that may combine data, code, runtime environments, and narratives (i.e., the traditional "science story"). I will conclude with some thoughts about the remaining challenges and opportunities to bridge the large conceptual gaps that continue to exist despite the recognition of problems of reproducibility and transparency in science.
ABOUT the Speaker. Bertram Ludäscher is a professor at the School of Information Sciences at the University of Illinois, Urbana-Champaign and a faculty affiliate with the National Center for Supercomputing Applications (NCSA) and the Department of Computer Science at Illinois. Until 2014 he was a professor at the Department of Computer Science at the University of California, Davis. His research interests range from practical questions in scientific data and workflow management, to database theory and knowledge representation and reasoning. Prior to his faculty appointments, he was a research scientist at the San Diego Supercomputer Center (SDSC) and an adjunct faculty at the CSE Department at UC San Diego. He received his M.S. (Dipl.-Inform.) in computer science from the University of Karlsruhe (now K.I.T.), and his PhD (Dr. rer. nat.) from the University of Freiburg, in Germany.
Neural Networks and Deep Learning for PhysicistsHéloïse Nonne
Introduction to neural networks and deep learning. Seminar given by Héloïse Nonne on February 19th, 2015 at CINaM (Centre Interdisciplinaire de Nanosciences de Marseille) at Aix-Marseille University
Informações sobre as oportunidade de estudo e carreira promovidos pelo curso de Ciências de Computação da Universidade de São Paulo, campus de São Carlos.
Introdução às ferramentas de Business Intelligence do ecossistema Hadoop:
Business Intelligence e Big Data
Big Data warehousing
Arquitetura de um data warehouse
Hadoop e Apache Hive
Extract Transform Load
Data warehouse vs Banco de dados operacional
OLAP – Online Analytical Processing
Apache Kylin
Soluções OLAP convencionais
Advanced Analytics com o Apache Mahout
On the Support of a Similarity-Enabled Relational Database Management System ...Universidade de São Paulo
Crowdsourcing solutions can be helpful to extract information from disaster-related data during crisis management. However, certain information can only be obtained through similarity operations. Some of them also depend on additional data stored in a Relational Database Management System (RDBMS). In this context, several works focus on crisis management supported by data. Nevertheless, none of them provides a methodology for employing a similarity-enabled RDBMS in disaster-relief tasks. To fill this gap, we introduce a similarity-enabled methodology together with a supporting architecture named Data-Centric Crisis Management (DCCM), which employs our methods over a RDBMS. We evaluate our proposal through three tasks: classification of incoming data regarding current events, identifying relevant information to guide rescue teams; filtering of incoming data, enhancing the decision support by removing near-duplicate data; and similarity retrieval of historical data, supporting analytical comprehension of the crisis context. To make it possible, similarity-based operations were implemented within one popular, open-source RDBMS. Results using real data from Flickr show that the proposed methodology over DCCM is feasible for real-time applications. In addition to high performance, accurate results were obtained with a proper combination of techniques for each task. At last, given its accuracy and efficiency, we expect our work to provide a framework for further developments on crisis management solutions.
Effective and Unsupervised Fractal-based Feature Selection for Very Large Dat...Universidade de São Paulo
Given a very large dataset of moderate-to-high di-
mensionality, how to mine useful patterns from it? In such
cases, dimensionality reduction is essential to overcome the
“curse of dimensionality”. Although there exist algorithms to
reduce the dimensionality of Big Data, unfortunately, they
all fail to identify/eliminate non-linear correlations between
attributes. This paper tackles the problem by exploring con-
cepts of the Fractal Theory and massive parallel processing
to present Curl-Remover, a novel dimensionality reduction
technique for very large datasets. Our contributions are: Curl-
Remover eliminates linear and non-linear attribute correlations
as well as irrelevant ones; it is unsupervised and suits for
analytical tasks in general – not only classification; it presents
linear scale-up; it does not require the user to guess the
number of attributes to be removed, and; it preserves the
attributes’ semantics. We performed experiments on synthetic
and real data spanning up to 1.1 billion points and Curl-
Remover outperformed a PCA-based algorithm, being up to
8% more accurate.
Fire Detection on Unconstrained Videos Using Color-Aware Spatial Modeling and...Universidade de São Paulo
The semantic segmentation of events on emergency contexts involves the identification of previously defined events of interest. In this work, the focused semantic event is the presence of fire in videos. The literature presents several methods for automatic video fire detection, but these methods were built under assumptions, such as stationary cameras and controlled lightening conditions that are often in contrast to the videos acquired by hand-held devices. To fulfill this gap, we propose a fire detection method, called SPATFIRE. Our method innovates on three aspects: (1) it relies on a specifically tailored color model named Fire-like Pixel Detector able to improve the accuracy of fire detection, (2) it employs a new technique for motion compensation, diminishing the problems observed in videos captured with non-stationary cameras, and, (3) it defines a segmentation method able to identify, not only the presence of fire in a video, but also the segments in the video where fire occurs. We experimented our proposal on two video datasets with different characteristics and summarize the results to demonstrate the superior efficacy, in terms of true positives and negatives, as compared to state-of-the-art methods.
Can we use information from social media and crowdsourced images to detect smoke and assist rescue forces? While there are computer vision methods for detecting smoke, they require movement information extracted from video data. In this paper we propose SmokeBlock: a method that is able to segment and detect smoke in still images. SmokeBlock uses superpixel segmentation and extracts local color and texture features from images to spot smoke. We used real data from Flickr and compared SmokeBlock against state-of-the-art methods for feature extraction. Our method achieved performance superior than the competitors, for the task of smoke detection. Our findings shall support further investigations in the field of image analysis, in particular, concerning images captured with mobile devices.
Vertex Centric Asynchronous Belief Propagation Algorithm for Large-Scale GraphsUniversidade de São Paulo
Inference problems on networks and their algorithms were always important subjects, but more so now with so much data available and so little time to make sense of it.
Common applications range from product recommendation to social networks and protein interaction.
One of the main inferences in this types of networks is the guilty-by-association method, where labeled nodes propagate their information throughout the network, towards unlabeled nodes.
While there is a widely used algorithm for this context, called Belief Propagation, it lacks the necessary convergence guarantees for loopy-networks.
More recently, a new alternative method was proposed, called LinBP and while it solved the convergence issue, the scalability for large graphs that do not fit memory remains a challenge.
Additionally, most works that try to use BP considering large scale graphs rely on specific infrastructure such as supercomputers and computational clusters.
Therefore we propose a new algorithm, that leverages state-of-the-art asynchronous vertex-centric parallel processing techniques in conjunction with the state-of-the-art BP alternative LinBP, to provide a scalable framework for large graph inference that runs on a single commodity machine.
Our results show that our algorithm is up to 200 times faster than LinBP's SQL implementation on tested networks, while achieving the same accuracy rate.
We also show that due to the asynchronous processing, our algorithm actually needs less iterations to converge when compared to LinBP when using the same parameters.
Finally, we believe that our methodology highlights the yet not fully explored parallelism available on commodity machines, leaning towards a more cost-efficient computational paradigm.
Fast Billion-scale Graph Computation Using a Bimodal Block Processing ModelUniversidade de São Paulo
Recent graph computation approaches have demonstrated that a single PC can perform efficiently on billion-scale graphs. While these approaches achieve scalability by optimizing I/O operations, they do not fully exploit the capabilities of modern hard drives and processors. To overcome their performance, in this work, we introduce the Bimodal Block Processing (BBP), an innovation that is able to boost the graph computation by minimizing the I/O cost even further. With this strategy, we achieved the following contributions: (1) \mflash, the fastest graph computation framework to date; (2) a flexible and simple programming model to easily implement popular and essential graph algorithms, including the \textit{first} single-machine billion-scale eigensolver; and (3) extensive experiments on real graphs with up to 6.6 billion edges, demonstrating M-Flash's consistent and significant speedup.
StructMatrix: large-scale visualization of graphs by means of structure detec...Universidade de São Paulo
Given a large-scale graph with millions of nodes and edges, how to reveal macro patterns of interest, like cliques, bi-partite cores, stars, and chains? Furthermore, how to visualize such patterns altogether getting insights from the graph to support wise decision-making? Although there are many algorithmic and visual techniques to analyze graphs, none of the existing approaches is able to present the structural information of graphs at large-scale. Hence, this paper describes StructMatrix, a methodology aimed at high-scalable visual inspection of graph structures with the goal of revealing macro patterns of interest. StructMatrix combines algorithmic structure detection and adjacency matrix visualization to present cardinality, distribution, and relationship features of the structures found in a given graph. We performed experiments in real, large-scale graphs with up to one million nodes and millions of edges. StructMatrix revealed that graphs of high relevance (e.g., Web, Wikipedia and DBLP) have characterizations that reflect the nature of their corresponding domains; our findings have not been seen in the literature so far. We expect that our technique will bring deeper insights into large graph mining, leveraging their use for decision making.
Several graph visualization tools exist. However, they are not able to handle large graphs, and/or they do not allow interaction. We are interested on large graphs, with hundreds of thousands of nodes. Such graphs bring two challenges: the first one is that any straightforward interactive manipulation will be prohibitively slow. The second one is sensory overload: even if we could plot and replot the graph quickly, the user would be overwhelmed with the vast volume of information because the screen would be too cluttered as nodes and edges overlap each other. GMine system addresses both these issues, by using summarization and multi-resolution. GMine offers multi-resolution graph exploration by partitioning a given graph into a hierarchy of com-munities-within-communities and storing it into a novel R-tree-like structure which we name G-Tree. GMine offers summarization by implementing an innovative subgraph extraction algorithm and then visualizing its output.
Techniques for effective and efficient fire detection from social media imagesUniversidade de São Paulo
Social media provides information, in the form of images, that is valuable to a vast set of human activities, including salvage and rescue in the case of crisis situations (such as accidents, explosions, and fire). However, these services produce images in a rate that is impossible for human beings to absorb and analyze; thus, it is a requirement to have methods for automatic analysis. However, despite the multiple works on image analysis, there are no studies on the specific topic of fire detection over social media. To fill this gap, this work describes the use and the evaluation of an ample set of content-based image retrieval and classification techniques in the task of fire detection. In our intent, we (1) built a ground-truth set of annotated images regarding fire occurrence; (2) engineered the Fast-Fire Detection and Retrieval ($\FFDnR$) architecture to combine configurations of feature extractors and distance functions to work with instance-based learning; and (3) evaluated 36 image descriptors in the task of fire detection. Our results demonstrated that, for fire detection, the best image descriptors concerning efficacy (F-measure, Precision-Recall, and ROC) and processing efficiency (wall-clock time) are achieved with MPEG-7 feature extractors Color Structure and Scalable Color, and with distance functions City-Block and Euclidean. Our work shall provide basis for further developments regarding monitoring of images from social media.
Multimodal graph-based analysis over the DBLP repository: critical discoverie...Universidade de São Paulo
The use of graph theory for analyzing network-like data has gained central importance with the rise of the Web 2.0. However, many graph-based techniques are not well-disseminated and neither explored at their full potential, what might depend on a complimentary approach achieved with the combination of multiple techniques. This paper describes the systematic use of graph-based techniques of different types (multimodal) combining the resultant analytical insights around a common domain, the Digital Bibliography & Library Project (DBLP). To do so, we introduce an analytical ensemble based on statistical (degree, and weakly-connected components distribution), topological (average clustering coefficient, and effective diameter evolution), algorithmic (link prediction/machine learning), and algebraic techniques to inspect non-evident features of DBLP at the same time that we interpret the heterogeneous discoveries found along the work. As a result, we have put together a set of techniques demonstrating over DBLP what we call multimodal analysis, an innovative process of information understanding that demands a wide technical knowledge and a deep understanding of the data domain. We expect that our methodology and our findings will foster other multimodal analyses and also that they will bring light over the Computer Science research.
Currently, link recommendation has gained more attention as networked data becomes abundant in several scenarios. However, existing methods for this task have failed in considering solely the structure of dynamic networks for improved performance and accuracy. Hence, in this work, we present a methodology based on the use of multiple topological metrics in order to achieve prospective link recommendations considering time constraints. The combination of such metrics is used as input to binary classification algorithms that state whether two pairs of authors will/should define a link. We experimented with five algorithms, what allowed us to reach high rates of accuracy and to evaluate the different classification paradigms. Our results also demonstrated that time parameters and the activity profile of the authors can significantly influence the recommendation. In the context of DBLP, this research is strategic as it may assist on identifying potential partners, research groups with similar themes, research competition (absence of obvious links), and related work.
Relational databases are rigid-structured data sources characterized by complex relationships among a set of relations (tables). Making sense of such relationships is a challenging problem because users must consider multiple relations, understand their ensemble of integrity constraints, interpret dozens of attributes, and draw complex SQL queries for each desired data exploration. In this scenario, we introduce a twofold methodology; we use a hierarchical graph representation to efficiently model the database relationships and, on top of it, we designed a visualization technique for rapidly relational exploration. Our results demonstrate that the exploration of databases is profoundly simplified as the user is able to visually browse the data with little or no knowledge about its structure, dismissing the need of complex SQL queries. We believe our findings will bring a novel paradigm in what concerns relational data comprehension.
http://www.icmc.usp.br/~junio/PublishedPapers/RodriguesJr_et_al_Frequency_Plot-SIBGRAPI2003.pdf
Jose Rodrigues, Agma J M Traina, Caetano Traina Jr (2003) Frequency Plot and Relevance Plot to Enhance Visual Data Exploration In: XVI Brazilian Symposium on Computer Graphics and Image Processing 117-124 IEEE Press.
@inproceedings { DBLP:conf/sibgrapi/RodriguesTT03,
title = "Frequency Plot and Relevance Plot to Enhance Visual Data Exploration",
year = "2003",
author = "Jose Rodrigues and Agma J M Traina and Caetano Traina Jr",
booktitle = " XVI Brazilian Symposium on Computer Graphics and Image Processing",
pages = "117-124",
publisher = "IEEE Press",
doi = "10.1109/SIBGRA.2003.1240999",
url = "http://www.icmc.usp.br/~junio/PublishedPapers/RodriguesJr_et_al_Frequency_Plot-SIBGRAPI2003.pdf",
urllink = "http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=1240999&",
abstract = "We present two techniques aiming at exploring databases through multivariate visualizations. Both techniques intend to deal with the problem caused by the limited amount of elements that can be presented simultaneously in traditional visual exploration procedures. The first technique, the Frequency Plot, combines data frequency with interactive filtering to identify clusters and trends in subsets of the database. Thus, graphical elements (lines, pixels, icons, or graphical marks) are color differentiated proportionally to how frequent the value being represented is, while interactive filtering allows the selection of interesting partitions of the database. The second technique, the Relevance Plot, corresponds to assigning different levels of color distinguishably to visual elements according to their relevance to a user's specified data properties set, which can be chosen visually and dynamically.",
keywords = "Computer science , Data analysis , Data visualization , Filtering , Frequency , Humans , Image databases , Information retrieval , Layout , Visual databases"}
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
1. A gentle introduction
to Deep Learning
Jose Fernando Rodrigues-Jr
University of Sao Paulo, Brazil
Supervision: Sihem Amer-Yahia
Université Grenoble Alpes, France
Funding: Fundação de Amparo à Pesquisa do Estado de São Paulo (Fapesp)
Grant 2018/17620-5
2. Laboratoire d’Informatique de Grenoble
/66
About me and my university
● The University of Sao Paulo
-Ranked number 2 among latin-american universities
-Ranked in the 250-300 stratum in the world (UGA is in the 300-350 stratum)
(source: Times Higher Education, 2019)
● Faculty at University of Sao Paulo since 2010, associate professor since 2014
● My campus is in the city of Sao Carlos, country side of the state of Sao Paulo
● The HDI of Sao Carlos is 0.805 (Brazil is 0.754 and France is 0.897)
2
3. Laboratoire d’Informatique de Grenoble
/66
About me and my university
● The University of Sao Paulo
-Ranked number 2 among latin-american universities
-Ranked in the 250-300 stratum in the world (UGA is in the 300-350 stratum)
(source: Times Higher Education, 2019)
● Faculty at University of Sao Paulo since 2010, associate professor since 2014
● My campus is the city of Sao Carlos, country side of the state of Sao Paulo
● The IDH of Sao Carlos is 0.805 (Brazil is 0.754 and France is 0.897)
3
4. Laboratoire d’Informatique de Grenoble
/66
Deep Learning
-From the IEEE top 10 computing trends 2018, Deep Learning is the number 1
https://www.computer.org/press-room/2017-news/top-technology-trends-2018;
-Not new: most of the techniques are 20, 30, even 50, years old;
-Not necessarily deep: some architectures have one single (hidden) layer;
-Myth: it is about artificial intelligence, not artificial conscience.
4
5. Laboratoire d’Informatique de Grenoble
/66
Deep Learning
Specifically, Deep Learning refers to the revival of artificial intelligence (artificial
neural networks) due to four factors:
1) lots of data: while a child learns what a dog looks like from three images, a computer
demands 3 million images;
2) computing power: 2.0xx computers have memory and processing power orders of
magnitude higher than 19xx computers; GPUs scaled the process even more;
3) algorithmic improvements: gradient descent, back propagation and architectural
innovations amplified the range of possibilities;
4) robust frameworks: Theano, TensorFlow, Keras, and many others made complex
parallel math computing accessible.
5
6. Laboratoire d’Informatique de Grenoble
/66
Image classification breakthrough
Large Scale Visual Recognition Challenge (ILSVRC) - ImageNet for short
2017
Training: 1.2 million images
Validation: 150.000 images
Test: 50.000 images
1.000 classes
6
7. Laboratoire d’Informatique de Grenoble
/66
Image classification breakthrough
Large Scale Visual Recognition Challenge (ILSVRC) - ImageNet for short
Super-human performance
2017
Training: 1.2 million images
Validation: 150.000 images
Test: 50.000 images
1.000 classes
7
9. Laboratoire d’Informatique de Grenoble
/66
No more features engineering
9
● The idea of features engineering applies to data
processing problems that demand features extraction.
This is not always the case;
● Yet, it is still possible to use Artificial Neural Networks
with manually extracted features - sometimes it is the
only course of action, like in regression problems.
10. Laboratoire d’Informatique de Grenoble
/66
Promising results
Soon (or already?) better than human skills:
-Computer Vision
-Text translation
-Text generation
-Games: Go, Chess, …
-Medicine: heart attack, neuro degenerative diseases, oncology….
Esteva, A. et al.; Dermatologist-level classification of skin cancer with deep neural networks,
Nature, 2017
-super-human performance on classifying skin lesions
-identified classes still unknown in the literature
-1500+ citations in one year
10
11. Laboratoire d’Informatique de Grenoble
/66
Turing award 2018
Yoshua Bengio, French-Canadian
(theoretical backgrounds)
Geoffrey Hinton, British-Canadian
(back propagation, AlexNet)
Yann LeCun, French
(convolutional networks and
engineering)
“For conceptual and engineering
breakthroughs that have made
deep neural networks a critical
component of computing.”
11
14. Laboratoire d’Informatique de Grenoble
/66
Biology inspiration
14
● Inspiration only - not simulation. It is not yet fully
understood how the brain works.
15. Laboratoire d’Informatique de Grenoble
/66
Existential parenthesis
Why neurons?
-The universe is made up of sets (for Comp. Science, unordered lists without repetition)
-In a world of sets, what do smart things do?
Ans.: they build up functions (or maps, for CS)
-What is a function, broadly speaking?
Given two sets X and Y, a function defines a mapping between then: f: X→ Y
X and Y can be anything, objects, emotions, concepts, abstractions, skills, music, …
-To do that, nature (evolution) designed specialized cells, named neuros
A very big bunch of neurons is able to build functions! 15
16. Laboratoire d’Informatique de Grenoble
/66
Existential parenthesis
Why neurons?
-The universe is made up of sets (for CC, unordered lists without repetition)
-In a world of sets, what do smart things do?
Ans.: they build up functions (or maps, for CC)
-What is a function, broadly speaking?
Given two sets X and Y, a function defines a mapping between then: f: X→ Y
X and Y can be anything, objects, emotions, concepts, abstractions, skills, music, …
-To do that, nature (evolution) designed specialized cells, named neuros
A very big bunch of neurons is able to build functions! 16
First key concept:
1) An Artificial Neuron Network is a function;
17. Laboratoire d’Informatique de Grenoble
/66
Existential parenthesis
Why neurons?
-The universe is made up of sets (for CC, unordered lists without repetition)
-In a world of sets, what do smart things do?
Ans.: they build up functions (or maps, for CC)
-What is a function, broadly speaking?
Given two sets X and Y, a function defines a mapping between then: f: X→ Y
X and Y can be anything, objects, emotions, concepts, abstractions, skills, music, …
-To do that, nature (evolution) designed specialized cells, named neuros
A very big bunch of neurons is able to build functions!
Compared to numeric Math sets, smart beings deal with sets whose all elements
cannot be foreseen, not even exhaustively.
Math function: f: NI → IR; for example f(x) = xe
Open function: f: {all possible dogs} → {all known dog breeds}
Compared to numeric Math sets, smart beings deal with sets with domain's having a
number of unique elements that cannot be completely foreseen, not even exhaustively.
Math function: f: NI → IR; for example f(x) = xe
Open function: f: {all possible dogs} → {all known dog breeds}
Akita, Alaskan husky,
Bichon Frisé, Border
Terrier, Boxer, Brazilian
Mastiff, ….
?
17
20. Laboratoire d’Informatique de Grenoble
/66
Principle - artificial neuron
20
In matrix form ⇒ Very important
● 1 input 1 x n feature vector:
● 1 processing n x 1 neuron:
0 ... n
0
...
n
21. Laboratoire d’Informatique de Grenoble
/66
Principle - artificial neuron
21
In matrix form ⇒ Very important
● j input 1 x n feature vectors:
● k processing n x 1 neurons:
0 ... n-1
0
...
n
0 ... n-1
0 ... n-1
...
0:
1:
j-1:
0
...
n
0
...
n
...
0: 1: k-1:
= Ij x n
= Mn x k
22. Laboratoire d’Informatique de Grenoble
/66
Principle - artificial neuron
22
In matrix form ⇒ Very important
● j input 1 x n feature vectors:
● k processing n x 1 neurons:
0 ... n-1
0
...
n
0 ... n-1
0 ... n-1
...
0:
1:
j-1:
0
...
n
0
...
n
...
0: 1: k-1:
= I
= M
Now, remember matrix dot product:
And the neuron principle:
23. Laboratoire d’Informatique de Grenoble
/66
Principle - artificial neuron
23
Suppose:
● j input 1 x n=10 feature vectors:
● k=5 processing neurons 10 x 5:
0 ... n-1
0
...
n
0 ... n-1
0 ... n-1
...
0:
1:
j-1:
0
...
n
0
...
n
...
0: 1: k-1:
= I⇒ j x 10 matrix
= M ⇒ 10 x 5 matrix
10 features
10 weights
5 neurons
24. Laboratoire d’Informatique de Grenoble
/66
Principle - artificial neuron
24
Suppose:
● j input 1 x n=10 feature vectors:
● k=5 processing 10 x 1 neurons:
0 ... n-1
0
...
n
0 ... n-1
0 ... n-1
...
0:
1:
j-1:
0
...
n
0
...
n
...
0: 1: k-1:
= I⇒ j x 10 matrix
= M ⇒ 10 x 5 matrix
The processing of the j 1x10 vectors by the 10x5 neurons is
represented in the figure:
Which corresponds to the dot product Ijx10
.M10x5
The output is a matrix O corresponding to j new vectors, each with 5
transformed features, that is 0j x 5
Ijx10
M10x5
25. Laboratoire d’Informatique de Grenoble
/66
Supervised learning
25
Training: I know the answer
→ Learning, building model
Testing: I do not know the answer
→ Evaluation, using model
29. Laboratoire d’Informatique de Grenoble
/66
After all, an optimization problem
*Biases omitted for simplicity
parameters
(mostly,
weights)
29
30. Laboratoire d’Informatique de Grenoble
/66
Principle - artificial neuron
30
Suppose:
● j input 1 x n=10 feature vectors:
● k=5 processing neurons 10 x 5:
0 ... n-1
0
...
n
0 ... n-1
0 ... n-1
...
0:
1:
j-1:
0
...
n
0
...
n
...
0: 1: k-1:
= I⇒ j x 10 matrix
= M ⇒ 10 x 5 matrix
10 features
10 weights
5 neurons
This is the object
of the
optimization,
what weights
lead to the
desired output?
31. Laboratoire d’Informatique de Grenoble
/66
After all, an optimization problem
*Biases omitted for simplicity
parameters
31
Second key concept:
1) An Artificial Neuron Network is a function;
2) The training of an ANN is an optimization problem;
32. Laboratoire d’Informatique de Grenoble
/66
After all, an optimization problem
*Biases omitted for simplicity
parameters
32
Attention:
- This presentation is only about the basics; in fact, it covers concepts on
Artificial Neural Networks;
- When features extraction is involved, like in image, and audio
processing, the process is much more complex;
- Actually, the deepness of "Deep Learning" has to do with these more
complex problems;
- Nevertheless, the principles are the same.
33. Laboratoire d’Informatique de Grenoble
/66
Overall (theoretical) process
1. Specify a structure and a loss function to guide the optimization;
2. Feed forward with matrix multiplication and non-linear activations;
3. while (not satisfactory results)
a. Compute the parameters’ adjust using gradient descent;
b. The network backpropagates using the multivariate chain rule;
c. Update the weights accordingly;
d. Classification/Regression.
33
34. Laboratoire d’Informatique de Grenoble
/66
Overall (theoretical) process
1. Specify a structure and a loss function to guide the optimization;
2. Feed forward with matrix multiplication and non-linear activations;
3. while (not satisfactory results)
a. Compute the parameters’ adjust using gradient descent;
b. The network backpropagates using the multivariate chain rule;
c. Update the weights accordingly;
d. Classification/Regression.
34
35. Laboratoire d’Informatique de Grenoble
/66
Error landscape
● The set of parameters defines an error landscape
● We want to move along this landscape to find the best minimum
(preferably the global minimum)
35Error landscape
36. Laboratoire d’Informatique de Grenoble
/66
How to converge to the proper parameters?
The standard solution is the gradient descent algorithm
1.Calculate the partial derivative
2.Backpropagate updating W as
3.Use the chain rule to propagate through all the layers
Loss
W
36
37. Laboratoire d’Informatique de Grenoble
/66
How to converge to the proper parameters?
The standard solution is the gradient descent algorithm
1.Calculate the partial derivative
2.Backpropagate updating W as
3.Use the chain rule to propagate through all the layers
Loss
W
37
The learning rate states how much to move
in the direction contrary to the gradient.
39. Laboratoire d’Informatique de Grenoble
/66
How to converge to the proper parameters?
39
Third key concept:
1) An Artificial Neuron Network is a function;
2) The training of an ANN is an optimization problem;
3) Gradient descent is the ultimate method to move along the error landscape;
40. Laboratoire d’Informatique de Grenoble
/66
Different gradient descent methods
● There are many gradient descent-based optimizers;
● They vary with respect to the speed of convergence, processing cost,
learning rate, and decay factor;
● Adadelta is the most robust and widely used;
○ It is stochastic, hence, more robust against local minima
40
41. Laboratoire d’Informatique de Grenoble
/66
Different gradient descent methods
● There are many gradient descent-based optimizers;
● They vary with respect to the speed of convergence, processing cost,
learning rate, and decay factor;
● Adadelta is the most robust and widely used;
○ It is stochastic, hence, more robust against local minima
41
Adadelta uses adaptive learning
rate; the closer to a minimum, the
smaller the learning rate.
42. Laboratoire d’Informatique de Grenoble
/66
Millions of parameters
● Warning: even for mid-sized networks, the number of weights sums up
to thousands, even millions;
● This is responsible for the high computational cost of deep learning
42
43. Laboratoire d’Informatique de Grenoble
/66
Millions of parameters
● Warning: even for mid-sized networks, the number of weights sums up
to thousands, even millions;
● This is responsible for the high computational cost of deep learning
43
44. Laboratoire d’Informatique de Grenoble
/66
Deep Learning Frameworks
Implementing all these concepts from scratch is very hard (really!);
To ease the process, academic and industrial players built frameworks that:
-make linear algebra expression as simple as scalar algebra expression;
-calculate partial derivatives automatically (one line of code);
-perform back propagation;
-distribute the computation over GPUs.
Main frameworks: Theano, Google TensorFlow, Microsoft Cognitive Toolkit,
PyTorch, Keras, Apache MXNet, NVIDIA Caffe, Chainer, and others. 44
45. Laboratoire d’Informatique de Grenoble
/66
Deep Learning Frameworks
Oh, it is so easy! - NO!
You still have to:
-model the data input and output; a big deal of numbers organized in
multi-dimensional arrays;
-model the layers in terms of size and connectivity -- matrix dimensionality
will give you headaches;
-implement the neurons’ computations;
-implement the updating scheme;
-get used to symbolic coding.
45
46. Laboratoire d’Informatique de Grenoble
/66
How to choose a Framework
● You are a PhD student, or Posdoc, on DL itself: Theano, TensorFlow, Torch
● You want to use DL only to get features: Keras, Caffe
● You work in industry: TensorFlow, Caffe
● You started your 2 months internship: Keras, Caffe
● You want to give practice works to your students: Keras, Caffe
● You are curious about deep learning: Caffe
● You don’t even know python: Keras, Torch
Source: https://project.inria.fr/deeplearning/files/2016/05/DLFrameworks.pdf
46
47. Laboratoire d’Informatique de Grenoble
/66
How to choose a Framework
● You are a PhD student, or Posdoc, on DL itself: Theano, TensorFlow, Torch
● You want to use DL only to get features: Keras, Caffe
● You work in industry: TensorFlow, Caffe
● You started your 2 months internship: Keras, Caffe
● You want to give practise works to your students: Keras, Caffe
● You are curious about deep learning: Caffe
● You don’t even know python: Keras, Torch
Source: https://project.inria.fr/deeplearning/files/2016/05/DLFrameworks.pdf
47
48. Laboratoire d’Informatique de Grenoble
/66
Pitfalls
● Proper pre-processing;
● Optimizing the structure can be a never ending process;
● Preventing over or under fitting;
● Getting it to converge (to a high-quality local minimum);
● Making sure you have the right loss function;
● Doing data augmentation correctly.
Time-consuming
● Testing a single idea can take a week or more;
● Preprocessing large data takes long time;
● Symbolic programing is tough;
● Hyper-parameters + process variations ⇒ number of possible settings
explode. 48
49. Laboratoire d’Informatique de Grenoble
/66
Further concepts beyond the introduction
● Regularization (L1, L2,...)
● Cost (Loss) Function (exponential, cross-entropy, hellinger, …)
● Activation Function (ReLU, Hyperbolic tangent, sigmoid, …)
● Output layer (softmax, linear, …)
● Linear algebra using broadcasting
● Specialized layers (convolution, pooling, embedding, ...)
● Dropout, masking, padding, ...
49
50. Laboratoire d’Informatique de Grenoble
/66
The DL zoo
https://towardsdatascience.com/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464 50
51. Laboratoire d’Informatique de Grenoble
/66
Further concepts beyond the introduction
● Regularization (L1, L2,...)
● Cost (Loss) Function (exponential, cross-entropy, hellinger, …)
● Activation Function (ReLU, Hyperbolic tangent, sigmoid, …)
● Output layer (softmax, linear, …)
● Linear algebra using broadcasting
● Specialized layers (convolution, pooling, embedding, ...)
● Dropout, masking, padding, ...
51
Key concepts
1) An Artificial Neuron Network is a function;
2) The training of an ANN is an optimization problem;
3) Gradient descent is the ultimate method to move along the error landscape.