This document presents two tensor factorization methods: Exponential Family Tensor Factorization (ETF) and Full-Rank Tensor Completion (FTC). ETF generalizes Tucker decomposition by allowing for different noise distributions in the tensor and handles mixed discrete and continuous values. FTC completes missing tensor values without reducing dimensionality by kernelizing Tucker decomposition. The document outlines these methods and their motivations, discusses Tucker decomposition, and provides an example applying ETF to anomaly detection in time series sensor data.
Exploring temporal graph data with Python: a study on tensor decomposition o...André Panisson
Tensor decompositions have gained a steadily increasing popularity in data mining applications. Data sources from sensor networks and Internet-of-Things applications promise a wealth of interaction data that can be naturally represented as multidimensional structures such as tensors. For example, time-varying social networks collected from wearable proximity sensors can be represented as 3-way tensors. By representing this data as tensors, we can use tensor decomposition to extract community structures with their structural and temporal signatures.
The current standard framework for working with tensors, however, is Matlab. We will show how tensor decompositions can be carried out using Python, how to obtain latent components and how they can be interpreted, and what are some applications of this technique in the academy and industry. We will see a use case where a Python implementation of tensor decomposition is applied to a dataset that describes social interactions of people, collected using the SocioPatterns platform. This platform was deployed in different settings such as conferences, schools and hospitals, in order to support mathematical modelling and simulation of airborne infectious diseases. Tensor decomposition has been used in these scenarios to solve different types of problems: it can be used for data cleaning, where time-varying graph anomalies can be identified and removed from data; it can also be used to assess the impact of latent components in the spreading of a disease, and to devise intervention strategies that are able to reduce the number of infection cases in a school or hospital. These are just a few examples that show the potential of this technique in data mining and machine learning applications.
Overview of tree algorithms from decision tree to xgboostTakami Sato
For my understanding, I surveyed popular tree algorithms on Machine Learning and their evolution. This is the first time I wrote a presentation in English. So, I am happy if you give me a feedback.
Likelihood is sometimes difficult to compute because of the complexity of the model. Approximate Bayesian computation (ABC) makes it easy to sample parameters generating approximation of observed data.
Exploring temporal graph data with Python: a study on tensor decomposition o...André Panisson
Tensor decompositions have gained a steadily increasing popularity in data mining applications. Data sources from sensor networks and Internet-of-Things applications promise a wealth of interaction data that can be naturally represented as multidimensional structures such as tensors. For example, time-varying social networks collected from wearable proximity sensors can be represented as 3-way tensors. By representing this data as tensors, we can use tensor decomposition to extract community structures with their structural and temporal signatures.
The current standard framework for working with tensors, however, is Matlab. We will show how tensor decompositions can be carried out using Python, how to obtain latent components and how they can be interpreted, and what are some applications of this technique in the academy and industry. We will see a use case where a Python implementation of tensor decomposition is applied to a dataset that describes social interactions of people, collected using the SocioPatterns platform. This platform was deployed in different settings such as conferences, schools and hospitals, in order to support mathematical modelling and simulation of airborne infectious diseases. Tensor decomposition has been used in these scenarios to solve different types of problems: it can be used for data cleaning, where time-varying graph anomalies can be identified and removed from data; it can also be used to assess the impact of latent components in the spreading of a disease, and to devise intervention strategies that are able to reduce the number of infection cases in a school or hospital. These are just a few examples that show the potential of this technique in data mining and machine learning applications.
Overview of tree algorithms from decision tree to xgboostTakami Sato
For my understanding, I surveyed popular tree algorithms on Machine Learning and their evolution. This is the first time I wrote a presentation in English. So, I am happy if you give me a feedback.
Likelihood is sometimes difficult to compute because of the complexity of the model. Approximate Bayesian computation (ABC) makes it easy to sample parameters generating approximation of observed data.
Because of deep learning we now talk a lot about tensors, yet tensors remain relatively unknown objects. In this presentation I will introduce tensors and the basics of multilinear algebra, then describe tensor decompositions and give some examples of how they are used in representation learning for understanding/compressing data. I will also briefly describe how tensor decompositions are used in 1) the method of moments for training latent variable models, and 2) deep learning for understanding why deep convolutional networks are such excellent classifiers.
Tensor Train (TT) decomposition [3] is a generalization of SVD decomposition from matrices to tensors (=multidimensional arrays).
It represents a tensor compactly in terms of factors and allows to work with the tensor via its factors without materializing the tensor itself.
For example, we can find the elementwise product of two TT-tensors of size 2^100 and get the result in the TT-format as well.
In the talk, we will show how Tensor Train decomposition can be used to represent parameters of neural networks [1] and polynomial models [2].
This parametrization allows exponentially many 'virtual' parameters while working only with small factors of the TT-format.
To train the model, i.e. optimize the objective subject to the constraint that the parameters are in the TT-format, [2] uses stochastic Riemannian optimization.
[1] Novikov, A., Podoprikhin, D., Osokin, A., & Vetrov, D. P. (2015). Tensorizing neural networks. In Advances in Neural Information Processing Systems.
[2] Novikov, A., Trofimov, M., & Oseledets, I. (2016). Tensor Train polynomial models via Riemannian optimization. arXiv:1605.03795.
[3] Oseledets, I. (2011). Tensor-train decomposition. SIAM Journal on Scientific Computing.
A presentation slide of the paper "Doubly Decomposing Nonparametric Tensor Regression" (Imaizumi & Hayashi 2016 ICML).
Full paper
http://www.jmlr.org/proceedings/papers/v48/imaizumi16.html
Extending Preconditioned GMRES to Nonlinear OptimizationHans De Sterck
Hans De Sterck's plenary invited lecture at the 2012 SIAM conference on Applied Linear Algebra, Valencia, Spain, June 2012. Title: "Extending Preconditioned GMRES to Nonlinear Optimization".
Top-N Recommendation for Shared Accountskoeverstrep
Standard collaborative filtering recommender systems assume that every account in the training data represents a single user. However, multiple users often share a single account. A typical example is a single shopping account for the whole family. Traditional recommender systems fail in this situation. If contextual information is available, context aware recommender systems are the state-of-the-art solution. Yet, often no contextual information is available. Therefore, we introduce the challenge of recommending to shared accounts in the absence of contextual information. We propose a solution to this challenge for all cases in which the reference recommender system is an item-based top-N collaborative filtering recommender system, generating recommendations based on binary, positive-only feedback. We experimentally show the advantages of our proposed solution for tackling the problems that arise from the existence of shared accounts on multiple datasets.
Chapter summary and solutions to end-of-chapter exercises for "Data Visualization: Principles and Practice" book by Alexandru C. Telea
Chapter provides an overview of a number of methods for visualizing tensor data. It explains principal component analysis as a technique used to process a tensor matrix and extract from it information that can directly be used in its visualization. It forms a fundamental part of many tensor data processing and visualization algorithms. Section 7.4 shows how the results of the principal component analysis can be visualized using the simple color-mapping techniques. Next parts of the chapter explain how same data can be visualized using tensor glyphs, and streamline-like visualization techniques.
In contrast to Slicer, which is a more general framework for analyzing and visualizing 3D slice-based data volumes, the Diffusion Toolkit focuses on DT-MRI datasets, and thus offers more extensive and easier to use options for fiber tracking.
Context-aware preference modeling with factorizationBalázs Hidasi
This talk was presented at the Doctoral Symposium of RecSys'15. It is a summary of the core part of my PhD research in the last few years. The research revolves around solving the implicit feedback based context-aware recommendation problem with factorization.
Associated paper: http://dl.acm.org/citation.cfm?id=2796543
Details of presented algorithms/methods (public versions available on http://hidasi.eu):
iTALS: http://link.springer.com/chapter/10.1007/978-3-642-33486-3_5
iTALSx: http://www.infocommunications.hu/documents/169298/1025723/InfocomJ_2014_4_5_Hidasi.pdf
ALS-CG/CD: http://link.springer.com/article/10.1007/s10115-015-0863-2
GFF: http://link.springer.com/article/10.1007/s10618-015-0417-y
Here we give an overview of the causes of pinning in multiphase lattice Boltzmann models and propose a stochastic sharpening approach to overcome this spurious phenomenon.
The first report of Machine Learning Seminar organized by Computational Linguistics Laboratory at Kazan Federal University. See http://cll.niimm.ksu.ru/cms/lang/en_US/main/seminars/mlseminar
Instantaneous Gelation in Smoluchwski's Coagulation Equation Revisited, Confe...Colm Connaughton
Invited talk given at "Boltzmann equation:
mathematics, modeling and simulations
In memory of Carlo Cercignani", Institut Henri Poincare, Paris, February 11, 2011.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Generalization of Tensor Factorization and Applications
1. Generalization of Tensor Factorization
and Applications
Kohei Hayashi
Collaborators:
T. Takenouchi, T. Shibata, Y. Kamiya, D. Kato, K. Kunieda, K. Yamada, K. Ikeda,
R. Tomioka, H. Kashima
May 14, 2012
1 / 29
2. Relational data
is a collection of relationships among multiple objects.
relationships of pair ⇔ matrix
2 / 29
3. Relational data
is a collection of relationships among multiple objects.
relationships of pair ⇔ matrix
relationships of 3-tuple ⇔ 3 dim. array
3 / 29
4. Relational data
is a collection of relationships among multiple objects.
relationships of pair ⇔ matrix
relationships of 3-tuple ⇔ 3 dim. array tensor
.
. .
.
. .
can represent as a tensor with missing values.
4 / 29
5. Issue of tensor representation
Tensor representation is generally high-dimensional and
large-scale
• e.g. 1, 000 users × 1, 000 items × 1, 000 times
= total 1, 000, 000, 000 relationships
Dimensional reduction techniques such as tensor
factorization is used.
5 / 29
6. Tensor factorization
Tucker decomposition [Tucker 1966]: a tensor factorization
method assuming
1 observation noise is Gaussian
.
2 underlying tensor is low-dimensional
.
These assumptions are general ... but are not always
true.
6 / 29
7. Contributions
Generalize Tucker decomposition and propose two new
models:
1 Exponential family tensor factorization (ETF)
.
[Joint work with Takenouchi, Shibata, Kamiya, Kunieda, Yamada, and
Ikeda]
• Generalize the noise distribution.
• Can handle a tensor containing mixed discrete and
continuous values.
2. Full-rank tensor completion (FTC)
[Joint work with Tomioka and Kashima]
• Kernelize Tucker decomposition
• Complete missing values without reducing the
dimensionality.
7 / 29
13. Vectorized form
Let x ∈ RD and z ∈ RK denote vectorized X and Z,
resp., then
x = Wz + ε where W ≡ U(3) ⊗ U(2) ⊗ U(1) .
• D ≡ D1 D2 D3 , K ≡ K1 K2 K3
• ⊗: the Kronecker product
Tucker decomposition = A linear Gaussian model
x ∼ N (x | Wz, σ 2 I)
13 / 29
14. Rank of tensor
Call dimensionalities of the core tensor rank of tensor .
• Rank of X is (K1 , K2 , K3 ).
Rank of tensor represents its complexity
• low-rank tensor has less information
14 / 29
17. Motivation: heterogeneous tensor
Tucker decomposition assumes Gaussian noise
• not appropriate for such data
.
Approach
.
generalize Tucker decomposition, assuming a different
distribution for each element
.
17 / 29
18. Exponential family tensor factorization
.
Likelihood
.
D
x∼ Expond (xd | θd ), θ ≡ Wz
d=1 exp. family Tucker decomp.
. where Expon(x | θ) ≡ exp [xθ − ψ(θ) + F (x)]
exponential family : a class of distributions
Gaussian Poisson Bernoulli
0.4
0.7
Density
Density
0.2
0.2
0.5
(θ = 0)
0.0
0.0
0.3
ψ(θ) θ2−4 −2
/2 0 2 4
exp[θ] 4
0 2 6 8
ln(10.0 0.5 1.0 1.5
−0.5
+ exp[θ])
x
18 / 29
19. .
Priors
.
• for z: a Gaussian prior N (0, I)
(m) −1
. • for U : a Gaussian prior N (0, αm I)
.
Joint log-likelihood
.
D
L =x Wz − ψhd (wd z)
d=1
M
1 αm 2
− ||z||2 − U(m) + const.
2 m=1
2 Fro
.
Estimate parameters by Bayesian inference.
19 / 29
20. Bayesian inference
Marginal-MAP estimator:
argmax exp[L(z, U(1) , . . . , U(M ) )]dz
U(1) ,...,U(M )
• The integral is not analytical.
Develop efficient yet accurate approximation with
Laplace approximation and Gaussian process.
• Computational cost is still higher than Tucker
dcomp.
• For a long and thin tensor (e.g. time series), online
algorithm is applicable (see thesis.)
20 / 29
22. Anomaly detection by ETF
Purpose Find irregular parts of tensor data
Method Apply distance-based outlier
DB (p, D) [Knorr+ VLDB’00] to the estimated
factor matrix U(m)
.
Definition of DB (p, D)
.
“An object O is a DB (p, D) outlier if at least fraction p
. the objects lies at a distance greater than D from O.”
of
22 / 29
23. Data set
A time series of multisensor measurements
• Each sensor recorded the human behavior (e.g.
position) of researchers in NEC lab. for 8 month.
• 6 (sensors) × 21 (persons) × 1927 (hours)
Sensor Type Min Max
X1 # of sent emails Count 0.00 14.00
X2 # of received emails Count 0.00 15.00
X3 # of typed keys Count 0.00 50422.00
X4 X coordinate Real −550.17 3444.15
X5 Y coordinate Real 128.71 2353.55
X6 Movement distance Non-negative 0.00 203136.96
23 / 29
24. Evaluation
List of irregular events are provided.
Examples of irregular events
Date Time Description
Dec 21 All day Private incident
Dec 22 15:00 Monthly seminar
16:00
Jun 15 13:00 Visiting tour
.
. .
. .
.
. . .
• Evaluate as a binary classification
• 50% are missing, 10 trials
• Apply ETF with online algorithm
24 / 29