This presentation discusses probability theory basics, Naive Bayes Classifier with some practical examples. This also introduces graph models for representing joint probability distributions.
In this presentation we describe the formulation of the HMM model as consisting of states that are hidden that generate the observables. We introduce the 3 basic problems: Finding the probability of a sequence of observation given the model, the decoding problem of finding the hidden states given the observations and the model and the training problem of determining the model parameters that generate the given observations. We discuss the Forward, Backward, Viterbi and Forward-Backward algorithms.
Discusses the concept of Language Models in Natural Language Processing. The n-gram models, markov chains are discussed. Smoothing techniques such as add-1 smoothing, interpolation and discounting methods are addressed.
This presentation is a part of ML Course and this deals with some of the basic concepts such as different types of learning, definitions of classification and regression, decision surfaces etc. This slide set also outlines the Perceptron Learning algorithm as a starter to other complex models to follow in the rest of the course.
In this presentation we describe the formulation of the HMM model as consisting of states that are hidden that generate the observables. We introduce the 3 basic problems: Finding the probability of a sequence of observation given the model, the decoding problem of finding the hidden states given the observations and the model and the training problem of determining the model parameters that generate the given observations. We discuss the Forward, Backward, Viterbi and Forward-Backward algorithms.
Discusses the concept of Language Models in Natural Language Processing. The n-gram models, markov chains are discussed. Smoothing techniques such as add-1 smoothing, interpolation and discounting methods are addressed.
This presentation is a part of ML Course and this deals with some of the basic concepts such as different types of learning, definitions of classification and regression, decision surfaces etc. This slide set also outlines the Perceptron Learning algorithm as a starter to other complex models to follow in the rest of the course.
Deep dive into the world of word vectors. We will cover - Bigram model, Skip-gram, CBOW, GLO. Starting from simplest models, we will journey through key results and ideas in this area.
This presentation discusses the state space problem formulation and different search techniques to solve these. Techniques such as Breadth First, Depth First, Uniform Cost and A star algorithms are covered with examples. We also discuss where such techniques are useful and the limitations.
MS CS - Selecting Machine Learning AlgorithmKaniska Mandal
ML Algorithms usually solve an optimization problem such that we need to find parameters for a given model that minimizes
— Loss function (prediction error)
— Model simplicity (regularization)
Algorithm Design and Complexity - Course 1&2Traian Rebedea
Courses 1 & 2 for the Algorithm Design and Complexity course at the Faculty of Engineering in Foreign Languages - Politehnica University of Bucharest, Romania
In the quest of improving the quality of education, Flexudy leverages the
power of AI to help people learn more efficiently.
During the talk, I will show how we trained an automatic extractive text
summarizer based on concepts from Reinforcement Learning, Deep Learning and Natural Language Processing. Also, I will talk about how we use pre-trained NLP models to generate simple questions for self-assessment.
Algorithm Design and Complexity - Course 4 - Heaps and Dynamic ProgammingTraian Rebedea
Course 4 for the Algorithm Design and Complexity course at the Faculty of Engineering in Foreign Languages - Politehnica University of Bucharest, Romania
This presentation is about how global terminology can evolve without a centralized organisation. The simple idea is, that everybody has to disclose the identity of at least two identifiers for the same think. These local semantic handshakes will have the effect of global terminological alignment.
발표자: 김영삼(서울대 박사)
발표일: 2018.8.
2015년 아타리 게임 컨트롤 과제와 2016년 알파고의 세계바둑 제패와 함께 강화학습은 많은 기계학습 연구자들의 관심을 얻게 되었으나, 자연어 처리 분야에서 강화학습은 아직까지 그 뚜렷한 활용전략이 나타나지 않고 있다. 본 토크에서는 강화학습이 자연어 처리 문제에 적용하기 어려운 주요 배경 중의 하나로, '보상의 희소성' 문제를 지적하고, 이를 해결하기 위한 방법으로 모형기반 강화학습과 기억기반 접근법의 활용 가능성을 논의하고자 한다. 이 가능성을 예시하기 위해 temporal-difference learning을 이용한 단어의 감정값 측정과 의료 진술문 상태값 측정과제를 수행하였고, 이를 중심으로 그 활용방법과 의미를 논의한다.
Deep learning paper review ppt sourece -Direct clr taeseon ryu
딥러닝 이미지 분류 테스크에서는 Self-Supervision 학습 방법이 있습니다. 레이블이 없는 상태에서 context prediction 이나 jigsaw puzzle과 같은 방법으로 학습시키는 방법이지만 이러한 self-supervision 테스크에는 모든 차원에 분포하지 않고 특정 부분 차원으로만 학습이 되는 Dimensional Collapse 라는 고질적인 문제를 일으킵니다. Self-supervision 중 positive pair는 가깝게, 그리고 negative pair는 서로 멀어지게 학습을 시키는 Contrastive Learning 이 있습니다. 이로인해 Dimensional Collapse에 강인할 것 이라고 직관적으로 생각이 들지만, 그렇지 않았습니다. 이러한 문제를 해결하기 위해 등장한 Direct CLR이라는 방법론을 소개드립니다.
논문의 배경부터 Direct CLR논문에 대한 디테일한 설명까지,
펀디멘탈팀의 이재윤님이 자세한 리뷰 도와주셨습니다.
오늘도 많은 관심 미리 감사드립니다 !
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
In this presentation we articulate when deep learning techniques yield best results from a practitioner's view point. Do we apply deep learning techniques for every machine learning problem? What characteristics of an application lends itself suitable for deep learning? Does more data automatically imply better results regardless of the algorithm or model? Does "automated feature learning" obviate the need for data preprocessing and feature design?
Deep dive into the world of word vectors. We will cover - Bigram model, Skip-gram, CBOW, GLO. Starting from simplest models, we will journey through key results and ideas in this area.
This presentation discusses the state space problem formulation and different search techniques to solve these. Techniques such as Breadth First, Depth First, Uniform Cost and A star algorithms are covered with examples. We also discuss where such techniques are useful and the limitations.
MS CS - Selecting Machine Learning AlgorithmKaniska Mandal
ML Algorithms usually solve an optimization problem such that we need to find parameters for a given model that minimizes
— Loss function (prediction error)
— Model simplicity (regularization)
Algorithm Design and Complexity - Course 1&2Traian Rebedea
Courses 1 & 2 for the Algorithm Design and Complexity course at the Faculty of Engineering in Foreign Languages - Politehnica University of Bucharest, Romania
In the quest of improving the quality of education, Flexudy leverages the
power of AI to help people learn more efficiently.
During the talk, I will show how we trained an automatic extractive text
summarizer based on concepts from Reinforcement Learning, Deep Learning and Natural Language Processing. Also, I will talk about how we use pre-trained NLP models to generate simple questions for self-assessment.
Algorithm Design and Complexity - Course 4 - Heaps and Dynamic ProgammingTraian Rebedea
Course 4 for the Algorithm Design and Complexity course at the Faculty of Engineering in Foreign Languages - Politehnica University of Bucharest, Romania
This presentation is about how global terminology can evolve without a centralized organisation. The simple idea is, that everybody has to disclose the identity of at least two identifiers for the same think. These local semantic handshakes will have the effect of global terminological alignment.
발표자: 김영삼(서울대 박사)
발표일: 2018.8.
2015년 아타리 게임 컨트롤 과제와 2016년 알파고의 세계바둑 제패와 함께 강화학습은 많은 기계학습 연구자들의 관심을 얻게 되었으나, 자연어 처리 분야에서 강화학습은 아직까지 그 뚜렷한 활용전략이 나타나지 않고 있다. 본 토크에서는 강화학습이 자연어 처리 문제에 적용하기 어려운 주요 배경 중의 하나로, '보상의 희소성' 문제를 지적하고, 이를 해결하기 위한 방법으로 모형기반 강화학습과 기억기반 접근법의 활용 가능성을 논의하고자 한다. 이 가능성을 예시하기 위해 temporal-difference learning을 이용한 단어의 감정값 측정과 의료 진술문 상태값 측정과제를 수행하였고, 이를 중심으로 그 활용방법과 의미를 논의한다.
Deep learning paper review ppt sourece -Direct clr taeseon ryu
딥러닝 이미지 분류 테스크에서는 Self-Supervision 학습 방법이 있습니다. 레이블이 없는 상태에서 context prediction 이나 jigsaw puzzle과 같은 방법으로 학습시키는 방법이지만 이러한 self-supervision 테스크에는 모든 차원에 분포하지 않고 특정 부분 차원으로만 학습이 되는 Dimensional Collapse 라는 고질적인 문제를 일으킵니다. Self-supervision 중 positive pair는 가깝게, 그리고 negative pair는 서로 멀어지게 학습을 시키는 Contrastive Learning 이 있습니다. 이로인해 Dimensional Collapse에 강인할 것 이라고 직관적으로 생각이 들지만, 그렇지 않았습니다. 이러한 문제를 해결하기 위해 등장한 Direct CLR이라는 방법론을 소개드립니다.
논문의 배경부터 Direct CLR논문에 대한 디테일한 설명까지,
펀디멘탈팀의 이재윤님이 자세한 리뷰 도와주셨습니다.
오늘도 많은 관심 미리 감사드립니다 !
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
Deep Learning For Practitioners, lecture 2: Selecting the right applications...ananth
In this presentation we articulate when deep learning techniques yield best results from a practitioner's view point. Do we apply deep learning techniques for every machine learning problem? What characteristics of an application lends itself suitable for deep learning? Does more data automatically imply better results regardless of the algorithm or model? Does "automated feature learning" obviate the need for data preprocessing and feature design?
Overview of TensorFlow For Natural Language Processingananth
TensorFlow open sourced recently by Google is one of the key frameworks that support development of deep learning architectures. In this slideset, part 1, we get started with a few basic primitives of TensorFlow. We will also discuss when and when not to use TensorFlow.
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
This presentation discusses decision trees as a machine learning technique. This introduces the problem with several examples: cricket player selection, medical C-Section diagnosis and Mobile Phone price predictor. It discusses the ID3 algorithm and discusses how the decision tree is induced. The definition and use of the concepts such as Entropy, Information Gain are discussed.
This slide set on convolutional neural networks is meant to be supplementary material to the slides from Andrej Karpathy's course. In this slide set we explain the motivation for CNN and also describe how to understand CNN coming from a standard feed forward neural networks perspective. For detailed architecture and discussions refer the original slides. I might post more detailed slides later.
In this presentation we discuss several concepts that include Word Representation using SVD as well as neural networks based techniques. In addition we also cover core concepts such as cosine similarity, atomic and distributed representations.
Deep Learning techniques have enabled exciting novel applications. Recent advances hold lot of promise for speech based applications that include synthesis and recognition. This slideset is a brief overview that presents a few architectures that are the state of the art in contemporary speech research. These slides are brief because most concepts/details were covered using the blackboard in a classroom setting. These slides are meant to supplement the lecture.
Operations Management Study in Textured Jersy Lanka LimitedHansi Thenuwara
Textured Jersey is renowned for the production of superior quality knit fabrics. The company’s history goes back nearly half a century, when its British roots gave it a solid grounding in the textile industry. Textured Jersey saw ownership change hands in the year 2000 and this modest operation entered a new era in Sri Lanka
Deep learning is receiving phenomenal attention due to breakthrough results in several AI tasks and significant research investment by top technology companies like Google, Facebook, Microsoft, IBM. For someone who has not been introduced to this technology, it may be daunting to learn several concepts such as feature learning, Restricted Boltzmann Machines, Autoencoders, etc all at once and start applying it to their own AI applications. This presentation is the first of several in this series that is intended at practitioners.
The discrete or atomic representation of words don't scale well to support rich semantics. Distributed representations associate a word with a vector based on the context in which the word occurs. In this presentation we describe the problem of word representation with a few illustrations and also describe the approach taken by word2vec. We also discuss the limitations of using a static database approach.
Development of a web site related to the tourism field in Sri Lanka, which provides tourism and related service information, advertising space for service providers and provides the space to share users' experiences
Mathematical Background for Artificial Intelligenceananth
Mathematical background is essential for understanding and developing AI and Machine Learning applications. In this presentation we give a brief tutorial that encompasses basic probability theory, distributions, mixture models, anomaly detection, graphical representations such as Bayesian Networks, etc.
I am Robert M. I am a Quantitative Analysis Homework Expert at excelhomeworkhelp.com. I hold a Master's in Statistics, from Birmingham, United States. I have been helping students with their homework for the past 7 years. I solved homework related to Quantitative Analysis.
Visit excelhomeworkhelp.com or email info@excelhomeworkhelp.com. You can also call on +1 678 648 4277 for any assistance with Quantitative Analysis Homework.
From Data to Knowledge in Mons, Belgium: "Taken from the cover of Atlas of Science, Katy Borner (MIT Press, 2010), where one reads: Noise becomes data when it has a cognitive pattern. Data becomes information when assembled into a coherent whole, which can be related to other information. Information becomes knowledge when integrated with other information in a form useful for making decisions and determining actions. Knowledge becomes understanding when related to other knowledge in a manner useful, in anticipating, judging and acting. Understanding becomes wisdom when informed by purpose, ethics, principles, memory and projection. George Santayana"
I am Joe M. I am a Statistics Homework Expert at statisticshomeworkhelper.com. I hold a Master's in Statistics, from the Gold Coast, Australia. I have been helping students with their homework for the past 6 years. I solve homework related to Statistics.
Visit statisticshomeworkhelper.com or email info@statisticshomeworkhelper.com.You can also call on +1 678 648 4277 for any assistance with Statistics Homework.
Random variables and probability distributions Random Va.docxcatheryncouper
Random variables and probability distributions
Random Variable
The outcome of an experiment need not be a number, for example, the outcome when a
coin is tossed can be 'heads' or 'tails'. However, we often want to represent outcomes
as numbers. A random variable is a function that associates a unique numerical value
with every outcome of an experiment. The value of the random variable will vary from
trial to trial as the experiment is repeated.
There are two types of random variable - discrete and continuous.
A random variable has either an associated probability distribution (discrete random
variable) or probability density function (continuous random variable).
Examples
1. A coin is tossed ten times. The random variable X is the number of tails that are
noted. X can only take the values 0, 1, ..., 10, so X is a discrete random variable.
2. A light bulb is burned until it burns out. The random variable Y is its lifetime in
hours. Y can take any positive real value, so Y is a continuous random variable.
Expected Value
The expected value (or population mean) of a random variable indicates its average or
central value. It is a useful summary value (a number) of the variable's distribution.
Stating the expected value gives a general impression of the behaviour of some random
variable without giving full details of its probability distribution (if it is discrete) or its
probability density function (if it is continuous).
Two random variables with the same expected value can have very different
distributions. There are other useful descriptive measures which affect the shape of the
distribution, for example variance.
The expected value of a random variable X is symbolised by E(X) or µ.
If X is a discrete random variable with possible values x1, x2, x3, ..., xn, and p(xi)
denotes P(X = xi), then the expected value of X is defined by:
where the elements are summed over all values of the random variable X.
http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#discvar
http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#contvar
http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#variance
If X is a continuous random variable with probability density function f(x), then the
expected value of X is defined by:
Example
Discrete case : When a die is thrown, each of the possible faces 1, 2, 3, 4, 5, 6 (the xi's)
has a probability of 1/6 (the p(xi)'s) of showing. The expected value of the face showing
is therefore:
µ = E(X) = (1 x 1/6) + (2 x 1/6) + (3 x 1/6) + (4 x 1/6) + (5 x 1/6) + (6 x 1/6) = 3.5
Notice that, in this case, E(X) is 3.5, which is not a possible value of X.
See also sample mean.
Variance
The (population) variance of a random variable is a non-negative number which gives
an idea of how widely spread the values of the random variable are likely to be; the
larger the variance, the more scattered the obser ...
Last time we talked about propositional logic, a logic on simple statements.
This time we will talk about first order logic, a logic on quantified statements.
First order logic is much more expressive than propositional logic.
The topics on first order logic are:
1-Quantifiers
2-Negation
3-Multiple quantifiers
4-Arguments of quantified statements
Generative Adversarial Networks : Basic architecture and variantsananth
In this presentation we review the fundamentals behind GANs and look at different variants. We quickly review the theory such as the cost functions, training procedure, challenges and go on to look at variants such as CycleGAN, SAGAN etc.
Convolutional Neural Networks : Popular Architecturesananth
In this presentation we look at some of the popular architectures, such as ResNet, that have been successfully used for a variety of applications. Starting from the AlexNet and VGG that showed that the deep learning architectures can deliver unprecedented accuracies for Image classification and localization tasks, we review other recent architectures such as ResNet, GoogleNet (Inception) and the more recent SENet that have won ImageNet competitions.
Artificial Neural Networks have been very successfully used in several machine learning applications. They are often the building blocks when building deep learning systems. We discuss the hypothesis, training with backpropagation, update methods, regularization techniques.
In this presentation we discuss the convolution operation, the architecture of a convolution neural network, different layers such as pooling etc. This presentation draws heavily from A Karpathy's Stanford Course CS 231n
Artificial Intelligence Course: Linear models ananth
In this presentation we present the linear models: Regression and Classification. We illustrate with several examples. Concepts such as underfitting (Bias) and overfitting (Variance) are presented. Linear models can be used as stand alone classifiers for simple cases and they are essential building blocks as a part of larger deep learning networks
Naive Bayes Classifier is a machine learning technique that is exceedingly useful to address several classification problems. It is often used as a baseline classifier to benchmark results. It is also used as a standalone classifier for tasks such as spam filtering where the naive assumption (conditional independence) made by the classifier seem reasonable. In this presentation we discuss the mathematical basis for the Naive Bayes and illustrate with examples
This is the first lecture of the AI course offered by me at PES University, Bangalore. In this presentation we discuss the different definitions of AI, the notion of Intelligent Agents, distinguish an AI program from a complex program such as those that solve complex calculus problems (see the integration example) and look at the role of Machine Learning and Deep Learning in the context of AI. We also go over the course scope and logistics.
This is the first lecture on Applied Machine Learning. The course focuses on the emerging and modern aspects of this subject such as Deep Learning, Recurrent and Recursive Neural Networks (RNN), Long Short Term Memory (LSTM), Convolution Neural Networks (CNN), Hidden Markov Models (HMM). It deals with several application areas such as Natural Language Processing, Image Understanding etc. This presentation provides the landscape.
Recurrent Neural Networks have shown to be very powerful models as they can propagate context over several time steps. Due to this they can be applied effectively for addressing several problems in Natural Language Processing, such as Language Modelling, Tagging problems, Speech Recognition etc. In this presentation we introduce the basic RNN model and discuss the vanishing gradient problem. We describe LSTM (Long Short Term Memory) and Gated Recurrent Units (GRU). We also discuss Bidirectional RNN with an example. RNN architectures can be considered as deep learning systems where the number of time steps can be considered as the depth of the network. It is also possible to build the RNN with multiple hidden layers, each having recurrent connections from the previous time steps that represent the abstraction both in time and space.
In this presentation we discuss the hypothesis of MaxEnt models, describe the role of feature functions and their applications to Natural Language Processing (NLP). The training of the classifier is discussed in a later presentation.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
2. Why Probability for NLP?
• Consider the following example scenarios:
• You are travelling in an autorikshaw on a busy road in Bangalore
and are a on a call with your friend.
• We are watching an Hollywood English film. Often times we may
not understand exactly every word that is spoken either due to
the accent of the speaker or the word is a slang that not
everyone outside the context can relate to.
• We are reading tweets that are cryptic with several misspelled
words, emoticons, hashtags and so on.
• Commonality in all the above cases is the presence of noise along
with the signal
• The noise or ambiguities result in uncertainty of interpretation
• In order for an NLP application to process such text, we need an
appropriate mathematical machinery.
• Probability theory is our tool to handle such cases.
3. What is a Random Variable?
• A is a Boolean valued RV if A denotes an event
and there is some degree of uncertainty to
whether A occurs.
• Example: It will rain in Manchester during the 4th
Cricket test match between India and England
• Probability of A is the fraction of possible
worlds in which A is true
• The area of blue rectangle = 1
• Random Variable is not a variable in the
traditional sense. It is rather a function
mapping.
Worlds
where A
is true
Worlds where A is false
4. Axioms of Probability
The following axioms always hold good:
• 0 <= P(A) <= 1
• P(True) = 1
• P(False) = 0
• P(A or B) = P(A) + P(B) – P(A and B)
Note: We can diagrammatically represent the above and verify these
5. Multivalued Discrete Random Variables
Examples of multivalued RVs
• Number of hashtags in a Tweet
• Number of URLs in a tweet
• Number of tweets that mention the
term iPhone in a given day
• Number of tweets sent by Times Now
channel per day
• The values a word position can take in a
natural language text sentence of a
given vocabulary
6. Random Variable Example
• We can treat the number of hashtags contained in a given tweet (h) as a Random
Variable because:
• A tweet has non-negative number of hashtags: 0 to n, where n is the maximum number of
hashtags bounded by the size of the tweet. For example, if a tweet has 140 characters and if
the minimum string length required to specify a hashtag is 3 (one # character, one character
for the tag, one delimiter), then n = floor(140 / 3) = 46
• As we observe, h is a discrete RV taking values: 0 <= h <= 46
• Thus the probability of finding a hashtag in a tweet satisfies:
0 < P(hashtag in a tweet) < 1
P(h = 0 V h = 1 V h = 2….V h = 46) = 1 – where h is the number of hashtags in the tweet
P(h = hi, h = hj) = 0 when i and j are not equal – hi and hj are 2 different values of hashtag
• If we are given a set of tweets, e.g., 500, we can compute the average number of
hashtags per tweet. This is real valued and hence a continuous variable.
• We can find the probability distribution for the continuous variable also (see later
slides)
7. Joint Distribution of Discrete Random Variables
• Consider 2 RVs X and Y, where X and Y
can take discrete values. The joint
distribution is given by:
P(X = x, Y = y)
The above satisfies:
1. P(X, Y) >= 0
2. Σ Σ P(X = xi, Y = yj) = 1 where the
summation is done for all i and all j
This concept can be generalized in to n
variables
HASHTAG RT URL SNAME
0 0 1 0
3 0 1 1
1 0 0 0
0 0 0 0
0 0 1 0
0 0 0 0
1 0 0 1
0 0 0 0
0 0 0 0
0 0 0 0
1 0 1 0
1 1 0 0
1 0 1 0
1 0 0 0
0 0 1 0
3 0 0 0
0 1 0 0
8. Frequency Distributions for our Tweet Corpora
Some Observations
• Frequency distribution shows distinguishable
patterns across the different corpora. For e.g, in the
case of Union budget, the number of tweets that
have hashtags one or more is roughly 50% of total
tweets while that for Railway budget is roughly 25%
• The expectation value of
hashtags/Retweets/URLs/Screen names per tweet
can be treated as a vector of real valued Random
Variables and used for classification.
0
50
100
150
200
250
300
350
400
0 1 2 3 4 5 6 7 8 9 10
Frequency Distribution For Dataset 10 (Tweets on
General Elections)
htfreq rtfreq urlfreq snfreq
0
50
100
150
200
250
300
350
400
450
0 1 2 3 4 5 6 7 8 9 10
Frequency Distribution for Dataset 1 (Tweets on Railway
Budget)
htfreq rtfreq urlfreq snfreq
0
100
200
300
400
500
0 1 2 3 4 5 6 7 8 9 10
Frequency Distribution for Dataset 19 (Tweets on
Union Budget)
htfreq rtfreq urlfreq snfreq
9. Random Variables: Illustration
• Let us illustrate the definition of Random Variables and derive the sum, product
rules of conditional probability with an example below:
• Suppose we follow tweets from 2 media channels, say, Times News (x1) and Business News
(x2). x1 is a general news channel and x2 is focused more on business news.
• Suppose we measured the tweets generated by either of these 2 channels over a 1 month
duration and observed the following:
• Total tweets generated (union of tweets from both channels) = 1200
• Break up of the tweets received as given in the table below:
600 40
200 360
x1 x2
y1
y2
10. Random Variables
• We can model the example we stated using 2
random variables:
• Source of tweets (X), Type of tweet (Y)
• X and Y are 2 RVs that can take values:
• X = x where x ԑ {x1, x2}, Y = y where y ԑ {y1, y2}
• Also, from the given data, we can compute
prior probabilities and likelihood:
P(X = x1) = (600 + 200)/1200 = 2/3 = 0.67
P(X = x2) = 1 – 2/3 = 1/3 = 0.33
P(Y = y1 | X = x1) = 600 / (600 + 200) = 3/4 = 0.75
P(Y = y2 | X = x1) = 0.25
P(Y = y1 | X = x2) = 0.1
P(Y = y2 | X = x2) = 0.9
600 40
200 360
x1 x2
y1
y2
11. Conditional Probability
• Conditional probability is the probability
of one event, given the other event has
occurred.
• Example:
• Assume that we know the probability of finding a
hashtag in a tweet. Suppose we have a tweet
corpus C on a particular domain, where there is a
increased probability of finding a hashtag. In this
example, we have some notion (or prior) idea
about the probability of finding a hashtag in a
tag. When given an additional fact that the
corpus from where the tweet was drawn was C,
we now can revise our probability estimate on
hashtag, which is: P(hashtag|C). This is called
posterior probability
12. Sum Rule
In our example:
P(X = x1) =
P(X = x1, Y = y1) + P(X = x1, Y = y2)
We can calculate the other
probabilities: P(x2), P(y1), P(y2)
Note:
P(X = x1) + P(X = x2) = 1
600 40
200 360
x1 x2
y1
y2
13. Product Rule and Generalization
From product rule, we have:
P(X, Y) = P(Y|X) P(X)
We can generalize this in to:
P(An, ..,A1)=P(An|An-1..A1)P(An-1,..,A1)
Suppose e.g. n = 4
P(A4, A3, A2, A1) = P(A4|A3, A2, A1) P(A3|A2, A1)
P(A2|A1) P(A1)
600 40
200 360
x1 x2
y1
y2
14. Bayes Theorem
From product rule, we have:
P(X, Y) = P(Y|X) P(X)
We know: P(X, Y) = P(Y, X), hence:
P(Y|X) P(X) = P(X|Y) P(Y)
From the above, we derive:
P(Y|X) = P(X|Y) P(Y) / P(X)
The above is the Bayes Theorem
15. Independence
• Independent Variables: Knowing Y does not alter our belief on X
From product rule, we know:
P(X, Y) = P(X|Y) P(Y)
If X and Y are independent random variables:
P(X|Y) = P(X)
Hence: P(X, Y) = P(X) P(Y)
We write: X Y to denote X, Y are independent
• Conditional Independence
• Informally, suppose X, Y are not independent taken together alone, but are independent on
observing another variable Z. This is denoted by: X Y | Z
• Definition: Let X, Y, Z be discrete random variables. X is conditionally independent of Y given Z
if the probability distribution governing X is independent of the value of Y given a value of Z.
P(X|Y, Z) = P(X|Z)
Also: P(X, Y | Z) = P(X|Y, Z) P(Y|Z) = P(X|Z) P(Y|Z)
16. Graph Models: Bayesian Networks
• Graph models also called Bayesian networks, belief networks and
probabilistic networks
• Nodes and edges between the nodes
• Each node corresponds to a random variable X and the value of the node is
the probability of X
• If there is a direct edge between two vertices X to Y, it means there is a
influence of X on Y
• This influence is specified by the conditional probability P(Y|X)
• This is a DAG
• Nodes and edges define the structure of the network and the conditional
probabilities are the parameters given the structure
17. Examples
• Preparation for the exam R, and the marks obtained in the exam M
• Marketing budget B and the advertisements A
• Nationality of Team N and chance of qualifying for quarter final of
world cup, Q
• In all cases, the Probability distribution P respects the graph G
18. Representing the joint distributions
• Consider P(A, B, C) = P(A) P(B|A) P(C|A, B). This can
be represented as a graph (fig a)
• Key Concept: Factorization – each vertex has a factor
• Exercise: Draw the graph for P(X1, X2, X3, X4, X5)
• The joint probability distribution with conditional
probability assumptions respects the associated
graph. (fig b)
• The graph describing the distribution is useful in 2
ways:
• Visualization of conditional dependencies
• Inferencing
• Determining Conditional Independence of a
distribution is vital for tractable inference
A
B C
A
B C
Fig (a)
Fig (b)
19. Continuous Random Variables
f (x) 0,x
f (x) 1
P(0 X 1) f (x)dx
0
1
• A random variable X is said to be continuous if it takes
continuous values – for example, a real valued number
representing some physical quantity that can vary at random.
• Suppose there is a sentiment analyser that outputs a sentiment
polarity value as a real number in the range (-1, 1). The output of
this system is an example of a continuous RV
• While the polarity as mentioned in the above example is a scalar
variable, we also encounter vectors of continuous random
variables. For example, some sentic computing systems model
emotions as a multi dimensional vector of real number.
• Likewise a vector whose elements are the average values of
hashtag, URL, Screen Names, Retweets per tweet, averaged over
a corpus constitutes a vector of continuous Random Variables
23. Expectation Value
• For discrete variables:
• Expectation value: E[X] = Σ f(x) p(x)
• If a random sample is picked from the distribution, the expectation is simply
the average value of f(x)
• Thus, the values of hashtag, screen name etc shown in the previous slide
correspond to their respective expectation values
• For continuous variables:
• E[X] = Integral of f(x) p(x) dx over the range of values X can take
24. Common Distributions
• Normal X N(μ, σ2)
E.g. the height of the entire population
f (x)
1
2
exp
(x )2
22
26. Naïve Bayes Classifier: Worked out example
• Classify the tweets in to one of the three corpora: (Railway, Union,
Elections), given a dataset.
• Split the dataset in to 75% training and 25% test
• Estimate the parameters using the training data
• Evaluate the model with the test data
• Report the classifier accuracy