1. Sarcasm detection poses a challenge for sentiment analysis systems as sarcasm involves stating the opposite sentiment from what is meant. This "Achilles heel" is important to address from both business and research perspectives.
2. The document describes a solution for sarcasm detection that uses features extracted from pretrained convolutional neural networks for sentiment analysis and emotion detection, combined with features from a baseline model.
3. Evaluation on a test set showed improved performance over the baseline models, with future work including collecting more data and exploring attention mechanisms and recurrent neural networks. Addressing sarcasm detection was presented as an important problem at the intersection of natural language processing and domain knowledge.
A comprehensive tutorial on Convolutional Neural Networks (CNN) which talks about the motivation behind CNNs and Deep Learning in general, followed by a description of the various components involved in a typical CNN layer. It explains the theory involved with the different variants used in practice and also, gives a big picture of the whole network by putting everything together.
Next, there's a discussion of the various state-of-the-art frameworks being used to implement CNNs to tackle real-world classification and regression problems.
Finally, the implementation of the CNNs is demonstrated by implementing the paper 'Age ang Gender Classification Using Convolutional Neural Networks' by Hassner (2015).
This presentation consist of detail description regarding how social media sentiments analysis is performed , what is its scope and benefits in real life scenario.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Organizations are collecting massive amounts of data from disparate sources. However, they continuously face the challenge of identifying patterns, detecting anomalies, and projecting future trends based on large data sets. Machine learning for anomaly detection provides a promising alternative for the detection and classification of anomalies.
Find out how you can implement machine learning to increase speed and effectiveness in identifying and reporting anomalies.
In this webinar, we will discuss :
How machine learning can help in identifying anomalies
Steps to approach an anomaly detection problem
Various techniques available for anomaly detection
Best algorithms that fit in different situations
Implementing an anomaly detection use case on the StreamAnalytix platform
To view the webinar - https://bit.ly/2IV2ahC
A comprehensive tutorial on Convolutional Neural Networks (CNN) which talks about the motivation behind CNNs and Deep Learning in general, followed by a description of the various components involved in a typical CNN layer. It explains the theory involved with the different variants used in practice and also, gives a big picture of the whole network by putting everything together.
Next, there's a discussion of the various state-of-the-art frameworks being used to implement CNNs to tackle real-world classification and regression problems.
Finally, the implementation of the CNNs is demonstrated by implementing the paper 'Age ang Gender Classification Using Convolutional Neural Networks' by Hassner (2015).
This presentation consist of detail description regarding how social media sentiments analysis is performed , what is its scope and benefits in real life scenario.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Organizations are collecting massive amounts of data from disparate sources. However, they continuously face the challenge of identifying patterns, detecting anomalies, and projecting future trends based on large data sets. Machine learning for anomaly detection provides a promising alternative for the detection and classification of anomalies.
Find out how you can implement machine learning to increase speed and effectiveness in identifying and reporting anomalies.
In this webinar, we will discuss :
How machine learning can help in identifying anomalies
Steps to approach an anomaly detection problem
Various techniques available for anomaly detection
Best algorithms that fit in different situations
Implementing an anomaly detection use case on the StreamAnalytix platform
To view the webinar - https://bit.ly/2IV2ahC
Part 1 of the Deep Learning Fundamentals Series, this session discusses the use cases and scenarios surrounding Deep Learning and AI; reviews the fundamentals of artificial neural networks (ANNs) and perceptrons; discuss the basics around optimization beginning with the cost function, gradient descent, and backpropagation; and activation functions (including Sigmoid, TanH, and ReLU). The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
Process the sentiments of NLP with Naive Bayes Rule, Random Forest, Support Vector Machine, and much more.
Thanks, for your time, if you enjoyed this short slide there are tons of topics in advanced analytics, data science, and machine learning available in my medium repo. https://medium.com/@bobrupakroy
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
Basic definitions, terminologies, and Working of ANN has been explained. This ppt also shows how ANN can be performed in matlab. This material contains the explanation of Feed forward back propagation algorithm in detail.
The Text Classification slides contains the research results about the possible natural language processing algorithms. Specifically, it contains the brief overview of the natural language processing steps, the common algorithms used to transform words into meaningful vectors/data, and the algorithms used to learn and classify the data.
To learn more about RAX Automation Suite, visit: www.raxsuite.com
This Edureka Recurrent Neural Networks tutorial will help you in understanding why we need Recurrent Neural Networks (RNN) and what exactly it is. It also explains few issues with training a Recurrent Neural Network and how to overcome those challenges using LSTMs. The last section includes a use-case of LSTM to predict the next word using a sample short story
Below are the topics covered in this tutorial:
1. Why Not Feedforward Networks?
2. What Are Recurrent Neural Networks?
3. Training A Recurrent Neural Network
4. Issues With Recurrent Neural Networks - Vanishing And Exploding Gradient
5. Long Short-Term Memory Networks (LSTMs)
6. LSTM Use-Case
Do you really think that a neural network is a block box? I believe, a neuron inside the human brain may be very complex, but a neuron in a neural network is certainly not that complex.
In this presentation, we are going to discuss how to implement a neural network from scratch in Python. This means we are NOT going to use deep learning libraries like TensorFlow, PyTorch, Keras, etc.
An overview of Deep Learning With Neural Networks. Use cases of Deep learning and it's development. Basic introduction tp the layers of Neural Networks.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
This presentation is about Sentiment analysis Using Machine Learning which is a modern way to perform sentiment analysis operation. it has various techniques and algorithm described and compared for SA
how,when and why to perform Feature scaling?
Different type of feature scaling Technique.
when to perform feature scaling?
why to perform feature scaling?
MinMax feature scaling techniques.
Unit vector scaling.
Decision tree is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems. It is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision.
Part 1 of the Deep Learning Fundamentals Series, this session discusses the use cases and scenarios surrounding Deep Learning and AI; reviews the fundamentals of artificial neural networks (ANNs) and perceptrons; discuss the basics around optimization beginning with the cost function, gradient descent, and backpropagation; and activation functions (including Sigmoid, TanH, and ReLU). The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
Process the sentiments of NLP with Naive Bayes Rule, Random Forest, Support Vector Machine, and much more.
Thanks, for your time, if you enjoyed this short slide there are tons of topics in advanced analytics, data science, and machine learning available in my medium repo. https://medium.com/@bobrupakroy
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
Basic definitions, terminologies, and Working of ANN has been explained. This ppt also shows how ANN can be performed in matlab. This material contains the explanation of Feed forward back propagation algorithm in detail.
The Text Classification slides contains the research results about the possible natural language processing algorithms. Specifically, it contains the brief overview of the natural language processing steps, the common algorithms used to transform words into meaningful vectors/data, and the algorithms used to learn and classify the data.
To learn more about RAX Automation Suite, visit: www.raxsuite.com
This Edureka Recurrent Neural Networks tutorial will help you in understanding why we need Recurrent Neural Networks (RNN) and what exactly it is. It also explains few issues with training a Recurrent Neural Network and how to overcome those challenges using LSTMs. The last section includes a use-case of LSTM to predict the next word using a sample short story
Below are the topics covered in this tutorial:
1. Why Not Feedforward Networks?
2. What Are Recurrent Neural Networks?
3. Training A Recurrent Neural Network
4. Issues With Recurrent Neural Networks - Vanishing And Exploding Gradient
5. Long Short-Term Memory Networks (LSTMs)
6. LSTM Use-Case
Do you really think that a neural network is a block box? I believe, a neuron inside the human brain may be very complex, but a neuron in a neural network is certainly not that complex.
In this presentation, we are going to discuss how to implement a neural network from scratch in Python. This means we are NOT going to use deep learning libraries like TensorFlow, PyTorch, Keras, etc.
An overview of Deep Learning With Neural Networks. Use cases of Deep learning and it's development. Basic introduction tp the layers of Neural Networks.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
This presentation is about Sentiment analysis Using Machine Learning which is a modern way to perform sentiment analysis operation. it has various techniques and algorithm described and compared for SA
how,when and why to perform Feature scaling?
Different type of feature scaling Technique.
when to perform feature scaling?
why to perform feature scaling?
MinMax feature scaling techniques.
Unit vector scaling.
Decision tree is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems. It is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision.
Securly deploys complex machine learning algorithms that analyze data against carefully curated data sets. Securly's Natural Language Process (NLP) engines are put through rigorous training and multiple levels of data analysis that train them to think like humans when detecting grief, depression, bullying, self-harm, and suicidal thoughts in kids.
A General Architecture for an Emotion-aware Content-based Recommender SystemLucio Narducci
A General Architecture for an Emotion-aware Content-based Recommender System
Fedelucio Narducci, Marco De Gemmis, Pasquale Lops
3rd Empire Workshop
RecSys 2015, Vienna, Austria, 16-20 September 2015
Uncovering the Causes of Emotions in Software Developer Communication Using Z...Mia Mohammad Imran
Understanding and identifying the causes behind developers' emotions (e.g., Frustration caused by 'delays in merging pull requests') can be crucial towards finding solutions to problems and fostering collaboration in open-source communities. Effectively identifying such information in the high volume of communications across the different project channels, such as chats, emails, and issue comments, requires automated recognition of emotions and their causes. To enable this automation, large-scale software engineering-specific datasets that can be used to train accurate machine learning models are required. However, such datasets are expensive to create with the variety and informal nature of software projects' communication channels.
In this paper, we explore zero-shot LLMs that are pre-trained on massive datasets but without being fine-tuned specifically for the task of detecting emotion causes in software engineering: ChatGPT, GPT-4, and flan-alpaca. Our evaluation indicates that these recently available models can identify emotion categories when given detailed emotions, although they perform worse than the top-rated models. For emotion cause identification, our results indicate that zero-shot LLMs are effective at recognizing the correct emotion cause with a BLEU-2 score of 0.598. To highlight the potential use of these techniques, we conduct a case study of the causes of Frustration in the last year of development of a popular open-source project, revealing several interesting insights.
Data Augmentation for Improving Emotion Recognition in Software Engineering C...Preetha Chatterjee
Emotions (e.g., Joy, Anger) are prevalent in daily software engineer- ing (SE) activities, and are known to be significant indicators of work productivity (e.g., bug fixing efficiency). Recent studies have shown that directly applying general purpose emotion classifica- tion tools to SE corpora is not effective. Even within the SE domain, tool performance degrades significantly when trained on one com- munication channel and evaluated on another (e.g, StackOverflow vs. GitHub comments). Retraining a tool with channel-specific data takes significant effort since manually annotating a large dataset of ground truth data is expensive.
In this paper, we address this data scarcity problem by auto- matically creating new training data using a data augmentation technique. Based on an analysis of the types of errors made by popu- lar SE-specific emotion recognition tools, we specifically target our data augmentation strategy in order to improve the performance of emotion recognition. Our results show an average improvement of 9.3% in micro F1-Score for three existing emotion classification tools (ESEM-E, EMTk, SEntiMoji) when trained with our best aug- mentation strategy.
Multimodal opinion mining from social mediaDiana Maynard
Presentation at the BCS SGAI 2013 conference in Cambridge, December 2013, describing the combination of opinion mining from text and multimedia from social media.
Multimedia data minig and analytics sentiment analysis using social multimediaKan-Han (John) Lu
● The growing importance of sentiment analysis coincides with the popularity of social network platform (Facebook, Twitter, Flickr).
● A tremendous amount of data in different forms including text, image, and videos makes sentiment analysis a very challenging task.
● In this chapter, we will discuss some of the latest works on topics of sentiment analysis based on visual content and textual content.
Similar to Sarcasm Detection: Achilles Heel of sentiment analysis (20)
Continuous Learning Systems: Building ML systems that learn from their mistakesAnuj Gupta
Won't it be great to have ML models that can update their “learning” as and when they make mistake and correction is provided in real time? In this talk we look at a concrete business use case which warrants such a system. We will take a deep dive to understand the use case and how we went about building a continuously learning system for text classification. The approaches we took, the results we got.
Deep dive into the world of word vectors. We will cover - Bigram model, Skip-gram, CBOW, GLO. Starting from simplest models, we will journey through key results and ideas in this area.
In this talk we explore how to build Machine Learning Systems that can that can learn "continuously" from their mistakes (feedback loop) and adapt to an evolving data distribution.
The youtube link to video of the talk is here:
https://www.youtube.com/watch?v=VtBvmrmMJaI
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
4. Sarcasm
● Greek: sarkázein (speak bitterly, use of irony to mock)
French: sarcasme
● Nuanced form of language where often the speaker explicitly states the opposite of
what she implies.
● Deliberately mean opposite of what is on the surface.
“This talk looks like great fun ;)”
5
5. Importance of sarcasm detection
Business Perspective:
● Organizations tap into social media for public opinion on their products &
services and real time customer assistance.
● To assist this, sentiment analysis is a key offering in any and every CRM tool.
● Customers often use sarcasm to expressing their frustration with
products/services.
6
6. ● Most sentiment analysis systems (SAS) fail to detect sarcasm and wrongly
infer the sentiment
● Both the systems got fooled by the word “love”.
● Most SAS lack the sophistication needed to detect sarcasm. 7
Stanford’s sentiment analysis demo Aylien sentiment analysis demo
7. ● This places extra burden on customer care teams.
● Owing to the volume, velocity of traffic, subtlety of language, background &
cultural differences; agents can miss sarcasm completely.
● Missing/Misinterpreting = PR disasters for brands.
8
8. Research Perspective:
● Much like QnA, text summarization, machine translation, sarcasm detection
involves complexity of language and is believed to be a much harder task.
● Any progress in sarcasm detection, is an positive step towards pushing the
boundaries of NLP.
● Only recently, people started to look into it.
● With improvement in our understanding and approaches to sentiment
analysis, researchers started focusing on more difficult cases
○ Aspect based sentiment analysis
○ Sarcasm detection
So, be it be business/research perspective, it is worth
investing time and energy in sarcasm detection.
9
9. ● Sarcasm: “a sharp, bitter, or cutting expression or remark; a bitter gibe or taunt”.
● Sarcasm is negative sentiment.
○ You are never sarcastically positive
What makes sarcasm detection difficult?
● It is deliberate - people employ play of language.
● It is subtle: it is just a word, phrases or a punctuation that is here and there.
● Even humans can find it hard to understand.
● Sarcasm is often used on social media platforms like twitter.
● Sarcasm in twitter comes with additional challenges : Fewer word cues (280
characters), spelling mistakes, acronyms, slang words, ever evolving vocabulary.
Key Characteristics
10
10. Problem Statement
Business problem: Build a sentiment analysis system capable of handling
sarcasm.
Abstract problem: Given an unlabeled tweet T from user U, a solution should
automatically detect if T is sarcastic or not.
Sarcasm ?text
Sentiment Analysis
No
Negative SentimentYes
11
11. Scope and Assumptions
● Consider the following sarcasm: “If Hillary wins, she will surely be pleased to
recall Monica each time she enters Oval office”.
Detecting this requires:
● Anaphora resolution
● Fact extraction
● Logical reasoning
● Such complex cases are beyond the scope of this work.
● Further, we assume all information necessary to detect sarcasm is contained
in same sentence (twitter data).
[Detecting sarcasm in paragraphs and articles is a much harder problem]
12
12. Dataset
Manually identified sources for sarcasm:
● Hashtags : #sarcasm, #not, #irony
● Handles : @sarcastic_us, @heissarcastic, @SarcasmMsg ….
What is not sarcasm ? Everything else.
● For this we also used twitter datasets for sentiment analysis.
● Being short (280 characters), all information necessary to detect sarcasm is
contained in same sentence.
● After cleaning left with ~100K data points, ~50K per class.
● Built test data of 20K data points in similar fashion, but from a different timeline.
13
13. Literature Survey
● Until very recently, hand coded features were used extensively.
○ Unigrams, bigrams, trigrams, n-grams, dictionary-based lexical features.
○ Pragmatic features such as emoticons, capitalization, punctuation.
○ Presence of a positive sentiment in close proximity of a negative situation phrase as a
feature for sarcasm detection.
○ Features based on frequency (gap between rare and common words)
○ Incongruous: number of time a word is followed by a word of opposite polarity.
○ #positive words, # negative words, length of longest sequences without polarity flip.
[Tsur et al., 2010, Gonzalez-Ib´anez et al., 2011, Riloff et al., 2013, Buschmeier et al.,
2014, Joshi et al., 2015] 14
14. ● These features are based on certain observations in the dataset. Thus, they
are mostly dataset specific.
● While this can give great performance on the dataset in hand, from a product
point of view one would like to have more robust features.
● Features that are not brittle and generalize to other datasets.
● People started to apply DL models.
15
15. Baseline
● Treated this as binary classification problem.
● Single layer RNNs (LSTM, GRU).
● Failed to generalize (F1 score of ~68%).
● Owing to not having enough data, they overfit very quickly.
● Simple CNN did far better (F1 score of ~76%)
16
16. Need for stronger signals
Literature on sarcasm detection has typically used 3 clues :
1. Sentiment
2. Emotion
3. Personality
Let us understand each one of them in detail.
17
17. Sentiment
● Most sarcastic sentences show a shift in sentiment
“I love the pain present in the breakups”
(shift in sentiment)
● There is a contradiction between sentiment of “love” and “pain of breakups”.
This is a hallmark of sarcasm.
● Thus, including sentiment clues should help in sarcasm detection.
Traditionally this was done via sentiment lexicons.
○ # negative words, # positive words, # sentiment shifts across adjacent words
● Instead, we use features extracted from neural network trained for sentiment. 18
18. Emotion
● Emotion: feelings such as happiness, anger, jealousy, grief, etc. One can
have many emotions simultaneously. Subjective in nature.
● Sentiment: opinion or mental attitude produced by emotions about something.
This is much more objective.
● Sarcastic sentences are rich in emotions.
● “My steller programing carrier: job offer ; Ctrl C, Ctrl V; resignation. Repeat”
● Pain, sadness, anger, disgust etc
● Thus, including emoticons clues should help.
● We use features extracted from neural network trained for emotion.
19
19. Personality
● There is a body of work that argues that sarcasm is not just a linguistic
phenomena but also a behavioral phenomena i.e. it not just about what is
being said but also who is saying that is super important.
● i.e. sarcasm is user specific: some users have a stronger tendency to be
sarcastic as compared to others*.
● This body of work factors in the history of the user in question to derive
features to model behavioral aspect. For this, they use past tweets of the
user.
● Researchers have shown substantial gains using personality based signals.
20
* There are systematic studies that establish positive correlation between ability to create & understand sarcasm and higher cognitive ability.
20. Personality
● However, from the view point of production system this is super challenging.
● One has to either store features from every users past timeline.
● Or retrieving a users history at run time, compute features on the fly
● Given typical volumes, both these choices have severe implications on
throughput and resources.
● Hence, we did not use personality based indicators.
21
21. Our solution in nutshell
22
Text
Sentiment
features
Final
Classifier
Emotion
features
Features from
baseline
Sentiment model
Emotion model
Baseline model
CNN / Linear models
23. Sentiment Model
● Objective is to build a sentiment model where the second last layer will
be used to extract features.
● We built a (standard) CNN for this.
○ Alternative layers of Convolution and Max Pooling
○ Followed by fully connected layers
○ Softmax output
● Text is tokenized (we used tweet tokenizer by Allen Ritter)
● Embedding layer is initialized using pretrained GloVe word embeddings
for twitter.
25
24. Sentiment Model
● To train this network we used a dataset for sentiment analysis.
○ 3 classes - negative, positive and neutral
○ Public dataset + custom data
● All convolutions are 1D convolutions.
○ Height of the filter varies.
○ Width of the filter is same as that of embedding dimension.
26
26. Emotion Model
● Build a emotion model where the second last layer will be used to extract
features.
● We built a (standard) CNN for this.
○ Alternative layers of Convolution and Max Pooling
○ Followed by fully connected layers
○ Softmax output
● Text is tokenized
● Embedding layer is initialized using pretrained GloVe word embeddings
for twitter.
28
27. Emotion Model
● To train this network we used a dataset for emotion analysis.
○ 6 classes - anger, disgust, surprise, sadness, joy and fear
○ Public dataset + custom data
● All convolutions are 1D convolutions.
29
28. Model Architecture
● Owing to scarcity of data, we kept networks simple.
● Embedding layer was frozen. Not fine tuned.
30
29. Our solution in nutshell
31
Text
Sentiment
features
Final
Classifier
Emotion
features
Features from
baseline
Pretrained Sentiment model
Pretrained Emotion model
Pretraned Baseline model
CNN / Linear models
30. Results
● Test Data came from a different timeline.
● ~20K balanced test set.
32
31. Future work
● Train your own word embedding
● Character n-gram embeddings
● Retry RNNs
● Attention networks
● Collect more data!
○ Collecting right data for negative class (not sarcasm) is very important.
○ Adding public datasets of sentiment to negative class helped us a lot.
● It will be interesting to see the impact of factoring in user behaviour.
33
32. Learnings
● Sarcasm detection is an important problem.
● It is difficult:
○ long term dependencies
○ subtle changes of word or punctuation can flip the polarity
○ Needs facts and external knowledge
● Present sentiment analysis systems are bad at detecting sarcasm.
● Pretrained (sub-task) specific CNNs can work in text as well.
● This is an example of domain knowledge + Deep Learning.
● Data collection strategy is important
○ Comprehensively collecting what is not sarcasm.
○ Adding public datasets of sentiment to negative class helped us a lot. 34