Me12tt tub

•

0 likes•195 views

The document evaluates different feature selection methods for bag-of-words approaches to video categorization. It finds that feature selection can improve results by filtering out non-informative terms. Metadata-based features like tags and descriptions generally outperform visual and audio features, but feature selection provides benefits across different feature types. The best performance comes from combining multiple feature types with transformation and selection techniques.

Feature Selection Methods for Bag-
of-(visual)-Words Approaches
Schmiedeke, Kelm and Sikora
Communication Systems Group
Technische Universität Berlin

4 October, 2012

Motivation 2

sports

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Lessons from last year 3

Features derived from metadata (esp. tags)
outperform visual and ASR ones
• Metadata: Naive Bayes (non translated)
• Visual feat.: SVM (avg. pooled histograms)
• ASR transcripts: kNN (JSD)

Uploader mainly contribute to a single category

Schmiedeke: “Feature Selection Methods for BoW Approaches”

This year‘s question 4

Does feature selection improve results achieved
with BoW model?

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Feature Selection/ Transformation 5

Mutual information:

Term Frequency:

PCA (Eigenvalue decomposition):

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Feature Selection 6

Concepts for terms selection:

Top terms for religion: Top terms for politics: Top terms for health:
bibl (0.0897) lunch (0.1200) jama (0.0495)
jesu (0.0797) obama (0.1113) health (0.0378)
god (0.0796) polit (0.0982) report (0.0357)
unleaven(0.0782) grittv (0.0881) harta (0.0227)
eeli (0.0782) flander (0.0861) exceric (0.0211)
davideel(0.0781) laura (0.0855) yoga (0.0203)
ministri(0.0780) economi(0.0747) study (0.0192)

… … …

daytripp (0.0) sonnet (0.0) ilsr (0.0)
adagio (0.0) screenplai (0.0) resystem (0.0)
acustica (0.0) acustica (0.0) acustica (0.0)

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Feature Selection 7

Top-k-Union:

Top terms for religion: Top terms for politics: Top terms for health:
bibl (0.0897) lunch (0.1200) jama (0.0495)
jesu (0.0797) obama (0.1113) health (0.0378)
god (0.0796) polit (0.0982) report (0.0357)
unleaven(0.0782) grittv (0.0881) harta (0.0227)
eeli (0.0782) flander (0.0861) exceric (0.0211)
davideel(0.0781) laura (0.0855) yoga (0.0203)
misistri(0.0780) economi(0.0747) study (0.0192)

… … …

daytripp (0.0) sonnet (0.0) ilsr (0.0)
adagio (0.0) screenplai (0.0) resystem (0.0)
acustica (0.0) acustica (0.0) acustica (0.0)

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Feature Selection 8

Top-k:

Top terms for religion: Top terms for politics: Top terms for health:
bibl (0.0897) lunch (0.1200) jama (0.0495)
jesu (0.0797) obama (0.1113) health (0.0378)
god (0.0796) polit (0.0982) report (0.0357)
unleaven(0.0782) grittv (0.0881) harta (0.0227)
eeli (0.0782) flander (0.0861) exceric (0.0211)
davideel(0.0781) laura (0.0855) yoga (0.0203)
misistri(0.0780) economi(0.0747) study (0.0192)

… … …

daytripp (0.0) sonnet (0.0) ilsr (0.0)
adagio (0.0) screenplai (0.0) resystem (0.0)
acustica (0.0) acustica (0.0) acustica (0.0)

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Feature Selection 9

Union>th:

Top terms for religion: Top terms for politics: Top terms for health:
bibl (0.0897) lunch (0.1200) jama (0.0495)
jesu (0.0797) obama (0.1113) health (0.0378)
god (0.0796) polit (0.0982) report (0.0357)
unleaven(0.0782) grittv (0.0881) harta (0.0227)
eeli (0.0782) flander (0.0861) exceric (0.0211)
davideel(0.0781) laura (0.0855) yoga (0.0203)
misistri(0.0780) economi(0.0747) study (0.0192)

… … …

daytripp (0.0) sonnet (0.0) ilsr (0.0)
adagio (0.0) screenplai (0.0) resystem (0.0)
acustica (0.0) acustica (0.0) acustica (0.0)
0.0002 0.0002 0.0001

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Feature Selection 10

Intersection>Th:

Top terms for religion: Top terms for politics: Top terms for health:
bibl (0.0897) lunch (0.1200) jama (0.0495)
jesu (0.0797) obama (0.1113) health (0.0378)
god (0.0796) polit (0.0982) report (0.0357)
… … …
web appl gossip
python googl interview
xbox teen iphon
big music san
expo tv texa
… … …
daytripp (0.0) sonnet (0.0) ilsr (0.0)
adagio (0.0) screenplai (0.0) resystem (0.0)
acustica (0.0) acustica (0.0) acustica (0.0)
0.0002 0.0002 0.0001

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Official runs 11

Bag of clustered SURF features transformed
using PCA
• Result does not benefit from transformation

official run without FS/FT
mAP 0.2301 0.2309
CA 41.63 % 41.71 %

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Official runs 12

Bag of filtered ASR transcripts terms (Union>Th)
• Result does benefit from selection

official run without FS/FT
mAP 0.1035 0.0522
CA 32.53 % 26.54 %

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Official runs 13

Bag of clustered SURF features filtered using MI
and intersection>th strategy
• Result does slightly benefit from selection

official run without FS/FT
mAP 0.2259 0.2221
CA 40.80 % 40.78 %

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Official runs 14

Bag of filtered terms derived from tags, title and
descriptions (Union>Th)
• Result does benefit from selection

official run without FS/FT
mAP 0.5225 0.4146
CA 58.18 % 55.70 %

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Official runs 15

Bag of clustered SURF features transformed
using PCA and decision fusion using uploader
• Result does benefit from transformation

official run without FS/FT
mAP 0.3304 0.2988
CA 52.14 % 49.19 %

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Conclusion & Future Work 16

FS showed potential for improving the results

Choice of using MI or TF is not critical, both
methods achieve roughly same results
• Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275)

Investigation in different scaling schemes (NB)

Use of class-independent selection score (MI)

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Backup 17

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Backup 18

Schmiedeke: “Feature Selection Methods for BoW Approaches”

Extracting visual features 19

SURF are extracted from each key frame
• At keypoints and at a regular grid

Vocabulary is built using hierarchical clustering
on SURF features of development set
• 4096/8196 codewords

Term vector for a single video is obtained by bin-
wise pooling of each key frames’ term vector
• avg

Schmiedeke: “Feature Selection Methods for BoW Approaches”

MediaEval 2012: Tagging Task 20

Question: What is the videos’ blip.tv category?
Blip.tv database (cc): ~ 3300 h
• 5288 training videos
• 9550 test videos
Official evaluation measurement is Mean
Average Precision (mAP)
Workshop will be held 4-5 October 2012 in Pisa,
Italy

Schmiedeke: “Feature Selection Methods for BoW Approaches”

The document evaluates different feature selection methods for bag-of-words approaches. It finds that feature selection can improve results achieved with bag-of-words models, depending on the features and selection method used. When applied to clustered SURF features transformed with PCA, filtered ASR transcripts terms, and metadata tags, the feature selection methods led to improved mean average precision and classification accuracy compared to using the features without selection. The choice of mutual information or term frequency for selection was not critical, as both achieved similar results.

Unit 2 boolean algebra and logic gates

AmrutaMehata

This document provides an introduction to Boolean algebra, which describes the behavior of digital circuits. It defines key concepts such as binary values, complement/NOT operations, AND and OR operations. It also outlines several important postulates and theorems of Boolean algebra, including identities, commutativity, absorption, De Morgan's theorems, and Shannon's expansion theorem. The document is intended to teach the basic foundations of Boolean algebra used in digital circuit design and logic gate optimization.

An introduction to variable and feature selection

Marco Meoni

Image retrieval based on feature selection method

eSAT Publishing House

IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.

Exploratory Analysis of Feature Selection Techniques in Medical Image Processing

Association of Scientists, Developers and Faculties

This paper analyses features selection method used in medical image processing. How image is selected by using diverse sort of method similarly: screening, scanning and selecting. We discussed on feature selection procedure which is extensively used for data mining and knowledge discovery and it carryout elimination of redundant features, concomitantly retaining the fundamental bigoted information, feature selection implies less data transmission and efficient data mining. It accentuates the need for further research in the field of pattern recognition that can effectively determine the situation with captured portion of human body.

3. introduction to text mining

Lokesh Ramaswamy

This document provides an introduction to text mining, including defining key concepts such as structured vs. unstructured data, why text mining is useful, and some common challenges. It also outlines important text mining techniques like pre-processing text through normalization, tokenization, stemming, and removing stop words to prepare text for analysis. Text mining methods can be used for applications such as sentiment analysis, predicting markets or customer churn.

Using support vector machine with a hybrid feature selection method to the st...

lolokikipipi

This document discusses using a support vector machine (SVM) with a hybrid feature selection method to predict stock trends. It proposes using F-score filtering followed by a wrapper method called Supported Sequential Forward Search (SSFS) to select optimal features for the SVM. An experiment applies this approach to NASDAQ index data, reducing 30 features to 17 using F_SSFS and achieving a classification accuracy of 81.7% with the SVM, outperforming a backpropagation neural network. The hybrid approach helps address overfitting issues while improving the SVM's prediction performance.

Text mining

Ali A Jalil

This document discusses text mining and provides an outline of the topic. It defines text mining as the analysis of natural language text data and explains why it is useful given the large amount of unstructured data. The document then describes the basic text mining process, which includes steps like filtering, segmentation, stemming, eliminating excessive words, and clustering. Several applications of text mining are mentioned like call centers, anti-spam, and market intelligence. Challenges of text mining like dealing with unstructured data and large collections of documents are also outlined.

Support Vector Machine (SVM) is a supervised machine learning algorithm that can be used for both classification and regression analysis. It works by finding a hyperplane in an N-dimensional space that distinctly classifies the data points. SVM selects the hyperplane that has the largest distance to the nearest training data points of any class, since larger the margin lower the generalization error of the classifier. SVM can efficiently perform nonlinear classification by implicitly mapping their inputs into high-dimensional feature spaces.

Introduction to Text Mining

Minha Hwang

The class outline covers introduction to unstructured data analysis, word-level analysis using vector space model and TF-IDF, beyond word-level analysis using natural language processing, and a text mining demonstration in R mining Twitter data. The document provides background on text mining, defines what text mining is and its tasks. It discusses features of text data and methods for acquiring texts. It also covers word-level analysis methods like vector space model and TF-IDF, and applications. It discusses limitations of word-level analysis and how natural language processing can help. Finally, it demonstrates Twitter mining in R.

Support Vector Machine without tears

Ankit Sharma

This document provides an overview of support vector machines (SVMs), including their basic concepts, formulations, and applications. SVMs are supervised learning models that analyze data, recognize patterns, and are used for classification and regression. The document explains key SVM properties, the concept of finding an optimal hyperplane for classification, soft margin SVMs, dual formulations, kernel methods, and how SVMs can be used for tasks beyond binary classification like regression, anomaly detection, and clustering.

Support Vector Machines

nextlib

This document summarizes support vector machines (SVMs), a machine learning technique for classification and regression. SVMs find the optimal separating hyperplane that maximizes the margin between positive and negative examples in the training data. This is achieved by solving a convex optimization problem that minimizes a quadratic function under linear constraints. SVMs can perform non-linear classification by implicitly mapping inputs into a higher-dimensional feature space using kernel functions. They have applications in areas like text categorization due to their ability to handle high-dimensional sparse data.

Support Vector Machine

Shao-Chuan Wang

This document provides an overview of support vector machines (SVMs). It discusses how SVMs can be used to perform classification tasks by finding optimal separating hyperplanes that maximize the margin between different classes. The document outlines how SVMs solve an optimization problem to find these optimal hyperplanes using techniques like Lagrange duality, kernels, and soft margins. It also covers model selection methods like cross-validation and discusses extensions of SVMs to multi-class classification problems.

Feature Selection in Machine Learning

Upekha Vandebona

Textmining Introduction

Datamining Tools

Text mining refers to extracting knowledge from unstructured text data. It is needed because most biological knowledge exists in unstructured research papers, making it difficult for scientists to manually analyze large amounts of text. Challenges include dealing with noisy, unstructured data and complex relationships between concepts. The text mining process involves preprocessing text through steps like tokenization, feature selection, and parsing to extract meaningful features before analysis can be done through classification, clustering, or other techniques. Potential applications are wide-ranging across domains like customer profiling, trend analysis, and web search.

Support Vector Machines for Classification

Prakash Pimpale

In machine learning, support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier.

Feature selection

Dong Guo

This document summarizes a machine learning workshop on feature selection. It discusses typical feature selection methods like single feature evaluation using metrics like mutual information and Gini indexing. It also covers subset selection techniques like sequential forward selection and sequential backward selection. Examples are provided showing how feature selection improves performance for logistic regression on large datasets with more features than samples. The document outlines the workshop agenda and provides details on when and why feature selection is important for machine learning models.

Feature selection concepts and methods

Reza Ramezani

This document discusses feature selection concepts and methods. It defines features as attributes that determine which class an instance belongs to. Feature selection aims to select a relevant subset of features by removing irrelevant, redundant and unnecessary data. This improves learning accuracy, model performance and interpretability. The document categorizes feature selection algorithms as filter, wrapper or embedded methods based on how they evaluate feature subsets. It also discusses concepts like feature relevance, search strategies, successor generation and evaluation measures used in feature selection algorithms.

A Review on Feature Selection Methods For Classification Tasks

Editor IJCATR

In recent years, application of feature selection methods in medical datasets has greatly increased. The challenging task in feature selection is how to obtain an optimal subset of relevant and non redundant features which will give an optimal solution without increasing the complexity of the modeling task. Thus, there is a need to make practitioners aware of feature selection methods that have been successfully applied in medical data sets and highlight future trends in this area. The findings indicate that most existing feature selection methods depend on univariate ranking that does not take into account interactions between variables, overlook stability of the selection algorithms and the methods that produce good accuracy employ more number of features. However, developing a universal method that achieves the best classification accuracy with fewer features is still an open research area.

An Introduction to Supervised Machine Learning and Pattern Classification: Th...

Sebastian Raschka

The document provides an introduction to supervised machine learning and pattern classification. It begins with an overview of the speaker's background and research interests. Key concepts covered include definitions of machine learning, examples of machine learning applications, and the differences between supervised, unsupervised, and reinforcement learning. The rest of the document outlines the typical workflow for a supervised learning problem, including data collection and preprocessing, model training and evaluation, and model selection. Common classification algorithms like decision trees, naive Bayes, and support vector machines are briefly explained. The presentation concludes with discussions around choosing the right algorithm and avoiding overfitting.

Finale of the Year: Apply for Next One!

GDSC PJATK

Fueling AI with Great Data with Airbyte Webinar

Zilliz

Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...

saastr

Nordic Marketo Engage User Group_June 13_ 2024.pptx

MichaelKnudsen27

WeTestAthens: Postman's AI & Automation Techniques

Postman

Digital Marketing Trends in 2024 | Guide for Staying Ahead

Wask

https://www.wask.co/ebooks/digital-marketing-trends-in-2024 Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.

Skybuffer SAM4U tool for SAP license adoption

Tatiana Kojar

Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool. SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.

Trusted Execution Environment for Decentralized Process Mining

LucaBarbaro3

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

shyamraj55

Programming Foundation Models with DSPy - Meetup Slides

Zilliz

Viewers also liked

Support Vector machine

Anandha L Ranganathan

Introduction to Text Mining

Minha Hwang

Support Vector Machine without tears

Ankit Sharma

Support Vector Machines

nextlib

Support Vector Machine

Shao-Chuan Wang

Feature Selection in Machine Learning

Upekha Vandebona

Textmining Introduction

Datamining Tools

Support Vector Machines for Classification

Prakash Pimpale

Feature selection

Dong Guo

Feature selection concepts and methods

Reza Ramezani

A Review on Feature Selection Methods For Classification Tasks

Editor IJCATR

An Introduction to Supervised Machine Learning and Pattern Classification: Th...

Sebastian Raschka

Viewers also liked (12)

Support Vector machine

Introduction to Text Mining

Support Vector Machine without tears

Support Vector Machines

Support Vector Machine

Feature Selection in Machine Learning

Textmining Introduction

Support Vector Machines for Classification

Feature selection

Feature selection concepts and methods

A Review on Feature Selection Methods For Classification Tasks

An Introduction to Supervised Machine Learning and Pattern Classification: Th...

Recently uploaded

Finale of the Year: Apply for Next One!

GDSC PJATK

Fueling AI with Great Data with Airbyte Webinar

Zilliz

Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...

saastr

Nordic Marketo Engage User Group_June 13_ 2024.pptx

MichaelKnudsen27

WeTestAthens: Postman's AI & Automation Techniques

Postman

Digital Marketing Trends in 2024 | Guide for Staying Ahead

Wask

Skybuffer SAM4U tool for SAP license adoption

Tatiana Kojar

Trusted Execution Environment for Decentralized Process Mining

LucaBarbaro3

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

shyamraj55

Programming Foundation Models with DSPy - Meetup Slides

Zilliz

System Design Case Study: Building a Scalable E-Commerce Platform - Hiike

Hiike

A Comprehensive Guide to DeFi Development Services in 2024

Intelisync

DeFi represents a paradigm shift in the financial industry. Instead of relying on traditional, centralized institutions like banks, DeFi leverages blockchain technology to create a decentralized network of financial services. This means that financial transactions can occur directly between parties, without intermediaries, using smart contracts on platforms like Ethereum. In 2024, we are witnessing an explosion of new DeFi projects and protocols, each pushing the boundaries of what’s possible in finance. In summary, DeFi in 2024 is not just a trend; it’s a revolution that democratizes finance, enhances security and transparency, and fosters continuous innovation. As we proceed through this presentation, we'll explore the various components and services of DeFi in detail, shedding light on how they are transforming the financial landscape. At Intelisync, we specialize in providing comprehensive DeFi development services tailored to meet the unique needs of our clients. From smart contract development to dApp creation and security audits, we ensure that your DeFi project is built with innovation, security, and scalability in mind. Trust Intelisync to guide you through the intricate landscape of decentralized finance and unlock the full potential of blockchain technology. Ready to take your DeFi project to the next level? Partner with Intelisync for expert DeFi development services today!

Operating System Used by Users in day-to-day life.pptx

Pravash Chandra Das

Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes. Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions. Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻 The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️ Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution. The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟

Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...

Jeffrey Haguewood

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on integration of Salesforce with Bonterra Impact Management. Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

akankshawande

UI5 Controls simplified - UI5con2024 presentation

Wouter Lemaire

TrustArc Webinar - 2024 Global Privacy Survey

TrustArc

How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024? In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores. See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe. This webinar will review: - The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey - The top challenges for privacy leaders, practitioners, and organizations in 2024 - Key themes to consider in developing and maintaining your privacy program

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx

SitimaJohn

Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.

Taking AI to the Next Level in Manufacturing.pdf

ssuserfac0301

Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as: 1. How quickly AI is being implemented in manufacturing. 2. Which barriers stand in the way of AI adoption. 3. How data quality and governance form the backbone of AI. 4. Organizational processes and structures that may inhibit effective AI adoption. 6. Ideas and approaches to help build your organization's AI strategy.

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...

saastr

Recently uploaded (20)

Finale of the Year: Apply for Next One!

Fueling AI with Great Data with Airbyte Webinar

Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...

Nordic Marketo Engage User Group_June 13_ 2024.pptx

WeTestAthens: Postman's AI & Automation Techniques

Digital Marketing Trends in 2024 | Guide for Staying Ahead

Skybuffer SAM4U tool for SAP license adoption

Trusted Execution Environment for Decentralized Process Mining

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

Programming Foundation Models with DSPy - Meetup Slides

System Design Case Study: Building a Scalable E-Commerce Platform - Hiike

A Comprehensive Guide to DeFi Development Services in 2024

Operating System Used by Users in day-to-day life.pptx

Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

UI5 Controls simplified - UI5con2024 presentation

TrustArc Webinar - 2024 Global Privacy Survey

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx

Taking AI to the Next Level in Manufacturing.pdf

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...

Me12tt tub

1. Feature Selection Methods for Bag- of-(visual)-Words Approaches Schmiedeke, Kelm and Sikora Communication Systems Group Technische Universität Berlin 4 October, 2012

2. Motivation 2 sports Schmiedeke: “Feature Selection Methods for BoW Approaches”

3. Lessons from last year 3 Features derived from metadata (esp. tags) outperform visual and ASR ones • Metadata: Naive Bayes (non translated) • Visual feat.: SVM (avg. pooled histograms) • ASR transcripts: kNN (JSD) Uploader mainly contribute to a single category Schmiedeke: “Feature Selection Methods for BoW Approaches”

4. This year‘s question 4 Does feature selection improve results achieved with BoW model? Schmiedeke: “Feature Selection Methods for BoW Approaches”

5. Feature Selection/ Transformation 5 Mutual information: Term Frequency: PCA (Eigenvalue decomposition): Schmiedeke: “Feature Selection Methods for BoW Approaches”

6. Feature Selection 6 Concepts for terms selection: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) ministri(0.0780) economi(0.0747) study (0.0192) … … … daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: “Feature Selection Methods for BoW Approaches”

7. Feature Selection 7 Top-k-Union: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) … … … daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: “Feature Selection Methods for BoW Approaches”

8. Feature Selection 8 Top-k: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) … … … daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) Schmiedeke: “Feature Selection Methods for BoW Approaches”

9. Feature Selection 9 Union>th: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) unleaven(0.0782) grittv (0.0881) harta (0.0227) eeli (0.0782) flander (0.0861) exceric (0.0211) davideel(0.0781) laura (0.0855) yoga (0.0203) misistri(0.0780) economi(0.0747) study (0.0192) … … … daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: “Feature Selection Methods for BoW Approaches”

10. Feature Selection 10 Intersection>Th: Top terms for religion: Top terms for politics: Top terms for health: bibl (0.0897) lunch (0.1200) jama (0.0495) jesu (0.0797) obama (0.1113) health (0.0378) god (0.0796) polit (0.0982) report (0.0357) … … … web appl gossip python googl interview xbox teen iphon big music san expo tv texa … … … daytripp (0.0) sonnet (0.0) ilsr (0.0) adagio (0.0) screenplai (0.0) resystem (0.0) acustica (0.0) acustica (0.0) acustica (0.0) 0.0002 0.0002 0.0001 Schmiedeke: “Feature Selection Methods for BoW Approaches”

11. Official runs 11 Bag of clustered SURF features transformed using PCA • Result does not benefit from transformation official run without FS/FT mAP 0.2301 0.2309 CA 41.63 % 41.71 % Schmiedeke: “Feature Selection Methods for BoW Approaches”

12. Official runs 12 Bag of filtered ASR transcripts terms (Union>Th) • Result does benefit from selection official run without FS/FT mAP 0.1035 0.0522 CA 32.53 % 26.54 % Schmiedeke: “Feature Selection Methods for BoW Approaches”

13. Official runs 13 Bag of clustered SURF features filtered using MI and intersection>th strategy • Result does slightly benefit from selection official run without FS/FT mAP 0.2259 0.2221 CA 40.80 % 40.78 % Schmiedeke: “Feature Selection Methods for BoW Approaches”

14. Official runs 14 Bag of filtered terms derived from tags, title and descriptions (Union>Th) • Result does benefit from selection official run without FS/FT mAP 0.5225 0.4146 CA 58.18 % 55.70 % Schmiedeke: “Feature Selection Methods for BoW Approaches”

15. Official runs 15 Bag of clustered SURF features transformed using PCA and decision fusion using uploader • Result does benefit from transformation official run without FS/FT mAP 0.3304 0.2988 CA 52.14 % 49.19 % Schmiedeke: “Feature Selection Methods for BoW Approaches”

16. Conclusion & Future Work 16 FS showed potential for improving the results Choice of using MI or TF is not critical, both methods achieve roughly same results • Metadata (mAP) : MI12004 (0.5277) vs. TF14976 (0.5275) Investigation in different scaling schemes (NB) Use of class-independent selection score (MI) Schmiedeke: “Feature Selection Methods for BoW Approaches”

17. Backup 17 Schmiedeke: “Feature Selection Methods for BoW Approaches”

18. Backup 18 Schmiedeke: “Feature Selection Methods for BoW Approaches”

19. Extracting visual features 19 SURF are extracted from each key frame • At keypoints and at a regular grid Vocabulary is built using hierarchical clustering on SURF features of development set • 4096/8196 codewords Term vector for a single video is obtained by bin- wise pooling of each key frames’ term vector • avg Schmiedeke: “Feature Selection Methods for BoW Approaches”

20. MediaEval 2012: Tagging Task 20 Question: What is the videos’ blip.tv category? Blip.tv database (cc): ~ 3300 h • 5288 training videos • 9550 test videos Official evaluation measurement is Mean Average Precision (mAP) Workshop will be held 4-5 October 2012 in Pisa, Italy Schmiedeke: “Feature Selection Methods for BoW Approaches”

Me12tt tub

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Recently uploaded

Recently uploaded (20)

Me12tt tub