- The document describes a method called "dynamic parcellation by aggregation of clusters" (dypac) to generate individual brain parcellations from large fMRI datasets.
- The method was applied to cneuromod datasets totaling over 30 hours of fMRI data from movies and TV shows.
- Results show the individual parcellations have high reproducibility across similar contexts like different seasons of a TV show, and more moderate reproducibility across different stimuli like movies and tasks. Individual parcellations also better predict brain activity than group parcellations.
An introduction to the Courtois NeuroMod project - intensive brain scanning of six participants (fMRI, MEG) to help train artificial neural networks. Focus on the first data release cneuromod-2020
This document provides tips for surviving a PhD program, including finding the right institution and advisor with matching research interests and working styles, developing strong research habits like writing every day in the same place and time, prioritizing goals and actions for the week, publishing research as early as possible, and ensuring your thesis is completed and signed. The key aspects of a PhD discussed are mastery of your subject, independent research skills, and communicating results to others.
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...GUANGYUAN PIAO
Microblogging services such as Twitter have been widely
adopted due to the highly social nature of interactions they have facilitated. With the rich information generated by users on these services, user modeling aims to acquire knowledge about a user's interests, which is a fundamental step towards personalization as well as recommendations. To this end, researchers have explored dierent dimensions such as (1) Interest Representation, (2) Content Enrichment, (3) Temporal Dynamics of user interests, and (4) Interest Propagation using semantic information from a knowledge base such as DBpedia. However, those dimensions of user modeling have largely been studied separately, and there
is a lack of research on the synergetic eect of those dimensions for user modeling. In this paper, we address this research gap by investigating 16 different user modeling strategies produced by various combinations of those dimensions. Dierent user modeling strategies are evaluated in the context of a personalized link recommender system on Twitter. Results show that Interest Representation and Content Enrichment play crucial roles in user modeling, followed by Temporal Dynamics. The user mod-
eling strategy considering Interest Representation, Content Enrichment and Temporal Dynamics provides the best performance among the 16 strategies. On the other hand, Interest Propagation has little eect on user modeling in the case of leveraging a rich Interest Representation or considering Content Enrichment.
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...Tulipp. Eu
- Computer vision has improved with more data and processing power, but global scene understanding remains challenging.
- The document proposes a multidisciplinary approach combining CNNs and human visual cognition to better model scene understanding, with the goal of applications like autonomous vehicles.
- It describes experiments observing how humans and primates recognize scenes to inform modeling, incorporating global and local descriptors with relationships. This approach aims to advance scene understanding capabilities.
CMPE 258, Image object deteion and ad suggestion nased on image provided. Application is hosted on AWS and Web Application is built using Django Application.ImageAI, Keras,TEnsorFlow was used to make the object model.Yolov3, Retinanet,YoloTiny were some of the models used.
The document provides an introduction to computer vision. It discusses key topics including:
- What computer vision is and why it is useful. It uses mathematical and computational tools to extract information from images and improve human vision.
- Some basic concepts in computer vision including digital images, sampling, noise removal, segmentation, and feature extraction techniques.
- Where computer vision is used such as healthcare, autonomous vehicles, augmented/virtual reality, industry, social media, security, agriculture, and fashion.
- A brief history of computer vision including classical approaches and the revolution enabled by advances in artificial intelligence and deep learning.
- The document describes a method called "dynamic parcellation by aggregation of clusters" (dypac) to generate individual brain parcellations from large fMRI datasets.
- The method was applied to cneuromod datasets totaling over 30 hours of fMRI data from movies and TV shows.
- Results show the individual parcellations have high reproducibility across similar contexts like different seasons of a TV show, and more moderate reproducibility across different stimuli like movies and tasks. Individual parcellations also better predict brain activity than group parcellations.
An introduction to the Courtois NeuroMod project - intensive brain scanning of six participants (fMRI, MEG) to help train artificial neural networks. Focus on the first data release cneuromod-2020
This document provides tips for surviving a PhD program, including finding the right institution and advisor with matching research interests and working styles, developing strong research habits like writing every day in the same place and time, prioritizing goals and actions for the week, publishing research as early as possible, and ensuring your thesis is completed and signed. The key aspects of a PhD discussed are mastery of your subject, independent research skills, and communicating results to others.
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...GUANGYUAN PIAO
Microblogging services such as Twitter have been widely
adopted due to the highly social nature of interactions they have facilitated. With the rich information generated by users on these services, user modeling aims to acquire knowledge about a user's interests, which is a fundamental step towards personalization as well as recommendations. To this end, researchers have explored dierent dimensions such as (1) Interest Representation, (2) Content Enrichment, (3) Temporal Dynamics of user interests, and (4) Interest Propagation using semantic information from a knowledge base such as DBpedia. However, those dimensions of user modeling have largely been studied separately, and there
is a lack of research on the synergetic eect of those dimensions for user modeling. In this paper, we address this research gap by investigating 16 different user modeling strategies produced by various combinations of those dimensions. Dierent user modeling strategies are evaluated in the context of a personalized link recommender system on Twitter. Results show that Interest Representation and Content Enrichment play crucial roles in user modeling, followed by Temporal Dynamics. The user mod-
eling strategy considering Interest Representation, Content Enrichment and Temporal Dynamics provides the best performance among the 16 strategies. On the other hand, Interest Propagation has little eect on user modeling in the case of leveraging a rich Interest Representation or considering Content Enrichment.
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...Tulipp. Eu
- Computer vision has improved with more data and processing power, but global scene understanding remains challenging.
- The document proposes a multidisciplinary approach combining CNNs and human visual cognition to better model scene understanding, with the goal of applications like autonomous vehicles.
- It describes experiments observing how humans and primates recognize scenes to inform modeling, incorporating global and local descriptors with relationships. This approach aims to advance scene understanding capabilities.
CMPE 258, Image object deteion and ad suggestion nased on image provided. Application is hosted on AWS and Web Application is built using Django Application.ImageAI, Keras,TEnsorFlow was used to make the object model.Yolov3, Retinanet,YoloTiny were some of the models used.
The document provides an introduction to computer vision. It discusses key topics including:
- What computer vision is and why it is useful. It uses mathematical and computational tools to extract information from images and improve human vision.
- Some basic concepts in computer vision including digital images, sampling, noise removal, segmentation, and feature extraction techniques.
- Where computer vision is used such as healthcare, autonomous vehicles, augmented/virtual reality, industry, social media, security, agriculture, and fashion.
- A brief history of computer vision including classical approaches and the revolution enabled by advances in artificial intelligence and deep learning.
The Frontier of Deep Learning in 2020 and BeyondNUS-ISS
This talk will be a summary of the recent advances in deep learning research, current trends in the industry, and the opportunities that lie ahead.
We will discuss topics in research such as:
Transformers, GPT-3, BERT
Neural Architecture Search, Evolutionary Search
Distillation, self-learning
NeRF
Self-Attention
Also shifting industry trends such as:
The move to free data
Rising importance of 3D vision
Using synthetic data (Sim2Real)
Mobile vision & Federated Learning
This document summarizes a project on real-time object detection using computer vision techniques. It discusses using a system that can recognize objects in a video stream from a camera and label them with bounding boxes and labels. It notes that most video surveillance footage is uninteresting unless there are moving objects. The project aims to address this by building an accurate, fast object detection system that can run on resource-constrained devices. It proposes using a hybrid CNN-SVM model trained on a large dataset to recognize objects and discusses the training and detection phases of the system.
This document is a tribute to Jiebo Luo's academic mentor and a recounting of Luo's personal journey in research. It describes key lessons Luo learned from his mentor including curiosity, open-mindedness, passion for research, and integrity. It then summarizes Luo's research areas including image processing, video coding, scene understanding, image classification, image captioning, visual event recognition, and using social multimedia to study social problems. It concludes with celebrating remaining forever young in academic research.
Real Time Object Dectection using machine learningpratik pratyay
This document discusses the development of a real-time object detection system using computer vision techniques. It aims to recognize and label moving objects in video streams from monitoring cameras with high accuracy and in a short amount of time. The system will use a hybrid model of convolutional neural networks and support vector machines for feature extraction and classification of objects from camera feeds into predefined classes. It is intended to help analyze surveillance video by only flagging clips that contain objects of interest like people or vehicles, reducing wasted storage and review time.
The document summarizes a presentation on localized learning approaches for human activity recognition using sensor data. It discusses developing a wearable system to monitor vital signs of hospital patients in real-time. The presentation covers data preparation and feature extraction, and using machine learning algorithms like LS-SVM and KNN for modeling. It evaluates the approaches on synthetic and real-world activity recognition datasets, finding localized learning handles class imbalance and outperforms global models in terms of time performance and ability to handle streaming data.
Elderly Assistance- Deep Learning Theme detectionTanvi Mittal
It was a Capstone project for AMPBA class of 2019 Winter. It uses Deep Learning to analyse the theme of Video. It combines various pre-trained models, enhances them using Transfer learning for the context of Elderly assistance and gives us a Warning Score in real time for any suspicious activity.
Time series analysis : Refresher and InnovationsQuantUniversity
This document provides an overview of a presentation on time series analysis using the QuSandbox platform. The presentation was given by Sri Krishnamurthy, founder and CEO of QuantUniversity, at a QuantUniversity meetup in Boston on November 29, 2018. It covered topics including machine learning techniques for time series analysis, case studies analyzing temperature and swap rate data, and a demonstration of modeling time series data with neural networks.
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningAli Alkan
The document provides an introduction to image processing and recognition using machine learning. It discusses how deep learning uses hierarchical neural networks inspired by the human brain to learn representations of image data without requiring manual feature engineering. Deep learning has been applied successfully to problems like computer vision through convolutional neural networks. The document also describes how KNIME can be used as an open-source platform to visually build and run deep learning models for image processing tasks and integrate with other tools. It highlights several image processing and deep learning nodes available in KNIME.
Surveillance scene classification using machine learningUtkarsh Contractor
The problem of scene classification in surveillance footage is of great importance for ensuring security in public areas. With challenges such as low quality feeds, occlusion, viewpoint variations, background clutter etc. The task is both challenging and error-prone. Therefore it is important to keep the false positives low to maintain a high accuracy of detection. In this paper, we adapt high performing CNN architectures to identify abandoned luggage in a surveillance feed. We explore several CNN based approaches, from Transfer Learning on the Imagenet dataset to object classification using Faster R-CNNs on the COCO dataset. Using network visualization techniques, we gain insight into what the neural network sees and the basis of classification decision. The experiments have been conducted on real world datasets, and highlights the complexity in such classifications. Obtained results indicate that a combination of proposed techniques outperforms the individual approaches.
Emotion recognition and drowsiness detection using python.pptGopi Naidu
This document proposes a system that uses artificial intelligence and digital image processing techniques for facial recognition, emotion recognition, drowsiness detection, and ID card detection. It describes the key components of the system including facial recognition using kNN, emotion recognition using CNNs, drowsiness detection by analyzing eye blinking, and ID card detection through shape and color detection. The overall goal of the project is to build an affordable and efficient surveillance and feedback system using these computer vision and AI techniques.
Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better!
In this lecture, we will get an introduction to Autoencoders and Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano. This will be a preview of the QuantUniversity Deep Learning Workshop that will be offered in 2017.
This document proposes and evaluates methods for video-to-video translation using CycleGAN. It begins with a baseline method that applies CycleGAN to each video frame independently, resulting in inconsistent translations between frames. A improved method adds a flow-guided loss term to CycleGAN that considers optical flow between frames, producing more temporally coherent translations. Evaluation shows the flow-guided method generates higher quality translations that better preserve details and consistency across frames when translating videos between day and night domains. Further optimizations to the model are suggested to improve results.
Introduction to computer vision with Convoluted Neural NetworksMarcinJedyk
Introduction to computer vision with Convoluted Neural Networks - going over history of CNNs, describing basic concepts such as convolution and discussing applications of computer vision and image recognition technologies
Creating 3D neuron reconstructions from image stacks and virtual slidesMBF Bioscience
This presentation was given at a workshop that focused on reconstructing neurons from image stacks to study neuron morphology. It covers strategies for capturing image stacks optimized for neuron reconstruction with Neurolucida 360, a new software product that makes it much easier to trace neurons from image stacks.
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...JacobSilbiger1
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object Detection Model
By: Nissim Cantor, Avi Radinsky, Jacob Silbiger
Github: https://github.com/ndcantor/tensorflow-street-classifier
Demo: https://www.youtube.com/watch?v=ItXdPJ3okMo
The document describes a project to perform object detection in videos. The team's scope was to identify, list, localize and bound objects in video frames using machine learning. They chose the MS-COCO dataset and the SSD model for its efficiency and speed at object detection. A comparative analysis found SSD_MOBILENET_V1_COCO to have the best balance of speed and accuracy. The team performed transfer learning to customize the model for new object types. They developed a web application using Flask that streams video frames from the client to perform object detection and returns bounding box coordinates.
The document discusses object detection pipelines. It begins by defining object detection as identifying objects in images and locating them with bounding boxes. The main components of an object detection pipeline are datasets, preprocessing, model selection and training, testing and evaluation. Popular models discussed are Faster R-CNN, R-FCN, and SSD which use deep convolutional neural networks as feature extractors and classifiers. Key evaluation metrics are mean average precision and prediction time/memory usage. Popular datasets mentioned are MSCOCO, Pascal VOC, and LSVRC. The document provides information on preprocessing, training including fine-tuning pre-trained models, and codes/models available on GitHub.
Video has become ubiquitous on the Internet, TV, as well as personal devices. Recognition of video content has been a fundamental challenge in computer vision for decades, where previous research predominantly focused on recognizing videos using a predefined yet limited vocabulary. Thanks to the recent development of deep learning and knowledge graph techniques, researchers in multiple communities are now striving to bridge videos with natural language in order to move beyond classification to interpretation, which should be regarded as the ultimate goal of video understanding. We will present recent advances in exploring the synergy of video understanding and language processing techniques, including video entity linking, video-language alignment, and video captioning, and discuss how domain knowledge can fit in to improve the performance.
Does deep learning solve all the machine learning problems? Where would domain knowledge fit in? While it is common in medical data analytics to incorporate domain knowledge, we focus on one emerging area in computer vision and language processing, video+language, to answer these questions.
Video has become ubiquitous on the Internet, TV, as well as personal devices. Recognition of video content has been a fundamental challenge in computer vision for decades, where previous research predominantly focused on recognizing videos using a predefined yet limited vocabulary. Thanks to the recent development of deep learning and knowledge graph techniques, researchers in multiple communities are now striving to bridge videos with natural language in order to move beyond classification to interpretation, which should be regarded as the ultimate goal of video understanding. We will present recent advances in exploring the synergy of video understanding and language processing techniques, including video entity linking, video-language alignment, and video captioning, and discuss how domain knowledge can fit in to improve the performance.
The Frontier of Deep Learning in 2020 and BeyondNUS-ISS
This talk will be a summary of the recent advances in deep learning research, current trends in the industry, and the opportunities that lie ahead.
We will discuss topics in research such as:
Transformers, GPT-3, BERT
Neural Architecture Search, Evolutionary Search
Distillation, self-learning
NeRF
Self-Attention
Also shifting industry trends such as:
The move to free data
Rising importance of 3D vision
Using synthetic data (Sim2Real)
Mobile vision & Federated Learning
This document summarizes a project on real-time object detection using computer vision techniques. It discusses using a system that can recognize objects in a video stream from a camera and label them with bounding boxes and labels. It notes that most video surveillance footage is uninteresting unless there are moving objects. The project aims to address this by building an accurate, fast object detection system that can run on resource-constrained devices. It proposes using a hybrid CNN-SVM model trained on a large dataset to recognize objects and discusses the training and detection phases of the system.
This document is a tribute to Jiebo Luo's academic mentor and a recounting of Luo's personal journey in research. It describes key lessons Luo learned from his mentor including curiosity, open-mindedness, passion for research, and integrity. It then summarizes Luo's research areas including image processing, video coding, scene understanding, image classification, image captioning, visual event recognition, and using social multimedia to study social problems. It concludes with celebrating remaining forever young in academic research.
Real Time Object Dectection using machine learningpratik pratyay
This document discusses the development of a real-time object detection system using computer vision techniques. It aims to recognize and label moving objects in video streams from monitoring cameras with high accuracy and in a short amount of time. The system will use a hybrid model of convolutional neural networks and support vector machines for feature extraction and classification of objects from camera feeds into predefined classes. It is intended to help analyze surveillance video by only flagging clips that contain objects of interest like people or vehicles, reducing wasted storage and review time.
The document summarizes a presentation on localized learning approaches for human activity recognition using sensor data. It discusses developing a wearable system to monitor vital signs of hospital patients in real-time. The presentation covers data preparation and feature extraction, and using machine learning algorithms like LS-SVM and KNN for modeling. It evaluates the approaches on synthetic and real-world activity recognition datasets, finding localized learning handles class imbalance and outperforms global models in terms of time performance and ability to handle streaming data.
Elderly Assistance- Deep Learning Theme detectionTanvi Mittal
It was a Capstone project for AMPBA class of 2019 Winter. It uses Deep Learning to analyse the theme of Video. It combines various pre-trained models, enhances them using Transfer learning for the context of Elderly assistance and gives us a Warning Score in real time for any suspicious activity.
Time series analysis : Refresher and InnovationsQuantUniversity
This document provides an overview of a presentation on time series analysis using the QuSandbox platform. The presentation was given by Sri Krishnamurthy, founder and CEO of QuantUniversity, at a QuantUniversity meetup in Boston on November 29, 2018. It covered topics including machine learning techniques for time series analysis, case studies analyzing temperature and swap rate data, and a demonstration of modeling time series data with neural networks.
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningAli Alkan
The document provides an introduction to image processing and recognition using machine learning. It discusses how deep learning uses hierarchical neural networks inspired by the human brain to learn representations of image data without requiring manual feature engineering. Deep learning has been applied successfully to problems like computer vision through convolutional neural networks. The document also describes how KNIME can be used as an open-source platform to visually build and run deep learning models for image processing tasks and integrate with other tools. It highlights several image processing and deep learning nodes available in KNIME.
Surveillance scene classification using machine learningUtkarsh Contractor
The problem of scene classification in surveillance footage is of great importance for ensuring security in public areas. With challenges such as low quality feeds, occlusion, viewpoint variations, background clutter etc. The task is both challenging and error-prone. Therefore it is important to keep the false positives low to maintain a high accuracy of detection. In this paper, we adapt high performing CNN architectures to identify abandoned luggage in a surveillance feed. We explore several CNN based approaches, from Transfer Learning on the Imagenet dataset to object classification using Faster R-CNNs on the COCO dataset. Using network visualization techniques, we gain insight into what the neural network sees and the basis of classification decision. The experiments have been conducted on real world datasets, and highlights the complexity in such classifications. Obtained results indicate that a combination of proposed techniques outperforms the individual approaches.
Emotion recognition and drowsiness detection using python.pptGopi Naidu
This document proposes a system that uses artificial intelligence and digital image processing techniques for facial recognition, emotion recognition, drowsiness detection, and ID card detection. It describes the key components of the system including facial recognition using kNN, emotion recognition using CNNs, drowsiness detection by analyzing eye blinking, and ID card detection through shape and color detection. The overall goal of the project is to build an affordable and efficient surveillance and feedback system using these computer vision and AI techniques.
Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better!
In this lecture, we will get an introduction to Autoencoders and Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano. This will be a preview of the QuantUniversity Deep Learning Workshop that will be offered in 2017.
This document proposes and evaluates methods for video-to-video translation using CycleGAN. It begins with a baseline method that applies CycleGAN to each video frame independently, resulting in inconsistent translations between frames. A improved method adds a flow-guided loss term to CycleGAN that considers optical flow between frames, producing more temporally coherent translations. Evaluation shows the flow-guided method generates higher quality translations that better preserve details and consistency across frames when translating videos between day and night domains. Further optimizations to the model are suggested to improve results.
Introduction to computer vision with Convoluted Neural NetworksMarcinJedyk
Introduction to computer vision with Convoluted Neural Networks - going over history of CNNs, describing basic concepts such as convolution and discussing applications of computer vision and image recognition technologies
Creating 3D neuron reconstructions from image stacks and virtual slidesMBF Bioscience
This presentation was given at a workshop that focused on reconstructing neurons from image stacks to study neuron morphology. It covers strategies for capturing image stacks optimized for neuron reconstruction with Neurolucida 360, a new software product that makes it much easier to trace neurons from image stacks.
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object...JacobSilbiger1
YU CS Summer 2021 Project | TensorFlow Street Image Classification and Object Detection Model
By: Nissim Cantor, Avi Radinsky, Jacob Silbiger
Github: https://github.com/ndcantor/tensorflow-street-classifier
Demo: https://www.youtube.com/watch?v=ItXdPJ3okMo
The document describes a project to perform object detection in videos. The team's scope was to identify, list, localize and bound objects in video frames using machine learning. They chose the MS-COCO dataset and the SSD model for its efficiency and speed at object detection. A comparative analysis found SSD_MOBILENET_V1_COCO to have the best balance of speed and accuracy. The team performed transfer learning to customize the model for new object types. They developed a web application using Flask that streams video frames from the client to perform object detection and returns bounding box coordinates.
The document discusses object detection pipelines. It begins by defining object detection as identifying objects in images and locating them with bounding boxes. The main components of an object detection pipeline are datasets, preprocessing, model selection and training, testing and evaluation. Popular models discussed are Faster R-CNN, R-FCN, and SSD which use deep convolutional neural networks as feature extractors and classifiers. Key evaluation metrics are mean average precision and prediction time/memory usage. Popular datasets mentioned are MSCOCO, Pascal VOC, and LSVRC. The document provides information on preprocessing, training including fine-tuning pre-trained models, and codes/models available on GitHub.
Video has become ubiquitous on the Internet, TV, as well as personal devices. Recognition of video content has been a fundamental challenge in computer vision for decades, where previous research predominantly focused on recognizing videos using a predefined yet limited vocabulary. Thanks to the recent development of deep learning and knowledge graph techniques, researchers in multiple communities are now striving to bridge videos with natural language in order to move beyond classification to interpretation, which should be regarded as the ultimate goal of video understanding. We will present recent advances in exploring the synergy of video understanding and language processing techniques, including video entity linking, video-language alignment, and video captioning, and discuss how domain knowledge can fit in to improve the performance.
Does deep learning solve all the machine learning problems? Where would domain knowledge fit in? While it is common in medical data analytics to incorporate domain knowledge, we focus on one emerging area in computer vision and language processing, video+language, to answer these questions.
Video has become ubiquitous on the Internet, TV, as well as personal devices. Recognition of video content has been a fundamental challenge in computer vision for decades, where previous research predominantly focused on recognizing videos using a predefined yet limited vocabulary. Thanks to the recent development of deep learning and knowledge graph techniques, researchers in multiple communities are now striving to bridge videos with natural language in order to move beyond classification to interpretation, which should be regarded as the ultimate goal of video understanding. We will present recent advances in exploring the synergy of video understanding and language processing techniques, including video entity linking, video-language alignment, and video captioning, and discuss how domain knowledge can fit in to improve the performance.
Similar to "Semantic Indexing of Wearable Camera Images: Kids’Cam Concepts" (20)
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Introduction to Jio Cinema**:
- Brief overview of Jio Cinema as a streaming platform.
- Its significance in the Indian market.
- Introduction to retention and engagement strategies in the streaming industry.
2. **Understanding Retention and Engagement**:
- Define retention and engagement in the context of streaming platforms.
- Importance of retaining users in a competitive market.
- Key metrics used to measure retention and engagement.
3. **Jio Cinema's Content Strategy**:
- Analysis of the content library offered by Jio Cinema.
- Focus on exclusive content, originals, and partnerships.
- Catering to diverse audience preferences (regional, genre-specific, etc.).
- User-generated content and interactive features.
4. **Personalization and Recommendation Algorithms**:
- How Jio Cinema leverages user data for personalized recommendations.
- Algorithmic strategies for suggesting content based on user preferences, viewing history, and behavior.
- Dynamic content curation to keep users engaged.
5. **User Experience and Interface Design**:
- Evaluation of Jio Cinema's user interface (UI) and user experience (UX).
- Accessibility features and device compatibility.
- Seamless navigation and search functionality.
- Integration with other Jio services.
6. **Community Building and Social Features**:
- Strategies for fostering a sense of community among users.
- User reviews, ratings, and comments.
- Social sharing and engagement features.
- Interactive events and campaigns.
7. **Retention through Loyalty Programs and Incentives**:
- Overview of loyalty programs and rewards offered by Jio Cinema.
- Subscription plans and benefits.
- Promotional offers, discounts, and partnerships.
- Gamification elements to encourage continued usage.
8. **Customer Support and Feedback Mechanisms**:
- Analysis of Jio Cinema's customer support infrastructure.
- Channels for user feedback and suggestions.
- Handling of user complaints and queries.
- Continuous improvement based on user feedback.
9. **Multichannel Engagement Strategies**:
- Utilization of multiple channels for user engagement (email, push notifications, SMS, etc.).
- Targeted marketing campaigns and promotions.
- Cross-promotion with other Jio services and partnerships.
- Integration with social media platforms.
10. **Data Analytics and Iterative Improvement**:
- Role of data analytics in understanding user behavior and preferences.
- A/B testing and experimentation to optimize engagement strategies.
- Iterative improvement based on data-driven insights.
"Semantic Indexing of Wearable Camera Images: Kids’Cam Concepts"
1. Semantic Indexing of Wearable
Camera Images: Kids’Cam
Concepts
Alan F. Smeaton
(Dublin City University)
… and …
2. ... Kevin McGuinness and Cathal Gurrin and Jiang Zhou
and Noel E. O’Connor
and Peng Wang
and Brian Davis and Lucas Azevedo
and Andre Freitas
and Louise Signal and Moira Smith and James Stanley
and Michelle Barr and Tim Chambers and Cliona Ní
Mhurchu
3. Overview
• Automatic assignment of one-per-class concept detectors
is now commonplace.
• We’re interested in the challenging case of processing
images from wearable cameras where improvement is
necessary.
• We try to exploit some limited manual annotations to
improve accuracy of automatic concept weights.
• This work is not complete, its ongoing, but the story is
interesting.
4. Analysis of Visual Media
• More progress made within the last few years than in previous decade
• Incorporation of deep learning plus availability of huge searchable
image resources and training data
• Automatic image tagging is now hosted
and offered by website like Aylien,
Imagga, Clarifai, and others, and very
cost-effective.
5. Analysis of Visual Media
• These developments are welcome … but … restrictive tagging
vocabularies.
• How to map to users formulating queries
• Alternative approach is tagging at query time but its expensive and not
scalable to huge collections.
• Almost all work on concept detection based on one concept at a time.
• TRECVid tried simultaneous detection of concept pairs like “computer-
screen with telephone”, and “airplane with clouds”.
• Limited success but “Government Leader with Flag” was OK !
• Detection of concepts independently needs a course-correction
because:
– Doesn’t avail of all available information sources
– Doesn’t map to a user’s search vocabulary
6. Long-term approach …
Images Concept Set
Mapping
User Search
vocabulary
How can a single image be mapped to two different vocabularies ?
7. Using NL for image search … tagging
• NL is fraught with complexities, ambiguities at all levels ..
– Lexical level polysemy
– Syntactic level structural ambiguity
– Semantic interpretations
– Discourse level pronoun resolution
• + vocabulary limitations when finding word or phrase to describe
something
• When using computers to help search for image data, language
challenges are exacerbated yet we assume a “simplistic” approach of
tagging by a set of concepts, notwithstanding what we’re seeing with
captioning here today
• Tagging is very useful for smaller, niche applications in restricted
domains with manual tagging, but we see scalability problems
– Addressed with progress in automatic tagging but we’re tolerant of
inaccuracies !
8. In this paper …
• We are interested in images from wearable cameras with lots of juicy
challenges.
• Notoriously difficult to process automatically because …
– Blurring caused by wearer moving at image capture
– Occlusions from wearer’s hands
– Lighting conditions
– Fisheye lens for wider perspective causing distortion
– First person viewpoint but not what wearer sees
– Content varies hugely across subjects
• Applications in memory support, behaviour recording and analysis,
security, other work-related, and QS.
• In this paper we work with wearable camera data from school children,
for analysis of their environments
10. The Kids’Cam Project
• Child obesity is a significant public health concern, worldwide.
• Unequivocal evidence that marketing of energy-dense and nutrient-
poor foods and beverages is a causal factor in child obesity.
• Evidence of children’s total exposure to advertising of poor foodstuffs
is not quantified.
• Kids’Cam study aimed to determine the frequency, nature and duration
of children’s exposure to such marketing.
• 169 randomly selected children 11 to 13 yo from 16 schools in
Wellington, NZ, each wore an Autographer and carried a GPS for 4
days .. .mages every 7 seconds, GPS every 5 seconds.
– 1.5M images, 2.5M GPS datapoints
• Manual annotation for food / beverage marketing using a 3-level, 53
concept ontology .. Inter-annotator reliability of 90%.
15. Processing the Kids’Cam Data
• Following integration of different data sources and after the
manual annotation of images, we processed the image
collection in the following way …
20. Training Free Refinement
• Current concept-at-a-time classifiers do not consider inter-
concept relationships or dependencies yet these do exist
• To improve one-per-class detectors, we post-process detection
scores
– We take advantage of concept co-occurrence and re-
occurrence which depend on the particular collection
– We take advantage of local (temporal) neighbourhood
information where concepts are likely to re-occur close in
time
– We use GPS location information where concepts identified
by a person at a location, may re-occur subsequently at that
same location
• TFR is based on non-negative matrix factorisation, described
elsewhere
22. • We do not know accuracy of assignment of 1,000 concepts but we
know accuracy of assignment of 53 concepts …and we have 1.5M
images each mapped into 2 concept spaces
• Can we adjust values in (b), anchored and pivoting around (a) in
addition to having already used local, within-collection distributions ?
y1
y2
x1 x2
b2
b1
a1 a2
(a) Manual, correct (b) Automatic,
unknown
accuracy
24. Cross-mapping concept spaces
• Distributional semantics – corpus-driven approach – based
on hypothesis that co-occurring words in similar contexts
have similar meaning
• Using word2vec in DINFRA, we can
map all words in a vocabulary to an
n-dimensional vector space, where
we can obtain relatedless scores
among the words
• Figure illustrates an example
• For each image in Kids’Cam we can
evaluate relatedness between human
annotation and automatic concepts
with highest-probability
26. • We have top-ranked
concepts, their
confidences, their
relatedness to the
manual tags …
• First effort is to simply
multiple, as in Table, but
its hard to see the
impact of this
27. And the result is …
• … and that’s where we currently are !
28. Conclusions and Future Work
• Since automatic concept-detection using pre-defined models has
made so much progress recently, we’re seeing vocabulary / concept
space mis-matching
• Using 1.5M Kids’Cam images from wearable cameras, we have used
within-collection distributions to “smooth” concept weights (outliers and
gaps) in TFR
• We are trying to pivot around some manual annotations in order to
improve concept accuracies
• But, we need …
– More concepts – a richer vocabulary of them
– More varied manual annotations, not just fast food adverts
– A more global or collection-wide way to combine concept
confidences and relatedless to known manual annotations
– Some validation of accuracy of automatic concepts to measure
accuracy of our post-processing
29. Finally, a plug …
• TRECVid Video captioning Pilot task 2016
• 2,000 x Vine Videos, manually annotated with
captions, twice
• 8 participating groups (CMU, CUHK, DCU, GMU,
NII, UvA, Sheffield)
• Two tasks …
– For each video, rank the 2,000 captions –
metric is MRR
– For each video, generate your own caption –
metrics are bleu, meteor, and UMBC STS
(Semantic Textual Similarity) Service
• Lots of lessons learned and will build upon for full
task in 2017, probably using Vine videos