Bibliotheca Digitalis. Reconstitution of Early Modern Cultural Networks. From Primary Source to Data.
DARIAH / Biblissima Summer School, 4-8 July 2017, Le Mans, France.
1st day, July 4th – Digital sources: theoretical fundamentals.
From pixels to content.
Jean-Yves Ramel – Professor of Computer Science, Computer Laboratory, University of Tours.
Abstract: https://bvh.hypotheses.org/3294#conf-JYRamel
Self-Directing Text Detection and Removal from Images with SmoothingPriyanka Wagh
This document presents a method for self-directing text detection and removal from images using smoothing and exemplar-based inpainting (TLES+EBI). It introduces the problem of existing text detection systems requiring technical skills and outlines objectives to improve automaticity and visually plausible region filling. The methodology applies L0 gradient minimization smoothing to text detection, followed by exemplar-based inpainting for hole filling. Experimental results show smoothing improves detection rates while exemplar-based inpainting decreases MSE and increases PSNR compared to other methods. The document concludes the approach achieves better text detection and visually plausible region filling.
Text mining is the process of extracting useful information and patterns from large collections of unstructured documents. It involves preprocessing texts, applying techniques like categorization, clustering, and summarization, and presenting or visualizing the results. While text mining has many applications in business, science, and other domains, it also faces challenges related to linguistics, analytics, and integrating domain knowledge. The document outlines the definition, techniques, applications, advantages, and limitations of text mining.
Image Compression Through Combination Advantages From Existing TechniquesCSCJournals
The tremendous growth of digital data has led to a high necessity for compressing applications either to minimize memory usage or transmission speed. Despite of the fact that many techniques already exist, there is still space and need for new techniques in this area of study. With this paper we aim to introduce a new technique for data compression through pixel combinations, used for both lossless and lossy compression. This new technique is also able to be used as a standalone solution, or with some other data compression method as an add-on providing better results. It is here applied only on images but it can be easily modified to work on any other type of data. We are going to present a side-by-side comparison, in terms of compression rate, of our technique with other widely used image compression methods. We will show that the compression ratio achieved by this technique tanks among the best in the literature whilst the actual algorithm remains simple and easily extensible. Finally the case will be made for the ability of our method to intrinsically support and enhance methods used for cryptography, steganography and watermarking.
This document discusses deep learning and convolutional neural networks. It defines artificial intelligence, machine learning, and deep learning, explaining that deep learning is a type of machine learning inspired by the human brain's neural networks. The document then describes commonly used CNN layers like convolution, pooling, padding, and fully connected layers. It provides examples of classic CNN architectures, including LeNet-5, AlexNet, and VGGNet, detailing their layer configurations and explaining how they were influential early models in deep learning.
Deep learning is receiving phenomenal attention due to breakthrough results in several AI tasks and significant research investment by top technology companies like Google, Facebook, Microsoft, IBM. For someone who has not been introduced to this technology, it may be daunting to learn several concepts such as feature learning, Restricted Boltzmann Machines, Autoencoders, etc all at once and start applying it to their own AI applications. This presentation is the first of several in this series that is intended at practitioners.
This document provides information on an image processing course offered at the University. The 3 credit, semester long course is open to students in computer science, information science, electronics, and electrical engineering. The course introduces fundamental concepts of digital image processing including image formation, representation, and basic operations. It covers topics like image transformation, filtering, restoration, and compression. The course aims to familiarize students with principles and applications of image processing to help with final year projects.
The document outlines the course objectives, outcomes, examination scheme, and units of a Computer Graphics course. The course aims to acquaint students with basic concepts, algorithms, and techniques of computer graphics through understanding, applying, and creating graphics using OpenGL. Students will learn about primitives, transformations, projections, lighting, shading, animation and gaming. The course assessment includes a mid-semester test, end-semester test, and covers topics ranging from graphics primitives to fractals and animation.
Self-Directing Text Detection and Removal from Images with SmoothingPriyanka Wagh
This document presents a method for self-directing text detection and removal from images using smoothing and exemplar-based inpainting (TLES+EBI). It introduces the problem of existing text detection systems requiring technical skills and outlines objectives to improve automaticity and visually plausible region filling. The methodology applies L0 gradient minimization smoothing to text detection, followed by exemplar-based inpainting for hole filling. Experimental results show smoothing improves detection rates while exemplar-based inpainting decreases MSE and increases PSNR compared to other methods. The document concludes the approach achieves better text detection and visually plausible region filling.
Text mining is the process of extracting useful information and patterns from large collections of unstructured documents. It involves preprocessing texts, applying techniques like categorization, clustering, and summarization, and presenting or visualizing the results. While text mining has many applications in business, science, and other domains, it also faces challenges related to linguistics, analytics, and integrating domain knowledge. The document outlines the definition, techniques, applications, advantages, and limitations of text mining.
Image Compression Through Combination Advantages From Existing TechniquesCSCJournals
The tremendous growth of digital data has led to a high necessity for compressing applications either to minimize memory usage or transmission speed. Despite of the fact that many techniques already exist, there is still space and need for new techniques in this area of study. With this paper we aim to introduce a new technique for data compression through pixel combinations, used for both lossless and lossy compression. This new technique is also able to be used as a standalone solution, or with some other data compression method as an add-on providing better results. It is here applied only on images but it can be easily modified to work on any other type of data. We are going to present a side-by-side comparison, in terms of compression rate, of our technique with other widely used image compression methods. We will show that the compression ratio achieved by this technique tanks among the best in the literature whilst the actual algorithm remains simple and easily extensible. Finally the case will be made for the ability of our method to intrinsically support and enhance methods used for cryptography, steganography and watermarking.
This document discusses deep learning and convolutional neural networks. It defines artificial intelligence, machine learning, and deep learning, explaining that deep learning is a type of machine learning inspired by the human brain's neural networks. The document then describes commonly used CNN layers like convolution, pooling, padding, and fully connected layers. It provides examples of classic CNN architectures, including LeNet-5, AlexNet, and VGGNet, detailing their layer configurations and explaining how they were influential early models in deep learning.
Deep learning is receiving phenomenal attention due to breakthrough results in several AI tasks and significant research investment by top technology companies like Google, Facebook, Microsoft, IBM. For someone who has not been introduced to this technology, it may be daunting to learn several concepts such as feature learning, Restricted Boltzmann Machines, Autoencoders, etc all at once and start applying it to their own AI applications. This presentation is the first of several in this series that is intended at practitioners.
This document provides information on an image processing course offered at the University. The 3 credit, semester long course is open to students in computer science, information science, electronics, and electrical engineering. The course introduces fundamental concepts of digital image processing including image formation, representation, and basic operations. It covers topics like image transformation, filtering, restoration, and compression. The course aims to familiarize students with principles and applications of image processing to help with final year projects.
The document outlines the course objectives, outcomes, examination scheme, and units of a Computer Graphics course. The course aims to acquaint students with basic concepts, algorithms, and techniques of computer graphics through understanding, applying, and creating graphics using OpenGL. Students will learn about primitives, transformations, projections, lighting, shading, animation and gaming. The course assessment includes a mid-semester test, end-semester test, and covers topics ranging from graphics primitives to fractals and animation.
The document discusses optical character recognition (OCR), including its history, current capabilities, and challenges. OCR is a technology that uses optical mechanisms to automatically recognize text characters, similar to how humans read. It involves converting scanned images of text into machine-encoded text. The summary discusses some of the key difficulties in OCR, such as distinguishing similar characters like 'O' and '0' or interpreting text against backgrounds. It also provides an overview of the paper, which will analyze the advancements and limitations of existing OCR systems to determine if it is suitable for different needs.
Seminar presentation about :
Automatic Image Annotation structure: shallow and deep,
cons and pros of different features and classification methods in AIA and
useful information about databases,toolboxes, authors
This document provides an overview of computer graphics systems. It discusses the basic components of a graphics system including input, computation, and output. For input, it describes common devices like mice, keyboards, and scanners. The computation stage involves transformations and rasterization to generate images. For output, it explains framebuffers, bit depths, and different display technologies like CRT, LCD, and projection displays. It provides details on how these displays work and their advantages/disadvantages.
This document summarizes a presentation on deep image processing and computer vision. It introduces common deep learning techniques like CNNs, autoencoders, variational autoencoders and generative adversarial networks. It then discusses applications including image classification using models like LeNet, AlexNet and VGG. It also covers face detection, segmentation, object detection algorithms like R-CNN, Fast R-CNN and Faster R-CNN. Additional topics include document automation using character recognition and graphical element analysis, as well as identity recognition using face detection. Real-world examples are provided for document processing, handwritten letter recognition and event pass verification.
This document summarizes a presentation by Dr. S. Ducasse on dedicated tools and research for software business intelligence at Tisoca 2014. It discusses:
- The need for dedicated tools tailored to specific problems to aid in maintenance, decision making, and reducing costs.
- The Moose technology for building custom analysis tools through its language-independent meta-model and ability to import different data sources.
- Examples of how analysis tools built with Moose have helped companies with challenges like migration, reverse engineering, and decision support.
- The benefits of an inventive toolkit approach that allows building multi-level dashboards, code analyzers, impact analyzers, and other custom tools to address specific
This document discusses Motaz El Saban's research experience and interests which focus on analyzing, modeling, learning from, and predicting digital media content such as text, images, and speech. Some key areas of research include real-time video stitching, annotating mobile videos, object and activity recognition from videos, and facial expression recognition using deep learning techniques. The document also outlines El Saban's educational background and provides an agenda for his upcoming presentation.
YCIS_Forensic PArt 1 Digital Image Processing.pptxSharmilaMore5
Basics of Digital Image Processing
Use of DIP in Society
Digital Image Processing Process
Why do we process images?
Image Enhancement and Edge detection
Python
How are we using Python in DIP
This document discusses multimodal learning analytics (MLA), which examines learning through multiple modalities like video, audio, digital pens, etc. It provides examples of extracting features from these modalities to analyze problem solving, expertise levels, and presentation quality. Key challenges of MLA are integrating different modalities and developing tools to capture real-world learning outside online systems. While current accuracy is limited, MLA is an emerging field that could provide insights beyond traditional learning analytics.
Introduction, graphics primitives :Pixel, resolution, aspect ratio, a frame buffer. Display devices, and applications of computer graphics.
Scan conversion - Line drawing algorithms: Digital Differential Analyzer (DDA), Bresenham’s Circle drawing algorithms: DDA, Bresenham’s, and Midpoint.
This document provides an overview of computer graphics systems. It discusses the basic components of a graphics system including input, computation, and output. For output, it describes raster display technologies like cathode ray tubes (CRTs) and liquid crystal displays (LCDs). It also discusses graphics memory and framebuffers for storing pixel color values, as well as color depth and dithering techniques. The goal of computer graphics is to solve the color function for each pixel on the display.
Kim Steenstrup Pedersen, lektor, Image Section, Department of Computer Science, København Universitet
Overblik over kunstig intelligens og digital billedanalyse. For øjeblikket sker der en rivende udvikling indenfor kunstig intelligens og især inden for analyse af digitale billeder og film. Vi ser jævnlige historier i pressen om nye fantastiske gennembrud indenfor kunstig intelligens (en del af disse historier udspringer fra store virksomheder som Google, Facebook og Amazon). Det er nærliggende at spørge – kan jeg anvende kunstig intelligens på min billedsamling? I dette foredrag vil jeg give et overblik over hvad kunstig intelligens og digital billedanalyse er og hvad det kan anvendes til. Jeg vil også give et indblik i styrker og svagheder ved eksisterende metoder og specielt hvad man skal være opmærksom på hvis man ønsker at anvende kunstig intelligens på sine billedsamlinger.
Implementation of Computer Vision Applications using OpenCV in C++IRJET Journal
This document discusses implementing computer vision applications using OpenCV in C++. It explores integrating OpenCV and C++ for developing robust CV applications. Examples are presented demonstrating seamless integration, including facial recognition, motion detection, and augmented reality. Four use cases are implemented: color detection, barcode decoding, text recognition, and text on images. While some pre-trained libraries were unavailable in C++, the expected outcomes were achieved through alternative methods using OpenCV functions and algorithms. The future of OpenCV in C++ is discussed to include industrial automation, security, real-time systems, and integration with deep learning.
This document provides an introduction and syllabus for a machine vision course. The 2-day course covers MATLAB basics, image processing fundamentals, and 4 projects involving content-based image retrieval, depth from stereo images, image segmentation, and object recognition. The instructor provides several demonstrations of computer vision applications and encourages students to directly start working on projects.
This document discusses a method for handwritten character recognition using a K-nearest neighbors (K-NN) classification algorithm. It begins by introducing the problem of handwritten character recognition and the challenges involved. It then describes the main steps of the proposed method: preprocessing the image data, extracting features, and classifying characters using K-NN. The document tests the method on the MNIST dataset of handwritten digits, achieving an accuracy of 97.67%. It concludes that the method is able to accurately recognize handwritten characters independently of size, font, or writer style.
Adaptive membership functions for hand written character recognition by voron...JPINFOTECH JAYAPRAKASH
The document proposes a new adaptive membership function approach for handwritten character recognition using image zoning. Existing zoning methods use static, non-adaptive membership functions that cannot model local feature distributions. The proposed system introduces adaptive membership functions selected for each zone using a genetic algorithm. It determines the optimal zoning topology and adaptive membership functions in a single process. Experimental results show it performs better than traditional zoning methods.
Introduction to Digital Image Processing Using MATLABRay Phan
This was a 3 hour presentation given to undergraduate and graduate students at Ryerson University in Toronto, Ontario, Canada on an introduction to Digital Image Processing using the MATLAB programming environment. This should provide the basics of performing the most common image processing tasks, as well as providing an introduction to how digital images work and how they're formed.
You can access the images and code that I created and used here: https://www.dropbox.com/sh/s7trtj4xngy3cpq/AAAoAK7Lf-aDRCDFOzYQW64ka?dl=0
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuMadhu Rock
This document summarizes an integrated approach to content-based image retrieval. It discusses extracting both color and texture features from images using color moments and local binary patterns. The system is tested on a database of 1000 images across 10 classes. Results show the integrated approach of using both color and texture features provides more accurate retrievals than using either feature alone. Evaluation metrics like precision, recall and accuracy are calculated to quantitatively analyze the system's performance. Overall, the proposed multi-feature approach is found to improve content-based image retrieval compared to single-feature methods.
Mind the Gap: Another look at the problem of the semantic gap in image retrievalJonathon Hare
Multimedia Content Analysis, Management and Retrieval 2006, San Jose, California, USA, 17 - 19 Jan 2006
http://eprints.soton.ac.uk/261887/
This paper attempts to review and characterise the problem of the semantic gap in image retrieval and the attempts being made to bridge it. In particular, we draw from our own experience in user queries, automatic annotation and ontological techniques. The first section of the paper describes a characterisation of the semantic gap as a hierarchy between the raw media and full semantic understanding of the media's content. The second section discusses real users' queries with respect to the semantic gap. The final sections of the paper describe our own experience in attempting to bridge the semantic gap. In particular we discuss our work on auto-annotation and semantic-space models of image retrieval in order to bridge the gap from the bottom up, and the use of ontologies, which capture more semantics than keyword object labels alone, as a technique for bridging the gap from the top down.
Hybrid Image Retrieval in Digital Libraries by Jean-Philippe Moreux & Guillau...Europeana
1) A large-scale experiment was conducted on deep learning techniques for hybrid image retrieval across multiple collections relating to World War 1.
2) An extract-transform-load approach was used to mine data from catalogs, images, OCR text and tables of contents, and enrich illustrations with metadata and topic modeling.
3) Deep learning models were used for image genre classification with 90% accuracy, and visual recognition tasks like person and soldier detection achieved modest results of 55-60% recall compared to 21% for keyword searches.
Par Marie-Luce Demonet (CESR). Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
The document discusses optical character recognition (OCR), including its history, current capabilities, and challenges. OCR is a technology that uses optical mechanisms to automatically recognize text characters, similar to how humans read. It involves converting scanned images of text into machine-encoded text. The summary discusses some of the key difficulties in OCR, such as distinguishing similar characters like 'O' and '0' or interpreting text against backgrounds. It also provides an overview of the paper, which will analyze the advancements and limitations of existing OCR systems to determine if it is suitable for different needs.
Seminar presentation about :
Automatic Image Annotation structure: shallow and deep,
cons and pros of different features and classification methods in AIA and
useful information about databases,toolboxes, authors
This document provides an overview of computer graphics systems. It discusses the basic components of a graphics system including input, computation, and output. For input, it describes common devices like mice, keyboards, and scanners. The computation stage involves transformations and rasterization to generate images. For output, it explains framebuffers, bit depths, and different display technologies like CRT, LCD, and projection displays. It provides details on how these displays work and their advantages/disadvantages.
This document summarizes a presentation on deep image processing and computer vision. It introduces common deep learning techniques like CNNs, autoencoders, variational autoencoders and generative adversarial networks. It then discusses applications including image classification using models like LeNet, AlexNet and VGG. It also covers face detection, segmentation, object detection algorithms like R-CNN, Fast R-CNN and Faster R-CNN. Additional topics include document automation using character recognition and graphical element analysis, as well as identity recognition using face detection. Real-world examples are provided for document processing, handwritten letter recognition and event pass verification.
This document summarizes a presentation by Dr. S. Ducasse on dedicated tools and research for software business intelligence at Tisoca 2014. It discusses:
- The need for dedicated tools tailored to specific problems to aid in maintenance, decision making, and reducing costs.
- The Moose technology for building custom analysis tools through its language-independent meta-model and ability to import different data sources.
- Examples of how analysis tools built with Moose have helped companies with challenges like migration, reverse engineering, and decision support.
- The benefits of an inventive toolkit approach that allows building multi-level dashboards, code analyzers, impact analyzers, and other custom tools to address specific
This document discusses Motaz El Saban's research experience and interests which focus on analyzing, modeling, learning from, and predicting digital media content such as text, images, and speech. Some key areas of research include real-time video stitching, annotating mobile videos, object and activity recognition from videos, and facial expression recognition using deep learning techniques. The document also outlines El Saban's educational background and provides an agenda for his upcoming presentation.
YCIS_Forensic PArt 1 Digital Image Processing.pptxSharmilaMore5
Basics of Digital Image Processing
Use of DIP in Society
Digital Image Processing Process
Why do we process images?
Image Enhancement and Edge detection
Python
How are we using Python in DIP
This document discusses multimodal learning analytics (MLA), which examines learning through multiple modalities like video, audio, digital pens, etc. It provides examples of extracting features from these modalities to analyze problem solving, expertise levels, and presentation quality. Key challenges of MLA are integrating different modalities and developing tools to capture real-world learning outside online systems. While current accuracy is limited, MLA is an emerging field that could provide insights beyond traditional learning analytics.
Introduction, graphics primitives :Pixel, resolution, aspect ratio, a frame buffer. Display devices, and applications of computer graphics.
Scan conversion - Line drawing algorithms: Digital Differential Analyzer (DDA), Bresenham’s Circle drawing algorithms: DDA, Bresenham’s, and Midpoint.
This document provides an overview of computer graphics systems. It discusses the basic components of a graphics system including input, computation, and output. For output, it describes raster display technologies like cathode ray tubes (CRTs) and liquid crystal displays (LCDs). It also discusses graphics memory and framebuffers for storing pixel color values, as well as color depth and dithering techniques. The goal of computer graphics is to solve the color function for each pixel on the display.
Kim Steenstrup Pedersen, lektor, Image Section, Department of Computer Science, København Universitet
Overblik over kunstig intelligens og digital billedanalyse. For øjeblikket sker der en rivende udvikling indenfor kunstig intelligens og især inden for analyse af digitale billeder og film. Vi ser jævnlige historier i pressen om nye fantastiske gennembrud indenfor kunstig intelligens (en del af disse historier udspringer fra store virksomheder som Google, Facebook og Amazon). Det er nærliggende at spørge – kan jeg anvende kunstig intelligens på min billedsamling? I dette foredrag vil jeg give et overblik over hvad kunstig intelligens og digital billedanalyse er og hvad det kan anvendes til. Jeg vil også give et indblik i styrker og svagheder ved eksisterende metoder og specielt hvad man skal være opmærksom på hvis man ønsker at anvende kunstig intelligens på sine billedsamlinger.
Implementation of Computer Vision Applications using OpenCV in C++IRJET Journal
This document discusses implementing computer vision applications using OpenCV in C++. It explores integrating OpenCV and C++ for developing robust CV applications. Examples are presented demonstrating seamless integration, including facial recognition, motion detection, and augmented reality. Four use cases are implemented: color detection, barcode decoding, text recognition, and text on images. While some pre-trained libraries were unavailable in C++, the expected outcomes were achieved through alternative methods using OpenCV functions and algorithms. The future of OpenCV in C++ is discussed to include industrial automation, security, real-time systems, and integration with deep learning.
This document provides an introduction and syllabus for a machine vision course. The 2-day course covers MATLAB basics, image processing fundamentals, and 4 projects involving content-based image retrieval, depth from stereo images, image segmentation, and object recognition. The instructor provides several demonstrations of computer vision applications and encourages students to directly start working on projects.
This document discusses a method for handwritten character recognition using a K-nearest neighbors (K-NN) classification algorithm. It begins by introducing the problem of handwritten character recognition and the challenges involved. It then describes the main steps of the proposed method: preprocessing the image data, extracting features, and classifying characters using K-NN. The document tests the method on the MNIST dataset of handwritten digits, achieving an accuracy of 97.67%. It concludes that the method is able to accurately recognize handwritten characters independently of size, font, or writer style.
Adaptive membership functions for hand written character recognition by voron...JPINFOTECH JAYAPRAKASH
The document proposes a new adaptive membership function approach for handwritten character recognition using image zoning. Existing zoning methods use static, non-adaptive membership functions that cannot model local feature distributions. The proposed system introduces adaptive membership functions selected for each zone using a genetic algorithm. It determines the optimal zoning topology and adaptive membership functions in a single process. Experimental results show it performs better than traditional zoning methods.
Introduction to Digital Image Processing Using MATLABRay Phan
This was a 3 hour presentation given to undergraduate and graduate students at Ryerson University in Toronto, Ontario, Canada on an introduction to Digital Image Processing using the MATLAB programming environment. This should provide the basics of performing the most common image processing tasks, as well as providing an introduction to how digital images work and how they're formed.
You can access the images and code that I created and used here: https://www.dropbox.com/sh/s7trtj4xngy3cpq/AAAoAK7Lf-aDRCDFOzYQW64ka?dl=0
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuMadhu Rock
This document summarizes an integrated approach to content-based image retrieval. It discusses extracting both color and texture features from images using color moments and local binary patterns. The system is tested on a database of 1000 images across 10 classes. Results show the integrated approach of using both color and texture features provides more accurate retrievals than using either feature alone. Evaluation metrics like precision, recall and accuracy are calculated to quantitatively analyze the system's performance. Overall, the proposed multi-feature approach is found to improve content-based image retrieval compared to single-feature methods.
Mind the Gap: Another look at the problem of the semantic gap in image retrievalJonathon Hare
Multimedia Content Analysis, Management and Retrieval 2006, San Jose, California, USA, 17 - 19 Jan 2006
http://eprints.soton.ac.uk/261887/
This paper attempts to review and characterise the problem of the semantic gap in image retrieval and the attempts being made to bridge it. In particular, we draw from our own experience in user queries, automatic annotation and ontological techniques. The first section of the paper describes a characterisation of the semantic gap as a hierarchy between the raw media and full semantic understanding of the media's content. The second section discusses real users' queries with respect to the semantic gap. The final sections of the paper describe our own experience in attempting to bridge the semantic gap. In particular we discuss our work on auto-annotation and semantic-space models of image retrieval in order to bridge the gap from the bottom up, and the use of ontologies, which capture more semantics than keyword object labels alone, as a technique for bridging the gap from the top down.
Hybrid Image Retrieval in Digital Libraries by Jean-Philippe Moreux & Guillau...Europeana
1) A large-scale experiment was conducted on deep learning techniques for hybrid image retrieval across multiple collections relating to World War 1.
2) An extract-transform-load approach was used to mine data from catalogs, images, OCR text and tables of contents, and enrich illustrations with metadata and topic modeling.
3) Deep learning models were used for image genre classification with 90% accuracy, and visual recognition tasks like person and soldier detection achieved modest results of 55-60% recall compared to 21% for keyword searches.
Similar to Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel (20)
Par Marie-Luce Demonet (CESR). Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Anne-Laure ALLAIN, Marlène ARRUGA, Sarra FERJANI et Sandrine BREUIL. Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Toshinori Uetani (CESR), Guillaume Porte (ARCHE, Strasbourg). Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Nicole Dufournaud (LISAA), Philippe Gambette (LIGM), Toshinori Uetani (CESR), le 25 novembre 2022. Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Toshinori UETANI (CESR, BVH), Guillaume PORTE (ARCHE, Université de Strasbourg), Flora POUPINOT (Equipex Biblissima, CESR, BVH). Le 15 décembre 2021. CESR, Tours. https://bibfr.bvh.univ-tours.fr
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
06-20-2024-AI Camp Meetup-Unstructured Data and Vector DatabasesTimothy Spann
Tech Talk: Unstructured Data and Vector Databases
Speaker: Tim Spann (Zilliz)
Abstract: In this session, I will discuss the unstructured data and the world of vector databases, we will see how they different from traditional databases. In which cases you need one and in which you probably don’t. I will also go over Similarity Search, where do you get vectors from and an example of a Vector Database Architecture. Wrapping up with an overview of Milvus.
Introduction
Unstructured data, vector databases, traditional databases, similarity search
Vectors
Where, What, How, Why Vectors? We’ll cover a Vector Database Architecture
Introducing Milvus
What drives Milvus' Emergence as the most widely adopted vector database
Hi Unstructured Data Friends!
I hope this video had all the unstructured data processing, AI and Vector Database demo you needed for now. If not, there’s a ton more linked below.
My source code is available here
https://github.com/tspannhw/
Let me know in the comments if you liked what you saw, how I can improve and what should I show next? Thanks, hope to see you soon at a Meetup in Princeton, Philadelphia, New York City or here in the Youtube Matrix.
Get Milvused!
https://milvus.io/
Read my Newsletter every week!
https://github.com/tspannhw/FLiPStackWeekly/blob/main/141-10June2024.md
For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here
https://www.youtube.com/@MilvusVectorDatabase/videos
Unstructured Data Meetups -
https://www.meetup.com/unstructured-data-meetup-new-york/
https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7
https://www.meetup.com/pro/unstructureddata/
https://zilliz.com/community/unstructured-data-meetup
https://zilliz.com/event
Twitter/X: https://x.com/milvusio https://x.com/paasdev
LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/
GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw
Invitation to join Discord: https://discord.com/invite/FjCMmaJng6
Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann
https://www.meetup.com/unstructured-data-meetup-new-york/events/301383476/?slug=unstructured-data-meetup-new-york&eventId=301383476
https://www.aicamp.ai/event/eventdetails/W2024062014
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of March 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
Discover the cutting-edge telemetry solution implemented for Alan Wake 2 by Remedy Entertainment in collaboration with AWS. This comprehensive presentation dives into our objectives, detailing how we utilized advanced analytics to drive gameplay improvements and player engagement.
Key highlights include:
Primary Goals: Implementing gameplay and technical telemetry to capture detailed player behavior and game performance data, fostering data-driven decision-making.
Tech Stack: Leveraging AWS services such as EKS for hosting, WAF for security, Karpenter for instance optimization, S3 for data storage, and OpenTelemetry Collector for data collection. EventBridge and Lambda were used for data compression, while Glue ETL and Athena facilitated data transformation and preparation.
Data Utilization: Transforming raw data into actionable insights with technologies like Glue ETL (PySpark scripts), Glue Crawler, and Athena, culminating in detailed visualizations with Tableau.
Achievements: Successfully managing 700 million to 1 billion events per month at a cost-effective rate, with significant savings compared to commercial solutions. This approach has enabled simplified scaling and substantial improvements in game design, reducing player churn through targeted adjustments.
Community Engagement: Enhanced ability to engage with player communities by leveraging precise data insights, despite having a small community management team.
This presentation is an invaluable resource for professionals in game development, data analytics, and cloud computing, offering insights into how telemetry and analytics can revolutionize player experience and game performance optimization.
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of May 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bibliotheca Digitalis Summer school: From pixels to content - Jean-Yves Ramel
1. Bibliotheca Digitalis
Reconstitution of Early Modern Cultural Networks
From Primary Source to Data
DARIAH / Biblissima Summer School
Le Mans, 4-8 July 2017
From Pixels to Contents
Jean-Yves Ramel
Professor of Computer Science,
Computer Laboratory, University of Tours
1st day, July 4th – Digital sources: theoretical fundamentals
2. 1
From pixels to content: An overview of
the main techniques used in DIA
1
Jean-Yves RAMEL
3. 2
IMAGES
CONTENTS
From Pixels to Contents
Introduction
Document Image Analysis
Machine Learning
Content based images Retrieval
Tools to automatically extract
information inside the images /
documents
Generation and utilization of
meta data
4. 3
From Pixels … to contents
Outline
From pixels…
What is an image?
Image (pre-)processing
… to Text
Transcription and Layout analysis
Segmentation and content extraction
An overview of Pattern Recognition
… but also to non-Text
Content characterization and signatures
Content retrieval and spotting
Back to meta-data?
From descriptive to perceptual meta-data
Is there adequate encoding formats?
Conclusions and perspectives
6. 5
From Pixels…
Digitization Set of pixels ?
Images come from a grid of microscopic photosensitive cells called PIXELS
Sampling
Quantization
Assignment of a numerical value drawn from the received lighting energy / pixel (grid unit)
Continuous value (xi,yi) → Discrete value (xi,yi) → Pixels
The range of colors that each pixel can take
Pixels
Source of
illumination
Initial object
(real world)
Radiant
energy
Analogical
image
Space partitioning :
sampling
Digitized object: Quantization :
color information encoding by the
sensors
Acquisition system
7. 6
From Pixels…
What is an image?
Image Quantization
Binary images: I(i,j) = 0 black or I(i,j) = 1 white
Gray level (8 bits/pixel) images:
I(i,j) = 0…..255 from the lighter to the darker.
Color images (24 bits/pixel):
3 values of lighting intensity Red, Green, Blue
I1(i,j) = 0…..255 – I2(i,j) = 0…..255 – I3(i,j) = 0…..255
Image Representation & Processing
Image = Array(s) of pixels = Matrices of values
1 pixel = A position inside the image (i,j) + 1 color (1 to 3 values)
The values I(i,j) associated to each pixel s(i,j) represent
their brightness intensity
8. 7
7
From Pixels…
What is an image?
• Sampling Image resolution
• Number of pixels per lenght unit
• In dpi (dots per inches) or ppp (points par pouce)
• When the resolution decrease, the precision decrease
• VF Image processing
• Page A4 = 21x29.7 cm
• 200 dpi : 1650 x 2340 pixels = 3 861 000 pixels
• 300 dpi : 3500 x 2480 pixels = 8 680 000 pixels
• 16M colors, 1 pixel = 3 octets 10 à 25 Mo/page !
• A trade-off between quality-quantity/time is mandatory
• Fidelity of the numerical version
• Mass of storage size – Transmission / Processing time
9. 8 8
From Pixels…
Why few pixels are so important?
Because, even with a high resolution, some isolated patterns can correspond
only to small numbers of pixels (from 20 to 50 pixels) !
20 to 40% of the useful information is situated on the boundaries of the shapes
and is modified randomly by the digitization process
The missing of few pixels can change the appearance of characters until the
confusion
Loss of topological information Broken characters or touching characters
The human perception is not sensitive to the resolution
Broken shape
Touching OCR errors
10. 9 9
After the digitization, the images usually still have a lot of defaults
Curvature and skew due to scanning
Noise on boundaries, dots, blur, …
From Pixels…
What is Image (pre-)processing?
11. 10 10
Curvature and skew correction is possible on text images
From Pixels…
Image (pre-)processing
12. 11 11
From Pixels…
Image (pre-)processing
The problem is more complicated in case of heterogeneous content
13. 12 12
The problem is more complicated in case of heterogeneous content
From Pixels…
Image (pre-)processing
15. 14
1-Offrant appres plusieurs rues sur ce faites
2-au lieu et temps acostume aly livre et
3- accenste la somme et quantite dessoulz
4-estempte du voloir et consentement de
5-plusieurs des borgeys de la dite ville
6-de chastellion ______________IIIIxXX V
flor(ins)
7-receu dudit franceys rotier pour
8-une albaleste dehue a cause de la
9-dite ferme dudit trezein avec la
10-somme et quantite dessus estempte ?
11- ______________II flor(ins)
12-receu de Alexandre ? escoffier al loup ?
13- dudit lieu de chatellion pour la
14-rense et ferme du commun de la dite
15-ville et communaute ? pour ung an commettre ?
16-alafeste nativite fi jehan baptiste lan
17-mil quatre cent et cinquante? A ly par le
18-pris dessouls esempt accense et hure ?
19-comme au plus offrant du voloyr et
20-et consentement de pluseures des bourgeys
21-de la dite ville de chastellion apres
22-pluseures rues ….refaytes es lieus ?
Preservation of the
link between the
Text and the Image
From pixels … to text
Automatic transcription but also layout Analysis
16. 15 15
First step : Image segmentation
• Transformation of the image (set of pixels) into patterns (regions of
interest) of higher level (EoC)
• These EoC could be very simple (part of characters) or more
sophisticated ones (paragraphs, illustrations, …)
• EoC extraction: Background (white) / Foreground (black) separation
• Color Image Grayscale binarisation
From pixels … to text
An overview of OCR mechanisms
EoC
17. 16
Just to illustrate the difficulties…
Most of the segmentation methods need a binarisation
From pixels … to text
An overview of OCR mechanisms
Threshold selection ?
18. 17 17
Just to illustrate the
difficulties…
Most of the segmentation
methods need a binarisation
Global threshold
Local thresholds
Niblack : S =m+ks² avec k= -0,2
| m : mean et s : standard deviation
From pixels … to text
An overview of OCR mechanisms
19. 18
First step: Image segmentation / Connected components
Then, we can try to group black pixels together to localize and
recognize higher level Element of Content (EoC)
From pixels … to text
An overview of OCR mechanisms
20. 19
From pixels … to text
An overview of OCR mechanisms
Next step : Layout analysis
• Connected Components Words Lines Paragraphs Page
• The results have to be saved in XML format (Alto, …)
• Choosing how to organize the XML tree (physical / logical) is not so easy…
21. 20 20
Next step : Layout analysis
Two kind of structures have been identified by researchers in DIA:
• The logical structure the generic one corresponding to a priori
knowledge about the content of the document
• The physical structure the analysed instance corresponding to the
extracted EoC inside the image, each one associated to descriptive
features (size, position, number of sub-patterns, … )
• Layout analysis tries to recognize these 2 structures (EoC identification)
From pixels … to text
An overview of OCR mechanisms
Physical structure logical structureSub-section
Paragraph
Sentence
22. 21
From pixels … to text
An overview of OCR mechanisms
Next step : Layout analysis
• The analysis / identification of the EoC is usually achieved based on a rule
based system defined through a grammar (static one) or defined interactively
by the users
Projet Européen Meta-e (http://meta-e.aib.uni-linz.ac.at/ )
First commercial system for automatic layout analysis
Dutch books of XVIIIe
Projet BVH - Paradiit (https://sites.google.com/site/paradiitproject/)
AGORA: an user-driven system for content extraction in
historical printed books
23. 22
Next step : Pattern / EoC recognition (toward Machine Learning)
How computers can recognize objects?
• We need a large set of (labelled) examples similar to the patterns to be
recognized a training set
• We need a list of stable and discriminative features (shape, color, size,…) used
to describe the patterns (labelled ones and unknown one)
From pixels … to text
An overview of OCR mechanisms
A E
… …
[ Training set ]
x1
[ 2D Representation
of the training set]
A
A
x2
x1
Stable and
discriminative
features
24. 23
Next step : Pattern / EoC recognition (toward Machine Learning)
How computers can recognize objects?
• When an unknown EoC arrives, we compute its features and compare it with the
content of the training set (associated built models)
From pixels … to text
An overview of OCR mechanisms
x2
[ 2D Representation
of the training set]
A
A
x2
x1
?A
26. 25
25
From pixels … to text
An overview of OCR mechanisms
Why commercial OCR are not working well on historical
documents?
Noises and degradations
Unusual layout
Unsuited training set
Fine Reader
27. 26
From pixels … to text
An overview of OCR mechanisms
Why commercial OCR are not working well on historical
documents?
Lack of data, knowledge and experiences
Unusual fonts and characters training data needs to be created
Unusual languages Lexicons, dictionaries and language models need to be created
Context often allows to modify our understanding of what is perceived by our
senses
Until now, we tried to recognized EoC without using their context
The same EoC could be interpreted differently according to its surrounding context
Results of OCR are highly correlated to the adequacy of the used word dictionary
Is there methods that need less a priori knowledge?
Processing non-Textual parts can be good source of inspiration?
29. 28
Ornamental letters ( +of 20000)
Figures (+ de 1500)
BATYR: http://www.bvh.univ-tours.fr
From pixels … to non-textual contents
Pictorial Content is also of high interest
30. 29
From pixels … to non-textual contents
An overview of Content Based Image Retrieval
• Computation time not crucial
• Signatures Visual Features
Perceptual meta data
Index
database
Statisrics
Perceptual meta data instead of classical meta data
Computation of signatures for all the images or even sub-parts of the
images (EoC)
Set of images to index
31. 30
From pixels … to non-textual contents
An overview of Content Based Image Retrieval
Using images
as query
instead of
words
Search
Engine
Back-office Front-office
32. 31
From pixels … to non-textual contents
An overview of Content Based Image Retrieval
It is again a question of features…
We speak about signatures
Query Image
Signature
computation
Index database
containing
perceptual signatures
Off-line indexation
On-line retrieval
Image Dataset
similar Images
Representation space
33. 32
From pixels … to non-textual contents
An overview of Content Based Image Retrieval
From CBIR to Word spotting
Off-lineindexation
Clustering
EoC
Signature
computation
34. 33
From pixels … to non-textual contents
An overview of Content Based Image Retrieval
From CBIR to Word spotting
On-lineRetrieval
Query images
Results: retrieved images
35. 34
From pixels … to non-textual contents
An overview of Content Based Image Retrieval
From CBIR to Word spotting
36. Is it just a question
of meta-data?
CONTENTSIMAGES
37. 36
The standard model
Descriptive meta-data + transcription
We have usually
• Descriptive meta-data in standard formats (MARC, EAD, Dublin Core, MODS, …)
• Edited manually
• “Semantical” information
• Text transcription associated to additional meta-data (TEI)
• Semi-automatic transcription or manually edited
• “Semantical” information
Text (TEI)
Meta-data
Images
Descriptive meta-
data (MARC)Keyboarding
Image
AnalysisDigitization
Text regions
XML
“Semantical” meta-data
38. 37
The standard model
Perceptual meta-data
It seems that CBIR can help to extract and save supplementary
information about image content (EoC) without going to the semantical
aspect (recognition)
Regions of interest
Visual features Perceptual signatures and index
Shapes, positions, colors, textures, … Numerical values (vectors)
Text (TEI)
Images
Descriptive meta-
data (MARC)Keyboarding
Image
AnalysisDigitization
Text regions
XML
Meta-dataMeta-
data ++
+ “Perceptual” meta-data
Non-Text regions
CBIR
39. 38
Is it just a problem of Meta-data?
What about the encoding formats?
Structuration of data and file formats is a difficult problem
• Data architects are needed
• Some interesting formats linked with previous discussions
o METS – Metadata Encoding and Transmission Standard
o ALTO – Analyzed Layout and Text Object
o TEI – Text Encoding Initiative
METSImages
Image ALTO
Enriched
documents
TEI …
40. 39
Is it just a problem of Meta-data?
ALTO for OCR
ALTO = Analyzed Layout and
Text Object
Standard XML
Created in 2003 during METAe
project
Developed by Graz, Linz, Innsbruck
universities
Description of the content and
the physical layout of one page
Used by several OCR software
Adapted and used by the BNF and
other libraries
Drawbacks: huge / static
41. 40
TEI = Text Encoding Initiative - Standard XML
Content tagging and logical structure encoding (full document)
Used a lot by libraries
Too much « open »? Quit « complex »
Is it just a problem of Meta-data?
TEI for transcription and enriched contents
42. 41
METS – Metadata Encoding and Transmission Standard
• Open XML Standard created in 2001 by the Digital Library Federation
maintained by METS Editorial Board
• XSD-Schema
• Linking between
multimedia objects
• Complete Description of
digitized content (images,
texts, audio, sculptures, …)
• Physical / logical structures
• Descriptive Meta data
(DC, MODS, MARC, …)
• …
metsHdr
dmdSec
amdSec
fileSec
structMap
METS document
header (info author, sotware, …)
descriptive metadata section
(bibliographic notice)
administrative metadata
section (copyright, …)
file inventory section
(file localisation)
structLink structural map linking
(link between structures)
structural maps
(physical et logical)
METS – First level Elements
Is it just a problem of Meta-data?
Link between meta data?
46. 45 45
From Pixels to Contents
Conclusions
Building tools for the valorisation of digitized
historical content is a pluri-disciplinary task
Meta-data production Experts of the domains
Selection and verification of the data Experts + Data accuratist
Structuration of the data and system Data / system architect
Computer vision, Machine learning Data scientist
Manual indexing is needed
Descriptive meta-data Semantical meta data
Standard formats for data encoding
Annotations could be seen as supplementary meta-data?
Operational methods and tools are available
Acquisition devices
Automatic tools: low level image processing, OCR
Perceptual meta-data should be added : CBIR
47. 46 46
From Pixels to contents
Conclusions - Perspectives
The actual context: big data and heterogeneous collections
Connexion between data, mutual enrichment, interoperability
Introduction and management of additional knowledge
Facing the diversity of the types of contents and usages
Quality of the interaction instead of only the quantity
Semantic Web : queries reformulation, smart crawlers, automatic
categorisation