MIPI DevCon 2016: A Developer's Guide to MIPI I3C ImplementationMIPI Alliance
In this presentation, Intel's Ken Foust, MIPI Sensor Working Group Chair, provides early adopters of MIPI I3C with targeted guidance on how to ensure a successful and efficient implementation of MIPI I3C in their products.
Leveraging I2C as a foundation, many components of MIPI I3C will be familiar to implementers, but with guidance provided here, viewers will gain a clearer understanding of MIPI I3C’s new innovative features, how they will improve their systems, and what considerations should be made to fully leverage them.
25 лет истории C++, пролетевшей на моих глазахcorehard_by
Автор доклада познакомился с C++ в 1991-ом году, а с 1992-го года C++ является для докладчика основным языком разработки. Что происходило с языком за это время? Как и почему он стал популярным? Как начался застой в развитии C++? Как C++ потерял свою популярность? Есть ли место для C++ в современном мире? Попробуем поговорить об этом опираясь на 25-летний опыт программирования на C++.
MIPI DevCon 2016: A Developer's Guide to MIPI I3C ImplementationMIPI Alliance
In this presentation, Intel's Ken Foust, MIPI Sensor Working Group Chair, provides early adopters of MIPI I3C with targeted guidance on how to ensure a successful and efficient implementation of MIPI I3C in their products.
Leveraging I2C as a foundation, many components of MIPI I3C will be familiar to implementers, but with guidance provided here, viewers will gain a clearer understanding of MIPI I3C’s new innovative features, how they will improve their systems, and what considerations should be made to fully leverage them.
25 лет истории C++, пролетевшей на моих глазахcorehard_by
Автор доклада познакомился с C++ в 1991-ом году, а с 1992-го года C++ является для докладчика основным языком разработки. Что происходило с языком за это время? Как и почему он стал популярным? Как начался застой в развитии C++? Как C++ потерял свою популярность? Есть ли место для C++ в современном мире? Попробуем поговорить об этом опираясь на 25-летний опыт программирования на C++.
graph2tab, a library to convert experimental workflow graphs into tabular for...Rothamsted Research, UK
a generic implementation of a method for producing spreadsheets out of pipeline graphs See https://github.com/ISA-tools/graph2tab for details.
Presentation given to my group at EBI, on Feb 2, 2012.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
graph2tab, a library to convert experimental workflow graphs into tabular for...Rothamsted Research, UK
a generic implementation of a method for producing spreadsheets out of pipeline graphs See https://github.com/ISA-tools/graph2tab for details.
Presentation given to my group at EBI, on Feb 2, 2012.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Climate Impact of Software Testing at Nordic Testing Days
PyCon APAC 2017: Page Layout Analysis of 19th Century Siamese Newspapers using Python and OpenCV
1. Page Layout Analysis of
19th
Century Siamese Newspapers
using Python and OpenCV
Mark Hollow
PyCon APAC, 2017
2. graduated in classical music · self-taught in computing
programming python since 2002 · 20 years working in IT
IT infrastructure · UNIX sysadmin · project management
software engineering · data systems · product management
about me...
2
3. once upon a time...
Dr Dan Beach Bradley - หมอ บรัดเลย
Born 18th
July 1804, New York; died 23th
June 1873, Bangkok
Graduated as Doctor of Medicine from New York University
American Protestant missionary in Siam
Arrives in Bangkok on 18th
July 1835 from Boston via Singapore
Brings with him the first printing press to Siam
Many notable achievements & firsts in Siam
first surgery, first vaccination, first printed Royal edict, first newspaper, first classified & commercial
advertising, first copyright transaction, first printing of Siamese law, first Siamese translation of the
Old Testament, first monolingual Siamese dictionary
3
4. the first siamese newspaper
The Bangkok Recorder - หนังสือจดหมายเหตุ
1844–1845 magazine-like, fact-based, introduces western
ideas, knowledge, science and Christianity
1865–1867 more social commentary and introduction of
western liberalism -- rather controversial
a lot of historical information...
thai society (seen from a western perspective)
regional and global news/information
prices of goods, services, imports and exports
4
5. there is no online
searchable database
of this historical
information.
5
6. DigitalBangkokRecorder
markhollow.com
digital bangkok recorder project
objectives
scan all the surviving editions
transcribe all text
make all text available online
learn how to do all of this
in this presentation
cleaning scanned images
detecting the page layout
extract all text lines
prepare for transcription
6
7. page layout
2 column layout
front page:
title & date lines
last page:
tabular data
some illustrations
some full-width
tables
7
8. a closer look...
large header on cover
dual-language headings
column separator line
topic separators
unique typeface
the first ever thai typeface
now-obsolete characters
not supported by modern ocr
8
11. what is opencv?
“OpenCV (Open Source Computer Vision Library) is an
open source computer vision and machine learning
software library.”
- opencv.org
Written in C++; bindings for Python and others
v3.2 used for here, v3.1 probably works
v2.x won’t work - different API structure
many v2.x blogs/articles still online - beware!
11
12. opencv basics: installation
$ pip install opencv-python
or
$ pip install opencv-contrib-python
No FFmpeg, GTK or carbon support - limits some features.
Works well in jupyter/ipython.
Non-free
patented
stuff!!
12
13. opencv basics: loading/saving images
loading images…
>>> import cv2
>>> img = cv2.imread(’image001.jpg’)
>>> type(img)
<type 'numpy.ndarray'>
saving images...
>>> cv2.imwrite(’newfile.png’, img)
OpenCV images
are numpy arrays!
All common
formats supported.
Extra args
supported for
image formats.
13
15. removing background noise (1)
- binarization: set pixel value based on threshold
- types: basic, adaptive
- both need experimentation with threshold value
bin_image, th = cv2.threshold(image, 192, 255,
cv2.THRESH_BINARY)
bin_image = cv2.adaptiveThreshold(image, 255,
cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY, 101, 2)
15
16. removing background noise (2)
bin_image, th = cv2.threshold(img, 0, 255,
cv2.THRESH_BINARY + cv2.THRESH_OTSU)
otsu binarization tries to find best threshold value
example:
manual threshold guessed at v=192
otsu selects v=177
improvement in number of artifacts
16
19. morphological transforms (1)
- erosion: erodes away the
boundaries of foreground object
- dilation: dilates/thickens
boundaries
NOTE: black = background
white = foreground
kernel = numpy.ones(
(5, 5), np.uint8)
new_image = ~cv2.erode(
~original_image,
kernel,
iterations=1)
19
20. morphological transforms (2)
- opening: erosion+dilation
used for removing noise
- closing: dilation+erosion
closes small holes in objects
kernel = numpy.ones(
(5, 5), np.uint8)
img2= ~cv2.morphologyEx(
~img1,
cv2.MORPH_OPEN,
kernel,
iterations=1)
20
21. contours
“a curve joining all the
continuous points having same
color or intensity”
_, contours, hierarchy = cv2.findContours(
~img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
col_img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
cv2.drawContours(col_img, contours, -1, (255,0,0), 5)
findContours return values:
contours: list of contours
hierarchy: contour structure
21
22. finding page margins
“open” removes
artifacts; “dilate”
emphasizes text
opened & dilated
22
get margin from
contour edges
findContours() to
group blocks; filter
out small contours.
24. morphological transforms (revisited)
structuring element (kernel) array
is made of 1’s & 0’s
it’s compared to each pixel
erode: takes minimum value
dilate: takes maximum value
a linear structuring element will
operate on linear patterns
Diagram: http://docs.opencv.org/3.1.0/d1/dee/tutorial_moprh_lines_detection.html
kernel
input image output image
Dilation Example
>>> cv2.getStructuringElement(cv2.MORPH_RECT, (10, 1))
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=uint8)
24
25. page segmentation
1
2
3
once for horizontal,
then for vertical
lines
erode & dilate with a long
linear structuring element:
extracts lines to mask
findContours() on
the mask gets
contour coordinates
draw contour to
remove line
centre line of contours
used as page section
boundaries
25
26. page segmentation (full page)
section boundaries
page margin
26
blank areas from
average values of
multiple adjacent
lines
28. template matching: finding objects in an image
result = cv2.matchTemplate(image,
template, cv2.TM_CCOEFF_NORMED)
_, maxval, _, maxloc =
cv2.minMaxLoc(result)
a template is a small image segment:
cv2.matchTemplate() returns match scores
28
29. structural analysis complete
- margins identified
- horizontal and vertical lines detected
- original lines removed
- blank areas identified
- removed decorative markers with templates
- use template matching to identify titles
- and therefore page style (eg. first or other page)
29
35. transcription
- transcribe enough text for developing an OCR model
- regular ocr is very inaccurate due to
the unique font
- hire typists or amazon mechanical turk
- there’s a few problems to solve:
- transcription cost, guidelines needed due to archaic text & unique typeface
- how to develop an OCR system?
- retrain tesseract 4.x’s LSTM (Long Short Term Memory) neural network?
- use tensorflow or similar?
- perhaps that’s my next PyCon presentation!
35
36. appendix
Not enough time to cover these
topics… :-(
- Removing page frames *
- Skew correction *
- Detecting tables †
- Detecting pictures
* See https://markhollow.com/
† Coming soon
36
Other resources:
- ocropus / ocropy: python document
analysis tools
- scantailor: GUI for cleaning
scanned documents
- CE316 / CE866: Computer Vision,
University of Essex, UK
http://orb.essex.ac.uk/ce/ce316/
37. in summary...
opencv basics · thresholds · morphological
transformations · contours · masks · template
matching and a little bit of numpy
...plus a practical application
to document layout analysis
37
38. thank you for listening.
questions?
38
Mark Hollow
markhollow.com
DigitalBangkokRecorder